Here is what I do right now,
sort -T /some_dir/ --parallel=4 -uo file_sort.csv -k 1,3 file_unsort.csv
the file is 90GB,I got this error message
sort: close failed: /some_dir/sortmdWWn4: Disk quota exceeded
Previously, I didn't use the -T option and apparently the tmp dir is not large enough to handle this. My current dir has free space of roughly 200GB. Is it still not enough for the sorting temp file?
I don't know if the parallel option affect things or not.
Best Answer
The problem is that you seem to have a disk quota set up and your user doesn't have the right to take up so much space in
/some_dir
. And no, the--parallel
option shouldn't affect this.As a workaround, you can split the file into smaller files, sort each of those separately and then merge them back into a single file again:
The magic is GNU sort's
-m
option (frominfo sort
):That will require you to have ~180G free for a 90G file in order to store all the pieces. However, the actual sorting won't take as much space since you're only going to be sorting in 100M chunks.