Sort Command – Scalability of ‘sort -u’ for Gigantic Files

sort

What is reasonable scalability limit of 'sort -u' ?
(in dimensions of "line length", "amount of lines", "total file size"?)

What is Unix alternative for files exceeding this in dimension of "amount of lines" ?
(Of course I can easily implement one, but I wondered if there is something that can be done with few standard Linux commands?)

Best Answer

The sort that you find on Linux comes from the coreutils package and implements an External R-Way merge. It splits up the data into chunks that it can handle in memory, stores them on disc and then merges them. The chunks are done in parallel, if the machine has the processors for that.

So if there was to be a limit, it is the free disc space that sort can use to store the temporary files it has to merge, combined with the result.