Sort Command – Difference Between `sort -u` and `sort | uniq -u`

sortuniq

I came into a level while exploring the bash that consisted of finding the line of text that occurs only once in a certain file.

Why is the output of the sort -u file command different from the output of sort file| uniq -u? Shouldn't they be the same?

Best Answer

sort -u and sort | uniq do produce the same output*: all of the lines in the input, exactly once each, in ascending order. That is the default behaviour of uniq.

uniq -u, on the other hand, asks to:

-u Suppress the writing of lines that are repeated in the input.

This is a very different behaviour: only the lines that do not repeat are outputted. When the file has been sorted first, that means the lines that only appear once in the entire file (which is what you wanted).


* There are some caveats about how sort and uniq consider equality, which Stéphane has noted in this answer to a related question. For the POSIX locale or files in some normalised form, they're identical; for others, there can be distinguishable differences.