I'm trying to find a way to find and print only lines from a file that don't have duplicates. If this is my file:
A
A
B
B
C
C
Y
Z
I am trying to print out only
Y
Z
Unfortunately, I keep getting
A
B
C
Y
Z
I have tried sort -u
, sort | uniq -u
, and grep | sort | uniq -u
with the same results. I was eventually able to achieve my goal of finding the unique line using uniq -c
and looking for the line that only appears one time, but I would like to know how to do this properly in the future.
Best Answer
AWK solution
{arr[$0]++};
creates associative array of line-number pairs. If a line is unique in the file, array item that corresponds to the line value will be 1, otherwise - greater than 1END
block is executed when we have reached end of file. We iterate over array items usingfor(value in array)
loop and print the value if the corresponding array item equals to 1, as mentioned before.Python 3
Same idea as the
awk
one. Here we useOrderedDict
class to create a dictionary of lines and their counts with preserved order.And here it is in action:
Perl
Again, same idea as Python script, and we're using ordered hash (see also the Tie::IxHash documentation ).
Test run:
sort and uniq variations
Have been mentioned in the comments multiple times already.
or