Ubuntu – How to find a single unique line in a file

command linetext processing

I'm trying to find a way to find and print only lines from a file that don't have duplicates. If this is my file:

A
A
B
B
C
C
Y
Z

I am trying to print out only

Y
Z

Unfortunately, I keep getting

A
B
C
Y
Z

I have tried sort -u, sort | uniq -u, and grep | sort | uniq -u with the same results. I was eventually able to achieve my goal of finding the unique line using uniq -c and looking for the line that only appears one time, but I would like to know how to do this properly in the future.

Best Answer

AWK solution

$ awk '{arr[$0]++};END{for(var in arr) if (arr[var] == 1) print var}' input.txt                                          
Y
Z
  • {arr[$0]++}; creates associative array of line-number pairs. If a line is unique in the file, array item that corresponds to the line value will be 1, otherwise - greater than 1
  • END block is executed when we have reached end of file. We iterate over array items using for(value in array) loop and print the value if the corresponding array item equals to 1, as mentioned before.

Python 3

Same idea as the awk one. Here we use OrderedDict class to create a dictionary of lines and their counts with preserved order.

#!/usr/bin/env python3
import sys
from collections import OrderedDict

if len(sys.argv) != 2:
   sys.stderr.write(">>> Script requires a file argument")
   sys.exit(1)

for arg in sys.argv[1:]:
    lines = OrderedDict()
    with open(sys.argv[1]) as fd:
        for line in fd:
            tmp = line.strip()
            if tmp in lines.keys():
                lines[tmp] = lines[tmp] + 1
            else:
                lines[tmp] = 1

    for line,count in lines.items():
        if count == 1:
            print(line)

And here it is in action:

$ ./get_unique_lines.py  input.txt                                                                                       
Y
Z

Perl

Again, same idea as Python script, and we're using ordered hash (see also the Tie::IxHash documentation ).

#!/usr/bin/perl
use strict;
use warnings;
use Tie::IxHash;

tie my %linehash, "Tie::IxHash" or die $!;

open(my $fp,'<',$ARGV[0])  or die $!;
while(my $line = <$fp> ){
    chomp $line;
    $linehash{$line}++;
}
close($fp);

for my $key (keys %linehash) {
    printf("%s\n",$key) unless $linehash{$key} > 1;
}

Test run:

$ ./get_unique_lines.pl input.txt                                                                                        
Y
Z

sort and uniq variations

Have been mentioned in the comments multiple times already.

$ sort input.txt | uniq -u                                                                                               
Y
Z

or

$ uniq -u input.txt                                                                                                      
Y
Z
Related Question