I have a file with numbered lines. The numbers are taking up the first 7 spaces each line. I want to check the remainder of the line for duplicates and only output the duplicates.
For example, my file might be:
1 abcde
2 12345789
3 6789
4 000000
5 abcde
In which case I would want my output to be:
1 abcde
5 abcde
Output formatting doesn't matter much of course, though it'd be great if the duplicate strings were matched together so I can find them more easily.
I'm using Linux.
Best Answer
sort
the file on the second field, and tell GNUuniq
to skip the first 7 characters (-s 7
), telling it print repeated lines (-D
):