I have a file with multiple columns and want to identify those where specific column values (cols 3-6) have been duplicated.
The following code finds the duplicates but I want to display both instances, not just the second. The other column values (cols 1, 2 and 7+) can be different between the 2 lines hence the need for me to view both instances.
awk 'seen[$3, $4, $5, $6]++ == 1' filename
Best Answer
uniq
is the correct tool for that:Where:
-D
- prints all duplicates-f2
- avoid comparing the first 2 fieldsEdit: If the fields 7 and above are not to be compared, you need
awk
:x[]
(columns 3-6) is checked. If it's already set run the part in{...}
(in the same statement then
variable is set to the value of that array item){...}
: Then
variable and the current line$0
are printed.x[]
array item for the next iteration to the current line contents, for later comparsion.