What is a *NIX way of removing redundancy in a case where I have pairwise comparisons like these in two columns
A B
B A
A C
A D
C A
D A
B C
C B
A B
and B A
represent the same comparison and I would like to remove such redundancy from the dataset. The final result should be
A B
A C
A D
B C
Best Answer
(or, getting terser with it 'cause Chris Down's answer's so sweet)
which could be further reduced if you don't care about the spaces in your data
)
The
FS
is awk's "field separator" variable, used here to guarantee the boundaries between key fields will be properly identified. My original had them run together,$1$2
, which as Stephane Chazelas pointed out would have treatedA BC
andAB C
as duplicates.