Lum – Sort multiple delimited file lexicographically by one column, numerically by another

awkcolumnssorttext formattingtext processing

I wish to sort the TSV file below (called min_ex) by the first column lexicographically and by the second column numerically.

A X, N    2.2
A, N    5.7
A, A    5.8
A, N    2.1
A, T    0.2
B G, M    2.3
B, L    0.1
B, I    0.2
B, M    9.3
B, C    9.9

I tried to do it with sort -k1,2 -n min_ex. but it doesn't work as it results in:

A, A    5.8
A, N    2.1
A, N    5.7
A, T    0.2
A X, N    2.2
B, C    9.9
B G, M    2.3
B, I    0.2
B, L    0.1
B, M    9.3

I also think am also pretty sure (through experimentation) that sort is taking any blank space as the delimiter, but I don't see an option to set the separators.

I'd like to have solutions using either pure AWK or no-sed at all (preferably both, separately), and I'd like to remain as POSIX compliant as possible.

Best Answer

sort -t$'\t' -k1,1 -k2,2n

does the trick, and it’s POSIX-compliant apart from the $'\t' part. -t specifies the field delimiter (instead of blank-to-non-blank transitions, which is the default); the n suffix can be applied to single field definitions.