Linux Shell – How to Sort by Last Column

shellsort

I'm running a script (that I have no control of) to get the following output.
I want to sort by the last(3rd) column. Each column is separate by spaces, and the 2nd column includes spaces/symbols.

    > ./script
    37622       (this is || test1)&&(SGD||HKD||RMB)     40010
    43944       (this is)&&(SGD||HKD)    102732
    79378       (this is||test2)&&(HKD||RMB)    205425
    457000      (test2) && (SGD||RMB||HKD||YEN)        71
    559658      (test1||test2)&&(RMB||YEN||SGD)     14043

I tried to use sort -k, but it doesn't work. Then I found this question – How to numerical sort by last column? – the solution provided is

awk '{print $NF,$0}' file.txt | sort -nr | cut -f2- -d' '

My question is: how do I make use of this when I run the script?

    > ./script | <something??>

Thank you.

Best Answer

Awk

You can adapt the linked pipe in a straight forward way:

$ ./script | awk '{ print $NF,$0 }' | sort -k1,1 -n | cut -f2- -d' '

In awk the expression $x references the x-th column of the current line (starting with 1) - and the predefined variable NF stores the number of columns of the current line, thus print $NF,$0 prints for each line the last column and the complete line (because $0 denotes the complete line). The cut command then outputs the 2nd to the last column of each line.

The -k1,1 part of sort means that only the first column is used as sort key - this only makes a difference when more than one line have the same value in the first column. Without -k1,1 the following columns will influence the relative order (as secondary and so on sorting key) in that case. With -k1,1 only the first column is used as sorting key - and the relative order of lines with the same key is not changed (i.e. a stable sort is performed).

sed

Alternatively you can solve it via sort and sed:

$ ./script | sed 's/^\(.\+[ \t]\+\)\([0-9]\+ *\)$/\2 \1/' | \
     sort -k1,1 -n | sed 's/^\([0-9]\+\) \(.\+\)$/\2 \1/'

Where the newline at the end of the first line is escaped via \ - you can remove that \ and enter the pipe as one line.

The idea is to first move the last column to the front, sort by the first column and then put it to the back again.

It is assumed that the last column is separated via white space, i.e. [ \t]\+ (spaces or tabs).

The sed expressions do the swapping via group references (e.g. \2 \1) - the groups are marked in the pattern via escaped parentheses: \(...\)

Related Question