How to print top five highest numbers from a column


I have a text file with four columns. I need to read till end of file and print the highest number from column3 (top 5 values) along with column 1.


xm|340034177|ref|RT_235820.1|   139697  192 0
xm|161622288|ref|RT_340093.1|   153819  2607    0
xm|75755638|ref|RT_557407.1|    153821  1937    0
xm|108773031|ref|RT_678101.1|   161452  1688    0
xm|30352011|ref|RT_784766.1|    150568  105 0


xm|161622288|ref|RT_340093.1|   2607
xm|75755638|ref|RT_557407.1|    1937
xm|108773031|ref|RT_678101.1|   1688
xm|340034177|ref|RT_235820.1|   192
xm|30352011|ref|RT_784766.1|    105

Best Answer

sort -k3n,3 filename | tail -5 | cut -d " " -f1,6-7

The above command will sort the file on the 3rd field. Now, I am piping this output to the tail command to print the top 5 numbers in the 3rd column. However, if you need only the first column and this 3rd column in the output, you can pipe the output to cut command.


cat filename

T_235820.1|   139697  192 0
xm|161622288|ref|RT_340093.1|   153819  2607    0
xm|75755638|ref|RT_557407.1|    153821  1937    0
xm|108773031|ref|RT_678101.1|   161452  1688    0
xm|30352011|ref|RT_784766.1|    150568  105 0
T_235820.1|   139697  192 0
xm|161622288|ref|RT_340093.1|   153819  607    0
xm|75755638|ref|RT_557407.1|    153821  937    0
xm|108773031|ref|RT_678101.1|   161452  1881    0
xm|30352011|ref|RT_784766.1|    150568  1051 0

Now, I run the above command on this file.

sort -k3n,3 filename | tail -5 | cut -d " " -f1,6-7

The output that I get is,

xm|30352011|ref|RT_784766.1|  1051
xm|108773031|ref|RT_678101.1| 1688 
xm|108773031|ref|RT_678101.1| 1881 
xm|75755638|ref|RT_557407.1|  1937
xm|161622288|ref|RT_340093.1| 2607 


You can add the -g flag for floating point and negative numbers as well in case if you have any in your file. The command would look like,

sort -k3ng,3 filename | tail -5 | cut -d " " -f1,6-7
Related Question