Extract Lines from Bottom Until Regex Match – Using AWK or SED

awkcommand linelinuxregular expressionsed

I have this output.

[root@linux ~]# cat /tmp/file.txt
virt-top time  11:25:14 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.0  0.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.0  0.0  95:44:07 instance-00000372
virt-top time  11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

You can see it has two blocks and i want to extract last block (if you see first block it has all CPU zero which i don't care) inshort i want to extract following last lines (Notes: sometime i have more than two instance-*) otherwise i could use "tail -n 2"

1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

I have tried sed/awk/grep and all possible way but not get close to desire result.

Best Answer

This feels a bit silly, but:

$ tac file.txt |sed -e '/^virt-top/q' |tac
virt-top time  11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

GNU tac reverses the file (many non-GNU systems have tail -r instead), the sed picks lines until the first that starts with virt-top. You can add sed 1,2d or tail -n +3 to remove the headers.

Or in awk:

$ awk '/^virt-top/ { a = "" } { a = a $0 ORS } END {printf "%s", a}' file.txt 
virt-top time  11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

It just collects all the lines to a variable, and clears that variable on a line starting with virt-top.

If the file is very large, the tac+sed solution is bound to be faster since it only needs to read the tail end of the file while the awk solution reads the full file from the top.

Related Question