Extract Lines from Bottom Until Regex Match – Using AWK or SED

I have this output.

[root@linux ~]# cat /tmp/file.txt
virt-top time  11:25:14 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.0  0.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.0  0.0  95:44:07 instance-00000372
virt-top time  11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB
   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM   TIME    NAME
    1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
    2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

You can see it has two blocks and i want to extract last block (if you see first block it has all CPU zero which i don't care) inshort i want to extract following last lines (Notes: sometime i have more than two instance-*) otherwise i could use "tail -n 2"

1 R    0    0    0    0  0.6 12.0  96:02:53 instance-0000036f
2 R    0    0    0    0  0.2 12.0  95:44:08 instance-00000372

I have tried sed/awk/grep and all possible way but not get close to desire result.

$ tac file.txt |sed -e '/^virt-top/q' |tac virt-top time 11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB ID S RDRQ WRRQ RXBY TXBY %CPU %MEM TIME NAME 1 R 0 0 0 0 0.6 12.0 96:02:53 instance-0000036f 2 R 0 0 0 0 0.2 12.0 95:44:08 instance-00000372

$ awk '/^virt-top/ { a = "" } { a = a $0 ORS } END {printf "%s", a}' file.txt virt-top time 11:25:17 Host foo.example.com x86_64 32/32CPU 1200MHz 65501MB ID S RDRQ WRRQ RXBY TXBY %CPU %MEM TIME NAME 1 R 0 0 0 0 0.6 12.0 96:02:53 instance-0000036f 2 R 0 0 0 0 0.2 12.0 95:44:08 instance-00000372

Best Answer

This feels a bit silly, but:

GNU tac reverses the file (many non-GNU systems have tail -r instead), the sed picks lines until the first that starts with virt-top. You can add sed 1,2d or tail -n +3 to remove the headers.

Or in awk:

It just collects all the lines to a variable, and clears that variable on a line starting with virt-top.

If the file is very large, the tac+sed solution is bound to be faster since it only needs to read the tail end of the file while the awk solution reads the full file from the top.

Best Answer

Related Solutions

Bash – awk + print lines from the first line until match word

How to extract lines between same patterns from a file

Related Question