Extract nth line matching pattern and the next N lines

text processing

There is a big file containing a pattern which is repeated periodically in the file, I want to extract just a specific pattern after certain values of occurrence as well as the next N lines.
Here is an example but the numbers before members of the group are not really existing.

input:

1 members of the group
...
...
2 members of the group
...
...
...
n members of the group
...
...
...

output:

85 members of the group
...
...
...
...
...

(85th match and the next 5 lines)

Best Answer

Here's one way with awk:

awk -vN=85 -vM=5 'BEGIN{c=0}
/PATTERN/{c++
{if (c==N) {l=NR;last=NR+M}}
}{if (NR<=last && NR>=l) print}' infile

Where N is the Nth line matching PATTERN and M is the number of lines that follow. It sets a counter and when the Nth line matching is encountered it saves the line number. It then prints the lines from the current NR up to NR+M.

For the record, that's how you do it with sed (gnu sed syntax):

sed -nE '/PATTERN/{x;/\n{84}/{x;$!N;$!N;$!N;$!N;$!N;p;q};s/.*/&\n/;x}' infile

This is using the hold space to count.
Each time it encounters a line matching PATTERN it exchanges buffers and checks if there are N-1 occurrences of \newline character in the hold buffer. If the check is successful it exchanges again, pulls in the next M lines with the $!N command and prints the pattern space then quits.
Otherwise it just adds another \newline char to the hold space and exchanges back.
This solution is less convenient as it quickly becomes cumbersome when M is a big number and requires some printf-fu to build up a sed script (not to mention the pattern and hold space limits with some seds).

Related Solutions

Print Nth Line – How to Print Nth Line Before Each Matching Pattern

A buffer of lines needs to be used.

Give a try to this:

awk -v N=4 -v pattern="example.*pattern" '{i=(1+(i%N));if (buffer[i]&& $0 ~ pattern) print buffer[i]; buffer[i]=$0;}' file

Set N value to the Nth line before the pattern to print.

Set patternvalue to the regex to search.

buffer is an array of N elements. It is used to store the lines. Each time the pattern is found, the Nth line before the pattern is printed.

Remove line matching a pattern if next line doesn’t match another pattern

with sed :

sed -ne '/NSAS_HOST/{N;/NOT OK/{p}};/NSAS_HOST/!p' FILE

OUTPUT:

NSAS_HOST:emsacssbcon01
NOT OK main load processes
NOT OK 5.3% AXConfigurator
NOT OK eth0.orig is not UP, but ifcfg-eth0.orig sets ONBOOT=yes
NOT OK eth1.bak is not UP, but ifcfg-eth1.bak sets ONBOOT=yes
NOT OK eth1.orig is not UP, but ifcfg-eth1.orig sets ONBOOT=yes
NSAS_HOST:emsacssb03
NOT OK eth0.orig is not UP, but ifcfg-eth0.orig sets ONBOOT=yes
NOT OK eth1.orig is not UP, but ifcfg-eth1.orig sets ONBOOT=yes
NSAS_HOST:d02-b2bpgdb01
NOT OK bond0: device speed not determined
NOT OK bond1: device speed not determined

Best Answer

Related Solutions

Print Nth Line – How to Print Nth Line Before Each Matching Pattern

Remove line matching a pattern if next line doesn’t match another pattern

Related Question