Extract nth line matching pattern and the next N lines

text processing

There is a big file containing a pattern which is repeated periodically in the file, I want to extract just a specific pattern after certain values of occurrence as well as the next N lines.
Here is an example but the numbers before members of the group are not really existing.

input:

1 members of the group
...
...
2 members of the group
...
...
...
n members of the group
...
...
...

output:

85 members of the group
...
...
...
...
...

(85th match and the next 5 lines)

Best Answer

Here's one way with awk:

awk -vN=85 -vM=5 'BEGIN{c=0}
/PATTERN/{c++
{if (c==N) {l=NR;last=NR+M}}
}{if (NR<=last && NR>=l) print}' infile

Where N is the Nth line matching PATTERN and M is the number of lines that follow. It sets a counter and when the Nth line matching is encountered it saves the line number. It then prints the lines from the current NR up to NR+M.


For the record, that's how you do it with sed (gnu sed syntax):

sed -nE '/PATTERN/{x;/\n{84}/{x;$!N;$!N;$!N;$!N;$!N;p;q};s/.*/&\n/;x}' infile

This is using the hold space to count.
Each time it encounters a line matching PATTERN it exchanges buffers and checks if there are N-1 occurrences of \newline character in the hold buffer. If the check is successful it exchanges again, pulls in the next M lines with the $!N command and prints the pattern space then quits.
Otherwise it just adds another \newline char to the hold space and exchanges back.
This solution is less convenient as it quickly becomes cumbersome when M is a big number and requires some printf-fu to build up a sed script (not to mention the pattern and hold space limits with some seds).

Related Question