Ubuntu – Print the second or nth match of a sed search which is based in between two patterns

bashcommand lineprintingsed

I would like to print the nth match of a sed search which is based on two patterns, as shown below:

sed -n '/start here/,/end here/p'  'testfile.txt'

Let's say that testfile.txt contains the text below:

start here
0000000
0000000
end here
start here
123
1234
12345

123456
end here
start here
00000000
end here
00000000

00000000

and that I do not want to print the zeros between the two patterns.

Based on the command above, I will get all the matches between the patterns, and its output is shown below:

start here
0000000
0000000
end here
start here
123
1234
12345

123456
end here
start here
00000000
end here

While my desired output is:

start here
123
1234
12345

123456
end here

Consider that the lines need to be printed as in testfile.txt and not concatenated.

Best Answer

I would just switch to another tool. Perl, for example:

perl -ne '$k++ if /Pattern1/; if(/Pattern1/ .. /Pattern2/){print if $k==3}' file

That will print the 3rd match. Change the $k==3 to whatever value you want. The logic is:

$k++ if /Pattern1/ : increment the value of the variable $k by one if this line matches Pattern1.
if(/Pattern1/ .. /Pattern2/){print if $k==3} : if this line is within the range of /Pattern1/ to /Pattern2/, print it but only if $k is 3. Change this value to whichever match you want.

You could wrap this in a little shell function to be able to get the Nth match more easily:

getNth(){
  pat1="$1"
  pat2="$2"
  n="$3"
  file="$4"

  perl -ne '$k++ if /'"$pat1"'/;if(/'"$pat1"'/ .. /'"$pat2"'/){print if $k=='"$n"'}' file

}

You could then run it like this:

getNth Pattern1 Pattern2 3 'huge file.txt'

Using your example data:

$ perl -lne '$k++ if /start here/;if(/start here/ .. /end here/){print if $k==2}' testfile.txt
start here
123
1234
12345

123456
end here

Or:

$ getNth 'start here' 'end here' 2 testfile.txt
start here
123
1234
12345

123456
end here

Just for fun, here's another perl approach:

$ perl -lne '($k++,$l++) if /start here/; print if $l && $k==2; $l=0 if /end here/' testfile.txt 
start here
123
1234
12345

123456
end here

Or, if you like golfing (thanks @simlev):

perl -ne 'print if /^start here$/&&++$k==2../^end here$/' testfile.txt

Related Solutions

Ubuntu – Print text between two XML tags

sed is a great tool but XML will eventually make any programmer who approaches it with a REGEX cry. I know. I've been there. If there is even the smallest chance that your data will change, you want a proper XML parser.

My choice would be to use BeautifulSoup but it makes handling it directly from Bash fairly hard. If you want to write an intermediary Python script, that's still an option... Otherwise xpath is a fairly classic option. It's a wrapper around Perl's libxml library and it does some fairly powerful things.

sudo apt-get install libxml-xpath-perl

And for your example, here's how I'd do this in the xpath query language:

xpath -e '*/serverName/*' big_xml_file.xml

Again, if you need to do anything useful with this XML, consider something even stronger like BeautifulSoup and Python.

Ubuntu – How to use sed to find the strings between 2 patterns

I'd use awk for this:

awk '
    /aaa accounting exec default/ {print; exec=1; next} 
    exec {
        if (/^ /) {print; next} else if (/^!/) {print}
        exec=0
    }
' filename

Passing the pattern, use awk's -v option, and then the pattern match operator ~:

awk -v patt='aaa accounting exec default' '
    $0 ~ patt {print; exec=1; next} 
    exec {
        if (/^ /) {print; next} else if (/^!/) {print}
        exec=0
    }
' filename

Best Answer

Related Solutions

Ubuntu – Print text between two XML tags

Ubuntu – How to use sed to find the strings between 2 patterns

Related Question