Awk or sed command to match regex at specific line, exit true if success, false otherwise

awkgnu-screenpuppetsedtext processing

I need to determine if a file contains a certain regex at a certain line and to return true (exit 0) if found, and otherwise false. Maybe I'm overthinking this, but my attempts proved a tad unwieldy. I have a solution, but I'm looking for maybe others that I hadn't thought of. I could use perl, but I'm hoping to keep this "lightweight" as possible as it runs during a puppet execution cycle.

The problem is common enough: in RHEL6, screen was packaged in a way that limited the terminal width to 80 characters, unless you un-comment the line at 132. This command checks to see if that line has already been fixed:

 awk 'NR==132 && /^#termcapinfo[[:space:]]*xterm Z0=/ {x=1;nextfile} END {exit 1-x}' /etc/screenrc

Note: if the file has fewer that 132 lines, it must exit with false.

I thought sed would be of help here, but apparently then you have to do weird tricks like null-substitutions and branches. Still, I'd like to see a sed solution just to learn. And maybe there is something else I overlooked.

EDIT 1: Added nextfile to my awk solution

EDIT 2: Benchmarks EDIT 3: Different host (idle). EDIT 4: mistakenly used Gile's awk time for optimized-per's run. EDIT 5: new bench

Benchmarks

First, note: wc -l /etc/screenrc is 216.
50k iterations when line not present, measured in wall-time:

  • Null-op: 0.545s
  • My original awk solution: 58.417
  • My edited awk solution (with nextfile): 58.364s
  • Giles' awk solution: 57.578s
  • Optimized perl solution 90.352s Doh!
  • Sed 132{p;q}|grep -q ... solution: 61.259s
  • Cuonglm's tail | head | grep -q : 70.418s Ouch!
  • Don_chrissti's head -nX |head -n1|grep -q: 116.9s Brrrrp!
  • Terdon's double-grep solution: 65.127s
  • John1024's sed solution: 45.764s

Thank you John and thank you sed! I am honestly surprised perl was on-par here. Perl loads in a bunch of shared libraries on startup, but as long as the OS is caching them all, it comes down to the parser and byte-coder. In the distant past (perl 5.2?) I found it was slower by 20%. Perl was slower as I originally expected but appeared to be better due to a copy/paste error on my part.

Benchmarks Part 2

The biggest configuration file which has practical value is /etc/services. So I've re-run these benches for this file and where the line to be changed is 2/3rds in the file. Total lines is 1100, so I picked 7220 and modified the regex accordingly (so that in one case it fails, in another it succeeds; for the bench it always fails).

  • John's sed solution: 121.4s
  • Chrissti's {head;head}|grep solution: 138.341s
  • Counglm's tail|head|grep solution: 77.948s
  • My awk solution: 175.5s

Best Answer

With GNU sed:

sed -n '132 {/^#termcapinfo[[:space:]]*xterm Z0=/q}; $q1'

How it works

  • 132 {/^#termcapinfo[[:space:]]*xterm Z0=/q}

    On line 132, check for the regex ^#termcapinfo[[:space:]]*xterm Z0=. If found quit, q, with the default exit code of 0. The rest of the file is skipped.

  • $q1

    If we reach the last line, $, then quit with exit code 1: q1.

Efficiency

Since it is not necessary to read past the 132nd line of the file, this version quits as soon as we reach the 132nd line or the end of the file, whichever occurs first:

sed -n '132 {/^#termcapinfo[[:space:]]*xterm Z0=/q; q1}; $q1'

Handling empty files

The version above will return true for empty files. This is because, if the file empty, no commands are executed and the sed exits with the default exit code of 0. To avoid this:

! sed -n '132 {/^#termcapinfo[[:space:]]*xterm Z0=/q1; q}'

Here, the sed command exits with code 0 unless the the desired string is found in which case it exits with code 1 The preceding ! tells the shell to invert this code to get back to the code we want. The ! modifier is supported by all POSIX shells. This version will work even for empty files. (Hat tip: G-Man)

Related Question