Does AWK have similar ability as SED to find line ranges based on text in line rather than line number

awkcommand linemacintoshsed

Resolution: the files were saved with CR rather than LF line breaks.
Mosvy pointed this out, but only posted as a comment, rather than an answer, so I am unable to officially thank him for helping me to find the cause and solve the problem.

Thanks mosvy, if you come back please post as an answer so I can give you a thumbs up.

SED seems to have:

sed '3,10d;/<ACROSS>/,$d' input.txt > output.txt

(delete line 3-10, then delete from line containing "<ACROSS>" to end of file; then write out output.)

Even when I try with only:

sed '3,10d' input.txt > output.txt

but for some reason neither seems to work on my Mac.

Not sure what else to try.

I am hoping there is something very similar with AWK.

Update:

when I enter:

sed '3,10d' input.txt > output.txt

it does not delete lines 3 – 10; it just spits back the entire file to output.txt;

when I try:

sed '/<ACROSS>/,$d' input.txt > output.txt

output.txt is blank

Also, I'm on 10.9.4

** Update 2:

Thank you to mosvy!! I wish I could upvote your comment. It was the problem solver.

It turns out the file was saved with CR rather than LF line breaks

When I converted it, that cured everything.

Thanks to everyone who contributed.

Best Answer

The OP's problem was caused by file file using CR (\r / ascii 13) instead of LF (\n / ascii 10) as line terminators as expected by sed. Using CR was the convention used in classic MacOS; as a non Mac user, the only use of it I've met with in the wild in the last two decades was in PDF files, where it greatly complicates any naive PDF parser written in perl (unlike RS in mawk and gawk, $/ in perl cannot be a regex).


As to the question from the title, yes, awk supports range patterns, and you can freely mix regexps and line number predicates (or any expression) in them. For example:

NR==1,/rex/   # all lines from the 1rst up to (and including)
          # the one matching /rex/

/rex/,0   # from the line matching /rex/ up to the end-of-file.

awk's ranges are different from sed's, because in awk the end predicate could also match the line which started the range. sed's behavior could be emulated with:

s=/start/, !s && /last/ { s = 0; print }

However, ranges in awk are still quite limited because they're not real expression (they cannot be negated, made part of other expressions, used in if(...), etc). Also, there is no magic: if you want to express something like a range with "context" (eg. /start/-4,/end/+4) you'll have to roll your own circular buffer and extra logic.

Related Question