Including the last line you'd do:
sed -n '/word/,$p'
That matches the first occurrence of word
all the way until the last line and prints all matches.
Not including the last line:
sed '/word/,$!d;$d'
...which deletes negated matches and then deletes the last line.
And to get from only the last match to the last line you have to try a little harder:
sed -e :n -e '/\n.*word/D;N;$q;bn'
It loops - it never completes the normal sed
line cycle but instead appends the next input line to the pattern space buffer and b
ranches back to do so again. But when it has at least two lines in pattern space and the last matches word it deletes everything in the buffer but the line that matches word. On the last line it just quits
and breaks the loop. So what gets printed is everything from the last occurring line containing word to the last line.
Hmmm... maybe I made that harder than it has to be:
sed 'H;$x;/word/h;$!d'
With that one every line is appended to hold space. But lines matching word then overwrite hold space. Every line in pattern space that is not the last line is deleted. And on the last line, just after it is appended to hold space, the hold and pattern spaces are exchanged (in case the last line also contains word) and everything from the last time word overwrote hold space is printed.
You can always do:
tac < fileName | sed '/EndPattern/,$!d;/StartPattern/q' | tac
If your system doesn't have GNU tac
, you may be able to use tail -r
instead.
You can also do it like:
awk '
inside {
text = text $0 RS
if (/EndPattern/) inside=0
next
}
/StartPattern/ {
inside = 1
text = $0 RS
}
END {printf "%s", text}' < filename
But that means reading the whole file.
Note that it may give different results if there's another StartPattern
in between a StartPattern
and the next EndPattern
or if the last StartPattern
does not have an ending EndPattern
or if there are lines matching both StartPattern
and EndPattern
.
awk '
/StartPattern/ {
inside = 1
text = ""
}
inside {text = text $0 RS}
/EndPattern/ {inside = 0}
END {printf "%s", text}' < filename
Would make it behave more like the tac+sed+tac
approach (except for the unclosed trailing StartPattern
case).
That last one seems to be the closest to your edited requirements. To add the warning would simply be:
awk '
/StartPattern/ {
inside = 1
text = ""
}
inside {text = text $0 RS}
/EndPattern/ {inside = 0}
END {
printf "%s", text
if (inside)
print "Warning: EOF reached without seeing the end pattern" > "/dev/stderr"
}' < filename
To avoid reading the whole file:
tac < filename | awk '
/StartPattern/ {
printf "%s", $0 RS text
if (!inside)
print "Warning: EOF reached without seeing the end pattern" > "/dev/stderr"
exit
}
/EndPattern/ {inside = 1; text = ""}
{text = $0 RS text}'
Portability note: for /dev/stderr
, you need either a system with such a special file (beware that on Linux if stderr is open on a seekable file that will write the text at the beginning of the file instead of the current position within the file) or an awk
implementation that emulates it like gawk
, mawk
or busybox awk
(those work around the Linux issue mentioned above).
On other systems, you can replace print ... > "/dev/stderr"
with print ... | "cat>&2"
.
Best Answer
With
gawk
, you can use thesplit()
function to determine fields and their separators: