Consider this sample file (line numbers are for reference only):
1 Reference duiarneutdigane uditraenturida enudtiar.
2
3 Reference uiae uiaetrtdnsu iatdne uiatrdenu diaren uidtae
4 on line 23.
5
6 uiae
7
8 uaiernd Reference uriadne udtiraeb unledut iaeru uilaedr
9 uiarnde line 234.
I was hoping to match every string beginning with “Reference” and ending with a period (i.e. ll. 1, 3–4, and 8–9) using this grep command (tst is the sample file):
grep -P '(?s)Reference.*?\.' tst
However, it only matches the first line. What I was thinking:
(?s)
, so.
matches all characters, including newlines.*?
should make the star non-greedy, so it doesn’t match the whole file if it ends with a period.- The expression should end with a literal period
\.
.
I’ve also tried awk and grep’s -z
flag, but with both I get either every line or not all lines match my expressions.
Best Answer
You can use this:
where
tst.txt
is your input file. It is the same regex as yours, but with two new flags.I added the
-z
flag to suppress newline at the end of line, substituting it for null character. Thusgrep
knows where end of line is, but sees the input as one big line.The
-o
flag means that it only prints the matched part.I got the following output: