When working w/ sed
I typically find it easiest to consistently narrow my possible outcome. This is why I sometimes lean on the !
negation operator. It is very often more simple to prune uninteresting input away than it is to pointedly select the interesting kind - at least, this is my opinion on the matter.
I find this method more inline with sed
's default behavior - which is to auto-print pattern-space at script's end. For simple things such as this it can also more easily result in a robust script - a script that does not depend on certain implementations' syntax extensions in order to operate (as is commonly seen with sed
{functions}
).
This is why I recommended you do:
sed '10,15!d;/pattern/!d;=' <input
...which first prunes any line not within the range of lines 10 & 15, and from among those that remain prunes any line which does not match pattern
. If you find you'd rather have the line number sed
prints on the same line as its matched line, I would probably look to paste
in that case. Maybe...
sed '10,15!d;/pattern/!d;=' <input |
paste -sd:\\n -
...which will just alternate replacing input \n
ewlines with either a :
character or another \n
ewline.
For example:
seq 150 |
sed '10,50!d;/5/!d;=' |
paste -sd:\\n -
...prints...
15:15
25:25
35:35
45:45
50:50
I'll use the same test file as thrig:
$ cat file
a
pat 1
pat 2
b
pat 3
Here is an awk solution:
$ awk '/pat/ && last {print last; print} {last=""} /pat/{last=$0}' file
pat 1
pat 2
How it works
awk
implicitly loops over every line in the file. This program uses one variable, last
, which contains the last line if it matched regex pat
. Otherwise, it contains the empty string.
/pat/ && last {print last; print}
If pat
matches this line and the previous line, last
, was also a match, then print both lines.
{last=""}
Replace last
with an empty string
/pat/ {last=$0}
If this line matches pat
, then set last
to this line. This way it will be available when we process the next line.
Alternative for treating >2 consecutive matches as one group
Let's consider this extended test file:
$ cat file2
a
pat 1
pat 2
b
pat 3
c
pat 4
pat 5
pat 6
d
Unlike the solution above, this code treats the three consecutive matching lines as one group to be printed:
$ awk '/pat/{f++; if (f==2) print last; if (f>=2) print; last=$0; next} {f=0}' file2
pat 1
pat 2
pat 4
pat 5
pat 6
This code uses two variables. As before, last
is the previous line. In addition, f
counts the number of consecutive matches. So, we print matching lines when f
is 2 or larger.
Adding grep-like features
To emulate the grep
output shown in the question, this version prints the filename and line number before each matching line:
$ awk 'FNR==1{f=0} /pat/{f++; if (f==2) printf "%s:%s:%s\n",FILENAME,FNR-1,last; if (f>=2) printf "%s:%s:%s\n",FILENAME,FNR,$0; last=$0; next} {f=0}' file file2
file:2:pat 1
file:3:pat 2
file2:2:pat 1
file2:3:pat 2
file2:7:pat 4
file2:8:pat 5
file2:9:pat 6
Awk's FILENAME variables provides the file's name and awk's FNR
provides the line number within the file.
At the beginning of each file, FNR==1
, we reset f
to zero. This prevents the last line of one file from being considered consecutive with the first line of the next file.
For those who like their code spread over multiple lines, the above looks like:
awk '
FNR==1{f=0}
/pat/ {f++
if (f==2) printf "%s:%s:%s\n",FILENAME,FNR-1,last
if (f>=2) printf "%s:%s:%s\n",FILENAME,FNR,$0
last=$0
next
}
{f=0}
' file file2
Best Answer
If
fileN
contains the numbers of lines to be modified, andtarget_file
is the text file that contain the lines to be modified. The minimum solution will require to read each file once.Sorted
If the file that contains the line numbers contains one number (bigger than 1) per line, is sorted and there are no repetitions, we can use:
Which will keep only one line in memory (of each file) and walk both files from start to end. However once awk has processed a line, line 15 for example, it won't go back to line 12, for example. So, the file
lineN
has to be sorted (not repeated, and greater than 1) for this to work.Unsorted
Of course, the naive solution is that the line numbers file could be sorted
sort -nu fileN
.But, if the list of line numbers may be unsorted (and repeated), we may use either sed ,
ed
(the precursor ofsed
), or awk (later):Convert each line in
lineN
to a sed editing command likes/^/MARKER /
. Either shell printf or sed could do that:Note that in the last case the editing is done directly and at the original file. The last command
w
writes the modifications to file. If what is needed is to print the result then use the third option, which will print all lines.awk
In awk, capture the whole
fileN
in memory and processtarget_file
Or, with a variable to control when the list of files with line numbers has ended:
Note that the last version allows several files with line numbers, like
fileN
andfileK
in the example.Also note that the awk versions do not process repeated line numbers. All repeated line numbers are processed just once.