Bash – Modify Non-Adjacent Lines Specified by Line Numbers

bashsed

I know the line numbers in advance, and keep them in another file:

cat linenos
2
15
42
44
... etc

as you see the lines are non-adjacent so I cannt use one range for sed.
The goal is to modify target file lines by, say, prepending them with a marker like MARKER

Straight forward way is to call sed multiple times to modify each line:

for l in $(cat linenos)
do 
  sed -i "${l}s/^/MARKER/" target_file
done

which apparently will call sed multiple times.

CAUTION: *Not only is this approach inefficient it can also make things go wrong if the modification is other than inserting a marker like this. Any line deletion or insertion sed command like d a r, will make initial line number in linenos invalid for the next sed runs in the loop.

What would you suggest to improve/optimize that?

Sample linenos file

cat linenos
2
5

Sample target_file

cat target_file
line one
line two
line three
line four
line five
line six

Expected result of modified target_file

cat target_file
line one
MARKERline two
line three
line four
MARKERline five
line six

Possible approach i came up with is dynamically create sed scenario

SEDCMD=$(for l in $(cat linenos); do echo -n "${l}s/^/MARK/;" ; done)

sed -i -e "$SEDCMD" targetfile

@steeldriver's below approach shares the idea, but is more elegant and concise

Best Answer

If fileN contains the numbers of lines to be modified, and target_file is the text file that contain the lines to be modified. The minimum solution will require to read each file once.

Sorted

If the file that contains the line numbers contains one number (bigger than 1) per line, is sorted and there are no repetitions, we can use:

awk 'BEGIN{ getline lineN <"fileN"} {
     if(NR==lineN){$0="MARKER " $0;getline lineN <"fileN"}
     }1' target_file

Which will keep only one line in memory (of each file) and walk both files from start to end. However once awk has processed a line, line 15 for example, it won't go back to line 12, for example. So, the file lineN has to be sorted (not repeated, and greater than 1) for this to work.

Unsorted

Of course, the naive solution is that the line numbers file could be sorted sort -nu fileN.

But, if the list of line numbers may be unsorted (and repeated), we may use either sed , ed (the precursor of sed), or awk (later):

Convert each line in lineN to a sed editing command like s/^/MARKER /. Either shell printf or sed could do that:

printf '%ss/^/MARKER /\n' $(<fileN) | sed -f - target_file
sed 's#$#s/^/MARKER /#' fileN       | sed -f - target_file

{ printf '%ss/^/MARKER /\n' $(<fileN); printf '%s\n' ,p Q; } | ed -Gs target_file
{ sed 's#$#s/^/MARKER /#' fileN ; echo "w"      ; } | ed target_file

Note that in the last case the editing is done directly and at the original file. The last command w writes the modifications to file. If what is needed is to print the result then use the third option, which will print all lines.

awk

In awk, capture the whole fileN in memory and process target_file

awk '{ if(NR==FNR){
                     a[$1]=1
                  }else{
                     if(a[FNR]==1){ printf("%s","MARKER ")};
                     print 
                  }
     }' fileN target_file

Or, with a variable to control when the list of files with line numbers has ended:

awk '{ if (dofile==1) {   if(a[FNR]==1){ printf("%s","MARKER ")};
                          print
                      }else{
                          a[$1]=1
                      }
     }' fileN fileK   dofile=1   target_file

Note that the last version allows several files with line numbers, like fileN and fileK in the example.

Also note that the awk versions do not process repeated line numbers. All repeated line numbers are processed just once.

Related Question