Ubuntu – Repeat whole line matches with grep for multiple instances on the same line

grepmultiple instancestext processing

An offshoot from this question:

While searching for the string "banana" from the following file, we would like 1,2,3 and 7 instances of lines 1,2,3 and 4 respectively. The number of grep outputs should equal the number of match instances while still returning the entire line.

There is one banana here
There are two banana banana here
There are three banana banana banana here
Basically there is no limit to how many banana banana banana banana banana banana banana we can have
In fact we need not have any too!

Note: If we remove the restriction of whole lines in the output, we have:

grep -no "banana" tempfile 

which returns

1:banana
2:banana
2:banana
3:banana
3:banana
3:banana
4:banana
4:banana
4:banana
4:banana
4:banana
4:banana
4:banana

Any ideas?

EDIT: This is the intended output

1 There is one banana here
2 There are two banana banana here
2 There are two banana banana here
3 There are three banana banana banana here
3 There are three banana banana banana here
3 There are three banana banana banana here
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have

Best Answer

grep doesn't have counter for matches, only -c counter for lines that have the match, but we can use awk to do that. As far as I understand , you want to print the line that matches x number of times based on the amount of matches. Well, here it is:

$ awk '{for(i=1;i<=NF;i++) if($i=="banana") counter++;for(j=1;j<=counter;j++) print NR,$0;counter=0 }' input.txt         
1 There is one banana here
2 There are two banana banana here
2 There are two banana banana here
3 There are three banana banana banana here
3 There are three banana banana banana here
3 There are three banana banana banana here
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have
4 Basically there is no limit to how many banana banana banana banana banana banana banana we can have

Basic idea here is that we loop over each word in a line, and count matches. If there's a match we increment counter and then use that counter to print the same line in a loop. Finally counter is reset and process repeats