I have a file of patterns and I want to return all the line numbers where the pattern was found, but in a wide format and not long/spread.
Example:
fileA.txt
Germany
USA
UK
fileB.txt
USA
USA
Italy
Germany
UK
UK
Canada
Canada
Germany
Australia
USA
I have done something like this:
grep -nf fileA.txt fileB.txt
which returned me:
1:USA
2:USA
4:Germany
5:UK
6:UK
9:Germany
11:USA
However, I want to have something like:
Germany 4 9
USA 1 2 11
UK 5 6
Best Answer
Using GNU
datamash
:This first uses
grep
to get the lines fromfileB.txt
that exactly matches the lines infileA.txt
, and outputs the matching line numbers along with the lines themselves.I'm using
-x
and-F
in addition to the options that are used in the question. I do this to avoid reading the patterns fromfileA.txt
as regular expressions (-F
), and to match complete lines, not substrings (-x
).The
datamash
utility is then parsing this as lines of:
-delimited fields (-t :
), sorting it (-s
) on the second field (-g 2
; the countries) and collapsing the first field (collapse 1
; the line numbers) into a list for each country.You could then obviously replace the colons and commas with tabs using
tr ':,' '\t\t'
, or with spaces in a similar way.