Using grep/awk/sed to sort and combine 2 files

awkgrepjoin;scriptingsed

I am taking a wifi log that has MAC addresses listed in it. I want to find out what MAC vendors/manufacturers for the devices that are connected to my router. I have 2 files, one with the MACs already 'grep' to show the first 3 octets for the MAC. The other file has a list of vendors and their first 3 octets they were issued. Right now the issue is that the first file has multiple duplicates which I can still match in the second file but it won't show how many it has from the first file. Below are the examples.

text.txt

00:10:f6
00:10:f6
03:48:03
8f:91:34
93:ab:c6

vendor.xml

03:48:03 vendor="apple"
00:10:f6 vendor="micro"
8f:91:34 vendor="dell"
93:ab:c6 vendor="sun"
23:8b:23 vendor="acer"
00:73:ad vendor="asus"

This is what I get when I run the following code:

cat text.txt vendor.xml |grep -Ff text.txt vendor.xml |sort -u |uniq -c >> final.txt

final.txt

  1 00:10:f6 vendor="micro"
  1 03:48:03 vendor="apple"
  1 8f:91:34 vendor="dell"
  1 93:ab:c6 vendor="sun"

The result should be instead:

  2 00:10:f6 vendor="micro"
  1 03:48:03 vendor="apple"
  1 8f:91:34 vendor="dell"
  1 93:ab:c6 vendor="sun"

Is there some flag or option I am not thinking of?

Best Answer

join combines the files (needing sorted inputs):

$ join <(sort text.txt) <(sort vendor.xml)
00:10:f6 vendor="micro"
00:10:f6 vendor="micro"
03:48:03 vendor="apple"
8f:91:34 vendor="dell"
93:ab:c6 vendor="sun"

So all what's left is to add uniq -c to do the counting:

$ join <(sort text.txt) <(sort vendor.xml) | uniq -c
      2 00:10:f6 vendor="micro"
      1 03:48:03 vendor="apple"
      1 8f:91:34 vendor="dell"
      1 93:ab:c6 vendor="sun"
Related Question