Ubuntu – Comparing two text files

bashcommand linecsvtext processing

I have 2 big csv files, file1.csv which looks like this

1,2,3,4
1,4,5,6
1,7,8,9
1,11,13,17

file2.csv which looks like this

1,2,3,4
1,7,8,9
2,4,9,10
13,14,17,18

These are just random numbers that I made up, basically the two numbers where identical, and sorted. I want to compare file1.csv and file2.csv and then copy the rows that are present in file1.csv but not in file2.csv to file3.csv. the delimiter is comma obviously

I tried

comm -2 -3 file.csv file2.csv > file3.csv

and I tried

diff -u file.csv file2.csv >> file3.csv

Both didn't work because file3 was bigger than file1 and file2. I tried different diff and comm commands, sometimes it's bigger than file2 and about the same size as file file1, I know that file3 has to be significantly less in size than file1 and file2. And of course I looked at file3, not the results I wanted

At this point, I know it could be done with diff or comm but I do not know the command to use.

Best Answer

Try this command:

 grep -v -f file2.csv file1.csv > file3.csv

According to grep manual:

  -f FILE, --file=FILE
          Obtain  patterns  from  FILE,  one  per  line.   The  empty file
          contains zero patterns, and therefore matches nothing.   (-f  is
          specified by POSIX.)

  -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v
          is specified by POSIX.)

As Steeldriver said in his comment is better add also -x and -F that:

  -F, --fixed-strings
          Interpret PATTERN as a  list  of  fixed  strings,  separated  by
          newlines,  any  of  which is to be matched.  (-F is specified by
          POSIX.)
  -x, --line-regexp
          Select  only  those  matches  that exactly match the whole line.
          (-x is specified by POSIX.)

So, better command is:

 grep -xvFf file2.csv file1.csv > file3.csv

This command use file2.csv line as pattern and print line of file1.csv that doesn't match (-v).

Related Solutions

Ubuntu – AWK–Comparing the value of two variables in two different files

The following script should solve your problem:

#!/bin/bash
A="$HOME/a.txt"
B="$HOME/b.txt"

cat $A | while read a; do
    cat $B | while read b; do
        b3=$(echo $b | awk ' { print $3 }')
        c=$(($b3 - $a))
        if (( $c > 10 )); then
            echo $b
        fi
    done
done

Don't forget to make it executable using the following command:

chmod +x script_name

Ubuntu – Get the unique lines of second file in result of comparing two files

You could accomplish this with grep.

Here is an example:

$ echo localhost > local_hosts

$ grep -v -f local_hosts /etc/hosts
127.0.1.1       ubuntu

# The following lines are desirable for IPv6 capable hosts
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Best Answer

Related Solutions

Ubuntu – AWK–Comparing the value of two variables in two different files

Ubuntu – Get the unique lines of second file in result of comparing two files

Related Question