I have two files: 'file1' has employee ID numbers. 'file2' has the complete database of the employees. Here is what they look like:
file1
123123
222333
file2
111222 Jones Sally
111333 Johnson Roger
123123 Doe John
444555 Richardson George
222333 Smith Jane
223456 Alexander Philip
I want to compare the two files and eliminate the entries from 'file2' that have ID numbers in 'file1'.
I found this 'awk' command which works perfectly:
awk 'FNR==NR{a[$1];next};!($1 in a)' file1 file2 > file3
The result:
file3
111222 Jones Sally
111333 Johnson Roger
444555 Richardson George
223456 Alexander Philip
WONDERFUL!
My problem is that the files are actually simplified .csv
files, and I must use a comma as a separator rather than a space. I have tried everything I can think of to make this work (i.e -F, -F',' -F"," everywhere in the command) and no success.
How do I get this to work with .csv
files?
By the way, I am on MacBook Pro, OSX Lion!
Best Answer
I'm assuming your
csv
files are something like:File1
File2
You could try using the
join
command, like so:More information about the command can be found here: man join