Text Processing – Check if All Lines of a File Occur in Another File

text processing

I got two files: file1 with about 10 000 lines and file2 with a few hundred lines. I want to check whether all lines of file2 occur in file1. That is: ∀ line ℓ ∈ file2 : ℓ ∈ file1

Should anyone not know what these symbols mean or what "check whether all lines of file2 occur in file1" means: Several equivalent lines in either files don't influence whether the check returns that the files meet the requirement or don't.

How do I do this?

Best Answer

comm -13 <(sort -u file_1) <(sort -u file_2)

This command will output lines unique to file_2. So, if output is empty, then all file_2 lines are contained in the file_1.

From comm's man:

   With  no  options,  produce  three-column  output.  Column one contains
   lines unique to FILE1, column two contains lines unique to  FILE2,  and
   column three contains lines common to both files.

   -1     suppress column 1 (lines unique to FILE1)

   -2     suppress column 2 (lines unique to FILE2)

   -3     suppress column 3 (lines that appear in both files)
Related Question