grep -f Patternfile: Troubleshooting Pattern Matching Issues

bashgrepsedtext processing

After searching, reading and trying possible solutions for much about grep lines with the list of patterns, hence I am writing. It is I know a very basic and have been attended in many forums.

But I am stuck with the following: I have two files, I want to grep those lines from bigger file, which matches with the pattern from smaller file.

I have a file_A.txt (a single column list of patterns to be matched) like:

comt241_c0_seq1
comt868_c0_seq1
comt685_c0_seq1
comt7977_c0_seq1
comt6723_c0_seq1
comt363_c0_seq1
comt384_c0_seq1

and another file_B.txt (tab delimited, with more entries than file_A)

comp5_c0_seq1   0   0   0   6   0   0   0   0   0
comt241_c0_seq1 0   0   0   0   0   0   0   0   0
comt868_c0_seq1 0   0   0   0   0   0   0   0   0
comt363_c0_seq1 0   0   0   0   0   0   0   0   0
comt384_c0_seq1 0   0   0   0   0   0   0   0   0
comp429_c0_seq1 0   0   0   0   0   0   0   0   0
comp452_c0_seq1 0   0   0   0   0   0   0   0   0
comp452_c0_seq2 0   0   0   0   0   0   0   0   0
comp483_c0_seq1 33  8   10  32  0   33  8   0   37
comt685_c0_seq1 0   0   0   0   0   0   0   0   0
comp494_c0_seq1 0   0   0   0   0   0   0   0   0
comt7977_c0_seq1    1   0   0   0   0   0   0   0   0
comp564_c0_seq1 0   0   0   0   0   0   0   0   0
comp596_c0_seq1 0   0   0   0   0   0   0   0   0
comp653_c0_seq1 10  0   0   2   0   0   0   0   0
comp724_c0_seq1 0   0   0   0   0   0   0   0   0
comt6723_c0_seq1    0   0   0   0   0   0   0   0   0

I tried grep -f file_A file_B > file_C

But it returned an empty file.

So I removed any white spaces from file_A using

sed 's/[ \t]*$//' file_A > new_file_A

but didn't worked out. I have tried a lot things to remove special character or space and to properly delimit file but it either gave me extra entries or nothing.

I think there are some special character either in file_A or file_B, which is bothering.
I am using text wrangler editor.

Is there other way apart from grep of doing this?

Best Answer

I created the two files with the same contents as mentioned and used grep in the same way and it worked fine. I hope you are using the same file names (I see the .txt extension missing in the question).

[sreeraj@server ~]$ grep -f file_A.txt file_B.txt > file_C.txt
[sreeraj@server ~]$ cat file_C.txt
comt241_c0_seq1 0   0   0   0   0   0   0   0   0
comt868_c0_seq1 0   0   0   0   0   0   0   0   0
comt363_c0_seq1 0   0   0   0   0   0   0   0   0
comt384_c0_seq1 0   0   0   0   0   0   0   0   0
comt685_c0_seq1 0   0   0   0   0   0   0   0   0
comt7977_c0_seq1    1   0   0   0   0   0   0   0   0
comt6723_c0_seq1    0   0   0   0   0   0   0   0   0
[sreeraj@server ~]$

You can try dos2unix on both the filenames if they are still producting an empty file.

dos2unix file_A.txt
dos2unix file_B.txt
Related Question