Shell script to read from multiple files in parallel

filesgrepparallelismscripting

I need to write a script that runs parallel and looks for a string in multiple files.
I tried a lot of options but they slow down the speed of my processor.

Best Answer

If the files are on separate disks, run one grep command on each disk.

For files on the same disk, the bottleneck is reading from the disk. Reading from multiple files in parallel will only make the speed worse.

If the files are on a RAID-0 array, you might get a speed increase by running two grep commands at the same time. Benchmark to see if you really gain time. The low-tech way:

grep file1 file2 file3 &
grep file4 file5 file6

With GNU parallel:

parallel -j 2 grep ::: file1 file2 file3 file4 file5 file6

If you're getting files from find:

find … -print0 | parallel -0 -j 2

Remember: if the files are on the same disk, a single grep command is the fastest.