file_A (~500MB, 1.6M lines) consists of all equal length search terms, 1 per line, not sorted.
file_B consists of all equal length text lines, 1 per line, not sorted
I've been able to run "grep -F -f file_A file_B >> output.txt" with any size file_B without problem on a box with 52GB ram. Problem is I'm now limited to 4GB ram and thus the size of file_A is now too large for this to run without exhausting available memory.
Short of manually chopping up file_A into smaller bites, is there any easy way to script this to grep for first 1000 lines of file_A, then when thats finished to automatically grep for lines 1001-2000, ect. until I've gone through all of file_A?
Best Answer
Loop through chunks of file_A, sending them as stdin to the same grep statement; adjust 1000 to your available memory: