Bash – Processing command parallel per batch

bashparallelismshell-script

So, I have 10 CPU core and 20 data to process. I want to process the data in parallel but I am afraid if I just process 20 at once it will make some problem. So, I want to process 10 data 2 times. Is there any command to do this?

Add info:

The data are in file format. It is quite huge, per file can reach 10GB. In my expreience if I launch more than 10 process, the PC will become really slow and even lag. So I am limiting the process to be only 10 which is equal to the number of cores. As for my RAM, I believe the software which process the file will not load everything at once so the RAM usage is quite low. That is why I just need to parallel the process for every 10 data. For now, I generate 10 shell script which execute parallel and each shell script contains sequential command.

Best Answer

Using GNU Parallel:

parallel my_process {} ::: files*

This will run one my_process file per CPU thread.

You can tell GNU Parallel to make sure there is 10G of RAM free before it starts the next job:

parallel --memfree 10G my_process {} ::: files*

If the free mem goes below 5G then GNU Parallel will kill the newest job and restart it when there is 10G free again.

Related Question