Bash – Inner for loop when run in background in bash spawns new bash process

background-processbashforpipe

I am executing below script

LOGDIR=~/curl_result_$(date |tr ' :' '_')

mkdir $LOGDIR

for THREADNO in $(seq 20)
do
for REQNO in $(seq 20)
do
 time curl --verbose -sS  http://dummy.restapiexample.com/api/v1/create --trace-ascii ${LOGDIR}/trace_${THREADNO}_${REQNO} -d @- <<EOF >> ${LOGDIR}/response_${THREADNO} 2>&1
 {"name":"emp_${THREADNO}_${REQNO}_$(date |tr ' :' '_')","salary":"$(echo $RANDOM%100000|bc)","age":"$(echo $RANDOM%100000|bc)"}
EOF
echo -e "\n-------------------------------" >> ${LOGDIR}/response_${THREADNO}
done 2>&1 | grep real > $LOGDIR/timing_${THREADNO} &
done

After sometime if i check for no of bash processes, it shows 20(not 1 or 21)

ps|grep bash|wc -l

The question is since I have not used brackets "()" to enclose inner loop, new shell process should not be spawned.
I want to avoid creating new shells as the CPU usage nears 100%.
I don't know if it matters, but i am using Cygwin.

Best Answer

Because you have piped the loop into grep, it must be run in a subshell. This is mentioned in the Bash manual:

Each command in a pipeline is executed in its own subshell, which is a separate process (see Command Execution Environment)

It is possible to avoid that with the lastpipe shell option for the final command in the pipeline, but not any of the others. In any case, you've put the whole pipeline into the background, which also creates a subshell.

There is no way around this. What you're doing inherently does require separate shell processes in order to work: even ignoring the pipeline, creating a background process requires creating a process.

If your issue is the CPU usage, that's caused by running everything at once. If you remove the & after the grep, all the commands will run in sequence instead of simultaneously. There will still be subshells created (for the pipeline), but those are not themselves the main issue in that case. If you need them to run simultaneously, the increased CPU usage is the trade-off you've chosen.

Related Question