Bash – Why does ‘jobs’ always return a line for finished processes when run in a subshell within a script

bashjobsshellshell-script

Normally, when a job is launched in the background, jobs will report that it is finished the first time it is run after the job's completion, and nothing for subsequent executions:

$ ping -c 4 localhost &>/dev/null &
[1] 9666
$ jobs
[1]+  Running                 ping -c 4 localhost &> /dev/null &
$ jobs
[1]+  Done                    ping -c 4 localhost &> /dev/null
$ jobs  ## returns nothing
$ 

However, when run in a subshell within a script it seems to always return a value. This script will never exit:

#!/usr/bin/env bash
ping -c 3 localhost &>/dev/null &
while [[ -n $(jobs) ]]; do
    sleep 1; 
done

If I use tee in the [[ ]] construct to see the output of jobs, I see that it is always printing the Done ... line. Not only once as I expected but, apparently, for ever.

What is even stranger is that running jobs within the loop causes it to exit as expected:

#!/usr/bin/env bash
ping -c 3 localhost &>/dev/null &
while [[ -n $(jobs) ]]; do
    jobs
    sleep 1; 
done

Finally, as pointed out by @mury, the first script works as expected and exits if run from the commandline:

$ ping -c 5 localhost &>/dev/null & 
[1] 13703
$ while [[ -n $(jobs) ]]; do echo -n . ; sleep 1; done
...[1]+  Done                    ping -c 5 localhost &> /dev/null
$ 

This came up when I was answering a question on Super User so please don't post answers recommending better ways of doing what that loop does. I can think of a few myself. What I am curious about is

  1. Why does jobs act differently within the [[ ]] construct? Why will it always return the Done... line while it doesn't when run manually?

  2. Why does running jobs within the loop change the behavior of the script?

Best Answer

You know, of course, that $(…) causes the command(s) within the parentheses to run in a subshell.  And you know, of course, that jobs is a shell builtin.  Well, it looks like jobs clears a job from the shell’s memory once its death has been reported.  But, when you run $(jobs), the jobs command runs in a subshell, so it doesn’t get a chance to tell the parent shell (the one that’s running the script) that the death of the background job (ping, in your example) has been reported.  So, each time the shell spawns a subshell to run the $(jobs) thingie, that subshell still has a complete list of jobs (i.e., the ping job is there, even though it’s dead after the 5th iteration), and so jobs still (again) believes that it needs to report on the status of the ping job (even if it’s been dead for the past four seconds).

This explains why running an unadulterated jobs command within the loop causes it to exit as expected: once you run jobs in the parent shell, the parent shell knows that the job’s termination has been reported to the user.

Why is it different in the interactive shell?  Because, whenever a foreground child of an interactive shell terminates, the shell reports on any background jobs that have terminated1 while the foreground process was running.  So, the ping terminates while the sleep 1 is running, and when the sleep terminates, the shell reports on the background job’s death.  Et voilà.


1 It might be more accurate to say “any background jobs that have changed state while the foreground process was running.”  I believe that it might also report on jobs that have been suspended (kill -SUSP, the programmatic equivalent to Ctrl+Z) or become unsuspended (kill -CONT, which is what the fg command does).

Related Question