Bash – Why does using `yes` on bash pipelines not cause infinite loops

bashpipeshell

According to its documentation, bash waits until all commands in a pipeline have finished running before continuing

The shell waits for all commands in the pipeline to terminate before returning a value.

So why does the command yes | true finish immediately? Shouldn't the yes loop forever and cause the pipeline to never return?

And a subquestion: according to the POSIX spec, shell pipelines may choose to either return after the last command finishes or wait until all the commands finish. Do common shells have different behavior in this sense? Are there any shells where yes | true will loop forever?

Best Answer

When true exits, the read side of the pipe is closed, but yes continues trying to write to the write side. This condition is called a "broken pipe", and it causes the kernel to send a SIGPIPE signal to yes. Since yes does nothing special about this signal, it will be killed. If it ignored the signal, its write call would fail with error code EPIPE. Programs that do that have to be prepared to notice EPIPE and stop writing, or they will go into an infinite loop.

If you do strace yes | true¹ you can see the kernel preparing for both possibilities:

write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=17556, si_uid=1000} ---
+++ killed by SIGPIPE +++

strace is watching events via the debugger API, which first tells it about the system call returning with an error, and then about the signal. From yes's perspective, though, the signal happens first. (Technically, the signal is delivered after the kernel returns control to user space, but before any more machine instructions are executed, so the write "wrapper" function in the C library does not get a chance to set errno and return to the application.)

¹ Sadly, strace is Linux-specific. Most modern Unixes have some command that does something similar, but it often has a different name, it probably doesn't decode syscall arguments as thoroughly, and sometimes it only works for root.

Related Solutions

Bash script wait for processes and get return code

You can do this by using a temporary directory.

# Create a temporary directory to store the statuses
dir=$(mktemp -d)

# Execute the backgrouded code. Create a file that contains the exit status.
# The filename is the PID of this group's subshell.
for i in 1 2; do
    { ssh mysql "/root/test$i.sh" ; echo "$?" > "$dir/$BASHPID" ; } &
done

# Wait for all jobs to complete
wait

# Get return information for each pid
for file in "$dir"/*; do
    printf 'PID %d returned %d\n' "${file##*/}" "$(<"$file")"
done

# Remove the temporary directory
rm -r "$dir"

Bash scripting – loop until return value is 0

The [ command is to evaluate conditional expressions. It's of no use here.

Because umount doesn't output anything on its standard output (the errors go to stderr), `sudo umount mount` expands to nothing.

So it's like:

while [ ]
do
  sleep 0.1
done

The [ command, when not passed any argument beside [ and ] returns false (a non-zero exit status), so you will not enter the loop.

Even if umount had output its errors on stdout, using the [ command would not have made sense, because the words resulting of that output would never have made up a valid conditional expression.

Here you want:

until sudo umount mount
do
  sleep 0.1
done

That is, you want to check the exit status of sudo/umount, not of a [ command.

If you wanted to check if umount output any error or warning on its stderr, that's where the [ could have been useful. The -n "some-string" is a conditional expression recognised by the [ command to test whether "some-string" is empty or not, so something like:

while [ -n "$(sudo umount mount 2>&1 > /dev/null)" ]; do
  sleep 0.1
done

But looking for the presence of error or warning messages is generally a bad idea. The umount command tells us whether or not it succeeds with its exit code, that's much more reliable. It could succeed and still output some warning message. It could fail and not output an error (like when it's killed).

In this particular case, note that umount might fail because the directory is not mounted, and you would loop forever in that case, so you could try another approach like:

while mountpoint -q mount && ! sudo umount mount; do
  sleep 0.1
done

Or if "mount" may be mounted several times and you want to unmount them all:

while mountpoint -q mount; do
  sudo umount mount || sleep 0.1
done

Best Answer

Related Solutions

Bash script wait for processes and get return code

Bash scripting – loop until return value is 0

Related Question