Bash – Why does the exit status differ when running `systemctl start; systemctl is-active` and `systemctl is-active` separately

bashcommand lineexit-status

The following sequence gives me the return value of the first command, not the 2nd as I would have expected (no matter if I run the 1st command in a subshell):

sudo systemctl start x; sudo systemctl is-active --quiet x; echo $?;
(sudo systemctl start x); sudo systemctl is-active --quiet x; echo $?;

The service x is broken and could not be started – so he's not running. The following command, ran stand-alone, gives me a correct return value of 3 as it should be:

sudo systemctl is-active --quiet x; echo $?;

So, why am I getting the return value of the first command (0) when running command; command; echo $? instead of the return value (3) of the second with echo $??

I'm on GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu). I know, that if I split it on 2 lines, it works:

sudo systemctl start x;
sudo systemctl is-active --quiet x; echo $?;

But I need to have it as a one-liner, as I'm putting it in PHP shell_exec() function. And running twice shell_exec() has the same result as putting the commands in one line.

Best Answer

When I encounter an issue like this, I tend to follow Sherlock Holmes’ mantra, and consider what is left, however implausible, once the impossible is eliminated. Of course with computers nothing is impossible, however some things are so unlikely we can ignore them at first. (This makes more sense with the original title, “command; command; echo $? — return value is not correct, why?”)

In this case, if

sudo systemctl start x; sudo systemctl is-active --quiet x; echo $?;

shows that $? is 0, that means that systemctl is-active really did indicate success. The fact that a separate systemctl is-active shows that the service isn’t active strongly suggests that there’s a race between the service and the human operator typing the commands; basically, that the service does start, to a sufficient extent for systemctl start to finish, and systemctl is-active to run and find the service active, but then the service fails, so a human-entered systemctl is-active finds it inactive.

Adding a short delay between systemctl start and systemctl is-active should avoid the false positive.

Related Question