Bash – Script dies when parent process is terminated

bashdebianshell-script

I have a .NET Core service running on a Debian 9, let's call it MyService. At some point this service is running a bash script update.sh using Process.Start() with ShellExecute=true.

This script basically runs apt-get update; apt-get upgrade.

During package upgrade, MyService process is terminated: update script is terminated as well and apt-get upgrade is killed as well, leaving inconsistent packages that must be fixed manually.

What I want is that update.sh is NOT terminated when MyService is terminated.

I tried splitting update.sh in 2 parts, the first running the second in different ways; I tried starting update2.sh with setsid and nohup but I always get same result.
I tried to execute update2.sh script in a new bash shell with /bin/bash /c "update2.sh", same result.

How do I run a script started from a binary and completely detach from binary process, so I can kill binary while script keeps running?

Here's my environment. MyService is a binary running as a service. update.sh is started by MyService.

.NET Core code to start shell script, inside MyService binary:

var process = new Process();
process.EnableRaisingEvents = true; // to avoid [defunct] sh processes
process.StartInfo.FileName = "/opt/myservice/update.sh";
process.StartInfo.Arguments = "";
process.StartInfo.UseShellExecute = true;
process.StartInfo.CreateNoWindow = true;
process.Start();
process.WaitForExit(10000);
if (process.HasExited)
{
  Console.WriteLine("Exit code: " + process.ExitCode);
}
else
{
  Console.WriteLine("Child process still running after 10 seconds");
}

update.sh:

nohup /opt/myservice/update2.sh > /opt/myservice/update.log &
systemctl stop MyService

update2.sh:

apt-get update >> /opt/myservice/update.log
apt-get -y install --only-upgrade myservice-1.0 >> /opt/myservice/update.log

update2.sh is never executed because it's terminated when MyService is terminated by update.sh.

update.sh returns code 143, it seems it has been killed.

2018-08-16 14:46:14.5215|Running update script: /opt/myservice/update.sh
2018-08-16 14:46:14.5883|Update script /opt/myservice/update.sh returned: 143

UPDATE

I tried following approaches, thanks for suggestions:

  • setsid
  • disown
  • nohup
  • screen
  • tmux
  • unshare

Every approach has same result, termination of all spawned processes.
I suspect this is a .NET Core "feature".

UPDATE 2

I discovered that systemctl stop MyService by default explicitly kills all spawned processes by a service.

https://stackoverflow.com/questions/40898077/systemd-systemctl-stop-aggressively-kills-subprocesses

If I add KillMode=process to service descriptor, update script is not terminated when service is terminated.

There is NO WAY to escape from PID space for a service started by systemctl. Every technique used, included the one in accepted answer, does not generate a separate process. Every spawned process is always killed by systemctl stop MyService unless KillMode=process is specified.

I ended up creating a separate service MyServiceUpdater: this service runs the plain updater script without any forking. Since PID space is different, everything works as expected. That was a long ride.

MyServiceUpdater example:

[Unit]
Description=Your Service Updater
After=network.target

[Service]
ExecStart=/path/to/update/script/updatescript.sh
ExecStopPost=
TimeoutStopSec=30
StandardOutput=null
WorkingDirectory=/path/to/service/directory/
KillMode=process

[Install]
WantedBy=multi-user.target

Best Answer

On a Centos 7 test system via

$ sudo rpm -Uvh https://packages.microsoft.com/config/rhel/7/packages-microsoft-prod.rpm
$ sudo yum install dotnet-sdk-2.1

which results in dotnet-sdk-2.1-2.1.400-1.x86_64 being installed then with the test code

using System;
using System.Diagnostics;
using System.ComponentModel;
namespace myApp {
    class Program {
        static void Main(string[] args) {
            var process = new Process();
            process.EnableRaisingEvents = true; // to avoid [defunct] sh processes
            process.StartInfo.FileName = "/var/tmp/foo";
            process.StartInfo.Arguments = "";
            process.StartInfo.UseShellExecute = true;
            process.StartInfo.CreateNoWindow = true;
            process.Start();
            process.WaitForExit(10000);
            if (process.HasExited) {
                Console.WriteLine("Exit code: " + process.ExitCode);
            } else {
                Console.WriteLine("Child process still running after 10 seconds");
            }
        }
    }
}

and a shell script as /var/tmp/foo a strace stalls out and shows that /var/tmp/foo is run through xdg-open which on my system does...I'm not sure what, it seems a needless complication.

$ strace -o foo -f dotnet run
Child process still running after 10 seconds
^C
$ grep /var/tmp/foo foo
25907 execve("/usr/bin/xdg-open", ["/usr/bin/xdg-open", "/var/tmp/foo"], [/* 37 vars */] <unfinished ...>
...

a simpler solution is to simply exec a program that in turn can be a shell script that does what you want, which for .NET requires not using the shell:

            process.StartInfo.UseShellExecute = false;

with this set the strace shows that /var/tmp/foo is being run via a (much simpler) execve(2) call:

26268 stat("/var/tmp/foo", {st_mode=S_IFREG|0755, st_size=37, ...}) = 0
26268 access("/var/tmp/foo", X_OK)      = 0
26275 execve("/var/tmp/foo", ["/var/tmp/foo"], [/* 37 vars */] <unfinished ...>

and that .NET refuses to exit:

$ strace -o foo -f dotnet run
Child process still running after 10 seconds
^C^C^C^C^C^C^C^C

because foo replaces itself with something that ignores most signals (notably not USR2, or there is always KILL (but avoid using that!)):

$ cat /var/tmp/foo
#!/bin/sh
exec /var/tmp/stayin-alive
$ cat /var/tmp/stayin-alive
#!/usr/bin/perl
use Sys::Syslog;
for my $s (qw(HUP INT QUIT PIPE ALRM TERM CHLD USR1)) {
   $SIG{$s} = \&shandle;
}
openlog( 'stayin-alive', 'ndelay,pid', LOG_USER );
while (1) {
    syslog LOG_NOTICE, "oh oh oh oh oh stayin alive";
    sleep 7;
}
sub shandle {
    syslog LOG_NOTICE, "nice try - @_";
}

daemonize

With a process that disassociates itself from the parent and a shell script that runs a few commands (hopefully equivalent to your intended apt-get update; apt-get upgrade)

$ cat /var/tmp/a-few-things
#!/bin/sh
sleep 17 ; echo a >/var/tmp/output ; echo b >/var/tmp/output

we can modify the .NET program to run /var/tmp/solitary /var/tmp/a-few-things

            process.StartInfo.FileName = "/var/tmp/solitary";
            process.StartInfo.Arguments = "/var/tmp/a-few-things";
            process.StartInfo.UseShellExecute = false;

which when run causes the .NET program to exit fairly quickly

$ dotnet run
Exit code: 0

and, eventually, the /var/tmp/output file does contain two lines written by a process that was not killed when the .NET program when away.

You probably should save the output from the APT commands somewhere, and may also need something so that two (or more!) updates are not trying to be run at the same time, etc. This version does not stop for questions and ignores any TERM signals (INT may also need to be ignored).

#!/bin/sh
trap '' TERM
set -e
apt-get --yes update
apt-get --yes upgrade
Related Question