Shell – Program to Send Notification Email When a Process Finishes

emailprocessshell

I am a computational scientist, and I run a lot of lengthy calculations on Linux. Specifically, I run molecular dynamics (MD) simulations using the GROMACS package. These simulations can take days or weeks, running on (for example) 8 to 24 cores. I have access to several nodes of a cluster, which means that at any given time, I am running approximately 4 or 5 jobs (each on a different node, and each on 8-24 cores).

The problem is that the simulation take a variable amount of time. I like to keep all nodes working on simulations around the clock, but to start a new simulation, I need to do log in with a terminal and do some manual work. But I always forget how much time is left in a simulation, so I always end up constantly checking them.

Is there any way that I can receive an e-mail when a Linux process finishes? Could there be a Linux program that does this? That way I would know when to log in with a terminal and prepare the next simulation.

I am using Ubuntu Linux. Thanks for your time.

Best Answer

Jobs submitted to the at daemon will send any output to you from stderr and stdout upon completion. It can also be configured to send mail even if the job has no output. It also has the benefit of running without a controlling terminal, so you don't have to worry about the effect that closing your terminal will have on the job.

example:

echo "/opt/product/bin/job.sh data123"|at -m NOW

When this job completes, the user who submitted the job will receive an email, and if there is any output at all you will receive it. You can change the email recipient by changing the LOGNAME environment variable.

at has a batch mode where you can queue jobs to run when the system is not busy. This is not a very good queueing system when multiple users are competing for resources, but nonetheless, if you wanted to run jobs with it:

echo "/opt/product/bin/job.sh dataA"|batch
echo "/opt/product/bin/job.sh dataB"|batch
echo "/opt/product/bin/job.sh dataC"|batch

By default the jobs will not start unless the system load is under 1.5, but that load figure can be adjusted (and with 24 cores I'd say you should). They can run in parallel if they don't bump the loadavg over the load limit (1.5 default again), or if they individually bump the loadavg over 1.5, they will run in serial.

You can view the job queue with atq, and delete jobs with atrm

Answer dependencies:

  1. Running atd daemon ( ps -ef|grep atd )
  2. You are allowed to submit jobs to atd (not denied by /etc/at.deny//etc/at.allow configurations)
  3. Functional sendmail MTA

Most systems have no problem with these requirements, but it's worthwhile to check.

Related Question