I would like to avoid doing this by launching the process from a monitoring app.
Process – How to Check How Long a Process Has Been Running
process
Related Solutions
The traditional method to accomplish this would be to have your script check for existence of a file in /var/run
when it starts, if none exists then create one containing its own PID. On completion, the script would remove this file. If the file does exist, the script simply exits. In this way, regardless of how frequently the script is called it will only execute its main code if it is not already running.
The problem with this approach is that an unclean termination leaves this file present on the system, so it is often augmented with a check to see if the specified PID exists and whether that PID is for the correct script.
This method does require that you change your script rather than simply amending your crontab
entry, but is a time-honoured mechanism for solving this kind of problem.
I wrote process_watcher.py
process_watcher --pid 1234 --to me@gmail.com
Currently, email body looks like:
PID 18851: /usr/lib/libreoffice/program/soffice.bin --writer --splash-pipe=5
Started: Thu, Mar 10 18:33:37 Ended: Thu, Mar 10 18:34:26 (duration 0:00:49)
Memory (current/peak) - Resident: 155,280 / 155,304 kB Virtual: 1,166,968 / 1,188,216 kB
[+] indicates the argument may be specified multiple times, for example:
process-watcher -p 1234 -p 4258 -c myapp -c "exec\d+" --to person1@domain.com --to person2@someplace.com
optional arguments:
-h, --help show this help message and exit
-p PID, --pid PID process ID(s) to watch [+]
-c COMMAND_PATTERN, --command COMMAND_PATTERN
watch all processes matching the command name. (RegEx pattern) [+]
-w, --watch-new watch for new processes that match --command. (run forever)
--to EMAIL_ADDRESS email address to send to [+]
-n, --notify send DBUS Desktop notification
-i SECONDS, --interval SECONDS
how often to check on processes. (default: 15.0 seconds)
-q, --quiet don't print anything to stdout
Ceate a GitHub issue if you need any improvements to it.
Best Answer
On Linux with the
ps
fromprocps(-ng)
(and most other systems since this is specified by POSIX):Where
$$
is the PID of the process you want to check. This will return the elapsed time in the format[[dd-]hh:]mm:ss
.Using
-o etime
tellsps
that you just want the elapsed time field, and the=
at the end of that suppresses the header (without, you get a line which saysELAPSED
and then the time on the next line; with, you get just one line with the time).Or, with newer versions of the procps-ng tool suite (3.3.0 or above) on Linux or on FreeBSD 9.0 or above (and possibly others), use:
(with an added
s
) to get time formatted just as seconds, which is more useful in scripts.On Linux, the
ps
program gets this from/proc/$$/stat
, where one of the fields (seeman proc
) is process start time. This is, unfortunately, specified to be the time in jiffies (an arbitrary time counter used in the Linux kernel) since the system boot. So you have to determine the time at which the system booted (from/proc/stat
), the number of jiffies per second on this system, and then do the math to get the elapsed time in a useful format.It turns out to be ridiculously complicated to find the value of HZ (that is, jiffies per second). From comments in
sysinfo.c
in the procps package, one can A) include the kernel header file and recompile if a different kernel is used, B) use the posixsysconf()
function, which, sadly, uses a hard-coded value compiled into the C library, or C) ask the kernel, but there's no official interface to doing that. So, theps
code includes a series of kludges by which it determines the correct value. Wow.So it's convenient that
ps
does that all for you. :)As user @336_ notes, on Linux (this is not portable), you can use the
stat
command to look at the access, modification, or status change dates for the directory/proc/$$
(where again$$
is the process of interest). All three numbers should be the same, sowill give you the time that process
$$
started, in seconds since the epoch. That still isn't quite what you want, since you still need to do the math to subtract that from the current time to get elapsed time — I guess something likedate +%s --date="now - $( stat -c%X /proc/$$ ) seconds"
would work, but it's a bit ungainly. One possible advantage is that if you use the long-format output like-c%x
instead of-c%X
, you get greater resolution than whole-number seconds. But, if you need that, you should probably use process-auditing approach because the timing of running the stat command is going to interfere with accuracy.