Jobs submitted to the at
daemon will send any output to you from stderr and stdout upon completion. It can also be configured to send mail even if the job has no output. It also has the benefit of running without a controlling terminal, so you don't have to worry about the effect that closing your terminal will have on the job.
example:
echo "/opt/product/bin/job.sh data123"|at -m NOW
When this job completes, the user who submitted the job will receive an email, and if there is any output at all you will receive it. You can change the email recipient by changing the LOGNAME
environment variable.
at
has a batch mode where you can queue jobs to run when the system is not busy. This is not a very good queueing system when multiple users are competing for resources, but nonetheless, if you wanted to run jobs with it:
echo "/opt/product/bin/job.sh dataA"|batch
echo "/opt/product/bin/job.sh dataB"|batch
echo "/opt/product/bin/job.sh dataC"|batch
By default the jobs will not start unless the system load is under 1.5, but that load figure can be adjusted (and with 24 cores I'd say you should). They can run in parallel if they don't bump the loadavg over the load limit (1.5 default again), or if they individually bump the loadavg over 1.5, they will run in serial.
You can view the job queue with atq
, and delete jobs with atrm
Answer dependencies:
- Running
atd
daemon ( ps -ef|grep atd
)
- You are allowed to submit jobs to
atd
(not denied by /etc/at.deny
//etc/at.allow
configurations)
- Functional
sendmail
MTA
Most systems have no problem with these requirements, but it's worthwhile to check.
When you run strace
the lines it's returning are system functions. In case it wasn't obvious epoll_wait()
is a function that you can do a man epoll_wait
to find out implementation details like so:
epoll_wait, epoll_pwait - wait for an I/O event on an epoll file descriptor
The description for epoll
:
The epoll API performs a similar task to poll(2): monitoring multiple file descriptors to see if I/O is possible on any of them. The epoll API can be used either as an edge-triggered or a level-triggered interface and scales well to large numbers of watched file descriptors.
So it would seem that you're process is blocking on file descriptors, waiting to see if I/O is possible on any of them.
I would change my tactics a bit and try and make use of lsof -p <pid>
to see if you can narrow down what these files actually are.
Best Answer
I don't think there's any better method than just observing the log file growing or failing to grow. And that's not foolproof since the accounting can be sent to an alternate file. and even on a normal system with nothing weird happening there'll be a cron job that stops accounting, rotates the log, and restarts it so there's a brief window where you'll get the wrong answer.
Maybe there should be a symlink to the current accounting file somewhere in
/proc
, but there isn't one.