Linux – How to terminate Linux tee command without killing application it is receiving from

killlinuxpipescriptingtee

I have a bash script that runs as long as the Linux machine is powered on. I start it as shown below:

( /mnt/apps/start.sh 2>&1 | tee /tmp/nginx/debug_log.log ) &

After it lauches, I can see the tee command in my ps output as shown below:

$ ps | grep tee
  418 root       0:02 tee /tmp/nginx/debug_log.log
3557 root       0:00 grep tee

I have a function that monitors the size of the log that tee produces and kills the tee command when the log reaches a certain size:

monitor_debug_log_size() {
                ## Monitor the file size of the debug log to make sure it does not get too big
                while true; do
                                cecho r "CHECKING DEBUG LOG SIZE... "
                                debugLogSizeBytes=$(stat -c%s "/tmp/nginx/debug_log.log")
                                cecho r "DEBUG LOG SIZE: $debugLogSizeBytes"
                                if [ $((debugLogSizeBytes)) -gt 100000 ]; then
                                                cecho r "DEBUG LOG HAS GROWN TO LARGE... "
                                                sleep 3
                                                #rm -rf /tmp/nginx/debug_log.log 1>/dev/null 2>/dev/null
                                                kill -9 `pgrep -f tee`
                                fi
                                sleep 30
                done
}

To my surprise, killing the tee command also kills by start.sh instance. Why is this? How can I end the tee command but have my start.sh continue to run? Thanks.

Best Answer

When tee terminates, the command feeding it will continue to run, until it attempts to write more output. Then it will get a SIGPIPE (13 on most systems) for trying to write to a pipe with no readers.

If you modify your script to trap SIGPIPE and take some appropriate action (like, stop writing output), then you should be able to have it continue after tee is terminated.

Better yet, rather than killing tee at all, use logrotate with the copytruncate option for simplicity.

To quote logrotate(8):

copytruncate

Truncate the original log file in place after creating a copy, instead of moving the old log file and optionally creating a new one. It can be used when some program cannot be told to close its logfile and thus might continue writing (appending) to the previous log file forever. Note that there is a very small time slice between copying the file and truncating it, so some logging data might be lost. When this option is used, the create option will have no effect, as the old log file stays in place.

overcommit_memory = 2

When overcommit_memory is set to 2, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.

In the case of the segfault, you should find a line such as this in the output of dmesg:

[1962.987529] myapp[3303]: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]

The at 0 means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).

overcommit_memory = 0 and 1

When overcommit_memory is set to 0 or 1, overcommit is enabled, and programs are allowed to allocate more memory than is really available.

However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn't actually have enough memory to satisfy it, it needs to get some memory back. It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they've been running, who's running them, and a number of other factors to determine which one gets killed.

After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.

However, even in this mode, programs can still be denied allocation requests. When overcommit_memory is 0, the kernel tries to take a best guess at when it should start denying allocation requests. When it is set to 1, I'm not sure what determination it uses to determine when it should deny a request but it can deny very large requests.

You can see if the OOM-Killer is involved by looking at the output of dmesg, and finding a messages such as:

[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child
[11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB

linux cron – Email Only Occasionally Sent on Output and Errors

Upon further testing, I suspect the & is messing with your results. As you point out, &>/dev/null is bash syntax, not sh syntax. As a result, sh is creating a subshell and backgrounding it. Sure, the subshell's echo creates stderr, but my theory is that:

cron is not catching the subshell's stderr, and
the backgrounding of the subshell always completes successfully, thus bypassing your || echo ....

... causing the cron job to have no output and thus no mail. Based on my reading of the vixie-cron source, it would seem that the job's stderr and stdout would be captured by cron, but it must be getting lost by the subshell.

Test it yourself in a /bin/sh environment (assuming you do not have a file named 'bar' here):

(grep foo bar) &
echo $?

Linux – How to terminate Linux tee command without killing application it is receiving from

Best Answer

`copytruncate`

Related Question

Best Answer

copytruncate

Related Solutions

Linux Memory Management – Will Linux Start Killing Processes If Memory Gets Short?

overcommit_memory = 2

overcommit_memory = 0 and 1

linux cron – Email Only Occasionally Sent on Output and Errors

Related Question

`copytruncate`