Tail Log – Tail a Log Until Keywords Found or Timeout

Jenkinssedtailtimeout

I'm using a special way that can let me visit google. It is very different in the page now… So I can edit the question and comment the answers normally. I deleted more early discussions and go to the point.

I want to stop tailing and return 0 when keywords are found for the script to execute next commands. If keywords are not found after one minute, the whole script will be stopped to return an error code. I'm using set -euxo pipefail that is necessary.

timeout 1m tail -Fn 0 --pid=$(ps -ef | grep "sed /$keywords" | grep -v grep | awk '{print $2}') $log_file | sed "/$keywords/q"

The command above which I used before seamed ok when I tested. But in Jenkins, sometimes it returned "Build step 'Execute shell' marked build as failure" when the keywords were found.

I've found the reason when I restarted the program manually. It returned code 141. So I looked up the code, and found it is related with tail -f and pipe | in http://www.pixelbeat.org/programming/sigpipe_handling.html.

I modified a command in other questions for my purpose. It seemed fine except that "tail -Fn 0 balabala.log" still remained in the background, which disappeared in several minutes. But it is nearest to the target.

{ sed /"$keywords"/q; kill -13 $!; } < <(exec timeout 1m tail -Fn 0 $log_file)

It's beyond my comprehension… I looked up the usages but still feel uncertain.

I changed kill -s PIPE "$!" to kill -13 $! to shorten the script.

I'm still confused about the usage of { } < <(). They are like alien words to me…

Can exec be deleted? It seems different from not writting it.

What's wrong with the tail in the background? Is it dangerous if I start many programs in the same time?

Here is the log in Jenkins:

......
+ keywords='cloud-service-notice has been started successfully'
+ log_file=/data/jars/logs/info.cloud-service-notice.log
+ cd /data/jars/cloud-service-notice
+ nohup java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
+ sed '/cloud-service-notice has been started successfully/q'
++ exec timeout 1m tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
2019-07-02 10:31:12,544 [main] INFO  o.s.c.a.AnnotationConfigApplicationContext.prepareRefresh[588] - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@54a097cc: startup date [Tue Jul 02 10:31:12 CST 2019]; root of context hierarchy
......
2019-07-02 10:31:33,860 [main] INFO  c.enneagon.service.notice.NoticeApp.main[24] - cloud-service-notice has been started successfully
+ kill -13 20021
+ ssh web01 'cd /data/releases/cloud-service-notice/20190702-104108/.. &&' 'ls -1 | sort -r | awk '\''FNR > 20  {printf("rm -rf %s\n", $0);}'\'' | bash'
Finished: SUCCESS

The process 20021 was timeout 1m tail -Fn 0 balabala.log. After the script finished, the process 20023 tail -Fn 0 balabala.log remained, and disappeared a few minutes later.

[root@web01 scripts]# ps -ef | grep notice
root     20020     1 27 10:41 ?        00:01:07 java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
root     20023     1  0 10:41 ?        00:00:00 tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
root     20461 18966  0 10:45 pts/1    00:00:00 grep --color=auto notice

I will answer my question with this command, but I'm not sure about it. I tested it in my local machines, and will put it in our production environment for further test.

After many tests, I finally changed the command to this:

{ sed /"$keywords"/q; kill $!; } < <(exec timeout 1m tail -Fn 0 $log_file)

-13 just be removed. That is the reason why tail -Fn 0 balabala.log remained. I can answer the four questions above roughly:

kill -15 is better because I added timeout in this command. $! is the pid of timeout. tail -Fn 0 balabala.log is a sub process here and using default number 15 can kill it.

It is just a process substitution and multiple processes using {...}. I can even ignore kill because after 1 minute timeout will kill itself in the background. So this command without kill is still acceptable: sed /"$keywords"/q < <(exec timeout 1m tail -Fn 0 $log_file). In this situation, it will always return 0.

Better not. There will be two upper-level processes shown when the command is executed. But it will be all right after finishing.

The reason is in "1" above. timeout was killed, but tail was left.

Best Answer

You could do something like:

sh -c 'echo "$$"; exec tail -f file' | (
  IFS= read -r pid
  timeout 60 sed "/$keyword/q"
  kill -s PIPE "$pid"
)

Unless you have enabled the pipefail option, that will exit with the exit status of kill which should be 0 unless tail has exited of its own (which shouldn't happen in practice).

With pipefail, that would exit with the exit status of tail killed by a SIGPIPE, (which most shells represent with a value of 141). You can always append a || true to force a success exit status.

Sort command and THE most recent log file

Why using -r (reverse sort order) than reaching end of output with tail?

Using normal sort order and take first entry would be quicker!

tail -f `/bin/ls -1td /path/to/log/file/*| /usr/bin/head -n1`

tail -f $(/bin/ls -1t /path/to/log/file/* | /bin/sed q)

work fine.

Nota: I like to use sed because this command is present in /bin, maybe before /usr are mounted.

tail -f `/bin/ls -1tr /path/to/log/file/* | /bin/sed -ne '$p'`

would work but, as already said: inversing sort order, than dropping whole ouput for using only last entry is not a real good idea ;-)

Warning, in last directory, * have to not match a directory, or else command tail won't know how to open it.

Same but using `find` for searching for most recent file:

read -a file < <(
    find /tmp 2>/dev/null -type f -mmin +-1 -mmin -10 -printf "%Ts %p\n" |
    sort -rn)
tail -f ${file[1]}

Note:

the -mmin +-1 ensure to not list bad timed files: in the futur.
read is builtin, create an array and prevent the use of head -n1| cut -d \ -f2
-mmin -10 could be changed or suppressed, but this prevent long sort process.

But tail support to watch about more than one file:

Try to open two shell console and try this:

In 1st console:

user@host[pts/1]:~$ touch /tmp/file_{1,2,3}
user@host[pts/1]:~$ tail -f /tmp/file_{1,2,3}
==> /tmp/file_1 <==

==> /tmp/file_2 <==

==> /tmp/file_3 <==

in second one, while keeping 1st console visible, hit many time:

user@host[pts/2]:~$ tee -a /tmp/file_$((RANDOM%3+1)) <<<$RANDOM
25285
user@host[pts/2]:~$ tee -a /tmp/file_$((RANDOM%3+1)) <<<$RANDOM
16381
user@host[pts/2]:~$ tee -a /tmp/file_$((RANDOM%3+1)) <<<$RANDOM
19766
user@host[pts/2]:~$ tee -a /tmp/file_$((RANDOM%3+1)) <<<$RANDOM
3053

1st console could look like:

==> /tmp/file_2 <==
25285

==> /tmp/file_1 <==
16381
19766

==> /tmp/file_3 <==
3053

...

In the idea of SO question, but time based, multi files

By using find command, we could watch on last minutes modified files -mmin or last days -mtime:

find /path/to/logdir -type f -mmin -10 -exec tail -f {} +

for watching for logfiles modified last 10 minutes.

Note:

Have a look at man tail, about
- -F option for long time watch
- -q option for not printing file names

Fancy formatting

find /path/to/logdir -type f -mmin -10 -exec tail -f {} + |
    sed -une 's/^==> .path.to.logdir.\(.*\) <==$/\1             /;ta;bb;
               :a;s/^\(.\{12\}\) *$/\1: /;h;bc;
               :b;G;s/^\(..*\)\n\(.*\)/\2 \1/p;:c;'

Where you could modify .path.to.logdir. and change 12 for more suitable length.

For sample, keeping our two console, stop 1st and try

user@host[pts/1]:~$ find /tmp/ -type f -mtime -1 -name 'file_?' -exec tail -f {} + |
    sed -une 's/^==> .tmp.\(.*\) <==$/\1             /;ta;bb;
        :a;s/^\(.\{12\}\) *$/\1: /;h;bc;
        :b;G;s/^\(..*\)\n\(.*\)/\2 \1/p;:c;'
file_2      :  25285
file_1      :  16381
file_1      :  19766
file_3      :  3053

than in second console, hit again some

user@host[pts/2]:~$ tee -a /tmp/file_$((RANDOM%3+1)) <<<$RANDOM

Ssh – Tail log file on multiple machines over ssh

What you see is effect of a standard stdout buffer in grep provided by Glibc. The best solution is to disable it by using --line-buffered (GNU grep, I'm not sure what other implementations might support it or something similar).

As for why this only happens in some cases:

ssh server "tail -f /var/log/server.log | grep test"

runs the whole command in the quotes on the server - thus grep waits to fill its buffer.

ssh server tail -f /var/log/server.log | grep test

runs grep on your local machine on the output tail sent through the ssh channel.

The key part here is, that grep adjusts its behaviour depending on whether its stdin is a terminal or not. When you run ssh -t, the remote command is running with a controlling terminal and thus the remote grep behaves like your local one.

Best Answer

Related Solutions

Shell – Tail -f the most recent log file