Tail Log – Tail a Log Until Keywords Found or Timeout

Jenkinssedtailtimeout

I'm using a special way that can let me visit google. It is very different in the page now… So I can edit the question and comment the answers normally. I deleted more early discussions and go to the point.

I want to stop tailing and return 0 when keywords are found for the script to execute next commands. If keywords are not found after one minute, the whole script will be stopped to return an error code. I'm using set -euxo pipefail that is necessary.

timeout 1m tail -Fn 0 --pid=$(ps -ef | grep "sed /$keywords" | grep -v grep | awk '{print $2}') $log_file | sed "/$keywords/q"

The command above which I used before seamed ok when I tested. But in Jenkins, sometimes it returned "Build step 'Execute shell' marked build as failure" when the keywords were found.

I've found the reason when I restarted the program manually. It returned code 141. So I looked up the code, and found it is related with tail -f and pipe | in http://www.pixelbeat.org/programming/sigpipe_handling.html.


I modified a command in other questions for my purpose. It seemed fine except that "tail -Fn 0 balabala.log" still remained in the background, which disappeared in several minutes. But it is nearest to the target.

{ sed /"$keywords"/q; kill -13 $!; } < <(exec timeout 1m tail -Fn 0 $log_file)

It's beyond my comprehension… I looked up the usages but still feel uncertain.

  1. I changed kill -s PIPE "$!" to kill -13 $! to shorten the script.
  2. I'm still confused about the usage of { } < <(). They are like alien words to me…
  3. Can exec be deleted? It seems different from not writting it.
  4. What's wrong with the tail in the background? Is it dangerous if I start many programs in the same time?

Here is the log in Jenkins:

......
+ keywords='cloud-service-notice has been started successfully'
+ log_file=/data/jars/logs/info.cloud-service-notice.log
+ cd /data/jars/cloud-service-notice
+ nohup java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
+ sed '/cloud-service-notice has been started successfully/q'
++ exec timeout 1m tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
2019-07-02 10:31:12,544 [main] INFO  o.s.c.a.AnnotationConfigApplicationContext.prepareRefresh[588] - Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@54a097cc: startup date [Tue Jul 02 10:31:12 CST 2019]; root of context hierarchy
......
2019-07-02 10:31:33,860 [main] INFO  c.enneagon.service.notice.NoticeApp.main[24] - cloud-service-notice has been started successfully
+ kill -13 20021
+ ssh web01 'cd /data/releases/cloud-service-notice/20190702-104108/.. &&' 'ls -1 | sort -r | awk '\''FNR > 20  {printf("rm -rf %s\n", $0);}'\'' | bash'
Finished: SUCCESS

The process 20021 was timeout 1m tail -Fn 0 balabala.log. After the script finished, the process 20023 tail -Fn 0 balabala.log remained, and disappeared a few minutes later.

[root@web01 scripts]# ps -ef | grep notice
root     20020     1 27 10:41 ?        00:01:07 java -jar /data/jars/cloud-service-notice/cloud-service-notice.jar --spring.profiles.active=test
root     20023     1  0 10:41 ?        00:00:00 tail -Fn 0 /data/jars/logs/info.cloud-service-notice.log
root     20461 18966  0 10:45 pts/1    00:00:00 grep --color=auto notice

I will answer my question with this command, but I'm not sure about it. I tested it in my local machines, and will put it in our production environment for further test.


After many tests, I finally changed the command to this:

{ sed /"$keywords"/q; kill $!; } < <(exec timeout 1m tail -Fn 0 $log_file)

-13 just be removed. That is the reason why tail -Fn 0 balabala.log remained. I can answer the four questions above roughly:

  1. kill -15 is better because I added timeout in this command. $! is the pid of timeout. tail -Fn 0 balabala.log is a sub process here and using default number 15 can kill it.
  2. It is just a process substitution and multiple processes using {...}. I can even ignore kill because after 1 minute timeout will kill itself in the background. So this command without kill is still acceptable: sed /"$keywords"/q < <(exec timeout 1m tail -Fn 0 $log_file). In this situation, it will always return 0.
  3. Better not. There will be two upper-level processes shown when the command is executed. But it will be all right after finishing.
  4. The reason is in "1" above. timeout was killed, but tail was left.

Best Answer

You could do something like:

sh -c 'echo "$$"; exec tail -f file' | (
  IFS= read -r pid
  timeout 60 sed "/$keyword/q"
  kill -s PIPE "$pid"
)

Unless you have enabled the pipefail option, that will exit with the exit status of kill which should be 0 unless tail has exited of its own (which shouldn't happen in practice).

With pipefail, that would exit with the exit status of tail killed by a SIGPIPE, (which most shells represent with a value of 141). You can always append a || true to force a success exit status.

See also:

Related Question