Bash – How to capture ordered STDOUT/STDERR and add timestamp/prefixes

bashio-redirectionpipeshell

I have explored almost all available similar questions, to no avail.

Let me describe the problem in detail:

I run some unattended scripts and these can produce standard output and standard error lines, I want to capture them in their precise order as displayed by a terminal emulator and then add a prefix like "STDERR: " and "STDOUT: " to them.

I have tried using pipes and even epoll-based approach on them, to no avail. I think solution is in pty usage, although I am no master at that. I have also peeked into the source code of Gnome's VTE, but that has not been much productive.

Ideally I would use Go instead of Bash to accomplish this, but I have not been able to. Seems like pipes automatically forbid keeping a correct lines order because of buffering.

Has somebody been able to do something similar? Or it is just impossible? I think that if a terminal emulator can do it, then it's not – maybe by creating a small C program handling the PTY(s) differently?

Ideally I would use asynchronous input to read these 2 streams (STDOUT and STDERR) and then re-print them second my needs, but order of input is crucial!

NOTE: I am aware of stderred but it does not work for me with Bash scripts and cannot be easily edited to add a prefix (since it basically wraps plenty of syscalls).

Update: added below two gists

(sub-second random delays can be added in the sample script I provided for a proof of consistent results)

Update: solution to this question would also solve this other question, as @Gilles pointed out. However I have come to the conclusion that it's not possible to do what asked here and there. When using 2>&1 both streams are correctly merged at the pty/pipe level, but to use the streams separately and in correct order one should indeed use the approach of stderred that involves syscall hooking and can be seen as dirty in many ways.

I will be eager to update this question if somebody can disprove the above.

Best Answer

You might use coprocesses. Simple wrapper that feeds both outputs of a given command to two sed instances (one for stderr the other for stdout), which do the tagging.

#!/bin/bash
exec 3>&1
coproc SEDo ( sed "s/^/STDOUT: /" >&3 )
exec 4>&2-
coproc SEDe ( sed "s/^/STDERR: /" >&4 )
eval $@ 2>&${SEDe[1]} 1>&${SEDo[1]}
eval exec "${SEDo[1]}>&-"
eval exec "${SEDe[1]}>&-"

Note several things:

  1. It is a magic incantation for many people (including me) - for a reason (see the linked answer below).

  2. There is no guarantee it won't occasionally swap couple of lines - it all depends on scheduling of the coprocesses. Actually, it is almost guaranteed that at some point in time it will. That said, if keeping the order strictly the same, you have to process the data from both stderr and stdin in the same process, otherwise the kernel scheduler can (and will) make a mess of it.

    If I understand the problem correctly, it means that you would need to instruct the shell to redirect both streams to one process (which can be done AFAIK). The trouble starts when that process starts deciding what to act upon first - it would have to poll both data sources and at some point get into state where it would be processing one stream and data arrive to both streams before it finishes. And that is exactly where it breaks down. It also means, that wrapping the output syscalls like stderred is probably the only way to achieve your desired outcome (and even then you might have a problem once something becomes multithreaded on a multiprocessor system).

As far as coprocesses be sure to read Stéphane's excellent answer in How do you use the command coproc in Bash? for in depth insight.

Related Question