Shell Pipe Tee – How to Reuse Pipe Data for Different Commands

pipeshelltee

I would like to use the same pipe for different applications, like in:

cat my_file | {
  cmd1
  cmd2
  cmd3
}

Cmd1 should consume part of the input. Cmd2 should consume another part and so on.

However, each cmd eats more of the input then it read really needs due buffering.

For example:

yes | nl | { 
  head -n 10 > /dev/null
  cat 
} | head -n 10

Outputs from line 912 instead of line 11.

Tee is not a good option, because each command is supposed to consume part of the stdin.

Is there a simple way to get this working?

Best Answer

You may use tee to duplicate command for processing whole stream by many command:

( ( seq 1 10 | tee /dev/fd/5 | sed s/^/line..\ / >&4 ) 5>&1 | wc -l ) 4>&1 
line.. 1
line.. 2
line.. 3
line.. 4
line.. 5
line.. 6
line.. 7
line.. 8
line.. 9
line.. 10
10

or split line by line, using bash:

while read line ;do
    echo cmd1 $line
    read line && echo cmd2 $line
    read line && echo cmd3 $line
  done < <(seq 1 10)
cmd1 1
cmd2 2
cmd3 3
cmd1 4
cmd2 5
cmd3 6
cmd1 7
cmd2 8
cmd3 9
cmd1 10

Finaly there is a way for running cmd1, cmd2 and cmd3 only once with 1/3 of stream as STDIN:

( ( ( seq 1 10 |
         tee /dev/fd/5 /dev/fd/6 |
           sed -ne '1{:a;p;N;N;N;s/^.*\n//;ta;}' |
           cmd1 >&4
     ) 5>&1 |
       sed -ne '2{:a;p;N;N;N;s/^.*\n//;ta;}' |
       cmd2 >&4
  ) 6>&1 |
    sed -ne '3{:a;p;N;N;N;s/^.*\n//;ta;}' |
    cmd3 >&4
) 4>&1 
command_1: 1
command_1: 4
command_1: 7
command_1: 10
Command-2: 2
Command-2: 5
Command-2: 8
command 3: 3
command 3: 6
command 3: 9

For trying this, you could use:

alias cmd1='sed -e "s/^/command_1: /"' \
    cmd2='sed -e "s/^/Command_2: /"' \
    cmd3='sed -e "s/^/Command_3: /"'

For using one stream on different process if on same script, you could do:

(
    for ((i=(RANDOM&7);i--;));do
        read line;
        echo CMD1 $line
      done
    for ((i=RANDOM&7;i--;));do
        read line
        echo CMD2 $line
      done
    while read line ;do
        echo CMD3 $line
      done
)
CMD1 1
CMD1 2
CMD1 3
CMD2 4
CMD2 5
CMD2 6
CMD2 7
CMD2 8
CMD2 9
CMD3 10

For this, you may have to transform your separated scripts into bash function to be able to build one overall script.

Another way could be to ensure each script won't output anything to STDOUT, than add a cat at end of each script to be able to chain them:

#!/bin/sh

for ((i=1;1<n;i++));do
   read line
   pRoCeSS the $line
   echo >output_log
 done

cat

Final command could look like:

seq 1 10 | cmd1 | cmd2 | cmd2
Related Question