If you get rid of the killing and shutdown stuff (which is unsafe and you may, in an extreme, but not unfathomable case when child.py
dies before the (head -n 1 shutdown; kill -9 $parent) &
subshell does end up kill -9
ing some innocent process),
then child.py
won't be terminating because your parent.py
isn't behaving like a good UNIX citizen.
The cat std_out &
subprocess will have finished by the time you send the quit
message, because the writer to std_out
is child_original.py
, which finishes upon receiving quit
at which moment it closes its stdout
, which is the std_out
pipe and that close
will make the cat
subprocess finish.
The cat > std_in
isn't finishing because it's reading from a pipe originating in the parent.py
process and the parent.py
process didn't bother to close that pipe. If it did, cat > stdin_in
and consequently the whole child.py
would finish by itself and you wouldn't need the shutdown pipe or the killing
part (killing a process that isn't your child on UNIX is always a potential security hole if a race condition caused due to rapid PID recycling should occur).
Processes at the right end of a pipeline generally only finish once they're done reading their stdin, but since you're not closing that (child.stdin
), you're implicitly telling the child process "wait, I have more input for you" and then you go kill it because it does wait for more input from you as it should.
In short, make parent.py
behave reasonably:
from __future__ import print_function
from subprocess import Popen, PIPE
import os
child = Popen('./child.py', stdin=PIPE, stdout=PIPE)
for letter in 'abcde':
print('Parent writes to child: ', letter)
child.stdin.write(letter+'\n')
child.stdin.flush()
response = child.stdout.readline()
print('Response from the child:', response)
assert response.rstrip() == letter.upper(), 'Wrong response'
child.stdin.write('quit\n')
child.stdin.flush()
child.stdin.close()
print('Waiting for the child to terminate...')
child.wait()
print('Done!')
And your child.py
can be as simple as
#!/bin/sh
cat std_out &
cat > std_in
wait #basically to assert that cat std_out has finished at this point
(Note that I got rid of that fd dup calls because otherwise you'd need to close both child.stdin
and the child_stdin
duplicate).
Since parent.py
operates in line-oriented fashion, gnu cat
is unbuffered (as mikeserv pointed out) and child_original.py
operates in a line oriented fashion, you've effectively got the whole thing line-buffered.
Note on Cat: Unbufferred might not be the luckiest term, as gnu cat
does use a buffer. What it doesn't do is try to get the whole buffer full before writing things out (unlike stdio). Basically it makes read requests to the os for a specific size (its buffer size), and writes whatever it receives without waiting to get a whole line or the whole buffer. (read(2) can be lazy and give you only what it can give you at the moment rather than the whole buffer you've asked for.)
(You can inspect the source code at http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/cat.c ; safe_read
(used instead of plain read
) is in the gnulib
submodule and it's a very simple wrapper around read(2) that abstracts away EINTR
(see the man page)).
It is working.
The different parts of a pipeline are executed concurrently. The only thing that synchronises/serialises the processes in the pipeline is IO, i.e. one process writing to the next process in the pipeline and the next process reading what the first one writes. Apart from that, they are executing independently of each other.
Since there is no reading or writing happening between the processes in your pipeline, the time take to execute the pipeline is that of the longest sleep
call.
You might as well have written
time ( foo.sh & bar.sh &; wait )
Terdon posted a couple of slightly modified example scripts in the chat:
#!/bin/sh
# This is "foo.sh"
echo 1; sleep 1
echo 2; sleep 1
echo 3; sleep 1
echo 4
and
#!/bin/sh
# This is "bar.sh"
sleep 2
while read line; do
echo "LL $line"
done
sleep 1
The query was "why does time ( sh foo.sh | sh bar.sh )
return 4 seconds rather than 3+3 = 6 seconds?"
To see what's happening, including the approximate time each command is executed, one may do this (the output contains my annotations):
$ time ( env PS4='$SECONDS foo: ' sh -x foo.sh | PS4='$SECONDS bar: ' sh -x bar.sh )
0 bar: sleep 2
0 foo: echo 1 ; The output is buffered
0 foo: sleep 1
1 foo: echo 2 ; The output is buffered
1 foo: sleep 1
2 bar: read line ; "bar" wakes up and reads the two first echoes
2 bar: echo LL 1
LL 1
2 bar: read line
2 bar: echo LL 2
LL 2
2 bar: read line ; "bar" waits for more
2 foo: echo 3 ; "foo" wakes up from its second sleep
2 bar: echo LL 3
LL 3
2 bar: read line
2 foo: sleep 1
3 foo: echo 4 ; "foo" does the last echo and exits
3 bar: echo LL 4
LL 4
3 bar: read line ; "bar" fails to read more
3 bar: sleep 1 ; ... and goes to sleep for one second
real 0m4.14s
user 0m0.00s
sys 0m0.10s
So, to conclude, the pipeline takes 4 seconds, not 6, due to the buffering of the output of the first two calls to echo
in foo.sh
.
Best Answer
Check if your
tshark
version has the-l
option for (nearly) line-buffered output.