Node.js Pipe – Determine if Process is Connected to Another via Pipes

node.jspipe

If I do this:

x | y

is there any way to check, during the runtime of x, to see if it's connected to y? Note that I don't know what y is, and I am not responsible for starting y.

Specifically, I am talking about the Node.js runtime, so perhaps this is a Node.js specific question. But ultimately, I am wondering if it's possible to determine given any runtime. Is it possible and how?

Is it possible to determine if the stdout/stderr are hooked up to the stdin of another process? I guess that's what this question is about.

Best Answer

To check whether the program's output is going to a pipe, based on https://nodejs.org/api/fs.html#fs_class_fs_stats, you want to call fs.fstat(FileDescriptor) and then call isFIFO() on the returned stat object (FIFO == first-in-first-out == a pipe or a named pipe):

$ </dev/null node -e 'var fs=require("fs");
   fs.fstat(0,function(err,stats){ if(err) throw(err); console.log(stats.isFIFO()); });  ' 
  false
$  : | node -e 'var fs=require("fs");
   fs.fstat(0,function(err,stats){ if(err) throw(err); console.log(stats.isFIFO()); });  ' 
  true

In C, you'd make the fstat syscall and then test the .st_mode field of the returned struct stat using the S_ISFIFO macro.

If you like to waste CPU cycles and want to use an external binary, you can execute test -p /dev/fd/$THE_FD to get the answer (or invoke that in a shell where test will be a builtin, or run stat, or launch something else capable of determining the file type).

Related Solutions

Shell – How to forward between processes with named pipes

If you get rid of the killing and shutdown stuff (which is unsafe and you may, in an extreme, but not unfathomable case when child.py dies before the (head -n 1 shutdown; kill -9 $parent) & subshell does end up kill -9ing some innocent process), then child.py won't be terminating because your parent.py isn't behaving like a good UNIX citizen.

The cat std_out & subprocess will have finished by the time you send the quit message, because the writer to std_out is child_original.py, which finishes upon receiving quit at which moment it closes its stdout, which is the std_out pipe and that close will make the cat subprocess finish.

The cat > std_in isn't finishing because it's reading from a pipe originating in the parent.py process and the parent.py process didn't bother to close that pipe. If it did, cat > stdin_in and consequently the whole child.py would finish by itself and you wouldn't need the shutdown pipe or the killing part (killing a process that isn't your child on UNIX is always a potential security hole if a race condition caused due to rapid PID recycling should occur).

Processes at the right end of a pipeline generally only finish once they're done reading their stdin, but since you're not closing that (child.stdin), you're implicitly telling the child process "wait, I have more input for you" and then you go kill it because it does wait for more input from you as it should.

In short, make parent.py behave reasonably:

from __future__ import print_function
from subprocess import Popen, PIPE
import os

child = Popen('./child.py', stdin=PIPE, stdout=PIPE)

for letter in 'abcde':
    print('Parent writes to child: ', letter)
    child.stdin.write(letter+'\n')
    child.stdin.flush()
    response = child.stdout.readline()
    print('Response from the child:', response)
    assert response.rstrip() == letter.upper(), 'Wrong response'

child.stdin.write('quit\n')
child.stdin.flush()
child.stdin.close()
print('Waiting for the child to terminate...')
child.wait()
print('Done!')

And your child.py can be as simple as

#!/bin/sh
cat std_out &
cat > std_in
wait #basically to assert that cat std_out has finished at this point

(Note that I got rid of that fd dup calls because otherwise you'd need to close both child.stdin and the child_stdin duplicate).

Since parent.py operates in line-oriented fashion, gnu cat is unbuffered (as mikeserv pointed out) and child_original.py operates in a line oriented fashion, you've effectively got the whole thing line-buffered.

Note on Cat: Unbufferred might not be the luckiest term, as gnu cat does use a buffer. What it doesn't do is try to get the whole buffer full before writing things out (unlike stdio). Basically it makes read requests to the os for a specific size (its buffer size), and writes whatever it receives without waiting to get a whole line or the whole buffer. (read(2) can be lazy and give you only what it can give you at the moment rather than the whole buffer you've asked for.)

(You can inspect the source code at http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/cat.c ; safe_read (used instead of plain read) is in the gnulib submodule and it's a very simple wrapper around read(2) that abstracts away EINTR (see the man page)).

Linux – “Leaky” pipes in linux

Easiest way would be to pipe through some program which sets nonblocking output. Here is simple perl oneliner (which you can save as leakybuffer) which does so:

so your a | b becomes:

a | perl -MFcntl -e \
    'fcntl STDOUT,F_SETFL,O_NONBLOCK; while (<STDIN>) { print }' | b

what is does is read the input and write to output (same as cat(1)) but the output is nonblocking - meaning that if write fails, it will return error and lose data, but the process will continue with next line of input as we conveniently ignore the error. Process is kind-of line-buffered as you wanted, but see caveat below.

you can test with for example:

seq 1 500000 | perl -w -MFcntl -e \
    'fcntl STDOUT,F_SETFL,O_NONBLOCK; while (<STDIN>) { print }' | \
    while read a; do echo $a; done > output

you will get output file with lost lines (exact output depends on the speed of your shell etc.) like this:

you see where the shell lost lines after 12773, but also an anomaly - the perl didn't have enough buffer for 12774\n but did for 1277 so it wrote just that -- and so next number 75610 does not start at the beginning of the line, making it little ugly.

That could be improved upon by having perl detect when the write did not succeed completely, and then later try to flush remaining of the line while ignoring new lines coming in, but that would complicate perl script much more, so is left as an exercise for the interested reader :)

Update (for binary files): If you are not processing newline terminated lines (like log files or similar), you need to change command slightly, or perl will consume large amounts of memory (depending how often newline characters appear in your input):

perl -w -MFcntl -e 'fcntl STDOUT,F_SETFL,O_NONBLOCK; while (read STDIN, $_, 4096) { print }'

it will work correctly for binary files too (without consuming extra memory).

Update2 - nicer text file output: Avoiding output buffers (syswrite instead of print):

seq 1 500000 | perl -w -MFcntl -e \
    'fcntl STDOUT,F_SETFL,O_NONBLOCK; while (<STDIN>) { syswrite STDOUT,$_ }' | \
    while read a; do echo $a; done > output

seems to fix problems with "merged lines" for me:

(Note: one can verify on which lines output was cut with: perl -ne '$c++; next if $c==$_; print "$c $_"; $c=$_' output oneliner)

Best Answer

Related Solutions

Shell – How to forward between processes with named pipes

Linux – “Leaky” pipes in linux

Related Question