Terminal or Pipe – How Programs Detect Stdout Connection

pipeterminal

I'm having trouble debugging a segfaulting program because the ouput right before the segfault is what I need, but this is lost if I'm piping the output to a file. According to this answer: https://unix.stackexchange.com/a/17339/22615, this is because the output buffer of the program flushes immediately when connected to a terminal but only at certain points when connected to a pipe. A few questions here:

How does a program determine what its stdout is connected to?
How does the "script" command produce the same behavior as when the program writes to a terminal?
Can this be achieved without the script command?

Best Answer

Telling if a file descriptor points to a terminal device

A program can tell if a file descriptor is associated with a tty device by using the isatty() standard C function (which generally underneath does an innocuous tty-specific ioctl() system call that would return with an error when the fd doesn't point to a tty device).

The [/test utility can do it with its -t operator.

if [ -t 1 ]; then
  echo stdout is open to a terminal
fi

Tracing libc function calls on a GNU/Linux system:

$ ltrace [ -t 1 ] | cat
[...]
isatty(1)                                      = 0
[...]

Tracing system calls:

$ strace [ -t 1 ] | cat
[...]
ioctl(1, TCGETS, 0x7fffd9fb3010)        = -1 ENOTTY (Inappropriate ioctl for device)
[...]

Telling if it points to a pipe

To determine whether a fd is associated with a pipe/fifo, one can use the fstat() system call, which returns a structure whose st_mode field contains the type and permissions of the file opened on that fd. The S_ISFIFO() standard C macro can be used on that st_mode field to determine if the fd is a pipe/fifo.

There is no standard utility that can do a fstat(), but there are several incompatible implementations of a stat command that can do it. zsh's stat builtin with stat -sf "$fd" +mode which returns the mode as a string representation whose first character represents the type (p for pipe). GNU stat can do the same with stat -c %A - <&"$fd", but also has stat -c %F - <&"$fd" to report the type alone. With BSD stat: stat -f %St <&"$fd" or stat -f %HT <&"$fd".

Telling if it's seekable

Applications generally do not care if stdout is a pipe though. They may care that it's seekable (though generally not to decide whether to buffer or not).

To test whether a fd is seekable (pipes, sockets, tty devices are not seekable, regular files and most block devices generally are), one can attempt a relative lseek() system call with an offset of 0 (so innocuous). dd is a standard utility that's an interface to lseek() but it can't be used for that test, as implementations would not call lseek() at all if you ask for an offset of 0.

The zsh and ksh93 shells have builtin seeking operators though:

$ strace -e lseek ksh -c ': 1>#((CUR))' | cat
lseek(1, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
ksh: 1: not seekable
$ strace -e lseek zsh -c 'zmodload zsh/system; sysseek -w current -u 1 0 || syserror'
lseek(1, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
Illegal seek

Disabling the buffering

The script command uses a pseudo-terminal pair to capture the output of a program, so the program's stdout (and stdin and stderr) will be a pseudo-terminal device.

When the stdout is to a terminal device, there is still generally some buffering, but it is line based. printf/puts and co will not write anything until a newline character is to be output. For other types of files, the buffering is by blocks (of a few kilo bytes).

There are several options to disable the buffering which are discussed in a number of Q&As here (search for unbuffer or stdbuf, Can't redirect cut output gives a few approaches) either by using a pseudo-terminal as can be done by socat/script/expect/unbuffer (an expect script)/zsh's zpty or by injecting code in the executable to disable the buffering as done by GNU's or FreeBSD's stdbuf.

Related Solutions

How to tell if the pipe buffer is full

I would trace your Perl script with a system call trace tool: strace (Linux), dtruss (OS X), ktrace (FreeBSD), truss (Solaris), etc. The goal would be to see how much time your Perl script spends waiting on reading from its stdin and how much time the other program spends waiting on writing to its stdout.

Here I'm testing this out with the writer as the bottleneck:

terminal 1$ gzip -c < /dev/urandom | cat > /dev/null

terminal 2$ ps auxw | egrep 'gzip|cat'
slamb    25311 96.0  0.0  2852  516 pts/0    R+   23:35   3:40 gzip -c
slamb    25312  0.8  0.0  2624  336 pts/0    S+   23:35   0:01 cat

terminal 2$ strace -p 25312 -s 0 -rT -e trace=read
Process 25312 attached - interrupt to quit
     0.000000 read(0, ""..., 4096) = 4096 <0.005207>
     0.005531 read(0, ""..., 4096) = 4096 <0.000051>

The first number here is the time since the start of the previous syscall, and the last number is the time spent in the syscall. So we can post-process with Perl a bit to aggregate it... [*]

terminal 2$ strace -p 25312 -s 0 -rT -e trace=read 2>&1 | perl -nle 'm{^\s*([\d.]+) read\(0, .*<([\d.]+)>} or next; $total_work += $1 - $last_wait; $total_wait += $2; $last_wait = $2; print "working for $total_work sec, waiting for $total_wait sec"; $last_wait = $2;'
working for 0 sec, waiting for 0.005592 sec
...
working for 0.305356 sec, waiting for 2.28624900000002 sec
...

terminal 2$ strace -p 25311 -s 0 -rT -e trace=write 2>&1 | perl -nle 'm{^\s*([\d.]+) write\(1, .*<([\d.]+)>} or next; $total_work += $1 - $last_wait; $total_wait += $2; $last_wait = $2; print "working for $total_work sec, waiting for $total_wait sec"; $last_wait = $2;'
...
working for 5.15862000000001 sec, waiting for 0.0555740000000007 sec
...

You could go fancier and make a SystemTap or DTrace script that does traces both sides at once, only tracks the correct file descriptor, and prints a nice status update every second or so with what percent of time each was waiting for the other.

[*] - Warning: my crude aggregation isn't quite right if read/write is being called on other file descriptors; it will underestimate the work time in that case.

The dtrace version is pretty neat actually.

terminal 1$ gzip -c < /dev/urandom | cat > /dev/null

terminal 2$ ps aux | egrep 'gzip| cat'
slamb    54189  95.8  0.0   591796    584 s006  R+   12:49AM  22:49.55 gzip -c
slamb    54190   0.4  0.0   599828    392 s006  S+   12:49AM   0:06.08 cat

terminal 2$ cat > pipe.d <<'EOF'
#!/usr/sbin/dtrace -qs

BEGIN
{
  start = timestamp;
  writer_blocked = 0;
  reader_blocked = 0;
}

tick-1s, END
{
  this->elapsed = timestamp - start;
  printf("since startup, writer blocked %3d%% of time, reader %3d%% of time\n",
         100 * writer_blocked / this->elapsed,
         100 * reader_blocked / this->elapsed);
}

syscall::write:entry
/pid == $1 && arg0 == 1/
{
  self->entry = timestamp;
}

syscall::write:return
/pid == $1 && self->entry != 0/
{
  writer_blocked += timestamp - self->entry;
  self->entry = 0;
}

syscall::read:entry
/pid == $2 && arg0 == 0/
{
  self->entry = timestamp;
}

syscall::read:return
/pid == $2 && self->entry != 0/
{
  reader_blocked += timestamp - self->entry;
  self->entry = 0;
}
EOF

terminal 2$ chmod u+x pipe.d
terminal 2$ sudo ./pipe.d 54189 54190
since startup, writer blocked   0% of time, reader  98% of time
since startup, writer blocked   0% of time, reader  99% of time
since startup, writer blocked   0% of time, reader  99% of time
since startup, writer blocked   0% of time, reader  99% of time
since startup, writer blocked   0% of time, reader  99% of time
^C
since startup, writer blocked   0% of time, reader  99% of time

And the SystemTap version:

terminal 1$ gzip -c /dev/urandom | cat > /dev/null

terminal 2$ ps auxw | egrep 'gzip| cat'
slamb     3405  109  0.0   4356   584 pts/1    R+   02:57   0:04 gzip -c /dev/urandom
slamb     3406  0.2  0.0  10848   588 pts/1    S+   02:57   0:00 cat

terminal 2$ cat > probes.stp <<'EOF'
#!/usr/bin/env stap

global start
global writer_pid
global writes
global reader_pid
global reads

probe begin {
  start = gettimeofday_us()
  writer_pid = strtol(argv[1], 10)
  reader_pid = strtol(argv[2], 10)
}

probe timer.s(1), end {
  elapsed = gettimeofday_us() - start
  printf("since startup, writer blocked %3d%% of time, reader %3d%% of time\n",
         100 * @sum(writes) / elapsed,
         100 * @sum(reads) / elapsed)
}

probe syscall.write.return {
  if (pid() == writer_pid && $fd == 1)
    writes <<< gettimeofday_us() - @entry(gettimeofday_us())
}

probe syscall.read.return {
  if (pid() == reader_pid && $fd == 0)
    reads <<< gettimeofday_us() - @entry(gettimeofday_us())
}
EOF

terminal 2$ chmod a+x probes.stp
terminal 2$ sudo ./pipe.stp 3405 3406
since startup, writer blocked   0% of time, reader  99% of time
...

Shell – cat into stdin then pipe into program keeps forked shell open, why

For the cases where it "works", you are leaving a process running cat which is reading its standard input, which has not been closed. Since that is not (yet) closed, cat continues to run, leaving its standard output open, which is used by the shell (also not closed).