In
./binary < file
binary
's stdin is the file open in read-only mode. Note that bash
doesn't read the file at all, it just opens it for reading on the file descriptor 0 (stdin) of the process it executes binary
in.
In:
./binary << EOF
test
EOF
Depending on the shell, binary
's stdin will be either a deleted temporary file (AT&T ksh, zsh, bash...) that contains test\n
as put there by the shell or the reading end of a pipe (dash
, yash
; and the shell writes test\n
in parallel at the other end of the pipe). In your case, if you're using bash
, it would be a temp file.
In:
cat file | ./binary
Depending on the shell, binary
's stdin will be either the reading end of a pipe, or one end of a socket pair where the writing direction has been shut down (ksh93) and cat
is writing the content of file
at the other end.
When stdin is a regular file (temporary or not), it is seekable. binary
may go to the beginning or end, rewind, etc. It can also mmap it, do some ioctl()s
like FIEMAP/FIBMAP (if using <>
instead of <
, it could truncate/punch holes in it, etc).
pipes and socket pairs on the other hand are an inter-process communication means, there's not much binary
can do beside read
ing the data (though there are also some operations like some pipe-specific ioctl()
s that it could do on them and not on regular files).
Most of the times, it's the missing ability to seek
that causes applications to fail/complain when working with pipes, but it could be any of the other system calls that are valid on regular files but not on different types of files (like mmap()
, ftruncate()
, fallocate()
). On Linux, there's also a big difference in behaviour when you open /dev/stdin
while the fd 0 is on a pipe or on a regular file.
There are many commands out there that can only deal with seekable files, but when that's the case, that's generally not for the files open on their stdin.
$ unzip -l file.zip
Archive: file.zip
Length Date Time Name
--------- ---------- ----- ----
11 2016-12-21 14:43 file
--------- -------
11 1 file
$ unzip -l <(cat file.zip)
# more or less the same as cat file.zip | unzip -l /dev/stdin
Archive: /proc/self/fd/11
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of /proc/self/fd/11 or
/proc/self/fd/11.zip, and cannot find /proc/self/fd/11.ZIP, period.
unzip
needs to read the index stored at the end of the file, and then seek within the file to read the archive members. But here, the file (regular in the first case, pipe in the second) is given as a path argument to unzip
, and unzip
opens it itself (typically on fd other than 0) instead of inheriting a fd already opened by the caller. It doesn't read zip files from its stdin. stdin is mostly used for user interaction.
If you run that binary
of yours without redirection at the prompt of an interactive shell running in a terminal emulator, then binary
's stdin will be inherited from its caller the shell, which itself will have inherited it from its caller the terminal emulator and will be a pty device open in read+write mode (something like /dev/pts/n
).
Those devices are not seekable either. So, if binary
works OK when taking input from the terminal, possibly the issue is not about seeking.
If that 14 is meant to be an errno (an error code set by failing system calls), then on most systems, that would be EFAULT
(Bad address). The read()
system call would fail with that error if asked to read into a memory address that is not writable. That would be independent of whether the fd to read the data from points to a pipe or regular file and would generally indicate a bug1.
binary
possibly determines the type of file open on its stdin (with fstat()
) and runs into a bug when it's neither a regular file nor a tty device.
Hard to tell without knowing more about the application. Running it under strace
(or truss
/tusc
equivalent on your system) could help us see what is the system call if any that is failing here.
1 The scenario envisaged by Matthew Ife in a comment to your question sounds a lot plausible here. Quoting him:
I suspect it is seeking to the end of file to get a buffer size for reading the data, badly handling the fact that seek doesn't work and attempting to allocate a negative size (not handling a bad malloc). Passing the buffer to read which faults given the buffer is not valid.
Best Answer
This is an interesting question, and it deals with a part of the Unix/Linux philosophy.
So, what is the difference between programs like
grep
,sed
,sort
on the one hand andkill
,rm
,ls
on the other hand? I see two aspects.The filter aspect
The first kind of programs is also called filters. They take an input, either from a file or from STDIN, modify it, and generate some output, mostly to STDOUT. They are meant to be used in a pipe with other programs as sources and destinations.
The second kind of programs acts on an input, but the output they give is often not related to the input.
kill
has no output when it works regularly, neither doesls
. The just have a return value to show success. They do not normally take input from STDIN, but mostly give output to STDOUT.For programs like
ls
, the filter aspect does not work that good. It can certainly have an input (but does not need one), and the output is closely related to that input, but it does not work as a filter. However, for that kind of programs, the other aspect still works:The semantic aspect
For filters, their input has no semantic meaning. They just read data, modify data, output data. It doesn't matter whether this is a list of numeric values, some filenames or HTML source code. The meaning of this data is only given by the code you provide to the filter: the regex for
grep
, the rules forawk
or the Perl program.For other programs, like
kill
orls
, their input has a meaning, a denotation.kill
expects process numbers,ls
expects file or path names. They cannot handle arbitrary data and they are not meant to. Many of them do not even need any input or parameters, likeps
. They do not normally read from STDIN.One could probably combine these two aspects: A filter is a program whose input does not have a semantic meaning for the program.
I'm sure I have read about this philosophy somewhere, but I don't remember any sources at the moment, sorry. If someone has some sources present, please feel free to edit.