The cat foo bar
example is not what I meant. Here cat
only has one input and one output at a time.
tee
is an example: it outputs to all the arguments, plus its standard output at the same time. Using the same kind of ASCII art diagram as in my previous answer, here's how tee foo bar
looks like when it's operating in a terminal.
+------------------+
| tee |
===|<stdin | +------------+
→ | | | terminal |
| stdout>|=========|<input |
| | → ##==|< |
| | || +------------+
| stderr>|=====##
| | →
| | +-------------+
| 3>|=======|> file "foo" |
| | → +-------------+
| | +-------------+
| 4>|=======|> file "bar" |
| | → +-------------+
| |
+------------------+
In this example, tee
is sending “useful” output to three channels: to the terminal (because that's where its standard output is connected to), and to two files. In addition, tee
has one more output channel for errors.
A program normally has three input/output channels, identified by their file descriptor number:
- standard input (stdin for short, file descriptor number 0);
- standard output (stdout for short, file descriptor number 1);
- standard error (stderr for short, file descriptor number 2).
The purpose of file descriptors 0, 1 and 2 is only a matter of convention — nothing enforces that a program cannot attempt to write to file descriptor 0 or read from descriptors 1 and 2 — but this is a convention that is pretty much universally followed.
If you run a program from a terminal, file descriptors 0, 1 and 2 start out connected to that terminal, unless they have been redirected. Other file descriptors start out closed, and will be used if the program opens other files.
In particular, all commands have two outputs: standard output (for the command's payload, the “useful” output), and standard error (for error or informational messages).
A pipeline in the shell (command1 | command2 | command3 | …
) connects each command's standard output to the next command's standard input. All commands' standard error goes to the terminal (unless redirected).
Shells provide ways to redirect other file descriptors. You've probably encountered 2>&1
or 2>file
to redirect standard error. See
When would you use an additional file descriptor? and the other posts it links to for examples of manipulations of other file descriptors.
Feature-rich shells also offer process substitution to generalize file redirection to piped commands, so that you aren't limited to a linear pipe with each command having a single input and a single output.
Very few commands attempt to access file descriptors above 2, except after they've opened a file (opening a file chooses a free file descriptor and returns its number to the application). One example is GnuPG, which expects to read the data to encrypt/decrypt/sign/verify on its standard input and to write the result to standard output. It can be told to read a passphrase on a different file descriptor with the --passphrase-fd
option. GnuPG also has options to report status data on other file descriptors, so you can have the payload output on stdout, error messages on stderr, and status information on another file descriptor. Here's an example where the output of a piped command is used as a passphrase:
echo fjbeqsvfu | rot13 | gpg -d --passphrase-fd=3 3<&0 <file.encrypted >file.plaintext
It's an efficiency measure. The CPU runs so much faster than the serial port that if the kernel let the userspace process run every time there was a little bit of room in the buffer, it would end up making a trip to userspace and back for every single byte of data. That's very wasteful of CPU time:
$ time dd if=/dev/zero of=/dev/null bs=1 count=10000000
10000000+0 records in
10000000+0 records out
10000000 bytes (10 MB, 9.5 MiB) copied, 5.95145 s, 1.7 MB/s
real 0m5.954s
user 0m1.960s
sys 0m3.992s
$ time dd if=/dev/zero of=/dev/null bs=1000 count=10000
10000+0 records in
10000+0 records out
10000000 bytes (10 MB, 9.5 MiB) copied, 0.011041 s, 906 MB/s
real 0m0.014s
user 0m0.000s
sys 0m0.012s
The above test isn't even reading and writing a real device: the whole time difference is how often the system is bouncing between userspace and kernelspace.
If userspace doesn't want to be held up, it can use non-blocking I/O, or it can check use a select()
call to see if there's room to write to the device... and if there's not, it can dump the remainder into a buffer of its own and continue processing. Admittedly, that does complicate things, since now you have a buffer that you have to flush... but if you're using stdio, that's generally true anyway.
Best Answer
They're coming from the kernel. You'll see them also by running
Kernel messages are displayed on virtual console by default; they aren't in X terminal emulators (such as GNOME Terminal).