About your performance question, pipes are more efficient than files because no disk IO is needed. So cmd1 | cmd2
is more efficient than cmd1 > tmpfile; cmd2 < tmpfile
(this might not be true if tmpfile
is backed on a RAM disk or other memory device as named pipe; but if it is a named pipe, cmd1
should be run in the background as its output can block if the pipe becomes full). If you need the result of cmd1
and still need to send its output to cmd2
, you should cmd1 | tee tmpfile | cmd2
which will allow cmd1
and cmd2
to run in parallel avoiding disk read operations from cmd2
.
Named pipes are useful if many processes read/write to the same pipe. They can also be useful when a program is not designed to use stdin/stdout for its IO needing to use files. I put files in italic because named pipes are not exactly files in a storage point of view as they reside in memory and have a fixed buffer size, even if they have a filesystem entry (for reference purpose). Other things in UNIX have filesystem entries without being files: just think of /dev/null
or others entries in /dev
or /proc
.
As pipes (named and unnamed) have a fixed buffer size, read/write operations to them can block, causing the reading/writing process to go in IOWait state. Also, when do you receive an EOF when reading from a memory buffer ? Rules on this behavior are well defined and can be found in the man.
One thing you cannot do with pipes (named and unnamed) is seek back in the data. As they are implemented using a memory buffer, this is understandable.
About "everything in Linux/Unix is a file"
, I do not agree. Named pipes have filesystem entries, but are not exactly file. Unnamed pipes do not have filesystem entries (except maybe in /proc
). However, most IO operations on UNIX are done using read/write function that need a file descriptor, including unnamed pipe (and socket). I do not think that we can say that "everything in Linux/Unix is a file"
, but we can surely say that "most IO in Linux/Unix is done using a file descriptor"
.
Normally, tr
shouldn't be able to write that error message because it should have been killed by a SIGPIPE signal when trying to write something after the other end of the pipe has been closed upon termination of head
.
You get that error message because somehow, the process running tr
has been configured to ignore SIGPIPEs. I suspect that might be done by the popen()
implementation in your language there.
You can reproduce it by doing:
sh -c 'trap "" PIPE; tr -dc "[:alpha:]" < /dev/urandom | head -c 8'
You can confirm that's what is happening by doing:
strace -fe signal sh your-program
(or the equivalent on your system if not using Linux). You'll then see something like:
rt_sigaction(SIGPIPE, {SIG_IGN, ~[RTMIN RT_1], SA_RESTORER, 0x37cfc324f0}, NULL, 8) = 0
or
signal(SIGPIPE, SIG_IGN)
done in one process before that same process or one of its descendants executes the /bin/sh
that interprets that command line and starts tr
and head
.
If you do a strace -fe write
, you'll see something like:
write(1, "AJiYTlFFjjVIzkhCAhccuZddwcydwIIw"..., 4096) = -1 EPIPE (Broken pipe)
The write
system call fails with an EPIPE error instead of triggering a SIGPIPE.
In any case tr
will exit. When ignoring SIGPIPE, because of that error (but that also triggers an error message). When not, it exits upon receiving the SIGPIPE. You do want it to exit, since you don't want it carrying on reading /dev/urandom
after those 8 bytes have been read
by head
.
To avoid that error message, you can restore the default handler for SIGPIPE with:
trap - PIPE
Prior to calling tr
:
popen("trap - PIPE; { tr ... | head -c 8; } 2>&1", ...)
Best Answer
As a result of the pipe in
x | y
, a subshell is created to contain the pipeline as part of the foreground process group. This continues to create subshells (viafork()
) indefinitely, thus creating a fork bomb.The fork does not actually occur until the code is run, however, which is the final invocation of
:
in your code.To disassemble how the fork bomb works:
:()
- define a new function called:
{ :|: & }
- a function definition that recursively pipes the calling function into another instance of the calling function in the background:
- call the fork bomb functionThis tends to not be too memory intensive, but it will suck up PIDs and consume CPU cycles.