Linux – pipe any two processes to each other

bsdforklinuxpipeprocess

In this page from The Design and Implementation of the 4.4BSD Operating System, it is said that:

A major difference between pipes and sockets is that pipes require a
common parent process to set up the communications channel

However, if I record correctly, the only way to create a new process is to fork an existing one. So I can’t really see how 2 processes could not have a common ancestor. Am I then right to think that any pair of processes can be piped to each other?

Best Answer

Am I then right to think that any pair of processes can be piped to each other?

Not really.

The pipes need to be set up by the parent process before the child or children are forked. Once the child process is forked, its file descriptors cannot be manipulated "from the outside" (ignoring things like debuggers), the parent (or any other process) can't do the "set up the comms. channel" part after the fact.

So if you take two random processes that are already running, you can't set up a pipe between them directly. You need to use some form of socket (or another IPC mechanism) to get them to communicate. (But note that some operating systems, FreeBSD among them, allow you to send file descriptors on Unix-domain sockets.)

Related Solutions

Shell – How to Exit less Follow Mode Without Stopping Other Processes in Pipe

Works OK for me when looking at a file that's being appended to but not when input comes from a pipe (using the F command - control-C works fine then).

See discussion at Follow a pipe using less? - this is a known bug/shortcoming in less.

Shell – On `fork`, children processes, and “subshells”

Since, according to zshall(1), $ZDOTDIR/.zshenv gets sourced whenever a new instance of zsh starts

If you focus on the word "starts" here you'll have a better time of things. The effect of fork() is to create another process that begins from exactly where the current process already is. It's cloning an existing process, with the only difference being the return value of fork. The documentation is using "starts" to mean entering the program from the beginning.

Your example #3 runs $SHELL -c 'date; printenv; echo $$', starting an entirely new process from the beginning. It will go through the ordinary startup behaviour. You can illustrate that by, for example, swapping in another shell: run bash -c ' ... ' instead of zsh -c ' ... '. There's nothing special about using $SHELL here.

Examples #1 and #2 run subshells. The shell forks itself and executes your commands inside that child process, then carries on with its own execution when the child is done.

The answer to your question #1 is the above: example 3 runs an entirely new shell from the start, while the other two run subshells. The startup behaviour includes loading .zshenv.

The reason they call this behaviour out specifically, which is probably what leads to your confusion, is that this file (unlike some others) loads in both interactive and non-interactive shells.

To your question #2:

if the shells that get created in #1 and #2 are called "subshells", what are those like the one generated by #3 called?

If you want a name you could call it a "child shell", but really it's nothing. It's no different than any other process you start from the shell, be it the same shell, a different shell, or cat.

To your question #3:

is it possible to rationalize (and maybe generalize) the empirical/anecdotal findings described above in terms of the "theory" (for lack of a better word) of Unix processes?

fork makes a new process, with a new PID, that starts running in parallel from exactly where this one left off. exec replaces the currently-executing code with a new program loaded from somewhere, running from the beginning. When you spawn a new program, you first fork yourself and then exec that program in the child. That is the fundamental theory of processes that applies everywhere, inside and outside of shells.

Subshells are forks, and every non-builtin command you run leads to both a fork and an exec.

Note that $$ expands to the PID of the parent shell in any POSIX-compatible shell, so you may not be getting the output you expect regardless. Note also that zsh aggressively optimises subshell execution anyway, and commonly execs the last command, or doesn't spawn the subshell at all if all the commands are safe without it.

One useful command for testing your intuitions is:

strace -e trace=process -f $SHELL -c ' ... '

That will print to standard error all process-related events (and no others) for the command ... you run in a new shell. You can see what does and does not run in a new process, and where execs occur.

Another possibly-useful command is pstree -h, which will print out and highlight the tree of parent processes of the current process. You can see how many layers deep you are in the output.

Best Answer

Related Solutions

Shell – How to Exit less Follow Mode Without Stopping Other Processes in Pipe

Shell – On `fork`, children processes, and “subshells”

Related Question