Ubuntu – Why pipes are used instead of input redirection

outputpiperedirect

I'm new to linux systems and I can't really understand why wee need two operators that can redirect output: pipe as | and ouput redirection operator >? Can't we just always use the second? Most of the times I see that the pipe is used if multiple commands are chained together. If however, the output is redirected to file, as in echo 'hello' > filename, the output redirection operator is used. What am I missing here?

Best Answer

The key point to remember is that pipes are inter-process communication device that allows two processes ( and that's what commands really are) to exchange data, while redirection operators are for manipulating where particular process writes.

In the video Unix Pipeline, the creator of awk language and one of the original people who worked on AT&T Unix Brian Kernighan explains:

First, you don't have to write one big massive program - you've got existing smaller programs that may already do parts of the job...Another is that it's possible that the amount of data you're procesing would not fit if you stored it in a file...because remember, we're back in the days when disks on these things had, if you were lucky, a Megabyte or two of data...So the pipeline never had to instantiate the whole output

As you can see, within the context which the pipelines were created, they actually were not just communication device, but also save storage space and simplify the development. Sure, we can use output/input redirection for everything (especially nowadays with storage capacity being in the range of Terabytes), however that would be inefficient from the storage point of view, and also processing speed - remember that you're directly feeding output from one command to another with |. Consider something like command1 | grep 'something'. If you write output of command1 first to a file, it will take time to write everything, then let grep go through the whole file. With pipeline and the fact that the output is buffered (meaning that left-side process pauses before right-side process is ready to read again), the output goes directly from one command to the other, saving time.

It is worth noting, that for inter-process communication, there's a use case of named pipes, to which you can use > operator to write from one command, and < to let another command read from it, and it's a use case where you do want to have particular destination on filesystem where multiple scripts/commands can write to and agree on that particular destination. But when it's unnecessary, anonymous pipe | is all you really need.

Related Question