How Do Pipelines Limit Memory Usage? – Detailed Explanation

historypipe

Brian Kernighan explains in this video the early Bell Labs attraction to small languages/programs being based on memory limitations

A big machine would be 64 k-bytes–K, not M or G–and so that meant any individual program could not be very big, and so there was a natural tendency to write small programs, and then the pipe mechanism, basically input output redirection, made it possible to link one program to another.

But I don't understand how this could limit memory usage considering the fact that the data has to be stored in RAM to transmit between programs.

From Wikipedia:

In most Unix-like systems, all processes of a pipeline are started at the same time [emphasis mine], with their streams appropriately connected, and managed by the scheduler together with all other processes running on the machine. An important aspect of this, setting Unix pipes apart from other pipe implementations, is the concept of buffering: for example a sending program may produce 5000 bytes per second, and a receiving program may only be able to accept 100 bytes per second, but no data is lost. Instead, the output of the sending program is held in the buffer. When the receiving program is ready to read data, then next program in the pipeline reads from the buffer. In Linux, the size of the buffer is 65536 bytes (64KB). An open source third-party filter called bfr is available to provide larger buffers if required.

This confuses me even more, as this completely defeats the purpose of small programs (though they would be modular up to a certain scale).

The only thing I can think of as a solution to my first question (the memory limitations being problematic dependent upon the size data) would be that large data sets simply weren't computed back then and the real problem pipelines were meant to solve was the amount of memory required by the programs themselves. But given the bolded text in the Wikipedia quote, even this confuses me: as one program is not implemented at a time.

All this would make a great deal of sense if temp files were used, but it's my understanding that pipes do not write to disk (unless swap is used).

Example:

sed 'simplesubstitution' file | sort | uniq > file2

It's clear to me that sed is reading in the file and spitting it out on a line by line basis. But sort, as BK states in the linked video, is a full stop, so the all of the data has to be read into memory (or does it?), then it's passed on to uniq, which (to my mind) would be a one-line-at-a-time program. But between the first and second pipe, all the data has to be in memory, no?

Best Answer

The data doesn’t need to be stored in RAM. Pipes block their writers if the readers aren’t there or can’t keep up; under Linux (and most other implementations, I imagine) there’s some buffering but that’s not required. As mentioned by mtraceur and JdeBP (see the latter’s answer), early versions of Unix buffered pipes to disk, and this is how they helped limit memory usage: a processing pipeline could be split up into small programs, each of which would process some data, within the limits of the disk buffers. Small programs take less memory, and the use of pipes meant that processing could be serialised: the first program would run, fill its output buffer, be suspended, then the second program would be scheduled, process the buffer, etc. Modern systems are orders of magnitude larger than the early Unix systems, and can run many pipes in parallel; but for huge amounts of data you’d still see a similar effect (and variants of this kind of technique are used for “big data” processing).

In your example,

sed 'simplesubstitution' file | sort | uniq > file2

sed reads data from file as necessary, then writes it as long as sort is ready to read it; if sort isn’t ready, the write blocks. The data does indeed live in memory eventually, but that’s specific to sort, and sort is prepared to deal with any issues (it will use temporary files it the amount of data to sort is too large).

You can see the blocking behaviour by running

strace seq 1000000 -1 1 | (sleep 120; sort -n)

This produces a fair amount of data and pipes it to a process which isn’t ready to read anything for the first two minutes. You’ll see a number of write operations go through, but very quickly seq will stop and wait for the two minutes to elapse, blocked by the kernel (the write system call waits).

Related Question