shell Read Command – Why Exit 1 on EOF Encounter

readshell

The bash man page says the following about the read builtin:

The exit status is zero, unless end-of-file is encountered

This recently bit me because I had the -e option set and was using the following code:

read -rd '' json <<EOF
{
    "foo":"bar"
}
EOF

I just don't understand why it would be desirable to exit non successfully in this scenario. In what situation would this be useful?

Best Answer

read reads a record (line by default, but ksh93/bash/zsh allow other delimiters with -d, even NUL with zsh/bash) and returns success as long as a full record has been read.

read returns non-zero when it finds EOF while the record delimiter has still not been encountered.

That allows you do do things like

while IFS= read -r line; do
  ...
done < text-file

Or with zsh/bash

while IFS= read -rd '' nul_delimited_record; do
  ...
done < null-delimited-list

And that loop to exit after the last record has been read.

You can still check if there was more data after the last full record with [ -n "$nul_delimited_record" ].

In your case, read's input doesn't contain any record as it doesn't contain any NUL character. In bash, it's not possible to embed a NUL inside a here document. So read fails because it hasn't managed to read a full record. It stills stores what it has read until EOF (after IFS processing) in the json variable.

In any case, using read without setting $IFS rarely makes sense.

For more details, see Understanding "IFS= read -r line".

Related Solutions

Shell Parallelism – Executing Piped Commands in Parallel

A problem with split --filter is that the output can be mixed up, so you get half a line from process 1 followed by half a line from process 2.

GNU Parallel guarantees there will be no mixup.

So assume you want to do:

 A | B | C

But that B is terribly slow, and thus you want to parallelize that. Then you can do:

A | parallel --pipe B | C

GNU Parallel by default splits on \n and a block size of 1 MB. This can be adjusted with --recend and --block.

You can find more about GNU Parallel at: http://www.gnu.org/s/parallel/

You can install GNU Parallel in just 10 seconds with:

$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
   fetch -o - http://pi.dk/3 ) > install.sh
$ sha1sum install.sh | grep 67bd7bc7dc20aff99eb8f1266574dadb
12345678 67bd7bc7 dc20aff9 9eb8f126 6574dadb
$ md5sum install.sh | grep b7a15cdbb07fb6e11b0338577bc1780f
b7a15cdb b07fb6e1 1b033857 7bc1780f
$ sha512sum install.sh | grep 186000b62b66969d7506ca4f885e0c80e02a22444
6f25960b d4b90cf6 ba5b76de c1acdf39 f3d24249 72930394 a4164351 93a7668d
21ff9839 6f920be5 186000b6 2b66969d 7506ca4f 885e0c80 e02a2244 40e8a43f
$ bash install.sh

Watch the intro video on http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Shell – Why Ctrl-D (EOF) Exits the Shell

The ^D character (also known as \04 or 0x4, END OF TRANSMISSION in Unicode) is the default value for the eof special control character parameter of the terminal or pseudo-terminal driver in the kernel (more precisely of the tty line discipline attached to the serial or pseudo-tty device). That's the c_cc[VEOF] of the termios structure passed to the TCSETS/TCGETS ioctl one issues to the terminal device to affect the driver behaviour.

The typical command that sends those ioctls is the stty command.

To retrieve all the parameters:

$ stty -a
speed 38400 baud; rows 58; columns 191; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O;
min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke

That eof parameter is only relevant when the terminal device is in icanon mode.

In that mode, the terminal driver (not the terminal emulator) implements a very simple line editor, where you can type Backspace to erase a character, Ctrl-U to erase the whole line... When an application reads from the terminal device, it sees nothing until you press Return at which point the read() returns the full line including the last LF character (by default, the terminal driver also translates the CR sent by your terminal upon Return to LF).

Now, if you want to send what you typed so far without pressing Enter, that's where you can enter the eof character. Upon receiving that character from the terminal emulator, the terminal driver submits the current content of the line, so that the application doing the read on it will receive it as is (and it won't include a trailing LF character).

Now, if the current line was empty, and provided the application will have fully read the previously entered lines, the read will return 0 character.

That signifies end of file to the application (when you read from a file, you read until there's nothing more to be read). That's why it's called the eof character, because sending it causes the application to see that no more input is available.

Now, modern shells, at their prompt do not set the terminal in icanon mode because they implement their own line editor which is much more advanced than the terminal driver built-in one. However, in their own line editor, to avoid confusing the users, they give the ^D character (or whatever the terminal's eof setting is with some) the same meaning (to signify eof).

Best Answer

Related Solutions

Shell Parallelism – Executing Piped Commands in Parallel

Shell – Why Ctrl-D (EOF) Exits the Shell

Related Question