A while loop and an here-document – what happens when

control flowhere-documentreadtext processingvariable

I have this while loop and here-document combo which I run in Bash 4.3.48(1) and I don't understand its logic at all.

while read file; do source ~/unwe/"$file"
done <<-EOF
    x.sh
    y.sh
EOF

My question is comprised of these parts:

  1. What does the read do here (I always use read to declare a variable and assign its value interactively, but I'm missing what it's supposed to do here).

  2. What is the meaning of while read? Where does the concept of while come in here?

  3. If the here-document itself comes after the loop, how is it even affected by the loop? I mean, it comes after done, and not inside the loop, so what's the actual association between these two structures?

  4. Why does this fail?

    while read file; do source ~/unwe/"$file" done <<-EOF
        x.sh
        y.sh
    EOF
    

    I mean, done is done… So why does it matter if done <<-EOF is on the same line as the loop? If I recall correctly, I did have a case in which a for loop was one-liner and still worked.

Best Answer

  1. The read command reads from its standard input stream and assigns what's read to the variable file (it's a bit more compicated than that, see long discussion here). The standard input stream is coming from the here-document redirected into the loop after done. If not given data from anywhere, it will read from the terminal, interactively. In this case though, the shell has arranged to connect its input stream to the here-document.

  2. while read will cause the loop to iterate until the read command returns a non-zero exit status. This will happen if there are any errors, or (most commonly) when there is no more data to be read (its input stream is in an end-of-file state).

    The convention is that any utility that wishes to signal an error or "false" or "no" to the calling shell does so by returning a non-zero exit status. A zero exit status signals "true" or "yes" or "no error". This status, would you wish to inspect it, is available in $? (only from the last executed utility). The exit status may be used in if statements and while loops or anywhere where a test is required. For example

    if grep -q 'pattern' file; then ...; fi
    
  3. A here-document is a form of redirection. In this case, it's a redirection into the loop. Anything inside the loop could read from it but in this case it's only the read command that does. Do read up on here-documents. If the input was coming from an ordinary file, the last line would have been

    done <filename
    

    Seeing the loop as one single command may make this more intuitive:

    while ...; do ...; done <filename
    

    which is one case of

    somecommand <filename
    

    Some shells also supports "here-strings" with <<<"string":

    cat <<<"This is the here-string"
    

    DavidFoerster points out that if any of the two scripts x.sh and y.sh reads from standard input, without explicitly being given data to read from a file or from elsewhere, the data read will actually come from the here-document.

    With a x.sh that contains only read a, this would make the variable a contain the string y.sh, and the y.sh script would never run. This is due to the fact that the standard input is redirected for all commands in the while loop (and also "inherited" by any invoked script or command) and the second line is "consumed" by x.sh before the while loop's read can read it.

    If this behaviour is unwanted, it can be avoided, but it's a bit tricky.

  4. It fails because there is no ; or newline before done. Without ; or newline before done, the word done will be taken as an argument of source, and the loop will additionally not be properly closed (this is a syntax error).

    It is almost true that any ; may be replaced by a newline (at least when it's a command delimiter). It signals the end of a command, as does |, &, && and || (and probably others that I have forgotten).

Related Question