Bash Syntax – Why Input Redirection Order Can’t Be Reversed in While Loops

bashio-redirectionsyntax

In Bash you can move input redirection operators to the front of a command:

cat <<< "hello"
# equivalent to
<<< "hello" cat

Why are you unable to do the same for while loops?

while read -r line; do echo "$line"; done <<< "hello"
# hello

<<< "hello" while read -r line; do echo "$line"; done
# -bash: syntax error near unexpected token `do'

I find it a bit confusing since you can pipe into a while loop. Am I doing something wrong or was it just a design decision?

Best Answer

It's just a consequence of how the grammar is defined. From the POSIX Shell Grammar specification:

command          : simple_command
                 | compound_command
                 | compound_command redirect_list
                 | function_definition
                 ;

And:

simple_command   : cmd_prefix cmd_word cmd_suffix
                 | cmd_prefix cmd_word
                 | cmd_prefix
                 | cmd_name cmd_suffix
                 | cmd_name
                 ;
[...]
cmd_prefix       :            io_redirect
                 | cmd_prefix io_redirect
                 |            ASSIGNMENT_WORD
                 | cmd_prefix ASSIGNMENT_WORD
                 ;
cmd_suffix       :            io_redirect
                 | cmd_suffix io_redirect
                 |            WORD
                 | cmd_suffix WORD
                 ;

As you can see, with compound commands, redirection is only allowed after, but with simple commands, it is allowed before as well. So, when the shell sees <redirection> foo, foo is treated as a simple command, not a compound command, and while is no longer treated as a keyword:

$ < foo while
bash: while: command not found

Hence, the do is unexpected, since it's only allowed after certain keywords.

So this applies not just to while loops, but most of the ways of setting up compound commands using reserved words:

$ < foo {
bash: {: command not found
$ < foo if
bash: if: command not found
$ < foo for
bash: for: command not found

Related Solutions

Shell – Why does cat x >> x loop

On an older RHEL system I've got, /bin/cat does not loop for cat x >> x. cat gives the error message "cat: x: input file is output file". I can fool /bin/cat by doing this: cat < x >> x. When I try your code above, I get the "looping" you describe. I also wrote a system call based "cat":

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int
main(int ac, char **av)
{
        char buf[4906];
        int fd, cc;
        fd = open(av[1], O_RDONLY);
        while ((cc = read(fd, buf, sizeof(buf))) > 0)
                if (cc > 0) write(1, buf, cc);
        close(fd);
        return 0;
}

This loops, too. The only buffering here (unlike for stdio-based "mycat") is what goes on in the kernel.

I think what's happening is that file descriptor 3 (the result of open(av[1])) has an offset into the file of 0. Filed descriptor 1 (stdout) has an offset of 3, because the ">>" causes the invoking shell to do an lseek() on the file descriptor before handing it off to the cat child process.

Doing a read() of any sort, whether into a stdio buffer, or a plain char buf[] advances the position of file descriptor 3. Doing a write() advances the position of file descriptor 1. Those two offsets are different numbers. Because of the ">>", file descriptor 1 always has an offset greater than or equal to the offset of file descriptor 3. So any "cat-like" program will loop, unless it does some internal buffering. It's possible, maybe even likely, that a stdio implementation of a FILE * (which is the type of the symbols stdout and f in your code) that includes its own buffer. fread() may actually do a system call read() to fill the internal buffer fo f. This may or may not change anything in the insides of stdout. Calling fwrite() on stdout may or may not change anything inside of f. So a stdio-based "cat" might not loop. Or it might. Hard to say without reading through a lot of ugly, ugly libc code.

I did an strace on the RHEL cat - it just does a succession of read() and write() system calls. But a cat doesn't have to work this way. It would be possible to mmap() the input file, then do write(1, mapped_address, input_file_size). The kernel would do all the work. Or you could do a sendfile() system call between the input and output file descriptors on Linux systems. Old SunOS 4.x systems were rumored to do the memory mapping trick, but I don't know if any one has ever done a sendfile-based cat. In either case the "looping" wouldn't happen, as both write() and sendfile() require a length-to-transfer parameter.

Bash Arrays – Using Variables in Array Names

Some ideas:

A "parameter expansion" of a variable value (the ${...} part):
```
echo "arr$COUNTER[0] = ${arr$COUNTER[0]}"
```
will not work. You may get around by using eval (but I do not recommend it):
```
eval echo "arr$COUNTER[0] = \${arr$COUNTER[0]}"
```
That line could be written as this:
```
i="arr$COUNTER[0]"; echo "$i = ${!i}"
```
That is called indirection (the !) in Bash.
A similar issue happens with this line:
```
declare -a "arr$COUNTER=($field)"
```
Which should be split into two lines, and eval used:
```
declare -a "arr$COUNTER"
eval arr$COUNTER\=$ \$field $
```
Again, I do not recommend using eval (in this case).
As you are reading the whole file into the memory of the shell, we may as well use a simpler method to get all lines into an array:
```
readarray -t lines <"VALUES_FILE.txt"
```
That should be faster than calling awk for each line.

An script with all the above could be:

#!/bin/bash
valfile="VALUES_FILE.txt"

readarray -t lines <"$valfile"             ### read all lines in.

line_count="${#lines[@]}"
echo "Total number of lines $line_count"

for ((l=0;l<$line_count;l++)); do
    echo "Counter value is $l"             ### In which line are we?
    echo "Field = ${lines[l]}"             ### kepth only to help understanding.
    k="arr$l"                              ### Build the variable arr$COUNTER
    IFS=" " read -ra $k <<<"${lines[l]}"   ### Split into an array a line.
    eval max=\${#$k[@]}                    ### How many elements arr$COUNTER has?
    #echo "field $field and k=$k max=$max" ### Un-quote to "see" inside.
    for ((j=0;j<$max;j++)); do             ### for each element in the line.
        i="$k[$j]"; echo "$i = ${!i}"      ### echo it's value.
    done
done
echo "The End"
echo

However, still, AWK may be faster, if we could execute what you need in AWK.

A similar processing could be done in awk. Assuming the 6 values will be used as an IP (4 of them) and the other two are a number and an epoch time.

Just a very simple sample of an AWK script:

#!/bin/sh
valfile="VALUES_FILE.txt"
awk '
NF==6 { printf ( "IP: %s.%s.%s.%s\t",$1,$2,$3,$4)
        printf ( "number: %s\t",$5+2)
        printf ( "epoch: %s\t",$6)
        printf ( "\n" )
    }
' "$valfile"

Just make a new question with the details.

Best Answer

Related Solutions

Shell – Why does cat x >> x loop

Bash Arrays – Using Variables in Array Names

Related Question