Shell Text Processing – Why IFS Has No Effect in while IFS= read..

environment-variablesshelltext processing

I might have something absolutely wrong, but it looks convincing to me, that setting IFS as one of the commands in the pre-do/done list has absolutely no effect.
The outer IFS (outside the while construct) prevails in all examples shown in the script below..

What's going on here? Have I got the wrong idea of what IFS does in this situation? I expected the array-split results to be as shown in the "expected" column.


#!/bin/bash
xifs() { echo -n "$(echo -n "$IFS" | xxd -p)"; } # allow for null $IFS 
show() { x=($1) 
         echo -ne "  (${#x[@]})\t |"
         for ((j=0;j<${#x[@]};j++)); do 
           echo -n "${x[j]}|"
         done
         echo -ne "\t"
         xifs "$IFS"; echo
}
data="a  b   c"
echo -e "-----   --  -- \t --------\tactual"
echo -e "outside        \t  IFS    \tinside" 
echo -e "loop           \t Field   \tloop" 
echo -e "IFS     NR  NF \t Split   \tIFS (actual)" 
echo -e "-----   --  -- \t --------\t-----"
IFS=$' \t\n'; xifs "$IFS"; echo "$data" | while         read; do echo -ne '\t 1'; show "$REPLY"; done 
IFS=$' \t\n'; xifs "$IFS"; echo "$data" | while IFS=    read; do echo -ne '\t 2'; show "$REPLY"; done 
IFS=$' \t\n'; xifs "$IFS"; echo "$data" | while IFS=b   read; do echo -ne '\t 3'; show "$REPLY"; done
IFS=" ";      xifs "$IFS"; echo "$data" | while         read; do echo -ne '\t 4'; show "$REPLY"; done 
IFS=" ";      xifs "$IFS"; echo "$data" | while IFS=    read; do echo -ne '\t 5'; show "$REPLY"; done 
IFS=" ";      xifs "$IFS"; echo "$data" | while IFS=b   read; do echo -ne '\t 6'; show "$REPLY"; done
IFS=;         xifs "$IFS"; echo "$data" | while         read; do echo -ne '\t 7'; show "$REPLY"; done 
IFS=;         xifs "$IFS"; echo "$data" | while IFS=" " read; do echo -ne '\t 8'; show "$REPLY"; done 
IFS=;         xifs "$IFS"; echo "$data" | while IFS=b   read; do echo -ne '\t 9'; show "$REPLY"; done
IFS=b;        xifs "$IFS"; echo "$data" | while IFS=    read; do echo -ne '\t10'; show "$REPLY"; done
IFS=b;        xifs "$IFS"; echo "$data" | while IFS=" " read; do echo -ne '\t11'; show "$REPLY"; done
echo -e "-----   --  -- \t --------\t-----"

Output:

-----   --  --   --------       actual   
outside           IFS           inside                assigned   
loop             Field          loop    #              inner
IFS     NR  NF   Split          IFS     #  expected    IFS
-----   --  --   --------       -----   #  ---------  --------
20090a   1  (3)  |a|b|c|        20090a  #                              
20090a   2  (3)  |a|b|c|        20090a  #  |a  b   c|  IFS=
20090a   3  (3)  |a|b|c|        20090a  #  |a  |   c|  IFS=b
20       4  (3)  |a|b|c|        20      #                          
20       5  (3)  |a|b|c|        20      #  |a  b   c   IFS=
20       6  (3)  |a|b|c|        20      #  |a  |   c|  IFS=b
         7  (1)  |a  b   c|             #                          
         8  (1)  |a  b   c|             #  |a|b|c|     IFS=" "
         9  (1)  |a  b   c|             #  |a  |   c|  IFS=b
62      10  (2)  |a  |   c|     62      #  |a  b   c|  IFS=
62      11  (2)  |a  |   c|     62      #  |a|b|c|     IFS=" "
-----   --  --   --------       -----      ---------   -------                        

Best Answer

(Sorry, long explanation)

Yes, the IFS variable in while IFS=" " read; do … has no effect on the rest of the code.

Let's first precise that the shell command line features two different kinds of variables:

  • shell variables (which only exist within a shell, and are local to the shell)
  • environment variables, which exist for every process. Those are usually preserved upon fork() and exec(), so child processes inherit them.

When you call a command with:

  A=foo B=bar command

the command is executed in within an environment where (environment) variable A is set to foo and B is set to bar. But with this command line, the current shell variables A and B are are left unchanged.

This is different from:

A=foo; B=bar; command

Here, shell variables A and B are defined and the command is run without environment variables A and B defined. Values of A and B are unaccessible from command.

However, if some shell variables are export-ed, the corresponding environment variables are synchronized with their respective shell variables. Example:

export A
export B
A=foo; B=bar; command

With this code, both shell variables and the shell environment variables are set to foo and bar. Since environment variables are inherited by sub-processes, command will be able to access their values.

To jump back to your original question, in:

IFS='a' read

only read is affected. And in fact, in this case, read doesn't care about the value of the IFS variable. It uses IFS only when you ask the line to be split (and stored in several variables), like in:

echo "a :  b :    c" | IFS=":" read i j k; \
    printf "i is '%s', j is '%s', k is '%s'" "$i" "$j" "$k"

IFS is not used by read unless it is called with arguments. (Edit: This is not exactly true: whitespace characters, i.e. space and tab, present in IFS are always ignored at the beginning/end of the input line. )

Related Question