Ubuntu – Using “while read…”,echo and printf get different outcomes

bashcommand lineecho

According to this question "Using "while read…" in a linux script"

echo '1 2 3 4 5 6' | while read a b c;do echo "$a, $b, $c"; done

outcome:

1, 2, 3 4 5 6

but when I replace echo with printf

echo '1 2 3 4 5 6' | while read a b c ;do printf "%d, %d, %d \n" $a $b $c; done

outcome

1, 2, 3
4, 5, 6

Could someone please tell me what makes these two commands different?
Thanks~

Best Answer

It's not just echo vs printf

First, let's understand what happens with read a b c part. read will perform word-splitting based on the default value of IFS variable which is space-tab-newline, and fit everything based on that. If there's more input than the variables to hold it, it will fit splitted parts into first variables, and what can't be fitted - will go into last. Here's what I mean:

bash-4.3$ read a b c <<< "one two three four"
bash-4.3$ echo $a
one
bash-4.3$ echo $b
two
bash-4.3$ echo $c
three four

This is exactly how it is described in bash's manual (see the quote at the end of the answer).

In your case what happens is that, 1 and 2 fit into a and b variables, and c takes everything else, which is 3 4 5 6.

What you also will see a lot of times is that people use while IFS= read -r line; do ... ; done < input.txt to read text files line by line. Again, IFS= is here for a reason to control word-splitting, or more specifically - disable it, and read a single line of text into a variable. If it wasn't there, read would be trying to fit each individual word into line variable. But that's another story, which I encourage you to study later, since while IFS= read -r variable is a very frequently used structure.

echo vs printf behavior

echo does what you'd expect here. It displays your variables exactly as read has arranged them. This has been already demonstrated in previous discussion.

printf is very special, because it will keep on fitting variables into format string until all of them are exhausted. So when you do printf "%d, %d, %d \n" $a $b $c printf sees format string with 3 decimals, but there's more arguments than 3 (because your variables actually expand to individual 1,2,3,4,5,6). This may sound confusing, but exists for a reason as improved behavior from what the real printf() function does in C language.

What you also did here that affects the output is that your variables are not quoted, which allows the shell ( not printf ) to break down variables into 6 separate items. Compare this with quoting:

bash-4.3$ read a b c <<< "1 2 3 4"
bash-4.3$ printf "%d %d %d\n" "$a" "$b" "$c"
bash: printf: 3 4: invalid number
1 2 3

Exactly because $c variable is quoted, it is now recognized as one whole string, 3 4, and it doesn't fit the %d format, which is just a single integer

Now do the same without quoting:

bash-4.3$ printf "%d %d %d\n" $a $b $c
1 2 3
4 0 0

printf again says: "OK, you have 6 items there but format shows only 3, so I'll keep fitting stuff and leaving blank whatever I cannot match to actual input from user".

And in all these cases you don't have to take my word for it. Just run strace -e trace=execve and see for yourself what does the command actually "see":

bash-4.3$ strace -e trace=execve printf "%d %d %d\n" $a $b $c
execve("/usr/bin/printf", ["printf", "%d %d %d\\n", "1", "2", "3", "4"], [/* 80 vars */]) = 0
1 2 3
4 0 0
+++ exited with 0 +++

bash-4.3$ strace -e trace=execve printf "%d %d %d\n" "$a" "$b" "$c"
execve("/usr/bin/printf", ["printf", "%d %d %d\\n", "1", "2", "3 4"], [/* 80 vars */]) = 0
1 2 printf: ‘3 4’: value not completely converted
3
+++ exited with 1 +++

Additional notes

As Charles Duffy properly pointed out in the comments,bash has its own built-in printf, which is what you're using in your command, strace will actually call /usr/bin/printf version, not shell's version. Aside from minor differences, for our interest in this particular question the standard format specifiers are the same and behavior is the same.

What also should be kept in mind is that printf syntax is far more portable ( and therefore preferred ) than echo, not to mention that the syntax is more familiar to C or any C-like language that has printf() function in it. See this excellent answer by terdon on the subject of printf vs echo. While you can make the output tailored to your specific shell on your specific version of Ubuntu, if you are going to be porting scripts across different systems, you probably should prefer printf rather than echo. Maybe you're a beginner system administrator working with Ubuntu and CentOS machines, or maybe even FreeBSD - who knows - so in such cases you will have to make choices.

Quote from bash manual, SHELL BUILTIN COMMANDS section

read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name ...]

One line is read from the standard input, or from the file descriptor fd supplied as an argument to the -u option, and the first word is assigned to the first name, the second word to the second name, and so on, with leftover words and their intervening separa‐ tors assigned to the last name. If there are fewer words read from the input stream than names, the remaining names are assigned empty values. The characters in IFS are used to split the line into words using the same rules the shell uses for expansion (described above under Word Splitting).