Bash Scripting – EOF vs End-of-Input Signal in sha256sum

bashhashsumnewlinesshell

I was trying to compute sha256 for a simple string, namely "abc". I found out that using sha256sum utility like this:

sha256sum file_with_string

gives results identical to:

sha256sum # enter, to read input from stdin
abc
^D

namely:

edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb

Note, that before the end-of-input signal another newline was fed to stdin.


What bugged me at first was that when I decided to verify it with an online checksum calculator, the result was different:

ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

I figured it might have had something to do with the second newline I fed to stdin, so I tried inserting ^D twice this time (instead of using newline) with the following result:

abcba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

Now, this is of course poorly formatted (due to the lack of a newline character), but that aside, it matches the one above.

After that, I realized I clearly fail to understand something about input parsing in the shell. I double-checked and there's no redundant newline in the file I specified initially, so why am I experiencing this behavior?

Best Answer

The difference is the newline. First, let's just collect the sha256sums of abc and abc\n:

$ printf 'abc\n' | sha256sum 
edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  -
$ printf 'abc' | sha256sum 
ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad  -

So, the ba...ad sum is for the string abc, while the ed..cb one is for abc\n. Now, if your file is giving you the ed..cb output, that means your file has a newline. And, given that "text files" require a trailing newline, most editors will add one for you if you create a new file.

To get a file without a newline, use the printf approach above. Note how file will warn you if your file has no newline:

$ printf 'abc' > file
$ file file
file: ASCII text, with no line terminators

And

$ printf 'abc\n' > file2
$ file file2
file2: ASCII text

And now:

$ sha256sum file file2
ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad  file
edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb  file2
Related Question