I'm curious about the theory behind how heredocs can be passed as a file to a command line utility.

Recently, I discovered I can pass a file as heredoc.

For example:

awk '{ split($0, arr, " "); print arr[2] }' <<EOF
foo bar baz

This is advantageous for me for several reasons:

  • Heredocs improve readability for multi line inputs.
  • I don't need to memorize each utilities flag for passing the file contents from the command line.
  • I can use single and double quotes in the given files.
  • I can control shell expansion.

For example:

ruby <<EOF
puts "'hello $HOME'"
'hello /Users/mbigras'

ruby <<'EOF'
puts "'hello $HOME'"
'hello $HOME'

I'm not clear what is happening.
It seems like the shell thinks the heredoc is a file with contents equal to the value of the heredoc.
I've this technique used with cat, but I'm still not sure what was going on:

cat <<EOL
hello world
hello world

I know cat prints the contents of a file, so presumably this heredoc is a temporary file of some kind.

I'm confused about what precisely is going on when I "pass a heredoc to a command line program".

Here's an example using ansible-playbook.
I pass the utility a playbook as a heredoc; however it fails, as shown using echo $?:

ansible-playbook -i localhost, -c local <<EOF &>/dev/null
- hosts: all
  gather_facts: false
    - name: Print something
        msg: hello world
echo $?

However, if I pass the utility the same heredoc but preceed it with /dev/stdin it succeeds

ansible-playbook -i localhost, -c local /dev/stdin <<EOF &>/dev/null
- hosts: all
  gather_facts: false
    - name: Print something
        msg: hello world
echo $?
  • What precisly is going on when one "passes a heredoc as a file"?
  • Why does the first version with ansible-playbook fail but second version succeed?
  • What is the significance of passing /dev/stdin before the heredoc?
  • Why do other utilities like ruby or awk not need the /dev/stdin before the heredoc?

Best Answer

What precisely is going on when one "passes a heredoc as a file"?

You aren't. Here-documents provide standard input, like a pipe. Your example

awk '{ ... }' <<EOF
foo bar baz

is exactly equivalent to

echo foo bar baz | awk '{ ... }'

awk, cat, and ruby all read from standard input if they aren't given a filename to read from on the command line. That is an implementation choice.

Why does the first version with anisble-playbook fail but second version succeed?

ansible-playbook does not read from standard input by default, but requires a file path instead. This is a design choice.

/dev/stdin is quite likely a symlink to /dev/fd/0, which is a way of talking about the current process's file descriptor #0 (standard input). That's something exposed by your kernel (or system library). The ansible-playbook command opens /dev/stdin like a regular filesystem file and ends up reading its own standard input, which would otherwise have been ignored.

You likely also have /dev/stdout and /dev/stderr links to FDs 1 & 2, which you can use as well if you're telling something where to put its output.

What is the significance of passing /dev/stdin before the heredoc?

It is an argument to the ansible-playbook command.

Why do other utilities like ruby or awk not need the /dev/stdin before the heredoc?

They read from standard input by default as a design choice, because they are made to be used in pipelines. They write to standard output for the same reason.

