How to pipe producer tell pipe consumer it has reached ‘End of File’?” (un-named-pipe, not named-pipe)

here-documenthere-stringpipe

I have an application which requires a producer to send filenames to a consumer, and have producer indicate to the consumer when the last filename has been sent and the end of file has been reached.

For simplicity, in the following example producer is demonstrated with echo and printf, while the consumer is demonstrated with cat. I have tried to extrapolate the "here file" method without success, using <<EOF to indicate to the producer-wrapper (if such a thing exists) what to look for as an indication of end of file. If it worked cat should filter EOF from the output.

Ex 1)

input

{
echo "Hello World!" 
printf '\x04' 
echo "EOF"
} <<EOF |\
cat

output

bash: warning: here-document at line 146 delimited by end-of-file (wanted `EOF')
Hello World!
EOF

Ex 2)

input

{ 
echo "Hello World!" 
printf '\x04' 
echo "EOF"
} |\
cat <<EOF

output

bash: warning: here-document at line 153 delimited by end-of-file (wanted `EOF')

Is it correct that the "here files" method for indicating delimiter only works for static text, and not dynamically created text?

— the actual application —

inotifywait -m --format '%w%f' /Dir |  <consumer>

The consumer is waiting for files to be written to directory /Dir.
It would be nice if when a file "/Dir/EOF" was written the consumer would detect logical end-of-file condition simply by writing shell script as follows:

inotifywait -m --format '%w%f' /Dir |<</Dir/EOF  <consumer>

— In response to Giles answer —

Is it theoretically possible to implement

cat <<EOF
hello
world
EOF

as

SpecialSymbol="EOF"
{
    echo hello
    echo world
    echo $SpecialSymbol
} |\
while read Line; do 
  if [[ $Line == $SpecialSymbol ]]
    break
  else 
    echo $Line
  fi
done |\
cat

By theoretically possible I mean "would it support existing usage patterns and only enable extra usage patterns which had previously been illegal syntax?" – meaning no existing legal code would be broken.

Best Answer

For a pipe, the end of file is seen by the consumer(s) once all the producers have closed their file descriptor to the pipe and the consumer has read all the data.

So, in:

{
  echo foo
  echo bar
} | cat

cat will see end-of-file as soon as the second echo terminates and cat has read both foo\n and bar\n. There's nothing more for you to do.

Things to bear in mind though is that if some of the commands on the left side of the pipe starts some background process, that background process will inherit a fd to the pipe (its stdout), so cat will not see eof until that process also dies or closes its stdout. As in:

{
  echo foo
  sleep 10 &
  echo bar
} | cat

You see cat not returning before 10 seconds have passed.

Here, you may want to redirect sleep's stdout to something else like /dev/null if you don't want its (non)output to be fed to cat:

{
  echo foo
  sleep 10 > /dev/null &
  echo bar
} | cat

If you want the writing end of the pipe to be closed before the last command in the subshell left of the | is run, you can close stdout or redirecting to that subshell in the middle of the subshell with exec, like:

{
  echo foo
  exec > /dev/null
  sleep 10
} | (cat; echo "cat is now gone")

However note that most shells will still wait for that subshell in addition to the cat command. So while you'll see cat is now gone straight away (after foo is read), you'll still have to wait 10 seconds for the whole pipeline to finish. Of course, in that example above, it would make more sense to write it:

echo foo | cat
sleep 10

<<ANYTHING...content...ANYTHING is a here-document, it's to make the stdin of command a file that contains the content. It wouldn't be useful there. \4 is byte that when read from a terminal makes data held by a terminal device be flushed to the application reading from it (and when there's no data, read() returns 0 which means end-of-file). Again, not of any use here.

Related Question