Ubuntu – Why does ‘ls > ls.out’ cause ‘ls.out’ to be included in list of names

bashcommand lineredirect

Why does $ ls > ls.out cause 'ls.out' to be included in list of names of files in current directory? Why was this chosen to be? Why not otherwise?

Best Answer

When evaluating the command the > redirection is resolved first: so by the time ls runs the output file has been created already.

This is also the reason why reading and writing to the same file using a > redirection within the same command truncates the file; by the time the command runs the file has been truncated already:

$ echo foo >bar
$ cat bar
foo
$ <bar cat >bar
$ cat bar
$ 

Tricks to avoid this:

  • <<<"$(ls)" > ls.out (works for any command that needs to run before the redirection is resolved)

    The command substitution is run before the outer command is evaluated, so ls is run before ls.out is created:

    $ ls
    bar  foo
    $ <<<"$(ls)" > ls.out
    $ cat ls.out 
    bar
    foo
    
  • ls | sponge ls.out (works for any command that needs to run before the redirection is resolved)

    sponge writes to the file only when the rest of the pipe has finished executing, so ls is run before ls.out is created (sponge is provided with the moreutils package):

    $ ls
    bar  foo
    $ ls | sponge ls.out
    $ cat ls.out 
    bar
    foo
    
  • ls * > ls.out (works for ls > ls.out's specific case)

    The filename expansion is performed before the redirection is resolved, so ls will run on its arguments, which won't contain ls.out:

    $ ls
    bar  foo
    $ ls * > ls.out
    $ cat ls.out 
    bar
    foo
    $
    

On why redirections are resolved before the program / script / whatever is run, I don't see a specific reason why it's mandatory to do so, but I see two reasons why it's better to do so:

  • not redirecting STDIN beforehand would make the program / script / whatever hold until STDIN is redirected;

  • not redirecting STDOUT beforehand should necessarily make the shell buffer the program's / script's / whatever's output until STDOUT is redirected;

So a waste of time in the first case and a waste of time and memory in the second case.

This is just what occurs to me, I'm not claiming these are the actual reasons; but I guess that all in all, if one had a choice, they would go with redirecting before anyway for the abovementioned reasons.