Shell – Difference Between 2>&1 > output.log and 2>&1 | tee output.log

io-redirectionpipeshell

I wanted to know the difference between the following two commands

2>&1 > output.log

and

2>&1 | tee output.log

I saw one of my colleague use second option to redirect. I know what 2>&1 does, my only question is what is the purpose of using tee where a simple redirection ">" operator can be used?

Best Answer

Looking at the two commands separately:

utility 2>&1 >output.log

Here, since redirections are processed in a left-to-right manner, the standard error stream would first be redirected to wherever the standard output stream goes (possibly to the console), and then the standard output stream would be redirected to a file. The standard error stream would not be redirected to that file.

The visible effect of this would be that you get what's produced on standard error on the screen and what's produced on standard output in the file.

utility 2>&1 | tee output.log

Here, you redirect standard error to the same place as the standard output stream. This means that both streams will be piped to the tee utility as a single intermingled output stream, and that this standard output data will be saved to the given file by tee. The data would additionally be reproduced by tee in the console (this is what tee does, it duplicates data streams).

Which ever one of these is used depends on what you'd like to achieve.

Note that you would not be able to reproduce the effect of the second pipeline with just > (as in utility >output.log 2>&1, which would save both standard output and error in the file by first redirecting standard output to the output.log file and then redirecting standard error to where standard output is now going). You would need to use tee to get the data in the console as well as in the output file.

Additional notes:

The visible effect of the first command,

utility 2>&1 >output.log

would be the same as

utility >output.log

I.e., the standard output goes to the file and standard error goes to the console.

If a further processing step was added to the end of each of the above commands, there would be a big difference though:

utility 2>&1 >output.log | more_stuff

utility >output.log      | more_stuff

In the first pipeline, more_stuff would get what's originally the standard error stream from utility as its standard input data, while in the second pipeline, since it's only the resulting standard output stream that is ever sent across a pipe, the more_stuff part of the pipeline would get nothing to read on its standard input.

Example

To show this I created a directory with 10,000 files in it.

for i in `seq -w 1 10000`;do echo "contents of file$i.txt" > file$i.txt;done

Each file looks similar to this:

$ more file00001.txt 
contents of file00001.txt

The output from pv:

$ find . -name '*.txt' -print0 | xargs -0 cat | pv -l > singlefile.rpt
  10k 0:00:00 [31.1k/s] [  <=>

As we can see, 10k lines were written out to my singlefile.rpt file. If xargs were passing us chunks of output, then we'd see that by a reduction in the number of lines that were being presented to pv.

Redirection. What is “<>”, “<&” and “>&-”

From http://www.manpagez.com/man/1/ksh/:

   <>word        Open file word for reading and writing as  standard  out-
                 put.

   <&digit       The standard input is  duplicated  from  file  descriptor
                 digit  (see  dup(2)).   Similarly for the standard output
                 using >&digit.

   <&-           The standard input is closed.  Similarly for the standard
                 output using >&-.

You will find all those details by typing man ksh.

Especially 2>&- means: close the standard error stream, i.e. the command is no longer able to write to STDERR, which will break the standard which requires it to be writable.

To understand the concept of file descriptors, (if on a Linux system) you may have a look at /proc/*/fd (and/or /dev/fd/*):

$ ls -l /proc/self/fd
insgesamt 0
lrwx------ 1 michas users 1 18. Jan 16:52 0 -> /dev/pts/0
lrwx------ 1 michas users 1 18. Jan 16:52 1 -> /dev/pts/0
lrwx------ 1 michas users 1 18. Jan 16:52 2 -> /dev/pts/0
lr-x------ 1 michas users 1 18. Jan 16:52 3 -> /proc/2903/fd

File descriptor 0 (aka STDIN) is used per default for reading, fd 1 (aka STDOUT) is default for writing, and fd 2 (aka STDERR) is default for error messages. (fd 3 is in this case used by ls to actually read that directory.)

If you redirect stuff it might look like this:

$ ls -l /proc/self/fd 2>/dev/null </dev/zero 99<>/dev/random |cat
insgesamt 0
lr-x------ 1 michas users 1 18. Jan 16:57 0 -> /dev/zero
l-wx------ 1 michas users 1 18. Jan 16:57 1 -> pipe:[28468]
l-wx------ 1 michas users 1 18. Jan 16:57 2 -> /dev/null
lr-x------ 1 michas users 1 18. Jan 16:57 3 -> /proc/3000/fd
lrwx------ 1 michas users 1 18. Jan 16:57 99 -> /dev/random

Now the default descriptors do no longer point to your terminal but to the corresponding redirects. (As you see, you can also create new fds.)

One more example for <>:

echo -e 'line 1\nline 2\nline 3' > foo # create a new file with three lines
( # with that file redirected to fd 5
  read <&5            # read the first line
  echo "xxxxxx">&5    # override the second line
  cat <&5             # output the remaining line
) 5<>foo  # this is the actual redirection

You can do such things, but you very seldom have to do so.

Best Answer

Related Solutions

Shell – Concatenating Thousands of Files: > vs >>

Example

Redirection. What is “<>”, “<&” and “>&-”

Related Question