Bash Grep Performance – Why ‘tac file | grep foo’ is Faster than ‘grep foo < <(tac file)'

bashefficiencygrepperformance

This question was motivated by "Reverse grepping", about grepping a huge file from bottom up.

tac file | grep whatever

Or a bit more effective:

grep whatever < <(tac file)

@vinc17 said:

The < <(tac filename) should be as fast as a pipe

There are also many interesting comments from other users.

My questions:

What is the difference between | and < <()?
Why is one faster than other?
And which is really faster?
Why did no one suggest xargs?

Best Answer

The construction <(tac file) causes to shell to:

Create a pipe with a name
- On systems such as Linux and SysV which have /dev/fd, a regular pipe is used, and /dev/fd/<the-file-descriptor-of-the-pipe> is used as the name.
- On other systems, a named pipe is used, which requires creating an actual file entry on disk.
Launch the command tac file and connect it to one end of the pipe.
Replace the whole construction on the command line with the name of the pipe.

After the replacement, the command line becomes:

grep whatever < /tmp/whatever-name-the-shell-used-for-the-named-pipe

And then grep is executed, and it reads its standard input (which is the pipe), reads it, and searches for its first argument in that.

So the end result is the same as with...

tac file | grep whatever

...in that the same two programs are launched and a pipe is still used to connect them. But the <( ... ) construction is more convoluted because it involves more steps and may involve a temporary file (the named pipe).

The <( ... ) construct is an extension, and is not available in the standard POSIX bourne shell nor on platforms that do not support /dev/fd or named pipes. For this reason alone, because the two alternatives being considered are exactly equivalent in functionality, the more portable command | other-command form is a better choice.

The <( ... ) construction should be slower because of the additional convolution, but it's only in the startup phase and I don't expect the difference to be easily measurable.

NOTE: On Linux SysV platforms, < ( ... ) does not use named pipes but instead uses regular pipes. Regular pipes (indeed all file descriptors) can be referred to by the special named /dev/fd/<file-descriptor-number so that's what the shell uses as a name for the pipe. In this way it avoids creating a real named pipe with a bona fide temporary filename in the real filesystem. Although the /dev/fd trick is what was used to implement this feature when it originally appears in ksh, it is an optimization: on platforms that don't support this, a regular named pipe in the real filesystem is used as described above.

ALSO NOTE: To describe the syntax as <<( ... ) is misleading. In fact it's <( ... ), which is replaced with the name of a pipe, and then the other < character which prefixes the whole thing is separate from this syntax and it's the regular well-known syntax for redirecting input from a file.

Standalone printf

Part of the "expense" in invoking a process is that several things have to happen that are resource intensive.

The executable has to be loaded from the disk, this incurs slowness since the HDD has be be accessed in order to load the binary blob from the disk which the executable is stored as.
The executable is typically built using dynamic libraries, so some secondary files to the executable will also have to be loaded, (i.e. more binary blob data being read from the HDD).
Operating system overhead. Each process that you invoke incurs overhead in the form of a process ID having to be created for it. Space in memory will also have be carved out to both house the binary data being loaded from the HDD in steps 1 & 2, as well as multiple structures having to be populated to store things such as the processes' environment (environment variables etc.)

excerpt of an strace of /usr/bin/printf

    $ strace /usr/bin/printf "%s\n" "hello world"
    *execve("/usr/bin/printf", ["/usr/bin/printf", "%s\\n", "hello world"], [/* 91 vars */]) = 0
    brk(0)                                  = 0xe91000
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a6b000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY)      = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=242452, ...}) = 0
    mmap(NULL, 242452, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd155a2f000
    close(3)                                = 0
    open("/lib64/libc.so.6", O_RDONLY)      = 3
    read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\357!\3474\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0755, st_size=1956608, ...}) = 0
    mmap(0x34e7200000, 3781816, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x34e7200000
    mprotect(0x34e7391000, 2097152, PROT_NONE) = 0
    mmap(0x34e7591000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x191000) = 0x34e7591000
    mmap(0x34e7596000, 21688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x34e7596000
    close(3)                                = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a2e000
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a2c000
    arch_prctl(ARCH_SET_FS, 0x7fd155a2c720) = 0
    mprotect(0x34e7591000, 16384, PROT_READ) = 0
    mprotect(0x34e701e000, 4096, PROT_READ) = 0
    munmap(0x7fd155a2f000, 242452)          = 0
    brk(0)                                  = 0xe91000
    brk(0xeb2000)                           = 0xeb2000
    brk(0)                                  = 0xeb2000
    open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=99158752, ...}) = 0
    mmap(NULL, 99158752, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd14fb9b000
    close(3)                                = 0
    fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a6a000
    write(1, "hello world\n", 12hello world
    )           = 12
    close(1)                                = 0
    munmap(0x7fd155a6a000, 4096)            = 0
    close(2)                                = 0
    exit_group(0)                           = ?*

Looking through the above you can get a sense of the additional resources that /usr/bin/printf is having to incur due to it being a standalone executable.

Builtin printf

With the built version of printf all the libraries that it depends on as well as its binary blob have already been loaded into memory when Bash was invoked. So none of that has to be incurred again.

Effectively when you call the builtin "commands" to Bash, you're really making what amounts to a function call, since everything has already been loaded.

An analogy

If you've ever worked with a programming language, such as Perl, it's equivalent to making calls to the function (system("mycmd")) or using the backticks (`mycmd`). When you do either of those things, you're forking a separate process with it's own overhead, vs. using the functions that are offered to you through Perl's core functions.

Anatomy of Linux Process Management

There's a pretty good article on IBM Developerworks that breaks down the various aspects of how Linux processes are created and destroyed along with the different C libraries involved in the process. The article is titled:Anatomy of Linux process management - Creation, management, scheduling, and destruction. It's also available as a PDF.

Best Answer

Related Solutions

Bash – Why is deleting files by name painfully slow and also exceptionally fast

Bash – Why is bash’s printf faster than /usr/bin/printf

Standalone printf

Builtin printf

An analogy

Anatomy of Linux Process Management

Related Question