Is cmp faster than diff -q

diff()performance

In a recent Ask Ubuntu question on checking whether some files have differing content, I saw a comment stating that, if the differing sections didn't matter, cmp would be faster than diff. This Stack Overflow answer concurs, giving the reason that cmp would stop at the first differing byte. However, GNU diff has the -q (or --brief) flag, which is supposed to make it report only when files differ. It would seem logical that GNU diff would also stop comparing once any difference is found (like grep would stop searching after the first match when -l or -q is specified).

Is cmp truly faster than diff -q, in the context of Linux systems, which are likely to have the GNU version?

Best Answer

Prompted by @josten, I ran a comparison on the two. The code is on GitHub. In short¹:

user-sys real

The User+Sys time taken by cmp -s seemed to be a tad more than that of diff in most cases. However, the Real time take was pretty much arbitrary - cmp ahead on some, diff ahead on some.

Summary:

Any difference in performance is pure coincidence. Use whatever you wish.

^{¹The images are 1920x450, so do open them in a tab to see them in their full glory.}

Standalone printf

Part of the "expense" in invoking a process is that several things have to happen that are resource intensive.

The executable has to be loaded from the disk, this incurs slowness since the HDD has be be accessed in order to load the binary blob from the disk which the executable is stored as.
The executable is typically built using dynamic libraries, so some secondary files to the executable will also have to be loaded, (i.e. more binary blob data being read from the HDD).
Operating system overhead. Each process that you invoke incurs overhead in the form of a process ID having to be created for it. Space in memory will also have be carved out to both house the binary data being loaded from the HDD in steps 1 & 2, as well as multiple structures having to be populated to store things such as the processes' environment (environment variables etc.)

excerpt of an strace of /usr/bin/printf

    $ strace /usr/bin/printf "%s\n" "hello world"
    *execve("/usr/bin/printf", ["/usr/bin/printf", "%s\\n", "hello world"], [/* 91 vars */]) = 0
    brk(0)                                  = 0xe91000
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a6b000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY)      = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=242452, ...}) = 0
    mmap(NULL, 242452, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd155a2f000
    close(3)                                = 0
    open("/lib64/libc.so.6", O_RDONLY)      = 3
    read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\357!\3474\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0755, st_size=1956608, ...}) = 0
    mmap(0x34e7200000, 3781816, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x34e7200000
    mprotect(0x34e7391000, 2097152, PROT_NONE) = 0
    mmap(0x34e7591000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x191000) = 0x34e7591000
    mmap(0x34e7596000, 21688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x34e7596000
    close(3)                                = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a2e000
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a2c000
    arch_prctl(ARCH_SET_FS, 0x7fd155a2c720) = 0
    mprotect(0x34e7591000, 16384, PROT_READ) = 0
    mprotect(0x34e701e000, 4096, PROT_READ) = 0
    munmap(0x7fd155a2f000, 242452)          = 0
    brk(0)                                  = 0xe91000
    brk(0xeb2000)                           = 0xeb2000
    brk(0)                                  = 0xeb2000
    open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=99158752, ...}) = 0
    mmap(NULL, 99158752, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd14fb9b000
    close(3)                                = 0
    fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd155a6a000
    write(1, "hello world\n", 12hello world
    )           = 12
    close(1)                                = 0
    munmap(0x7fd155a6a000, 4096)            = 0
    close(2)                                = 0
    exit_group(0)                           = ?*

Looking through the above you can get a sense of the additional resources that /usr/bin/printf is having to incur due to it being a standalone executable.

Builtin printf

With the built version of printf all the libraries that it depends on as well as its binary blob have already been loaded into memory when Bash was invoked. So none of that has to be incurred again.

Effectively when you call the builtin "commands" to Bash, you're really making what amounts to a function call, since everything has already been loaded.

An analogy

If you've ever worked with a programming language, such as Perl, it's equivalent to making calls to the function (system("mycmd")) or using the backticks (`mycmd`). When you do either of those things, you're forking a separate process with it's own overhead, vs. using the functions that are offered to you through Perl's core functions.

Anatomy of Linux Process Management

There's a pretty good article on IBM Developerworks that breaks down the various aspects of how Linux processes are created and destroyed along with the different C libraries involved in the process. The article is titled:Anatomy of Linux process management - Creation, management, scheduling, and destruction. It's also available as a PDF.

Bash Performance – Is Dash Faster Than Bash?

SHELL SEQ:

Probably a useful means of bench-marking a shell's performance is to do a lot of very small, simple evaluations repetitively. It is important, I think, not just to loop, but to loop over input, because a shell needs to read <&0.

I thought this would complement the tests @cuonglm already posted because it demonstrates a single shell process's performance once invoked, as opposed to his which demonstrates how quickly a shell process loads when invoked. In this way, between us, we cover both sides of the coin.

Here's a function to facilitate the demo:

sh_bench() (                                               #dont copy+paste comments
    o=-c sh=$(command -v "$1") ; shift                     #get shell $PATH; toss $1
    [ -z "${sh##*busybox}" ] && o='ash -c'                 #cause its weird
    set -- "$sh" $o "'$(cat <&3)'" -- "$@"                 #$@ = invoke $shell
    time env - "$sh" $o "while echo; do echo; done|$*"     #time (env - sh|sh) AC/DC
) 3<<-\SCRIPT                                                                      
#Everything from here down is run by the different shells    
    i="${2:-1}" l="${1:-100}" d="${3:-                     
}"; set -- "\$((n=\$n\${n:++\$i}))\$d"                     #prep loop; prep eval
    set -- $1$1$1$1$1$1$1$1$1$1                            #yup
    while read m                                           #iterate on input
    do  [ $(($i*50+${n:=-$i})) -gt "$(($l-$i))" ] ||       #eval ok?
            eval echo -n \""$1$1$1$1$1"\"                  #yay!
        [ $((n=$i+$n)) -gt "$(($l-$i))" ] &&               #end game?
            echo "$n" && exit                              #and EXIT
        echo -n "$n$d"                                     #damn - maybe next time
    done                                                   #done 
#END
SCRIPT                                                     #end heredoc

It either increments a variable once per newline read or, as a slight-optimization, if it can, it increments 50 times per newline read. Every time the variable is incremented it is printed to stdout. It behaves a lot like a sort of seq cross nl.

And just to make it very clear what it does - here's some truncated set -x; output after inserting it just before time in the function above:

time env - /usr/bin/busybox ash -c '
     while echo; do echo; done |
     /usr/bin/busybox ash -c '"'$(
         cat <&3
     )'"' -- 20 5 busybox'

So each shell is first called like:

 env - $shell -c "while echo; do echo; done |..."

...to generate the input that it will need to loop over when it reads in 3<<\SCRIPT - or when cat does, anyway. And on the other side of that |pipe it calls itself again like:

"...| $shell -c '$(cat <<\SCRIPT)' -- $args"

So aside from the initial call to env (because cat is actually called in the previous line); no other processes are invoked from the time it is called until it exits. At least, I hope that's true.

Before the numbers...

I should make some notes on portability.

posh doesn't like $((n=n+1)) and insists on $((n=$n+1))
mksh doesn't have a printf builtin in most cases. Earlier tests had it lagging a great deal - it was invoking /usr/bin/printf for every run. Hence the echo -n above.
maybe more as I remember it...

Anyway, to the numbers:

for sh in dash busybox posh ksh mksh zsh bash
do  sh_bench $sh 20 5 $sh 2>/dev/null
    sh_bench $sh 500000 | wc -l
echo ; done

That'll get 'em all in one go...

0dash5dash10dash15dash20

real    0m0.909s
user    0m0.897s
sys     0m0.070s
500001

0busybox5busybox10busybox15busybox20

real    0m1.809s
user    0m1.787s
sys     0m0.107s
500001

0posh5posh10posh15posh20

real    0m2.010s
user    0m2.060s
sys     0m0.067s
500001

0ksh5ksh10ksh15ksh20

real    0m2.019s
user    0m1.970s
sys     0m0.047s
500001

0mksh5mksh10mksh15mksh20

real    0m2.287s
user    0m2.340s
sys     0m0.073s
500001

0zsh5zsh10zsh15zsh20

real    0m2.648s
user    0m2.223s
sys     0m0.423s
500001

0bash5bash10bash15bash20

real    0m3.966s
user    0m3.907s
sys     0m0.213s
500001

ARBITRARY = MAYBE OK?

Still, this is a rather arbitrary test, but it does test reading input, arithmetic evaluation, and variable expansion. Maybe not comprehensive, but possibly near to there.

EDIT by Teresa e Junior: @mikeserv and I have done many other tests (see our chat for details), and we found the results could be summarized like this:

If you need speed, go definitely with dash, it is much faster than any other shell and about 4x faster than bash.
While busybox's shell can be much slower than dash, in some tests it could be faster, because it has many of its own userland utilities, like grep, sed, sort, etc., which don't have as many features as the commonly used GNU utilities, but can get the work done as much.
If speed is not everything you care about, ksh (or ksh93) can be considered the best compromisse between speed and features. It's speed compares to the smaller mksh, which is way faster than bash, and it has also some unique features, like floating point arithmetic.
Although bash is famous for its simplicity, stability, and functionality, it was the slowest of all shells in the majority of our tests, and by a large margin.