After finding out that several common commands (such as read
) are actually Bash builtins (and when running them at the prompt I'm actually running a two-line shell script which just forwards to the builtin), I was looking to see if the same is true for true
and false
.
Well, they are definitely binaries.
sh-4.2$ which true
/usr/bin/true
sh-4.2$ which false
/usr/bin/false
sh-4.2$ file /usr/bin/true
/usr/bin/true: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=2697339d3c19235
06e10af65aa3120b12295277e, stripped
sh-4.2$ file /usr/bin/false
/usr/bin/false: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=b160fa513fcc13
537d7293f05e40444fe5843640, stripped
sh-4.2$
However, what I found most surprising was their size. I expected them to be only a few bytes each, as true
is basically just exit 0
and false
is exit 1
.
sh-4.2$ true
sh-4.2$ echo $?
0
sh-4.2$ false
sh-4.2$ echo $?
1
sh-4.2$
However I found to my surprise that both files are over 28KB in size.
sh-4.2$ stat /usr/bin/true
File: '/usr/bin/true'
Size: 28920 Blocks: 64 IO Block: 4096 regular file
Device: fd2ch/64812d Inode: 530320 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-01-25 19:46:32.703463708 +0000
Modify: 2016-06-30 09:44:27.000000000 +0100
Change: 2017-12-22 09:43:17.447563336 +0000
Birth: -
sh-4.2$ stat /usr/bin/false
File: '/usr/bin/false'
Size: 28920 Blocks: 64 IO Block: 4096 regular file
Device: fd2ch/64812d Inode: 530697 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-01-25 20:06:27.210764704 +0000
Modify: 2016-06-30 09:44:27.000000000 +0100
Change: 2017-12-22 09:43:18.148561245 +0000
Birth: -
sh-4.2$
So my question is: Why are they so big? What's in the executable other than the return code?
PS: I am using RHEL 7.4
Best Answer
In the past,
/bin/true
and/bin/false
in the shell were actually scripts.For instance, in a PDP/11 Unix System 7:
Nowadays, at least in
bash
, thetrue
andfalse
commands are implemented as shell built-in commands. Thus no executable binary files are invoked by default, both when using thefalse
andtrue
directives in thebash
command line and inside shell scripts.From the
bash
source,builtins/mkbuiltins.c
:Also per @meuh comments:
So it can be said with a high degree of certainty the
true
andfalse
executable files exist mainly for being called from other programs.From now on, the answer will focus on the
/bin/true
binary from thecoreutils
package in Debian 9 / 64 bits. (/usr/bin/true
running RedHat. RedHat and Debian use both thecoreutils
package, analysed the compiled version of the latter having it more at hand).As it can be seen in the source file
false.c
,/bin/false
is compiled with (almost) the same source code as/bin/true
, just returning EXIT_FAILURE (1) instead, so this answer can be applied for both binaries.As it also can be confirmed by both executables having the same size:
Alas, the direct question to the answer
why are true and false so large?
could be, because there are not anymore so pressing reasons to care about their top performance. They are not essential tobash
performance, not being used anymore bybash
(scripting).Similar comments apply to their size, 26KB for the kind of hardware we have nowadays is insignificant. Space is not at premium for the typical server/desktop anymore, and they do not even bother anymore to use the same binary for
false
andtrue
, as it is just deployed twice in distributions usingcoreutils
.Focusing, however, in the real spirit of the question, why something that should be so simple and small, gets so large?
The real distribution of the sections of
/bin/true
is as these charts shows; the main code+data amounts to roughly 3KB out of a 26KB binary, which amounts to 12% of the size of/bin/true
.The
true
utility got indeed more cruft code over the years, most notably the standard support for--version
and--help
.However, that it is not the (only) main justification for it being so big, but rather, while being dynamically linked (using shared libs), also having part of a generic library commonly used by
coreutils
binaries linked as a static library. The metada for building anelf
executable file also amounts for a significant part of the binary, being it a relatively small file by today´s standards.The rest of the answer is for explaining how we got to build the following charts detailing the composition of the
/bin/true
executable binary file and how we arrived to that conclusion.As @Maks says, the binary was compiled from C; as per my comment also, it is also confirmed it is from coreutils. We are pointing directly to the author(s) git https://github.com/wertarbyte/coreutils/blob/master/src/true.c, instead of the gnu git as @Maks (same sources, different repositories - this repository was selected as it has the full source of the
coreutils
libraries)We can see the various building blocks of the
/bin/true
binary here (Debian 9 - 64 bits fromcoreutils
):Of those:
Of the 24KB, around 1KB is for fixing up the 58 external functions.
That still leaves around roughly 23KB for rest of the code. We will show down bellow that the actual main file - main()+usage() code is around 1KB compiled, and explain what the other 22KB are used for.
Drilling further down the binary with
readelf -S true
, we can see that while the binary is 26159 bytes, the actual compiled code is 13017 bytes, and the rest is assorted data/initialisation code.However,
true.c
is not the whole story and 13KB seems pretty much excessive if it were only that file; we can see functions called inmain()
that are not listed in the external functions seen in the elf withobjdump -T true
; functions that are present at:Those extra functions not linked externally in
main()
are:So my first suspicion was partly correct, whilst the library is using dynamic libraries, the
/bin/true
binary is big *because it has some static libraries included with it* (but that is not the only cause).Compiling C code is not usually that inefficient for having such space unaccounted for, hence my initial suspicion something was amiss.
The extra space, almost 90% of the size of the binary, is indeed extra libraries/elf metadata.
While using Hopper for disassembling/decompiling the binary to understand where functions are, it can be seen the compiled binary code of true.c/usage() function is actually 833 bytes, and of the true.c/main() function is 225 bytes, which is roughly slightly less than 1KB. The logic for version functions, which is buried in the static libraries, is around 1KB.
The actual compiled main()+usage()+version()+strings+vars are only using up around 3KB to 3.5KB.
It is indeed ironic, such small and humble utilities have became bigger in size for the reasons explained above.
related question: Understanding what a Linux binary is doing
true.c
main() with the offending function calls:The decimal size of the various sections of the binary:
Output of
readelf -S true
Output of
objdump -T true
(external functions dynamically linked on run-time)