Binary Files – Understanding the Mystery of Binary Files

binarycompiling

This is about files straight from the compiler, say g++, and the -o (outfile) flag.

If they are binary, shouldn't they just be a bunch of 0's and 1's?

When you cat them, you get unintelligible output but also intact words.

If you file them, you get the answer immediately – there seem to be no computation. Do the binary files in fact have headers with this kind of information?

I thought a binary executable was just the program just compiled, only in the form of machine instructions that your CPU can instantly and unambiguously understand. If so, isn't that instruction set just bit patterns? But then, what's all the other stuff in the binaries? How do you display the bits?

Also, if you somehow get hold of the manual of your processor, could you write a binary manually, one machine instruction at a time? That would be terribly ineffective, but very fascinating if you got it to work even for a "Hello World!" demo.

Best Answer

This Super User question: Why don't you see binary code when you open a binary file with text editor? addresses your first point quite well.

Binary and text data aren't separated: They are simply data. It depends on the interpretation that makes them one or the other. If you open binary data (such as an image file) in a text editor, much of it won't make sense, because it does not fit your chosen interpretation (as text).

Files are stored as zeros and ones (e.g. voltage/no voltage on memory, magnetization/no magnetization on hard drive). You don't see zeros and ones when cat ing the files because the 0/1 sequences won't be of much use to an human; characters make more sense, and an hexdump is better for most purposes (try hexdump on a file).

Executable files do have a header that describes parameters such as the architecture for which the program was built, and what sections of the file are code and data. This is what file uses to identify the characteristics of your binary file.

Finally: yes, you can write programs in assembly language using CPU opcodes directly. Take a look at Introduction to UNIX assembly programming and the Intel x86 documentation for a starting point.

Related Solutions

Debian – How to Get Compiler Flags Used to Build Binaries in a (.deb) Package

The compiler flags used are a function of

the debian/rules file,
the package's build files (since the upstream author may specify flags there too),
the build system used (dh, cdbs etc.),
the default compiler settings.

To see the flags used you effectively need to at least compile the package:

debian/rules build

Trying things like

debian/rules -n

generally won't take you very far; for instance on a dh-based package it will just say

dh build

or something similar; asking dh to show what that would do (with --no-act) will produce

dh_testdir
dh_auto_configure
dh_auto_build

and so on.

There is no fool-proof, easy-to-explain way to determine the build flags by reading debian/rules; you can get some idea by looking for flags set there, and also (where appropriate) by looking for options for dpkg-buildflags (such as DEB_BUILD_MAINT_OPTIONS) and running that. For many packages the easiest way to see what flags were used is to look at the build logs for the packages shipped in the archives, starting from https://buildd.debian.org. For example the logs for coreutils on i386 show that the flags used were -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector-strong -Wformat -Werror=format-security for compilation, and -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wl,--as-needed -Wl,-z,relro for linking (thanks to Faheem Mitha for pointing out the latter!).

How many binaries do I need to get ‘reasonable’ Linux coverage

If the question is “How many Gnu/Linux distros do I need to target, to get 90% user coverage?”, then

Compile and link with static libraries: to need to target the distro. OR link to libraries that are installed in the same directory.

Target the x86 (maybe 32 bit and 64bit, I don't know how many 64bit distros don't have 32bit user land support), maybe also target the ARM from the raspberry pi, maybe sparc, maybe alpha, mips (you will have to do a survey to find out how many of your potential users use these.

Best Answer

Related Solutions

Debian – How to Get Compiler Flags Used to Build Binaries in a (.deb) Package

How many binaries do I need to get ‘reasonable’ Linux coverage

Related Question