Linux – Understanding Linux executable formats and software distribution packages

distributionslinux

I am having trouble understanding Linux executable formats and software distribution packages. There are so many different distributions of Linux itself, and it seems like every software package has been compiled separately for each distro. Why is this? I understand that some "packages" are made to install on different distros, but is the executable format for the software different?

Also, why do many Linux users prefer the command prompt versions of applications vs GUI versions? I can understand the need for small footprints, but even GUI apps can have small footprints if they're coded right.

Best Answer

Package Managers & Dependencies

Most Linux distributions use package managers for software installation and removal. Package managers provide some benefits such as the possibility of using a central repository from which (almost) any piece of software can be downloaded, the organization of pieces of software into bundles that can be installed as one cohesive group, and the main benefits: automatic dependency handling and tracking what changes packages make so they can be uninstalled.

Certain pieces of software might require certain libraries or other programs to perform duties that would be redundant if it was re-implemented in that piece of software. Packages allow for the expression of these dependencies.

Differences: package formats and strategies

There exist several different package managers. Each was created because the existing ones did not meet the needs of some people. Each package manager requires packages in its own format.

Furthermore, different distributions have different requirements of the software that is included. There are a number of pieces of software that can have differing capabilities depending on options that are given when it is compiled from source code into a machine executable. Some distributions want to provide full feature sets and a rich experience while others want to provide as lean and simple an experience as possible, and there is everything in between. Also, the distribution may decide to format its directory structure differently or use a different init system. They may decide to bundle the software differently: there may be a package called "dev-utils" in two different distributions, but one version of that includes yacc while the other doesn't. Because of these different needs, the distributions choose to compile the software in different ways.

This is why even if you have a package in the correct format for your package manager, it may not work if the package was intended for a different distribution. For instance, that package might rely on yacc being installed, and it expressed that dependency through requiring the "dev-utils" package, but your "dev-utils" doesn't include yacc. Now there is a package installed with an unmet dependency.

It's not really a problem.

A big part of being a Linux distribution is maintaining a central software repository. The distribution takes care of maintaining all of this for you. This actually makes it very easy to install software. You typically use the package manager to search for and select some packages, then tell it to install them; it takes care of the rest for you. The Windows software installation process includes hunting for software on 3rd-party websites, trying to locate the appropriate download link, downloading, virus-checking, and running an install program which then asks you a bunch of irrelevant questions. That whole mess isn't the standard on Linux.

The repository can't possibly include everything

Now, there may be cases where a piece of software you require is not in your distribution's repository. The packages that are supplied by a software repository is one of the differentiating features of distributions. When you can't find the software you need in your distribution's repositories, there are three possible avenues (really, two plus a way to really screw things up).

Community Repositories

Many distributions have unofficial repositories that are maintained by people not associated with the distribution. Ubuntu calls them PPAs, Fedora calls them Fedora People Repositories. Arch Linux doesn't have a specific name for third-party repositories, but it does have its AUR, which is a collection of "recipes" for packages (note: there is only one AUR). You might first try installing a package from one of these sources since it is easy to un-install them if they don't work.

Compile from Source

If you can't find an unofficial repository with what you need, compiling from source is not hard. You need to have your distribution's development package installed; this includes basic things like a compiler, linker, parser, and other tools that are usually needed for compiling software. Then you find the source code of the project (which is almost always packaged in a .tgz or .tbz (called a "tarball"). Download it into its own directory somewhere, extract it (using tar -xf filename.tgz, and usually go into the one directory it created. In that directory may be a file called README or INSTALL. If it exists, go ahead and read it; most of them tell you to do the same thing. The next few steps are done at a command line. Run ls, and look for an executable file called configure. If it exists, run it by doing ./configure; it can take a couple of minutes sometimes. That usually runs some tests to figure out how your distribution has things setup, and it makes sure you have the tools required to compile this piece of software. The next step is to run make. This actually compiles the software, and it will likely take some time - anywhere from a few minutes to hours depending on the size of the software you're compiling. Once that is done, you run make install. This installs the software, which involves copying the products of the compilation to the appropriate places in your filesystem. After that, the software is available for use.

This was a long section, but it's summarized as "README, ./configure, make, make install". That's the routine to remember.

Install a package from another distribution (don't do this)

I list this only because it is and alternative, but it will almost certainly not end well. It is possible to install packages for other distributions, and you might find yourself wanting to do that. Well, don't. Don't do it until you understand your system very well. In fact, I'm not going to put any commands here showing how to do it even though it's possible. If you do get to that point where it seems like this is the only option, don't install the package using the package manager; instead, pull things out of the package and place them in your system manually, along with notes about what you've done so that you can undo it if necessary.

The command-line bit

Some people prefer the command line for the advantages it gives them. These can be summarized into three things:

Ease of automation
Speed (compared to clicking all over the place in a gui)
Expressiveness

The biggest of these is expressiveness; there are things that can be done at a command line that are not possible in a graphical interface.

Finally, command-line instructions are frequently given in helpful forums such as this one because it is much easier to convey the correct information than giving "click-here-then-there-then-there" type instructions.

Related Solutions

Linux – Oldest binary working on Linux

I think that /bin/true has to be the oldest working ..

Well, can you call a zero-byte file a binary?

touch /tmp/old_true
chmod 755 /tmp/old_true
/tmp/old_true
echo $?

Linux – Are there any Linux distributions that focus on binary backward compatibility

Basically this boils down to: you can't keep binary compatibility and introduce new features, since these things go directly against each other in most aspects. If you introduce major new features you in the end have to change the ABI (usually shortly after the changes in the API). Now, you can have versioned symbols (like for example Glibc has), but this makes the libraries grow in size (and may also incur some performance penalty while loading a binary into memory) and developers certainly don't want to keep it in general libraries (the legacy code contains bugs that nobody is interested in fixing).

The usual way to go around this on distribution side is twofold:

do not change versions - this is typical for Enterprise-grade distributions like (in alphabetical order) RedHat and SUSE, as well as for some others (Debian, Slackware, Ubunty LTS and probably their clones).
allow installation of various versions of a library at the same time.

On the application distributor this is handled in the same way as on windows: stuff everything needed into the distribution package. Yes, this is the way it is often done on Windows - this is also one of the reasons for typical Windows system usually having several times higher disk space requirements than Linux with the same functionality - the applications are simply sharing only very little among themselves and have their own copies somewhere. You can think of it as of every GTK/Qt application coming with its own GTK/Qt stack. It can have some advantages but disadvantages are also plentiful. For example from security point of view it is a nightmare in Technicolor^TM. If the binaries are statically linked, it's even in FullHD.