ELF Executable – Which Parts Get Loaded into Memory and Where

dynamic-linkingdynamic-loadingelflinuxmemory

What I already know:

An ELF executable has a number of sections, obviously the .text and .data sections get loaded into memory as these are the main parts of the program. But for a program to work, it needs more info, especially when linked dynamically.

What I'm interested in are sections like .plt, .got, .dynamic, .dynsym, .dynstr etcetera. The parts of the ELF that are responsible for the linking of functions to addresses.

From what I've been able to figure out so far, is that things like .symtab and .strtab do not get loaded (or do not stay) in memory. But are .dynsym and and .dynstr used by the linker? Do they stay in memory? Can I access them from program code?

And are there any parts of an executable that reside in kernel memory?

My interest in this is mostly forensic, but any information on this topic will help. The resources I've read about these tables and dynamic linking are more high level, they only explain the workings, not anything practical about the contents in memory.

Let me know if anything in unclear about my question.

Best Answer

The following is a really good reference: http://www.ibm.com/developerworks/linux/library/l-dynamic-libraries/. It contains a bibliography at the end of a variety of different references at different levels. If you want to know every gory detail you can go straight to the source: http://www.akkadia.org/drepper/dsohowto.pdf. (Ulrich Drepper wrote the Linux dynamic linker.)

You can get a really good overview of all the sections in your executable by running a command like "objdump -h myexe" or "readelf -S myexe".

The .interp section contains the name of the dynamic loader that will be used to dynamically link the symbols in this object. The .dynamic section is a distillation of the program header that is formatted to be easy for the dynamic loader to read. (So it has pointers to all the other sections.)

The .got (Global Offset Table) and .plt (Procedure Linkage Table) are the two main structures that are manipulated by the dynamic linker. The .got is an indirection table for variables and the .plt is an indirection table for functions. Each executable or library (which are called "shared objects") has its own .got and .plt and these are tables of the symbols referenced by that shared object that are actually contained in some other shared object.

The .dynsyn contains all the information about the symbols in your shared object (both the ones you define and the external ones you need to reference.) The .dynsyn doesn't contain the actual symbol names. Those are contained in .dynstr and .dynsyn has pointers into .dynstr. .gnu.hash is a hash table used for quick lookup of symbols by name. It also contains only pointers (pointers into .dynstr, and pointers used for making bucket chains.)

When your shared object dereferences some symbol "foo" the dynamic linker has to go look up "foo" in all the dynamic objects you are linked against to figure out which one contains the "foo" you are looking for (and then what the relative address of "foo" is inside that shared object.) The dynamic linker does this by searching the .gnu.hash section of all the linked shared objects (or the .hash section for old shared objects that don't have a .gnu.hash section.) Once it finds the correct address in the linked shared object it puts it in the .got or .plt of your shared object.

Related Question