Linux – Memory layout of dynamic loaded/linked library

cdynamic-linkingelflinuxshared memory

when loading a shared library in Linux system, what is the memory layout of the shared library?

For instance, the original memory layout is the following:

+-----------+
|heap(ori)  |
+-----------+
|stack(ori) |
+-----------+
|.data(ori) |
+-----------+
|.text(ori) |
+-----------+

When I dlopen foo.so, will the memory layout be A or B?

A
+-----------+
|heap(ori)  |
+-----------+
|stack(ori) |
+-----------+
|.data(ori) |
+-----------+
|.text(ori) |
+-----------+
|heap(foo)  |
+-----------+
|stack(foo) |
+-----------+
|.data(foo) |
+-----------+
|.text(foo) |
+-----------+

B
+-----------+
|heap(ori)  |
+-----------+
|heap(foo)  |
+-----------+
|stack(foo) |
+-----------+
|stack(ori) |
+-----------+
|.data(foo) |
+-----------+
|.data(ori) |
+-----------+
|.text(foo) |
+-----------+
|.text(ori) |
+-----------+

Or anything other than A and B… ?

Best Answer

The answer is "Other". You can get a glimpse of the memory layout with cat /proc/self/maps. On my 64-bit Arch laptop::

00400000-0040c000 r-xp 00000000 08:02 1186758                            /usr/bin/cat
0060b000-0060c000 r--p 0000b000 08:02 1186758                            /usr/bin/cat
0060c000-0060d000 rw-p 0000c000 08:02 1186758                            /usr/bin/cat
02598000-025b9000 rw-p 00000000 00:00 0                                  [heap]
7fe4b805c000-7fe4b81f5000 r-xp 00000000 08:02 1182914                    /usr/lib/libc-2.21.so
7fe4b81f5000-7fe4b83f5000 ---p 00199000 08:02 1182914                    /usr/lib/libc-2.21.so
7fe4b83f5000-7fe4b83f9000 r--p 00199000 08:02 1182914                    /usr/lib/libc-2.21.so
7fe4b83f9000-7fe4b83fb000 rw-p 0019d000 08:02 1182914                    /usr/lib/libc-2.21.so
7fe4b83fb000-7fe4b83ff000 rw-p 00000000 00:00 0
7fe4b83ff000-7fe4b8421000 r-xp 00000000 08:02 1183072                    /usr/lib/ld-2.21.so
7fe4b85f9000-7fe4b85fc000 rw-p 00000000 00:00 0
7fe4b85fe000-7fe4b8620000 rw-p 00000000 00:00 0
7fe4b8620000-7fe4b8621000 r--p 00021000 08:02 1183072                    /usr/lib/ld-2.21.so
7fe4b8621000-7fe4b8622000 rw-p 00022000 08:02 1183072                    /usr/lib/ld-2.21.so
7fe4b8622000-7fe4b8623000 rw-p 00000000 00:00 0
7ffe430c4000-7ffe430e5000 rw-p 00000000 00:00 0                          [stack]
7ffe431ed000-7ffe431ef000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

You can see that the executable gets loaded in low memory, apparently .text segment, read-only data, and .bss. Just about that is "heap". In much higher memory the C library and the "ELF file interpreter", "ld-so" get loaded. Then comes the stack. There's only one stack and one heap for any given address space, no matter how many shared libraries get loaded. cat only seems to get the C library loaded.

Doing cat /proc/$$/maps will get you the memory mappings of the shell from which you invoked cat. Any shell is going to have a number of dynamically loaded libraries, but zsh and bash will load in a large number. You'll see that there's just one "[heap]", and one "[stack]".

If you call dlopen(), the shared object file will get mapped in the address space at a higher address than /usr/lib/libc-2.21.so. There's something of an "implementation dependent" memory mapping segment, where all addresses returned by mmap() show up. See Anatomy of a Program in Memory for a nice graphic.

The source for /usr/lib/ld-2.21.so is a bit tricky, but it shares a good deal of its internals with dlopen(). dlopen() isn't a second class citizen.

"vdso" and "vsyscall" are somewhat mysterious, but this Stackoverflow question has a good explanation, as does Wikipedia.

Related Solutions

ELF Executable – Which Parts Get Loaded into Memory and Where

The following is a really good reference: http://www.ibm.com/developerworks/linux/library/l-dynamic-libraries/. It contains a bibliography at the end of a variety of different references at different levels. If you want to know every gory detail you can go straight to the source: http://www.akkadia.org/drepper/dsohowto.pdf. (Ulrich Drepper wrote the Linux dynamic linker.)

You can get a really good overview of all the sections in your executable by running a command like "objdump -h myexe" or "readelf -S myexe".

The .interp section contains the name of the dynamic loader that will be used to dynamically link the symbols in this object. The .dynamic section is a distillation of the program header that is formatted to be easy for the dynamic loader to read. (So it has pointers to all the other sections.)

The .got (Global Offset Table) and .plt (Procedure Linkage Table) are the two main structures that are manipulated by the dynamic linker. The .got is an indirection table for variables and the .plt is an indirection table for functions. Each executable or library (which are called "shared objects") has its own .got and .plt and these are tables of the symbols referenced by that shared object that are actually contained in some other shared object.

The .dynsyn contains all the information about the symbols in your shared object (both the ones you define and the external ones you need to reference.) The .dynsyn doesn't contain the actual symbol names. Those are contained in .dynstr and .dynsyn has pointers into .dynstr. .gnu.hash is a hash table used for quick lookup of symbols by name. It also contains only pointers (pointers into .dynstr, and pointers used for making bucket chains.)

When your shared object dereferences some symbol "foo" the dynamic linker has to go look up "foo" in all the dynamic objects you are linked against to figure out which one contains the "foo" you are looking for (and then what the relative address of "foo" is inside that shared object.) The dynamic linker does this by searching the .gnu.hash section of all the linked shared objects (or the .hash section for old shared objects that don't have a .gnu.hash section.) Once it finds the correct address in the linked shared object it puts it in the .got or .plt of your shared object.

Non-reentrant libraries in shared memory

I believe that the really short answer is that Linux compilers arrange code into pieces, at least one of which is just pure code, and can therefore be memory mapped into more than one process' address space. Any globals get mapped such that each process gets its own copy.

You can see this using readelf, or objdump, but readelf gives a clearer picture, I think.

Here's a piece of output from readelf -e /usr/lib/libc.so.6. That's the C library, probably mapped into almost every process. The relevant part of readelf output (although all of it is interesting) is the Program Headers:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x00000034 0x00000034 0x00140 0x00140 R E 0x4
  INTERP         0x164668 0x00164668 0x00164668 0x00017 0x00017 R   0x4
      [Requesting program interpreter: /usr/lib/ld-linux.so.2]
  LOAD           0x000000 0x00000000 0x00000000 0x1adfc4 0x1adfc4 R E 0x1000
  LOAD           0x1ae220 0x001af220 0x001af220 0x02c94 0x057c4 RW  0x1000
  DYNAMIC        0x1afd90 0x001b0d90 0x001b0d90 0x000f8 0x000f8 RW  0x4
  NOTE           0x000174 0x00000174 0x00000174 0x00044 0x00044 R   0x4
  TLS            0x1ae220 0x001af220 0x001af220 0x00008 0x00048 R   0x4
  GNU_EH_FRAME   0x164680 0x00164680 0x00164680 0x06124 0x06124 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  GNU_RELRO      0x1ae220 0x001af220 0x001af220 0x01de0 0x01de0 R   0x1

The two LOAD lines are the only pieces of the file that get mapped directly into memory. The first LOAD header maps a piece of /usr/lib/libc.so.6 into memory with R and E permissions: read and execute. That's the code. Hardware features keep a program from writing to that piece of memory, so all programs can share the same pages of real, physical memory. The kernel can set up the hardware to map the same physical memory into all processes.

The second LOAD header is marked RW - read and write. This is the part with global variables that the C library uses. Each process gets its own copy in physical memory, mapped into that process' address space with the hardware permissions set to allow reading and writing. That section is not shared.

You can see these memory mappings in a running process using the /proc file system. A good command to illustrate: cat /proc/self/maps. That lists all the memory mappings that the cat process has, and from what files the kernel got them.

As far as how much you have to do to ensure that your function is allocated to memory that gets mapped into different processes, it's pretty much all down to the flags you give to the compiler. Code intended for ".so" shared libraries is compiled "position independent". Position independent code does things like refer to memory locations of variables with offsets relative to the current instruction, and jumps or branches to locations relative to the current instruction, rather than loading from or writing to absolute addresses, and jumping to absolute addresses. That means the "RE" LOAD piece of /usr/lib/libc.so and the "RW" piece only have to have be loaded at addresses that are the same distance apart in each process. In your example code, the static variable will always be at least a multiple of a page size apart from the code that references it, and it will always get loaded that distance apart in a process' address space due to the way the LOAD ELF headers are given.

Just a note about the term "shared memory": There's a user-level shared memory system, associate with "System V interprocess communications system". That's a way for more than one process to very explicitly share a piece of memory. It's fairly complicated and obscure to set up and get correct. The shared memory that we're talking about here is more-or-less completely invisible to any user process. Your example code won't know the difference if it's running as position independent code shared between multiple processes, or if it's the only copy ever.

Best Answer

Related Solutions

ELF Executable – Which Parts Get Loaded into Memory and Where

Non-reentrant libraries in shared memory

Related Question