Linux – Why some libraries and other parts get repeated in the linux virtual memory with gdb

debugginggdblinuxprocessvirtual-memory

enter image description here

This is the result of looking at virtual memory of a process in gdb; I have some questions regarding this:

  1. Why are some parts of the virtual memory are repeated? For example, our program (stack6) and libc library is repeated 4 times; if they have partitioned them into different parts, then why? Why not just put them all together?

  2. Is the top path (/opt/pro…) the instruction section (text section) of our virtual memory and only contains the instructions?

  3. Why are the sizes of the 4 libc's different? What's the deal with the offset, if we already have the size and starting addr, then what is offset for?

  4. Where are the data, bss, kernel and heap sections and why do some parts of the above picture have no info about them? Is there any better option in gdb that actually shows all the parts?

  5. Is there any better program than gdb that shows the virtual memory part of our process much better? I just want to have a good visual of an actual virtual memory, which debugging program provides the best result.

The sections that I mentioned :

enter image description here

Best Answer

There’s one important piece of information missing from gdb’s output: the pages’ permissions. (They’re shown on Solaris and FreeBSD, but not on Linux.) You can see those by looking at /proc/<pid>/maps; the maps for your Protostar example show

$ cat /proc/.../maps
08048000-08049000 r-xp 00000000 00:0f 2925       /opt/protostar/bin/stack6
08049000-0804a000 rwxp 00000000 00:0f 2925       /opt/protostar/bin/stack6
b7e96000-b7e97000 rwxp 00000000 00:00 0
b7e97000-b7fd5000 r-xp 00000000 00:0f 759        /lib/libc-2.11.2.so
b7fd5000-b7fd6000 ---p 0013e000 00:0f 759        /lib/libc-2.11.2.so
b7fd6000-b7fd8000 r-xp 0013e000 00:0f 759        /lib/libc-2.11.2.so
b7fd8000-b7fd9000 rwxp 00140000 00:0f 759        /lib/libc-2.11.2.so
b7fd9000-b7fdc000 rwxp 00000000 00:00 0
b7fe0000-b7fe2000 rwxp 00000000 00:00 0
b7fe2000-b7fe3000 r-xp 00000000 00:00 0          [vdso]
b7fe3000-b7ffe000 r-xp 00000000 00:0f 741        /lib/ld-2.11.2.so
b7ffe000-b7fff000 r-xp 0001a000 00:0f 741        /lib/ld-2.11.2.so
b7fff000-b8000000 rwxp 0001b000 00:0f 741        /lib/ld-2.11.2.so
bffeb000-c0000000 rwxp 00000000 00:0f 0          [stack]

(The Protostar example runs in a VM which is easy to hack, presumably to make the exercises tractable: there’s no NX protection, and no ASLR.)

You’ll see above that what appears to be repeated mappings in gdb actually corresponds to different mappings with different permissions. The text segment is mapped read-only and executable; the data segment is mapped read-only; BSS and the heap are mapped read-write. Ideally, the data segment, BSS and heap are not executable, but this example lacks NX support so they are executable. Each shared library gets its own mapping for its text segment, data segment and BSS. The fourth mapping is a non-readable, non-writable, non-executable segment typically used to guard against buffer overflows (although given the age of the kernel and C library used here this might be something different).

The offset, when given, indicates the offset of the data within the file, which doesn’t necessarily have much to do with its position in the address space. When loaded, this is subject to alignment constraints; for example, libc-2.11.2.so’s program headers specify two “LOAD” headers:

Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz   Flg Align
LOAD           0x000000 0x00000000 0x00000000 0x13d2f4 0x13d2f4 R E 0x1000
LOAD           0x13e1cc 0x0013f1cc 0x0013f1cc 0x027b0  0x0577c  RW  0x1000

(Use readelf -l to see this.)

These can result in multiple mappings at the same offset, with different virtual addresses, if the sections mapped to the segments have different protection flags. In stack6’s case:

Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
LOAD           0x000000 0x08048000 0x08048000 0x00604 0x00604 R E 0x1000
LOAD           0x000604 0x08049604 0x08049604 0x00114 0x00128 RW  0x1000

(This also explains the small size shown by proc info mappings for stack6: each header requests less than 4KiB, with a 4KiB alignment, so it gets two 4KiB mappings with the same offset at different addresses.)

Blank mappings correspond to anonymous mappings; see man 5 proc for details. You’d need to break on mmap in gdb to determine what they correspond to.

You can’t see the kernel mappings (apart from the legacy vsyscall on some architectures) because they don’t matter from the process’s perspective (they’re inaccessible).

I don’t know of a better gdb option, I always use /proc/$$/maps.

See How programs get run: ELF binaries for details of the ELF format as read by the kernel, and how it maps to memory allocations; it has pointers to lots more reference material.

Related Question