Linux Kernel – Understanding 3/1 Split and Physical Map

linuxlinux-kernelmemory

I'm trying to understand the Linux 3/1 split (or 2/2, 1/3, any) and how mapping to physical memory work. Let's assume x86.

What I don't understand in particular is why the kernel's 1GiB in va[3GiB, 4GiB) is always mapped to pa[0, 1GiB]. The split is at (virtual) PAGE_OFFSET.

What if I have more memory? What if I have less? Where does all memory for user-space go?

From TLDP I understand that the bottom physical 1GiB is always for the kernel (why?). High memory is used (by this post) when the virtual address space is smaller than the physical address space, because the memory is a lot and it would be wasted otherwise (right?); in x86-64 it is not being used because the virtual address space is anormous.

One thing to keep always the kernel there might be that on context switches current remains the same and there's no need to change cr3.

This answer says:

The High Memory is the segment of memory that user-space programs can address. It cannot touch Low Memory.

Low Memory is the segment of memory that the Linux kernel can address directly. If the kernel must access High Memory, it has to map it into its own address space first.

Are people overloading the terms "low memory" and "high memory"?

Finally, LDD3 says:

The kernel cannot directly manipulate memory that is not mapped into the kernel's address space. The kernel, in other words, needs its own virtual address for any memory it must touch directly. Thus, for many years, the maximum amount of physical memory that could be handled by the kernel was the amount that could be mapped into the kernel's portion of the virtual address space, minus the space needed for the kernel code itself. As a result, x86-based Linux systems could work with a maximum of a little under 1 GB of physical memory.

Does this refer to the fact that a pointer p in the kernel must be hold a virtual address, not a physical one, as mapping always applies? Why this "1GiB of physical memory" restriction?

Best Answer

The 32-bit x86 is almost as obsolete today as the 16-bit 8086 was when Linux was born in the early 1990s. Back then the 4 GB virtual address space made possible by the 386 was plenty, because a typical desktop machine only had a few ten megabytes of RAM.

Linus made the decision to split up the virtual address space so that the upper 1 GB (starting at address 0xc0000000) was reserved for the kernel, and the lower 3 GB (starting at address 0) was available to the user space process. All physical RAM was then mapped starting at PAGE_OFFSET, i.e. the addresses starting at 3 GB. 1 GB was plenty at the time, since (as I mentioned earlier) the typical amount of physical RAM was way smaller than this, and this split left a comfortable 3 GB space for user space.

Before switching on paging (i.e. the virtual to physical address mapping) the kernel image, containing the kernel code and static data, is loaded to the start of physical memory. (Well, not exactly, but typically starting at 2 MB, because of some quirks of the PC platform.) When paging is turned on, the memory at physical address N ends up at virtual address N + PAGE_OFFSET. This means that the kernel image occupies the lower part of the kernel memory area, typically just a few megabytes.

Note that I have talked about virtual address spaces so far, spaces that are reserved for certain things. To actually use the addresses, you will have to map physical RAM page frames to the virtual addresses. In the early days only a tiny bit of the kernel virtual address space was mapped, because there was so little physical RAM to map, but this changed dramatically quite soon when larger RAM sizes became affordable, leading to the 1 GB space not being enough to address all RAM. Thus the "high memory" mechanism was introduced, which provided a window where parts of the extra RAM was mapped as needed.

So why does the kernel need to have the RAM in its virtual address space. The thing is that the CPU can only (programmatically) access memory through virtual (mapped) addresses. Pointers in registers are pointers to the virtual address space, and so is the instruction pointer that directs program flow. The kernel needs to be able to freely access RAM, for example to zero a buffer that it will later provide to a user space process.

That the kernel has hoarded the whole of RAM into its address space does not mean the user space can't access it. More than one mapping to a RAM page frame can exist; it can be both permanently mapped to the kernel memory space and mapped to some address to a user space when the process is chosen for execution.

Related Solutions

Linux – ZONE_NORMAL and it’s association with Kernel/User-pages

On a 32-bit architecture you have 0xffffffff (4'294'967'295 or 4 GB) linear addresses (not physical space) to refer to a physical address.
Even with only 512 MB of physical storage (the real RAM stick connected to the bus), the kernel will still use 4'294'967'295 (4 GB) linear addresses to calculate the physical ones.

The linux kernel divides these 4 GB (of addresses) into the user space (high memory) and the kernel space (low memory) by 3/1, so the kernel space has 1'073'741'823 (1 GB) of linear addresses to use.

These 1 GB of linear addresses, are only accessible by the kernel and are getting divided up even further.

ZONE_DMA: Contains page frames of memory below 16 MB. This is used for old ISA buses, they are able to address only the first 16 MB of RAM.

ZONE_NORMAL: Contains page frames of memory at and above 16 MB and below 896 MB, these are the addresses, which the kernel can map/access directly.

ZONE_HIGHMEM: Contains page frames of memory at and above 896 MB, page frames above this border are not generally mapped to the kernel space and therefore not directly accessible by the kernel. Page frames from the user space can be temporarily or permanently mapped here.

How much real, physical RAM space is occupied by the different zones depends on the form and number of processes you run.

If you enter free -ml in your console, you can see the usage including low- and high memory:

             total       used       free     shared    buffers     cached
Mem:          3022       2116        905          0        105       1342
Low:           839        196        642
High:         2182       1919        263
-/+ buffers/cache:        667       2354
Swap:         2859         93       2766

Best Answer

Related Solutions

Linux – ZONE_NORMAL and it’s association with Kernel/User-pages

Related Question