Linux – How does a 64-bit Linux Kernel manage page tables for a 32-bit application in compatibility mode

linuxlinux-kernelvirtual-memory

At the moment I am reading the book "Understanding the Linux Kernel". There the following is said:

For 32-bit architectures with no Physical Address Extension, two paging levels are sufficient. Linux essentially eliminates the Page Upper Directory and the Page Middle Directory fields by saying that they contain zero bits. However, the positions of the Page Upper Directory and the Page Middle Directory in the sequence of pointers are kept so that the same code can work on 32-bit and 64-bit architectures. The kernel keeps a position for the Page Upper Directory and the Page Middle Directory by setting the number of entries in them to 1 and mapping these two entries into the proper entry of the Page Global Directory.

So the page table hierarchy of a 64-bit Linux kernel with 4-level paging looks like this:

PML4 (Linux: PGD) -> 512 * PDPT (Linux: PUD) -> 512 * PD (Linux: PMD) -> 512 * PT

So in the text it is said that two levels are sufficient (like in normal 32-bit paging) and that is why the PUD and the PMD are "eliminated" but any of this two tables has a length of one and is kept in the right order of the sequence.
In my understanding this means that the PML4 (PGD) corresponds to the PD (PMD) and consists of direct pointers to the PT. So the PUD and the PMD are "skipped". But this doesn't seem to be right because after a mode switch into kernel mode paging has to be done with 64-bit page tables to access kernel pages. Furthermore, this mapping scheme doesn't allow to map memory (e. g. kernel pages) above the 4GB border.
An other explanation could be that the 32-bit address is zero extended to 64-bit and for the first two tables in the hierarchy the first entry is used. Then you could use the remaining bits for selecting the entry in the remaining two tables and the offset within a page frame. But this also doesn't seem to be right because the bit count for the entry within each table is different in 32-bit and 64 bit mode. So this would cause trouble, too.
That is why there has to be something I haven't considered. I hope there is someone who could clear things up.

Best Answer

So first of all, it is not possible just like that to run 32 bit executable on a 64 bit system. So you don't need to translate the 32 bit address in a 64 bit address or something like that.

Second, 32 bit (without PAE) simply does not allow to map memory above the 4 GB border.

I thought a lot about the problem and by reading the section several times, I figured it out. You may have seen the similar question on StackOverflow: Linux Kernel Memory Management Paging Levels

But I try to explain it how I understood it. The thing I try to explain is how four level paging works on a 32 bit system.

The first sentence is essential

[...] eliminates the Page Upper Directory and the Page Middle Directory fields by saying that they contain zero bits.

This does not mean, that the kernel sets these fields to zero but it says these are zero without expressing this somewhere. So you have the usual 32 bit address separation for a two level paging.

This means you use the 10 most significant bits of the virtual address for PML4 (Linux: PGD).

The PML4 (Linux: PGD) points to the PDPT (Linux: PUD) which has only one entry. Because the kernel says, the index/offset is zero, this only entry is taken.

The only entry of the PDPT (Linux: PUD) points to the PD (Linux: PMD) which again has only one entry. And again, the kernel says, the index/offset is zero, so this only entry is taken.

And finally, the only entry of the PD (Linux: PMD) points to the PT, where the intermediate 10 bits of the virtual address are used as the index to find the wanted page.

Summarized in short:

[1024 *] PML4 (Linux: PGD) -> 1 * PDPT (Linux: PUD) -> 1 * PD (Linux: PMD) -> 1024 * PT
Related Question