In the case of Linux, a task (kernel internal idea of a thread; threads can share resources, like memory and open files; some run only inside the kernel) can run in userland, or (it's thread of execution) can transfer into the kernel (and back) to execute a system call. A user thread can be highjacked temporarily to execute an interrupt (but that isn't really that thread running).
That a process is a "system process" or a regular user process is completely irrelevant in Unix, they are handled just the same. In Linux case, some tasks run in-kernel to handle miscellaneous jobs. They are kernel jobs, not "system processes" however.
One big caveat: Text books on complex software products (compilers and operating systems are particularly egregious examples) tend to explain simplistic algorithms (often ones that haven't been used in earnest for half a century), because real world machines and user requirements are much too complex to be handled in some way that can be described in a structured, simple way. Much of a compiler is ad-hoc tweaks (particularly in the area of code optimization, the transformations are mostly the subset of possibilities that show up in practical use). In the case of Linux, most of the code is device drivers (mentioned in passing as device-dependent in operating system texts), and of this code a hefty slice is handling misbehaving devices, which do not comply to their own specifications, or which behave differently between versions of "the same device". Often what is explained in minute detail is just the segment of the job that can be reduced to some nice theory, leaving the messy, irregular part (almost) completely out. For instance, Cris Fraser and David Hanson in their book describing the LCC compiler state that typical compiler texts contain mostly explanations on lexical analysis and parsing, and very little on code generation. Those tasks are some 5% of the code of their (engineered to be simple!) compiler, and had negligible error rate. The complex part of the compiler is just not covered in standard texts.
So first of all, it is not possible just like that to run 32 bit executable on a 64 bit system. So you don't need to translate the 32 bit address in a 64 bit address or something like that.
Second, 32 bit (without PAE) simply does not allow to map memory above the 4 GB border.
I thought a lot about the problem and by reading the section several times, I figured it out.
You may have seen the similar question on StackOverflow: Linux Kernel Memory Management Paging Levels
But I try to explain it how I understood it.
The thing I try to explain is how four level paging works on a 32 bit system.
The first sentence is essential
[...] eliminates the Page Upper Directory and the Page Middle Directory fields by saying that they contain zero bits.
This does not mean, that the kernel sets these fields to zero but it says these are zero without expressing this somewhere.
So you have the usual 32 bit address separation for a two level paging.
This means you use the 10 most significant bits of the virtual address for PML4 (Linux: PGD).
The PML4 (Linux: PGD) points to the PDPT (Linux: PUD) which has only one entry. Because the kernel says, the index/offset is zero, this only entry is taken.
The only entry of the PDPT (Linux: PUD) points to the PD (Linux: PMD) which again has only one entry. And again, the kernel says, the index/offset is zero, so this only entry is taken.
And finally, the only entry of the PD (Linux: PMD) points to the PT, where the intermediate 10 bits of the virtual address are used as the index to find the wanted page.
Summarized in short:
[1024 *] PML4 (Linux: PGD) -> 1 * PDPT (Linux: PUD) -> 1 * PD (Linux: PMD) -> 1024 * PT
Best Answer
Compatibility.
First, note that Sun's 64-bit support goes back to 1998, with Solaris 7, well before AMD64 and even Itanium had OS support. By supporting both 32-bit and 64-bit in userland, you could let the vast majority of software run completely unchanged.
Check out the Solaris 64-bit Developer's guide (dated 2005). First, it notes that there are really 2 separate systems:
and then repeatedly emphasizes that if you've got good old C-code that assumes it's 32-bit, it'll work just fine - even continue to build just fine, as if nothing's changed:
Successful tech transitions are usually accompanied by quirky hybrids and chimeras that sometimes live past their usefulness.