The 32-bit x86 is almost as obsolete today as the 16-bit 8086 was when Linux was born in the early 1990s. Back then the 4 GB virtual address space made possible by the 386 was plenty, because a typical desktop machine only had a few ten megabytes of RAM.
Linus made the decision to split up the virtual address space so that the upper 1 GB (starting at address 0xc0000000) was reserved for the kernel, and the lower 3 GB (starting at address 0) was available to the user space process. All physical RAM was then mapped starting at PAGE_OFFSET, i.e. the addresses starting at 3 GB. 1 GB was plenty at the time, since (as I mentioned earlier) the typical amount of physical RAM was way smaller than this, and this split left a comfortable 3 GB space for user space.
Before switching on paging (i.e. the virtual to physical address mapping) the kernel image, containing the kernel code and static data, is loaded to the start of physical memory. (Well, not exactly, but typically starting at 2 MB, because of some quirks of the PC platform.) When paging is turned on, the memory at physical address N ends up at virtual address N + PAGE_OFFSET. This means that the kernel image occupies the lower part of the kernel memory area, typically just a few megabytes.
Note that I have talked about virtual address spaces so far, spaces that are reserved for certain things. To actually use the addresses, you will have to map physical RAM page frames to the virtual addresses. In the early days only a tiny bit of the kernel virtual address space was mapped, because there was so little physical RAM to map, but this changed dramatically quite soon when larger RAM sizes became affordable, leading to the 1 GB space not being enough to address all RAM. Thus the "high memory" mechanism was introduced, which provided a window where parts of the extra RAM was mapped as needed.
So why does the kernel need to have the RAM in its virtual address space. The thing is that the CPU can only (programmatically) access memory through virtual (mapped) addresses. Pointers in registers are pointers to the virtual address space, and so is the instruction pointer that directs program flow. The kernel needs to be able to freely access RAM, for example to zero a buffer that it will later provide to a user space process.
That the kernel has hoarded the whole of RAM into its address space does not mean the user space can't access it. More than one mapping to a RAM page frame can exist; it can be both permanently mapped to the kernel memory space and mapped to some address to a user space when the process is chosen for execution.
Best Answer
On a modern 64-bit x86 Linux?
Yes. It calls
kmap()
orkmap_atomic()
, but on x86-64 these will always use the identity mapping. x86-32 has a specific definition of it, but I think x86-64 uses a generic definition in include/linux/highmem.h.And yes, the identity mapping uses 1GB hugepages.
LWN article which mentions kmap_atomic.
I found kmap_atomic() by looking at the PIO code.[*]
Finally, when read() / write() copy data from/to the page cache:
generic_file_buffered_read -> copy_page_to_iter -> kmap_atomic() again.
[*] I looked at PIO, because I realized that when performing DMA to/from the page cache, the kernel could avoid using any mapping. The kernel could just resolve the physical address and pass it to the hardware :-). (Subject to IOMMU). Although, the kernel will need a mapping if it wants to checksum or encrypt the data first.