Your calculation is correct. shmall can be set higher than the available virtual memory. If you would try to use all of it then it would not fail because of shmall is exceeded but because of other reasons.
BTW there are also commands to find these IPC limits:
ipcs -l
lsipc # util-linux>=2.27
Note that even the virtual memory is unlimited on Linux by default, greater-than RAM+swap. See
https://serverfault.com/questions/606185/how-does-vm-overcommit-memory-work
How the OOM killer decides which process to kill first?
On the other hand you could limit the virtual memory per process using ulimt -v
which wouldn't affect kernel's /proc/sys/kernel/shmall
neither.
So first of all, it is not possible just like that to run 32 bit executable on a 64 bit system. So you don't need to translate the 32 bit address in a 64 bit address or something like that.
Second, 32 bit (without PAE) simply does not allow to map memory above the 4 GB border.
I thought a lot about the problem and by reading the section several times, I figured it out.
You may have seen the similar question on StackOverflow: Linux Kernel Memory Management Paging Levels
But I try to explain it how I understood it.
The thing I try to explain is how four level paging works on a 32 bit system.
The first sentence is essential
[...] eliminates the Page Upper Directory and the Page Middle Directory fields by saying that they contain zero bits.
This does not mean, that the kernel sets these fields to zero but it says these are zero without expressing this somewhere.
So you have the usual 32 bit address separation for a two level paging.
This means you use the 10 most significant bits of the virtual address for PML4 (Linux: PGD).
The PML4 (Linux: PGD) points to the PDPT (Linux: PUD) which has only one entry. Because the kernel says, the index/offset is zero, this only entry is taken.
The only entry of the PDPT (Linux: PUD) points to the PD (Linux: PMD) which again has only one entry. And again, the kernel says, the index/offset is zero, so this only entry is taken.
And finally, the only entry of the PD (Linux: PMD) points to the PT, where the intermediate 10 bits of the virtual address are used as the index to find the wanted page.
Summarized in short:
[1024 *] PML4 (Linux: PGD) -> 1 * PDPT (Linux: PUD) -> 1 * PD (Linux: PMD) -> 1024 * PT
Best Answer
Application can apply for huge page, kernel will not determine the page size unless compile the PAGE_SIZE into the kernel source code. Using mmap flags can determine the page size in the source code of application.
kmalloc use default page size in Linux kernel, that is PAGE_SIZE in the kernel, which is compiled or runtime determined. Same for vmalloc.
The size of waste memory is determined by PAGE_SIZE and the data, if page size is 4MB, data is 5MB, the wasted memory size will be (PAGE_SIZE*N) - 5MB= 3MB.