Linux – How does the kernel assign cpu ids to physical cores

cpulinuxlinux-kernel

One may find a lot of info about how to inspect physical cores layout in a running linux system, but I'm wondering how the physical cores are assigned a cpu id in the first place.

After experiencing on a couple of machines, it appears to me that the assignment is deterministic (i.e. booting from the same hardware will result in the same cpu ids assignment to physical processing core).
Also, it seems to me that the assignment works somehow "round-robin", such that two cores with nearby numbers are physically far away from each other. For example, in one server I'm working on (2 sockets x 8 cpus x 2 hyperthreading, x86_64): cpu0 is on socket 0, cpu1 is on socket 1, then cpu2 is again on socket 0 but it's on a different physical core from cpu0; … goes on like this until cpu15, then cpu16 is hyperthreaded on the same physical core as cpu0.

If this is deliberate as it seems, do you know of anywhere I can find this behaviour documented? What's the rationale behind it?

I'd expecially appreciate any reference to documentation or lkml post, but even a pointer to the right place in the sources could be helpful.

Best Answer

Is is hardly documented and largely depends on a platform. For x86, next available id is assigned to CPU in the function generic_processor_info()

So, for x86, cpu ids are depending on order in which we would call that function. It is called when APIC (interrupt controller) is initialized, while APIC settings are taken from ACPI MADT table and the ACPI tables are provided by BIOS.

You may try to decode them yourself using ACPI tools (acpica-tools package in CentOS):

# acpidump > acpi.dat
# acpixtract -a acpi.dat
# iasl -d apic.dat
...
# cat apic.dsl | awk -F: '/Subtable Type/ { st = $2 }
                          /Processor ID/ { id = $2 }
                          /Processor Enabled/ { print id, $2, st } '

Related Solutions

So what are logical cpu cores (as opposed to physical cpu cores)

Physical cores are just that, physical cores within the CPU. Logical cores are the abilities of a single core to do 2 or more things simultaneously. This grew out of the early Pentium 4 CPUs ability to do what was termed Hyper Threading (HTT).

It was a bit of a game that was being played where sub components of the core weren't being used for certain types of instructions while, another long running instruction might have been being executed. So the CPU could in effect work on 2 things simultaneously.

Newer cores are more full-fledged CPUs so they're working on multiple things simultaneously, but they aren't true CPUs as the physical cores are. You can read more about the limitations of the hyperthreading functionality vs. the physical capabilities of the core here on tomshardware in this article titled: Intel Core i5 And Core i7: Intel’s Mainstream Magnum Opus.

You can see the breakdown of your box using the lscpu command:

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
CPU(s):                4
Thread(s) per core:    2
Core(s) per socket:    2
CPU socket(s):         1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 37
Stepping:              5
CPU MHz:               2667.000
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              3072K
NUMA node0 CPU(s):     0-3

In the above my Intel i5 laptop has 4 "CPUs" in total

CPU(s): 4

of which there are 2 physical cores (1 socket × 2 cores/socket = 2 cores)

Core(s) per socket: 2

CPU socket(s): 1

of which each can run up to 2 threads

Thread(s) per core: 2

at the same time. These threads are the core's logical capabilities.

VirtualBox: Is it a bad idea to assign more virtual CPU cores than number of physical CPU cores

Hardware / OS / Software

Host: Linux Mint 18 Cinnamon 64-bit (fully updated); Kernel version 4.4.0-47-generic

Guest: Windows 8.1 Pro 64-bit (fully updated)

Processor: Intel Core i7-4700HQ, (6MB cache, 4 physical cores, or 8 using Hyper-Threading), CPU Benchmark

VirtualBox: Version 5.1.10 r112026 (Qt5.5.1)

Guest Additions: Installed and up-to-date

Benchmark Tool #1: WinRAR version 5.40 final 64-bit

Benchmark Tool #2: VeraCrypt version 1.19 final 64-bit

Preparation

In both cases I waited after boot until the CPU, RAM, disk drive are at stable near zero-point hits.

Method

Cloning the original virtual machine to have two identical ones.
I have, for the second pass, since the reboot disabled Antivirus pointed out at the bottom of this answer and updated WinRAR in both cases from a Beta to the Final version.
I have done the same Preparation as pointed out earlier.
The virtual machine ran in foreground, without any other CPU time hungry application running, I have disabled what I could for the purpose of the test not being influenced.
To include potential caching inside or outside the system, I ran the same test twice consequently. The benefit being almost none.

Results

WinRAR

4 cores => 7.5 minutes (shorter time is better)

WinRAR with 4 cores enabled, 1.5GiB processed in 7.5 minutes.
8 cores => 4.5 minutes (shorter time is better)

WinRAR with 8 cores enabled, 1.5GiB processed in 4.5 minutes.

VeraCrypt

4 cores => speed 2.6 GiB/s (higher speed is better)

VeraCrypt with 4 cores enabled, HW-accelerated AES (AES-NI) speed 2.6 GiB/s.
8 cores => speed 3.9 GiB/s (higher speed is better)

VeraCrypt with 8 cores enabled, HW-accelerated AES (AES-NI) speed 3.9 GiB/s.

Conclusion

I could run as many tests as necessary. But I figure, if these two, one of which is rather complex compression test, the second being a set of rather complex encryption tests, what would be the point.

Both of the benchmarks show a marked difference. I see no reason to believe, that their results are inaccurate, as I followed a rather rigorous preparation and method, moreover these tests have taken place in RAM to rule out I/O bottleneck. From my standpoint, the warning mentioned in the question may apply to some conditions, but certainly not all of them. Having shared with you these pretty remarkable results, I am certain for you to agree with me, that this warning probably should not be taken so seriously on modern CPUs featuring Hyper-Threading with the latest VirtualBox version. One thing for sure: Don't take me for the word and test it under your own conditions, before you decide to apply this setting permanently.

New benchmark

Host + Guest: Linux Mint 19.2 "Tina" - Cinnamon (64-bit); both with kernel: 5.3.0-24-generic.

Processor: Intel® Core™ i7-7700HQ; 6 MB Cache, up to 3.80 GHz, 4 physical cores, or 8 using Hyper-Threading, CPU Benchmark comparison

VirtualBox: Version 6.1.0 r135406 (Qt5.9.5)

Guest Additions: Installed and up-to-date

Benchmark Tool: VeraCrypt version 1.24 Hotfix1 64-bit final (web page, direct deb download link)

Preparation and Method

Same as previous benchmark.

Results

VeraCrypt AES encryption with 4 cores

⟶ speed 4.8 GiB/s ^{(higher speed is better)}

VeraCrypt AES encryption with 8 cores (Hyper-Threading warning issued)

⟶ speed 7.2 GiB/s ^{(higher speed is better)}

Conclusion

Wonderful 50% performance increase with Hyper-Threading enabled, but only with the AES sadly, I will have to run some more comprehensive test. Will be back in a few days with results.

Best Answer

Related Solutions

So what are logical cpu cores (as opposed to physical cpu cores)

VirtualBox: Is it a bad idea to assign more virtual CPU cores than number of physical CPU cores

Hardware / OS / Software

Preparation

Method

Results

WinRAR

VeraCrypt

Conclusion

New benchmark

Preparation and Method

Results

VeraCrypt AES encryption with 4 cores

VeraCrypt AES encryption with 8 cores (Hyper-Threading warning issued)

Conclusion

Related Question