Do higer end server cpu typically have slower single thread performance

64-bitcpuperformancethreadsxeon

Something I saw many times and which is confirmed by multiple benchmarks: Xeon cpu and more generally Intel cpu targeting the server market have slower per thread performance than a CoreX cpu.
Even a $117 22nm Core i3 Ivy Bridge cpu will typically run python workloads faster than a $2000 10nm Xeon Cannon Lake cpu. And it’s not even with Turbo Boost mode enabled!

Except in the case of python (where the language doesn’t have proper multithreading support) server workloads are more multithreaded and more multiprocess than the games and workloads run by an individual which explains why they favour sacrificing single thread performance in order to have more cores.

While it’s already know that Intel and other hardware maufacturers can no longer increase performance using single core designs, what (in details) does decreasing per thread peformance for the same microarchitecture brings? Why not continue to just add less but faster core per chip for the same price?

Best Answer

(This post is asking for speculation and I'm happy to oblige.)

Why not continue to just add less but faster core per chip for the same price?

The problem is that the current technology had hit its limits, so only minor performance improvements are now possible. Improvements of 10-20% just don't sound very convincing.

On the other hand, manufacturers do not wish to fall behind Moore's law, stating that computer chip performance would roughly double every 18 months (with no increase in power consumption). This needs an improvement factor of 100%, and such single-core technology just does not exist.

Solution : Double the number of cores and sum up their total capacity, as proof that performance is evolving fast enough by 100%.

In real life this theoretical increase of the number of cores is not guaranteed to increase the total performance, since some computer resources are shared and may become bottlenecks, for example the RAM, bus and disk.

What does decreasing per thread performance for the same micro-architecture brings?

Increasing the number of cores cannot be done indefinitely, especially in view of electrical consumption. For a core to work faster, it needs more electricity. This means that the more cores you have, each will have a smaller part of the total available electricity and so must work slower.

The solution here is turbo mode, whereby one core gets most of the available electrical supply. So you have one fast core and the others either turned off or slowed down. But as one core cannot support that mode indefinitely, the solution is to switch turbo mode on for multiple cores in rotation.

In general, for comparable technology, a CPU with fewer cores may prove faster than a multi-core CPU, for a core-to-core comparison. Other factors may influence the speed, but choosing between the number of cores and single-core performance is often the question. The applicability of turbo mode to the work-load is another question.

Related Question