Why does the Apple A12X processor have better benchmark results then the i7-4790T

cpuipadperformance

In my workstation I have an Intel i7-4790T that I've always thought was a pretty fast CPU. But according to Geekbench 4 the Apple A12X processor in the new iPad Pro comfortably beats it. When I run Geekbench 4 I get a single core speed of around 4,000 but on the new iPad Pro the A12X processor returns around 5,000 i.e. 25% faster. In fact even the A12 and A11 score more than my i7-4790T. On the multicore test my CPU scores a shade over 11,000 while the A12X scores 18,000, which is a whopping 60% faster.

A preliminary question is whether Geekbench is a reliable indicator of real world speed. For example the only thing I do that really stresses my CPU is video resampling with Handbrake. Handbrake isn't available for IOS, but assuming it was ported would Handbrake really process videos 60% faster on the A12X, or is the Geekbench score unrepresentative of real world performance?

But my main question is this: leaving aside exactly how the A12X and my CPU compare, how have Apple managed to get an ARM based RISC chip to be that fast? What aspects of its architecture are responsible for the high speed?

My understanding of RISC architectures is that they do less per clock cycle but their simple design means they can run at higher clock speeds. But the A12X runs at 2.5GHz while my i7 has a base speed of 2.7GHz and will boost to 3.9GHz in single core loads. So given my i7 will run at clock speeds 50% faster than the A12X how does the Apple chip manage to beat it?

From what I can find on the Internet the A12X has much more L2 cache, 8MB vs 256KB (per core) for my i7, so that's a big difference. But does this extra L2 cache really make such a big difference to the performance?

Appendix: Geekbench

The Geekbench CPU test only stresses the CPU and the CPU-memory speeds. The details of exactly how Geekbench does this are described in this PDF (136KB). The tests appear to be exactly the sort of things we do that that use lots of CPU, and it appears they would indeed be representative of Handbrake performance that I suggested as an example.

The detailed breakdown of the Geekbench results for my i7-4790T and the A12X are:

Test            i7-4790T      A12X
Crypto            3870        3727
Integer           4412        5346
Floating Point    4140        4581
Memory Score      3279        5320

Best Answer

The A12X is an enormous CPU built on the latest technology, leaving far behind the older i7-4790T dating from 2014.

First difference is the manufacturing process: The A12X is a 7 nm chip, while the i7-4790T Haswell-DT is built on older 22 nm. Smaller transistors mean less space, less operating power and faster signals across shorter chip paths.

The A12X has a whopping 10 billion transistors, while the i7-4790T has only 1.4 billion.

This allows the A12X to have six integer execution pipelines, among which two are complex units, two load and store units, two branch ports, and three FP/vector pipelines, giving a total of an estimated 13 execution ports, far more than the eight execution ports of the Haswell-DT architecture.

For cache size, per core we have on the A12: Each Big core has L1 cache of 128kB and L2 cache of 8MB. Each Little core has 32kB of L1and 2MB of L2. There’s also an additional 8 MB of SoC-wide$ (also used for other things).

Haswell architecture has L1 cache of 64KB per core, L2 cache of 256KB per core, and L3 cache of 2–40 MB (shared).

It can be seen that the A12X beats the i7-4790T on all points and by a large margin.

Regarding RISC vs CISC architecture, this is now a moot point on modern processors. Both architectures have evolved to the point where they now emulate each other’s features to a degree in order to mitigate weaknesses.

I quote here the chart of comparisons to Xeon 8192, i7 6700k, and AMD EPYC 7601 CPUs, compiled by Reddit (link below), where the A12 compares well even with desktop processors:

image

Sources :

Related Question