Impact of the L3 cache on performance – worth a dual-processor system

cpuxeon

I will be purchasing a new high-end system, and I would like to have a better sense of whether a dual-processor Xeon system (I am looking at the new, high-end Xeon E5-2687W) might, realistically, provide a noticeable performance improvement due to the doubling of the L3 cache (20 MB per CPU).

(This is in addition to the occasional added advantage due to the doubling of cores and RAM.)

My usage scenario is, roughly, that I have many background applications running at any time – 3 or 4 data compression/backup applications, a low-impact web server, one or two virtual machines at any given time (usually fairly idle), and perhaps 20 utility programs that utilize a noticeable (but small) portion of the CPU cores. In total, when I am not actively using the computer, about 25% of the total CPU power is utilized in my current i7-970 6-core (12 thread) system.

When I am doing routine work, the CPU utilization often exceeds 50%, and occasionally hits 75%-80%.

The Xeon E5-2687W is not only a second-generation i7 (so should improve performance for that reason), but also has 8 cores (16 threads), rather than 6 cores. For this reason, I expect to run into the 75% CPU range even less frequently. Nonetheless, the ability to double the cores and the RAM is a consideration.

However, in the end, I believe this decision comes down to whether the doubling of the L3 cache will provide a noticeable improvement. There are many benchmarks, and a lot of discussion, regarding CPU power. However, I find very little discussion of L3 cache utilization, and how increases in the L3 cache (such as doubling it with dual processors) affect performance.

For example: If there are only two processes running, but each benefits from a large L3 cache (such as might be the case for background processes that frequently scan the file system), perhaps the overall system performance might noticeably improve with dual CPU's – even if only a single core is active on each CPU – due to each process having double the effective L3 cache.

I am hoping that someone has a sense of the benefits of increasing (or doubling) the L3 cache size.

Note: the CPU I am considering (the Xeon E5-2687W) has 20 MB L3 cache, so a system with dual CPU's would have 40 MB L3 cache.

Best Answer

As always with caching questions, the answer would be "it entirely depends on your workload". The cache is only of any use if your running processes are spending a significant amount of time accessing memory and exhibit a noticeable locality of reference for memory addressing (and are not happy with the smaller L1/L2 caches present per core for this matter).

Having a high number of processes running within different threads increases the odds for thrashing of the shared cache and thus diminishes performance gains which possibly would have been achieved otherwise. This is also the reason for increasing the cache size with an increased core count - the more memory-competing threads you have running, the larger your shared cache likely needs to be in order to be useful at all.

There is an oldish article from Tom's Hardware comparing two old P4 chips with and without L3 cache for a number of rendering / graphical workloads. The numbers are rubbish, as is the whole benchmark, but it contains a nice explanation of the caching architecture in general and L3 caching in particular.

The bottom line: you likely would not notice the difference, but if you need the exact numbers, you would have to purchase both CPUs and run your workload on both of them to compare runtimes.

Related Question