GPU vs CPU – Understanding Core Differences

computer-architecturecorecpu-architecturegpunvidia-graphics-card

A (say NVidia) GPU is made of streaming multiprocessors consisting of arrays of streaming processors or CUDA core. There are 5120 CUDA cores on V100. A general purpose (say Intel) CPU has "only" up to 48 cores.

How is a GPU core different from a CPU core ? Is the difference essentially the supported instruction set ? What is the frequency for a CUDA core ?

Edit

Vector register operations on a CPU is Single Instruction Multiple Data (SIMD), kernel distributed among grids/thread blocks/threads on a GPU is Single Instruction Multiple Threads (SIMT). Can we say a GPU is only a SIMT device ? What about data ? Are the different threads running in parallel applying the same kernel to different set of data ? Then it seems a GPU performs both SIMD and SIMT. Would you like to comment ?

Best Answer

A CPU is a much more general purpose machine than a GPU. We might talk about using a GPU as a "general purpose" GPU, but they have different strengths.

CPU cores are capable of a wide variety of operations and deal with (what can for all intents be considered to be) a random branching instruction stream. Multiple programs all vying for time on the processor and being controlled by the operating system. They cache and predict as much as they can while still trying to remain capable of dealing with sudden changes in the instruction stream.

GPUs on the other hand are processors designed to deal with data streams. Their processors are designed to work with a small series of instructions (a shader program) across a potentially vast amount of data. HD, 2k and 4k screens contain a huge number of pixels, and a shader must run programs across every pixel in successive runs to achieve particular effects. To that end their programs are (compared to a CPU) smaller, their per-core caches similarly smaller, but their bandwidth to memory phenomenally faster.

They might, with suitable programming, be able to achieve the same tasks, but the focus of instructions vs data processing is what separates a CPU from a GPU.

As such their cores are designed to work to those strengths. For a long while GPU shader cores have operated around 1-2GHz (modern intel graphics cores list their speeds as 500MHz to 1.5GHz) while CPUs have been anywhere between 1.5 and 4GHz and more.

Instruction processing benefits more from speed of individual units because it can be difficult or impossible to break an instruction stream down into multiple streams, hence CPUs need to be faster to deal with instructions quicker. The problem is that the faster you run a core the more heat it generates so you hit a limit in how fast you can run it. (There are other technical limitations that affect clock speed but that's something for another story.)

Data processing on the other hand lends itself to running the same task (program) on different data sets and parallelism, hence the more cores you can throw at the task the better. Running cores at a slower speed generates less heat. Less heat means you can put in more cores therefore better throughput of data. Hence data tasks benefit from a different (smaller, leaner) type of core to a CPU.

The end result is that we have two distinct types of processor. One is aimed at general purpose instruction streams, and another that is aimed at bulk data handling.

Best Answer

Related Solutions

Why can I run 23,000 CUDA threads on the GeForce GTX 480 GPU

Cores and threads? How does it all work exactly

Related Question