The Uniqueness of Mac’s Neural Engine in the Chip Industry

apple-siliconcpugpuhardwaremotherboard

From Wikipedia:

The neural engine allows Apple to implement neural network and machine learning in a more energy-efficient manner than using either the main CPU or the GPU. However, third party apps cannot use the Neural Engine, leading to similar neural network performance to older iPhones.

Reading this Reddit hardly and hardly find the answer, I'd love to ask:

What is the Neural Engine in new generation Mac architectures?

I know there is CPU which is a component that receives commands (instructions) and works in front of a memory (RAM), and I know there is GPU who receives commands and works in front of another memory (matrix driven).

Is NE also a similar component to them? If so, does it also have a unique instruction-set? Specific programming/logic/script languages? Equivalent competitors in the market/industry?

If as a computer man, I want to benefit from it, what apps can I develop with it for example, which is clearly better to do with it, than with GPU? I would love for specific details rather than general lines as Apple provides to the chip

Best Answer

What is the Neural Engine?

The Neural Engine is a part of Apple's processors - for example the Apple Silicon M1 that you have tagged.

The M1 die has many parts that make up the processor. That includes a CPU, a GPU, the Neural Engine, caches, RAM and other smaller parts.

Each of those parts again consists of smaller parts:

The CPU is actually a number of performance CPU cores and a number of efficiency cores. The GPU consists of a number of modules, which can be thought of as GPU cores. And the Neural Engine consist of a number of NPU cores (Neural Processing Units).

Is the Neural Engine a similar component to the CPU and GPU?

In general terms, you can think of CPU cores, GPU cores and NPU cores as a sort of "black box" that takes in commands (instructions, or whatever you would like to call them), communicates with the rest of the processor and peripherals (such as for example loading/storing data in RAM or communicating over busses) and perform some sort of computation.

The Apple M1 is structured with a unified memory architecture which means that these various subsystems all access the same main memory. There is no special "matrix driven memory" for the GPU as you indicate in your question.

The Neural Engine is targeted specifically at performing the types of computation you would need when running machine learning models. Typically this means optimising for very fast matrix multiplications.

Does it have a unique instruction-set or programming languages?

The Neural Engine has a unique "instruction set" (although it is not part of the ARM instruction set). Similar to programs for the CPU and GPU, you have a compiler which transform programs (models) written in a higher level language into commands and data structures that can be understood by the Neural Engine.

From a practical stand point, ordinary developers will interact with the Neural Engine through using CoreML. This means that they essentially a single, specific "language" for their models.

Can the Neural Engine do something better than the CPU or GPU?

The Neural Engine is not strictly "better" than the CPU or the GPU for performing these sort of calculations. You can run machine learning models on the CPU, on the GPU and on the Neural Engine - so the Neural Engine is not opening up some new type of computation that is impossible without it.

The main reason for the Neural Engine compared to having extra CPU cores is that the Neural Engine is targeted specifically for a few, specific types of computations common in machine learning models. This means that it usually performs these computations using less electrical power than had you used a CPU core (or looked at alternatively, you achieve higher performance with the same amount of electrical power). In addition, it is possible for the CPU and GPU to be doing something else while the Neural Engine does its work. This means longer battery life, less fan noise, lower perceived latency, etc.

Are the competitors in the market?

There exist many other neural processing units in the market. It is not something that is unique to Apple. For example Google have their TPU, Nvidia have the NVDLA, Amazon have Inferentia, and so on.

However, Apple is perhaps in a unique situation where the overwhelming majority of their users of their operating systems (iOS, iPadOS and macOS) have their Neural Engine (in some version) and run compatible software. This means that they can aggressively take advantage of it in their applications and background services.

Related Question