Artificial intelligence (AI) is transforming the world of computing, and both AMD and Intel are integrating their own neural processing units (NPU), as other ARM-based SoCs have already been doing, including Apple’s own. Since AMD took the first step in x86, we now find that we have Ryzen AI vs Intel NPU, both with their differences and similarities, with different performances, and also different specifications.
Therefore, we are going to analyze what these differences are, to better understand these systems.
At the heart of artificial intelligence (AI) is a powerful processor known as the NPU (Neural Processing Unit) . Unlike general-purpose CPUs and graphics-focused GPUs, NPUs are designed for a specific task: executing the complex mathematical operations behind machine learning algorithms. Currently there are chips with dedicated NPUs used as Domain-Specific Accelerators, or also integrated into SoCs along with other units, as is the case in many ARM chips for mobile devices, or Apple’s own M-Series units.
While CPUs handle the general tasks of a computer and GPUs excel at parallel processing for graphics (although as you know they are also used for general purpose as accelerators), NPUs take specialization to a whole new level . They are designed specifically for the unique calculations that artificial neural networks need, and which are generally simple mathematical operations such as addition, subtraction, multiplication and division, but with very specific data, such as integers and low-precision floating point, but in very high quantities.
While the CPU and GPU are designed to handle very diverse data with sizes that are usually 32 or 64-bit, and even 128-bit, 256-bit, 512-bit vectors and even more, the NPU is especially optimized for :
- Convolutional Neural Networks (CNNs)– These are commonly used for image recognition and natural language processing. They usually use FP32 floating point data, that is, medium precision.
- Recurrent Neural Networks (RNNs)– Used for tasks such as natural language processing and machine translation. They can use FP32 or FP64 floating point data, depending on the precision required.
- Decision Trees: Decision trees are simple machine learning models used for classification and regression tasks. They typically use integer data to represent feature values. Most often they are integers INT8, INT16 and INT32. That is, numbers from 8 to 32 bits long.
Did you know that there are units called GPNPU? It is the combination of GPU + NPU units, which is perfect for accelerating AI loads. These hybrid drives are beginning to be seen in some specialized systems, such as HPC and data centers that frequently use these loads. As you may know, GPUs are also good at this type of calculations, as can be seen in the NVIDIA H-Series, A-Series, AMD Instinct products, etc. If you combine both, the results can be even better…
Machine learning algorithms are the backbone of AI applications. These algorithms learn from data patterns, make predictions, and make decisions without explicit programming. NPUs play a crucial role in the efficient execution of these algorithms. They handle tasks like training (refining the algorithm) and inference (making predictions based on the trained model), all while processing massive data sets.
Today, NPUs may seem like supporting players, helping with tasks like blurring the background in video calls or generating AI images locally on your machine. However, as AI features become more integrated into everyday applications, NPUs have the potential to become an essential part of our computers of the future.
TOPS (Tera Operations Per Second) represents the ability of an NPU to perform trillions of operations per second. This measurement indicates the raw processing speed of the NPU, that is, the number of calculations it can perform in one second. The higher the TOPS value, the higher the theoretical performance of the NPU. It is estimated that to run a chatbot like Microsoft Copilot locally, a performance of 40 TOPS would be needed.
NPU in different areas
Heterogeneous computing is currently being used to achieve better performance and energy efficiency. And this involves using various types of processors with their strengths, dedicating themselves to what they do best. To the CPU and GPU, now we must also add the NPU for this type of loads that are relatively new. Additionally, adding an NPU prevents the CPU or GPU from being overloaded with these types of tasks, and they can go about their business.
In principle, mobile devices are the first that have been using this type of NPU units, with SoCs such as Apple A-Series, Qualcomm Snapdragon, Samsung Exynos, Mediatek, etc. But AMD made a bet with Zen 4 to bring these solutions to the PC world as well, with the AMD Ryzen Zen 4 being the first to include NPUs in an x86 system. Apple had also done the same with its M-Series, with the M1, M2, M3, M4, etc., which include an NPU that Apple calls Neural Engine.
Why did NPU start on mobile devices? Well, very simple, since these devices need AI for certain applications, such as voice recognition assistants, camera functions to stabilize video, correction filters for images, bokeh effects, visual recognition, etc. Many Smart TVs also began using NPU chips a few years ago, since they use AI to improve image quality, upscale old content to higher or sharper resolutions, etc. Other smart home devices have also been using these accelerators.
Currently, with chatbot apps , they are also important on PCs, and that is why not only AMD has been encouraged to include them, Intel also did the same right after, with Meteor Lake, which also included its own NPU. And of course, also servers and HPC, given AI cloud-based applications and services, these teams have also been using both specialized GPUs and NPUs.
TPU is nothing more than an NPU, although Google calls its Tensor Processing Unit. Apple, as I said, calls it Neural Engine. But whatever, it’s the same…
As AI becomes more integrated into our lives, the role of NPUs will become even more prominent . This means a shift towards streamlined, efficient and ubiquitous AI processing. Therefore, in the near future, we will see more present NPUs, and with properties much superior to the current ones…
What is AMD Ryzen AI?
AMD Ryzen AI is the brand that brings together AMD’s artificial intelligence (AI) technologies for Ryzen processors. This initiative combines hardware and software to offer improved performance in these types of tasks. That is, it includes both a dedicated NPU called AI Stream Engine, which is based on the XDNA architecture, and also a Ryzen AI Software that provides a set of software tools ranging from a driver to take advantage of these capabilities, to developers such as AMD ROCm, Ryzen AI SDK, or Ryzen AI Hub…
You may also be interested in knowing the best laptops for AI
What is Intel NPU?
On the Intel NPU Software side we have exactly the same, that is, an NPU implemented in its tiles, and also another part of software composed of a driver and library for the AI. Therefore, both are the same…