Shingo Kojima
Shingo Kojima
Senior Principal Specialist
Published: February 27, 2024

Recent vision AI models have to deal with dynamic and complex environments, resulting in the need for more power efficiency and speed in real-time applications.

To meet the market needs, Renesas released the next-generation Dynamically Reconfigurable Processor for AI (DRP-AI) accelerator. The DRP-AI accelerator delivers high-power efficiency of 10 TOPS/W, up to 10 times higher than the conventional technology, running complex image AI models that previously required a GPU with a power consumption as low as that of a conventional embedded microprocessor (MPU).

In addition to this AI accelerator, the high-end RZ/V2H MPU is equipped with an image processing accelerator using the Dynamically Reconfigurable Processor (DRP), a quad-core Linux processor Arm® Cortex®-A55 running at up to 1.8GHz, a dual-core 800MHz Arm Cortex-R8 high-speed real-time processor, and an I/O processing Arm Cortex-M33 sub-core in a heterogeneous multiprocessor configuration.

The combination of seven Arm-based CPU cores, next-generation DRP-AI, and DRP makes it possible to immediately process the results of image recognition and AI judgment in mechanical control, making it the ideal AI processor for next-generation autonomous robots, autonomous mobile robot, drones, and other applications. Learn more about the RZ/V2H MPU features in this blog.

Next-Generation AI Accelerator DRP-AI

The RZ/V2M, RZ/V2L, and RZ/V2MA are embedded with Renesas' original DRP-AI accelerator but Renesas has leveled-up its original AI accelerator, DRP-AI, to the next generation to meet the recent market needs.

To dramatically increase power efficiency, the DRP-AI applies INT8 quantization and hardware support for unstructured pruning, which has been difficult for conventional AI accelerators, to achieve inference performance of up to 80 TOPS and power efficiency of 10 TOPS/W. Learn more about unstructured pruning in the "Next Generation Highly Power-Efficient AI Accelerator (DRP-AI3): 10x Faster Embedded Processing in Advanced AI for Autonomous Systems" whitepaper.

Figure 1 below shows a comparison of AI inference performance with the other RZ/V products. In the case of ResNet-50, a typical classification Convolutional Neural Network (CNN), the performance is 14 times higher than RZ/V2L without pruning (Dense model) and 45 times higher with pruning.

RZ/V Series AI Inference Performance (Pre/Post Process are not Included)
Figure 1. RZ/V Series AI Inference Performance (Pre/Post Process are not Included)

Open CV Accelerator by Dynamically Reconfigurable Processor DRP

Various methods have been used in applications for image recognition and decision making even before the advent of deep learning. OpenCV, an open-source computer vision library, is one such example.  Even now that AI image processing is available, OpenCV is still a very useful technology. Both Vision AI and OpenCV are now used together in the appropriate places.

To accelerate both AI and various image processing algorithms such as OpenCV, the RZ/V2H MPU is designed with a dynamically reconfigurable processor separated from DRP-AI to provide a DRP library for OpenCV accelerators, taking full advantage of its flexibility.

Figure 2 compares the performance of the OpenCV accelerator with DRP to the RZ/V2H quad-core CPU. For example, the Sobel filter used for image edge detection is 16 times faster moving from 7.6fps to 123fps by using the DRP acceleration.

OpenCV Accelerator Performance Benchmark
Figure 2. OpenCV Accelerator Performance Benchmark

Heterogeneous Configuration for AI + High-speed Real-time Control  

Although fast multi-core Linux processors are optimal for image AI, it requires large memory resources and is very difficult to achieve the sub-ms real-time performance required for mechanical control.

To solve this issue, RZ/V2H uses a quad-core Cortex-A55 to run Linux programs including AI processing, and a dedicated high-speed real-time processor for RTOS processing in applications that require high real-time performance, such as motor control.

By connecting these different OSs via inter-processor communication using OpenAMP, the results of decisions made by the DRP-AI and Linux processor can be reflected in the mechanical control in real time by the processor for RTOS.

RZ/V2H Block Diagram
Figure 3. RZ/V2H Block Diagram

The RZ/V2H embedded AI microprocessor with these unique features is already in mass production, and the RZ/V2H evaluation board is also available to jumpstart your next vision AI development.

Visit the RZ/V2H product page to learn more about the RZ/V2H MPU, and explore our Vision Detection Single Board Computer winning combination that can be implemented as a System-on-Module (SoM) for development scalability or as an application-driven Single Board Computer (SBC) solution.

Share this news on