The following page content corresponds to the products marketed in Japan.
If you do not live in Japan, please
- Overview
- Further Information
SH-4 32-bit RISC CPU Core Family

The SH-4 is a licensable family of dual-issue 32-bit RISC CPU cores.
The SH-4 is available in three groups, the integer SH4-400 group and SH4-500 group and with an integrated vector Floating Point Unit, the SH4-200 group. SH-4 cores deliver programmable multimedia solutions, for example the SH4-200 can execute in software an MPEG4 384Kbps, 15fps CIF decode in only 45MHz. Licensees can configure the size of the 2-way set associative instruction and data caches from 4KB to 64KB in all of the SH-4 family.
The SH4-400 group is an extremely compact integer CPU core suitable for use in SoCs requiring a class-leading performance with tight space and power consumption constraints. For example the SH4-401S CPU core delivers up to 400 DMIPS. The SH4-450S integer CPU core is less than 0.8mm² and only consumes 0.06mW/MHz.
The SH4-500 group is a compact CPU core with an integral MMU delivering the performance of the SH4-400 but allowing more complex applications and operating systems to be executed. The SH4-501S, synthesizable core is only 1mm² in a generic 0.13µm CMOS process and a system with CPU and 8KB instruction and data caches would be 3.12mm².
The SH4-200 group includes the high performance Vector FPU (VFPU) which delivers 7MFLOPS/MHz performance. The VFPU is a high performance IEEE 754 conformant FPU with the performance and functionality to support audio codecs, 3D graphics etc. The SH4-210S is a synthesisable core while the SH4-202 is a hard macro with 16KB I and 32KB D caches and system peripherals in 0.13µm, which is available today at speeds up to 366MHz and is supported by operating systems including Linux and Windows CE.NET.
SH-4 key features

- Dual issue (superscalar) CPU delivering 1.5DMIPS/MHz (Dhrystone 2.1)
- Optional 128-bit Vector Floating Point Unit (FPU)
- 16-bit encoded instruction set delivers class leading code density. The SH-4 instruction set is based on the popular SHcompact RISC instruction set and is the only licensable 32-bit CPU technology to offer an instruction set that is entirely 16-bit encoded.
- Efficient cache architecture: The SH-4 family has a 2-way set associative split cache architecture.
- Optional memory management unit (MMU) that supports virtual addressing and variable page sizes and is capable of supporting complex operating systems such as Linux and Windows CE.NET as well as real-time kernels such as ITRON.
- The SH-4 is part of the upward compatible SuperH family and there is a huge range of third party products already available, Renesas offers a C/C++ toolchain based on the open source GNU technology.
- Energy efficient core:
The SH-4 features Sleep and Standby power down modes.
Memory accesses are minimized through the 16-bit instruction coding.
Dual-issue performance enables the CPU to execute a task in the minimum possible time period. This enables the CPU to spend longer periods in Sleep mode.
SH-4 based SoCs can be designed with variable voltage supplies and multiple clock domains with clock gearing (variable frequencies) to optimize overall power consumption.
SH-4 family performance
The SH-4 family delivers impressive performance across a range of multimedia applications.
| Benchmark | Performance |
| Dhrystone 2.1 | 1.5 DMIPS / MHz (SuperH gcc) |
| Floating point operations | 7 MFLOPS / MHz (uses FPU) |
| Complex FFT 1024 point radix 2 | 4 cycles / complex butterfly (uses FPU) |
| 16-tap, 40-sample Block FIR | 1.6MACs / cycle |
| EEMBC | See http://www.eembc.org |
| BDTIMark2000 | 750 at 240MHz (uses FPU) |
| 3D polygons | 9.3M/s at 266MHz (uses FPU) |
| ITU-T G.729 Annex C (8k/s) | Requires 25MHz and 40k bytes (uses FPU) |
| VoIP channels (full G.729) | 39MHz CPU performance per channel (uses FPU) |
SH4-400 group
The SH4-400 group is a high performance dual issue integer 32-bit RISC CPU family with a MAC/MUL unit and is designed for a range of multimedia applications that require a compact CPU core able to execute both general purpose code and codecs such as audio, speech and low-bit rate video.

The key features of the SH4-400
- MAC/MUL unit that delivers:
o 133MMACs/s at 266MHz
o Automatic data load and pointer increment
o 16 and 32-bit inputs
o 32 and 64-bit results - Efficient cache architecture:
o The SH4-400 has been designed with 2-way set associative data and instruction caches that deliver a high level of system performance
o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time, deterministic performance.
o Configurable cache sizes: 4KB to 64KB
SH4-401S Synthesizable core
The first product in the SH4-400 range is the SH4-401S, a fully synthesizable core.
The SH4-401S CPU is designed to be implemented in a range of different processes, below are some expected implementation figures in 0.18µm and 0.13µm processes.
| 0.18µm die size |
0.18µm clock speed |
0.13µm die size |
0.13µm clock speed |
|
| CPU only | 1.76mm² | 180-200MHz | 1mm² | 225-266MHz |
| CPU + 8K I and 8K D caches | 4.86mm² | 180-200MHz | 2.66mm² | 225-266MHz |
SH4-450S Synthesizable core
The SH4-450S is a lower frequency and more compact member of the SH4-400 range.
The following figures are typical size and performance figures for the SH4-450S
| 0.18μm die size |
0.18μm clock speed |
0.13μm die size |
0.13μm clock speed |
|
| CPU only | 1.30mm² | 100-120MHz | 0.78mm² | 100-133MHz |
| CPU + 8K I and 8K D caches | 4.25mm² | 100-120MHz | 2.41mm² | 100-133MHz |
SH4-500 group
The SH4-500 group builds on the SH4-400 group of cores by adding a Memory Management Unit (MMU) allowing more complex applications with virtual or protected memory requirements to execute.

The SH4-500 group delivers performance in line with the SH4-400 group. The features supported are a superset of those in the SH4-400 group.
The main features of the SH4-500 are:
- MAC/MUL unit that delivers:
o 133MMACs/s at 266MHz
o Automatic data load and pointer increment
o 16 and 32-bit inputs
o 32 and 64-bit results - Efficient cache architecture:
o The SH4-500 has been designed with 2-way set associative data and instruction caches that deliver a high level of system performance
o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time, deterministic performance.
o Configurable cache sizes: 4KB to 64KB
SH4-501S Synthesizable core
The first product in the SH4-500 family is the SH4-501S, a fully synthesizable core.
The SH4-501S CPU is designed to be implemented in a range of different processes, below are some expected implementation figures in 0.18µm and 0.13µm general purpose processes.
|
0.18µm |
0.18µm clock speed |
0.13µm die size |
0.13µm clock speed |
|
| CPU only | 1.76mm² | 180-200MHz | 1.00mm² | 225-266MHz |
| CPU + 8K I and 8K D caches | 5.67mm² | 180-200MHz | 3.12mm² | 225-266MHz |
SH4-200 group
The SH4-200 group is a high performance dual issue integer 32-bit RISC CPU group with an integrated vector floating point unit designed for a range of multimedia applications that require a compact CPU core with integrated vector floating point able to execute both general purpose code and multimedia code such as audio, speech and video codecs. The SH4-202 running at 266MHz can deliver a full duplex CIF, 384kbps, 15fps MPEG4 codec entirely in software.
The key features of the SH4-200 group
- Vector Floating point unit (FPU) that delivers:
o 7MFLOPS/MHz
o Vector instructions include matrix operations for 3D graphics, speech, audio and video codecs - Efficient cache architecture:
o The SH4-200 offers two different cache options, direct mapped and 2-way set associative data and instruction caches.
o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time performance.
o Configurable cache sizes: 4KB to 64KB bytes
SH4-202: Hard macro
There are a number of products available in the SH4-200 group, see the product table below.
The SH4-202 integrates the SH4-200 core, a debug port and system peripherals providing a complete CPU system ready for integration in to an SoC device. The SH4-202 supports a range of operating systems including Linux and Windows CE.NET.

Summary: SH-4 family product variants
The following product variants are available:
| Core | Process | Clock speed | Cache | Die size (mm²) | Availability | Comment |
| SH4-200 group - FPU with 2-way set associative caches | ||||||
| SH4-210S | Synthesizable core | Up to 400MHz |
Customer configurable |
In 0.13µm CPU+FPU =1.53mm² | Now | |
| SH4-202 | 0.13µm in UMC-GP, TSMC-LVOD | 266MHz - 366MHz |
16k I 32k D |
0.13µm Core = 7.8- 8.2mm² |
Now | Includes UDI / AUD debug port, UART, timers, interrupts, clocks |
| SH4-400 group - Integer CPU with 2-way set associative caches | ||||||
| SH4-401S | Synthesizable core | Up to 266MHz | Configurable | In 0.13µm CPU=1.00mm² | Now | |
| SH4-450S | Synthesizable core | Up to 133MHz | Configurable | In 0.13µm CPU=0.78mm² | Now | |
| SH4-500 group - Integer CPU with MMU and 2-way set associative caches | ||||||
| SH4-501S | Synthesizable core | Up to 266MHz | Configurable | In 0.13µm CPU=1.00mm² | Now | |
Notes: * GP = General purpose process option. LVOD = Process targeted at high clock speed
Japan English
