Home / Products / SuperH Core Licensing / CPU cores /

SH-4 


 
Provide feedback
Print this page



CPU cores
Architecture
CPU roadmap
Core selection guide
SH-4
SH-5
Peripherals
Debug Support
Licensing


Japanese Site



RELATED RESOURCES

SuperH RISC engine Family



SH-4 32-bit RISC CPU Core Family



The SH-4 is a licensable family of dual-issue 32-bit RISC CPU cores.

The SH-4 is available in three groups, the integer SH4-400 group and SH4-500 group and with an integrated vector Floating Point Unit, the SH4-200 group. SH-4 cores deliver programmable multimedia solutions, for example the SH4-200 can execute in software an MPEG4 384Kbps, 15fps CIF decode in only 45MHz. Licensees can configure the size of the 2-way set associative instruction and data caches from 4KB to 64KB in all of the SH-4 family.

The SH4-400 group is an extremely compact integer CPU core suitable for use in SoCs requiring a class-leading performance with tight space and power consumption constraints. For example the SH4-401S CPU core delivers up to 400 DMIPS. The SH4-450S integer CPU core is less than 0.8mm² and only consumes 0.06mW/MHz.

The SH4-500 group is a compact CPU core with an integral MMU delivering the performance of the SH4-400 but allowing more complex applications and operating systems to be executed. The SH4-501S, synthesizable core is only 1mm² in a generic 0.13µm CMOS process and a system with CPU and 8KB instruction and data caches would be 3.12mm².

The SH4-200 group includes the high performance Vector FPU (VFPU) which delivers 7MFLOPS/MHz performance. The VFPU is a high performance IEEE 754 conformant FPU with the performance and functionality to support audio codecs, 3D graphics etc. The SH4-210S is a synthesisable core while the SH4-202 is a hard macro with 16KB I and 32KB D caches and system peripherals in 0.13µm, which is available today at speeds up to 366MHz and is supported by operating systems including Linux and Windows CE.NET.

  •  
  •  
  •  

SH-4 key features



  • Dual issue (superscalar) CPU delivering 1.5DMIPS/MHz (Dhrystone 2.1)
  • Optional 128-bit Vector Floating Point Unit (FPU)
  • 16-bit encoded instruction set delivers class leading code density. The SH-4 instruction set is based on the popular SHcompact RISC instruction set and is the only licensable 32-bit CPU technology to offer an instruction set that is entirely 16-bit encoded.
  • Efficient cache architecture: The SH-4 family has a 2-way set associative split cache architecture.
  • Optional memory management unit (MMU) that supports virtual addressing and variable page sizes and is capable of supporting complex operating systems such as Linux and Windows CE.NET as well as real-time kernels such as ITRON.
  • The SH-4 is part of the upward compatible SuperH family and there is a huge range of third party products already available, Renesas offers a C/C++ toolchain based on the open source GNU technology.
  • Energy efficient core:
    The SH-4 features Sleep and Standby power down modes.
    Memory accesses are minimized through the 16-bit instruction coding.
    Dual-issue performance enables the CPU to execute a task in the minimum possible time period. This enables the CPU to spend longer periods in Sleep mode.
    SH-4 based SoCs can be designed with variable voltage supplies and multiple clock domains with clock gearing (variable frequencies) to optimize overall power consumption.

SH-4 family performance

The SH-4 family delivers impressive performance across a range of multimedia applications.

 

Benchmark Performance
Dhrystone 2.1 1.5 DMIPS / MHz (SuperH gcc)
Floating point operations 7 MFLOPS / MHz (uses FPU)
Complex FFT 1024 point radix 2 4 cycles / complex butterfly (uses FPU)
16-tap, 40-sample Block FIR 1.6MACs / cycle
EEMBC See http://www.eembc.org
BDTIMark2000 750 at 240MHz (uses FPU)
3D polygons 9.3M/s at 266MHz (uses FPU)
ITU-T G.729 Annex C (8k/s) Requires 25MHz and 40k bytes (uses FPU)
VoIP channels (full G.729) 39MHz CPU performance per channel (uses FPU)

SH4-400 group

The SH4-400 group is a high performance dual issue integer 32-bit RISC CPU family with a MAC/MUL unit and is designed for a range of multimedia applications that require a compact CPU core able to execute both general purpose code and codecs such as audio, speech and low-bit rate video.



The key features of the SH4-400

  • MAC/MUL unit that delivers:
    o 133MMACs/s at 266MHz
    o Automatic data load and pointer increment
    o 16 and 32-bit inputs
    o 32 and 64-bit results
  • Efficient cache architecture:
    o The SH4-400 has been designed with 2-way set associative data and instruction caches that deliver a high level of system performance
    o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time, deterministic performance.
    o Configurable cache sizes: 4KB to 64KB

SH4-401S Synthesizable core

The first product in the SH4-400 range is the SH4-401S, a fully synthesizable core.

The SH4-401S CPU is designed to be implemented in a range of different processes, below are some expected implementation figures in 0.18µm and 0.13µm processes.

 

  0.18µm
die size
0.18µm
clock speed
0.13µm
die size
0.13µm
clock speed
CPU only 1.76mm² 180-200MHz 1mm² 225-266MHz
CPU + 8K I and 8K D caches 4.86mm² 180-200MHz 2.66mm² 225-266MHz

SH4-450S Synthesizable core

The SH4-450S is a lower frequency and more compact member of the SH4-400 range.
The following figures are typical size and performance figures for the SH4-450S

  0.18μm
die size
0.18μm
clock speed
0.13μm
die size
0.13μm
clock speed
CPU only 1.30mm² 100-120MHz 0.78mm² 100-133MHz
CPU + 8K I and 8K D caches 4.25mm² 100-120MHz 2.41mm² 100-133MHz

 

SH4-500 group

The SH4-500 group builds on the SH4-400 group of cores by adding a Memory Management Unit (MMU) allowing more complex applications with virtual or protected memory requirements to execute.



The SH4-500 group delivers performance in line with the SH4-400 group. The features supported are a superset of those in the SH4-400 group.

The main features of the SH4-500 are:

  • MAC/MUL unit that delivers:
    o 133MMACs/s at 266MHz
    o Automatic data load and pointer increment
    o 16 and 32-bit inputs
    o 32 and 64-bit results
  • Efficient cache architecture:
    o The SH4-500 has been designed with 2-way set associative data and instruction caches that deliver a high level of system performance
    o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time, deterministic performance.
    o Configurable cache sizes: 4KB to 64KB

SH4-501S Synthesizable core

The first product in the SH4-500 family is the SH4-501S, a fully synthesizable core.

The SH4-501S CPU is designed to be implemented in a range of different processes, below are some expected implementation figures in 0.18µm and 0.13µm general purpose processes.

 

0.18µm
die size

0.18µm
clock speed
0.13µm
die size
0.13µm
clock speed
CPU only 1.76mm² 180-200MHz 1.00mm² 225-266MHz
CPU + 8K I and 8K D caches 5.67mm² 180-200MHz 3.12mm² 225-266MHz

SH4-200 group

The SH4-200 group is a high performance dual issue integer 32-bit RISC CPU group with an integrated vector floating point unit designed for a range of multimedia applications that require a compact CPU core with integrated vector floating point able to execute both general purpose code and multimedia code such as audio, speech and video codecs. The SH4-202 running at 266MHz can deliver a full duplex CIF, 384kbps, 15fps MPEG4 codec entirely in software.

The key features of the SH4-200 group

  • Vector Floating point unit (FPU) that delivers:
    o 7MFLOPS/MHz
    o Vector instructions include matrix operations for 3D graphics, speech, audio and video codecs
  • Efficient cache architecture:
    o The SH4-200 offers two different cache options, direct mapped and 2-way set associative data and instruction caches.
    o The data cache can be configured in a mixed cache/RAM mode delivering fast, real time performance.
    o Configurable cache sizes: 4KB to 64KB bytes

SH4-202: Hard macro

There are a number of products available in the SH4-200 group, see the product table below.

The SH4-202 integrates the SH4-200 core, a debug port and system peripherals providing a complete CPU system ready for integration in to an SoC device. The SH4-202 supports a range of operating systems including Linux and Windows CE.NET.



Summary: SH-4 family product variants

The following product variants are available:

Core Process Clock speed Cache Die size (mm²) Availability Comment
SH4-200 group - FPU with 2-way set associative caches
SH4-210S Synthesizable core Up to 400MHz

Customer configurable

In 0.13µm CPU+FPU =1.53mm² Now  
SH4-202 0.13µm in UMC-GP, TSMC-LVOD 266MHz
- 366MHz
16k I
32k D
0.13µm Core = 7.8- 8.2mm²

Now   Includes UDI / AUD debug port, UART, timers, interrupts, clocks
SH4-400 group - Integer CPU with 2-way set associative caches
SH4-401S Synthesizable core Up to 266MHz Configurable In 0.13µm CPU=1.00mm² Now  
SH4-450S Synthesizable core Up to 133MHz Configurable In 0.13µm CPU=0.78mm² Now  
SH4-500 group - Integer CPU with MMU and 2-way set associative caches
SH4-501S Synthesizable core Up to 266MHz Configurable In 0.13µm CPU=1.00mm² Now  

Notes: * GP = General purpose process option. LVOD = Process targeted at high clock speed