Voice recognition is used as a human machine interface (HMI) which has been adopted in a lot of products including robotics and smart speakers. The voice recognition solution is developed from the need to adopt more convenient functions while keeping costs as low as possible in consumer and industrial equipment. Voice recognition function is becoming an important additional feature, as it can help visually impaired and elderly persons by enabling the use of spoken commands to achieve certain tasks. Renesas' voice recognition solutions do not need an internet connection (edge voice recognition solution), thus providing differentiation and high functionality in current products.

The voice recognition solution is implemented with an A/D converter or I2S (Inter-IC Sound) and middleware and enables a high recognition rate under noisy environment conditions using noise suppressor technology.

Image
Voice Recognition System Overview

Noise Suppression Technologies

  • Beam forming - Reduce noise from another target
  • Noise suppressor - Reduce steady noise
  • Echo cancellation - Prevent or remove echo that is being created or already present

Solutions

Simple Voice Recognition Solution

This is an edge voice recognition solution using noise suppressor technologies.

Features

  • Small voice recognition solution with MEMS microphone
  • Control LED on/off and infrared communication* compatible devices via voice recognition
  • Easily change voice recognition parameters by checking the voice waveform with the evaluation tool

*Supported by RX231 voice recognition solution

  RX231 Voice Recognition Solution RX651 Voice Recognition Solution RA6M1 Voice Recognition Solution
Hardware MCU RX231 (R5F52318ADFL)
ROM/RAM: 512KB/64KB
Package: 48-pin LQFP
RX651 (R5F5651EDDFM)
ROM/RAM: 2MB/640KB
Package: 64-pin LFQFP
RA6M1 (R7FA6M1AD3CFM)
ROM/RAM: 512KB/256KB
Package: 64-pin LQFP
Microphone Digital MEMS Mic x2 Analog MEMS Mic x2 Analog MEMS Mic x2
Other functions Infrared communication, RGB LED, USB (Full Speed), push switch RGB LED, USB (Full Speed), push switch RGB LED, USB (Full Speed), push switch
Board size 60mm x 40mm 60mm x 40mm 60mm x 40mm
Software OS Not used Not used Not used
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
- - Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Reference Designs

  Hardware Software (source code & application notes) and voice recognition evaluation tool
RX231 Voice Recognition Solution RX231 Group Voice Recognition Demo Board (PDF | English, 日本語) Contact a Renesas sales office for detailed information
RX651 Voice Recognition Solution RX651 Group Voice Recognition Demo Board (PDF | English, 日本語)
RA6M1 Voice Recognition Solution RA6M1 Group Voice Recognition Demo Board (PDF | English, 日本語)

RA6M3 HMI Solution

This solution enables edge voice recognition, voice playback, capacitive touch operation and environmental sensing using the 1-chip RA6M3 MCU.

Features

  • Realize voice recognition, voice playback, TFT LCD control and environmental sensor control using the 1-chip RA6M3
  • Use voice recognition to change the TFT LCD settings, and get voice feedback
  • Easily change M/W parameters while checking the voice waveform with the evaluation tool
Image
RA6M3 HMI Solution
  RA6M3 HMI solution
Hardware EK-RA6M3G
  • MCU: RA6M3 (R7FA6M3AH3CFC)
    • Package: 176-pin LQFP
  • USB (Debug, Full Speed, High Speed)
  • Graphics expansion board
    • 4.3-inch TFT color LCD (capacitive touch overlay with controller)
    • 480 x 272 resolution
    • Backlight controller
HMI Expansion Board
  • Analog MEMS Mic x2
  • External expansion microphone circuit (MEMS type (analog output) or Electret condenser type)
  • Speaker operation circuit & speaker
  • Humidity and temperature Sensor (Renesas/HS3001)
  • Gas sensor (Renesas/ZMOD4410)
  • Arduino Uno connection
Software OS Amazon Free RTOS
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
Toshiba Digital Solutions/Voice Trigger
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
 * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech

Reference Designs

  Hardware Software (source code & application notes) and voice recognition evaluation tool
RA6M3 HMI solution RA6M3 Group RA6M3 HMI Expansion Board (PDF | English, 日本語) Contact a Renesas sales office for detailed information

High Performance HMI Solution

This solution enables noise suppressor, voice recognition, voice synthesis and a touch panel by using the 1-chip RZ/A1H without an internet connection.

Features

  • Voice recognition solution with noise suppressor technology
  • Realize voice recognition, speech synthesis and TFT LCD control using th e 1-chip RZ/A1H
  • Voice synthesis and LCD display function for showing results
Image
High Performance HMI Solution
Function Partner Middleware
Noise Suppressor Techno Mathematical Co., Ltd. Zoom Voice
Voice Recognition Advanced Media Inc. AmiVoice Micro
Voice Synthesis Hitachi ULSI Systems Co., Ltd. Ruby Talk®

Evaluation Tool

Features

Enables the below functions by connecting the evaluation board to a PC.

  • Visually check the sound input as a waveform
  • Change the M/W parameters for voice recognition and noise reduction
  • Display recognized ID
  • Sound data before and after noise processing can be saved and played back
Image
Voice Recognition Evaluation Tool

Recommended Middleware

Advanced Media/AmiVoice Micro - Voice Recognition

Advanced Media/AmiVoice Micro enables voice recognition without an internet connection with a low clock and small memory environment compared to existing products.

Supported microcontrollers (MCUs): RXv2 CPU-based RX family (RX231, RX230, RX65N, RX651, and RX64M), RXv3 CPU-based RX family (RX72M and RX72N), Arm Cortex-M4 (RA6M1, RA6M2, and RA6M3), Arm Cortex-A9 (RZ/A1H and RZ/A1L)

Model Required Memory Size Languages
Normal Model ROM: Over 33KB, RAM: Over 23KB Japanese, English, Chinese (Mandarin), Thai
High Recognition Model ROM: Over 482KB, RAM: Over 23KB Japanese

Required ROM/RAM Size to Recognition Word Number

Number of Words Normal Model (KB) High Recognition Model (KB)
ROM RAM ROM RAM
5 33 23 482 23
10 54 25 681 25
20 78 28 995 28
30 96 30 1,226 30
40 109 33 1,444 33
50 117 33 1,587 33
100 143 46 2,143 46
150 160 55 2,452 55

The information referenced changes according to the language and the content of recognition word.

The high recognition model is able to improve the recognition rate by consuming more of the ROM usage for calculation compared to the normal model.

Support for voice activity detection (VAD)

This support includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks.

Toshiba Digital Solutions/RECAIUS™ Voice Trigger - Voice Recognition

RECAIUS Voice Trigger realizes voice control function without an internet connection. You can change target phrases without speech data and use this as a customized detector of your own wake-words and/or voice commands.

Supported MCUs: Arm Cortex-M4 (RA6M1, RA6M2, and RA6M3), Arm Cortex-A9 and later

Supported languages: Japanese, American English and Mandarin Chinese
To be commercialized (available for evaluation): Canadian French, American Spanish, British English, French, German, Spanish, Italian

Required Memory Size

Number of Words ROM (KB) RAM (KB)
5 145 45
10 160 50
20 190 65

The information referenced changes according to the language and the content of recognition word.

Techno Mathematical/Zoom Voice - Noise Suppressor Technology

Zoom Voice supports two noise suppressor technologies, beam forming and noise suppressor.

Beam Forming

  • Extract the target sound properly from the front with reducing the background noise
  • Use two non-directive microphones
  • Effect could be set from "1: weak to 7: strong"

Noise Suppressor

  • Noise reduction 30dB (about 1/30) max.
  • Noise reduction could be set according to frequency

High-Speed Process Version Applied DSP Instruction
The processing speed of a DSP instruction applied version is 30% higher.

Supported MCUs:
DSP instruction applied version: RXv2 CPU-based RX family (RX231, RX230, RX65N, RX651, and RX64M), RXv3 CPU-based RX family (RX72M and RX72N)
Normal version: Arm Cortex-M4 (RA6M1, RA6M2, and RA6M3), Arm Cortex-A9 (RZ/A1H and RZ/A1L)

Noise Suppressor Technology Required Memory Size
Beam Forming ROM: 40KB, RAM: 10KB
Noise Suppressor ROM: 40KB, RAM: 10KB

Beam Forming and Noise Suppressor Use Case

Image
Beam Forming and Noise Suppressor Use Case

The high recognition rate is achieved even under noisy environments by using Zoom Voice. A very high effect can be expected at 5dB or less S/N ratio.

The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition) is shown below.

Image
Zoom Voice Recognition Rate

Note 1: The comparison was done using the sound of a vacuum cleaner and washing machine as the source of noise.
Note 2: This data is based on research from Renesas.

Partners

Image
Advanced Media. Inc. Logo

Advanced Media. Inc.

Development and sales of voice recognition software products

Contact: https://www.advanced-media.co.jp/contact/english/

Image
Toshiba Logo

Toshiba Digital Solutions

System integration, development, manufacture and sales of ICT solutions utilizing IoT and AI technology

Contact: https://www.toshiba-sol.co.jp/en/contact/index.html
Email: tdsl-recaius-mw-sales-r1@ml.toshiba.co.jp

Image
Techno Mathematical Co., Ltd.

Techno Mathematical Co., Ltd.

Development and sales of image, acoustic and sound processing software and hardware products

Contact: http://www.tmath.co.jp/eng/contact_us/

Applications

Documentation & Downloads

Title Other Languages Type Format File Size Date
Application Notes & White Papers
RA6M3 Group RA6M3 HMI Expansion Board 日本語 Application Note PDF 1.83 MB
RX231 Group Voice Recognition Demo Board 日本語 Application Note PDF 906 KB
RX651 Group Voice Recognition Demo Board 日本語 Application Note PDF 985 KB
RA6M1 Group Voice Recognition Demo Board 日本語 Application Note PDF 864 KB
Other
Voice Recognition Solution Product Brief PDF 443 KB