Voice recognition is used as a human machine interface (HMI) which has been adopted in a lot of products including robotics and smart speakers. The voice recognition solution is developed from the need to adopt more convenient functions while keeping costs as low as possible in consumer and industrial equipment. Voice recognition function is becoming an important additional feature, as it can help visually impaired and elderly persons by enabling the use of spoken commands to achieve certain tasks. Renesas' voice recognition solutions do not need an internet connection (edge voice recognition solution), thus providing differentiation and high functionality in current products.

The voice recognition solution is implemented with an A/D converter or I2S (Inter-IC Sound) and middleware and enables a high recognition rate under noisy environment conditions using noise suppressor technology.

Image
Voice Recognition System Overview

 


Noise Suppression Technologies

  • Beam forming - Reduce noise from another target
  • Noise suppressor - Reduce steady noise
  • Echo cancellation - Prevent or remove echo that is being created or already present

Solutions

RX231, RX651, RA6M1 Voice Recognition Solution

This solution enables edge voice recognition with a small board.

RX671, RX72N Voice Recognition Solution

Start development quickly with purchasable Renesas evaluation board

RA4M2 ECM Voice Recognition Solution

Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)

RA4W1 Voice Recognition with Bluetooth® Low Energy Solution

This solution enables edge voice recognition, voice playback, Bluetooth® Low Energy and environmental sensing using a single RA4W1 MCU.

RX671 Voice Recognition, Capacitive Touch and Cloud Demo

This solution enables edge voice recognition, capacitive touch and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.

RA6M3 HMI Solution

This solution enables edge voice recognition, voice playback, capacitive touch operation and environmental sensing using a single RA6M3 MCU.


RX231, RX651, RA6M1 Voice Recognition Solution

This solution enables edge voice recognition with a small board.

Features

  • Small voice recognition solution with MEMS microphone
  • Control LED on/off and infrared communication* compatible devices via voice recognition
  • Easily change voice recognition parameters by checking the voice waveform with the evaluation tool

*Supported by RX231 voice recognition solution

Image
RX231 Voice Recognition Solution Board

RX231 Voice Recognition Solution

Image
RA651 Voice Recognition Solution Board

RX651 Voice Recognition Solution

Image
RA6M1 Voice Recognition Solution Board

RA6M1 Voice Recognition Solution

  RX231 Voice Recognition Solution RX651 Voice Recognition Solution RA6M1 Voice Recognition Solution
Hardware MCU RX231 (R5F52318ADFL)
ROM/RAM: 512KB/64KB
Package: 48-pin LQFP
RX651 (R5F5651EDDFM)
ROM/RAM: 2MB/640KB
Package: 64-pin LFQFP
RA6M1 (R7FA6M1AD3CFM)
ROM/RAM: 512KB/256KB
Package: 64-pin LQFP
Microphone Digital MEMS Mic x2 Analog MEMS Mic x2 Analog MEMS Mic x2
Other functions Infrared communication, RGB LED, USB (Full Speed), push switch RGB LED, USB (Full Speed), push switch RGB LED, USB (Full Speed), push switch
Board size 60mm x 40mm 60mm x 40mm 60mm x 40mm
Software OS Not used Not used Not used
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
- Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Reference Designs

  Hardware Software (source code & application notes) and voice recognition evaluation tool
RX231 Voice Recognition Solution RX231 Group Voice Recognition Demo Board Rev.1.01 (PDF | English, 日本語) Contact a Renesas sales office for detailed information
RX651 Voice Recognition Solution RX651 Group Voice Recognition Demo Board (PDF | English, 日本語)
RA6M1 Voice Recognition Solution RA6M1 Group Voice Recognition Demo Board (PDF | English, 日本語)

RX671, RX72N Voice Recognition Solution

Start development quickly with purchasable Renesas evaluation board

Features

  • Edge voice recognition solution with MEMS microphone
  • Downloadable demo software
  • Easily change voice recognition parameters by checking the voice waveform with the evaluation tool
Image
RX671 Voice Recognition Solution

RX671 Voice Recognition Solution

Image
RX72N Voice Recognition Solution

RX72N Voice Recognition Solution

  RX671 Voice Recognition Solution RX72N Voice Recognition Solution
Hardware Renesas Starter Kit+ for RX671
(Model number: RTK55671EHS10000BE)
  • MCU: RX671 (R5F5671EHDFB)
    • ROM/RAM: 2MB/384KB
    • Package: 144-pin LFQFP
  • Digital MEMS Mic x2
RX72N Envision Kit
(Model number: RTK5RX72N0C00000BJ)
  • MCU: RX72N (R5F572NDHDFB)
    • ROM/RAM: 4MB+64KB/1MB
    • Package: 144-pin LFQFP
  • Digital MEMS Mic x2
Software OS Not used Not used
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Download

Items Note
RX671 Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語) Supported languages: Japanese, English
Contact a Renesas sales office for sample source and evaluation tool
RX671 Group Voice Recognition Demonstration (Voice Trigger Middleware) Coming soon
RX72N Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語) Supported languages: Japanese, English
Contact a Renesas sales office for sample source and evaluation tool
RX72N Group Voice Recognition Demonstration (Voice Trigger Middleware) Coming soon

RA4M2 ECM Voice Recognition Solution

Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)

Features

  • Low BOM cost and small board voice recognition solution
  • Use cost efficient ECM for voice input
  • Selectable ECM for evaluation and its amp gain is changeable
Image
RA4M2 ECM Voice Recognition Solution
  RA4M2 ECM Voice Recognition Solution
Hardware MCU RA4M2 (R7FA4M2AD3CFL)
ROM/RAM:512KB/128KB
Package: 48-pin LQFP
Op-amp READ2303G
Microphone Electret Condenser Microphone x1
Other functions RGB LED, USB (Full Speed), push switch
Board size 60mm x 40mm
Software OS Not used
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Reference Designs

Items Note
RA4M2 Voice Recognition ECM Demo Board (PDF | English, 日本語) Contact a Renesas sales office for how to obtain the demo board
RA4M2 Group Voice Recognition Demo Board Sample Software Contact a Renesas sales office for detailed information

Download

Items Note
RA4M2 Group Voice Recognition Demonstration(AmiVoiceMicro) Rev.1.00 (PDF | English, 日本語) Supported languages: Japanese, English, Mandarin Chinese
RA4M2 Group Voice Recognition Demonstration(Voice Trigger Middleware) Rev.1.00 (PDF | English, 日本語) Supported languages: Japanese, American English, Mandarin Chinese

RA4W1 Voice Recognition with Bluetooth® Low Energy Solution

This solution enables edge voice recognition, voice playback, Bluetooth Low Energy communication and environmental sensing using a single RA4W1 MCU.

Features

  • Voice recognition, voice playback, Bluetooth Low Energy control and environmental sensor control with a single RA4W1 MCU
  • Generates audio feedback according to the voice recognition result and sends the result to a smartphone via Bluetooth Low Energy
  • Operate a demo board and confirm sensor information via Bluetooth Low Energy using a mobile device
Image
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Board
  RA4W1 Voice Recognition with Bluetooth Low Energy Solution
Hardware EK-RA4W1
  • MCU: RA4W1 (R7FA4W1AD2CNG)
    • ROM/RAM: 512KB/96KB
    • Package: 56-pin QFN
  • Bluetooth Low Energy circuit
  • USB Full Speed device
  • Arduino™ UNO connector
HMI Expansion Board
  • Analog MEMS Mic x2
  • External expansion microphone circuit (MEMS type (analog output) or Electret condenser type)
  • Speaker operation circuit & speaker
  • Humidity and temperature Sensor (Renesas/HS3001)
  • Gas sensor (Renesas/ZMOD4410)
  • Arduino Uno connection
Software OS Not used
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
Toshiba Digital Solutions/Voice Trigger
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
 * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech

Reference Designs

Items Note
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Board Contact a Renesas sales office for detailed information
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Sample Software

Download

Deliverables Note
RA4W1 Group Voice Recognition Demonstration(AmiVoiceMicro) (PDF | English, 日本語) Supported languages: Japanese, English
RA4W1 Group Voice Recognition Demonstration(Voice Trigger Middleware) (PDF | English, 日本語) Supported languages: Japanese, American English, Mandarin Chinese

RX671 Voice Recognition, Capacitive Touch and Cloud Demo

This solution enables edge voice recognition, capacitive touch and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.

Features

  • Realize voice recognition, capacitive touch, and LCD control (LCD module) using a single RX671
  • Change application settings via voice recognition and capacitive touch, and display results on an LCD
  • Enable remote control by connecting to the cloud (AWS) via a Wi-Fi module
Image
RX671 Voice Recognition, Capacitive Touch and Cloud Demo
  RX671 Voice Recognition, Capacitive Touch and Cloud Demo
Hardware Renesas Starter Kit+ for RX671
  • MCU: RX671 (R5F5671EHDFB: Supported Encrypt Function)
    • ROM/RAM: 2MB/384KB
    • Package: 144-pin LFQFP
  • Built-in audio circuit. SSIE (Serial Sound Interface) can be evaluated.
  • Touch feature (self-capacitive type) can be evaluated.
  • Encryption engines and key management function by Trusted Secure IP can be evaluated.
  • Built-in SD memory card slot. SDHI (SD Host Interface) can be evaluated.
  • 1 channel USB Function or 1 channel USB Host can be evaluated.

Wi-Fi Pmod™ Expansion Board

  • IEEE 802.11b/g/n compliant, 2.4GHz, HT20, MCS0-7, up to 13-ch
  • 1x1 single stream system
  • One UART and one HS-UART MCU host interface
  • Full suite of AT command support
Software OS Amazon FreeRTOS
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger (Coming soon)

Download

Items Note
RX671 Group Voice recognition / Touch and Cloud solution using Renesas Starter Kit+ for RX671 Rev.1.00 (PDF | English, 日本語)  
RX671 Group Voice recognition / Touch and Cloud solution using Renesas Starter Kit+ for RX671 Rev.1.00 - Sample Code (ZIP | English, 日本語) Contact a Renesas sales office to request the source code.

RA6M3 HMI Solution

This solution enables edge voice recognition, voice playback, capacitive touch operation and environmental sensing using a single RA6M3 MCU.

Features

  • Realize voice recognition, voice playback, TFT LCD control and environmental sensor control using the 1-chip RA6M3
  • Use voice recognition to change the TFT LCD settings, and get voice feedback
  • Easily change M/W parameters while checking the voice waveform with the evaluation tool
Image
RA6M3 HMI Solution
  RA6M3 HMI solution
Hardware EK-RA6M3G
  • MCU: RA6M3 (R7FA6M3AH3CFC)
    • Package: 176-pin LQFP
  • USB (Debug, Full Speed, High Speed)
  • Graphics expansion board
    • 4.3-inch TFT color LCD (capacitive touch overlay with controller)
    • 480 x 272 resolution
    • Backlight controller
HMI Expansion Board
  • Analog MEMS Mic x2
  • External expansion microphone circuit (MEMS type (analog output) or Electret condenser type)
  • Speaker operation circuit & speaker
  • Humidity and temperature Sensor (Renesas/HS3001)
  • Gas sensor (Renesas/ZMOD4410)
  • Arduino Uno connection
Software OS Amazon Free RTOS
Middleware Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
Toshiba Digital Solutions/Voice Trigger
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
 * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech

Reference Designs

  Hardware Software (source code & application notes) and voice recognition evaluation tool
RA6M3 HMI solution RA6M3 Group RA6M3 HMI Expansion Board (PDF | English, 日本語) Contact a Renesas sales office for detailed information

Evaluation Tool

Features

Enables the below functions by connecting the evaluation board to a PC.

  • Visually check the sound input as a waveform
  • Change the M/W parameters for voice recognition and noise reduction
  • Display recognized ID
  • Sound data before and after noise processing can be saved and played back
Image
Voice Recognition Evaluation Tool

Recommended Middleware


Advanced Media/AmiVoice Micro - Voice Recognition

Advanced Media/AmiVoice Micro enables voice recognition without an internet connection with a low clock and small memory environment compared to existing products.

Supported microcontrollers (MCUs)

Renesas Core:

  • RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group etc.)

Arm Core:

  • Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm® Cortex®-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm® Cortex®-A9 (RZ/A Series)
Model Required Memory Size Languages
Normal Model ROM: Over 33KB, RAM: Over 23KB Japanese, English, Chinese (Mandarin), Thai, Korean
High Recognition Model ROM: Over 482KB, RAM: Over 23KB Japanese

Required ROM/RAM Size to Recognition Word Number

Number of Words Normal Model (KB) High Recognition Model (KB)
ROM RAM ROM RAM
5 33 23 482 23
10 54 25 681 25
20 78 28 995 28
30 96 30 1,226 30
40 109 33 1,444 33
50 117 33 1,587 33
100 143 46 2,143 46
150 160 55 2,452 55

The information referenced changes according to the language and the content of recognition word.

The high recognition model is able to improve the recognition rate by consuming more of the ROM usage for calculation compared to the normal model.

Support for voice activity detection (VAD)

This support includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks.


Toshiba Digital Solutions/RECAIUS™ Voice Trigger - Voice Recognition

RECAIUS Voice Trigger realizes voice control function without an internet connection. You can change target phrases without speech data and use this as a customized detector of your own wake-words and/or voice commands.

Supported MCUs

Renesas Core:

  • RXv2 (RX65N, RX651, RX64M Group etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group etc.)

Arm Core:

  • Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm® Cortex®-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm® Cortex®-A9 (RZ/A Series)

Supported languages: Japanese, American English and Mandarin Chinese
To be commercialized (available for evaluation): Canadian French, American Spanish, British English, French, German, Spanish, Italian

Required Memory Size

Number of Words ROM (KB) RAM (KB)
5 145 45
10 160 50
20 190 65

The information referenced changes according to the language and the content of recognition word.


Techno Mathematical/Zoom Voice - Noise Suppressor Technology

Zoom Voice supports two noise suppressor technologies, beam forming and noise suppressor.

Beam Forming

  • Extract the target sound properly from the front with reducing the background noise
  • Use two non-directive microphones
  • Effect could be set from "1: weak to 7: strong"

Noise Suppressor

  • Noise reduction 30dB (about 1/30) max.
  • Noise reduction could be set according to frequency

High-Speed Process Version Applied DSP Instruction
The processing speed of a DSP instruction applied version is 30% higher.

Supported MCUs:

DSP instruction applied version: Renesas Core:

  • RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group etc.)

Normal version: Arm Core:

  • Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm® Cortex®-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm® Cortex®-A9 (RZ/A Series)
Noise Suppressor Technology Required Memory Size
Beam Forming ROM: 40KB, RAM: 10KB
Noise Suppressor ROM: 40KB, RAM: 10KB

Beam Forming and Noise Suppressor Use Case

Image
Beam Forming and Noise Suppressor Use Case

The high recognition rate is achieved even under noisy environments by using Zoom Voice. A very high effect can be expected at 5dB or less S/N ratio.

The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition) is shown below.

Image
Zoom Voice Recognition Rate

Note 1: The comparison was done using the sound of a vacuum cleaner and washing machine as the source of noise.
Note 2: This data is based on research from Renesas.


Partners

Image
Advanced Media. Inc. Logo

Advanced Media. Inc.

Development and sales of voice recognition software products

Contact: https://www.advanced-media.co.jp/contact/english/

Image
Toshiba Logo

Toshiba Digital Solutions

System integration, development, manufacture and sales of ICT solutions utilizing IoT and AI technology

Contact: https://www.toshiba-sol.co.jp/en/contact/index.html
Email: tdsl-recaius-mw-sales-r1@ml.toshiba.co.jp

Image
Techno Mathematical Co., Ltd.

Techno Mathematical Co., Ltd.

Development and sales of image, acoustic and sound processing software and hardware products

Contact: http://www.tmath.co.jp/eng/contact_us/


Image
Lab on the Cloud

Lab on the Cloud

Renesas' Lab on the Cloud is an online environment where Renesas solutions, including popular evaluation boards, winning combinations and software, are hosted in a remote lab that customers access and test online.

Voice Recognition Solutions

The demo system is a simple working solution that recognizes voice commands to initiate the corresponding operation. It uses a high performance Arm® Cortex®-M4 core based RA6M1 MCU. It is highly efficient and supported by an open and flexible ecosystem – the Flexible Software Package built on FreeRTOS to reduce development time. The boards are trained with the voice models to recognise and results in voice response as well as voice match score.

Access the Lab

Videos

RX671 HMI+Cloud Single Chip Solution

Introducing a solution that enables HMI and cloud connectivity functions on a single chip using the RX671. This solution provides a reference demo for a comprehensive experience of voice recognition, touch keys, and cloud connectivity. Demo software can be downloaded from the Renesas website. For details, refer to the application note "RX671 Group Voice recognition / Touch and Cloud solution using Renesas Starter Kit+ for RX671 Rev.1.00 - Sample Code (ZIP | English, 日本語)".