Voice Recognition Solution

Voice recognition solution with high recognition rate even in noisy environment

The voice recognition is used as a human machine interface which has been adopted in lots of products like robot and smart speaker. The voice recognition is developed from the needs to adopt more convenient functions while keeping costs as low as possible in consumer equipment and industrial equipment. And the voice recognition function is becoming an important additional feature because it is possible to help visually impaired and elderly persons by using the voice recognition function. Renesas is providing a voice recognition solution which does not need internet connection to make current products differentiation and high functionality.

System Overview

Implemented by A/D converter or SSI (Serial Sound Interface) and middleware

System Overview

 

Realize high recognition rate under noisy environment using “Noise suppressor technology”

(ex) noise suppressor technologies

  • Beamforming
    Reducing noise from other of target
  • Noise suppressor
    Reducing steady noise
  • Echo cancellation
    Preventing or removing echo that is being created or already present
Recommended Middleware

Voice recognition/AmiVoice Micro

Features of AmiVoice Micro

Realized voice recognition in none internet connection, low clock and small memory environment compared to existing products

 

Two acoustic models
  • Normal model
  • High recognition model
  ※ High recognition model is able to improve recognition rate by consuming more amount of ROM usage for calculation compared to normal model.

 

Support of VAD (voice activity detection)
It includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks (threshold is 1000 to 15000).

Required memory size

  • Normal model
    ROM: over 33[KB], RAM: over 23[KB]
  • High recognition model
    ROM: over 482[KB], RAM: over 23[KB]

 

Required ROM/RAM size against to recognition word number

Word number Normal model [KB] High recognition model [KB]
ROM RAM ROM RAM
5 33 23 482 23
10 54 25 681 25
20 78 28 995 28
30 96 30 1,226 30
40 109 33 1,444 33
50 117 33 1,587 33
100 143 46 2,143 46
150 160 55 2,452 55

* Information for reference (It changes according to the language and the content of recognition word.)


Languages

  • Normal model
    Japanese, English, Chinese (Mandarin)
  • High recognition model
    Japanese

Noise suppressor technology/Zoom Voice

Features of Zoom Voice

Support two noise suppressor technologies

Beam forming
  • Extracting the target sound properly from front with reducing the background noise
  • Using two non-directive microphones
  • Effect could be set from “1: weak to 7: strong”
Noise suppressor
  • Noise reduction 30dB (about 1/30) max.
  • Noise reduction could be set according to frequency

 

High speed process version applied DSP instruction

The processing speed of DSP instruction applied version is 30% higher

Supported MCU: RXv2 CPU-based RX family (RX64M, RX71M, RX231, RX230, RX65N, RX651, RX23T, RX24T, RX24U Group)


Required memory size

ROM: 40[KB], RAM: 10[KB]

Use case of beam forming and noise suppressor

 

beam forming and noise suppressor

The high recognition rate is achieved even under noisy environments by using Zoom Voice.

Especially very high effect can be expected at 5[dB] or less S/N ratio.

 

The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition)

Zoom Voice

Note 1, using sound of vacuum cleaner and washing machine as the source of noise.

Note 2, this data is base on the research of Renesas.

Solutions

High performance HMI solution

Realized noise suppressor, voice recognition, voice synthesis and touch panel by using 1-chip RZ/A1H without internet connection.

 

Features

Adopting voice recognition function with noise suppressor technology

Improving recognition rate by implementing beam forming and noise suppressor function.

Adopting voice recognition function with tuning

Performance parameters of noise suppressor and voice recognition could be changed on the touch panel according to the usage environment.

Feedback the results by voice synthesis function and LCD display function

Recognition results could feedback by voice synthesis function and LCD display function.

Overview

High performance HMI solution

Click here for details of RZ/A1H

 

Used middleware

Function Partner Middleware Remarks
Noise suppressor Techno Mathematical Co., Ltd. Zoom Voice -
Voice recognition Advanced Media Inc. AmiVoice Micro High recognition version, normal version
Voice synthesis Hitachi ULSI Systems Co., Ltd. Ruby Talk® -

Low power consumption voice recognition solution

Realizing high performance voice recognition by using low power consumption technology and DSP operating instruction of RX231

 

Features

Adopting voice recognition function with noise suppressor technology

Improving recognition rate by implementing beam forming and noise suppressor function, tuning according to the usage environment.

Infrared communication function

Controlling the infrared communication compatible equipment according to recognition results.

Immediately to start evaluation and development of speech recognition function

Providing both of board and sample software for evaluation and development.

Overview

Low power consumption voice recognition solution

Click here for details of RX231

 

Used middleware

Function Partner Middleware Remarks
Noise suppressor Techno Mathematical Co., Ltd. Zoom Voice DSP instruction applied version
Voice recognition Advanced Media Inc. AmiVoice Micro Normal version

Comparison of solutions

  High performance HMI solution Low power consumption voice recognition solution
MPU/MCU RZ/A1H (RAM: 10[MB], Clock: 400[MHz]) RX231 (ROM: 512[KB], RAM: 64[KB], Clock: 54[MHz])
OS mbed  (RTOS) Not used
Functions - Voice recognition: AmiVoice Micro (High recognition version, normal version)
- Noise suppressor: Zoom Voice
- Voice synthesis: Ruby Talk®
- Touch screen
- Voice recognition: AmiVoice Micro (Normal version)
- Noise suppressor: Zoom Voice (DSP instruction applied version)
- Infrared communication output
Recognition languages All three languages
- Japanese (High recognition version)
- Chinese (Normal version)
- English (Normal version)
One of below languages
- Japanese (Normal version)
- Chinese (Normal version)
- English (Normal version)
Overview - Realized noise suppressor, voice recognition, voice synthesis and touch panel by using 1-chip RZ/A1H
- Performance parameters of noise suppressor and voice recognition could be changed on the touch panel
- No internet connection required
- Infrared communication output (Infrared remote controller) according to the results voice recognition
- MEMS microphone mounted small board
- Parameters of voice recognition and noise suppressor could be set by switch
- No internet connection required
Reference designs
Solution Application note
High performance HMI solution Contact us
Low power consumption voice recognition solution r12an0091ej0100
Partners

株式会社アドバンスト・メディア

Development and sales of voice recognition software products

Advanced Media. Inc

CONTACT:https://www.advanced-media.co.jp/contact/english/


株式会社テクノマセマティカル

Development and sales of image, acoustic and sound processing software and hardware products

Techno Mathematical Co., Ltd.

CONTACT:http://www.tmath.co.jp/eng/contact_us/

Contact Us