Voice Recognition Solution
A solution with high recognition rate even in noisy environment
The voice recognition is used as a human machine interface which has been adopted in lots of products like robot and smart speaker. The voice recognition is developed from the needs to adopt more convenient functions while keeping costs as low as possible in consumer equipment and industrial equipment. And the voice recognition function is becoming an important additional feature because it is possible to help visually impaired and elderly persons by using the voice recognition function. Renesas is providing a voice recognition solution which does not need internet connection to make current products differentiation and high functionality.
Implemented by A/D converter or SSI (Serial Sound Interface) and middleware
Realize high recognition rate under noisy environment using “Noise suppressor technology”
(ex) noise suppressor technologies
- Beamforming
Reducing noise from other of target - Noise suppressor
Reducing steady noise - Echo cancellation
Preventing or removing echo that is being created or already present
Voice recognition/AmiVoice Micro
Features of AmiVoice Micro
- Realized voice recognition in none internet connection, low clock and small memory environment compared to existing products
- Two acoustic models
- Normal model
- High recognition model
- Support of VAD (voice activity detection)
- It includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks (threshold is 1000 to 15000).
Required memory size
- Normal model
ROM: over 33[KB], RAM: over 23[KB] - High recognition model
ROM: over 482[KB], RAM: over 23[KB]
Required ROM/RAM size against to recognition word number
Word number | Normal model [KB] | High recognition model [KB] | ||
---|---|---|---|---|
ROM | RAM | ROM | RAM | |
5 | 33 | 23 | 482 | 23 |
10 | 54 | 25 | 681 | 25 |
20 | 78 | 28 | 995 | 28 |
30 | 96 | 30 | 1,226 | 30 |
40 | 109 | 33 | 1,444 | 33 |
50 | 117 | 33 | 1,587 | 33 |
100 | 143 | 46 | 2,143 | 46 |
150 | 160 | 55 | 2,452 | 55 |
* Information for reference (It changes according to the language and the content of recognition word.)
Languages
- Normal model
Japanese, English, Chinese (Mandarin) - High recognition model
Japanese
Noise suppressor technology/Zoom Voice
Features of Zoom Voice
Support two noise suppressor technologies
Beam forming
- Extracting the target sound properly from front with reducing the background noise
- Using two non-directive microphones
- Effect could be set from “1: weak to 7: strong”
Noise suppressor
- Noise reduction 30dB (about 1/30) max.
- Noise reduction could be set according to frequency
High speed process version applied DSP instruction
The processing speed of DSP instruction applied version is 30% higher
Supported MCU: RXv2 CPU-based RX family (RX64M, RX71M, RX231, RX230, RX65N, RX651, RX23T, RX24T, RX24U Group)
Required memory size
ROM: 40[KB], RAM: 10[KB]
Use case of beam forming and noise suppressor
The high recognition rate is achieved even under noisy environments by using Zoom Voice.
Especially very high effect can be expected at 5[dB] or less S/N ratio.
The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition)
Note 1, using sound of vacuum cleaner and washing machine as the source of noise.
Note 2, this data is base on the research of Renesas.
High performance HMI solution
Realized noise suppressor, voice recognition, voice synthesis and touch panel by using 1-chip RZ/A1H without internet connection.
Features
Adopting voice recognition function with noise suppressor technology
Improving recognition rate by implementing beam forming and noise suppressor function.
Adopting voice recognition function with tuning
Performance parameters of noise suppressor and voice recognition could be changed on the touch panel according to the usage environment.
Feedback the results by voice synthesis function and LCD display function
Recognition results could feedback by voice synthesis function and LCD display function.
Used middleware
Function | Partner | Middleware | Remarks |
---|---|---|---|
Noise suppressor | Techno Mathematical Co., Ltd. | Zoom Voice | - |
Voice recognition | Advanced Media Inc. | AmiVoice Micro | High recognition version, normal version |
Voice synthesis | Hitachi ULSI Systems Co., Ltd. | Ruby Talk® | - |
Low power consumption voice recognition solution
Realizing high performance voice recognition by using low power consumption technology and DSP operating instruction of RX231
Features
Adopting voice recognition function with noise suppressor technology
Improving recognition rate by implementing beam forming and noise suppressor function, tuning according to the usage environment.
Infrared communication function
Controlling the infrared communication compatible equipment according to recognition results.
Immediately to start evaluation and development of speech recognition function
Providing both of board and sample software for evaluation and development.
Used middleware
Function | Partner | Middleware | Remarks |
---|---|---|---|
Noise suppressor | Techno Mathematical Co., Ltd. | Zoom Voice | DSP instruction applied version |
Voice recognition | Advanced Media Inc. | AmiVoice Micro | Normal version |
Comparison of solutions
High performance HMI solution | Low power consumption voice recognition solution | |
---|---|---|
MPU/MCU | RZ/A1H (RAM: 10[MB], Clock: 400[MHz]) | RX231 (ROM: 512[KB], RAM: 64[KB], Clock: 54[MHz]) |
OS | mbed (RTOS) | Not used |
Functions | - Voice recognition: AmiVoice Micro
(High recognition version, normal version)
- Noise suppressor: Zoom Voice - Voice synthesis: Ruby Talk® - Touch screen |
- Voice recognition: AmiVoice Micro (Normal version)
- Noise suppressor: Zoom Voice (DSP instruction applied version) - Infrared communication output |
Recognition languages | All three languages
- Japanese (High recognition version) - Chinese (Normal version) - English (Normal version) |
One of below languages
- Japanese (Normal version) - Chinese (Normal version) - English (Normal version) |
Overview | - Realized noise suppressor, voice recognition, voice synthesis and
touch panel by using 1-chip RZ/A1H
- Performance parameters of noise suppressor and voice recognition could be changed on the touch panel - No internet connection required |
- Infrared communication output (Infrared remote controller) according
to the results voice recognition
- MEMS microphone mounted small board - Parameters of voice recognition and noise suppressor could be set by switch - No internet connection required |
Partners
Development and sales of voice recognition software products
Advanced Media. Inc
Development and sales of image, acoustic and sound processing software and hardware products