Alexa, can you hear me now? Low power voice interface technology evolves
Echo canceler – allows microphones to hear voice commands even while music is being played back
Direction of arrival estimation – determines direction that the voice is coming from (used in conjunction with beamforming)
Far-field beamforming – combines multiple microphone signals to improve the quality of voice recognition
Noise reduction – removes background noise and interfering sources (like TVs and air conditioners) to further improve VR
Trigger word – wake word like “Alexa” which is recognized on product
Voice services integration – back-end cloud based voice service
Microphone processing audio IP
- Attention processing
- Noise reduction
- Quiescent sound detector
- Optimized beamformers
Playback processing audio IP
- Volume management
- Bass enhancement
- Dialog enhancement
Figure 4 shows the different microphone geometries which are designed using DSP Concepts optimization algorithms.
The performance of the voice UI largely depends upon the signal-to-noise. So, how loud is my voice given the loudness of other interfering signals in the environment. DSP Concepts developed a beamformer design algorithm that optimizes the receive signal-to-noise ratio (SNR). They do not design for particular beam patterns, instead they optimize the signal-to-noise at the output of the microphone array because the problem is fundamentally SNR. So, a user can look at what the microphone geometry is like, what the noise level in the room is, or even what the SNR is for the quality of the microphones being used, and optimize with this information (Figure 2).
Figure 2 Graphs for the SNR optimized beamformer which focuses upon the user’s voice. The graphs above are for a two-element microphone array. (Image courtesy of DSP Concepts)
These are algorithms that go around the ‘wake word.’ An example is in the Amlogic reference design which has processing that wraps itself around the ‘wake word’ called Attention Processing. Based upon the environment, the system is able to know when to pay attention and when to ignore things. This reduces false alarms and missed triggers. The software mimics the way a human at times pays attention to something and other times is just peripherally paying attention.
Advanced noise reduction
The main challenge here is not so much building an echo canceller, but methods of dealing with environmental noise. In a home, there may be a dishwasher running, a TV or radio playing; these sounds need to be somehow ignored. The different noise reduction algorithms address this. Some algorithms address steady-state or stationary noise (fans and air conditioners), and some with directional interferers (radios and TVs).
In the smart home, people want voice UI on their air conditioners. In an automotive environment, there is road noise, engine noise, and more.
Quiescent sound detector
Battery applications need low power processing so there is an algorithm listening on one of the microphones to see if there is anything to which it might need to pay attention. Upon detecting sound, it wakes up the rest of the system so more advanced processing can be done on the sound.
They also have complete, turnkey software solutions with a full suite of algorithms for designers who are building a system to hook up to the cloud.