Alexa, can you hear me now? Low power voice interface technology evolves

-January 15, 2018

Software IP

Echo canceler – allows microphones to hear voice commands even while music is being played back

Direction of arrival estimation – determines direction that the voice is coming from (used in conjunction with beamforming)

Far-field beamforming – combines multiple microphone signals to improve the quality of voice recognition

Noise reduction – removes background noise and interfering sources (like TVs and air conditioners) to further improve VR

Trigger word – wake word like “Alexa” which is recognized on product

Voice services integration – back-end cloud based voice service

Microphone processing audio IP

  • Attention processing
  • Noise reduction
  • Quiescent sound detector
  • Optimized beamformers

Playback processing audio IP

  • Volume management
  • Bass enhancement
  • Dialog enhancement


Figure 4 shows the different microphone geometries which are designed using DSP Concepts optimization algorithms.

The performance of the voice UI largely depends upon the signal-to-noise. So, how loud is my voice given the loudness of other interfering signals in the environment. DSP Concepts developed a beamformer design algorithm that optimizes the receive signal-to-noise ratio (SNR). They do not design for particular beam patterns, instead they optimize the signal-to-noise at the output of the microphone array because the problem is fundamentally SNR. So, a user can look at what the microphone geometry is like, what the noise level in the room is, or even what the SNR is for the quality of the microphones being used, and optimize with this information (Figure 2).

Figure 2 Graphs for the SNR optimized beamformer which focuses upon the user’s voice. The graphs above are for a two-element microphone array. (Image courtesy of DSP Concepts)

Attention processing

These are algorithms that go around the ‘wake word.’ An example is in the Amlogic reference design which has processing that wraps itself around the ‘wake word’ called Attention Processing. Based upon the environment, the system is able to know when to pay attention and when to ignore things. This reduces false alarms and missed triggers. The software mimics the way a human at times pays attention to something and other times is just peripherally paying attention.

Advanced noise reduction

The main challenge here is not so much building an echo canceller, but methods of dealing with environmental noise. In a home, there may be a dishwasher running, a TV or radio playing; these sounds need to be somehow ignored. The different noise reduction algorithms address this. Some algorithms address steady-state or stationary noise (fans and air conditioners), and some with directional interferers (radios and TVs).

In the smart home, people want voice UI on their air conditioners. In an automotive environment, there is road noise, engine noise, and more.

Quiescent sound detector

Battery applications need low power processing so there is an algorithm listening on one of the microphones to see if there is anything to which it might need to pay attention. Upon detecting sound, it wakes up the rest of the system so more advanced processing can be done on the sound.

They also have complete, turnkey software solutions with a full suite of algorithms for designers who are building a system to hook up to the cloud.

Loading comments...

Write a Comment

To comment please Log In