Ubiquitous sensors meet the most natural interface--speech—Part I
Bernie Brafman, Sensory, Inc. - January 8, 2013
These popular mobile applications, automotive infotainment, Bluetooth headsets and hands-free kits, and entertainment and home automation remote controls, all require a button press to engage the voice user interface. The microphone is present and capable of capturing voice, which could be used to begin interaction much like the “Computer” command in Star Trek.
Historically, there have been practical reasons for the button press requirement. First, speech recognition technology is sensitive to background noise and recognizing “voice triggers” in real world environments has not yielded acceptable accuracy. As a result, close talking microphones such as headsets were required. Another consideration is the computational requirements for doing “always on” continuous listening voice triggers can rapidly consume battery life in mobile devices. These limitations have made for unacceptable real-world performance. However, the compelling value of truly hands free operation--safety and convenience--continues to spur the search for a reliable and robust solution.
Fortunately just such a solution has arrived.
The technology
During decades of embedded speech recognition innovation, users have wanted to use reliable speaker independent voice triggers to make their devices truly hands free. Recently a combination of experience in real world speech recognition combined with a breakthrough in handling noisy conditions allowed the development and offering of a hands-free voice control with fast, highly accurate, reliable and noise robust voice triggers for a wide variety of consumer electronics.There are three components that led to the innovation. First, keyword spotting, whereby a phrase can be recognized without the customary preceding and following silence. This is a fundamental part of the noise robustness and reliability. Extensive experience with keyword spotting was developed over years of producing top-selling consumer electronics products. For example, in a mobile phone, the user might be able say, “Take a picture,” to activate the camera. Keyword spotting provides the phone the ability to recognize this phrase when said as “I want to take a picture” or “please take a picture” or “hey watch this, I can just say take a picture and it works.” Keyword spotting allows voice triggers to be recognized when embedded in a sentence, without pauses or silence before or after a word, and it contributes to overall noise robustness.
The second component represents a breakthrough in the handling of noisy conditions. Noise reduction techniques that are very effective when applied in telecommunications have proven to be ineffective or even degrade speech recognition accuracy due to a mismatch between the spectral characteristics of the incoming audio and the audio from which the speech recognizer engine’s underlying models are built. Figure 1 shows a block diagram of an embedded speech recognition engine, including the Acoustic Model, a database of phonetic information used to match the user's speech to the vocabulary to be recognized. Also shown is a Speech and Noise database, which is part of the breakthrough developed to solve the challenges associated with voice triggers in noise. Technologists now have a way to use statistical methods to encode, and include noise in the recognition process in a way that dramatically improves not only noise robustness, but far field recognition as well.

Next: Applications and design considerations
About the Author
Bernard Brafman is Vice President of Business Development for Sensory, Inc., responsible for strategic business partnerships. He received his MSEE from Stanford University. He can be reached at bbrafman@sensoryinc.com
Brushless DC Motors – Part I: Construction and Operating Principles
Brushless DC Motors--Part II: Control Principles
Mechanical Buttons to capacitive sensing—A step-by-step guide—Part III
Oscillators: How to generate a precise clock source
Slideshow: NASA’s incredible unsung sensors
Flow metering tutorial - Part 2: Pulse-based counting in flow meters
Mechanical buttons to capacitive sensing—A step-by-step guide--Part I
