The fundamental flaw of too many listening devices
Currently inhabiting my household are six devices that respond to an "Ok Google" voice activation prompt:
- my two Google Pixel smartphones
- my Moto 360 Android Wear smartwatch
- my Google Nexus 7 FHD tablet
- my Toshiba Chromebook 2
- and a Google Home Mini that I'm testing prior to tearing it down for your perusal.
Also currently inhabiting my household are six devices that respond to an "Alexa" voice activation prompt:
- three Amazon Echos
- two Amazon Echo Dots
- and an Amazon Tap (which gained hands-free mode support in February 2017)
Source: Flickr user turoczy
And finally, there are two devices currently inhabiting my household that respond to a "Hey Siri" voice activation prompt (my iPad 3 is too old to support the feature, thereby requiring a home-button press to activate Siri):
I bet you can already guess the scenario I'm about to describe. When I'm away from home, for example, I typically have both smartphones (one for work, the other for personal use) in my pockets, along with the Moto 360 strapped to my wrist. When I say "Ok Google," all three widgets wake up and respond to what I say next. The same issue applies for my wife, whose iPad and iPhone are frequently in close proximity.
The only reason we don't currently have an "Alexa" issue is that the various Echo devices are in separate rooms (and my voice is usually soft enough that the sound doesn't carry from one room to another), along with the fact that the older Fire Sticks we also currently own, located in some of those same rooms, aren't voice-only-activated. And don't get me started on the looming Microsoft Cortana calamity to come, assisted by a recently announced partnership with Amazon...
This is, perhaps obviously, a mess. And it's probably going to get worse before it (hopefully) gets better. Amazon, for example, recently added the ability to tell voices apart. But as far as I can tell, at least so far, this customization is still tied to a specific Amazon account. And our Echos are all connected to my account (among other reasons so we can stream Amazon Music Unlimited throughout the house without having to pay for it twice, on both my wife's and my Amazon accounts), so any customization I'd do wouldn't enhance the Echo in her office, for example.
Eventually, I hope, the various services will become smart enough such that her attempts at "OK Google" won't successfully prompt my Android gear to respond, nor will my "Hey Alexa" utterances wake up her iOS devices. But there's still the problem of my within-listening-distance smartphones and smartwatch simultaneously emerging from their slumber, for example.
Fixing this issue is going to require more system-level smarts. Ideally, for example, if my watch and one of the phones was paired and in sufficiently close judged proximity, the phone would ignore my voice, assuming that the watch was dealing with it (passing data to the phone as necessary). And my other phone, even though it was not watch-paired (only one pairing can be active at a time), would realize that it's close to its paired peer and would ignore my voice, too. "Easy to say, hard to do," as a spiritual instructor from my past was fond of often saying.
There are probably plenty of other usage-scenario stumbles that you can brainstorm, and perhaps have even personally experienced. And you can probably think of plenty of other "fixes," too. However the problem does get solved, and to whatever degree it's fixed, it's going to require lots of upfront testing, because while the scenario I've described is frustrating when too many devices respond to your voice, it's even more frustrating when none of them do. Sound off with your own thoughts in the comments.
—Brian Dipert is Editor-in-Chief of the Embedded Vision Alliance, and a Senior Analyst at BDTI and Editor-in-Chief of InsideDSP, the company's online newsletter.
- Alexa, can you hear me now? Low power voice interface technology evolves
- Voice-activated interface becomes pervasive and persistent
- Voice as an interface in the smart home: Can you hear me now?
- Teardown: Amazon's Echo voice-activated virtual assistant
- IoT's harbinger is an Echo
- Wi-Fi versus 5G? Nope, it’s both
- Is 2017 the year of the voice interface?
- Creating JARVIS - Smart microphones enabling the digital butler
- From Noise and Echoes to Clear Speech: The Voice Activation Frontend