Clear calls from the road
Voice-interface technology improves the safety and clarity of hands-free communication in automobiles.
By Samuel Yu, Fortemedia -- EDN, January 19, 2006
Talking and driving—a dangerous combination, for sure, yet popular for a generation that wants to stay in touch with family and friends, no matter where they are or what they are doing. "Of all the accessories for cell phones, the ones people most want are headsets and other hands-free devices," says research analyst Linda Barrabee, who last year surveyed mobile-phone-buying habits for the Yankee Group. With the advent of a standard wireless interface in cellular phones (Bluetooth), the hands-free function in automotive telematics is set for explosive growth over the next several years.
Over the past decade, technological advances have contributed to improvements in voice quality and enhanced recognition rates for the hands-free cellular-phone function. Since the early days of using a plain microphone as the input device in the car cabin, industrial engineers have designed special housing for microphones, DSP engineers have developed voice-processing algorithms, and several companies have ventured into array microphones.
Despite these considerable developments, two perceptible needs still remain for users: land-line-quality conversations and higher voice-recognition rates, especially at highway speeds. Limitations in current technology allow microphones to pick up noise from all around the car cabin. Such environments require a new technology to give consumers what they want.
Hands-free-market background
Road safety continues to be a major concern for everyone, including government agencies, car manufacturers, and drivers themselves. Anything that draws the driver's attention away from driving can make the roads hazardous—whether it is reading the newspaper, eating a sandwich, or talking on the phone. As such, hands-free cell-phone functions that allow drivers to keep both hands on the wheel and both eyes on the road are becoming essential features in today's automobiles.
Worldwide, about 25 countries, including Australia, Italy, Israel, and Japan, have passed laws restricting drivers from using handheld cell phones. Three US states, including New York, have enacted similar laws, and at least 40 other states are now proposing such legislation, according to the National Conference of State Legislatures. Although consumers must comply with these laws, they also want to enjoy their conversations. Today, there is still a noticeable difference in voice quality between using a handheld phone and using a hands-free device. On the other hand, service providers such as On-Star continually look to improve customer satisfaction, especially for the automated voice-activated systems.
Technology trend
Historically, hands-free-telematics functions have used a single microphone as the input interface. This microphone could be unidirectional or omnidirectional and strategically placed within the cabin—on the visor, steering wheel, or rearview mirror, for example. Although this setup served its purpose in picking up the talker's voice, it also picked up all of the surrounding noise and echo, which quickly became unbearable for the far-end user.
Industrial designers quickly realized that, by using special acoustically designed microphone housings, they could block out a certain level of noise while focusing the microphone pickup at a certain location. Although this arrangement helped to increase the SNR (signal-to-noise ratio), echo still remained a pestering concern.
A big step in making hands-free telematics a widely acceptable feature was the deployment of AEC (acoustic-echo-cancellation) and noise-suppression software. Running on either general DSP platforms or on dedicated IC chips, these algorithms can reduce acoustic echo by 45 dB and suppress stationary noise by 10 dB. AEC and noise-suppression signal processing have significantly improved the voice quality of hands-free conversations in the car cabin. However, for many of these software-DSP products, the user experience has still not reached the point at which everyday consumers could comfortably use the hands-free function. Users have given mixed reviews of several hands-free-car-kit models using software DSP on the market today, complaining about distorted sounds and robotic-sounding voices.
Array and small-array microphones
The big leap in voice-interface technology is the array microphone. By arranging multiple microphones in an array, companies such as Fortemedia (www.fortemedia.com), AKG (www.akg.com), Knowles Acoustics (www.knowlesacoustics.com), and even Microsoft (www.microsoft.com) can further reduce surrounding noise, providing a more natural-sounding voice. Leveraging the information gathered by the multiple microphones about the voice and surrounding environment, an array microphone can process the signals in such a way that effectively forms a beam to pick up the wanted signal and cancels out noise outside the beam. Jaguar (www.jaguar.com) and Mercedes Benz (www.mercedes-benz.com) have deployed several hands-free car kits using array microphones in their XK models and E-Class cars, respectively, for example.
Although there are improvements in noise suppression, the traditional array microphone is still impractical and limited in two ways. First, it requires at least 30 mm between each microphone, putting placement and space constraints on the end technology. Second, it can cancel noise only on a 2-D plane, which makes it harder to pinpoint the talker and allows noise to leak into the beam. Diffused noise, engine noise, rattling of the dashboard, and general road noise coming from above and below the pie-shaped beam cause major problems for voice-recognition-related applications.
A new technology, SAM (small-array microphone), is the next step in the voice-interface market for noise-free communication. Designers can use SAM, placing the microphones only 5 mm apart (center to center), in practically any situation or application. SAM uses a fundamentally different algorithm from the traditional array microphone to process the voice, effectively forming a 3-D cone-shaped beam. As such, the system cancels out any noise outside the beam, whether above or below, without any leakage.
Traditional beam-forming
Traditional beam-forming uses the time delay between signals received at different microphones in the array. As such, the microphones are farther apart so that the information that each microphone receives is sufficiently different. The width of a broadside-array beam is based on the wavelength of the signal divided by the length of the aperture. So, at low frequencies at longer wavelength, the beam needs to be wider than that of higher frequencies at shorter wavelength.
Due to the need to process the difference in time delay and the need to capture frequencies of 300 Hz to 3.3 kHz, the traditional array microphones need to be at least 30 mm apart. This requirement brings about many limitations.
|
In Figure 1, the two microphones are facing 0°, meaning that the beam center is the y axis. Now, assume the signal source at Point A is playing at the same decibel level as the signal source at Point B. Also assume that Point A and Point B are the same distance away from the center of the array. In this case, the signal from Source A is suppressed, because the array microphone can obviously detect that Source A is outside the beam. (Time delay to Microphone 1 is much longer than time delay to Microphone 2.) However, the signal from Source B is not suppressed; to the traditional array microphone, Source B is effectively in the middle on the beam, because the difference in time delay is exactly the same to Microphone 1 as to Microphone 2. This limitation applies to every plane throughout the z axis, as well as directly behind the array (180°). Thus, the traditional array microphone can effectively suppress noise only in a 2-D manner. (In this example, the system cancels noise only on the xy plane.)
SAM-beam-forming
SAM-beam-forming technology is unlike traditional setups. It uses one unidirectional microphone and one omnidirectional microphone. Because you can place these two microphones only 5 mm away from each other center to center, the information coming to both microphones is highly correlated—virtually the same. Consequently, the beam-forming capability relies on the intelligence of Fortemedia's patented voice-processing algorithm to decipher this information.
Because you can place microphones of a SAM virtually right next to each other, the effective beam is a 3-D cone-shaped beam, which offers many advantages over the traditional array microphone. The setup in Figure 2 is the same as in Figure 1, except the receiving device is a SAM instead of the traditional array microphone. To the SAM, the signals from Source A and Source B are exactly the same—in this case, both outside the beam. This situation applies throughout the y axis, forming a 3-D cone-shaped beam. The technique effectively suppresses noise above, below, and behind the beam. Figure 3 compares the effective beam of a traditional array microphone with that of a SAM.
Value to end users
We've come a long way since the early days of using just a plain microphone as the input device. Undoubtedly, we will continue to use the phone in the car. And, based on legislative developments around the world, we will need to do it using hands-free kits. So, what does this situation mean for users? With SAM technology, On-Star users, for example, will experience higher voice-recognition rates when using the automated systems; the systems will suppress engine noise, road noise, and the rattling of the dashboard. Users will also be able to barge in and interrupt the automated system when necessary. During a person-to-person conversation, the far-end user will never notice the noises and rattles from the car cabin. With the SAM, calls you make in the car will be just as clear as calls you make from your living room.


















