Feature
Signal to noise: calculating the high-resolution-audio reality-to-hype ratio
Is high-resolution audio a "sound" investment, or will it bring your next design to a crashing halt in the market? Last time, large sample sizes made their pitch. This time, high sample rates step up to the microphone.
By Brian Dipert, Technical Editor -- EDN, 2/20/2003
Part
1 of this article series discussed the theoretical benefits of high-resolution audio's large sample sizes and the degree to which the limitations of real-life equipment, recording studios, and listening environments constrain those benefits (Reference 1). In this second set of the series, we audition the other half of the
technology behind high-resolution audio: high sampling rates. And, for an
encore, I'll wrap up the performance with some suggestions on how to continue
your audio education.
To evaluate the validity of the high-sample-rate hype, first dust off your college textbooks and recall that according to the Nyquist theory, a given sampling rate perfectly reproduces all frequencies up to half that sample rate. Recall, too, that in a digital-audio-inclusive design, antialias filters find use during both audio capture and playback. During capture—that is, as part of the ADC stage—they keep audio frequencies above half the Nyquist rate from folding back into the passband. During playback—that is, as part of the DAC stage—they prevent inaudible, potentially damaging high-frequency energy from traveling beyond the filter stage to subsequent links in the playback chain, such as power amplifiers and transducers.
An ideal lowpass filter would have perfect transmission in the passband, perfect rejection above the cutoff frequency, and be acoustically "transparent" in all other respects to the audio passing through it (Figure 1). When was the last time you lived in the ideal world? In real life, design engineers must balance the desire for aggressive filtering against the need for low distortion near the cutoff frequency. Make the passband-to-rejection-band transition too steep, and you might create phase shift and ripple, depending on the complexity and cost of the filter architecture you choose. Make the transition too gentle, and you attenuate audible information, allow aliased information to become audible, or both.
Modern CD players minimize the acoustic effects of antialias filters by oversampling—most commonly at an 8× rate—the digital data, thereby moving aliased images of the audio beyond the 22.05- to 44.1-kHz range where 1× sampling would begin to place them. The resultant antialiasing filter can, as a result, be much gentler in its passband-to-rejection-band transition. DVD-Audio and its ilk extend this technique back before the media playback stage to the point at which the recording equipment captures and digitizes the original audio.
High sampling rates also deliver a potential benefit with regard to dithering, a random pattern added to the least significant bit of a signal during sample-length reduction (Figure 2). Although dither increases background noise, it does so in a way uncorrelated to the signal and therefore more pleasing to the ear than the effect that simply truncating bits creates. With 96-kHz sampling, the dither noise distributes to a frequency band more than twice as wide as that in 44.1-kHz sampling, but less than half of this wider frequency band is audible. Perceived noise drops by approximately 3 dB as a result, and noise shaping can further decrease the audible dither noise level (see Web-only sidebar "Reading 'bitween' the lines").
Notice that in all the discussion so far, I haven't mentioned the most commonly touted marketing advantage of the new formats: their ability to capture sonic information higher than an audio CD's 22.05-kHz Nyquist-dictated cutoff. In reality, even the keenest-hearing children barely perceive audio information at 20 kHz, and, by middle age, even the sharpest ear can't hear anything higher than 15 kHz. Research data even suggests that the human auditory system lumps all frequencies higher than approximately 12.5 kHz into a single frequency "bin," in which humans cannot differentiate the various frequencies present. Noise shaping, a core technology that most modern audio formats use, takes advantage of these phenomena.
Though the frequency-domain benefits of ultrasonic-captured information are at best dubious, time-domain advantages may be more compelling (references 2 and 3). Ripple near an alias filter's cutoff frequency translates to "smearing" of sharp transients in the time domain. When listeners report that DVD-Audio discs and SACDs (Super Audio CDs) sound crisper, they may be noting the reduced blurring effect that an optimized antialias filter has on abrupt edges, such as the "attacks" that percussion and brass instruments produce, and on the higher order harmonics that instruments such as distorted guitars generate (Reference 4). Even if the ears and brain don't consciously acknowledge the presence of a stimulus, it may still have an effect—a phenomenon that subliminal advertising also harnesses. Recent studies indicate increased brain activity in response to high-resolution audio, even when listeners don't report any audible difference between that audio and more conventional music formats (Reference 5).
One claimed benefit of high-resolution audio that likely holds no water is the belief that high sampling rates and consequent ultrasonic frequencies aid in precisely locating a sound source. This phenomenon, the Haas effect, refers to the fact that the phase—that is, time—difference between when a sound hits one ear and when it hits the other is one of two means by which you acoustically place its source in 3-D space. (The other means is the intensity difference you perceive between one ear and the other.) The time difference between any two 44.1- kHz samples is approximately 23 µsec, yet the human auditory system can resolve phase- and time-delay differences of only a few microseconds (defined in part by the distance between an average person's ears).
As Thomas Sandmann from Master Orange Entertainment notes, "One frequent opinion states that the higher sampling rate with its shorter time distance between two samples is better suited for replicating such phase-delay differences. However, this theory is without any foundation because it is indeed possible to present shorter time distances in a digital signal than the distance of two samples. The phase position of a digital audio signal is quite continual with respect to its value since the quantization and the resultant numeric values always apply only to the current amplitude in a discrete time pattern.
"The reconstruction in the D/A-conversion process results in the original waveform and also in the original phase position of the signal. Here, simply raising the sampling frequency does not result in an advantage" (Reference 6). Claims of listeners' locating sound sources better at high sample rates also ignore the results of studies indicating that at high frequencies, the ears and brain rely on intensity, not phase or time, to discern direction (Reference 7).
Can't get no (sonic) satisfactionIf the benefits of a migration beyond 16-bit, 44.1-kHz audio are so obscure, then why do so many people claim that the new formats sound so much better, especially when they're auditioning in nonideal listening environments? One pragmatic answer is that brains are fickle organs; if someone wants to believe that one thing is better than another, the brain happily distorts its sensory inputs to create the desired result. If you've just spent tens of thousands of dollars to upgrade your gear and music collection, that investment can be a strong perception incentive. Also, folks notice subtle differences much more when they're testing multiple options at once than when they listen to any of the options stand-alone (Reference 8).
As noted in Part 1, a well-engineered surround mix can be a powerful motivation for migrating from one format too another. An equally powerful incentive is a reformatted version of an old music classic, slowly and carefully remastered on modern equipment that is free of overflow, rounding, and truncation artifacts and employs the latest and greatest antialias filter technology. This benefit is analogous to the remastered audio CDs that sound so much better than the original, rushed mixes that audio engineers unfamiliar with the new and evolving rules of the then-nascent digital age created. As the person sitting next to me at an audition of the two-channel SACD remix of the Rolling Stones' Street Fighting Man at last October's Audio Engineering Society Convention said, "I never heard the master tape hiss so clearly before."
Remastering a music catalog has its place. However, various vendors are targeting brute-force upsampling equipment at audiophiles. Don't let their sales pitches fool you. This gear connects to a CD player's digital outputs and claims to transform your audio CDs into 24-bit, 96- or 192-kHz "higher quality" presentations. Remember that you can't out of thin air create more meaningful bits than those that existed in the source; padding a sample with zeros doesn't count. Also, "upsampling" doesn't differ from the "oversampling" that CD players perform. An audiophile box may use a more robust antialias filter than a bargain-basement CD player, which may lead to a slight sonic improvement, but that's the extent of the gain.
As John Atkinson from Stereophile writes, other than making active the lowest 8 bits of a 24-bit word, none of these products create any new audio information. As susceptibility to word-clock jitter increases with sampling frequency, upsampling audio data can even make things worse rather than better; and no matter how good these upsampling products can sound, they offer no conceptual advantage over traditional CD-playback systems. Atkinson is "convinced that the sonic differences...are due to the...choices in digital filters [that these products' designers] make with respect to the number of taps, passband ripple, and stopband rejection and to changes in the jitter performance" (Reference 9).
| For more information... | ||
|
When you contact any of the following manufacturers directly, please let them know you read about their products in EDN. |
||
| Apex Digital www.apexdigitalinc.com | Audio Engineering Society (AES) www.aes.org | Digital Theater Systems (DTS) www.dtsonline.com |
| Master Orange Entertainment www.master-orange.de | Meridian Audio www.meridian-audio.com | Microsoft www.microsoft.com/windowsmedia |
| Philips www.philips.com | Sony www.sony.com | Stereophile Magazine www.stereophile.com |
| Terratec www.terratec.net | Toshiba www.csd.toshiba.com | University of Essex www.essex.ac.uk |
| University of Salford www.salford.ac.uk | University of Waterloo www.uwaterloo.ca | Voyetra Turtle Beach www.audiotron.net |
| Author Information |
Technical editor Brian Dipert enjoys listening to DTS CDs and DVDs, DVD-Audio discs and SACDs. He admits, though, that neither his dining room nor his kitchen is located within the surround-sound sweet spot. When he's not listening critically, he's equally happy auditioning two-channel audio CDs ripped to 96-kbps Microsoft Windows Media Audio format, stored on a Toshiba Magnia SG-10 home media server, and played back over a Voyetra Turtle Beach AudioTron. Contact him at 1-916-454-5242, fax 1-916-454-5101, bdipert@edn.com, and www.bdipert.com. |
| References |
|
|















Technical editor Brian Dipert enjoys listening to DTS CDs and DVDs, DVD-Audio discs and SACDs. He admits, though, that neither his dining room nor his kitchen is located within the surround-sound sweet spot. When he's not listening critically, he's equally happy auditioning two-channel audio CDs ripped to 96-kbps Microsoft Windows Media Audio format, stored on a Toshiba Magnia SG-10 home media server, and played back over a Voyetra Turtle Beach AudioTron. Contact him at 1-916-454-5242, fax 1-916-454-5101, 
