Signal to noise: calculating the high-resolution-audio reality-to-hype ratio

By Brian Dipert, Technical Editor -- 2/20/2003

Part 1 of this article series discussed the theoretical benefits of high-resolution audio's large sample sizes and the degree to which the limitations of real-life equipment, recording studios, and listening environments constrain those benefits (Reference 1). In this second set of the series, we audition the other half of the technology behind high-resolution audio: high sampling rates. And, for an encore, I'll wrap up the performance with some suggestions on how to continue your audio education.

Samplification

To evaluate the validity of the high-sample-rate hype, first dust off your college textbooks and recall that according to the Nyquist theory, a given sampling rate perfectly reproduces all frequencies up to half that sample rate. Recall, too, that in a digital-audio-inclusive design, antialias filters find use during both audio capture and playback. During capture—that is, as part of the ADC stage—they keep audio frequencies above half the Nyquist rate from folding back into the passband. During playback—that is, as part of the DAC stage—they prevent inaudible, potentially damaging high-frequency energy from traveling beyond the filter stage to subsequent links in the playback chain, such as power amplifiers and transducers.

An ideal lowpass filter would have perfect transmission in the passband, perfect rejection above the cutoff frequency, and be acoustically "transparent" in all other respects to the audio passing through it (Figure 1). When was the last time you lived in the ideal world? In real life, design engineers must balance the desire for aggressive filtering against the need for low distortion near the cutoff frequency. Make the passband-to-rejection-band transition too steep, and you might create phase shift and ripple, depending on the complexity and cost of the filter architecture you choose. Make the transition too gentle, and you attenuate audible information, allow aliased information to become audible, or both.

ADVERTISEMENT
Modern CD players minimize the acoustic effects of antialias filters by oversampling—most commonly at an 8× rate—the digital data, thereby moving aliased images of the audio beyond the 22.05- to 44.1-kHz range where 1× sampling would begin to place them. The resultant antialiasing filter can, as a result, be much gentler in its passband-to-rejection-band transition. DVD-Audio and its ilk extend this technique back before the media playback stage to the point at which the recording equipment captures and digitizes the original audio.

High sampling rates also deliver a potential benefit with regard to dithering, a random pattern added to the least significant bit of a signal during sample-length reduction (Figure 2). Although dither increases background noise, it does so in a way uncorrelated to the signal and therefore more pleasing to the ear than the effect that simply truncating bits creates. With 96-kHz sampling, the dither noise distributes to a frequency band more than twice as wide as that in 44.1-kHz sampling, but less than half of this wider frequency band is audible. Perceived noise drops by approximately 3 dB as a result, and noise shaping can further decrease the audible dither noise level (see Web-only sidebar "Reading 'bitween' the lines").

Notice that in all the discussion so far, I haven't mentioned the most commonly touted marketing advantage of the new formats: their ability to capture sonic information higher than an audio CD's 22.05-kHz Nyquist-dictated cutoff. In reality, even the keenest-hearing children barely perceive audio information at 20 kHz, and, by middle age, even the sharpest ear can't hear anything higher than 15 kHz. Research data even suggests that the human auditory system lumps all frequencies higher than approximately 12.5 kHz into a single frequency "bin," in which humans cannot differentiate the various frequencies present. Noise shaping, a core technology that most modern audio formats use, takes advantage of these phenomena.

Though the frequency-domain benefits of ultrasonic-captured information are at best dubious, time-domain advantages may be more compelling (references 2 and 3). Ripple near an alias filter's cutoff frequency translates to "smearing" of sharp transients in the time domain. When listeners report that DVD-Audio discs and SACDs (Super Audio CDs) sound crisper, they may be noting the reduced blurring effect that an optimized antialias filter has on abrupt edges, such as the "attacks" that percussion and brass instruments produce, and on the higher order harmonics that instruments such as distorted guitars generate (Reference 4). Even if the ears and brain don't consciously acknowledge the presence of a stimulus, it may still have an effect—a phenomenon that subliminal advertising also harnesses. Recent studies indicate increased brain activity in response to high-resolution audio, even when listeners don't report any audible difference between that audio and more conventional music formats (Reference 5).

One claimed benefit of high-resolution audio that likely holds no water is the belief that high sampling rates and consequent ultrasonic frequencies aid in precisely locating a sound source. This phenomenon, the Haas effect, refers to the fact that the phase—that is, time—difference between when a sound hits one ear and when it hits the other is one of two means by which you acoustically place its source in 3-D space. (The other means is the intensity difference you perceive between one ear and the other.) The time difference between any two 44.1- kHz samples is approximately 23 µsec, yet the human auditory system can resolve phase- and time-delay differences of only a few microseconds (defined in part by the distance between an average person's ears).

As Thomas Sandmann from Master Orange Entertainment notes, "One frequent opinion states that the higher sampling rate with its shorter time distance between two samples is better suited for replicating such phase-delay differences. However, this theory is without any foundation because it is indeed possible to present shorter time distances in a digital signal than the distance of two samples. The phase position of a digital audio signal is quite continual with respect to its value since the quantization and the resultant numeric values always apply only to the current amplitude in a discrete time pattern.

"The reconstruction in the D/A-conversion process results in the original waveform and also in the original phase position of the signal. Here, simply raising the sampling frequency does not result in an advantage" (Reference 6). Claims of listeners' locating sound sources better at high sample rates also ignore the results of studies indicating that at high frequencies, the ears and brain rely on intensity, not phase or time, to discern direction (Reference 7).

Can't get no (sonic) satisfaction

If the benefits of a migration beyond 16-bit, 44.1-kHz audio are so obscure, then why do so many people claim that the new formats sound so much better, especially when they're auditioning in nonideal listening environments? One pragmatic answer is that brains are fickle organs; if someone wants to believe that one thing is better than another, the brain happily distorts its sensory inputs to create the desired result. If you've just spent tens of thousands of dollars to upgrade your gear and music collection, that investment can be a strong perception incentive. Also, folks notice subtle differences much more when they're testing multiple options at once than when they listen to any of the options stand-alone (Reference 8).

As noted in Part 1, a well-engineered surround mix can be a powerful motivation for migrating from one format too another. An equally powerful incentive is a reformatted version of an old music classic, slowly and carefully remastered on modern equipment that is free of overflow, rounding, and truncation artifacts and employs the latest and greatest antialias filter technology. This benefit is analogous to the remastered audio CDs that sound so much better than the original, rushed mixes that audio engineers unfamiliar with the new and evolving rules of the then-nascent digital age created. As the person sitting next to me at an audition of the two-channel SACD remix of the Rolling Stones' Street Fighting Man at last October's Audio Engineering Society Convention said, "I never heard the master tape hiss so clearly before."

Remastering a music catalog has its place. However, various vendors are targeting brute-force upsampling equipment at audiophiles. Don't let their sales pitches fool you. This gear connects to a CD player's digital outputs and claims to transform your audio CDs into 24-bit, 96- or 192-kHz "higher quality" presentations. Remember that you can't out of thin air create more meaningful bits than those that existed in the source; padding a sample with zeros doesn't count. Also, "upsampling" doesn't differ from the "oversampling" that CD players perform. An audiophile box may use a more robust antialias filter than a bargain-basement CD player, which may lead to a slight sonic improvement, but that's the extent of the gain.

As John Atkinson from Stereophile writes, other than making active the lowest 8 bits of a 24-bit word, none of these products create any new audio information. As susceptibility to word-clock jitter increases with sampling frequency, upsampling audio data can even make things worse rather than better; and no matter how good these upsampling products can sound, they offer no conceptual advantage over traditional CD-playback systems. Atkinson is "convinced that the sonic differences...are due to the...choices in digital filters [that these products' designers] make with respect to the number of taps, passband ripple, and stopband rejection and to changes in the jitter performance" (Reference 9).


For more information...
When you contact any of the following manufacturers directly, please let them know you read about their products in EDN.
Apex Digital
www.apexdigitalinc.com
Audio Engineering Society (AES)
www.aes.org
Digital Theater Systems (DTS)
www.dtsonline.com
Master Orange Entertainment
www.master-orange.de
Meridian Audio
www.meridian-audio.com
Microsoft
www.microsoft.com/windowsmedia
Philips
www.philips.com
Sony
www.sony.com
Stereophile Magazine
www.stereophile.com
Terratec
www.terratec.net
Toshiba
www.csd.toshiba.com
University of Essex
www.essex.ac.uk
University of Salford
www.salford.ac.uk
University of Waterloo
www.uwaterloo.ca
Voyetra Turtle Beach
www.audiotron.net


Author Information
Technical editor Brian Dipert enjoys listening to DTS CDs and DVDs, DVD-Audio discs and SACDs. He admits, though, that neither his dining room nor his kitchen is located within the surround-sound sweet spot. When he's not listening critically, he's equally happy auditioning two-channel audio CDs ripped to 96-kbps Microsoft Windows Media Audio format, stored on a Toshiba Magnia SG-10 home media server, and played back over a Voyetra Turtle Beach AudioTron. Contact him at 1-916-454-5242, fax 1-916-454-5101, bdipert@edn.com, and www.bdipert.com.


References
  1. Dipert, Brian, "Signal to noise: calculating the high-resolution audio reality-to-hype ratio (Part 1)," EDN, Feb 6, 2003, pg 32.
  2. Dunn, Julian, "Anti-alias and anti-image filtering: the benefits of 96 kHz sampling rate formats for those who cannot hear above 20 kHz," 104th Audio Engineering Society Convention, May 16 to 19, 1992.
  3. Story, Mike, "A suggested explanation for (some of) the audible differences between high sample rate and conventional sample rate audio material," DCS Ltd, 1997.
  4. Hon, Andrew, "It's alive! Ultrasonic spectra isn't so ultra anymore," http://www.ocf.berkeley.edu/~ashon/audio/Ultrasonics.htm.
  5. Oohashi, Tsutomu, et al, "Inaudible high-frequency sounds affect brain activity: hypersonic effect," Journal of Neurophysiology, Volume 83, No. 6, June 2000, pg 3548, http://jn.physiology.org/cgi/content/abstract/83/6/3548.
  6. Sandmann, Thomas, "What do 24 bit and 96 kHz achieve?'' www.terratec.de/4G/2496-en.pdf.
  7. Dipert, Brian, "Digital audio breaks the sound barrier," EDN, July 20, 2000, pg 72.
  8. Dipert, Brian, "Security scheme doesn't hold water(marking),''EDN, Dec 21, 2000, pg 35.
  9. Atkinson, John, "Upsampling or oversampling?" Stereophile, December 2000.

Reading “bitween” the lines

Attentive readers may notice that I’m side-stepping the DVD-Audio-versus-SACD (Super Audio CD) debate. Why? For one thing, I have too little space to do the topic justice. For another, the controversy has no clear answer; strong advocates and detractors exist on both sides of the tug-of-war rope. And, in some sense, the controversy is quickly becoming a nonissue; both formats have much higher quality potential than the rest of the audio-playback chain can support, and the hybrid players now on the market, such as Apex Digital’s AD-7702, handle both formats.

Instead of directly capturing each sample’s value in a multibit format, SACD instead encodes the level and positive or negative slope of the waveform at each sample point, in a single-bit format operating at multimegahertz rates. SACD advocates point out that this approach negates the need for decimation during audio capture and oversampling during playback, both of which degrade quality, according to Philips and Sony. However, you cannot dither single-bit formats, and SACD opponents also claim that pulse-density modulation has its own nonlinearities, which noise shaping can’t fully remove from the audible frequency range. Practically speaking, you can experience the purity of the supposed SACD advantage only if the A/D converter in the player directly handles the DSD (direct-stream-digital) bit stream versus first downsampling and transcoding it to PCM (pulse-code-modulation), and if you subsequently run the player’s analog outputs directly to a preamplifier and power amplifier versus to an audio/video receiver or another device that redigitizes them in a multibit, parallel fashion.

To follow the debate for yourself, peruse the papers presented at the last several years’ worth of Audio Engineering Society conferences; look especially for those from Philips and Sony, from Malcolm Hawksford of the University of Essex (Essex, UK), from Stanley Lipshitz and John Vanderkooy of the University of Waterloo (Waterloo, ON, Canada), and from James Angus of the University of Salford (Salford, UK). You can also find a pro-PCM paper from Hawksford at the Acoustic Renaissance for Audio site; note that Meridian Audio, which hosts the site, is a strong DVD-Audio advocate (Reference A).

Reference
A. Hawksford, Malcolm OJ, “Bitstream versus PCM debate for high-density compact disc,” www.meridian-audio.com/
ara/bitstrea.htm
.


© 2009, Reed Business Information, a division of Reed Elsevier Inc. All Rights Reserved.