Zibb

Feature

Signal to noise: calculating the high-resolution-audio reality-to-hype ratio

Is high-resolution audio a "sound" investment, or will it bring your next design to a crashing halt in the market?

By Brian Dipert, Technical Editor -- EDN, 2/6/2003

Sidebars:
Put on a new record

Music labels and equipment suppliers hope that high-resolution-audio formats, such as DVD-Audio and SACD (Super Audio CD), will be the latest in a string of mostly successful upgrade "pitches" (Reference 1). Beginning with the 78-rpm record and quarter-inch tape, the audio industry has sold consumers on a series of claimed ever-higher quality formats: 33.3-rpm albums, 45-rpm singles, and eight-track tapes, all of which audio CDs and cassettes and, less successfully, DAT and MiniDiscs have now superseded. Along the way, consumers have upgraded their audio gear, refreshed their music libraries, and purchased more expensive variants of new music. With history as a guide to future trends, why should this latest format jump be any different?

Well, with every generation up-tick, the incremental quality improvement has diminished. I'd argue, in fact, that the reason that audio CDs so quickly supplanted LPs had little or nothing to do with higher sonic quality, and the even more rapid acceptance of degraded-quality MP3 and other lossy- compression formats supports this claim. The embrace of the audio CD was all about portability and durability, not first-play sound quality, and some folks still insist that LPs sound better. The latest audio formats often offer surround sound, which is a credible upgrade motivator for at least some consumers. But will large samples and high sample rates further increase consumers' temptations to pull out their wallets? Will consumers care about these features? Or, as the actions of Sony (which refused to label recent two-channel re-releases of the Rolling Stones' library as the hybrid SACDs they in fact were and sells them at conventional-CD prices) suggest, will the plethora of formats have a detrimental effect on sales?

What's the theory behind the auditory benefit claims of the new high-resolution formats? And how well does this theory hold up outside the laboratory—that is, in the real world? To set a framework for the discussion that follows, let's make sure we're using the same vocabulary, and the same definitions for the words in that vocabulary. I adapted the descriptions that follow from an Analog Devices application note (Reference 2):

  • Decibel: describes the sound-level (sound-pressure-level) ratio or power and voltage ratios; dBVOLTS=20×log(Vo/Vi), dBWATTS=10×log(Po/Pi), dBSPL=20×log(Po/PI).
  • Dynamic range: the difference between the loudest and the quietest representable signal level or, if noise is present, the difference between the loudest (maximum level) signal to the noise floor; measured in decibels; dynamic range=(peak level)–(noise floor) dB.
  • SNR: the difference between the nominal level and the noise floor; measured in decibels; other authors define SNR for analog systems as the ratio of the largest representable signal to the noise floor when no signal is present, which more closely parallels SNR for a digital system.
  • Headroom: the difference between nominal line level and peak level where signal clipping occurs; measured in decibels; the larger the headroom, the better the audio system handles very loud signal peaks before distortion occurs.
  • Peak operating level: the maximum representable signal level at which clipping of the signal occurs.
  • Line level: nominal operating level (0 dB or, more precisely, –10 to +4 dB).
  • Noise floor: the noise floor for human hearing is the average level of "just audible" white noise; analog-audio equipment can generate noise from components; with a DSP, noise can come from quantization errors.

Analog Devices' documentation also points out that you can assume that the sum of headroom and SNR of an electrical analog signal is equal to the dynamic range, although this statement is not entirely accurate because signals can still be audible below the noise floor. It also points out that, in undithered DSP-based systems, you cannot directly apply the SNR definition because no noise is present in the absence of a signal. In the digital domain, which this article series will primarily discuss, dynamic range and SNR both often describe the ratio of the largest representable signal to the quantization error or noise floor.

More bits to represent a signal mean more available quantization levels (Figure 1 and Table 1). Having more levels means lower quantization noise, a wider dynamic range, and a more accurate representation of the original signal. Again quoting the Analog Devices literature, "The maximum representable signal amplitude to the maximum quantization error for an ideal A/D converter or DSP-based digital system is calculated as: SNRRMS (dB)=6.02×n+1.76 dB; dynamic range (dB)=6.02×n+1.76 dB6×n."

The documentation bases 1.76 dB on sinusoidal waveform statistics, and "this figure would vary for other waveforms"; n represents the data-word length. Providing more bits means providing better sound, then, at least to a point. How much accuracy between the sampled signal and the original is good enough, and how much is too much? Supporting more bits requires more processing muscle and more storage, both of which negatively impact cost. Ironically, Meridian Audio's Bob Stuart, one of the founding fathers of the 24-bit DVD-Audio format, along with a number of equally well-regarded peers, published a paper a few years ago that stated that 20-bit precision at a 48-kHz sampling rate and 14-bit precision at a 96-kHz sampling rate (in both cases incorporating noise shaping) were the maximum-required specifications for high-quality audio (Reference 3).

Thinking along similar lines, sound engineer Thomas Sandmann from Master Orange Entertainment points out that the theoretical quantization noise of a 24-bit A/D converter at –144 dB is significantly lower than the thermal noise of a single resistor connected to the ADC input (Reference 4). And Sound and Vision editor David Ranada, in a recent review of DVD-Audio and SACD players, notes that even the best of them, with an effective dynamic range of 18 to 19 bits, delivers A-weighted noise levels approximately 34 dB "worse" than ideal, 24-bit PCM performance (Reference 5).

The human auditory system, in an ideal anechoic listening environment, discerns a 120-dB dynamic range. Literature often quotes the typical ambient masking-noise level in a living room as 45 dB SPL (sound-pressure level); the noise level in a moving automobile is significantly higher. Quantization noise, most noticeable in audio with low signal levels, must be near to (because it's correlated to the audio) or ideally above this ambient noise floor before it's audible. Even with 16-bit audio CDs, such a scenario would require extensive signal amplification, which would likely blow out speakers and eardrums when the audio returned to nominal levels (Table 2).

You may be getting the sense at this point that a 24-bit sample is overkill for audio storage. Even if you believe that 16-bit samples are insufficient, which I don't, a few bits' more resolution will keep sample size from becoming the weak link in the audio chain that begins in the recording studio and microphones and ends in the listening room and your ears. The choice of a 24-bit sample primarily results from the fact that modern memory, processing, and input/output circuits most readily handle information in 8-bit groups. But at least two scenarios exist for which I'd argue that 24 bits might not be enough.

The first situation occurs during the original recording, mixing, and mastering of the audio, before producers transfer it to optical storage, a downloadable file, or some other mass-distribution vehicle. Think about all of the operations that occur during music creation: Audio engineers combine, equalize, speed up, and slow down multiple tracks' worth of recordings; acoustically manipulate vocals to turn marginal singers into divas; and invariably compress the dynamic range of the final product for as-loud-as-possible radio broadcast. Each of these numerous steps involves arithmetic calculations that, with insufficient precision, result in overflow, rounding, and truncation errors. The effects of these incremental errors build on each other and may audibly degrade the final product (see sidebar "Put on a new record"). Even so, audio engineers are still grumbling over the significant amount of expensive hardware and software upgrades that larger samples, flowing into and out of machines at faster rates, require (references 6 to 8).

An analogous scenario occurs in the decoding and postprocessing stages of audio playback. The latest generation audio formats, such as DVD-Audio, DTS 96/24, WMA Professional, PCM-transformed SACD—which themselves require long data words and postdecoding tasks, such as surround virtualization and bass management—further add to the potential for calculation error and subsequent loss of acoustic "transparency." Even in the era of 16-bit audio, 32-bit, fixed- and floating-point DSPs commonly found use in midrange and high-end equipment. With the migration to 24-bit audio, the 32-bit DSP will likely become pervasive, and all but the lowest end systems will employ floating-point variants.

In Part 2 of this article series, I'll discuss the other half of the technology behind the high-resolution audio hype: high sampling rates. Have a relaxing intermission and tune in to the next issue of EDN for the rest of the show.


For more information...
When you contact any of the following manufacturers directly, please let them know you read about their products in EDN.
Analog Devices
www.analog.com
Audio Engineering Society (AES)
www.aes.org
Digigram
www.digigram.com
Digital Theater Systems (DTS)
www.dtsonline.com
Master Orange Entertainment
www.master-orange.de
Meridian Audio
www.meridian-audio.com
NEC
www.nec.com
Oktava
http://oktava.tula.net
Sony
www.sony.com
Sound and Vision Magazine
www.soundandvisionmag.com
  


Author Information
Technical editor Brian Dipert is off to listen to the culinary frequencies emanating from his microwave oven (note: not the interference it creates in his cordless phone, of which he is already intimately aware), and the quantization noise in his latest ac/dc CD. (Any excuse to "crank it up" is a good excuse.) Reach him, and his amplifier that goes to "11," at 1-916-454-5242, fax 1-916-454-5101, bdipert@edn.com, and www.bdipert.com.


References
  1. Dipert, Brian, "Destination distortion: High-resolution audio strides toward an unclear future," EDN, Jan 9, 2003, pg 36.
  2. Tomarakos, John, and Dan Ledger, "Using the low-cost, high performance ADSP-21161 SIMD digital signal processor for digital audio applications, Revision 2," Analog Devices, Aug 9, 2001, www.analog.com/UploadedFiles/Application_Notes/2792059121065L_Audio_Tutorial.pdf.
  3. Stuart, Robert et al, "A proposal for the high-quality audio application of high-density CD carriers," Version 1.2, June 23, 1995, www.meridian-audio.com/ara/araconta.htm.
  4. Sandmann, Thomas, "What do 24 bit and 96 kHz achieve?" www.terratec.de/4G/2496-en.pdf.
  5. Ranada, David, "Grading on the curve," Sound and Vision, December 2001, pg 43.
  6. Cooper, Michael, "Bridging the gap," Electronic Musician, November 2002, pg 38.
  7. Davis, Steve, "The high overhead of high bit rates," Pro Sound News, September 2002, pg 28.
  8. Smith, Noel, "Hi-res audio: savior or emperor's new clothes?"Pro Sound News, November 2002, pg 34.
 

Put on a new record

One of my off-hours hobbies is recording music. I began the pastime several years ago by running the outputs of Oktava MC-012 microphones and a Denecke PS-2 dual-channel, portable phantom power unit directly into my Sony TCD-D8 DAT deck. Later, I added a Sony SBM-1, which includes both a higher quality microphone preamp and an A/D converter whose SBM (super-bit-mapping) noise-shaping techniques deliver claimed 20-bit equivalent dynamic range, to my collection. Denecke's AD-20, a portable microphone preamp which also includes a 20-bit ADC, targets users with audio recorders that incorporate conventional coaxial and optical S/PDIF (Sony Philips Digital Interface) inputs.

Lately, though, I've switched to a NEC Versa UltraLite notebook PC as my audio-capturing platform, along with a Digigram VxPocket sound card, which supports 24-bit, 48-kHz recording. The raw 24-bit files don't necessarily sound any better than the 16-bit ones I was making before. But they hold up better through multiple editing passes before dithering and 16-bit, 44.1-kHz downsampling. And, believe me, given my elementary skills, my recordings require lots of editing before I'm happy enough to burn an audio CD!



Reed Business Information Resource Center

Featured Company


Related Resources

ADVERTISEMENT

ADVERTISEMENT

Feedback Loop


Post a CommentPost a Comment

There are no comments posted for this article.

Related Content

 

By This Author


ADVERTISEMENT

Knowledge Center



Technology Quick Links

EDN Marketplace


©1997-2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites