|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
April 10, 1997 Understanding, enhancing, and measuring PC-audio qualitySteven Harris, Crystal Semiconductor Corp PC designers need a good analog primer, including performance characteristics and test procedures, to design a quality PC-audio system. The original IBM PC could make only a few beeps from its limited pulse-width-modulated sound-generation system. Today's PCs have more sophisticated sound subsystems, but they still fail to live up to users' audio-quality expectations (Reference 1). As vendors add TV, digital-video-disk (DVD), 3-D-game, and CD-playing capabilities to the PC, it becomes more of a consumer item, and buyers expect a level of audio quality similar to that of a typical stereo or home-theater system. (See box, "PC meets TV.") Fortunately, the technology to markedly improve audio quality in PCs exists. Industry standards and initiatives, which Microsoft (Redmond, WA) and Intel (Santa Clara, CA) support, are raising the standards for audio quality. (See box, "Making sense of PC-audio standards".) A critical aspect of the use of these standards and ultimately of improving audio quality is well-defined performance specifications and audio-system measurement methods. Also, because most PC engineers specialize in digital circuits and are unfamiliar with analog circuits, they need some training to achieve and measure high-quality audio. What is audio quality? The three fundamental metrics for assessing audio quality are SNR, frequency response, and total harmonic distortion (THD). SNR is the ratio of the signal to the background noise. Table 1 lists some common audio examples and their typical SNRs. If exposed to a loud sound, the human ear has an automatic gain-control mechanism that reduces the ear's sensitivity for several hours. Although the ear's overall perceivable range is 120 dB, the instantaneous sensitivity is much less, around 85 dB. For high-quality audio, an SNR of at least 80 dB is necessary, and more is better. Notice the low value of 49.7 dB for 8-bit data in Table 1. This low value explains why the audio accompanying most game CD-ROMs sounds noisy. Most game audio systems truncate the audio data to 8 bits/sample, causing the typical background "hiss." As disk capacities increase and audio compression algorithms become popular, game audio quality will improve.
THD equals nonlinearity A perfectly linear system introduces no distortion into the program material. THD measures the system's nonlinearity. It is the ratio of the amplitude of the signal's harmonics to the amplitude of the test signal's sine wave. In well-designed systems that contain no ADCs or DACs (an all-analog signal path), distortion typically occurs at a high signal level and generates low-order harmonics. Fortunately, the human ear exhibits an effect called "masking": loud sounds mask the presence of nearby harmonics. Therefore, the ear is not particularly sensitive to low-order harmonics that large signals cause. However, in a digital system, if a large signal ever clips the ADC, the resulting sound is harsh. So, when you make a digital recording, the signal-level setting must allow plenty of headroom before clipping occurs. For small signals, nonlinearities in the ADC can result in high-order harmonic distortion. This type of distortion is audible because the frequencies of such distortion products are far from the signal frequency. Subtle positioning and ambience information are often present in the low-level sounds, so high-quality conversion of low-level signals is important for good 3-D positioning. PC audio has unique requirements In addition to SNR, frequency response, and THD, several design issues are unique to PC-based-audio systems. These issues include system bandwidth, volume control, audio/video synchronization, system noise, user interface, games compatibility, synthesizer quality, frequency accuracy, and transducer quality. System bandwidth: The CPU and system resources that the audio subsystem uses affect the PC's overall performance. For example, playing back a 44.1-kHz-sample-rate, 16-bit-word stereo stream requires a 1.4-Mbps data flow from the hard disk or CD-ROM across the system bus to the DACs in the codec chip. With insufficient bus bandwidth, you get pops and clicks in the sound because of gaps in the data. Some well-designed codecs allow the use of F-Mode DMA transfer, which cuts the system-bus use by a factor of 4, greatly reducing the pops and clips. When playing back a movie on your PC, if sufficient system bandwidth is unavailable, the system usually gives priority to the audio, because a few missed video frames are less objectionable than are missed audio samples. Electronic volume control: Most PC-audio systems use electronic volume controls rather than mechanical potentiometers. One potential drawback of such controls is that a change in the volume setting can create glitches, or "zipper" noise. Well-designed electronic volume controls incorporate zero-crossing detection, which allows the volume setting to change only when the signal level crosses the midscale point. This scheme renders the glitches small enough to be inaudible. Audio/video synchronization: Although not strictly an audio problem, a PC's synchronization between audio and video typically is poor. New and improved versions of movie-player software, along with new application-programming-interface (API) definitions from Microsoft, will do much to improve audio/video synchronization (Reference 2). System noise: A PC is a particularly hostile environment for low-level, analog audio signals. Many high-speed (greater than 10 MHz) asynchronous clock and data signals are present that can interfere with the analog audio signals. Although adequate filtering can directly reduce the coupling to analog signals, many ADCs and DACs are sensitive to high-frequency interference, causing audible artifacts. Some effects of poor digital-interference rejection is audible sounds related to mouse movement, graphics refresh, or disk-drive activity. However, with careful layout and grounding techniques, you can achieve SNRs greater than 80 dB with no activity-related noises. Microsoft is promoting an industry trend to place the audio converters outside the main PC cabinet, using digital interfaces, such as the Universal Serial Bus (Reference 3) or IEEE-1394 (Reference 4). This placement makes it easier to achieve higher SNRs. Some PCs are mechanically noisy, primarily because of the cooling fan. If you want a PC to reproduce high-quality audio, this fan noise is unacceptable. Some newer PCs use low-speed fans and special baffles to greatly reduce the acoustic noise. Another technique is to sense the temperature inside the PC's case and increase the fan's speed only when the temperature gets too high. Another source of acoustic noise is the disk-drive rotation and head access. Newer PCs use special mounting techniques, baffles, and insulation to reduce the disk-drive sounds (Reference 5). User interface: Part of the PC-audio experience is the user interface. New interfaces mimic the front-panel controls of a typical home stereo system, which provides a more intuitive PC-audio interface. These interfaces, such as those from Voyetra Inc (Yonkers, NY), are becoming more widely available. Game compatibility: A measure of audio quality unique to PCs is game compatibility. The extent to which a PC's sound system can successfully reproduce audio in a high percentage of popular games is important (Reference 6). The first sound card for the PC was the SoundBlaster (Creative Labs, Stillwater, OK), which set a de-facto hardware-interface standard for PC-game authors. As alternative sound systems became available, backward compatibility with the SoundBlaster interface was mandatory. Game designers write games that run well under Windows 95 using Direct X API calls that allow designers to separate advances in hardware from the games' software interface. Music-synthesizer quality: The original SoundBlaster card uses FM synthesis to generate musical-instrumentlike sounds. This technique produces only a crude approximation of real music, which is the reason most PC games make a lot of beeplike sounds. Wavetable synthesis, which is more realistic, is rapidly replacing FM synthesis. Wavetable synthesis involves storing digitized samples of real instrument sounds in memory. The synthesizer then scales these samples, repeats them, and amplitude-modulates them to yield lifelike representations of the real instrument's sound. Such wavetable synthesizers are available as a software package--for example, the Yamaha musical-instrument-digital interface MIDIPLUG (www.yamaha.com). However, software implementations consume large amounts of system bandwidth, making them suitable only for technology-demonstration purposes. Wavetable synthesis is commonly available as a multichip set, including MIDI-interpreting microcontroller, wavetable-processing DSP, and sample ROM. Several companies have recently introduced single-chip wavetable devices, such as the CS9236 from Crystal Semiconductor (Austin, TX), which use innovative compression techniques to minimize the necessary ROM size without compromising audio quality. Also, with the introduction of audio-focused DSPs, such as the CS4610, applications such as 3-D audio and wavetable synthesis can run simultaneously with minimal impact on the host CPU's bandwidth. These trends, plus the obvious quality improvement, will make wavetable music synthesis standard in the PC. Note that not all wavetable synthesizers sound the same, but there is no objective measurement technique for judging sound quality. As the synthesizer architects try to match instruments more closely, subjective judgment comes into play. However, it is clear that wavetable synthesis is superior to FM synthesis. Sampling-frequency accuracy: More and more amateur and professional musicians use PCs, and they are especially interested in the quality of the synthesizer and the precision of the audio-frequency reproduction. The accuracy of the frequencies in the synthesizer is usually high, based on a quartz crystal. Unfortunately, the frequencies' accuracy during .WAV file playback may not be up to musician standard requirements, because some audio codecs compromise the playback sample rate's accuracy. For example, if you record a .WAV file at a sample rate of 44.1 kHz and play back that recording at 44.05 kHz, the reproduced music is slightly shifted in frequency toward the low end of the audio band. Most users do not detect this difference in pitch, but musicians find it unacceptable. Better quality audio codecs allow precise operation at the standard PC-audio sample-rate frequencies using a quartz crystal. Transducer quality: Acoustic transducers have a major impact on audio quality. On the recording side, the microphone quality and housing can cause recordings to sound either very good or like a bad phone call. Mounting the microphone in the monitor housing and immediately above the display is becoming increasingly popular. If the loudspeakers are also mounted in or attached to the monitor housing, careful isolation between the microphone and the loudspeakers is necessary. With insufficient isolation, certain applications, such as karaoke machines and speakerphones, can cause the classic feedback-induced "howling." Paying careful attention to the microphone mounting and the associated cavities results in good isolation and a directional response pattern aimed at the normal user position. Loudspeaker quality is still a sore point with most prepackaged multimedia PCs; many units produce low-quality sound. You can get after-market loudspeakers that can enhance sound quality (Reference 7), many of which include small, magnetically shielded main loudspeakers with a separate woofer unit. Loudspeaker and PC companies are working together to improve acoustic-sound quality. More sophisticated techniques might include using signal processing in the PC to improve the loudspeaker's acoustic performance. This processing can correct frequency-response deviations in the loudspeaker or perform compression to improve the sound quality of notebook PCs' small loudspeakers. Define the measurement paths
The first signal path is the analog-signal path (denoted "A-A"). It is the path of an analog signal from the codec's input, through the output mixer and gain block, to the analog output. The signal does not go through the ADC or DAC. You use this mode for listening to music CDs using the CD-ROM-drive analog output. When following the analog-to-digital path (A-D-PC), an analog signal travels from the analog input through the input mixer to the ADC. The digitized data then routes into memory or disk in the PC. The digital-to-analog path (PC-D-A) is a playback path. Data from the memory or disk goes through the DAC and output mixer to the analog output. The analog-to-digital-to-analog path (A-D-PC-D-A) is from the analog input through the input mixer to the ADC and finally into the memory or disk in the PC. Then, play back the data from the memory or disk to the DAC (the playback can be simultaneous with the recording), through the output mixer, and finally to the analog output. Note that some codecs provide an internal loop-back path from the ADC's output to the DAC's input, bypassing the PC bus. This path is useful for characterizing and debugging the codec's ADC and DAC but is not ideal for system measurements, because the data flowing over the PC bus can degrade analog performance. Using the A-D-PC-D-A path is a more realistic and severe test. The last path is the digital-to-analog-to-digital (PC-D-A-A-D-PC) path. You use this path to play back digital audio data from the memory or disk through the DAC to an analog output connector. Then, you route the analog signal to an analog input via an external shielded cable, usually fitted with 3.5-mm jack plugs. Set the input mixer to route the analog signal through the ADC and the subsequent digital signal into the PC memory or disk. Note that most codecs provide an internal loop-back path from the DAC output to the ADC input, bypassing the external components. Similar to the previous loop-back path, this path is useful only for codec characterization and debugging, not for system measurements, because any external components can degrade analog performance. Use standard measurements to test quality Once you clearly define these measurement paths, you can proceed to the measurements. Unfortunately, you can interpret and measure many audio-performance characteristics in different ways, which leads to inconsistencies in the final values. You can circumvent inconsistencies using the following standard measurement procedures to test the key specifications of SNR, frequency response, and THD. Unless otherwise noted, the measurement bandwidth is 20 Hz to 20 kHz, the test-signal frequency is 1 kHz, the test-system sample rate is 44.1 kHz, and the mixer settings are such that all attenuators are set to no attenuation in the signal path, with only the test channel unmuted. When performing SNR and THD measurements, it is common to filter the signal to compensate for the ear's uneven frequency response, giving low and high frequencies less influence on the final measured value. You can use either CCIR-468 or A-weighting filters (Reference 8). The definitions given below and the sample plots are unweighted. If you use weighting, specify the type to allow the comparison of values. The measurement techniques are based on the AES17 standard (Reference 9), the EIAJ CD-measurement standard (Reference 10), and techniques in Audio Precision Inc's (Beaverton, OR) Audio Measurement Handbook (Reference 8). Note that all of the following measurement descriptions include the path name in parentheses. Measure the SNR Measure the SNR using a 60-dBFS (decibels referred to full scale) signal. This amplitude is low enough to be unaffected by any large signal nonlinearities but large enough to ensure that the system under test is being exercised. You can also use other test-signal amplitudes, provided that the signal-level setting generates no distortion components. SNR for record (A-D-PC): Establish the 0-dBFS analog level by increasing the signal input until the ADC output data clips. Set the level just below the clipping point. Reduce the input signal to 60-dBFS. Perform an FFT on the digital data from the ADC and take the ratio of the amplitude of the 60-dB signal to the rms sum of the other frequencies. Add 60 to bring the answer to dBFS. SNR for playback (PC-D-A): Play back a 0-dBFS test digital-sine-wave file and measure the analog sine wave's amplitude. Set this level to 0 dBFS. Play back a 60-dBFS test digital-sine-wave file, notch out the fundamental, and measure the residual level. Compare this level to the previously measured 0- dBFS level. SNR for analog measurement of record-and-playback path (A-D-PC-D-A): Set the system to record and simultaneously play back the recorded data. Establish a 0-dBFS level by increasing the signal input until the analog-output signal clips. Set the level just below the clipping point. Reduce the input signal to 60 dBFS. Notch out the fundamental from the analog output and measure the residual level. Compare this level to the previously set 0-dBFS level. Note that this test is useful if the only available test equipment is an analog audio analyzer. This test yields a composite result of record-and-playback SNR, which masks details of the separate playback and record values but gives a good indicator of overall quality. For example, if the SNRs of both the record and the playback paths are 80 dB, the composite SNR measures 77 dB. If the specifications for record and playback are 80 dB, a composite measurement of 80 dB guarantees that both record and playback must be operating at greater than or equal to 80 dB. With a composite measurement of 77 to 80 dB, it is possible that both record and playback meet the 80-dB specification. However, a composite measurement of less than 77 dB means that the SNR of either the record or the playback path is less than 80 dB and, therefore, below the specification. SNR for digital measurement of playback-and-record path (PC-D-A-A-D-PC): Set the system to play back and record simultaneously. Connect the analog output to an analog input with a shielded patch cable. Play back a test digital-sine-wave file at 0 dBFS. Adjust the record path's gain until the ADC output is as large as possible without clipping. Set this output level to 0 dBFS. Play back a test digital-sine-wave file at 60 dBFS. Perform an FFT on the digital data from the ADC, and take the ratio of the amplitude of the 60-dB signal to the rms sum of the other frequencies. Add 60 to bring the answer to decibels-referred-to-full-scale units. This test is useful for an automated self-test and requires only the appropriate software and a patch cable. The system must be capable of simultaneous record and playback. This test assumes that the nominal full-scale analog-output signal level is approximately equal to the nominal full-scale analog-input signal level. If this case is not true, the test result indicates a value that is worse than the actual value. You can adjust the full-scale output voltage to equal the full-scale input voltage by using a special patch cord to link analog-out to analog-in that has gain or attenuation built-in, as necessary. Measure the frequency response The typical range for frequency-response testing is 20 Hz to 20 kHz. If you use spot frequencies for testing, choose frequencies at equal to or less than octave intervals. Set the level at 1 kHz to 0 dB on the response plot. For all tests, the frequency-sweep range is 20 Hz to 20 kHz. Frequency response for record (A-D-PC): Apply the frequency-swept sine wave with an amplitude of 20 dBFS to an analog input. Record the data to memory or hard disk. Plot the deviation in signal level vs frequency from the data. Frequency response for playback (PC-D-A): Play the frequency-swept sine wave from a test data file at an amplitude of 20 dBFS to an analog output. Measure the signal level over time and store the result. Plot the deviation in signal level vs time from the data. Arrange the x-axis scale so you can replace time with frequency. Analog frequency response for record-and-playback path (A-D-PC-D-A): Set the system to record and to simultaneously play back the recorded data. Using an audio analyzer, apply the swept sine wave at an amplitude of 20 dBFS to an analog input. Measure the simultaneous analog output, and plot the deviation in signal level vs frequency. If simultaneous record and playback is not possible, you can perform this test by recording the swept sine-wave analog signal to disk and then playing back the signal into the analog analyzer. This test is useful if the only available test equipment is an analog audio analyzer. This test yields a composite result of record and playback responses, which masks details of the separate playback and record responses but is useful to judge overall system performance. Digital frequency response for record-and-playback path (PC-D-A-A-D-PC): Set the system to play back and record simultaneously. Play a test file consisting of the swept sine wave at an amplitude of 20 dBFS to an analog output. Connect the analog output to an analog input with a shielded patch cable. Record the analog input to the memory or disk and plot the deviation in signal level vs frequency. This test is useful for an automated self-test because it requires only the appropriate software, a patch cable, and a system capable of simultaneous record and playback. This test yields a composite result of playback and record responses. Measure THD As mentioned earlier, THD is the ratio of the amplitude of the signal harmonics to the amplitude of the test signal. However, THD plus noise (THD+N), which measures both the harmonics and the noise present in the output signal, is a more common way to gauge distortion than is THD alone. For a THD+N measurement, it is important to include all nontest signal frequencies, not just multiples of the test frequency, because converters can generate aliased components anywhere in the measurement frequency band. Also, the THD+N measurement is easier to perform than is THD, because you must only filter out the test frequency and perform a broadband measurement of the residual signal, rather than perform a spectral analysis. The THD+N measurement is also often referred to full scale, rather than to the test-signal amplitude.
THD+N for playback (PC-D-A): Play back a test sine-wave data file at 0 dBFS. Measure the analog-output amplitude and set this level as 0 dB. Play back a 3-dBFS test sine-wave data file to the analog output. Notch out the test-frequency component from the output and measure the remaining signal. Take the ratio of this measurement to the previously measured 0-dB level. Express the answer as a percentage.
THD+N for digital measurement of playback-and-record path (PC-D-A-A-D-PC): Set the system to play back and record simultaneously. Connect the analog output to an analog input with a shielded patch cable. Play back a test digital sine-wave file at 0 dBFS. Adjust the record path's gain until the ADC's output is as large as possible without clipping. Set this level to 0 dBFS. Play back a test digital sine-wave file at 3 dBFS. Perform an FFT on the digital data from the ADC and take the ratio of the amplitude of the nontest signal frequencies' sum to 0 dBFS. Express the answer as a percentage. For more detailed descriptions of PC-audio measurement techniques and sample equipment to use, see Reference 11. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| EDN Access | Feedback | Table of Contents | |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Copyright © 1997 EDN Magazine, EDN Access. EDN is a registered trademark of Reed Properties Inc, used under license. EDN is published by Cahners Publishing Company, a unit of Reed Elsevier Inc. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||