EDN logo


Design Feature: July 21, 1994

Asynchronous conversion thwarts incompatibility in sampling A/D systems

Robert Adams,
Analog Devices Inc

By using asynchronous sample-rate conversion, you can eliminate the problem of repeated or dropped samples that can occur when interconnecting systems with incompatible clock rates.

A technique known as "asynchronous sample-rate conversion" (ASRC) solves two problems you commonly encounter in communications between digital systems. An ASRC system acts as a universal buffer between incompatible audio sample rates and solves the problem of communication between systems where both try to act as clock "masters."

Before going into the details of ASRC, it's useful to explore the rationale behind its development. Digital audio has provided the major impetus. A decade ago, most audio signals were recorded, stored, and delivered in analog form to the end user. Today, the trend is to use digital techniques wherever possible to avoid the loss of signal quality that inevitably occurs with analog technology. An obvious example is the modern CD player, where the superb signal quality of digital sound has found nearly universal acceptance as the new standard for music distribution.

In the early days of digital audio, interconnection between pieces of equipment was implemented mostly in the analog domain, and each part of the chain came with its own A/D and D/A converters. For example, the digital output of a CD player is first converted to analog form, then sent to a preamp for further processing. Inside the preamp, an A/D converter often converts the signal back to digital form to perform special effects, such as ambience enhancement (reverb), digital equalization, or Dolby surround-sound decoding. It did not take long to realize that a direct digital connection between components would eliminate the redundant A/D and D/A converters and consequently improve the signal quality.

As digital interconnection between components becomes more common, a new problem frequently arises: synchronization. For example, assume you wish to send a digital audio signal over a synchronous network. The network must act as the clock master: It requests data at a fixed rate from the data source (Fig 1). The data source is most likely producing data at a different sample rate from that of the network and wants to act as the clock master. If the two systems are connected with no attempt at synchronization, then samples may be repeated or dropped, depending on whether the network rate is higher or lower than the sample rate of the source. This example shows that it's not easy to interconnect two systems when both want to be the clock master.

Another example of synchronization's causing problems is a system in which you need to mix multiple sources with incompatible sample rates digitally. Fig 2 shows a typical mixing console for which all the inputs are in digital form. You can synchronize many of the input sources to a common studio reference generator, but often one or more input channels come from sources that cannot operate in a clock "slave" mode (for example, consumer CD or digital-audio tape signals, as well as audio from network connections). Again, this disparity in sample rates causes repeated or dropped samples to occur at regular intervals, resulting in audible errors.

In many cases, sample rates are nominally the same, but differ by the accuracy and tolerance of the internal clock generators. Again, because both the sender and the receiver are clock masters, repeated or dropped samples occur. Table 1 summarizes some of the most popular audio sample rates and where they are used. If we want to include multimedia and networked audio applications, the list would become even longer.

Table 1—Popular audio sampling rates and applications
Sampling
frequency
Application (kHz)
44.1 CD, DCC, digital
audio tape
32 DCC, digital audio
tape, MAC, NICAM, DSR
48 Professional audio,
digital audio tape,
DCC, DAB, S-VHS
16 MAC
18.9 CD-I (Level C)
37.8 CD-I (Levels A and B)
31.5 8-mm videocasette
recording
44.056 U-matic videocassette
recording
38 Digital FM-stereo
decoders


Sync vs async sample-rate converters

Fig 3 shows the difference between synchronous and asynchronous sample-rate converters. In a synchronous converter (Fig 3a), the user sets the frequency ratio FSout/FSin explicitly, and the converter then produces output samples at the requested rate. The frequency ratio for such a device must be a rational number and is usually expressed as a ratio N/M. To keep the hardware compexity reasonable, N and M are usually quite small because using large values entails a large number of stored interpolation coefficients.

While a synchronous sample-rate converter is often useful, its output is still a clock master and thus does not solve any of the problems. What you need is an asynchronous sample-rate converter (Fig 3b), which provides correctly interpolated output data when requested from an external clock signal. The output sample-rate clock signal is now an input to the device, and the rate converter produces data upon request. The output is, therefore, a sample-clock slave, solving the problem of interconnection between multiple systems that all want to be the clock master.

For example, Fig 4 shows a solution to the digital mixing-console problem in which each "wild," or unsynchronized, input uses an ASRC. The output clock of each ASRC ties to the internal system clock of the mixer. The input clock to each ASRC is derived from the serial input stream of each channel, using a PLL. The ASRC effectively decouples the internal and external sampling rates, and makes digital interconnection as easy as the analog interconnection of yesterday.


Synchronous rate-conversion theory

A synchronous sample-rate converter applies an explicit sample-frequency ratio FSout/FSin of N/M, and samples are then produced at a rate that is locked to the input rate. The ratio (both N and M integers) must be rational; furthermore, N and M should be quite small to avoid the requirement of large coefficient storage. From a signal-processing point of view, synchronous sample-rate converters are quite simple.

Fig 5 shows an example in which N/M=4/3. First, an interpolation by a factor of four takes place. This is done by inserting three zero-valued samples between input samples, which increases the sample rate by a factor of four. A digital filter for the higher sample rate of FSin×4 then removes the images. The digital filter's cutoff frequency is less than half the input or output sample rate, whichever is lower. The output of this filter is then decimated by three, a process that involves simply discarding two of every three samples at the output of the interpolation filter.

Fig 6 shows the direct-convolution model of an interpolation filter for N=4. The zero-stuffing causes only a subset of the filter coefficients to be used in the calculation of any output sample. This observation leads to the "polyphase-filter" model, in which the filter coefficients split into n separate FIR filters (Fig 7). Each "tick" of the input clock produces four simultaneous outputs. These values are then read out sequentially at the rate FSin3N, resulting in the same sequence of output values as that of the original interpolation filter.

Because each subfilter uses the same stored input data, you can store this input data only once in a common RAM. This technique reduces the required amount of RAM because you do not need to store the zeros that were conceptually inserted in the interpolation. Also, the computation rate goes down becase the digital filter's multiply-accumulate automatically skips zero-valued stored data values.

An obvious simplification of this technique is to compute only those outputs that decimation does not discard. You can use a simple formula to select the correct polyphase-filter branch for each requested output. To produce the Kth output sample, the branch index Bk is Bk=KM mod N.

An integrator that advances by M every time an output is computed, with a modulus of N can produce the branch-selection index. You can then simplify conversion to an FIR filter with N sets of coefficients, where you choose the correct set of coefficients each output sample according to the above equation.


Asynchronous sample-rate-conversion theory

You can extend the described interpolation/decimation model to asynchronous conversion by simply making the interpolation ratio extremely large (Fig 8). As the interpolation ratio increases, the correlation between adjacent interpolated output samples becomes so high that the worst-case change between one interpolated sample and the next is smaller than an LSB of the desired output word length. The resampling simply grabs the "nearest" interpolated output when the user-supplied output clock requests an output sample.

By using such a high interpolation ratio, the interpolated signal resembles an analog signal for all practical purposes, and the resampling process at the new output rate is no different from sampling the original analog signal at the new rate.

To achieve 16-bit worst-case performance, you must use an intepolation ratio of 65,536. You must use the polyphase-filter hardware-reduction techniques to avoid a huge computation rate. The length of each polyphase filter does not increase with the interpolation ratio, but the number of parallel polyphase-filter branches becomes very large for high-quality interpolation. For example, the AD1890 single-chip ASRC (see box, "A silicon solution for ASRC") uses 65,536 sets of polyphase filters, each with 64 coefficients. The IC uses a compression technique to reduce the required internal ROM to store these coefficients and uses on-the-fly decompression hardware during multiply/accumulate.

Previously, we saw how you can generate a sequence of polyphase branch indexes in the synchronous case where N and M are explicitly given. In the asynchronous case, you must generate the sequence of polyphase branch-index numbers based on measurements of the relative phases of the user-supplied input and output clocks. In theory, you can directly measure clock-arrival times, but this implies that a high-frequency several-gigahertz clock signal is available to achieve the required measurement accuracy.

The AD1890 ASRC chip eliminates this requirement by digitally filtering a series of low-accuracy clock-arrival measurements. The process results in better than 200-psec time resolution, even though the master clock runs at only 16 MHz. In addition, the low cutoff frequency (less than 3 Hz) effectively rejects jitter on the user-supplied input and output clocks.

When the sample rates are dynamically changing, the digitally filtered internal estimate of the input/output relative clock phases may lag behind the actual external-clock phases because of the low cutoff frequency of the digital filter. You could use extra RAM as an elastic store buffer in this case, allowing the ASRC to track dynamically changing sample rates without error.

A silicon solution for ASRC
In the past, solutions for asynchronous sample-rate conversion have used multiple DSP chips along with external dedicated hardware, often using very high-frequency clocks to measure the arrival times of external clocks to high accuracy. These solutions often carry prices exceeding $5000. As a result, system designers were often obliged to live with the problem of occasional dropped or repeated samples or in some cases had to go through an extra stage of A/D to D/A conversion.

All-digital monolithic ICs that implement a complete stereo, asynchronous sample-rate converter are now available. The AD1890 is the first such chip to enter the field. While quite complex on the inside, the chip is very easy to use from the designer's perspective (Fig A). Input data comes in serial form to the device, using a simple 3-wire interface. The L/R clock acts as the frame sync; it indicates the start of a serial word, as well as whether the received data is for the left or right channel. This signal is used internally to sense the input sample rate.

The output signal also appears in serial form. The framing signal for the output L/R clock is an input to the chip. Again, this represents the difference between synchronous and asynchronous converters—an asynchronous converter accepts an external sample-rate clock as an input to the device and produces data on demand. The chip requires a master clock with a 16- to 20-MHz frequency for proper operation. It's not necessary to synchronize this clock to either the input or the output sample frequencies. The input word length is 20 bits, which easily accommodates digital audio signals from most professional equipment. Word lengths as great as 24 bits are available at the output.

The AD1890 follows real-time variations in the input or output sample frequencies with no error. The chip features a programmable acquisition time: In "fast" mode, the device rapidly tracks large step changes in the input or output sample frequencies, and in "slow" mode, it tracks only gradual variations. One advantage inherent in the slow mode is that jitter in either the input or the output L/R clock is heavily filtered, allowing accurate rate conversion even in the presence of large amounts of clock jitter.

In slow mode, the device attenuates jitter frequencies above 3 Hz by 6 dB/octave. This feature is important in systems in which the clock must be recovered from the serial input stream. In many cases, the high-frequency losses that occur when serial data is sent over long cables result in intersymbol interference in the data stream. When recovering a clock signal from such a signal (using a PLL), large amounts of jitter may occur in the recovered clock. This jitter can cause excess noise and distortion if you use the jittery clock to clock a D/A converter. By allowing a crystal-generated local clock to 2establish a stable internal clock, you can avoid this type of degeneration.

Fig B shows the IC's performance in signal quality. When converting from 48 to 44.1 kHz (two common audio rates), the THD plus noise (THD+N) is greater than -105 dB. For any sample-rate converter, the worst-case input signal is a full-scale signal at the band edge. For a 20-kHz, full-scale input signal, the THD+N degrades to -95 dB, or 16-bit equivalent quality. These distortion products scale with the input level: If you reduce the input by 20 dB, the distortion components decrease by 20 dB as well.


Robert Adams is manager of audio development at Analog Devices Inc, where he's worked for five years. His principal duty is to design ICs for the multimedia, consumer, and professional-audio markets. A member of the IEEE and a Fellow in the Audio Engineering Society, Robert has a BSEE degree from Tufts University, Medford, MA. His spare-time pursuits include family activities and playing the saxophone in a jazz band.


| EDN Access | feedback | subscribe to EDN! |
| design features | design ideas | Columnist |


Copyright © 1995 EDN Magazine. EDN is a registered trademark of Reed Properties Inc, used under license.