Using FPGAs in consumer electronics

Phuttachad Thiencharoenwong - September 04, 2008

It may seem odd to be reading about analog-audio and -video processing when the industry is currently focusing on the analog transmission switch-off and the digital-broadcasting successor. However, for legacy reasons and because of the increase in demand from markets that have later switch-off dates, such as East Europe, India, and South America, it is likely that analog decoding will be a requirement well into the next decade.

Several typical blocks constitute analog-front-end-acquisition circuits (Figure 1). You generally achieve the SIF (sound-intermediate frequency) and video decoding through the use of one or two ICs from a number of manufacturers. The system requires external memory if it includes a 3-D comb filter, as well as, perhaps, a baseband-audio-stereo ADC. Repetitions of these blocks exist in multituner-system configurations.

Cores versus ICs

Video- and audio-decoding IP (intellectual-property) cores conceptually allow the replacement of these ICs with FPGAs. Area-efficient implementations enable you to incorporate the functions in cost-effective FPGA products, such as Altera’s Cyclone and Xilinx’s Spartan families. One way to keep down the cost of the FPGA is to allow different FPGA images for different standards, with the main control processor downloading the appropriate image depending on the detected standard. Image sets might comprise, for example, NTSC (National Television System Committee) and BTSC (Broadcast Television Systems Committee) or PAL (phase-alternating line) and NICAM (near-instantaneous companded-audio multiplexed). Using the main control processor to initialize the FPGA also offsets the cost of an EEPROM to hold the FPGA image.

The FPGA does not incorporate significant analog functions; an external IC must perform the analog-to-digital conversion. Such ICs are available from a number of manufacturers, such as Analog Devices, which offers the AD9981. The ADC also performs synchronization-separation and clock-generation functions and can interface to CVBS (composite-video-broadcast-signal), YC (S-video), YPbPr (green/blue/red), and RGB (red/green/blue) inputs. An external ADC usually achieves a better SNR (signal-to-noise ratio) than that of an integrated alternative, but the cost of implementation is about 50% higher than that of a standard-IC option. However, integrating additional functions into the FPGA can mitigate this cost increase.

For example, it is common for recorders to allow users to record one channel while watching another or for televisions to have picture-in-picture features. Such capabilities require two digital tuners, two analog tuners, two decoders, and, therefore, two sets of ICs in the non-FPGA implementation. However, you can embed two decoders within one small FPGA, thereby limiting the incremental cost to one additional ADC. To reduce area, it may be possible to limit decoder options for the secondary channel by, for example, omitting SECAM (séquentiel-couleur-avec-mémoire)
decoding. The total cost of the FPGA option, as a result, will be similar to or even less than that of the IC option.

**Tuner alternatives**

Next, consider how to include additional functions in the FPGA. The conventional metal-“can” analog tuner contains two main functions: an RF tuner and a demodulator. You can reduce the cost of the tuner by alternatively performing demodulation digitally within the FPGA. The video ADC can digitize the video IF. The cost savings from eliminating the need for a tuner will double in the case of dual-tuner products. This option also opens the possibility of using a silicon tuner, whose performance is now adequate for most consumer products. The silicon tuner, with its all-region capability, brings possibilities for cost-effective, all-region consumer products with corresponding manufacturing-cost savings. Metal-can tuners, conversely, are usually region-specific for unit-cost reasons.

Another FPGA-enabled possibility is to add local intelligence through the integration of a small microcontroller core for front-end functions, thereby offloading that requirement from the main controller. For example, a microcontroller can issue a simple high-level instruction to tune channels. Advantages of this approach include faster initial channel-acquisition times and the ability to power down more of the main controller under standby conditions; this approach has appeal in today’s “green” society.

You can incorporate additional functions for higher-end consumer products by integrating the SDI (serial-digital-interface) receiver. FPGAs now offer LVDS (low-voltage-differential-signaling) receivers and clock-recovery circuits, which permit the implementation of an SDI receiver with just an external cable-equalizer IC. This approach is considerably cheaper than using a complete SDI-receiver IC.

IC-based video-decoder options usually also strip off the information in the vertical-blanking area of the signal, but they leave the processing of that portion of the signal for the main processor. Again, having some specialized hardware and a local controller within the FPGA permit, for example, closed-caption decoding. Decoding closed-caption or Teletext subtitles is common in televisions, but less so in PVRs (personal video recorders) and DVD (digital-video-disc) recorders. However, superimposing the subtitle on the output BT656 video, as an FPGA can do, allows the recording of subtitles for deaf and partially deaf users. You could embed the decoded subtitles in the MPEG (Moving Picture Experts Group) metadata for a more elegant approach.

**Comb implementations**

Another potentially integrated function is an improved comb filter. Most consumer products use a 2-D comb, which leaves undesirable decoding artifacts that also consume valuable bandwidth if compressed. A good 3-D comb can noticeably improve quality, but it requires an external-memory device. Available 3-D-comb-IC options use a symmetrical frame comb, which for PAL requires four frames×625 lines×1440 pixels×8 or 10 bits=36 Mbits, thereby necessitating an external-memory device, usually an SDRAM. You can halve this requirement by using an asymmetrical comb, which has the advantage of requiring no compensating audio delay. You can halve it again for PAL by using PAL modifiers in the comb architecture. An additional halving of memory budget can occur with the use of a field comb.

A single-tap, 262-line comb for NTSC or a 312-line comb for PAL gives excellent results, although it does not permit perfect decoding of complex still frames. However, for real-life images, the wide
aperture of the PAL-frame comb often means that the 3-D comb will fail on moving images and thereby regress to line-comb mode. A field comb offers a good compromise between memory requirements and performance. For PAL, a field comb requires 312 lines×1440 pixels×8 bits minimum=3.6 Mbits. Unfortunately, this capacity is still too large to enable the use of integrated memory in small, cost-effective FPGAs.

However, it is possible to implement a pseudo-3-D comb that fits within the memory requirements of even the smallest FPGAs. Normally, three comb modes are available to the decoder: the 3-D comb, a 2-D line comb, and a simple mode that is either a lowpass or a notch filter. The decoder chooses the appropriate comb mode, basing its decision on signals that indicate failure conditions, such as motion, which prevents the 3-D comb from operating, or diagonals, which prevent the 2-D comb from operating. An often-assumed priority is that 3-D is always the preferred mode and that simple mode is the least desirable mode. However, this prioritization is not always true; on flat areas of color, for example, simple mode is often the best mode, as measured by highest SNR or least visible artifacts. The reason for this seeming disparity is that the wide aperture of the 3-D comb means that clock jitter can leave a residual subcarrier; just 1 nsec of clock jitter across an 80-msec tap distance can result in this scenario.

You can, therefore, consider a comb architecture in which the 2-D and simple modes decode an image wherever possible. Such an approach uses the 3-D comb only when neither the simple nor the 2-D modes can operate. To create this design, the system stores in memory a 1-bit positional flag along with the frame- and field-delayed data for that flag position. On the subsequent frame or field, the design can then choose a 3-D-comb aperture for these positions in addition to the 2-D and simple modes. A number of tested images, including the ubiquitous Snell and Wilcox moving-zone plate, reveal that 3-D is rarely necessary under these constraints. It is still necessary to determine whether the 3-D comb is failing. It fails, for example, if substantial motion is in the image, especially if you use a frame aperture. A considerably reduced memory requirement is the benefit, however.

A 1-bit plane is necessary for the 3-D comb aperture. For example, a 312-line field comb needs 312×720=224,640 bits. Also, at the flagged locations in which both the 2-D and the simple modes fail, the design needs memory to store the delayed information. Surprisingly, this memory budget is limited to just 32 kbytes and still produces substantial improvements in the decoded image. In other words, for a large range of images, there are just 32,000 pixels for which the 3-D comb is essential. This combined amount of memory is available even in small FPGAs, allowing the pseudo-3-D comb to operate on at least one decoder channel.

Memory budget

A modified comb architecture yields another improvement in decoding quality. VCRs (videocassette recorders) use a color-under technique for recording. The luminance information FM-modulates onto an approximately 3.5-MHz carrier, whereas the chrominance information, along with the luminance information it contains, remodulates onto a carrier with a frequency of approximately 600 kHz to avoid problems recording at higher frequencies. Because of this remodulation of the chrominance component onto another carrier, this approach loses the phase relationship with the original subcarrier. It’s therefore no longer possible to comb the higher-frequency component of the signal to separate the luminance and chrominance. The system therefore discards the higher-frequency information, resulting in a low-resolution image.
Most comb filters use a complementary-baseband comb. Composite-video demodulation occurs using a lowpass-filtered, phase-locked subcarrier waveform to produce the chrominance outputs. Combing the chrominance signals then removes the luminance components; remodulating and adding chrominance produces a clean chrominance-only signal, centering on and in phase with the subcarrier frequency. Subtracting this signal from the composite video produces clean luminance.

You can use a variant of this comb architecture, wherein the remodulation and the subtraction from the composite source occur before the comb filter (Figure 2). This technique produces simple chrominance components and a clean, notched luminance. Combing the chrominance components to produce combed-chrominance outputs and then subtracting them from their simple versions leaves you with the high-frequency-luminance signal. Remodulating it and adding it to the notched luminance effectively fills in the information missing from the notch. The advantage of this architecture is the ability to add the high-frequency luminance to the notched luminance using a different phase of the subcarrier. Remodulating the high-frequency luminance, the original phase relationship reconstructs the original waveform with a higher bandwidth.

You base adjustment of the phase of the second remodulator on information you derive from a highpass filter. You can determine the correct phase for the addition of the luminance signals by detecting the improved sharpness of the luminance signal for particular subcarrier phases. This approach is possible using a highpass filter, a square-law function to rectify the highpass-filter output and increase the weighting to the slope of the signals, and an accumulator to measure the amount of edge detail in the image. Correct phase adjustment is an iterative process, albeit a satisfactory one because the phase changes at a slow rate. The additional phase offset adds to the subcarrier phase you derive from the input signal. This method can substantially improve the perceived sharpness of any VCR source and reduce the discrepancy between the performance of the VCR and that of the DVD. It is ideal, for example, in transcriptions.

Incorrect timing of luminance and chrominance is common with some video sources, such as VCRs. Noncoincidence of vertical edges leads to a lack of clarity in the resultant video image, specifically with regard to smearing of vertical edges. Many video decoders offer a YC-delay control, which allows the user to vary the comparative delay. However, the problem with this manual-control method is that the user needs to know the delay to be able to compensate for it, effectively rendering the control redundant. Such adjustment is difficult to do visually, especially for an unskilled user, and it requires specific video-test patterns to properly perform the adjustment. The delay can also vary over time, especially for mechanical mechanisms, such as VCRs. However, the FPGA can incorporate methods not available in off-the-shelf IC decoders to automatically, periodically retime the luminance and chrominance.

**Audio approaches**

Audio acceptance occurs in either an SIF (sound-intermediate frequency) from the tuner, such as NICAM or BTSC, or baseband stereo from the composite- or component-video inputs. ICs demodulate and decode the SIF signal; similarly, IP cores that perform these functions are available for FPGAs. As mentioned, the FPGA can take advantage of different configurations for different standards to reduce the area impact, and it can incorporate additional decoders for multituner products.

IC options may integrate the baseband-audio ADC with the video decoder. However, this approach usually results in a worse SNR than that of an external-IC alternative. However, it is possible to also use one of the video ADCs for the audio. By substantially oversampling the audio and then decimating the result, you can theoretically achieve the additional needed bits without using an
external device, leading to further cost savings (Figure 3).