EDN logo


Design Feature: July 4, 1996

Fastand flexible: FIR filters in reconfigurable logic

Doug Conner,
Technical Editor

Reconfigurable logic lets you implement DSP functions in hardware, providing a mix of speed and design flexibility that isn't available in DSPs or mask-programmed ASICs. Filter-design tools make design tasks easy.

Designing DSP operations into reconfigurable logic offers the flexibility and quick turnaround that approach those of a µP or DSP, yet also offers nearly the same speed of a mask-programmed ASIC. It sounds almost too good to be true. But, how does it work in practice? To answer that question, I tried a hands-on design using a reconfigurable logic device and an FIR-filter design tool to obtain first-hand information on the process.

In the last issue of EDN (Reference 1), I related my experience designing a reconfigurable data-acquisition system. Part of the data-acquisition system was a reconfigurable filter-function block at the front of the system that filters data with one of four lowpass or bandpass digital filters. The data-acquisition system gave me a test bed for the 16- and 32-tap filters I had designed.

DSP tools for PLDs

I used Altera's DSP Design Kit (Table 2), a free add-on tool to its MAX+PLUS II design software, for the project. The first release of the DSP Design Kit lets you design with 8-, 16-, 24-, 32-, and 64-tap serial or parallel FIR filters. The design kit also includes a 3×3 convolution operation for video applications.

Table 2—Components and sources
Manufacturer Product Price
Altera Corp
San Jose, CA
(408) 894-7000
fax (408) 944-0952
http://www.altera.com

Flex 10K50 PLD $350 (100) Available now
$199 (1000) Available the end of 1996
MAX+PLUS II
Magnum design software
$4995 (other versions start at $495)
Signalogic
Dallas, TX
(214) 343-0069
fax (214) 343-0163

DSPower Block Diagram andHypersignal $1250 (demo software is free)

Altera's DSP Design Kit is designed for use with the company's Flex 8000 and 10k logic families. These static-RAM (SRAM) -based reconfigurable- logic devices, like other SRAM-based PLDs, offer infinite reprogrammability. You can take advantage of the reconfigurable capability to implement different functions at different times. The data-acquisition system I designed reconfigures the device to provide different filters at different times.

Altera's DSP Design Kit is one of several DSP design products available for reconfigurable logic devices and for PLDs in general. Not only do other PLD vendors offer DSP design tools and libraries, but logic-synthesis companies also offer software to help you generate a DSP design for programmable logic. You should be able to perform DSP operations on any PLD that offers the necessary logic capacity. The tools available for a particular PLD family, of course, affect the ease with which you can implement DSP functions and the resulting performance. The PLD's architecture also affects the performance of the design and whether you have certain capabilities, such as reconfigurability. (See Reference 2 for a discussion of the products available for DSP design in programmable logic.)

The FIR filter
Figure A shows a conventional four-tap FIR filter. The digital data clocks into the register at the left and shifts through the registers with each clock cycle. Although a single register is shown for each tap, the registers are as wide as the input data word.

A linear-phase-response FIR filter has symmetric filter coefficients, so h1=h4 and h2=h3. The symmetric coefficients let you change the architecture to that of Figure B, trading half the multiplies for addition operations. Altera's DSP Design Kit uses the architecture of Figure B to implement the filters.

Although it is possible to design filters with the filter coefficients stored in registers, which allows you to vary the filter coefficients while the device is running, the DSP Design Kit stores the filter's coefficients in a ROMlike fixed format. This method offers the highest speed for the company's PLDs.

The drawback to storing the filter coefficients in a ROMlike structure is that you can only change the filter coefficients by reconfiguring the device. You can reconfigure the device in as little as 60 msec, but each set of coefficients requires a new design. You must compile the new design to generate the programming file, an operation that took me approximately 30 minutes for each design. Although it is practical to keep a number of filter configurations in mass storage and then reconfigure the device as necessary, you probably would not find it feasible to incrementally alter the filter coefficients in a widely varying adaptive-filter scheme with hundreds of configurations.

As I started this project, I had a basic understanding of the theory behind digital filters, but I had no practical experience actually designing with them. I found the tools were simple enough to use that I could jump right in despite my limited background and lack of experience in digital-filter design. The more you know, the faster you are able to develop an appropriate filter design for your application. As is the case in all design operations, you have to juggle a variety of price and performance tradeoffs to reach the best solution for your application.

Step one: filter coefficients

To start implementing a digital filter, you need a list of filter-tap coefficient values. To obtain the filter coefficients, you need to use one of the filter-generation-software tools available from DSP-design-tool vendors. For this project, I used Signalogic's DSPower Block Diagram and HyperSignal software (Table 2).

An appropriate DSP-design tool helps you evaluate the filter's response characteristics before you start the implementation process. You'll have to spend time making performance tradeoffs among a variety of parameters and characteristics, such as the number of filter taps, cutoff frequencies, roll-off characteristics, bits of resolution, and ripple in the passband and stopbands.

Some filter-design tools only show you the ideal floating-point-filter characteristics. Although this is a good start, the response of the filter may be significantly affected when you implement the filter with limited fixed-point resolution, as I found out after implementing a few filters. Floating-point-filter implementation is a wasteful extravagance in any system requiring high speed. Fixed-point implementations can offer all the resolution you need by using the necessary number of bits and scaling values appropriately. Using more taps or more bits of resolution than you really need wastes logic resources or reduces speed.

Even if the filter-design tool you use doesn't let you represent the results of the actual bit resolution you are using, you can see the results later in the device simulation. Although this method is workable (in fact, I used it), I recommend using a filter-design tool that lets you see the results of the actual bit resolution you plan to use. The process of repeatedly changing the design, synthesizing the filter for the PLD, and simulating the PLD is too time-consuming to be an efficient design method. Synthesizing a filter and simulating it on the 60-MHz Pentium-based computer I used took approximately 30 minutes.

At the time I was using Signalogic's DSPower Block Diagram and Hypersignal software, I wasn't able to view the filter characteristics using my selected bit resolution, although that capability is now available. Signalogic's software let me change filter characteristics and see the results with a turnaround time of less than 1 minute. Because I was seeing only the ideal floating-point-filter performance, not the results of actual bit resolution I would use in the filter, I ended up with greater ripple in the passband of the lowpass filters in Figures 1 and 3 than I wanted. If I had been able to see the effects of the bit resolution I was using in 1 minute, instead of requiring a 30-minute compile and simulation, it would have helped me improve the filter's performance.

Following the FIR cookbook

Once you know the number of taps on your digital filter and the coefficients for each tap, you are ready to go through the semiautomatic process of implementing the filter in a PLD using the DSP Design Kit.

The DSP Design Kit software accepts floating-point, filter-tap coefficients and automatically scales them to the correct value for the bit resolution you select. Using the scaled coefficients, the software generates a file of precomputed partial products for implementing the digital filter.

You then specify whether you want to use a fully parallel or fully serial filter. The parallel filter provides the highest speed and uses the most logic resources. All additions and multiplies occur in parallel, making the data rate equal to the clock speed. The serial filter is slower, but consumes fewer gates. (All of my design samples used the parallel filter implementations.)

Next, you select a filter with the number of taps you want and specify the parameters. The filter parameters you specify are: the input data resolution, the internal resolution level (number of bits), the output resolution, the coefficient resolution, whether the filter is symmetrical or asymmetrical, and whether the filter is pipelined or not. You add the symbol for the filter you've just created to a schematic, providing the input signals (data input, clock, and clear) and the output signal (data output).

At this point, the design is almost complete. You need to select the device for which you will compile the design—in my case, the 50,000-gate Flex 10K50 (Table 2). You also need to select the logic-synthesis style you want to use. For all the designs, I selected the fast-logic-synthesis mode. The compilation process creates a completely placed and routed design. When compilation is complete, you are ready to download the program to the PLD and run the device. You can also simulate the device.

The DSP Design Kit software includes the ability to generate a swept-sine-wave-simulation input file. Using this automatically generated stimulus file, you can use the simulator in the MAX+PLUS II design software to see the filter's response vs frequency. Writing the output of the simulator to a table file, you can use the plotting software included in the DSP Design Kit to plot the filter output. Figures 1a through (2, 3) 4a show the simulation plots for the filters I created. The filters have a frequency sweep that ranges from 0 to ½ the clock frequency.

Once you've created the digital-filter design, you can download the program to the PLD to create a running filter or, as was the case in my designs, use the filter as a hierarchical design element in a larger design. Because the filter input and output uses a two's complement number, you may need to convert the input for compatibility with your data source. I was using offset binary directly from an ADC, so I inserted conversion logic in front of the filter and then converted back to offset binary after the filter.

The design I created used the filter in series with a complete data-acquisition system and a parallel port interface to a PC. All digital functions were implemented on the Flex 10K50 device, including data storage using the device's on-chip RAM. The filter, integrated with a data-acquisition system, let me test the filter response with a swept sine wave, just as in the simulation. I wrote a Visual Basic program to control the reconfigurable-logic device and to display the data from the data- acquisition system. You can see the actual filter performance in Figures 1b through (2, 3) 4b.

The real filters

Looking at Figures 1 through (2, 3) 4, you can see that the real filters and simulated filters have similar characteristics. In fact, they should have only a few differences. First, the swept sine waves used for simulation and for the actual device were slightly different. The simulated sine wave swept from a frequency of 0 to ½ the clock frequency, using 1000 data points. The actual filter was tested with an arbitrary waveform generator sweeping from 50 kHz to 5 MHz using 8000 data points. (The clock frequency was 10 MHz in the tests.) Furthermore, the arbitrary waveform generator is not synchronized to the filter clock, so every time a test runs, the amplitude vs frequency varies slightly, especially at higher frequencies, where each cycle of the sine wave is defined by two or three data points. Distortion in the analog data path and conversion also alters the results.

Table 1 lists configuration and performance statistics for the sample filters I designed and tested. The maximum clock speed shown in the table ranges from 31.6 to 42.0 MHz. The maximum clock speed values were taken from the design timing analyzer. I only functionally tested the filters at clock speeds as high as 20 MHz.

Table 1—FIR-filter designs

Filter type Number
of taps
Input
resolution
(bits)
Coefficient
resolution
(bits)
Internal
resolution
(bits)
Output
resolution
(bits)
Logic
cells
used
% of Flex
10K50 logic
cells used
Maximum
clock speed
(MHz)
Filter 1 Lowpass 16 8 8 12 12 521 18 40.5
Filter 2 Bandpass 16 8 8 10 10 466 16 42.0
Filter 2A Bandpass 16 8 8 17 17 599 20 31.6
Filter 3 Lowpass 32 8 8 12 12 1003 34 40.0
Filter 4 Bandpass 32 8 8 12 12 1005 34 37.5

Note that the filters I designed were generated automatically. There was absolutely no hand-tweaking of the design or the layout. I provided the filter coefficients and worked through the cookbook directions. If you know what a digital filter is, you know enough to begin using this tool.

I experimented with varying the internal and output resolution to reduce logic usage. To carry the full resolution of the 8-bit input and coefficients through the filter, the output should be 17 bits. The 17-bit value comes from adding two 8-bit inputs (a 9-bit result) and then multiplying the result by an 8-bit coefficient. I found the 10- to 12-bit values adequate for my designs, because I was only using an 8-bit output from the filter. From the 10- or 12-bit output, I selected the sign bit and the appropriate seven MSBs.

For the 16-tap bandpass filters in Table 1, filter 2A, with 17-bit internal and output resolution required 20% of the device, and filter 2, with 10-bit internal and output resolution, required 16% of the device. Speed also dropped from 42 to 31.6 MHz, when using the higher resolution.

Note that the filters don't use any of the Flex 10K50's 20,480 bits of embedded RAM. The device has plenty of room for other logic in addition to the 16- and 32-tap filters I tested.

PLDs vs DSPs

It's also interesting to compare the speed of the filter implementation with a fast DSP processor. According to Greg Goslin, DSP program manager at Xilinx (San Jose, CA), a 66-MHz DSP has a theoretical maximum throughput of less than 4M samples/sec for an 8-bit, 16-tap filter. The full-resolution 8-bit, 16-tap filter implemented in the reconfigurable logic device (Table 1) uses 20% of the device and has a throughput of 31.6M samples/sec, offering about eight times the speed of the DSP processor. The Flex 10K50 costs $350 now and will cost $199 (1000) at the end of 1996. If we assume an 80% utilization level (ignoring the 20,480 bits of embedded RAM that is also available) for the PLD, then the cost for the filter implementation is $50, compared with approximately $20 for the DSP. Designs that don't require the large logic capacity of the Flex 10K50, can use a smaller, less expensive device. If you need the speed, then the PLD wins on speed, power, pc-board space, and cost.

Although I didn't explore the serial implementations of digital filters, they should provide low to moderate speeds and require even fewer resources in the PLD, possibly making the PLD attractive for even lower speed, lower cost applications.

PLDs, in general, including one-time programmable devices, still offer many of the same advantages for DSP operations as the reconfigurable logic devices I used. Those advantages are processing speed and the ability to modify a design, and to test it quickly. The time required to go from development to production is minimal.

Tools and technology improve with time, but old horror stories are slow to die. I still occasionally hear from designers that complain about not being able to use the automatic design tools for complex PLDs (including FPGAs) and having to manually place and route designs for performance. Because I did not try any manual placement and routing, or even try a low-level schematic design of a filter, I cannot speak about the performance improvements that might be possible using those techniques. I can say that the automatic tools are easy to use and provide designs with performance that should be sufficient for a wide variety of DSP applications.

Looking ahead
Until recently, most DSP functions have been performed on DSPs, µPs, or mask-programmed ASICs. High-density PLDs provide another path for implementing DSP functions that offer greater processing speed than DSP and µP devices, although high-density PIDs still offer less speed than mask-programmed devices. Expect to see a major shift in DSP hardware implementations as more designers catch on to the price and performance advantages PLDs offer for implementing DSP operations, while retaining design flexibility.


You can reach Technical Editor Doug Conner at (805) 461-9669, fax (805) 461-9640, or e-mail at edndconner@mcimail.com


References

  1. Conner, Doug, "Reconfigurable logic: built in adaptability," EDN, June 20, 1996, pg 42.

  2. Levy, Markus, "DSP design tools target FPGAs," EDN, June 20, 1996, pg 75.


| EDN Access | feedback | subscribe to EDN! |
| design features | out in front | design ideas | columnist | departments | products|


Copyright © 1996 EDN Magazine. EDN is a registered trademark of Reed Properties Inc., used under license.