Design Feature: June 20, 1996
Reconfigurable logic gives you the potential to modify hardware designs on the fly. You can implement a hardware design in an appropriate SRAM-based PLD and then reprogram the device in only milliseconds to change the hardware functions it performs. The same reconfigurable-logic device can serve different functions at different times, potentially reducing the size, cost, and power consumption of your overall design (Reference 1).
But, are hardware speed and software flexibility compatible? Can a PLD reprogram in real time to implement hardware functions on demand? And, how do you design with these devices? To find out, I designed a reconfigurable data-acquisition system that requires signal-filtering capabilities for a variety of circumstances. The system reconfigures as necessary by implementing one of several signal-filtering options in an SRAM-based PLD. The different options are essentially separate designsEDN1, EDN2, EDN3, EDN4, and EDN5that load and execute as required.
Designing my data-acquisition system was much like creating any PLD-based design, except that I'd need a
PLD that I could reconfigure quickly, while the system was running. The PLD I chose is Altera's Flex 10K50 (see box, "The reconfigurable PLD"). With its on-chip RAM, the 10K50 implements my entire data-acquisition system except for the analog-input path, an ADC, and a clock oscillator (see box, "The reconfigurable system"). System control and display are on a PC connected directly to the PLD via a parallel port.
| The reconfigurable PLD |
|---|
|
The reconfigurable logic device I used for this project was Altera's Flex 10K50 SRAM-based PLD. Nominally rated at 50,000 gates, this device has 2880 logic cells, 3184 total registers, and 20,480 bits of RAM that are arranged in 10 embedded-array blocks (EABs). You can configure each 2048-bit EAB with words of 1-, 2-, 4-, or 8-bit width. You can generate longer or wider RAM blocks by combining EABs. The 10K50 is available in a variety of packages having as many as 310 user I/O pins. The package in this project was a 403-pin PGA. As an SRAM-based device, the 10K50 requires programming every time it powers up. You can also reprogram any time you want to change the functions it performs. The device accommodates a variety of programming modes, including serial or 8-bit parallel programming and synchronous or asynchronous modes. It can also self-load from EPROM. The parallel synchronous mode is the fastest; running at 10 MHz, it can program the PLD's 609,000 bits in about 61 msec. Although I had originally planned to use a parallel-port connection to program my system's PLD in less than a second, I changed that plan when I decided to use Altera's demo board, which conveniently provided all the hardware I needed for the project. Instead of programming through the parallel port, I used a serial-port programming scheme and Altera's BitBlaster serial-download cable, which attaches to the demo board with a 10-pin connector. I could program or reprogram the PLD at 115,200 baud in 11 seconds.
|
Altera agreed to loan me a computer system and design software for the duration of my project. The system was a 60-MHz Pentium-based PC with 32 Mbytes of RAM and a 17-inch monitor. The software was Altera's MAX+PLUS II (Magnum edition) running under Windows 3.1 (see box, "The design software").
| The design software |
|---|
|
The design software I used for this project was Altera's MAX+PLUS II, Magnum edition. This package includes a graphical (schematic) editor, a text editor, a waveform editor, a symbol editor, a floorplan editor, a compiler, a simulator, a timing analyzer, a message processor, and software for programming a PLD. MAX+PLUS II is available for Unix workstations and for PCs running Windows NT, Windows 95, or Windows 3.1. A tutorial included with the software walks you through a sample design to familiarize you with all the software tools. The tutorial takes you through examples of graphical design, text-based AHDL (Altera Hardware Description Language) design, and waveform design. The software also supports VHDL and Verilog design, although I didn't use these capabilities. Even though most of my experience has been in schematic design, I found AHDL easy to use. You can use Boolean equations, typical software constructs (such as case statements and if-then statements), and truth tables. The design tools in MAX+PLUS II provide automatic settings for synthesizing your design, plus plenty of manual controls if you want or need detailed control over the synthesis process. I speak of synthesis because even a schematic design will use some level of synthesis to combine functions efficiently into the logic elements on a PLD. For example, you can select one-hot state machines (one register per state) or encoded states. Some selections default to different settings, depending on which device family you're using. The Flex 10K family is a register-rich device and, therefore, defaults to one-hot state machines instead of using encoded states, which are useful for saving registers on simple PLDs. The simulator is easy to use. You can work from vector files or directly with the waveform editor. You can easily create clocks, group values, and counting values with the waveform editor to provide input stimulus for testing designs. The software tutorial also shows you how to use the floorplanner, something I never used after running the tutorial. The floorplanner is tightly linked with the other design tools, so you can easily locate specific logic in the floorplanner from whatever tool you're using.
|
I began with a trip to Altera, where I picked up the computer system with software already installed. I then headed back to my office to begin work the following day. For a day and a half, I worked through a tutorial included with the MAX+PLUS software. After this basic introduction, I was ready to start designing.
Checking out the library
My first design activity was another learning experience, investigating a design approach that uses a Library of Parameterized Modules (LPM). I'd been interested in the LPM approach for quite a while, and this project seemed like to good place to check it out. LPM design doesn't limit you to fixed 7400-series functions that make up the bulk of most schematic-level libraries for digital design; instead, it uses a relatively small collection of highly flexible functions to create a wide variety of customizable graphical building blocks. You specify parameters to custom configure functions such as counters, adders, multipliers, registers, and comparators to exactly the size and operations you need.
There isn't much to learning LPM design, as it turns out. If you can design with schematics, you can design with LPM. For designers who don't want to give up graphical design, yet can't live with the inefficiency of creating just a few gates at a time, LPM is an alternative worth considering. I was impressed enough from my brief investigation that I ended up using it to design all counters, multiple-bit registers, digital compare functions, and RAM (see box, "LPM design").
| LPM design |
|---|
|
In my project, I used LPM design exclusively for designing all counters, multiple-bit registers, digital compare functions, and RAM. Later, in another project, I used it to create 28-bit-wide add and subtract functions and a 24-bit3256-word FIFO. In most cases, LPM performed admirably. Although LPM offers standard gate constructs and would be very useful for extremely wide gates, I used ordinary schematic symbols for the two-or three-input gates that I needed for clock-enable logic. These few gates and a few lone flip-flops were the only cases where I used ordinary schematic design elements. Small library, custom functions LPM is very easy to use. The actual library of components is small, because each function type can generate a huge variety of function variations. Altera's LPM implementation, for example, has only 25 function types. In designing with LPM, you first select a device type. A pop-up parameter box then lets you set the parameters you want and, to avoid cluttering your schematic, lets you deselect the parameters you don't want. On-line help documentation lets you check default values for inputs you don't want to use. The parameterized entry method acts like an automatic checklist to make sure you've considered all options for the function you're using. Invariably, during the course of a design, I'll need to change some parameters. A flip-flop might need a clock enable, for example, or a counter width might need to change. It's a simple matter to bring up the parameter box and perform the necessary edits. Increasing design efficiency Text-based languages such as VHDL and Verilog are on the rise, but I often hear that schematic-level design is still popular with some designers using PLDs. The reason usually cited is that you give up too much performance when using synthesis tools and hardware-description languages like Verilog and VHDL. To compound the problem, these designers say, logic-synthesis tools also use more logic elements on a device. I'm not able to verify or disprove these assertions, because I'm not using VHDL or Verilog. What I can say is that if you favor the use of schematic design methods, LPM offers features that can make your design work go faster and, in my opinion, better. LPM allows a PLD vendor to create optimum implementations for different bit widths of each function. As functions like counters, comparators, and adders grow in size from a few to many bits, the optimum implementation for speed or area may change. An LPM synthesis tool can use the best implementation for a particular function and bit width, and you don't even have to know about it. For example, Altera's Flex 10K architecture uses fast-carry logic when cascading certain functions for speed. As a designer, you don't need to know anything about the fast-carry logic as long as the LPM synthesis software uses it properly. The benefits of LPM depend on the quality of implementation by your PLD company. As a bonus, other logic-synthesis tools can take advantage of the intelligence built into LPM logic-synthesis tools. By using LPM-generated functions, these logic-synthesis tools can better achieve architecturally optimized designs. My LPM design experience wasn't completely without problems, though. The LPM multiplexer was difficult to use, so I ended up using an AHDL (Altera Hardware Description Language) multiplexer, which gave me the added benefit of being able to use a nonstandard data-select word. On the positive side, using LPM with the Flex 10K50's on-chip RAM was extremely simple. The on-chip RAM is completely self-timed, so there were no timing concerns. All you have to do is get the data and address to arrive in time to meet the setup- and hold-time requirements. The rising edge of the clock writes the data when the write-enable line is high. RAM doesn't get any simpler than this. I highly recommend LPM. It won't solve all your design needs, but, neither will text-based design tools.
|
After working with the easy LPM design tools, it was time to move on to the more difficult task of state-machine design. I usually find state machines one of the more complex parts of a design, even for relatively simple systems. So even though my data-acquisition system had only seven states, I spent a lot of time drawing state diagrams to determine how I wanted it to work.
I chose the truth-table method for implementing my state machine, primarily because I'd been impressed with the simplicity of it when I'd run through the MAX+PLUS II software tutorial. Starting with a template and referring to the tutorial example, I produced the code for my own state machine. It's mostly a matter of declaring inputs and outputs and then filling in a table. On the left side, you designate previous states as "high," "low," or "don't care"; on the right side, you list the corresponding next state.
The state-machine design went smoothly. The design checker found only some misspelled signal names, a missing semicolon, and a few other problems. When my design finally passed the check, I compiled it and got a state machine.
To check out the state machine, I started the simulator, pulled up a waveform editor, and selected the state machine's inputs and outputs I wanted displayed (all of them in this case). Using the waveform editor, I then created the input conditions I wanted to test. When I ran the simulator, the results appeared on the waveform display, complete with the state names I used in the text design. The state machine didn't work perfectly this first time, but it was easy to figure out its problems. I went back to the text, modified it, recompiled, and ran the simulation again. After a few iterations, I had a working "first-cut" state machine.
At this point, I was only three days into my project and already had a functional preliminary state machine and a few schematics in various stages of completion. I decided to take the next step and compile my design specifically for the 403-pin Flex 10K50 PLD. Previously, I'd been compiling with automatic device assignment, a default option that results in a compilation for the smallest PLD that the design will fit in.
When I tried to compile for the 10K50, though, the system crashed. After a few reboots and attempts to recompile, I was convinced I had a real problem. The system kept crashing, always at about the same point in the procedure. I called up Greg Steinke, the Altera application engineer assigned to answer my questions, and explained the problem. Steinke had me try a few possible fixes, all without success.
Stale software
Finally, we discovered the problembeta-version design software. I'd picked up my design system from Altera two days before the official release of MAX+PLUS II release 6.1. Once we realized I wasn't using the official release, I e-mailed my design to Steinke and he compiled it on his system without a problem. I could still use the Flex 10K auto device assignment on my own system, but for the 10K50, I'd have to wait for the new software release to come out of production.
Unfortunately, Altera's production software gets distributed on CD-ROMs, and the system I was using didn't have a CD-ROM drive. That meant an unexpected task of selecting, ordering, and installing a drive. Two days later, and after only a few minor technical problems, I had the new drive installed and working.
In the meantime, I continued using the Flex 10K auto device assignment. I used LPM design for the most part, but in the design of a decoder and a multiplexer, I found that text-based design, using Altera's hardware-description language (AHDL), could simplify my work (see box, "HDL design").
| HDL design |
|---|
|
AHDL was also easy to learn and use. In one situation, I needed a 4-to-1 multiplexer with a 4-bit-wide data path. The multiplexer uses nonstandard select values to make it compatible with a control word from the PC that controls my system. With AHDL, generating the multiplexer was as simple as declaring signals and adding a case statement for the functions. I also used AHDL to design a decoder. The decoder clocks output data into registers, making it slightly more complex than the multiplexer. However, it was still easy for me to design, even as a first-time AHDL user. Later, in a separate project, I used AHDL to create an encoder for changing a 24-bit integer into a 12-bit floating-point value. This design really drove home the utility of a text-based design language for some functions. Using just over a page of code that very systematically described the encoder, I was able to create a design that I think would have taken much longer to create and would have been more prone to design errors using schematic design.
|
I didn't make any general assumptions about whether text-based or LPM design is superior; I simply used whichever appeared to be most efficient. You can look at efficiency from the implementation point of viewhow long it will take to design, verify, and make changesor from the performance point of viewhow many logic elements your design will use and how fast it will run. In the end, you'll only know which approach is better by using bothnot something for which most designers, including me, usually have time.
In this project, using a huge PLD, efficient gate utilization wasn't a top priority. Still, I think my implementations are fairly efficient. I often checked after design compilations to see how many logic elements I'd used. And I always checked the timing to see how fast a block of logic would perform.
For example, my state machine design initially ran at 34 MHz, according to the MAX+PLUS II timing analyzer. This speed was sufficient for my design, but not as fast as I had expected. Examining the state-machine truth table, I found I had constrained the states more than necessary. Going carefully through my design and changing some conditions to "don't cares," I then reduced the combinatorial logic in front of the system's registers. By the time I pared the logic to what I believed was a minimum, the timing analyzer told me I'd increased the speed to 69 MHz.
Checking the timing was easy. The timing analyzer in MAX+PLUS II has a mode for synchronous designs that tells you both the longest delays between registers and the maximum allowable clock rate for your design. It's a useful tool for keeping track of performance as you make design changes. The timing analyzer also provides a setup-and-hold matrix and
a delay matrix, which are also useful for asynchronous designs. The timing-analysis tools are cross-linked, too, so you can jump right into other design tools to evaluate a design or make changes.
The design software also lets you set up timing constraints and logic-synthesis options to help you control time and to trade off area for speed. I didn't learn how to use the timing-constraints options, but I did use the fast logic-synthesis option, which provides about twice the speed of normal logic synthesis. The fast option uses carry chains to speed up certain functions, but it can use up a lot of a PLD's capacity if you use it on all circuits.
Design checkers in MAX+PLUS II caught errors that might have been difficult to debug later. Once, for example, when I was changing some parameters for an LPM counter, I inadvertently disabled the synchronous-load function while editing a parameter box. Had the design checker not then warned me of unused inputs, I probably would have had the eventual problem, in simulation, of determining why a counter wasn't behaving properly.
Eight days into the project, I was working on the last major block for design EDN1, a data-path block that includes 16,384 bits of RAM organized as 2048 8-bit words. This memory requires eight of the 10 2-kbit RAM blocks on the Flex 10K50 PLD, which I'd be using to implement my design, but it also fits into a Flex 10K40 using all the 10K40's RAM blocks. To avoid the system crashes that still occurred when I compiled for the 10K50, I decided to compile for the 10K40 just to keep making progress. I surmised that the crashes I had been experiencing weren't specific to the 10K50, but simply related to the large number of computer operations involved in compiling for this larger device.
Compiling for the smaller device didn't help, though. When I tried to compile a top-level design that included all my subdesigns, the computer still kept crashing. I tried all the fixes Steinke had suggested before, without success. It was a Saturday, so I knew I couldn't get any help. What's worse, Steinke was scheduled to be out on travel all the following week.
Waiting for software
On Monday, I called up Altera's regular help line and spoke with someone named Charles. We tried some fixes without success; Charles, like Steinke, thought I needed the new software that still hadn't started shipping. After a few more calls to various people at Altera, I concluded that I'd probably be without the new software for another week.
By now, the crashing compiler had me feeling a little shaky about getting the project done on time. After some reflection, I decided my best course of action was to plan for success and keep as many things moving forward as I could. I'd need control and display software running before I could do anything with a PLD, so this seemed like a good time to get that out of the way. I spent the rest of the day laying out a software plan.
For the control software, I took a two-level approach. The first level contains low-level controls for all upload and download functions. These low-level controls provide both the foundation for my high-level function calls and an interface for low-level debug operations. It gives me access to individual registers' contents and data bits. The high-level control form provides oscilloscope-type controls and display.
My first priority in the low-level softwareone that I'd addressed a few weeks earlier in preparation for my projectwas determining how to work around a shortcoming of Visual Basic. Visual Basic appealed to me for its apparent simplicity in programming and creating a graphical user interface, but it doesn't provide access to a PC's parallel port except through a high-level printer driver. And, because I'd be using a PC to control my data-acquisition system, I needed low-level read and write control.
Fortunately, EDN contributing editor David Shear pointed out an EDN "Design Idea" (Reference 2) for a dynamic-link library (DLL) that was just what I needed. The DLL provides 8- and 16-bit I/O reads and writes on the ISA bus. You can use it directly as function calls or via an executable program that lets you read and write to I/O addresses manually. I downloaded the DLL software from EDN's Web site (/), and found that it worked perfectly.
Reading and writing through my PC's printer port provided all the control my data-acquisition system needed. A simple write loop performs about 100,000 8-bit transfers per second on my PC; a write-and-read loop that uses a variety of addresses performs about 50,000. This rate was more than adequate for my project, because I needed only about 5000 transfers to completely download data after an acquisition cycle.
Writing and testing the low-level control software and the high-level control and display software did turn out to be time consuming, however. Mostly due to continued embellishments and my lack of familiarity with Visual Basic, these tasks consumed a whole week. Visual Basic worked fine, however. Although this was my first experience with it, I found it easy to use and able to accommodate all the operations I needed without any problems. Had I already been familiar with it, I probably could have written the software in a couple of days.
Still, by Saturday I decided that the control and display software were complete. I'd added all the bells and whistles I could think of, including a second display channel, the ability to save and recall acquisitions, and the option of plotting points or vectors.
Plugging on
Still wondering when the latest version of MAX+PLUS II would arrive, I decided to begin work on my second design, EDN2, which would include a 16-tap lowpass filter. I was 14 days into the project without a lot to show for my effort, but still taking to heart the concept of planning for success. My first design, EDN1, wouldn't compile, so I couldn't test it, and I had no idea if it would work. Yet, I was starting a second design, EDN2, that would depend on the first one being functional. Despite a relatively low confidence level, I continued to put in 12-hour days. I might fail, I reasoned, but I wouldn't have to look back and tell myself I should have tried harder.
I began the design of EDN2 by loading Altera's DSP Design Kit software, which works with MAX+PLUS II. You can use the DSP Design Kit to create a variety of digital filters having from eight to 64 taps. Now in its first release, the kit also has a 3×3 convolver for video. Presumably, subsequent release will add more capabilities. For my purposes, the filter-design capabilities alone made the kit attractive.
Using the design kit was easy enough. You provide filter coefficients and follow a documented procedure, and the semiautomatic software generates filters for you. The software's documentation includes a detailed example for generating an eight-tap lowpass filter for a Flex 8000, an earlier Altera SRAM-based logic family. I worked through this example to get familiar with the software.
But, when I tried to compile my filter design, the system crashed. After repeated crashes and subsequent reboots, I began reflecting on computer benchmarks. Why aren't computers ever rated on a time-to-boot specification? This spec ought to be the computer world's equivalent of auto racing's quarter-mile event. When your system is crashing all the time, it's the one benchmark you really care about. The system I was using boots in a relatively quick three minutes, but when you have to boot 10 or 20 times a day, it eats up a lot of time.
Finally my patience ran out. I'd waited a week and a half for new software, and now I faced a real roadblock. Reluctantly, but necessarily, I took advantage of my status as an EDN editor. I put in a call to the "Right Person" at Altera, and all of a sudden good things started to happen. An hour later, I got a call telling me that the software still wasn't ready on a production CD-ROM, but that a CD-Recordable copy was headed my way via Fed Ex.
While I waited for the new software, I continued designing the digital filters that drop into the input path of my data-acquisition system. Because I needed to generate filter coefficients for input to the DSP Design Kit, I used a demo version of Signalogic's DSPower Block Diagram and Hypersignal software. The demo software is capable of generating filter coefficients and filter-response plots for a wide variety of filters. The complete version of DSPower lets you string together a wide variety of DSP functions and run data through them. You can even link up with a DSP board and perform the functions on the board instead of in software. In my case, I was just looking at single blocks that were either a lowpass or a bandpass filter.
When it comes to digital filters, I'm okay on the concept, but I have zero design experience. Undaunted, I called up Signalogic and got a little help over the phone from Jeff Brower, the company president. Brower explained how to use the software and where to find the filter coefficients. Soon I was off having a great time specifying filter performance parameterssuch as attenuation, steepness of rolloff, and tolerable passband rippleand seeing how close my generated filters could come to satisfying them. Because I specified the number of filter taps I wanted, the resulting filters couldn't always meet my requirements. You have the option, however, of letting the software add more taps to get the performance you specify.
More problems
Finally, the CD-Recordable with the new MAX+PLUS II software arrived. Full of hope, I loaded the software into the system. But the system still crashed! Now what? I e-mailed my latest design files to Steinke at Altera, who compiled them without problems on his system.
My problem, Steinke and I figured, must be specific to my system or its configuration. Steinke and I were using the same software, but Steinke's was running under Windows NT, and mine was running under Windows 3.1. Following that logic, Steinke compiled my design on another system running Windows 3.1without a problem. That meant that the problem I'd been having all along was in the computer system I was using or in the way the system was configured. The beta software may not have been the problem.
Over the phone, Steinke guided me through a variety of system-configuration files. I made a few changes to config.sys, autoexec.bat, and win.ini, but still my system crashed. I then ran the scandisk utility, which revealed a variety of misrepresented, damaged, and cross-linked files. This looked bad. The utility attempted to fix the corrupted files, but the compiler still kept crashing. Steinke suggested reloading Windows.
It was a Friday. I really wanted this to work and end the week on a bright note. I reloaded Windows 3.1 and crossed my fingers. My design compiled! Success! I spent some time running simulations on the full design and found that it simulated correctly as best I could tell.
Saturday, I went back to work on the DSP design software, designing filters. Friday night's celebration, I soon discovered, had been premature. Odd things were appearing, like pop-up boxes in reverse video. The system was crashing where it had never before had problems. It looked like DOS was trashed, too.
Delete everything, start over
Monday morning I talked with Steinke. Scandisk was now indicating no corrupted files and no bad clusters on my disk system, but something was seriously wrong. Steinke and I agreed we'd better start clean, so I saved my design files and started from ground zero by reloading DOS. A few hours later I had the system up and running.
Although I wouldn't be sure of it for a few more days, the chronic crashing was over. Nineteen days into the project, the computer and software were finally stable. Whether the system was corrupted before I borrowed it, the software caused the problem, or I did something to cause the problem, I'll probably never know. What I did know at that point was that I had 11 days left to complete a bunch of designs and get them working.
Design EDN1 was about as far along as I could go without a PLD and a demo board, which I hadn't yet received from Altera. Consequently, I got back to work on design EDN2, the one with a 16-tap lowpass filter. The DSP Design Kit made the job simple. It even included software for generating a swept-sine-wave stimulus file and software to plot the simulation output. (See the next issue of EDN for more details on how I generated filters with the kit.)
With the computer and software working properly at last, it didn't take long to develop the 16-tap lowpass filter. After that, I created a third design, with a 16-tap bandpass filter. I then took these two filter designs and dropped them into the front end of design EDN1, thus creating designs EDN2 and EDN3 (see box, "Integrated hierarchical design and simulation").
| Integrated hierarchical design and simulation |
|---|
|
I used hierarchical design and simulation throughout the project, both for speed of simulation and to minimize difficulties in design checkout. I found that MAX+PLUS II made hierarchical design easy and very efficient, both for text-based (HDL) design and graphical (LPM and schematic) design. A big advantage of the hierarchical approach is that you can use a design by itself as soon as you create it, and you can also save it for later reuse as part of a larger design. In my project, I first created and simulated low-level designs; when they were all working, I simply merged them into a top-level design. Working from the bottom up, I compiled and simulated every drawing or text block by itself as an independent design. At the end of a compilation, each design was complete for a specified PLD; I could plug the PLD into a board and upload the design I'd just completed, and the new logic would be implemented. Reusing a design at a higher level was equally simple. I could take a design I'd just completed, choose the "Create Default Symbol" command, and get a symbol I could drop into another design drawing, complete with labeled inputs and outputs. You can use this symbol just as you would any other schematic-design symbol, and you can also customize the symbol if you want. I found that the default symbols were adequate and had the advantage of being immediately available. Simulating designs hierarchically provides fast turnaround in compilations, especially at lower design levels, because you compile only new or revised parts of your design. In my project, simulation with timing was fast enough on these subdesigns that I never bothered to use functional simulation by itself. Simulating at lower levels also minimizes the problems you'll have later. As you go up the design hierarchy, your design becomes more complex, making problems more difficult to find and longer to correct and recompile.
|
I had to make some minor changes, though, in order to drop these filters in. The ADC in my design provides offset binary values, and the filters use two's-complement values. The filters' outputs have to be converted back to offset binary for the trigger-compare function and to remain compatible with the display software. I added another level of design hierarchy for the conversions and then inserted the design block into the 8-bit data path. Because I was using reconfigurable logic, I decided not to resimulate the revised filters in the system. They had already been simulated by themselves, and if I made a mistake putting them in the data path, I'd be able to tell later when I ran the design. And there's always the option of simulating the system later.
Serial Number 1
I now had three designs almost ready for testing in the demo board, and on day 23 of my project, I received the demo board and a Flex 10K50 PLD from Altera. The demo board is Serial Number 1. Altera said it had checked out the analog-input path to the PLD, but not much else; I'd be the first to use the parallel-port connection.
One thing left to do was to assign pinouts on my PLD design to match those on the demo board. Although my design uses only about two dozen I/O pins on the 403-pin 10K50, I needed to make sure every I/O pin is assigned to drive, receive, or be in a high-impedance state. Unfortunately, you can't leave unused pins in a high-impedance state unless they're being driven by off-chip sources. Floating pins could oscillate, causing problems.
I couldn't assign correct states right away, however, because I didn't have data sheets for all the devices on the demo board that connect to the PLD. So once again I called Steinke, who e-mailed me a pin-assignment file. I then made the pin assignments and recompiled my de-sign.
Earlier, when I'd been letting the design software perform pin assignments automati-cally, the timing analyzer had indicated that design EDN1 would run reliably at a clock rate of 25.7 MHz. Now, having forced pin assignments to match those already implemented on the demo board I was using, the timing analyzer said the rate had dropped slightly, to 24.4 MHz. Examining the slowest path, I found a place where I could add a register that would add a cycle of latency but raise the maximum clock speed to 29 MHz. Taking into account the slower-than-standard PLD in use on the Altera demo board, the analyzer derated the max speed to 25.1 MHz. I probably could have gotten more speed from my design if I'd needed to, but a 25-MHz clock rate was sufficient for this project.
One timing factor that I did address was slew rate. The Flex 10K family lets you assign fast or slow output slew rates on a pin-by-pin basis. Pins driven with the fast slew rate need some type of termination to reduce ringing. Slow slew rates don't need termination and are adequate for many signals. My design requires a fast slew rate only on the ADC encode signal; the lines driving the parallel port can use the slow slew rate. So I selected the global slow slew rate and then changed the ADC encode to fast.
Finally, it was time to give EDN1 a try. I powered up the demo board and found no apparent problems. (Nothing was hot, anyway.) An onboard switching power supply generates all voltages from its 12V input, which was drawing about 300 mA. The Flex 10K50 has a typical standby current of 500 µA, so a lot of other devices on the board were obviously drawing power.
It was the moment of truth. I connected the demo board's parallel port to my PC, which was running the control and display software. I downloaded the PLD's programming data. I tried a data acquisition. Nothing happened.
With a long sigh, I considered how to proceed. Was it the software, the hardware, or both that weren't working properly? I knew the parallel port's outputs were writing data correctly; I checked that earlier while developing the software. Now I probed the read lines. Nothing appeared to be happening on them.
With another long sigh, I told myself not to panic. Next to designing circuits, debugging them is one of my favorite activities. Where else can you get such interesting puzzles to solve?
I sat back and took stock of the situation. I had borrowed a 500-MHz HP5461B digital oscilloscope with a 1-Gsample/sec maximum sample rate. I could look at any signal on the board with all the timing resolution I could want. I could capture a signal even if it only happened once. I'd written the control and display software myself, so I could modify it, if necessary, to do anything I might need for debugging the hardware. And, best of all, I could reprogram the hardware to do anything I wanted. The reconfigurable PLD was the best debugging tool I could ask for.
The basic principle of debugging a design, whether on a simulator or in hardware, is to divide and conquer. Find the boundary between what is working and what isn't working. My first step was to verify the data path between my PC and the demo board.
Copying the EDN1 top-level schematic to a new design named Debug1, I stripped out the entire data-acquisition system and replaced it with four two-input OR gates. My plan was to send data down the eight lines that write to the demo board and to read the output of the OR gates on the four read lines. From start to finish, creating the Debug1 design took about 15 minutes, including compiling the design and generating the program file. I downloaded the design through the BitBlaster serial-download cable, and soon my 50,000-gate PLD was performing the ignoble but temporarily useful function of a single 7432 quad two-input OR de-vice.
The eight writes were going out, I found, but the four reads weren't coming back, and in only a minute I discovered why: I goofed. A standard parallel port has eight dedicated write lines, a 4-bit open-collector I/O port, and a 4-bit input port. I wrote the control and display software to use the 4-bit input-only port for reads; yet, for some inexplicable reason, I designated pinouts on the PLD to use the open-collector I/O port.
I got off easy. My careless mistake could have destroyed PLD output drivers. I hate to admit my mistake, but I pass along my experience because it highlights the flexibility of a RAM-based PLD for debugging. After reassigning the parallel-port pins on Debug1 to the correct locations, I recompiled and verified that everything seemed okay.
Dropping Debug1 and going back to EDN1, I corrected the parallel-port assignments, recompiled, and programmed the PLD. The PLD responded, but I could tell it wasn't really working. It was time to use the low-level control software to see what was working and what wasn't.
Debugging the software
It didn't look too bad. Manual control of the shift-ring register was okay, uploading and downloading of the RAM address register was okay, and incrementing the RAM address counter was okay. But I couldn't upload and download entire register frames.
In frame uploads and downloads, the whole register shift ring cycles once around, reading or writing new values to all the registers. A single one-bit shift seemed okay, but there were problems in cycling through a complete 56-bit frame. It looked like a software bug.
Going back to the code for shifting frames, I added suitable breakpoints and started debugging. Frame reads were working correctly, I discovered, but frame writes weren't. It took only a few minutes to find the problem and fix it.
Finally, both the software and the hardware were working correctly. Using a 10-MHz oscillator (a 20-MHz oscillator in later tests), I started capturing data (
Figures 1 and
2). Trigger, holdoff, and delay functions appeared to be working properly.
The analog section of the demo board had quite a bit of noise on it, I noticed. The switching power supply is only an inch from the analog path, and digital signals are all over the board. The AD9012 flash ADC is capable of 100-Msamples/sec conversion rates and has a 160-MHz bandwidth. It appeared to be doing a credible job of converting the input signal; it just had a fairly noisy input signal.
Design EDN2, with the 16-tap lowpass filter, and design EDN3, with the 16-tap bandpass filter, also appeared to work properly. To properly test them, though, I would need a swept sine wave. I'd borrowed an HP 33120A combination function generator and arbitrary-waveform generator, but the function generator's fastest sweep time is 1 msec, and I wanted to sweep from 50 kHz to 5 MHz in 200 µsec.
No problem. Using Hewlett-Packard's Benchlink software, I could download the required waveform to the arbitrary-waveform generator. Visual Basic proved its utility once more, letting me quickly write a program to generate the swept-sine waveform and write it in a comma-separated file format. I then imported the 8000-point file into the Benchlink software and downloaded it.
The response of the lowpass filter to the swept waveforms appears in
Figure 3. Ripple in the passband and stopbands that had shown up in simulation also showed up in the real filters. Although the de-signs all ran using a 20-MHz clock oscillator, for the filter tests, I needed to reduce the clock to 10 MHz, because the frequency limit for arbitrary-waveform generation was 5 MHz, and I wanted to test the filters up to half the clock frequency.
I now had one more day left in my schedule, so I decided to create 32-bit lowpass and bandpass filters to add to the 16-bit filters I'd already done. The new filters worked just like the simulation said they would. The response of the 32-tap bandpass filter appears in
Figure 4.
None of the designs I created came close to using all of the 10K50's capacity. Even with a full data-acquisition system and a 32-tap filter (a fully parallel filter with data rate equal to clock rate), I used only 1219 of the available 2880 logic cells (42%). The data-acquisition system without a filter (EDN1) used up 209 logic cells when optimized for speed, a minuscule 7% of the logic cells. The system used 80% of the available embedded-memory blocks, leaving 20% of the RAM available for other data-processing functions.
Use of reconfigurable logic begs comparison with an ASIC implementation, of course. A mask-programmed ASIC, in volume, can certainly offer several data-processing functions at a lower cost than you can achieve with programmable logic. A mask-programmed ASIC can also provide greater performance than is possible with reconfigurable logic. But reconfigurable logic has other advantages that ASICs don't have.
One advantage of reconfigurable logic is the potential for reducing the amount of silicon in a design. Because my project was intended only as a prototype exercise for putting reconfigurable logic to work, the filters I created are only the beginning of what you could add to a data-acquisition system. In a design that's intended to become a product, the library of reconfigurable functions in the design can get pretty big. When it does, the programmable logic that can swap functions in and out, as needed, can be less expensive than having all the functions simultaneously and permanently implemented in silicon. And when you want to add yet another function to an existing design, programmable logic looks even more attractive.
What I learned
Overall, my project with reconfigurable logic was a very positive experience. My patience was tried by a crashing computer, but once that was out of the way, the whole development process was pretty smooth. I've always found that it takes more time to figure out in detail what you want a system to do than it does to implement the design. My experience on this project confirms that opinion, plus I found that LPM and text-based tools in a well-integrated hierarchical-design system are able to reduce design time.
My experience continues to reinforce my enthusiasm for reconfigurable logic as a valuable technology for digital design. The experience also heightens my awareness of some design differences that become apparent when you spend time using reconfigurable logic.
For example, the evolutionary growth potential for a design based on reconfigurable logic should not be underestimated. Being able to see your full system runninghardware and softwareis an enormous advantage, not just for verifying that the design works properly, but for im-proving your de-sign. This ad-
vantage also ex-tends to reprogrammable PLDs (those PLDs with a limited reprogramming life) and to a lesser extent even to one-time programmable PLDs. Although with other types of logic implementations, you can somewhat overcome the slower design- and implementation-cycle times by using complete system-level simulation or emulation, in the end these implementations suffer a severe evolutionary disadvantage.
Using reconfigurable logic doesn't excuse you from careful and complete system design, of course. The more up-front system design you perform, the further you'll go with each evolutionary iteration. Although the ability to quickly correct mistakes will let you recover from errors in your design, it's no substitute for good system design.In particular, the foundation of your system design will eventually limit how far your design can evolve before surpassing the system's limits. You aren't just designing the logic for a board, you're designing a system that should continue to evolve, hopefully for several years. Although your reconfigurable PLD can accommodate system-design changes, logic or devices that are external to the PLD may not be as forgiving.
For the type of system I designed, reconfigurable logic lets you create a lot of different designs that you can program at will into a single device. You can move hierarchical design blocks in and out of the system design not only at design time, but at runtime, too. You're not forced to compromise a design that has to do different things at different times. This ability to alter some or all of your hardware design at runtime opens up a whole new world of possibilities, some of which can be competitive advantages.
| The reconfigurable system |
|---|
|
To experiment with reconfigurable logic, I decided to design a reconfigurable data-acquisition system. A PLD in this system, Altera's 10K50, accepts data from an 8-bit flash ADC (AD9012) and performs all high-speed data-acquisition and -processing functions. These functions include digital filtering, triggering, and storage of data in the PLD's on-chip RAM.
A typical data-acquisition cycle begins by uploading data-acquisition triggering parameters via a register shift ring. These parameters include trigger level, trigger polarity (rising or falling edge), holdoff count, and a stop-acquisition count, which determines the window of pretrigger or post-trigger data acquired.
When acquisition is complete, the host computer downloads the RAM's contents. A control word lets you increment the RAM address, avoiding the need to use the register shift ring for each RAM-address change while downloading data. In cases where the trigger conditions are not satisfied or when you're initially examining a data stream, a "force-trigger" command lets you bypass the trigger functions and acquire data immediately. The holdoff function doesn't allow the trigger to operate unless the trigger criteria have been false for at least the entire holdoff period. The combination of holdoff function and delay make it easy to place the acquisition window where you want it in relationship to a pulse train or a burst of data. This combination also lets you trigger on discrepancies in uniform data, such as missing pulses.
|
| Components, equipment, and sources | |||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EDN's Hands-On Reconfigurable-Logic Project used the hardware and software components and equipment listed below. For information about any of these items, circle the appropriate numbers on the postage-paid Information Retrieval Service card or use EDN's Express Request service. When you contact any of the suppliers directly, please let them know you read about their products in EDN.
|

You can reach Technical Editor Doug Conner at (805) 461-9669; fax (805) 461-9640; email edndconner@mcimail.com
References
Acknowledgments
I'd like to thank Paul Kemp for his Design Idea providing a DLL for reading and writing to the ISA bus. I'd also like to thank Greg Steinke of Altera for his help on the project.