EDN Access

[Download PDF version]

NOTE: Figures (below) link to Adobe Acrobat (PDF) files. To get the entire article in one PDF, click the button on the left.

GET ACROBAT READER


September 11, 1998


[bullet]
[bullet]

Hands-On Project:
Synthesis Shoot-Out at the EDN Corral

Choosing the right synthesis tool set for your budget, design complexity, and project schedule is tough. We hope that this evaluation study will point you in the right direction and help you to avoid some of the problems I encountered.

Brian Dipert, Technical Editor

Click here for the
ADDENDUM
to this article...

For a previous Hands-On Project, I developed a multibus-interface bridge chip with integrated data-transform circuits and a synchronous-DRAM (SDRAM) controller (see "Getting a handle on HDLs," EDN, May 7, 1998, pg 71). The target implementation device is a member of Xilinx's XC4000XL FPGA family, and my goal was to eventually compare schematic- versus VHDL-based versions of the design and to compare a VHDL design run through multiple synthesis compilers.

The schematic-versus-synthesis comparisons will have to wait for a future Hands-On Project, because I am still developing the schematic design! This fact hints at the answer to one of the comparisons I'd hoped to make in the study: development time of the two alternative design approaches. However, I've successfully compiled both the VHDL design and subsets of it through three front-end tool sets (Table 1) at various price and feature setpoints and through Xilinx's Alliance 1.4 back-end software. The results are sometimes predictable but often surprising.

@ a glance

  • Picking the right synthesis tool set for your needs involves evaluating compilation time, design speed and efficiency, price, design-flow comprehensiveness, breadth of vendor and product-line support, and quality of technical support after the sale.
  • Running your code through multiple vendors' tools can help catch bugs you might otherwise overlook or ignore. 
  • Just because the front- and back-end tools speak the same netlist language doesn't mean integrating them in your design flow will run smoothly.
  • Evaluation results depend not only on the tool's inherent robustness but also on how much you influence the process by declaring attributes and exploring various compilation and place-and-route options.

Getting the compilers to run through the designs without errors or unexpected warnings and configuring them so that they would create netlists that the Alliance place-and-route tools would accept proved more frustrating than I had—perhaps naively—anticipated. The software also severely stressed the capabilities of my computing hardware, tangibly reminding me why workstation manufacturers can justify charging significant price premiums over standard desktop PCs. However, Microsoft's (www.microsoft.com) Windows 95, the operating-system platform for all the tools, as well as the tools themselves, were unexpectedly stable (see sidebars "The software" and "The hardware").

Project objectives and guidelines

For a detailed explanation of the VHDL design, download the design specification from the addendum to this article (see sidebar "Head to the addendum"). I compiled the following VHDL files:

  • TOPLEVEL.VHD: the design (highest level hierarchy) with FIFO arrays created from arrays of std_logic_vectors;
  • TOPLVLRM.VHD: the design (highest-level hierarchy) with FIFO arrays created from instantiated Xilinx RAM components;
  • BACKFIFO.VHD: the 32X16-bit back-end-bus bidirectional FIFO buffer for design TOPLEVEL.VHD;
  • BACKFIRM.VHD: the 32X16-bit back-end-bus bidirectional FIFO buffer for design TOPLVLRM.VHD;
  • HOST_IF.VHD: the control and status registers, plus the state machines that decode host-bus operations and synchronize the host bus, back-end bus, SDRAM controller, and other internal circuits;
  • HOSTFIRD.VHD: the 32X16-bit unidirectional FIFO buffer usedfor host-bus reads for designTOPLEVEL.VHD;
  • HOFIRDRM.VHD: the 32X16-bit unidirectional FIFO buffer usedfor host-bus reads for designTOPLVLRM.VHD;
  • HOSTFIWR.VHD: the 32X16-bit unidirectional FIFO buffer usedfor host-bus writes for designTOPLEVEL.VHD;
  • HOFIWRRM.VHD: the 32X16-bit unidirectional FIFO buffer usedfor host-bus writes for designTOPLVLRM.VHD;
  • INTBUFF.VHD: a 16-bit, unidirectional, three-state buffer;
  • MEMTOP.VHD: the SDRAM-controller top-level hierarchy file, interconnecting files ADDRCONV.VHD, MEM_CONT.VHD, and REFRESH.VHD;
  • X1_VHD: combinatorial logic, which implements the X1 data-transform function, including dual 16-bit Boolean, shift, and arithmetic modules; and
  • X2_VHD: combinatorial logic, which implements the X2 data-transform function, including dual 16-bit arithmetic, limit, and round modules.

I take on two roles in this study. First, I am a VHDL "newbie" who doesn't understand, want to use, or have the time to learn any of the special tool-compiling options. Second, I represent the engineer who wants to write silicon-independent VHDL to enable design migration among multiple programmable-logic vendors and product families, from a programmable-logic device to an ASIC and perhaps even back again, or to easily reuse portions of a working design in another project. However, the design-file list shows that I relented and did two versions of each FIFO. One version tested the compiler's ability to infer RAM usage, and the other instantiated Xilinx dual-port RAM components.

For each front-end tool, I compiled four sets of 13 netlists, defined by specifying combinations of low or high compiler effort and area or speed emphasis. Although I defined all flip-flops in the design to set or clear in response to an active external reset, I didn't hard-code Xilinx-specific start-up attributes. I also let the tools pick their preferred state-machine-generation approach (binary, one-hot, gray code, etc). Although I maximized tool flexibility by not explicitly defining any state values, I tried to avoid IF-THEN-ELSE logic structures, instead using CASE-WHEN alternatives whenever I could. Using IF-THEN-ELSE structures often produces slower, multistage priority-ordered logic chains, whereas using CASE-WHEN structures typically creates faster logic with all possible outputs evaluated in parallel.

I left preservation or flattening of the hierarchy in the netlists and the output-netlist format at the tool defaults. Synopsys uses Xilinx netlist format (XNF), whereas Accolade and Minc use electronic design interchange format (EDIF). Synopsys' FPGA Express is the only tool that let me enter timing constraints in the front end (and provided corresponding timing estimates after compilation in conjunction with a user-specified device speed bin). I disabled these timing constraints from passing to the back-end tools through the generated netlist. Therefore, the only time I needed to interrupt FPGA Express between its synthesize and optimize algorithms was to tell it to ignore unlinked cells (the instantiated RAM components in one version of the FIFOs) during global set/reset mapping.

For Xilinx's Alliance tool set, I made only a few departures from the default settings. If the front-end tools specified I/O buffer flip-flops, I allowed Alliance to pack them into I/O buffers for both inputs and outputs. I also didn't bother producing configuration data, because I wouldn't be programming an actual device. Finally, I modified the timing- report format to report one path per default timing constraint.

All designs, regardless of their size, targeted Xilinx's XC4085XL, in the -1 speed bin and in a 560-lead BGA package. Xilinx also offers faster (-09) and slower (-2, -3) speeds and a 559-lead PGA package. The XC4085XL provides 3136 configurable logic blocks (CLBs) in a 56X56-block matrix. Each CLB contains two four-input look-up tables (LUTs), one three-input LUT, and two flip-flops. Additional I/O flip-flops boost the total register count to 7168. Four-input LUTs can implement logic functions or alternatively find use as 16-bit embedded-RAM elements.

In Part 1 of this series, I used FPGA Express as my VHDL development platform. I hoped that once I got the design and its subsets to compile correctly through FPGA Express, I'd have little problem with any other vendors' tools. The only possible struggle I predicted came from my use of the Synopsys-developed ieee.std_logic_unsigned and ieee.std_logic_signed libraries. Although both Accolade and Minc provide compatible library suites, solving this potential problem, my work wasn't over when I added Accolade's PeakFPGA to the compiler suite.

In retrospect, the ability to run the same source code through multiple tool sets was a positive experience that, because of limited time and budget, many of you will be unable to replicate. This multitool duplication caught a lot of problems up-front that might have otherwise not appeared until I ran the netlist through Alliance or, even worse, programmed and attempted to operate a device. One mistake that PeakFPGA found was that I was attempting to directly three-state flip-flop outputs on reset, instead of running the driven outputs through a separate nonclocked process with only an output-enable control signal in the sensitivity list. This experience exemplifies a recurring theme: Even though I used a high-level description language, which theoretically insulated me from the destination device details, I needed a more complete understanding of the target silicon architecture than I first thought necessary. The compilers don't detect that Xilinx FPGA logic-block flip-flops don't offer outputs that can be three-stated. Instead, the compilers automatically insert three-state buffers, forcing me to explicitly define the buffers.

While solving this problem, I discovered the source of a series of obscure warning messages I'd received in FPGA Express concerning its inability to infer proper output-buffer structures for the host data bus. Two sources within the chip can potentially drive this bus:the HOST_IF.VHD status registersand HOSTFIRD.VHD's Port A (orHOFIRDRM.VHD for the instantiated design). I had embedded three-state control within each lower level module, instead of bringing the driven outputs to a multiplexer in the top-level entity/architecture and making them three-state at that point. Other circuits that I fixed, thanks to a combination of FPGA Express and PeakFPGA warnings and error messages, included a few registers for which I'd forgotten to define set or clear conditions on reset.

I also took advantage of the cleanup opportunity to remove a few explicitly defined state-bit combinations from the SDRAM-controller design I inherited from MoSys' (www.mosys.com) Christian Green, which he had targeted for a CPLD. You can safely ignore some of the warnings that remain and that you'll see when you compile my code. For example, I reserved I/Os for the currently unimplemented test-bus signals and host-bus REQ and GNT. I also generated full flags in each of the FIFOs and ran them to HOST_IF.VHD, although my state machines didn't use those flags. Fortunately, once the code made it through FPGA Express and PeakFPGA without errors or warnings, compilation in Minc's PLSynthesizer went smoothly. As a rule, don't assume your design is solid just because you don't get any syntax or compile errors; examine the warnings, too!

Back-end issues

Now for the back-end tools, and, once again, my idealistic hopes for smooth sailing emerge. With PeakFPGA, reality pretty much matched my fantasy. The only problem I encountered stemmed from Xilinx's sending me Alliance 1.5-compatible (but 1.4-incompatible) dual-port RAM netlists (in an .NGC format). Rerunning the Accolade-generated EDIF netlists with the correct RAM files (.NGO format) produced successful translation, mapping, and place-and-route results. If I had created my own RAM components in Alliance instead of relying on Xilinx to do the work for me, I wouldn't have experienced this problem.

FPGA Express presented me with a somewhat greater struggle. I encountered a host of "partial-carry-chain" errors while mapping the XNF netlists. Synopsys at first suggested that I resolve this problem by making sure that I was using the same FPGA family, device, package, and speed bin in both FPGA Express and Alliance. Unlike PeakFPGA and PLSynthesizer, which require that you enter only a device manufacturer and product family, FPGA Express requires that you also enter a device, package, and speed bin. Unfortunately, this advice didn't solve the problem, and I had to add the ignore-relative-location-constraints (-ir) option to Alliance's mapping program for Synopsys-generated files. The bug I stumbled across is apparently specific to FPGA Express' 2.1.1 version, and the company should have it fixed with Version 2.1.2 by now.

I'm unsure whether I prefer FPGA Express' front-end FPGA technology- mapping optimizations to Accolade's and Minc's greater reliance on back-end tools. Compare and decide for yourself whether Synopsys' approach produces improved compiling times, smaller designs, or faster design performance (Table 2, Table 3, and Table 4). The problem I see, though, is that you may not know what device size or package you need until you complete a test run through the back-end tools and determine the logic, routing, and I/O-pin resources consumed. If you guess too large or small a part or the wrong package or speed bin and you're using FPGA Express, you have to rerun your design through the entire flow.

My Accolade and Synopsys problems paled in comparison with the struggles I had with Minc's PLSynthesizer. Fortunately, you probably won't encounter at least one of them. Minc had just released its 6.1 version, which for the first time supports the Xilinx 4000XL product family, so instead of installing the software from an older CD-ROM, I got the files from Minc's server via File Transfer Protocol. However, when I attempted to compile any VHDL designs with bidirectional ports, PLSynthesizer generated 0-byte IOBUF.EDN files, which created invalid connection errors in the back end. I determined that Minc forgot to include IOBUF.EDN in the library of the file set I downloaded, and, when I added the correct file, the software generated valid IOBUF.EDNs for appropriate designs.

PLSynthesizer 6.1 doesn't yet automatically insert I/O pads in the EDIF netlists, so I enabled the create-I/O-pads-from-ports (-a) option in Xilinx's translation software. Yet, I still couldn't get the designs to make it through Alliance, except for simple VHDL files, such as INTBUFF.VHD. I even tried compiling to XNF netlists using the XC4000E setting. (PLSynthesizer doesn't offer XNF output for the XC4000XL family.) However, the device families' internal architectures differ too much, leading to a host of mapping errors. Finally, I determined that the PLSynthesizer-generated LogicBlox macro was incompatible with the format that Alliance expected, causing Alliance to discard most of the design's logic. Switching to Macro+ macro generation cured that frustration.

Only one problem remained: PLSynthesizer expected references to instantiated components, such as Xilinx RAM blocks, in a long-list format different from Xilinx's default, which FPGA Express and PeakFPGA accept. Revising the VHDL for the FIFOs and again running the RAM-based designs through the flow finally provided no errors. Fortunately, because PLSynthesizer accepts scripts, I could kick off a multifile compiling session on the Pentium-133 desktop PC and temporarily switch to the Pentium II-400 or my notebook PC to do other work.

The 10,000-gate design-complexity estimates that Stephen Wasson, principal engineer at HighGate Design (www.highgatedesign.com), and I forecast in my May 7 article were more than a factor of 5 too low in some cases (Table 2, Table 3, and Table 4). The flip-flop counts for the noninstantiated FIFOs also clearly show that none of the tools could infer RAM from my array structures of std_logic_vectors. Fortunately, all the vendors correctly infer internal three states where necessary, although Accolade uses both a three-state buffer and an LUT for each signal.

Notice the varying front-end compilation times when comparing tool results. Depending on which step of the design you do your simulation in (if you do it at all), fast front-end compilation may or may not be important to you. For each vendor, notice how compile time and efficiency and performance results vary based not only on how I configured the speed-versus-area and low-versus-high effort settings, but also on the size of the design and the types of logic the design contained.

If your designs contain circuits of similar complexity to mine and you use a similar hardware and operating-system environment, your results will probably also be similar. Regardless, I strongly encourage you to conduct your own tests, either with the code I provide on EDN Access or with your own HDL designs (see sidebar "Use it; don't abuse it"). Remember, too, that I ran the front-end processes on a Pentium-133 computer and the back-end software on a Pentium II-400 PC. As a result, front- and back-end results for the same vendor don't directly correlate, although vendor-to-vendor differential comparisons for each step in the flow are still valid.

I recommend that you not directly benchmark the vendors' front-end flip-flop-count, LUT-count, and design-performance results against each other, because they're just predictions. I published the numbers, though, to enable you to judge how accurate these estimates are. Speed and size predictions before lengthy back-end translation, mapping, and placement and routing can provide valuable data, but only if they're reasonably accurate.

The tables and the back-end mapping reports show some interesting trends. Both Accolade and Synopsys conservatively defaulted to slow-slew-rate output-buffer settings for all signals, whereas Minc chose fast-slew driv-ers for all unidirectional signals and slow-slew drivers for bidirectional pins, such as data buses and the Ready pin. Fast-slew buffers may improve external interface timing, but their presence is not what defines my design's maximum clock speed. You should also watch out for noise issues when configuring too many outputs as fast slew.

In some cases, I found excessive loading of certain internal signals, such as one write-enable fanning to all 512 flip-flops in a FIFO array. The resultant degraded propagation delay was often the limiter to clock speed for the circuit. Especially when optimizing for speed, I expected to find more parallel logic duplication, trading off a few more gates for much less fan-out and correspondingly faster signal switching.

In a register-rich FPGA architecture, such as the XC4000XL, one-hot-encoding techniques commonly produce faster state machines than the alternative binary coding. (The binary-coding approach is better suited for comparatively combinatorial-rich but register-scarce CPLDs.) Compare the flip-flop counts for HOST_IF.VHD andMEMTOP.VHD, both of which contain state machines, and you get an idea of which tool probably selected which technique. State-machine encoding is one of several factors that a designer can also explicitly specify via attributes embedded in the VHDL source code.

BACKFIFO.VHD contains an array of flip-flops that ports A and B can potentially write to. The most logic-efficient means of implementing this circuit probably involves logically combining Port A's and Port B's write enables to form the flip-flop clock enable. Otherwise, you end up with a more LUT-intensive circuit, feeding the flip-flop's D input. Another factor influencing logic size versus speed involves using ripple-carry or fast-carry-look-ahead circuits for adders. All of the FIFOs, plus HOST_IF.VHD, MEMTOP.VHD, and the two transform blocks, contain adders or subtracters. Compare area-versus-speed and low-versus-high-effort results for a vendor and among vendors to detect when they used each adder type.

Postmapping reports indicate that FPGA Express is the only tool that can infer use of the flip-flops in I/O buffers. PeakFPGA also seems unable to take advantage of global routing resources for chipwide signals, such as resets, in the absence of explicit user-defined attributes. I expect that all the vendors, in their optimization efforts, will aggressively use attribution, and I'm looking forward to seeing how this attribution alters the results. One thing that baffles me about the Accolade data set is that the area-versus-speed priority option in PeakFPGA's user interface seems to produce no variation in either the front- or back-end results. Although a company representative indicates that this parameter does not influence the front-end compilation but passes to the back-end Xilinx tools, I see no evidence that this parameter makes any difference.

When comparing the tools, also look at Table 1. Balance a tool set's price against its results, especially at the gate-count complexities you typically see in your designs. Consider whether you require support for a number of programmable-logic vendors, only one vendor, or even a subset of a single vendor's product line and whether widespread device support is important to you. Does the software package include just a compiler, or does it also offer a simulator, a state-machine entry program, or a context-sensitive text editor? Finally, consider the all-in-one tool approach; it will help you avoid the front- to back-end file-communication-mismatch struggles that I faced.

Considering the amount of work I've put into this project so far, I definitely plan to leverage the design in future Hands-On articles. The next step will probably be to create a robust simulation testbench. This task includes developing a back-end VHDL model and integrating an SDRAM model from one of the memory vendors. I still think a schematic-versus-synthesis comparison will be interesting, and if I ever find time to complete the schematic design, you'll probably find the results in EDN, too. Another enticing twist on this study might be to keep the compiler tools as the constant, but vary the target device, across both multiple vendors and multiple product families for a given vendor (such as comparing the XC4000XL results to those on Xilinx's Spartan devices). As always, your feedback and suggestions are welcome!


References

  1. Dipert, Brian, “Counting on gate counts? Don’t count on it,” EDN, Aug 3, 1998, pg 52.
  2. Dipert, Brian, “Getting a handle on HDLs,” EDN, May 7, 1998, pg 71.
  3. Dipert, Brian, “Moving beyond programmable logic: if, when, how?,” EDN, Nov 20, 1997, pg 77.
  4. Dipert, Brian, “Programmable logic: Beat the heat on power consumption,” EDN, Aug 1, 1997, pg 57.
  5. Dipert, Brian, “Shattering the programmable-logic speed barrier,” EDN, May 22, 1997, pg 36.

Use it; don't abuse it

I think the results of this project reveal lots of useful information for making your own synthesis-tool decisions. However, the last thing I want you to do is to draw absolute conclusions without running additional tests of your own. Sound like a contradiction? Let me explain.

Compilation time, like the measured speed of any other software program, depends on the capabilities of the CPU, memory, hard drive, and other subsystems in the computer you run the compiler on. One vendor's compilation times might linearly or exponentially decrease with increased CPU speed more than others, another compiler might better leverage additional system memory or more hard-drive swap space, and other factors may also differ.

Software performance also varies, depending on the operating system. For example, although I used no multithreaded software, such software should become common as Microsoft (www.microsoft.com) continues to push Windows NT as the foundation for its future operating-system strategy. Multithreaded applications would show significant performance improvements in a multiprocessor computer running Windows NT or Unix instead of Windows 95/98, which can't take advantage of more than one CPU.

Note that I used the Xilinx XC4085XL as the target for all of my VHDL files, regardless of their design size. Only with the TOPLEVEL.VHD and TOPLVLRM.VHD files did I come close to running out of on-chip logic and routing. Compiling times, design efficiency, and design speed depend not only on the design complexity but also on how many silicon resources the compiler has to work with, and compilers respond differently to resource constraints. Also, I tested specific versions of each tool; the vendors are always making incremental improvements, and results several months and several revisions later (especially after the vendors see my numbers) might differ.

Each compiler handles state machines, arithmetic and Boolean units, datapath structures, memory blocks, and the like differently from its counterparts. I tried to highlight strengths and weaknesses of various circuit types by compiling both the top-level design and the hierarchical subsets. However, your results may differ depending on the relative percentage of circuits in your designs. You'll also notice different results depending on the effort level and your area-versus-speed priority. Netlist format and preservation or flattening of hierarchy can also impact the outcome.

This study also reveals that, when given little explicit guidance by the user's source code and program-configuration options, the compilers' abilities to infer nongeneric FPGA structures (that is, elements other than four-input look-up tables and logic-block flip-flops) vary. If you make VHDL code less silicon-independent, tune the compiler and place-and-route settings, or manually floorplan your design, efficiency and performance results will probably show significant improvement.

Your choice of synthesis language may dramatically change the outcome of your design. I have no idea how well any of the tools evaluated in this study handle Verilog, for either part or all of the design. VHDL-93 incompatibility wasn't an issue for me in synthesis, but it might be for you, and, if so, it will probably influence your tool-set decisions. Taking mixed-source design to the next level, you need to evaluate what happens when you integrate netlists generated by a schematic-capture package or from Abel, Palasm, or Altera's (www.altera.com) AHDL.

I dodged one key variable—comparing the compilation time and efficiency/performance results of a first-time design compilation versus those of a recompilation of a hierarchical design in which you have modified only one or a few modules. This variable would have made the study results too complicated to decipher. After compiling low-level design files in Synopsys' FPGA Express, I had to manually delete files created by the "elaborate" step to force FPGA Express to re-create them when compiling the top level. This step ensured an accurate measure of compilation time for the design. However, if I were designing in real life and tweaking only one entity/architecture before recompiling, I'd appreciate the much faster compilation time that the incremental approach offers.


Head to the addendum

In the Addendum to this article, you'll find lots of supplemental information. I've posted the VHDL source files, along with the design specification that guided their development, in this addendum. I have also included my compilation and data-collection batch files, as well as a set of vendor front-end and Xilinx back-end reports for each set of data in Table 2, Table 3, and Table 4.

Xilinx plans to take the netlists I generated and experiment with them in several ways. First, the company will run them through the Alliance Version 1.5 back-end tool set, using the same settings I did in 1.4. Version 1.5 was still in beta testing when I finished my project but should be available by now. Second, for both Alliance 1.4 and 1.5, the company plans to take a more active role in guiding the back-end process, using special user options, defining pinouts and timing constraints, and perhaps even manually floorplanning portions of the design before automated place and route. The addendum link will contain the results of each of these tests.

Finally, I'm turning my source code, my results using the vendors' tools, and a table template over to Accolade, Minc, and Synopsys, acting as tool experts, for their optimization efforts. They need to target the same device, speed bin, and package I used but are free to take advantage of any software upgrades, including those of Alliance, as long as the companies ship the software by Sept 11, 1998. Like Xilinx, the synthesis vendors can enable published tool options, define pinout and timing, and manually floorplan in the back end. They'll document the optimization steps so that you can duplicate their results.

In addition to documentation, the vendors will provide two sets of numbers. In the first set, they will add VHDL-standard and Xilinx-specific attributes in the source code, such as explicitly enabling one-hot encoding or assigning signals to global routing resources. For the second set, they will go one step further, revamping my VHDL (switching between CASE-WHEN and IF-THEN syntax, for example) to improve the tools' ability to infer special logic structures, such as clock enables.

Lack of an available comprehensive test-vector set means that you have to take Accolade's, Minc's, and Synopsys' word that they haven't changed the design's function as part of their second-set "improvements." However, I told the vendors not to fundamentally change the design's operation, such as pipelining the X1 and X2 transforms, which might boost operating frequency but would also alter data latency through the device.


The software

Early this year, I contacted all the EDA vendors that sell programmable-logic-synthesis software to gauge their interest in participating in my study. Accolade Design Automation, Minc, Synopsys, and Xilinx, with its Alliance Version 1.4 as the common back-end counterpart, have products represented in this article (Figure A). What about everyone else?

Aldec (www.aldec.com) doesn't recommend the compiler it includes in its Active-VHDL tool set for designs having more than 5000 gates, suggesting others' compilers beyond that point while still acting as a simulation, design-entry, and overall flow integrator. Cadence (www.cadence.com) advocates Synplicity's (www.synplicity.com) Synplify for programmable- logic designs. Exemplar Logic (www.exemplar.com) had planned to participate, but its Leonardo Spectrum tool set didn't complete beta testing before press time.

Orcad (www.orcad.com) also had planned to participate. I had installed and was about to run the company's Express tool when it announced at the Design Automation Conference in San Francisco in June that it would move to Exemplar's compiler engine for future versions, beginning around October. Because this time frame would be only a month after article publication, we agreed that it made no sense for me to move forward with the Express review.

Synplicity declined to participate unless my study was far more comprehensive than I realistically had hardware, manpower, or time to accomplish. The company insisted that I evaluate both Verilog and VHDL versions of designs. These designs would span a range of gate counts, multiple device targets, and multiple computer and operating-system platforms. Synplicity also wanted me to be far more critical of non-VHDL-93-compliant tools than I felt was necessary. Given that I had hoped to publish the results of the study sometime within this decade, I thanked the company for its suggestion and proceeded without it.

Veribest's (www.veribest.com) FPGA Desktop includes Synopsys' FPGA Express synthesis engine, which I was already evaluating. Finally, Synopsys' acquisition of Viewlogic Systems (www.viewlogic.com) meant that further development of Viewlogic's Aurora synthesis compiler wouldn't continue.


The hardware

All VHDL-to-netlist compilation took place on my Pentium-133 desktop computer, running Windows 95 and containing a first-generation Intel (www.intel.com) core logic chip set, 64 Mbytes of extended-data-out DRAM, and a 6.4-Gbyte hard-disk drive. Before compilation, I shut down all other programs, including virus-scanning software and Microsoft's (www.microsoft.com) FastFind background utility. NEC (www.nec.com) loaned me a 21-in. monitor, whose screen dimensions would have been a practical necessity if I'd moved forward with my schematic designs. I still appreciated its size and resolution, even in the textual VHDL environment.

I also attempted to use the Pentium-133 PC to run the back-end Xilinx Alliance software. However, my first place-and-route session using back-end timing constraints prematurely terminated after 36 hours with no end in sight, thanks to a power brownout at my home office and subsequent computer shutdown (Sacramento, CA, is notorious for power outages once the thermometer passes 100°F). After calculating how long a minimum of 156 sessions would take at this rate, I reached the following conclusions:

  • FPGA technology mapping, place and route, and final timing analysis take far longer than I had anticipated from my humble complex-PLD-fitter and low-gate-count-design background.
  • Guiding the tools with user-entered timing constraints significantly increases the place-and-route time and makes little sense when I had no hard data in hand on what speed the design was capable of, so I would not use timing constraints for my portion of this project.
  • I needed an uninterruptible power supply to minimize the potential for frustrating delays resulting from power loss.
  • I needed to throw a lot more computing horsepower at the problem!

American Power Conversions (www.apcc.com) sent me a Back-UPS Pro 650 power supply (suggested retail price $379.99), capable of approximately 45 minutes of system life on battery backup. Intel supplied a Pentium II-400-based PC using its SE440BX motherboard, containing 64 Mbytes of PC-100 synchronous DRAM (SDRAM) and a 4-Gbyte hard-disk drive and running Windows 95 OSR2 (Figure A). I used this system to run all of the Xilinx translate, map, and place-and-route sessions and most of the postroute timing analysis.

Windows 95 and the applications I tested that ran on top of it were all surprisingly stable. Because of the scripting capability built into Minc's PLSynthesizer and the ability to execute multiple iterations of Xilinx software from a DOS batch file, Windows 95 frequently ran all day, for several days at a stretch. My systems never required rebooting during these extended sessions, and I didn't have to deal with a single crash during the project.

I encountered only two software issues. Early in the project, Accolade's PeakFPGA began locking up the system when I attempted to compile hierarchical designs. I reinstalled the software on the installation, and the problem disappeared. I also completed Xilinx's post-route timing analysis using the company's TRCE timing-analysis tool on TOPLEVEL.VHD- and TOPLVLRM.VHD-created designs; the analyses ran for days without finishing, although Windows 95's System Monitor program indicated that the CPU was doing something. Loren Lacy, software marketing representative from Xilinx, successfully ran the designs through timing analysis on a 400-MHz Pentium II Windows NT-based workstation with 512 Mbytes of PC-100 SDRAM at the company's Boulder, CO, facility.


Table 1—Representative programmable-logic-synthesis vendors and products

Manufacturer Product Features Supported platforms Evaluated version Price
Accolade Design Automation Kirkland, WA
1-425-828-2122
fax 1-425-739-2163
www.acc-eda.com
PeakFPGA VHDL synthesis (multiple vendors) Windows 95/98/NT 4.24C $4500
PeakSuite VHDL synthesis (multiple vendors) VHDL PRO+VITAL simulation Windows 95/98/NT   $5495
Minc
Colorado Springs, CO
1-719-590-1155
fax 1-719-590-7330
www.minc.com
PLSynthesizer VHDL synthesis (single vendor) Windows 95/NT   $4200
PLSynthesizer VHDL synthesis (multiple vendors) Windows 95/NT   $7000
PLSynthesizer Verilog synthesis (single vendor) Windows 95/NT   $4200
PLSynthesizer Verilog synthesis (multiple vendors) Windows 95/NT   $7000
PLSynthesizer VHDL and Verilog synthesis (single vendor) Windows 95/NT   $5400
PLSynthesizer VHDL and Verilog synthesis (multiple vendors) Windows 95/NT 6.1.0.7 $9000
PLSynthesizer (floating license) VHDL synthesis (single vendor) Windows 95/NT   $6300
PLSynthesizer (floating license) VHDL synthesis (multiple vendors) Windows 95/NT   $10,500
PLSynthesizer (floating license) VHDL synthesis (single vendor) SunOS, Solaris, HP-UX   $8400
PLSynthesizer (floating license) VHDL synthesis (multiple vendors) SunOS, Solaris, HP-UX   $14,000
PLSynthesizer (floating license) Verilog synthesis (single vendor) SunOS, Solaris, HP-UX   $8400
PLSynthesizer (floating license) Verilog synthesis (multiple vendors) SunOS, Solaris, HP-UX   $14,000
PLSynthesizer (floating license) VHDL and Verilog synthesis (single vendor) SunOS, Solaris, HP-UX   $10,800
PLSynthesizer (floating license) VHDL and Verilog synthesis (multiple vendors) SunOS, Solaris, HP-UX   $18,000
Synopsys
Mountain View, CA
1-650-962-5000
fax 1-650-694-4249
www.synopsys.com
FPGA Express VHDL and Verilog synthesis (single vendor) Windows 95/98/NT   $5000
FPGA Express VHDL and Verilog synthesis (multiple vendors) Windows 95/98/NT 2.1.1.3025 $12,000
FPGA Express (floating license) VHDL and Verilog synthesis (multiple vendors) Windows 95/98/NT   $16,000
FPGA Compiler (upgrade to existing Design Compiler license) VHDL and Verilog synthesis (multiple vendors) Solaris, HP-UX   $15,000
FPGA Compiler VHDL and Verilog synthesis (multiple vendors) Solaris, HP-UX   $29,500
Xilinx
San Jose, CA
1-408-559-7778
fax 1-408-559-7114
www.xilinx.com
Alliance Translate, map, place and route, and configure (fewer than 8000 gates) Windows 95/NT   $95
Alliance Translate, map, place and route, and configure (fewer than 8000 gates) Solaris, HP-UX, AIX   $750
Alliance Translate, map, place and route, and configure (product line) Windows 95/NT 1.4.12 $5995
Alliance Translate, map, place and route, and configure (product line) Solaris, HP-UX, AIX   $3995
Foundation VHDL/Verilog synthesis (single vendor) Translate, map, place androute, and configure (fewer than 8000 gates) Windows 95/NT   $1495
Foundation VHDL/Verilog synthesis (single vendor) Translate, map, place and route, and configure (product line) Windows 95/NT   $7995

Table 2—Accolade PeakFPGA results


Table 3—MINC PLSynthesizer results


Table 4—Synopsys FPGA Express results


Acknowledgements

As during Part 1 of this project, Loren Lacy from Xilinx was a big help. He provided welcome assistance in debugging numerous hardware and software problems at all hours. I appreciated his good humor during those dark moments when I seriously doubted whether I’d ever complete this project. I’d also like to thank synthesis-vendor representatives Dave Pellerin from Accolade Design Automation; Kevin Bush, Michel Crastes, and Freddy Engineer from Minc; and Ramine Roane from Synopsys.


[Brian Dipert]Brian Dipert, Senior Technical Editor

You can reach Technical Editor Brian Dipert at 1-916-454-5242, fax 1-916-454-5101, edndipert@worldnet.att.net.


| EDN Access | Feedback | Table of Contents |


Copyright © 1998 EDN Magazine, EDN Access. EDN is a registered trademark of Reed Properties Inc, used under license. EDN is published by Cahners Business Information, a unit of Reed Elsevier Inc.