EDN logo


Design Feature: August 15, 1996

Wring PCI performance from programmable logic

Doug Conner,
Technical Editor

PCI cores for programmable logic—if they meet your performance requirements—can help you more quickly finish your system design. But, beware: Designing for 33-MHz PCI pushes the envelope for most PLDs, and PCI compliance is a thorny issue.

The PCI bus is perhaps today's most popular high-performance bus, finding use in many PCs and workstations plus a myriad other closed and open systems. The bus comes in a number of variations, such as CompactPCI, that target specific markets (Reference 1). However, designing for 33-MHz PCI taxes most PLDs to their limits. Nevertheless, manufacturers and designers of PLDs, including complex PLDs and FPGAs, using the 33-MHz PCI bus are stepping up to the design challenge of using PLDs for 33-MHz PCI designs. These features include operation from 0 to 33 MHz, a 32-bit data width and burst-mode transfers that let you potentially transfer data as fast as 132 Mbytes/sec. (A 64-bit-wide version doubles that data rate.) A 66-MHz version of PCI doubles the performance again and places it well beyond the capabilities of today's PLDs.

When considering programmable logic for PCI, first consider the alternatives: standard or custom chips. Many companies provide standard PCI-interface chips that may fit your needs. However, many designers don't use these devices because they can't find exactly what they need in a standard PCI chip (see box, "Why not use standard chips?"). In that case, designers have to use custom designs: mask-programmed ASICs, which find extensive use in high-volume applications, or programmable logic. Mask-programmed ASICs make sense from both cost and performance standpoints. ASICs are also attractive for PCI interfaces that connect to back-end applications requiring high gate counts. Even with the high gate counts available on some of today's PLDs, ASICs can offer more gates. However, many users cannot justify ASICs' NRE costs or longer design time.

Why not use standard chips?

Designers have their reasons for not using standard chips when using PLDs for PCI applications. For example, one user of PLDs for PCI applications, Mark Santoro, president of Santoro Systems Engineering, used Actel's CorePCI target core and an Actel device for an asynchronous-transfer-mode telecommunications application. Santoro says he didn't use off-the-shelf chips for this application, because they lack the extended-data-out (EDO) DRAM controller the application needed. He would have had to use another chip for the EDO DRAM.

"Every time you go on and off chip, you lose time," says Santoro. "If you can put the back-end application on the same chip, you reduce latency."

Another designer, Richard Lomas, president of Lomas Data Corp (Marlborough, MA), used Xilinx's LogiCore PCI Target core in an XC4013E for a PCI-based RAID (redundant array of inexpensive disks) 1 disk-mirroring application. The system provides a fully redundant disk backup. Lomas did not use an off-the-shelf chip because he designed the board around the existing interface to minimize software changes. You can't buy a standard PCI chip for that purpose, says Lomas.

Another user of PLDs for PCI is Mike Dini, president of the Dini Group. Dini has designed four PCI systems with off-the-shelf PCI chips plus three with Quicklogic devices. He also believes that off-the-shelf controllers do not do quite what you want. When he did use off-the-shelf chips, Dini says, he discovered that a custom chip would probably have better met his needs.

Programmable logic, on the other hand, makes sense for lower volumes, and it provides shorter time to market and lower cost. Eliminating the NRE cost and avoiding the need to make large production commitments make PLDs an attractive choice for PCI applications compared with ASICs, but there is more to consider, according to Raj Raghavan, president of Virtual Chips (San Jose, CA), which provides two dozen PCI cores for ASIC applications. Raghavan cautions designers that they must fully understand PCI before they can make it work with FPGAs.

"Ironically, the most sophisticated customers are in the ASIC world," says Raghavan, "and they really don't have to be that sophisticated. The tools are there, and the processes are there. You can pretty much insulate the customer."

Yet another alternative to masked-programmed ASICs and standard PLDs are laser-programmed devices from such companies as Chip Express. The devices offer masked-gate-array speeds, gate densities as high as 200,000, one- to five-day prototype turnaround, and one-week production turnaround. The company sells 20,000-gate ASICs for $15 (10,000) without NRE charges and including three design iterations. This approach still provides a sizable production commitment compared with PLDs but much less than that of standard ASICs.

In addition to all these choices, designers face yet another roadblock: PCI compliance. Any company providing ICs for PCI should be able to provide designers with a PCI-compliance checklist. The checklist provides a summary of electrical and timing characteristics to meet the PCI specifications. A device that fails to meet the requirements of the checklist is not PCI-compliant, although the design may work in a closed-architecture system under less stringent requirements. For example, output buffers that don't exhibit the required drive over the full temperature range may operate over a more restricted range.

A device that meets all the requirements on the PCI checklist offers some level of PCI compliance, but meeting the requirements offers no guarantee that designers can design a PCI-compliant board with the device or that the board can operate without wait states in burst mode. Mike Dini, president of the Dini Group (La Jolla, CA), has designed four PCI systems with off-the-shelf PCI chips plus three with Quicklogic devices. He says that even operations such as parity-error checking, which requires the design to perform a 36-bit XOR in 30 nsec, may cause problems.

"I doubt whether anyone is actually meeting that timing," says Dini. "Using handcrafted design, you may actually get to the 30-nsec result, but you give up some flexibility in the design."

Another critical factor in PCI design is the delay through buffers. It takes data 11 nsec to get from the PCI clock to the bus. During that time, the clock from the outside must go through the buffer to the register, and the data at the input of the register must go from the register through the output buffer and onto the PCI bus. Raghavan says methods exist for fixing the delays on systems that are too slow, but any method takes two clock cycles instead of one. Most FPGA companies offer PCI-compliant buffers, according to Raghavan, but the companies "forget" to tell designers what the delay through these buffers is.

"People are looking for simple answers to complex problems," says Raghavan. "It's difficult to get accurate information."

Yet another critical part of PCI-compliant design is its requirement for a 7-nsec setup time and 0-nsec hold time. According to Raghavan, to satisfy that requirement, a design sometimes needs analog delays to compensate for the 0-nsec requirement.

"That issue is also a tough design challenge," he says. "If your design has a Verilog or VHDL core, you can synthesize that core to handle logic design. However, the core still cannot handle the analog delay. Some FPGA vendors avoid this problem with a handcrafted design. However, it may be difficult to synthesize a core to an FPGA and have the core operate. You may spend two months trying to fix the 0-nsec hold-time problem."

Users report success

Not every user is suffering from these difficult timing problems, however. Mark Santoro, president of Santoro Systems Engineering (Encino, CA), used Actel's CorePCI VHDL model for a PCI target and an Act 3 family device. The resulting design had no timing problems. The design supports burst-mode read and write transfers with zero wait states. Burst transfers for writes operate at a 3-1-1-1 rate, meaning that the first transfer requires three clock cycles and subsequent transfers each require a single clock cycle. A burst transfer of four 32-bit data words requires six clock cycles and provides a data rate of 89 Mbytes/sec. Longer bursts approach the maximum 132-Mbyte/sec data rate.

Santoro's design required little manual work. He synthesized most of the design from VHDL, although he used schematics for explicit control in some parts of the design.

Dini, meanwhile, uses macros on his PCI designs when they are available. Otherwise, he uses Verilog until he encounters speed problems and then uses schematics to solve the problems.

Designers would typically rather have a PCI-core design in VHDL or Verilog that they can customize for their applications. This scenario is sometimes impossible, however, for performance reasons. Xilinx's LogiCore PCI Target design, a preplaced macro for some XC4000E-family devices, lets you customize FIFO buffers and the back-end interface (Figure 1). The PCI interface end, however, is a carefully designed macro that accommodates few modifications. Another designer, Richard Lomas, president of Lomas Data Corp (Marlborough, MA), used Xilinx's LogiCore PCI Target core in an XC4013E for a PCI-based RAID (redundant array of inexpensive disks) 1 disk-mirroring application. Lomas successfully used the LogiCore Target design without encountering timing problems.

To help you develop PCI interfaces, PLD companies offer help ranging from often-free application notes and reference designs to the complete core designs with test vectors for verifying your design. Table 1 lists some representative products to help you design PCI interfaces. New product offerings are occurring rapidly, so contact the PLD companies directly for their latest offerings. The PLD companies can also let you know about third parties providing PCI cores and other products applicable to the vendors' devices.

Table 1—Representative PLD products for PCI designs

Manufacturer

Product

PLD company

Target PLD family

Type of design

Burst-transfer support

Datapaths supported

Design-language implementation

Simulation/design verification

Price

Actel

CorePCI-Slave

Actel

Act 3

Target

Yes

32 bit

VHDL, Verilog

Complete PCI testbench

$3995

CorePCI-Master

Actel

Act 3

Master

Yes

32 bit

VHDL, Verilog

Complete PCI testbench

$4995

CorePCI-Bridge

Actel

Act 3

Bridge

Yes

32 bit

VHDL, Verilog

Complete PCI testbench

$4995

Altera

Target

Altera

Flex, Max, Flashlogic

Target

No

32 bit

AHDL

Basic simulation coverage

Free

Target

Altera

Flex 8000

Target

Yes

32 bit

AHDL

Partial test-vector coverage

Free

Master/Target

Altera

Flex, Max Flashlogic

Master/target

No

32 bit

AHDL

Basic coverage

Free

AMD

PCI design kit

AMD

Mach

Target

Yes

32 bit

Schematics and equations

Verilog simulation

Free

Crosspoint Solutions

CoreBank PCI

Crosspoint Solutions

CP20K

Target

Yes

32 bit

Verilog and schematic

Testbench and stimulus

$1995

Cypress Semiconductor

UltraLogic PCI Design Kit

Cypress

pASIC380, Flash370i

Target

Yes

32 bit

VHDL

Viewsim command file

Free

Eureka Technology

EC100

Altera

Flex 8000

Target

Yes

32 bit

Verilog

Test vectors

Less than $10,000

EC110

Altera

Flex 8000

Target

No

32 bit

Verilog

Test vectors

Less than $10,000

EC200

Altera

Flex 8000

Target

Yes

32 bit

Verilog

Test vectors

Less than $10,000

Logic Innovations

Master Target

Altera

Flex 10K

Master/target

Yes

64 bit

VHDL, Verilog, AHDL, netlist

Complete PCI testbench

From $15,000

Master Target

Altera

Flex 10K

Master/target

Yes

32 bit

VHDL, Verilog, AHDL, netlist

Complete PCI testbench

From $15,000

Target

Altera

Flex 10K

Target

Yes

32 bit

VHDL, Verilog, AHDL, netlist

Complete PCI testbench

From $15,000

Lucent Technologies

PCI design kit

Lucent Technologies

ORCA

Target

Yes

32 bit

VHDL, Verilog

Complete PCI testbench

$5000

PCI design kit

Lucent Technologies

ORCA

Master

Yes

32 bit

VHDL, Verilog

Complete PCI testbench

$15,000

Quicklogic

PCI design kit

Quicklogic

pASIC 1

Master/target

Yes

32 bit

Verilog and schematic

Full Verilog PCI testbench

Free

Xilinx

LogiCore PCI Target

Xilinx

XC4000E

Target

Yes

32 bit

Schematic design

Fully verified design

$4995

LogiCore PCI Initiator

Xilinx

XC4000E

Master/target

Yes

32 bit

Schematic design

Fully verified design

$8995

With all the emphasis PLD manufacturers place on meeting the PCI front-end speed requirements, however, don't lose sight of the back-end interface to your application. If your PCI interface meets the burst-mode transfer rates without wait states, the back end of your design must also meet those same data rates unless it is just moving short bursts through a FIFO buffer. If performance for the high data rates is difficult for the PCI interface, then going off the PCI interface chip to another chip may also offer performance challenges.

Santoro notes that when you go on and off chip, buffer times become much longer than internal propagation times. In addition, some PLDs have space only for the target interface and none for the back-end logic. Going off-chip also adds latency.

Santoro got the PCI target interface and the extended-data-out DRAM controller he needed on a single Actel Act 3 device. Smaller PLDs, even though they may be fast enough for PCI compliance, may end up causing timing-related problems if you can't get your back-end application on the same chip. The higher density PLDs offer space for relatively large designs in addition to the PCI interface, but you must also consider the design details of your back-end application—not just space for the logic.

Most PLD companies offering designs for PCI concern themselves with having target, initiator, and bridge designs and with what level of support designs offer for functions such as burst-mode transfers. Virtual Chips offers two dozen PCI cores, because Ethernet, ATM, and communications customers have different needs in the control they need on the data flow, according to Raghavan. When people buy a PCI core or a PCI design, they need user-interface flexibility at the back end to fit their applications.

The requirements of your application can also affect how much work remains for you beyond the PCI interface. Even though you have a 32-bit-wide PCI-bus interface, your application may need a 64-bit-wide bus. Your application may operate synchronously or asynchronously with the PCI bus. Virtual Chips offers cores for ASICs that accommodate these design requirements without additional design work for the core user.

PCI cores that don't offer all these variations still save you work. However, think of PCI cores as a way to trade money to reduce design time and PCI-specific design knowledge. Cores that offer a less customized interface to your back-end application still offer value—just less value than those that provide a simpler back-end interface.

Verifying design compliance

Verifying that your design complies with the PCI specification is another important part of the design task and depends on the system you are designing. Designs for open use in PCI systems need full compatibility if these designs are to plug into any system and operate properly. If you use a free reference design or application note, you are usually on your own when it comes to verification. Cores or drop-in modules typically come with standard or optional testbenches, simulation models, and other means to verify that the design works.

Using the PCI bus in closed, or embedded, systems can significantly alter the verification process—and even the entire design requirements—because the design must be compatible only within the system. This use of PCI has grown into a major application of the bus. The bus's high performance and complete documentation allow its use in these closed systems. Companies can also eliminate or alter parts of the PCI standard that do not suit the application.

To verify his design, Santoro did not check compliance with PCI because the application was a closed system; thus, the design did not require compliance. Santoro's company used gate-level simulation of the design and then functional simulation of various modes. Santoro claims that, if he were using the simulation in an open system, he'd want a full test suite in VHDL or Verilog and then he would simulate the design.

In Lomas' case, Xilinx provided the PCI simulation, and Lomas used a Viewlogic simulator to verify as much of the design as possible. Lomas needed good verification, because the board targeted open PCI use. The designers tried the board on as many systems as possible, including 166-MHz Pentiums and 486 systems, to ensure that no problems existed. In Dini's case, the designers used simulation to verify that the designs worked. Virtual Chips also extensively tests the PCI cores that it sells.

"We have tests that run for a full week testing everything we can think of," Raghavan says. "You need a large set of tests to verify operation for all conceivable situations."

Take all these design issues into consideration if you plan to begin a PCI design using PLDs. Know your performance requirements and thoroughly research the programmable logic you plan to use.

Looking ahead

You can expect the availability of programmable-logic cores and large macros to grow rapidly, not just for PCI but for all types of complex-logic functions. The PLD manufacturers see these cores as vital to their strategy of maintaining a short time to market and in simplifying design, especially for devices with more than 10,000 gates. The PLD manufacturers and a variety of third-party intellectual-property companies are working to fill the need.




You can reach Technical Editor Doug Conner at (805) 461-9669; fax (805) 461-9640; e-mail edndconner@mcimail.com

References

  1. Quinnell, Richard A, "The Mighty Morphin' PCI bus," EDN, April 25, 1996, pg 58.

For free information...
When you contact any of the following manufacturers directly, please let them know you read about their products at the EDN Magazine WWW site.
Actel Corp
Sunnyvale, CA
(800) 228-3532
fax (408) 739-1540
www.actel.com
Altera Corp
San Jose, CA
(408) 894-7000
fax (408) 944-0952
AMD
Sunnyvale, CA
(800) 222-9323
www.amd.com
Chip Express Corp
Santa Clara, CA
(408) 988-2445
fax (408) 988-2449
www.chipexpress.com
Crosspoint Solutions Inc
Milpitas, CA
(408) 324-0200
fax (408) 324-0123
Cypress Semiconductor Corp
San Jose, CA
(408) 943-2600
fax (408) 943-6848
www.cypress.com
Eureka Technology
Los Altos, CA
(415) 960-3800
fax (415) 960-3805
Logic Innovations
San Diego, CA
(619) 455-7200
(619) 455-7273
www.logici.com
Lucent Technologies Inc
Allentown, PA
(800) 372-2447
fax (610) 712-4106
www.attme.com/fpga
PCI Special Interest Group Publications
Portland, OR
(800) 433-5177
fax (503) 234-6762
Quicklogic Corp
Santa Clara, CA
(408) 987-2000
fax (408) 987-2012
Virtual Chips
San Jose, CA
(408) 452-1600
fax (408) 452-0952
www.vchips.com
Xilinx Inc
San Jose, CA
(408) 559-7778
fax (408) 559-7114
www.xilinx.com



| EDN Access | feedback | subscribe to EDN! |
| design features | out in front | design ideas | departments | products | columnist |


Copyright © 1996 EDN Magazine. EDN is a registered trademark of Reed Properties Inc, used under license.