EDN logo


Design Feature: July 18, 1996

Chip hardware and software: Why can't they just get along?

Jim Lipman,
Technical Editor

Everyone is talking about hardware/software codesign, but few companies are doing much about it. Knowing what EDA tools are available and how to apply them to systems on chips simplifies this difficult task.

Virtually all chips containing one or more processing engines need software to operate. You need to develop this software, whether on the chip in ROM or externally, with the chip's hardware development. Unfortunately, the story about a project team's hardware and software developers' exchanging business cards when they meet approaches the truth. System-on-a-chip hardware and software design and implementation are often separate and parallel operations until the late stages of hardware development (Figure 1). This situation often produces specification and implementation mismatches between hardware and software components.

A design methodology that waits until a hardware prototype is available before checking software compatibility causes many problems. If you discover a software bug, you cannot easily change the hardware without a significant delay in final product release. Also, hardware description languages (HDLs), such as Verilog and VHDL, along with HDL simulators, have greatly accelerated chip-hardware development over the past few years. This acceleration puts an even heavier load on software-development teams to interface with hardware developers earlier in the system-design cycle. Fortunately, help is on the way in EDA tools and design methodologies that can ease your hardware/software codesign burden.

System partitioning

Many factors influence how you partition a system into hardware and software (Reference 1), including:

Even though you cannot automatically do the partitioning, you can use tools from several software vendors, such as Alta Group, CPU Technology, and Synopsys (Table 1), to validate high-level system representations of partitionings and choose the best one. Today's tools, however, can only point you in the right direction. You still need to simulate your design at lower levels of abstraction to determine if you are meeting required speed, size, power, and cost specifications.

Table 1—Examples of EDA-tool and -equipment vendors for hardware/software codesign
Company Product Function Price1 Comments
Aisys Intelligent Systems DriveWay µC/µP drive-code generation $195 to $1995
Alta Group Envision High-level system design/verification $60,000 Focused on multimedia design
Aptix System Explorer Hardware emulation $65,000
Replicate boards $12,0002
CAE Plus ArchGen Hardware/software validation $69,900
CPU Technology SystemLab Virtual-prototyping environment Design dependent
Eagle Design Automation Eaglei Hardware/software codevelopment $40,0003 Virtual-prototyping service
Eaglev ASIC validation tools $40,0003
Ikos VirtuaLogic SLI Hardware emulation $150,0005
I-Logix Magnum Hardware/software codesign $25,0004
Mentor Graphics Seamless CVE Hardware/software cosimulation $75,000
SimExpress Hardware emulation $100,000
Quickturn Design Solutions System Realizer RTL emulation $240,000
HDL-ICE RTL in-circuit-emulation software $45,000
Synopsys COSSAP DSP-design tool suite $40,0006 Code generation for DSP chips
Zycad Paradigm RP Rapid prototyping system $30,000
1 Starting prices, unless otherwise noted.
2 Used with an existing System Explorer emulation system.
3 Price excludes Virtual Software Processor models; prices for those models start at $5000.
4 Price excludes Sharpshooter code generators; prices for those generators start at $10,000.
5 Also requires a logic analyzer for operation (not included in price).
6 Hardware- and software-implementation kits are available for $66,100 and $50,000, respectively. Both include COSSAP.

Cosimulation techniques

Throughout your system design, you need to simulate the hardware and software to see whether they work together correctly. Cosimulation lets hardware simulators and hardware models interface with system software. This interface lets you concurrently verify the software while exploring hardware trade-offs. Each type of simulation involves trade-offs between simulation speed and accuracy. You need simulation environments that adequately handle the software and hardware synchronization. You also need to be able to debug the system when problems appear. Debugging involves being able to "see" inside the internal states and register contents of hardware modules. Another factor to consider is the availability of simulation models, both for hardware and software modules. Table 2 from Reference 2 compares cosimulation techniques.

Table 2—Comparison of hardware/software cosimulation techniques
Method Speed (instructions/sec) Debug capability Model complexity Turnaround time Software checking? Hardware checking?
Nanosecond accurate 1 to 100 Best Hardest Fast OK Yes
Cycle accurate 50 to 1000 Excellent Hard Fast OK Yes
Instruction level 2000 to 20,000 OK Medium Fast Yes OK
Synchronized handshake Limited by hardware simulation No processor states None Fast Yes OK
Virtual hardware Fast No processor or hardware states None Fast Yes No
Bus functional Limited by hardware simulation No processor states Easier Fast No Yes
Hardware modeler 10 to 50 No processor states Timing only Fast OK Yes
Emulation Fast Limited None Slow OK OK

The type of model you use to represent a hardware component reflects a trade-off between the accuracy, or detail, of the part you are modeling and the speed of simulation. The most precise are nanosecond-accurate models, which have accurate timing information at the model's pins and include complete functionality. This type of model is good for checking system hardware details. The high accuracy comes at a price, however. The model is difficult to design, and its simulation is slow. You may also have problems getting a nanosecond-accurate model of a particular component. Manufacturers of this part, fearing reverse-engineering of a model into a competitive part, may not make such models available. Of less complexity is the cycle-accurate, or zero-delay, model. This model supplies correct pin transitions at the edge of each clock transition. Fewer unique event times during simulation result in a simulation speed-up of a factor of approximately 10 over a nanosecond-accurate model. A still-faster model is one that is instruction-set-accurate. This model, also called an Instruction Set Simulation (ISS) model, correctly emulates the instruction set, meaning that the model correctly shows register and memory values but provides no timing information. This type of model is generally not useful for hardware evaluation but is good for debugging software. ISS models do not correctly model some complex processor functions, such as pipelining timing problems, but the model simulates faster than does a cycle-accurate model. Generally, the faster the model, the less accurately it models the system's hardware and the faster it simulates.

Several EDA companies offer tools for the hardware/software codesign problem; none of these tools is currently completely satisfactory. Moving a step toward "easier" hardware/software coverification is Mentor Graphics with the recent announcement of the Seamless CoVerification Environment (CVE). Seamless CVE is the first product stemming from last October's merger of Mentor Graphics and Microtec Research (Santa Clara, CA), a software-development-tool maker. Seamless CVE lets software developers interact with hardware models at the behavioral level and register-transfer level (RTL). The tool's cosimulation kernel works with normally unrelated hardware- and software-design tools (Figure 2). Mentor claims that Seamless CVE performs simulation two to three times faster than does a gate-level simulation.

Seamless CVE presents a true processor model to a logic simulator. The model interacts with the system bus, so that the system correctly models and executes software and hardware interactions. The cosimulation kernel uses synchronization algorithms for timing, optimization algorithms to speed simulation, and memory-management logic to speed data transfer between hardware and software memory models. Seamless CVE supports several popular Verilog and VHDL simulators, including the company's own QuickSim II and QuickHDL and Cadence's (San Jose, CA) Verilog-XL. For very high-performance applications, you can use a hardware accelerator to speed simulation. Microtec's Xray debugger eases software development and source-code tracing. Xray links to commercial and proprietary real-time operating systems (RTOSs) and to compilers for C, C++, and assembly language. Mentor also offers process and DSP-engine models of Advanced RISC Machines (Los Gatos, CA), Hitachi (Brisbane, CA), Intel (Folsom, CA), Motorola (Austin, TX), SGS-Thomson (Lincoln, MA), and Texas Instruments (Dallas) to assist system development.

Virtual prototypes

You use a virtual prototype in place of hardware to help define your system and to debug it after you define the hardware. The idea of a virtual prototype, a software model of a hardware system or subsystem, is conceptually simple but difficult to implement. When you use these models early in the design process, they let you do what-if analyses to decide what portion of a system you should implement in hardware and software. Cost, size, power, and time-to-market constraints are your guides in making this decision. Simulating system software on virtual prototypes helps you make hardware trade-offs; for example, should you put a larger cache on an ASIC processor chip or use faster and more expensive memory chips? After you decide what the hardware should look like, using a virtual prototype to test the application or embedded software tells you if the choice of hardware implementation and software definition is correct. If the software functions incorrectly, you must decide whether to change it or to modify the proposed hardware. Both of these tasks are still relatively easy, because you have not yet implemented your system in real hardware.

You can define virtual prototypes at various stages of your hardware's development: the conceptual, architectural, RTL, or even gate level. The earlier you define and simulate a virtual prototype, the easier it is to make any necessary changes if the system functions incorrectly. Another advantage of using high-level virtual prototypes is that you can simulate them faster than you can lower level software models. However, because you are using software models in place of real hardware, simulation is much slower than emulating the actual hardware or using a hardware prototype.

One company, CPU Technology, offers a virtual-prototyping service using the company's SystemLab environment. SystemLab is a virtual environment that uses fully functional, clock-accurate software models and representations of the equipment you would normally use to design and validate your hardware/software system. Such equipment includes an in-circuit emulator, a logic analyzer, and a clock generator (Figure 3). CPU Technology develops virtual prototypes of the hardware blocks, including off-the-shelf components and ASICs, of your system. You use these prototypes to prove system concept, evaluate hardware architectures, and integrate software and hardware. You can also use SystemLab to modify a system before physically implementing changes. The company's models are cycle-accurate at the instruction level and allow you to run reasonably fast simulations. For example, an 8051 processor simulates in SystemLab at about 8000 clock cycles/sec on a Pentium system and about 15,000 clock cycles/sec on a system with a Pentium Pro processor. You can also run complex system simulations in SystemLab, because you only need about 10 bytes of memory for each equivalent logic gate in a design. (A 1 million-gate ASIC has a modest requirement of about 10 Mbytes of memory beyond that needed for the operating system.)

RASSP program improves design process

Kevin Jorgensen, Precedence

In June 1993, the Advanced Research Projects Agency (ARPA) initiated a program to investigate and develop design methodologies to improve the design process for signal processors. ARPA chartered the Rapid Prototyping of Application Specific Signal Processors (RASSP) program to improve by a factor of four the time required to take a design from concept to prototype, design quality, and life-cycle cost. To develop technology for the RASSP program, ARPA selected EDA vendors Alta Group, Berkeley Design Technology (Fremont, CA), Mentor Graphics, Precedence (Santa Clara), and Quickturn Design Systems. The program's challenge was to leverage these vendors' expertise through a new unified design environment and to accelerate delivery of signal-processing hardware and software.

In a typical RASSP design-environment application, designers consider the trade-offs between implementing signal-processing algorithms in hardware using off-the-shelf processing cores and using software that loads into onboard ROM. RASSP divides the process into milestones, each representing an incremental refinement of the design and a step toward the virtual prototype and final implementation. Designers use commercial simulation technologies from RASSP participants to evaluate trade-offs at each milestone. A simulation backplane provides the technology that links the commercial point tools into a contiguous flow to facilitate an implementation path. Designers create a model at each milestone; they exercise this model using various simulator combinations. Partitioning is the creation of a model of the design. In an RASSP design process, hardware and software partitioning occurs early in the design process, as well as between various hardware alternatives, as the designer refines the final implementation. Partition-creation ease and the speed at which designers can evaluate partitions determine efficiency of the design process.

Although the RASSP program is specific to DSPs, the resulting developments apply to broad range of designs. As the first complete hardware/software-codesign environment based on commercial-tool technology, the RASSP program represents an important turning point in the development of complex design solutions. You can find additional RASSP information on the Web at http://esto.sysplan.com/ETC/RASSP/.

Eagle Design Automation's Eaglei addresses the embedded-system codesign market. Eaglei contains the Virtual Product Console (VPC) and the Virtual Software Processor (VSP). VSPs are software models of popular µPs to use with the VPC, which links the models with application software. There are two types of VSPs available. The first is an enriched bus-functional model, composed of a C-code core in a Verilog or VHDL wrapper. The C code communicates with the application software, and the HDL wrapper represents the system hardware for an HDL simulator. Eagle's recently introduced ISS models represent the second kind of VSP.

Using the VPC and a VSP with a hardware simulator provides a hardware/software codevelopment environment (Figure 4) that can simulate as many as 5000 instruc-tions/sec. The VPC also lets you test software against the hardware model at the behavioral, register, and gate levels of abstraction. Eagle offers VSPs for the Advanced RISC Machines ARM7; AMD (Sunnyvale, CA) 29K and X86, Intel X86 and i960; Mips Technologies (Mountain View, CA) 4000, and Motorola 68K and Power PC. Eaglei hardware-simulator support includes Cadence Verilog-XL, Chronologic's (Los Altos, CA) VCS, Model Technology's (Beaverton, OR) V-System/VHDL, Vantage's (Fremont, CA) Optium, and Viewlogic's (Marlborough, MA) Fusion.

Hardware emulation

Moving down the design "food chain," emulation is a way of checking hardware and software compatibility early in a design at the RTL before synthesizing, as a logic-gate representation, or at the gate level. At these stages, the hardware is the functional equivalent of a breadboard. The emulation system configures your design within an array of PLDs, which can be commercial FPGAs or, for Mentor Graphics' SimExpress, in a full custom chip. Emulation occurs as fast as tens of megahertz. However, the entry cost of complex design emulation can be hundreds of thousands of dollars for a system with a capacity of hundreds of thousands of gates. Also, the time required to set up a complex system for emulation may be weeks or months (see box, "Chip-emulation advantages"). However, actual emulations take only hours, and you can rerun them as necessary when you make software or hardware changes.

Chip-emulation advantages

Setting up a complex system for emulation can require a large time commitment. Hamid Butt, the director of system development of the Business Development Unit of Toshiba America, explains how the company uses emulation for embedded-processor and MPEG II chip designs. The chips his group designs typically have about 200,000 gates of logic plus memory. After RTL verification, Toshiba compiles the Verilog description and emulates the design on a Quickturn system. The setup time for the chip to this point in the design ranges from two to three months. The actual emulation time is a few hours (ranging from about 8 hours for an incremental compilation to about 18 hours for a full compilation). It typically takes just a few hours to make any change emulation points out in the software. A required hardware modification, at the RTL, may take one to seven days to implement.

A big advantage of using emulation at RTL rather than simulating the chip with a Verilog simulator is the speed of the emulation: a three-order-of-magnitude improvement. Toshiba ran a benchmark on an MPEG-II chip with Verilog simulation using Chronologic's VCS simulator running on a 167-MHz UltraSPARC workstation. The simulation took 3½ hours to simulate 1½ frames. At a 32-frame/sec rate; this speed translates to more than three days to simulate 1 sec of data. Using emulation, Toshiba emulated the same chip in about 5 sec, an improvement by a factor of almost 1700.

Besides Mentor Graphics, companies such as Aptix, Ikos, and Quickturn Design Solutions offer emulation hardware and software. Of this group, Quickturn is the best known in the simulation arena for the company's System Realizer product line. Quickturn performs RTL emulation before synthesis. The company's top-level system can emulate systems with as many as 3 million gates as fast as 4 MHz. The company's new Quest II emulation software reduces the time to compile a design before emulation and enhances the visibility of the design during debugging. Toshiba America (San Jose, CA) uses a Quickturn emulation system to verify chip hardware at the RTL and gate level (Figure 5). Toshiba performs an optional functional check after synthesis to the gate level if a custom-designed block is on the chip. An example of such a block would be the datapath for a processor, because the custom design is more area-efficient than one done with automatic EDA tools. Toshiba does timing simulation and verification in parallel with chip floorplanning and layout to save design time. Both subflows are closely coupled; a problem in either layout or timing usually means a change in the other subflow.

Looking Ahead

Cosimulation techniques with virtual prototypes that replace actual hardware are becoming more common. Simulation needs to accelerate, however. Even the 20,000-instruction/sec speed of Instruction Set Simulation (ISS) models is slower than the real-time operation of the final systems. Look for the development of different hardware models that allow you to simulate at higher speeds, albeit with a loss of model detail, when you first simulate the system for "proof of concept."

Another area needing assistance is system partitioning into hardware and software components before implementing hardware and code. EDA tools, such as Synopsys' COSSAP, let you graphically build a DSP system using predesigned and user-specified blocks. You simulate the system at a very high level of abstraction and explore algorithmic, architectural, and implementation trade-offs. After optimizing the system, you manually partition the blocks for either hardware or software implementation. COSSAP then helps you generate C code to run on industry-standard DSP cores or chips and either VHDL or Verilog for the hardware modules. Future system-design tools from EDA vendors will include ASIC capability in the design flow and semiautomatic partitioning of the system blocks into software and hardware modules. Fully automatic partitioning is difficult and will not appear in commercial tools for at least the next three to five years.

Hardware prototypes

The most technically ideal way to test your application software on system hardware is to build and use a prototype of the hardware. Software runs at or near full system speed on a prototype and provides accurate hardware-timing checking. Using the real thing gives you a range of software-development tools from which to choose when testing and debugging the hard-ware/software combination. However, if you find a hardware problem, it will most likely effect system-design time; a problem that necessitates a redesign might force the final product to miss a critical design window. This situation emphasizes the necessity of doing everything possible to run hard-ware/software coverification earlier in the design cycle.

The EDA industry, through new tools and design methodologies, is attacking the bottleneck of hardware/software codesign for complex chips (Table 2). Current efforts focus on the use of virtual prototypes for as-yet-unimplemented hardware and speeding the cosimulation of hardware and application software early in system design. The problem of initial system partitioning into hardware and software components, based on such constraints as data throughput, cost, and power, is still in evidence. Such partitioning is largely a manual operation based on designer expertise, and EDA and hardware vendors still need to develop techniques to further automate this procedure.



You can reach Technical Editor Jim Lipman at (510) 606-1370, fax (510) 606-1563, email ednlipman@mcimail.com


References

  1. Adams, Jay, and D Thomas, "The Design of Mixed Hardware/Software Systems," Design Automation Conference Proceedings, June 1996, pg 515.
  2. Rowson, James, "Hardware/Software Co-Simulation," Design Automation Conference Proceedings, June 1994, pg 439.
  3. Bunza, Geoffrey, "A Journey into Parallel Worlds: Exploring Hard-ware/Software Systems Integration," Eagle Design Automation, March 1996.
  4. Zaidi, Jauher, "Hardware Software Co-simulation," Design SuperCon96, On-Chip System Design Conference Proceedings, January 1996, pg 3-1.
  5. Levy, Brian, "Early Firmware/Hardware Integration and Verification Using Hardware Emulation," Design SuperCon96, On-Chip System Design Conference Proceedings, January 1996, pg 7-1.
  6. Mittag, Larry, "Trends in hardware/software codesign," Embedded Systems Programming, January 1996, pg 36.


Representative manufacturers of EDA tools and equipment for hardware/software codesign
When you contact any of the following manufacturers directly, please let them know you read about their products at the EDN Magazine WWW site.
Aisys Intelligent Systems
Petach-Tikva, Israel
(800) 397-7922
fax (800) 625-5525
www.aisys-usa.com
Alta Group
Sunnyvale, CA
(408) 733-1595
fax (408) 523-4601
www.altagroup.com
Aptix
San Jose, CA
(408) 428-6200
fax (408) 944-0646
www.aptix.com
CAE Plus
Austin, TX
(512) 338-0165
fax (512) 338-1092
www.cae-plus.com
CPU Technology
Pleasanton, CA
(510) 224-9920
fax (510) 227-0539
Eagle Design Automation
Beaverton, OR
(503) 520-2300
fax (503) 520-2323
www.eagledes.com
Ikos
Cupertino, CA
(408) 255-4567
fax (408) 366-8699
www.ikos.com
I-Logix
Andover, MA
(508) 682-2100
fax (508) 682-5995
Mentor Graphics
San Jose, CA
(408) 685-7000
fax (408) 685-1202
www.mentorg.com
Quickturn Design Solutions
Mountain View, CA
(415) 967-3300
fax (415) 967-3189
www.quickturn.com
Synopsys
San Jose, CA
(415) 962-5000
fax (415) 965-8637
www.synopsys.com
Zycad
Fremont, CA
(510) 623-4400
fax (510) 623-4550
www.zycad.com



| EDN Access | feedback | subscribe to EDN! |
| design features | out in front | design ideas | departments | products | columnist |


Copyright © 1996 EDN Magazine. EDN is a registered trademark of Reed Properties Inc, used under license.