EDN Executive Editor Ron Wilson explores how IC design teams really work: the struggle for power efficiency and performance, wrestling with semiconductor processes and design methodologies, the challenges of global design teams. How do we somehow herd architecture, IP, design and verification into a successful tape-out?
Jul 23 2009 2:57PM | Permalink |Comments (9) |
Most people today think of design for low power in terms of voltage islands, multiple threshold voltages, clock gating, power gating, or dynamic voltage-frequency scaling. All of these procedures apply either at the transistor level or to the power and clock nets that support the logic. And all have been boosted by at least some degree of design automation: you capture your power intent in a database, do power-efficient floorplanning, use a power-aware synthesis tool, a power-optimizing CTS tool, and so on.
But there is another whole army of techniques that manipulate the structure of the logic nets and the data itself, rather than the underlying power, clock, or device levels. These techniques rest on three principles.
First, if you can reduce the number of logic transitions that occur during execution of a task, you reduce the energy required to complete the task. There are actually two sub-parts to this principle. Architectural changes can minimize the number of transitions necessary to perform a task—for example substituting a gray-code counter for an arithmetic one. And in most circuits, spurious transitions get generated and propagate through many levels of logic, eating up a little energy in each parasitic capacitive load. So if you can use designs that don't glitch, and manipulate operands or gate functions to prevent spurious activity from propagating, you save joules.
Second, if you reduce the gate count and area of a circuit, you almost certainly will reduce both the dynamic and static power consumption, other things being equal. Reduced area means reduced capacitance.
Third, if you increase timing slack on a net by reorganizing the logic, you can either use more high-threshold transistors—noise margins permitting--or lower drive strengths on the net. Thus sometimes a logic optimization that is unnecessary to meet timing may actually allow you to reduce circuit power.
Today such microarchitectural techniques don't always come from an automated tool. "Some of these things can be inferred," said Synopsys product marketing director Jay Chaing, "But some of them have to be done ahead of time as IP components and then selected."
Consequently we are starting to see IP blocks that are not only friendly to power-aware synthesis, but also that employ power-friendly microarchitectures. One recent example of this trend is Synopsys DesignWare miniPower: a library of data path elements including FIFOs, counters, and arithmetic elements. Not only are the elements tuned to work with Design Compiler Ultra and Power Compiler, but they have employed a range of microarchitectural techniques.
Reducing spurious transitions is perhaps the low-hanging fruit of the exercise. If a logic transition has no meaning in the intended function of the circuit, there's no reason to waste energy by allowing it to propagate through the data path—especially with the deep logic nets that characterize such blocks as multipliers and dividers. The miniPower designers have reduced such transitions in part by careful design, but also by explicit means, such as gating segments of data paths, with the gating logic integrated into the first stage of the data path, rather than hung out in front of it. Some cells also use data tracking—circuits that keep track of when the input to a function is valid, and block the input until that time. In some cases designers can block propagation of spurious signals simply by inserting operands that will quiet the activity in the logic trees.
Another important task is minimizing the number of valid transitions needed for a task. This can involve using specialized logic cells, encoding inputs to minimize transitions during a computation, or reorganizing the logic itself for minimum transitions rather than minimum gate count or minimum delay—for instance by taking best advantage of low-activity bits in logic and clock gating.
Some of these opportunities can be inferred from the initial logic design. But often that requires some knowledge of operand-bit activity levels. In general, Chiang said, such information has to come from the user, although in some cases the synthesis tool can use empirical activity models. Given some source of activity data, the tool must select from among the alternative approaches available to if to produce the lowest-power choices for a particular data path.
This is only a quick overview of the way DC Ultra, Power Compiler, and the new miniPower libraries are intended to work together. The result, according to Chiang, can be significant: one beta customer designing a wireless chip reported power reduction on the full chip—which was presumably a mostly-datapath baseband processor—of over 40 percent in dynamic power, and nearly 20 percent in static power compared to results from that company's existing design flow. In another case, a fast networking chip, using the libraries actually led to an increase in dynamic power, but the impact on static power was so significant that the overall power reduction in the customer's use profile was nearly a quarter compared to using the same tool flow with a different third-party IP source.
Maybe two take-aways here. First, now that tools have automated most of the mechanical techniques for power reduction, attention is going to shift to microarchitectural techniques. And this is going to force even more difficulty into the already-fraught problem of comparing competing IP offerings. Second, in architectural and logic-optimization techniques just as in power-gating or DVFS, understanding of use profiles and having accurate activity data are absolutely essential starting points—not luxuries or after-the-fact decorations. And that observation points directly to the growing, forced, and unwelcome integration of the software-development effort into the chip-design team.
Related entries in: EDA | SOC (System on a chip) |