Feature
Accurate power-analysis techniques support smart SOC-design choices
Power issues can plague any SOC, so every design flow needs methods for predicting power consumption, beginning at RTL.
By Jim Flynn, Synopsys Professional Services -- EDN, 12/7/2004
|
As power consumption becomes increasingly critical for both portable and nonportable applications, accurate techniques for predicting an SOC's (system on chip's) power have become essential. Designers need to know, for example, whether a mobile-phone SOC requires both voltage and frequency scaling or whether a network-router SOC consumes as much power as a desk lamp. A designer can make these determinations only by estimating a design's power consumption and intelligently applying power-saving methods. Conducting power analysis throughout the flow enables designers to address power issues early on and avoid the lengthy iterations caused by redesigns required to meet power budgets.
Four analysis methods apply to various points throughout the design flow: before synthesis, with rough synthesis results, after full synthesis, and after layout; a fifth method provides a vectorless technique for calculating switching activity that designers can use at any point from synthesis on. Table 1 details the power-analysis and associated power-optimization methodologies in the context of the design flow.
Most SOC-power-analysis methods depend on gate-level representations, but, by that point in the design flow, the power-saving opportunities with the greatest impact have passed. At the system and algorithm levels, for example, using a parallel approach rather than a serial implementation reduces clock frequencies, which can significantly decrease power consumption.
In one design that serially processes data samples, for example, designers estimate with early design models that the chip's power consumption would be far too high. They change the architecture so that the chip processes the samples in parallel to reduce the logic's clock speed from 80 to 10 MHz. Additionally, the designers reduce the supply voltage from 1.8 to 1.25V. The parallel-processing logic is much larger than the serial-processing equivalent, but the resulting logic's reduced voltage and operating frequency combine to cut power consumption by 75%. This parallel approach succeeds, because power has a squaring function to voltage and only a linear function for frequency and switching. Often, the area penalty is small, but the power saving is significant, so it is worth exploring the trade-offs.
Power-related trade-offs at any level of design abstraction depend on a designer's ability to analyze power consumption, and the earlier the better. One analysis method uses a spreadsheet to estimate power at the RTL (register-transfer level). Other methods rely on tool reports and provide increasingly accurate results at each stage as additional design and library information becomes available. The vectorless-analysis method offers a way to get "full power coverage" analogous to the way static-timing analysis provides full timing coverage.
Table 2 summarizes the analysis methods. (For background on static and dynamic power consumption, see sidebar "Where an SOC consumes power.") Although this article focuses on analyzing dynamic power, note that leakage power is becoming increasingly important at process geometries of 130 nm and below.
RTL-power analysisIn the earliest stages of a design flow, a spreadsheet power analysis can provide rough but valuable estimates of a design's power consumption. If designers have not yet selected the library, this analysis can reveal the best power-conscious libraries and design architectures. After library selection, using Synopsys Design Compiler and Synopsys Power Compiler tools can supply values for use in the spreadsheets. The power-analysis spreadsheet includes approximate gate counts, rough activity-per-block values, side-by-side vendor microwatts-per-megahertz data, and relative power estimates. The analysis at this point can show that a design consumes far too much power to be practical—thus avoiding weeks of work to create a chip that is useful only as a coffee-cup warmer.
The spreadsheet-analysis method requires an estimate of each block's gate count (number of library cells of each type) and activity level. It also requires information on the amount of energy consumed by the switching of each cell type. Data from a library vendor's manual can facilitate the assignment of an appropriate power value relative to speed (in microwatts per megahertz). Designers calculate a block's internal power consumption for a particular type of cell as: Power consumption=gate count×microwatts/megahertz×activity×frequency.
Summing these power values for all the types of cells in a block gives the block's overall internal active-power estimate. Before performing synthesis, designers need to estimate gate counts based on architectural choices and an understanding of the design. For example, they can derive approximate gate counts from features such as bus sizes, word lengths, control layers, and memory depth. After selecting the library and performing early synthesis, a designer can estimate the gate counts for a block by using a report from the synthesis tool to show the number of each instance type for the design. Assigning the activity levels is a key aspect of the power calculation. The gates of a design have different activity levels that you can estimate with or without a simulation to extract switching activity. After selecting the library, however, it is a good idea to run a functional simulation to determine the switching activity.
Designers measure the switching activity in terms of a toggle rate. The toggle rate equals the number of logic-zero-to-logic-one and logic-one-to-logic-zero transitions of a design object (for example, a net, pin, or port) per unit of time. A net having an activity of 50 logic-one-to-logic-zero transitions and 50 logic-zero-to-logic-one transitions during a 100-nsec interval has a toggle rate of one. A net having an activity of five logic-one-to-logic-zero transitions and five logic-zero-to-logic-one transitions during a 10-nsec interval also has a toggle rate of one. As these examples illustrate, a toggle rate of one indicates one activity transition per nanosecond. You can relate power and toggle rate by understanding that each transition requires some amount of energy to change the state of an internal circuit during the time interval of the state change.
It is key to note that power estimates at any level of abstraction are meaningful only when the switching activity represents the chip's actual working operation. A common mistake is to use a vector set that simulates system-boot sequences. This activity rarely represents actual working conditions and therefore leads to inaccurate power estimates. An RTL simulator can automatically generate an SAIF (Switching Activity Interchange Format) file, but the activity values are accurate only if the vector set is realistic. No tool can automatically generate such vectors, because the task requires an understanding of the circuit's intent. (However, the vectorless-analysis technique presented later in this article offers another way to obtain activity values.)
Some power-analysis tools can use an SAIF file to define libraries and constraints and to annotate the design for power estimation. The Synopsys Power Compiler tool's default switching activity for nonannotated ports is 0.25 toggle per positive edge. This tool applies and propagates this value throughout the block. Table 3 lists examples of results estimated using the spreadsheet method. After using this method to calculate internal power, a designer can estimate switching power as 30% of internal power. Without accurate load and switching data, engineers need to treat this value as a rough estimate. Such estimates are more useful in comparing the power implications of various design strategies than in predicting a chip's actual power consumption. However, rough estimates at the RTL stage do provide an early warning that a design may turn out to be unacceptably hot.
Estimating leakage powerSwitching power is usually the most important value to determine in early analysis, but you can also estimate leakage power based on each cell type's leakage data. Because leakage differs for high and low states, you must base the leakage analysis on the static probability that a signal is at a certain logic state.
Static probability has a value between zero and one that depends on a signal's function. For example, an active-low reset signal typically has a logic-one static probability (SP1) at or near 1.0 (100%). For a data-bus signal, you can assume a value of 0.5 (50%) for SP1 unless some architectural characteristic suggests otherwise. After selecting the library, you can calculate static probability during simulation by comparing the time a signal is at a certain logic state with the total simulation time.
Gate-level power analysisAfter performing synthesis, designers can obtain fairly accurate power estimates based on actual gate counts and simulated activity. The most significant sources of inaccuracy at this point are the activity and the prelayout wire-load values. A designer can improve accuracy by generating an SAIF file from gate-level simulations. The activity values are accurate only when the simulation vectors represent actual application behavior. As for the load values, a tool such as Synopsys Physical Compiler software can improve their accuracy after physical optimization. You can produce an SPEF (Standard Parasitic Delay Format) file annotating Steiner route and RC parasitic estimates. After layout, a gate-level simulation can generate a VCD (value-change-dump) file. VCD files log changes to signal values during a simulation and provide the design's nodal activity, structural-data hierarchical connectivity, path delays, and timing and event information.
Chip I/Os can significantly reduce accuracy if they are numerous, switching at high speed, and driving long wires. Lumped load models for the I/Os may produce too pessimistic results, which can be a problem if design goals call for accurate rather than worst-case power estimates. For a more accurate picture, you can run Synopsys HSpice tool simulations on critical I/O-cell types with accurate distributed-impedance models. You calculate the I/O cell power using numeric methods that determine charge and energy per rising/falling edge. Given the Synopsys HSpice tool output of current and time, you can also calculate the internal energy per transient using the trapezoidal integration method (in Matlab, for example). You can use the I/O activity recorded during analysis to scale I/O power. Finally, you combine the total I/O power with the core power for an overall power estimate.
To show how power estimates vary using the methods described so far, Figure 1 shows examples based on one block (a high-speed FIR filter) in a DSP design. This example demonstrates how much the accuracy of the information available in phases of the design and implementation cycle affects the power estimates.
Vectorless power analysisThe power-analysis methods presented here so far derive switching activity from simulation data, but this approach has a variety of limitations. Gate-level descriptions of the circuit allow the most accurate simulations for obtaining switching activity but may be difficult to obtain. The design may be too large to simulate at the gate level, or an incomplete netlist may cause inaccurate timing that makes gate-level simulations impossible. There may also be lack of a testbench, test cases, or both that force maximum power usage. Simulation test cases often fail to generate the values for switching activity that are necessary for power analysis. It is difficult to verify that you covered the cases that cause the highest power consumption, and this uncertainty parallels the problems of using simulations to verify timing, where there is no certainty that all the timing paths are covered. Due to this uncertainty, static-timing analysis has largely replaced simulation for timing verification.
Designers can achieve a similar degree of coverage for power analysis using a vectorless technique. In this technique, the designer annotates the worst-case switching values on all of a design's architecturally stable parts (ports and registers). For best accuracy, engineers need to propagate the switching values for the ports and registers throughout the design based on statistical/heuristic calculations. Power-analysis tools can automatically perform this task through all of the design's logic cones and write a net switching report. Designers can then use a Perl script to process the report data into a sequence of net-switching-activity annotation commands. These commands drive a tool to annotate all nets with appropriate switching data, which designers can then use with any of the aforementioned power-analysis methods. A statistical propagation technique allows designers to specify pessimistic values for port and register switching if a complete annotation of all the nets is impractical. This approach delivers the realistic power-consumption values necessary for today's SOC designs.
| Author Information |
| James P Flynn is a senior IC designer with Synopsys Professional Services (www.synopsys.com/sps). He has a master's degree in analog-IC design. He has a master's degree in analog-IC design from Florida Institute of Technology. |
|















