Power: a significant challenge in EDA design
Brian Bailey, Contributing Technical Editor - May 24, 2012
Power is the rate at which energy is consumed—not a hot topic 10 years ago but a primary design consideration today. A system’s consumption of energy creates heat, drains batteries, strains power-delivery networks, and increases costs. The rise in mobile computing initially drove the desire to reduce energy consumption, but the effects of energy consumption are now far-reaching and may cause some of the largest structural changes in the industry. This issue is important for server farms, the cloud, automobiles, chips, and ubiquitous sensor networks relying on harvested energy.
The reason for the sudden change is that physics was helping with process technologies down to 90 nm. With each increasingly smaller node, however, voltages decreased, creating a corresponding drop in power. In general, power budgets remained fixed even as developers integrated additional capabilities. With smaller geometries, voltage scaling is more difficult and is failing to keep up. As voltages approach the threshold voltage, switching times increase. To compensate, designers lowered threshold voltages, but doing so caused a significant increase in leakage and switching currents.
Every stage in the design flow—from software architecture to device physics—affects power consumption. Although each team can locally optimize power consumption, no single group can create a low-power design. Conversely, any one group can destroy it. This situation is creating a new need for cooperation and cross-discipline tooling. Power issues do not stop on chip. They spread to interconnect topologies, board and system design, power controllers, and so on. Current EDA tools do not build in the concept of power, meaning that designers are adopting retrofit approaches rather than rebuilding from the ground up.
The role of physics
The power a chip consumes is the sum of switching, or dynamic, power and passive, or leakage, power. The dynamic component of power is due to the capacitive load of a design. This component charges through a PMOS transistor whenever a net makes a transition from zero to one. The energy drawn from the power supply is equal to the capacitive load multiplied by the square of the voltage. The system stores half of this energy in the capacitor; the other half is dissipated in the transistor. For the one-to-zero transition, no additional energy is drawn from the power supply, but the charge dissipates in the NMOS transistor. Assuming that the node changes at frequency F, then dynamic power is FCLVDD2, where CL is the capacitive load and VDD is the voltage. Although other forms of dynamic-power consumption exist, they are much smaller.
Reducing the voltage has a considerable effect due to the voltage-squared term. Unfortunately, performance also relates to voltage because the increased voltage causes an increase in the gate drive, VGS−VT, where VGS is the gate-to-source voltage and VT is the threshold voltage. In older technologies, the leakage power was insignificant. As device sizes have decreased, leakage has become more significant in a number of areas, including gate-oxide tunneling, subthreshold voltages, reverse-bias junctions, gate-induced drain, and gate current due to hot-carrier injection.
Silicon dioxide is the typical material for insulation. At low thickness levels, electrons can tunnel across it. This relationship is exponential, meaning that halving the thickness increases the leakage by a factor of four—not an issue until transistor geometries decreased to less than 130 nm. Using high-k dielectrics instead of silicon dioxide provides similar device performance, allowing a thicker gate insulator and thus reducing this current.
Transistors have a gate-to-source threshold voltage below which the subthreshold current through the device decreases exponentially. As supply voltages decrease to reduce dynamic-power consumption, the threshold voltage also decreases, resulting in less gate-voltage swing below the threshold to turn off the device. Subthreshold conduction varies exponentially with gate voltage.
The formation of a reverse bias between diffusion regions and wells or between wells and the substrate causes small reverse-bias-junction leakages. A high-electric-field effect in the drain junction of MOS transistors causes gate-induced drain leakage, which is typically handled in the fabrication technology. Gate-current leakage is due to the drift of the threshold voltage in short-channel devices and relates to high electric fields within the device. The fabrication technology also primarily controls this effect.
Designers have made a trade-off between dynamic- and static-power consumption. The reduction in voltage has reduced the dynamic power but increased the static power. Consider a typical chip in a cell phone. When the device is operating, leakage accounts for about 10% of the consumed power; the other 90% is dynamic power. When a cell phone is in standby mode, however, which could be 90% of the time, little dynamic activity occurs in the chip. Minimizing both types of power is thus equally important.
Devices continue to show improvement in power consumption. For example, the Samsung 28-nm low-power process delivers 35% less active and standby power at the same frequency than the company’s 45-nm low-power process, and the 28-nm process offers a 60%-of-active-power reduction at the same frequency compared with 45-nm low-power system-on-chip designs. Taiwan Semiconductor Manufacturing Co’s 28-nm high-performance, low-power process consumes more than 40% less standby power than the company’s 40-nm low-power process. GlobalFoundries, meanwhile, offers three power levels for its 28-nm node (Figure 1).
Moore’s Law continues unabated, and chips are packing more functions into each device. According to Colin Baldwin, director of marketing for Open-Silicon, customers can design the next-generation device with similar unit cost and twice the amount of performance but with an overall increase in power, even though power per device has decreased. Clock frequencies, another variable, are slowly drifting upward, but less quickly than the process entitlement in many markets. Open-Silicon finds that most customers are trying to integrate added function with only slightly increased overall power. Thus, to maintain the same overall power, it becomes necessary to look at the energy savings that can be made during other parts of the design flow.
Optimize and compare
Design involves estimation and optimization. Estimation allows you to make comparisons between possible implementation options. In addition, you can make optimizations automatically, or you can do so with tool assistance at various levels of abstraction. According to Arvind Shanmugavel, director of application engineering for Apache/Ansys, power estimation is an accurate science only when you have a complete design and a correct set of vectors. Until you have completed the design, everything is, by definition, an estimate of what will happen in the design. You should be looking for large and relative changes rather than absolute numbers in the power budgets during the early phases of the design. You can expect a 20% deviation between the RTL (register-transfer level) to silicon and a 10% deviation between gates to silicon, according to Venki Venkatesh, director of engineering at Atrenta.
If a tool states that one possible approach would consume less total energy than an alternative approach, this statement must be correct; otherwise, the tool may cause the selection of an inferior approach. Unlike area and performance, power is vector-related, and you may thus need to run several simulations to get a representative sample of the design’s activity. For example, consider the choice of applying random data into an audio processor versus more typical speech data. Figure 2 shows the transition activity for a few registers in a finite-impulse-response filter (Reference 1). For an architecture that does not destroy the data correlation, the speech data switches 80% less capacitance than does the random input. The sequencing of operations can result in large variations of the switching activity due to these temporal correlations.
Some companies, however, believe that you can get close enough using statistical methods employing expected activity from counters and other recognizable pieces of logic. You can now optimize energy consumption in many ways—most at RTL or below. According to Shanmugavel, clock gating is among the common techniques for minimizing active-power consumption. Shutting off the clock for a circuit prevents any toggle activity of the clocks or registers in a design. Another technique is to employ voltage islands, which lowers the operating voltage of a design and quadratically reduces the switching component of active power. Designers use voltage islands for areas of a chip in which performance and speed are noncritical but can save power.
DVFS (dynamic-voltage/frequency scaling) is by far the most complex active-state power-management technique. This approach changes the active operating voltage and frequency depending on the demand of the load. During high-load conditions, the voltage and frequency are at nominal conditions, and the chip or unit functions to its fullest extent. During low-loading conditions, the voltage or the frequency scales down to perform at a lower speed but provides lower active-power consumption. Designers realize this technique through a combined hardware/software approach.
On-die voltage regulators meet the demands of various active- and static-power requirements. ICs usually have off-chip voltage-regulator modules that can supply the voltage and current requirements for active states. However, designers are increasingly using on-die regulators as the number of voltage domains and the need for these voltage domains to respond faster to the demand current increase.
Stacking ICs that communicate with one another to minimize the signal interconnect is an emerging trend in low-power design. According to Apache’s Shanmugavel, manufacturers often stack processors and memory over a silicon interposer that makes connections using TSVs (through-silicon vias). These interposers provide a low-capacitance signal interconnect between the die, thus reducing the I/O’s active-power consumption. As the cost of 3-D ICs begins to decrease and as designers get a better understanding of thermal impact, a migration to 3-D ICs will occur across the industry.
To minimize static-power consumption, designers can use power gating, which provides the maximum amount of leakage-power savings for a device in standby. Shutting off the clocks for the functional units reduces active power, but the unit still consumes leakage power. Designers must understand several trade-offs in power gating before implementing the design.
One of the oldest techniques for reducing leakage power is to swap nominal-threshold-voltage gates with high-threshold-voltage gates. In CMOS, the subthreshold leakage is inversely proportional to the threshold voltage. Higher-threshold-voltage devices have a lower leakage envelope than do lower-threshold-voltage devices but come at a cost of larger delays. Designers must perform a careful trade-off analysis to achieve optimal leakage savings using this technique.
Another approach for reducing static power, active back-biasing, increases the bias voltage of the substrate nodes in CMOS gates to reduce the leakage current. This biasing technique essentially increases the threshold voltage of a unit or the entire chip during standby modes, thus decreasing the leakage power. To get a feel for the adoption rates of these techniques, Synopsys collects data from customers using its Global User Survey (Figure 3).
Power adds another layer of complexity that designers must verify. It requires additional tool support that manufacturers are cobbling into those now on the market. Power adds several new devices to the design, such as isolation logic, power switches, level shifters, and retention cells.
However, according to Krishna Balachandran, director of low-power-verification marketing at Synopsys, power optimizations may also involve sequential RTL transformations that must be verified against the original RTL. The absence of such verification can lead to nonfunctioning systems on chips or higher-than-desired leakage. Simulation approaches may be too slow, not cost-effective, and not exhaustive, leading to incomplete verification coverage of power optimizations. Traditional formal-equivalence tools typically target verification of combinational transformations and are inadequate for the kind of changes typically necessary for power optimizations. Most commercially available formal-equivalence tools also suffer from capacity and performance limitations that must be overcome to handle low-power designs with complex power architectures and hundreds of power domains. A new class of formal-equivalence tools with high capacity and performance targeting verification of sequential transformations must evolve to meet these new requirements.
According to Lauro Rizzatti, general manager of Eve-USA, power optimization also presents challenges for EDA vendors. Many low-power techniques are generally incongruent with RTL simulation or emulation, which abstracts out any notion of voltage. Designers must adapt these digital tools to support power intent and low-power-optimization techniques for implementation.
According to Dermott Lynch, vice president of marketing at Silicon Frontline Technology, power devices typically operate at 70 to 90% efficiency, which results in a loss of 10 to 30% of overall system power. Ely Tsern, vice president and chief technology officer for Rambus’ semiconductor-business group, adds that more aggressive power-mode transitions, with finer-grained power domains, will result in faster transition of local supply currents, which in turn can induce greater di/dt supply noise for sensitive local circuits, especially analog circuits.
Shanmugavel cautions, however, that, under all conditions, the power-delivery network should be able to sustain the load without compromising the voltage integrity. For example, when a global clock transitions and a functional unit turns on to perform a task, a transient-current demand occurs. This transient current can be three to five times that of the nominal current, depending on the functional block, which places an enormous load on the power-delivery network. You must validate the transient voltage noise on the network under these circumstances.
You can reach Contributing Technical Editor Brian Bailey at firstname.lastname@example.org.
Datasheets.com Parts Search
185 million searchable parts
(please enter a part number or hit search to begin)