A turn-off: Power management complicates life for verification engineers
There is intense pressure on all levels of chip engineering to reduce power consumption. That situation, in turn, has led to increasingly dramatic—and invasive—measures to reduce power in individual blocks and circuits.
These efforts have been extremely successful, but they have come at a cost in design complexity—often the topic of conferences. At least as serious, and much less discussed, there has been a serious impact on the verification process. At best, aggressive power management complicates functional verification. At worst, it can render a design unverifiable. EDN has spoken with design teams in the United States, Europe, and Asia to understand the scope of the problem in logic design and to see how the best designers are coping with the new verification challenge.
The most used logic-level power-management techniques fall nicely into a few categories, based on the obvious ways to reduce static and dynamic power. For static-power reduction, the only simple things you can do are to use high-threshold-voltage transistors as much as possible, reduce the voltage on the supply pins, or turn the supply off altogether. The techniques for accomplishing these tasks include multithreshold design, multivoltage design, and power gating.
For reducing dynamic power, your options are to reduce the supply voltage or reduce the frequency. Techniques for this process include clock gating and DVFS (dynamic-voltage-frequency scaling). The sidebar “The one-minute power manager: a primer” contains a brief description of each of these techniques. This article will examine the impact of each of these ideas on verification.
At a coarse level, clock gating can be one of the simplest techniques for reducing dynamic power. Hence, it sees frequent use in cases in which the main problem is to limit peak dynamic power. “We use mainly clock gating today … to limit package cost,” says Laurent Ducousso, STMicroelectronics’ IP (intellectual-property)-verification and system-modeling manager.
Ducousso says that his company’s design teams define clock domains early, basing them on IP or subsystem boundaries. That approach allows the team to determine when a domain may be clock-gated based on a high-level view of the operating modes and to check the sequencing of clock gating as an added task in functional verification. The physical-design team inserts the gating circuitry during the back-end flow. In simple cases, clock gating does not impact the state within the block, so there is no need to verify state retention. But this case does not hold true for every block. “We are just getting into situations where we have to worry about state retention,” Ducousso says. “We are looking at our test benches to see what we can do with verifying these situations. Assertions may turn out to be quite useful.”
Functional verification of clock gating is just added work, not a new concept, according to Michael Floyd, IBM’s chief architect for energy management in its Power Systems Group. “Basically, clock gating adds to the state space,” Floyd says. “It needs to be timed and verified and tested in manufacturing like any other function.” For improved test coverage, designers sometimes put in modes to disable clock gating. “But some designs use the clock gate as a hold tap on latches,” Floyd comments, “so it becomes a part of the functional design, and we have to verify and test that function with clock gating turned on.”
As clock gating gets more complex, the verification team’s attention seems to shift from functional correctness to timing and signal-integrity issues. “We rely on design guidelines to get clock-gating structures functionally correct during the front-end-design process,” explains Krish Krishnamoorthy, director of advanced methodology in the System LSI group at Toshiba. “Then, we check during physical verification to make sure everything came out right.” These checks might include careful timing because, for instance, a late-arriving gate signal could gate a clock off during its active phase, with disastrous results downstream.
The complexity of this process varies. “For simple datapaths, there’s hardly any impact on the verification process,” says IBM fellow Brad McCredie. “But, if designers start using fine-grained clock gating in control logic, that’s a major whammo to the verification process.”
Early in the pursuit of power management, design teams began to use voltage islands—separating blocks that did not have to run at the maximum clock speed and supplying them with reduced voltage. This practice imposed two new tasks on verification. First, it required that the verification team make sure that correct level shifters were in place on all of the signals passing between voltage islands, including clocks. Second, it required that cell libraries had timing files for each of the voltages at which a design might use the cells and that the timing team used the right delays for the voltage at which a voltage island was operating. There are other issues, as well, when signals from one voltage domain cross another domain.
Once designers had declared and isolated voltage islands, it was a natural step to simply turn off the power to a voltage island when it was not in use. But this idea—power gating—creates a whole new set of issues for the verification team. Obviously, blocks that remain on must function correctly when power-gated blocks are off. Less obviously, turning the power off and on requires a delicate dance involving isolating the inputs and outputs, saving the internal state of the block, shutting off power, restoring power, restoring state, and reconnecting the block to the rest of the design (Figure 1).
“We verify power-gated designs in four stages,” explains Freescale Semiconductor’s verification team lead, Prashant Bhargava. “We examine the shut-off sequences. Then, we verify the design with the blocks in the power-off state, mainly to make sure they are properly isolated and have retained state correctly. Next, we verify the power-up sequences. And, finally, we verify the blocks in their power-on state.”
Freescale, like many other companies, relies on Cadence CPF (Common Power Format) files to capture the intent of the power management. “In the CPF file, we identify the gating signals, what blocks are gated, the power controller, the interfaces between switched and unswitched blocks, the commands to isolate blocks, and the location of level shifters,” Bhargava says. The CPF file becomes the central definition of design intent for power management and can drive some other tools—primarily, Cadence tools—downstream to direct such tasks as simulation and synthesis.
The CPF file may hold promise for centralizing the power design, but it also has its own significant issues. For example, there is no method for automatically checking the CPF file. “We can identify errors in the CPF only during the verification process,” Bhargava explains.
Sam Leung, director of digital design at Staccato Communications, seconds this point. “Verifying that what you got out of synthesis is what you asked for is still a very manual, visual process,” he says. In power gating in particular, the verification team has to manually check that the correct structures actually exist in the design. “There are so many inputs and outputs and so much manual table entry involved that errors happen,” Leung says. “You have to check the files.”
After verifying the CPF or UPF (Unified Power Format) files, there is the matter of verifying the design. “There is no tool to grab data out of the CPF file and generate a test bench for the power management,” Leung laments. “We have to create a set of test vectors just to verify the operation of the power-management functions and the power controller. Today, we can perform these checks only at the functional level, not the gate level. The reason for this [situation] is that the big challenges come in checking signals that cross power domains,” Leung explains. “You need to know if something is leaking somewhere. You can’t verify these signals in isolation; you have to work with the full-system models.”
Nilesh Ranpura, a design manager at eInfochips, also relies on CPF to organize the verification effort. “When you start verification, you have to understand which blocks are at which voltages,” Ranpura says. In other words, he adds, “you have to identify the sequences of the power-control state machine and what the voltage combinations will be in each state. This [task] isn’t automated, and the CPF data isn’t accessible to all simulation tools, but, even so, we have found that having the CPF data has reduced our verification time by about 40%.”
One of the problems Ranpura identifies is that, when you are turning power on and off in blocks, the signal states in the system are no longer limited to the simple 1, 0, X, Z, and so forth that most simulation tools offer. There are other possibilities. “If you use Boolean assertions, you sometimes have to suppress incorrect firings,” Ranpura says.
IBM’s Floyd agrees about the need for states beyond 1, 0, X, and Z in simulation (Figure 2). “When we initially started using power gating, it caused some indigestion because it increased the number of situations in which you deal with an unknown state,” he says. “But it turns out that conventional X-state analysis is too pessimistic; removing all the X states hurts the critical paths. So, we are actively developing alternatives. In principle, a formal tool can verify that a power-gated domain could not corrupt data. That [ability] is exciting.”
Once the verification team is comfortable that the sequencer is in fact following the design intent, you must verify not only that the signal is arriving, but also that it is arriving at the correct voltage and on time. “When you are switching blocks that are connected to each other, you have to verify that the level-shifter sequencing is correct,” says Ranpura. “It must exactly match the design intent, or you can generate huge numbers of errors with no easy way to trace them back to the problem.”
Other non-Boolean problems arise, as well. “Memory blocks are an interesting issue for power gating,” says NXP’s microcontroller-design manager, Avindt Chopra. “You can power-gate them, but, if you do, you had better simulate the transitions very carefully—for instance, modeling the inrush current in the memory block as you change voltage.”
Buses can be another important point to watch. “When you are gating an entire IP block,” says ST’s Ducousso, “if the block is connected to a system bus, you have to be certain that the bus interface stays electrically clean during the transitions.” Ducousso makes another interesting point with regard to buses: Misfortune follows if you power down a block that has pending bus requests. In this case and probably in a good many others, the actual state that you must inspect before powering down the system may not reside fully inside the block. Some of it may be in an adjoining block that handshakes with this one, or it may be halfway across the chip in a bus arbiter. Another important case to check is the state of DRAM controllers during power sequencing. Corrupted memory from a spurious write cycle generated during a power transition can be nearly impossible to trace.
When a design moves from power gating to DVFS, things can get even more complex. “The complexity of a DVFS design completely explodes in verification,” says NXP’s Chopra. Now, every possible combination of voltages in the voltage islands in the system represents a new corner that you must examine.
Fortunately, the steps are about the same. First, verify that the power sequencer is doing what the designers intended. Now, this problem is more complex, however, because it involves the application software as well as the hardware. In DVFS, you may change the voltage on a block because the software says the block doesn’t need to be fast for its current task. In fact, in DVFS designs, sequencing becomes so complex that designers often opt for a microcontroller or a task on a CPU in place of a state machine. This situation makes sequence verification a software-verification problem as well as a hardware-verification challenge.
Next, it is necessary to verify that the blocks in isolation are going through the correct sequence of events on voltage changes. This process can become nightmarishly complex because the sequencer, clock-gating circuitry, state-retention circuitry, and isolation buffers may all be in different voltage and clock domains and hence subject to different timing. “We have seen clock-gating signals released prematurely because of differences in voltage levels that were not captured by the simulator, for instance,” eInfochips’ Ranpura relates.
IBM’s Floyd points out that, when connected blocks are working at different voltages and frequencies, the system has in effect become globally asynchronous and locally synchronous. “We have developed tools to detect asynchronous paths and to check their attributes when they cross a synchronous domain,” Floyd explains. “For instance, we might modulate the cycle time on a clock to verify that there will be no loss of data. And we audit designs to make sure that domain crossings comply [with] one of a few acceptable techniques.”
Signal integrity can be an issue, as well. “It is not just a matter of ensuring correct functional behavior at all the new combinations of corners,” warns Toshiba’s Krishnamoorthy. “Remember that crosstalk can be different depending on voltage.” So, for instance, a signal might not be an aggressor when the block in which it is originating is at low voltage. It might become a serious problem, however, if the originating block is running at high voltage and frequency and the signal crosses a block that is turned down.
A systems approach
The details of what you need to check during verification of a power-managed design seem endless. But one thing many experts emphasize is that you must verify the design as a system, not as blocks in isolation. Many of the key issues, particularly in power-gated and DVFS designs, which can have many signal levels and asynchronous clock domains, occur not within blocks but between blocks. Thus, it becomes necessary for the verification team to examine the interactions between blocks at different combinations of power and frequency.
Emerging tools such as CPF and UPF can help keep track of these complexities on a full-chip level, but they are neither universally compatible with widely used tools nor universally used among design teams. IBM, for example, has its own internal tools for domain tagging and tag inspection during verification to keep the power state of the system known and consistent.
Even at the architectural level, power management is imposing changes to make verification more viable. “We have logical design hierarchy based on functional blocks, and we have physical hierarchy based on the implementation,” observes Rob Cosaro, systems/architecture/applications manager of business-line standard ICs and microcontrollers at NXP. “But, increasingly, we have a third hierarchy based on clock and voltage domains. Power management in effect is defining another hierarchy.” And having that hierarchy clear from architectural design may be the most important factor in making verification successful.
|The one-minute power manager: a primer|
As energy-conservation requirements have grown more stringent, many techniques have emerged for saving active and leakage power in logic circuits. It is not uncommon to find multiple techniques at use in different parts of a design. Here is a quick overview of these approaches.
Clock gating is one of the earliest techniques for reducing dynamic power. It can increase static power because the clock-gating cells need to be fast, and designers often implement them with large, low-threshold transistors. This method simply shuts off the clock to portions of the circuit that are inactive.
Originally, designers used clock gating at the block level as a way of creating a standby mode. More recently, designers have employed fine-grained clock gating, down to the level of individual latches. Control circuitry can simply decide not to issue a clock pulse on a cycle when the data in a latch does not change. "We see designers using clock gating as a 'hold' tap on latches," says Michael Floyd, IBM's chief architect for energy management in its power-systems group. Thus, fine-grained clock-gating schemes can become exceedingly complex.
If some blocks can be slower than others, it makes sense to run the slower blocks at a lower frequency and turn down the supply voltage until these blocks just meet timing. This technique was also one of the early approaches—slightly more complex than coarse clock gating, especially in its impact on timing closure.
Power gating involves turning off the supply voltage to a block to stop both static- and dynamic-power consumption. The technique is more complex than it sounds. You have to be sure that there will be no activity needed from the block when the power is off. You also have to deal with state—including whether to preserve state while the power is off, where to save it, and how to restore it. It may also involve determining how to sequence the shutdown and power-up cycles and whether you can anticipate activity on the block early enough to perform the power-up sequence. You also must isolate the block from surrounding circuitry during power transitions.
DVFS (dynamic-voltage-frequency scaling) is a blend of voltage islands and power gating: You adjust the voltage and clock frequency of each block on the fly so that it is just meeting its deadlines for the current task. This approach requires the ability to modulate clocks on a block-by-block basis, to dynamically vary supply voltages at the same granularity, and to ensure that nothing unexpected will happen while you are switching to a new operating point. It requires fairly detailed knowledge of the application's performance requirements. And the whole chip must meet timing at every legal combination of block operating frequencies.
In DTVC (dynamic-threshold-voltage control), you dynamically control the threshold voltage on individual sets of transistors, thereby choosing a leakage-versus-speed point that just matches the requirements of the moment on a path. This approach is effective, but it requires special structures in the semiconductor process. It is custom-design stuff today, primarily in use by only a few advanced-processor vendors.