Panel dissects the details of managing power on the multicore SoC
In a way, we have designed the edifice of SoC power management from the foundations up—like good engineers and bad architects, as the saying might go. The effort began with process innovations, continued through device and circuit design, and worked its way up to block-level constructs such as power-gating and DVFS (dynamic voltage-frequency scaling)—each level of complexity built on the ones below it. But now that we have all of the foundation in place, we are at a bit of a loss as to how the building should look.
This becomes most apparent on the multicore SoC. One of the fundamental ideas of multicore architecture is that by switching cores on and off or by throttling them, you can match the performance and power consumption of the chip almost exactly to the real-time needs of the task load, to a degree that is not feasible with huge single-core processors. But that promise implies a lot about the foundation. Such an SoC would be a mosaic of independently-controlled voltage and clock domains, all mortared together with interfaces that handle voltage translation, clock boundary crossings, and intricacies such as data retention and non-disturbance.
All the elements of that foundation are in place. And even, to a great extent, the design tools to lay the foundation are in place. (We will look politely away from the verification question for now.) But what seems to be missing is a unified, standardized way to control all those regions so that the end-user actually gets the lowest heat dissipation, the longest battery life, or whatever. That was the topic a panel took on at the Multicore Expo yesterday afternoon.
The first question the panel was asked to tackle was the matter of design capture. Just how does the chip architect capture her power intent, if you will, so the design team can implement it? The panelists answers were revealing.
Stephen Hamilton, application architect for wireless systems at Sonics, stated that such intent-capture must be at the system level, and must include the notion of centralized software control. But he lamented that there are no standards for expressing these concepts today. Silistix senior solutions architect Bob Adair agreed, saying that today chip designers were employing all the tricks of the trade to minimize power, and yet there were no standards for expressing what they had done or for tying the blocks’ power-management engines together. "We need an architectural capture language for power," Adair said.
Darren Jones, engineering director for microprocessor development at MIPS, observed a different way of framing the question. "If you ask an architect what their energy intent is," Jones said, "it will be to minimize energy for the required performance. So in a way what you are asking is not about energy, but about performance intent."
Srikanth Jadcherla, R/D group director in the Verification Group at Synopsys, added that there were really two important missing pieces here. "We created a language for expressing the implementation of power management hardware," Jadcherla said. "But we didn’t put into that any way of capturing the intent of the power management system."
That was one piece. The second missing element was a control network. There is no standard protocol for command or control of power-management features during operation. This lack turns every power-managed chip with external IP into a tower of Babel, and has forced everyone to invent their own approach to controlling the power-management system. Jadcherla pointed to ACPI (the Advanced Configuration and Power Interface) in the personal-computer space as an example of what might be done. But he warned that even this mature interface, focused on a single architecture, was not scaling well into the multicore world.
Hamilton asked if the existing power-management implementation languages, UPF and CPF, could be extended to capture intent and operation as well as configuration. He asserted that existing tools were capable of processing UPF to validate the construction of the interfaces between clock domains or power domains, and could do rule checking. But he questioned whether there were tools that could, for instance, check the validity of power-up/down or DVFS sequences.
Other panelists agreed that articulating a physical structure for power management and verifying that it was structurally correct and that it operated were all within the range of existing UPF- or CPF-compliant tools. But questions remain about the protocol for controlling the hardware, and indeed about the architecture of the controller itself. Should the chip be under central control? And should that control be a state machine responding to the current state of the various blocks on the chip, or should it be a software-based machine working on data provided by the operating system or its applications?
At this point the discussion became rather complex. Adair said that much of the detailed management of power operations could actually be done in the interconnect, rather than explicitly in a central controller. Hamilton pointed out that in some cases, features of the interconnect were necessary in order for advanced power management to be workable at all. He offered the example of a multicore system in which a vital link between two cores passes through a number of other cores. If the power-management controller decides to shut down one of these intermediate cores, what happens to the link latency? "We have concluded that multi-threaded, non-blocking fabric is essential in order to keep the interconnect latency independent of the power state," he said.
Further discussion covered the intricacies of DVFS or power-gating in coherent systems. What if one shuts down a core whose cache has dirty shared data? Flushing that cash back to a shared level or—worse—to DRAM entails considerable energy expenditure. And it could significantly impact the execution time of tasks on other cores that shared the data, which might erase the energy savings of shutting the core down in the first place.
Panelists also discussed verification issues that arose with aggressive multicore power management. For example, what happens when you power-up a core in the midst of several other, already-operating, cores? Can the inrush current of an entire 32-bit CPU core, its caches, sense amps, clock drivers, and the like cause a local brown-out on a power bus? Can the transients couple through the interconnect or the substrate to adjacent logic or memories? Most signal-integrity analyses don’t examine the size of power-up or power-down transients.
Jones mentioned that this question uncovered a bit of a secret in the CPU design world: the care that must be taken to size local power rails. "It’s a real problem," he said, "to get the rails big enough not to cause brown-outs if you power-up a core during operation of the chip. We tend to err toward overdesign."
Jadcherla pointed out another such transient issue: the use of header transistors for power gating instead of footer transistors. He said that for some reason, perhaps because turning off footer transistors alters ground capacitance, footer transistors tend to generate more noise on the supply rails when they are switched, and header transistors seem to generate less.
So the discussion managed to range from system-level capture of power intent to protocols for communications between power-management objects to details of implementation within power-managed blocks. It was a thoroughly informative afternoon.
M. Simon commented:
M. Simon commented:















