ARM IP group wrestles complexity of 32-nm physical design
A presentation at the ARM Developers' Conference reveals the extent to which library developers will shield chip designers from the challenges of 32-nm lithography.
By Ron Wilson, Executive Editor -- EDN, 10/27/2009
In a fascinating short presentation at the ARM Developers' Conference last week, Dipesh Patel, vice president for physical IP at ARM, described that organization's plans to support the Common Platform 32-nm high-k/metal-gate (HKMG) process with libraries and memory compilers. In the course of the talk Patel also gave important insights into the growing complexity of physical design at 32 nm and beyond, the consequent future of cell libraries, and the rapid blurring of the lines between cell-based and custom design styles.
Patel described a library development effort that began early in 2008, and that required close cooperation both with the Common Platform process developers and with physical-design tool vendors. The resulting family of libraries is not quite like anything that has gone before it, at least in the commercial space. The family actually has several major branches. There are Foundation libraries, which, as the name suggests, are the baseline, general-purpose digital libraries. But in addition, in recognition that at 32 nm the traditional cell-based flow is breaking down, there are two more levels of library IP: Enhanced and Enhanced-Core.
The latter category appears to include IP libraries developed for implementation of specific ARM cores. Patel mentioned libraries with power-gating optimizations for implementing CPU cores, area-optimized libraries specifically for ARM's Mali graphics core, and ultra-low-power memory compilers. The decision to create libraries for specific ARM cores may simply be ARM seeking to gain competitive advantage from its expensive acquisition of Artisan Components. Or it may be telling us that the resources necessary to do custom circuit design are growing beyond the means even of CPU implementation teams.
Patel went on to give more detailed information about the cell architecture. Pin layout has been designed with great flexibility, providing multiple hit-points for some pins, and offering the flexibility to drop some objects from the cell layout in order to improve access to interior pins. These efforts required ARM to work with the Synopsys Zroute designers to make sure the routing algorithms could exploit the new cell features.
Results seem to justify the effort. Patel claimed that he is seeing 87 to 95% silicon utilization on nine-track libraries, and 90 to 97% on 12-track libraries, using five routing layers.
Another interesting feature Patel described was what he called multi-channel library design. The multi-channel libraries provide pairs of footprint-compatible cells with two different channel lengths, without resorting to the use of an additional mask layer. That means the designer now has two closely spaced choices of channel length—and hence, threshold voltage—for each of the three Vt choices offered by the process. Fortunately, the expanded choices fit into the existing multi-Vt power and timing flows, so from the design team's point of view the new cell option means finer-grained control over the leakage/speed tradeoff with no change in design flow.
Along with describing new capabilities, Patel offered some comments that hinted at the underlying complexity of designing the lower layers at 32 nm. "We recommend against routing on Metal-1," Patel said. "The litho complexities are just too great." At another point he said, "We found that DRC-only sign-off is now insufficient. You must do lithography simulation, as well." One response to these complexities, which Patel illustrated in his foils but did not mention, was a growing rigidity of design rules. His plots showed no user routing on Metal-1, and strictly unidirectional routing on Metal-2 and Metal-3. Patel did mention that the routing approach his team used opened up channels for long runs on Metal-2 and 3, freeing Metal-4 for global routes.
ARM has taped out a number of test chips to measure the progress of their development, including at least one design with multiple Cortex M3 processor cores. Patel said that results are refreshing: "With HKMG, scaling is back."
He said that compared to 45 nm at the block, rather than the circuit, level, the 32-nm results were showing not only the expected 55% reduction in area, but a 40% drop in leakage, a 30% reduction in dynamic power, and—rather surprisingly—a 24% increase in maximum clock frequency. How much of this block-level payoff is due to the gate stack and how much is due to the advances in cell design an place/route technology is unclear.
In a separate announcement, Subramani Kengeri, GlobalFoundries' vice president of design solutions, said that the company would port ARM memory compilers, Foundation, and Enhanced IP to the company's new 28-nm process. This library-level IP would, in turn, be used to produce a hardened Cortex-A9 CPU core and a full design-enablement environment that goes around the core, including shuttle-verified interconnect fabric, peripherals, and I/Os. The full kit will be available to early-access customers in the first half of next year, Kengeri said.
Commenting on the advanced features in the ARM libraries, Kengeri observed, "Standard cells aren't what they used to be. The effort now is to make the choice of cells wide enough to eliminate most of the need for custom design." That may be a necessary effort, because the complexities Patel outlined may put custom design beyond the reach of even many teams skilled in CPU implementation.















