Path-specific derating to reduce timing pessimism: a proposal
Reducing your pessimism about timing uncertainty as the design progresses can speed timing closure.
Suprio Bhattacharya, Manav Rachna College of Engineering -- EDN, June 25, 2010
This article proposes few methods to speed the timing closure—that is, cleaning all the setup and hold violations—of a chip by reducing the pessimism in the timing constraints for the design. Manufacturing inaccuracies and on-chip variations at and below 65 nm necessitate use of a pessimism factor in the timing analysis. With increasing design sizes that have billions of timing paths, any pessimism hits the design cycle by a considerable amount. Therefore, any reduction in the pessimism factor should improve the design, reduce the design cycle, and, ultimately, shorten time to market.
Pessimism requirement in timing analysis
Timing constraints are the major guides to the tools and directly affect the design size and performance. As the design passes along the subsequent stages of the design flow, the timing constraints change so that the final design delivered by the tool is optimizal based on realistic constraints. At the synthesis stage, the design is in its nascent state, having undergone only the logical realization with the logic gates. The design is logically optimized, but it is missing any physical information about the nets that might allow accurate calculation of the parasitic delays of the interconnect segments. Therefore, it becomes necessary to keep some pessimism window around the interconnect delays so that, when the design gradually passes to the next stages of placement and routing, the actual delays will lie within the window.
Necessity of derating factor
The actual design will have temperature variations across the die and voltage drops along the nets due to which the speed of the components will vary. The design must tolerate such issues so that there are no functional failures due to "on chip" variations. Designers apply a derating factor to their timing estimates to account for such OCVs (on-chip variations). The derating factor is applied to the clock paths and the data paths so that the derating factor decreases the available time period for each such paths. Thus, the derating factor is one of the contributors to the pessimism window. The tools work to meet the setup and hold times for all the paths on this reduced available time period.
Path-specific derating
Normally designers employ a global derating factor to all the paths. A global value essentially does not consider the possibility that numerous paths might be less affected by OCVs. If a considerable number of such paths exist, then addressing them with a reduced derating factor should improve runtime and, eventually, the design size and performance. In an usual design practice, the derating factor is applied to the time period available to each path irrespective of the components and net attributes associated with the path. So here we have an opportunity to refine this process. It is necessary to figure out the on-chip variation pattern on the die so that we can use this information in setting a derating-factor pattern accordingly.
Variation in the manufacturing process is essentially in two components: standard cells and interconnects. Thus, it is important to categorize these components into groups on the basis of the degree of variation in their manufacturing so that we can apply different derating factors to them. The possibility of manufacturing inaccuracy is greater for certain standard cells and routed nets than the rest. Therefore, we can apply larger derating factors to such cells and nets, and lesser derating factors to those cells and nets that have a lesser manufacturing inaccuracy possibility.
The tools will require some time to categorize the components into such groups, at the cost of increased runtime. So in order to have an optimal solution, it is necessary to estimate the number of paths prior to applying the different derating factors. For example, if there are a few hundred low-variability paths out of billions of paths, we can avoid the increase in runtime for the special treatment of such relatively lesser number of paths by simply using the common derating factor for the entire design. However, if the relative number of paths is quite large, then it is imperative to separately treat such paths with specific derating factors for some noticeable advantage in the design.
There are also some obvious indications for standard cells. Standard cell layout style strongly influences the yield of the resulting dice. Hence, in most cases the standard cell layout designers religiously follow the design rules and the foundry's recommendations in designing the cells. However, functional complexity of a cell can cause some cells to be more prone to manufacturing inaccuracies—and especially lithographic variations—than others. Paths including such complex cells should require more derating. The foundry will provide the information on such cells that have greater chances of channel width or length variation. So the designer may decide upon a special derating factor for such cells.
The final hiccup: how to bring this into the flow
The placement and routing stages of the ASIC design flow are highly experimental. The design may become unroutable due to overcongestion or have too many timing violations for a particular combination of placement and routing strategy. After several experiments of placement and routing strategies, the design may be good enough for tapeout. The routing process is carried out in a number of steps. The P/R (place-and-route) tool initially carries out an approximate routing wherein it associates the routing "lines" to nets. After all the nets are associated, the tool extracts (calculates) the parasitic delays—the distributed RC factors of the nets—based on the technology LEF file. (Library Exchange Format is a Cadence-proprietary format to list the geometrical specifications, resistance, and capacitance values of the metal layers and dielectric layers of the chip.) The P/R tool prepares a timing graph based on this LEF file to account for the delays between all the timing paths. Based on this timing graph, the routing algorithm runs to associate the nets with the routing tracks such that it routes all-or at least most of-the nets with minimum timing violations.
To do this, the routing tool has to go through an iterative process of routing, unrouting, and re-routing to obtain incrementally better results. At 65 nm and lower nodes where interconnect delays dominate the timing closure, the impact of the interconnect delay is evident only at the routing stage when the tool actually routes the nets. Before this there is only an approximated routing done by the tool, leading to only a rough estimate of the delay. And even after the design is routed, noise closure can still tweak the routing. Thus, routing is effectively frozen only at the end of the design closure after noise analysis is complete. Until then, the iterations continue. So we face the chicken and egg problem here: Without actual routing information, we can't assign appropriate derating factors to the paths. But the analysis will be pessimistic without an appropriate derating factor.
We intend to group interconnects on the basis of routing, but eventually the route may change because of congestion or noise problems. Therefore, the grouping needs to be tentative. It is necessary to have a grouping technique that will be immune to minor changes in routing so that the derating factor can be appropriate. One possible approach can be to identify the longer nets that move across the die boundary. Such nets should have longer continuity along a direction and can be associated to the smaller derating-factor group. We need to identify more such ways to identify nets that will be less sensitive to OCVs, in order to have a considerable number of nets for which we can reduce the derating pessimism.
There are at least two approaches to this problem. One approach would be to use the global derating factor until routing is done with some manageable number of timing violations. At this stage the basic structure of the routes is almost set, so now we can rely upon the geometry of the routes to derive the updated derating factors for the nets. This methodology should deliver a cleaner design quickly with fewer iterations. Alternatively, we could employ a variable derating routing algorithm so that most of the pessimism is removed even at the initial level of routing. However, the increased complexity of this technique may raise questions about the runtime. Since routing is an iterative process, the runtime depends upon the time required for a single iteration and the number of iterations. On larger designs, the derating-aware routing will have a very large runtime for a single iteration, but the total number of iterations should be less. Therefore, this approach might or might not reduce the total routing execution time.
Epilogue
Designers have been trying every way to cut out undue pessimism from the design methodology because any undue pessimism eventually delays the time to market. Routing and noise cleanup at the final stages of the design are very iterative and consume a significant portion of the design-cycle period. With tighter timing constraints and larger design sizes, we need more realistic constraints in order to deliver a clean design in time. The proposed method to derate the paths based on their geometry is a step toward this end. It certainly should benefit designs with a large number of straight nets running across the die.


















