Rethinking static-timing analysis
by Ron Wilson, Executive Editor -- EDN, April 8, 2010
STA (static-timing analysis) was nearly an instant success at timing closure 15 years ago, and nothing much has changed since except for creating partitioning/scheduling algorithms to parallelize the algorithms for multicore CPUs. This stasis has allowed an increase in the number of instances in a design, the number of modes in which you must analyze a design, and the number of process corners. Consequently, runtimes for full designs across modes and corners have become enormous—days, in some cases.
That situation has turned STA from an elegant, fast tool into a powerful and trusted, but ponderous, necessity, consuming licenses and days of precious schedule with abandon. If there were a way to dramatically speed up STA, users would need fewer licenses, could save the real-estate and power costs for huge server collections, and could employ the tool in situations in which it has become impractical today, such as checking timing constraints or evaluating ECOs (engineering change orders).
Several alternatives exist for accelerating STA. To begin with, the task is parallelizable. Most nets are independent with regard to delay, so you can organize the nets into independent sets and dispatch them to independent threads. That fact makes STA inherently friendly to multicore computing and, more practically, to execution on graphics processors.
According to Dan Blong, technical-marketing manager at Magma Design Automation, 15 years have elapsed since anyone looked at the underlying algorithms and the code to see whether there were other ways to accelerate the analysis. A team headed by Pathmill pioneer Jacob Avidan set out two years ago to accomplish that goal. The result is Tekton, a new STA/extraction/Spice environment. Magma claims that Tekton runs significantly faster as a timing analyzer on a single CPU and dramatically faster in a multimode/multicorner analysis on a multicore machine. The company claims that it can perform these tasks on any design, using one machine, and in less than an hour, further claiming a speed of 1 million nets per minute and near-linear scaling for as many as 24 CPUs.
The improved performance comes primarily from careful organization of the work to avoid the need for repeating calculations—that is, recognizing tasks that are unnecessary and saving intermediate results for use in future calculations. It does not mean compromising timing accuracy. Blong says that Tekton correlates well with PrimeTime delay results and approximates PrimeTime crosstalk results.
Tekton drops into existing flows, accepting PrimeTime tcl and Perl scripts. The complete Tekton environment includes the Tekton QCP extraction tool, which correlates to QuickCap, and crosstalk analysis, on-chip variation analysis, and an integrated Spice engine. In addition to timing an entire design in less than an hour, the package includes an incremental mode that lets you work directly on particular blocks of IP (intellectual property) or on ECOs.





















