Subscribe to EDN
RSS
Reprints/License
Print
Email

Design quality and its impact on design closure

Steps to ensure quality early in the design can speed closure and avoid failed silicon.

Piyush Sancheti, Atrenta Inc; Sanjay Churiwala, Atrenta Inc; and Rob Knoth, Magma Design Automation -- EDN, July 15, 2010

 View as PDF



The cost of SOC (system-on-chip) design continues to skyrocket, market windows continue to shrink, and design complexity continues to grow exponentially. These challenges are only a few of those that SOC designers face. In an effort to prevent major disasters, designers must ensure that the SOCs achieve design closure, which includes meeting certain key objectives, such as performance, power, and area. Design-closure objectives are often in conflict with each other, however. Designers must constantly trade off one for the other to ensure that the design stays within the enduser application’s requirements.

A typical SOC design starts with an RTL (register-transfer-level) description to capture user intent and a set of design constraints to drive implementation. The design team first verifies the RTL for correctness of functional intent through simulation and formal verification. The design then goes through a series of implementation steps, including synthesis and placement and routing, which eventually result in a GDSII (Graphic Design System 2) layout for silicon manufacturing. The quality of incoming design and the associated constraints have a large impact on the designers’ ability to reach closure. However, you can ease this process by using a series of design-quality measures starting at RTL and continuing throughout implementation, focusing on quality measures at the five stages during an integrated RTL-to-GDSII implementation flow (Figure 1). You can expand the concept to other stages of implementation or adapt it to other flows, including presynthesis-RTL quality; postsynthesis, postscan-netlist quality; post-timing-netlist quality; postplacement- netlist quality; or postroutenetlist quality.

Presynthesis-RTL quality


SOC designs that don’t start well will usually fail to reach closure. Quality measures at the RTL stage of design go a long way toward determining successful design closure and working silicon. Once you synthesize the design, you are to a large extent freezing the design intent, and you have limited flexibility for correcting design-quality issues inherent in RTL.

Modern SOC designs typically cater to multiple end markets to amortize the high cost of design. The same design can have multiple variations and live on for multiple generations through updates and upgrades. This scenario is prevalent in consumer electronics and automotive chips, in which manufacturers accomplish 80% or more of the design through reuse. Future generations may reuse the RTL you create for the current design and therefore it could have a longer shelf life than the current design. You must also consider commercial third-party IP (intellectual property), such as processors, digital-signal-processing blocks, and bus fabrics, and interface IP, including Ethernet, USB (Universal Serial Bus), and PCI (Peripheral Component Interconnect). The SOC team typically receives this IP in RTL.

For these reasons, you must ensure the quality of the RTL and constraints going into synthesis. Design teams often focus on functional correctness using simulation and formal verification, but spending some effort on implementation feasibility and the overall quality of RTL can go a long way toward accelerating design closure. Design teams can achieve such quality measures through a set of analyses on the RTL and design constraints.

Structural and connectivity integrity

Design 
quality and its impact on design closure figure 2RTL linting can weed out syntax and semantic issues and ensure compliance with coding standards. RTL designers should, however, address more serious structural and connectivity issues at this early stage. If left undetected, they may later lead to more serious design-closure issues. Examples of these issues include excessive levels of logic between flipflops (Figure 2), combinational loops, unintentional latches, blocking assignment in sequential blocks, variables or nonconstants in the terminating condition of a loop, missing asynchronous resets from a sensitivity list, multiply- driven nets without a tristate, undriven nets and ports, and mismatching between the left- and the right-hand sides of an assignment. Although you may be able to detect and fix some or all of these issues during synthesis or later stages of implementation, it is more efficient to fix them before putting any effort into implementation.

Clocks and resets


A typical SOC contains heterogeneous IP from different sources. As a result, the number of asynchronous clock domains on a chip has increased dramatically. It is possible for one chip to have 20 or more clock domains. You must ensure that clocks and resets are properly designed. When data signals cross between asynchronous clock domains, you must synchronize them to prevent metastability (Figure 3). Clock synchronizers can range from multiple-flip-flop synchronizers to more exotic schemes, such as FIFO (first-in/first-out) buffers with handshaking. It is important to prevent the data loss and reconvergence of synchronized signals to ensure reliable behavior. You must synchronize deasserted resets, even if they are asynchronous, with the clock domain.

You should ensure not only that synchronizers are in place on crossings but also that you’ve correctly implemented the protocol. For example, a FIFO should have no overflow or underflow, and you must implement proper sequencing between requests and acknowledgments in a handshaking scheme. Functional simulation may not detect clock-domain-crossing bugs unless verification engineers create dedicated testbench scenarios for each crossing, a daunting task for designs that have thousands of such crossings. You must employ structural- analysis and formal-verification techniques to exhaustively analyze and verify clock-domain crossings.

Power reduction


Power has come to the forefront of design-closure concerns for a variety of reasons, including battery life, cooling costs, reliability, and energy efficiency. Studies show that the determination of more than 80% of the power of a design happens by the time it enters synthesis. For that reason, you must address power management early in the design flow, using architectural techniques, such as multiple voltage domains, power domains, and dynamic-voltage-frequency scaling, and RTL techniques, such as clock and data gating. Designers must start with an estimate of the power consumption of the design and selectively apply these techniques based on power targets for the design.

Voltage and power domains add new challenges for design closure. In voltage domains, it is important to insert level shifters wherever signals cross from one voltage domain to another. Similarly, you must place isolation cells in power domains that can be shut off when not in use to ensure that unpowered outputs are not floating. These floating signals could introduce a functional error or a high-leakage path to ground. You must also ensure that the enable logic for isolation cells is in an always-on domain. Some designers insert level shifters and isolation logic at RTL, and others capture the power intent in CPF (Common Power Format) or UPF (Unified Power Format) for automatic insertion by downstream implementation tools. In either case, designers must ensure that the design has level shifters and isolation-logic cells at each such crossing.

With judicious use, clock gating can be an effective powerreduction technique. Most synthesis tools can automatically insert gating based on clock enables in the RTL. However, not all clock gates save power, especially in the case of registers, such as flip-flops, that are almost always enabled or if the design has only a few gated registers. In such cases, the additional gating logic can consume more power than the power you save by gating the clock. Excessive clock gating can lead to timing-closure issues and contribute to routing congestion. You should instead selectively apply clock gating in places in which it has the most impact on power.

RTL analysis for clock gating can help in a number of ways. At RTL, you can identify global clock-gating signals, which can gate clocks for an entire design or for large register banks. A review at RTL can also analyze and prioritize explicit clock enables. RTL designers define these enables for their powersaving potential and help remove those that save little or no power. Power-management designers can also identify new or implicit clock-gating opportunities that RTL designers may have overlooked. In addition, power-management specialists can also generate directives for synthesis to intelligently gate clocks.

Various clock-gating opportunities are available to RTL designers (Figure 4). Power designers can do similar analysis to identify data-gating opportunities, in which a cloud of combinational logic drives an enabled register. Gating the combinational logic using the same enable that you apply to the terminal register eliminates wasted power resulting from toggles in the combinational logic when the register is disabled. For example, an N-bit multiplier, with the input data bits arriving at different times, is a candidate for data gating. The multiplier continues to multiply even though the results remain unused until all the bits of both data inputs have arrived. Data gating can be an effective technique for such datapath-intensive designs that digital-signal processing commonly uses.

Design for test


Designs must have high test coverage, both for stuck-atand at-speed-fault modes, especially in consumer electronics, which must quickly reach silicon volume with few defects. Traditionally, design teams stitch scan chains during synthesis or later stages and test coverage and then estimate test coverage using ATPG (automatic-test-pattern-generation) tools. However, most testability issues are detectable and correctable at RTL, so the design will eventually meet test-coverage goals.

For example, the key to high stuck-at-fault coverage is to make sure that the design is fully controllable and observable in scan mode. However, high stuck-at-fault coverage in RTL encounters many barriers, including nonscannable flip-flops whose inputs are unobservable and whose outputs are uncontrollable. Designs that internally generate control signals, such as clock or asynchronous set/clear, are the most common causes of this situation. Nontransparent latches are other major issues in that their inputs are unobservable and their outputs are uncontrollable. Large memories and analog- and mixed-signal blocks similarly suffer from inputs that are unobservable and outputs that are uncontrollable. The enable pins of tristates are unobservable. Combinational feedback loops also restrict testability, and test-mode values in capture mode may restrict controllability.

Despite efforts from RTL designers, some parts of the design may still not be observable and controllable and may require the insertion of additional test points. Test-coverage analysis at RTL can help detect where to place additional test points and their resulting impact on test coverage. For example, in one design, adding 12 test points increased the test coverage from less than 94% to more than 98% (Figure 5). It is easier to add test points in RTL when you fully comprehend the design’s intent than in the later stages of implementation.

In deep-submicron designs—those at the 90-nm and smaller nodes—designers worry about transition faults that can occur at normal clock speed. Stuck-at-fault testing, which typically uses a slow test clock, does not detect transition faults. Designers must perform at-speed testing in which system clocks multiplex the test clocks. This step adds a level of complexity for timing closure. At-speed testing also introduces functional closure challenges, such as those that occur when asynchronous clock domains share the same test clock, which could affect the at-speed test coverage. It is therefore critical to estimate at-speed test coverage at RTL and fix potential functional and timing-closure issues.

DFT (design for test) poses a unique challenge for IP reuse. IP that meets test-coverage goals in a previous design could fall short in the current design. For example, if some inputs of the IP are tied to constants in the current design, parts of the IP may become uncontrollable. This issue could affect the test coverage of the SOC. Hence, you must perform test-coverage analysis at both the block/IP level and the SOC level.

Design constraints


Design constraints are a critical part of the design intent. They capture the designer’s requirements on the performance, power, and area from implementation. The quality of constraints is just as critical as the quality of RTL in synthesis. At this early stage, designers usually manually define constraints for clock frequencies, input and output delays, modes of operation, and exceptions—false and multicycle paths. Because this step is the starting point for implementation, the completeness and correctness of constraints are critical in meeting design closure.

You might catch some constraint issues during synthesis if you carefully examine the synthesis transcript. You might find, for example, missing-clock or -mode constraints when the design is using a multiplexed clock (Figure 6). Other constraints may occur when input and output delays reference an incorrect clock, multiple mode constraints tie the same node to conflicting constant values, or timing exceptions are missed on asynchronous clock-domain crossings.

However, synthesis or static-timing analysis may not catch more serious issues in design constraints. These issues typically involve constraints that are simply assertions to the synthesis and static-timing analysis; they thus remain undetected. You may not catch such issues until final chip integration or, worse yet, in silicon.

For example, a generated clock does not derive from the declared source clock, and the waveform for a generated clock differs from the implied waveform, depending on the presence and location of inverters in clock-divider circuits. Other examples include missing clock latency or uncertainty, missing delay constraints on primary inputs or outputs, block-level constraints that are more relaxed than the chip-level requirements, incorrect or insufficient timing budget along a snaking path, or an incorrect multiplier for multicycle paths.

Analysis of design constraints in RTL design can help avoid these issues. Designers can exploit this analysis to generate the clock, input, and output constraints in a correct-byconstruction way, thereby eliminating many overlooked bugs. For instance, you can use clock-domain-crossing analysis and knowledge of asynchronous control signals to generate timing exceptions. At RTL, you can address inconsistencies between block- and chip-level constraints by comparing the sets of constraints in the context of the complete RTL design.

Postscan-netlist quality


At the postsynthesis stage, the design has gone through logic synthesis, resource sharing, Boolean optimization, and scan-chain insertion. Assuming that RTL and constraints in synthesis are high-quality, the resulting netlist should be in good shape. However, a lot has changed in the design, and constraints may also have changed. You can take some measures to ensure quality at the postsynthesis, postscan netlist and constraints.

At this stage, the design may contain power and voltage domains to manage power consumption. You now must perform an exhaustive verification of these domains to ensure proper insertion of level shifters and isolation logic. Even if you inserted these level shifters and isolation logic before synthesis or scan-chain insertion, the design may still need updates. In one such example, an isolation cell is missing in the scan path because DFT designers inserted the scan logic after the implementation of power domains (Figure 7). Adding the isolation cell in the scan path fixes a potential functional bug or a high-leakage path to ground.

Such power-domain bugs also commonly occur when designers forget the required “don’t-touch” constraints in the synthesis script. This omission can cause buffer optimization during synthesis, which removes level shifters or isolationlogic cells.

SOC designs are now so large and complex that implementation may need to be hierarchical. Designers typically perform synthesis at the block level, and chip integration occurs at the netlist level. In such scenarios, you must merge block-level-synthesis constraints into chip-level constraints. Manually merging constraints can be error-prone and could lead to an incorrect set of chip-level constraints for later timing closure. You could at this stage apply constraint consistency and correctness checks. You can automatically merge constraints by using block-level constraints and the full-chip netlist to prevent bugs.

If constraints have changed at this stage, it is important to establish equivalence with the original constraints for synthesis. Just as you can perform logic-equivalence checking for design stages from RTL to netlist, you can now establish equivalence between RTL and netlist constraints. Designers can ensure the integrity of design constraints and eventual design closure by adopting constraints equivalence in their flow.

If the implementation flow is hierarchical, you might want to stitch together chip-level test logic at the netlist level. In such scenarios, ensure that the global test clocks and test-mode signals propagate to individual blocks. You should perform quality checks to ensure that required values propagate to pins on subblocks when they specify a driving condition at a primary input or an internal node of the chip. Similarly, designers can benefit from connectivity checks to ensure that a path exists between two user-specified nodes in the design with an optional sensitization condition (Figure 8). For example, imagine that Pin A in Block 1 does not connect to a primary output and that Pin B in Block 2 is observable. By establishing a path between Pin A and Pin B, you can now ensure that Pin A is observable.

Post-timing-netlist quality At the post-timing stage, you must ensure that the design meets timing requirements and starts analyzing timing violations from static-timing analysis. This stage is another critical one. Timing closure can be a challenge if the design is overconstrained or incorrectly constrained. Other causes of problems could be structural defects, such as combinational loops, excessive levels of logic, or unregistered outputs from blocks or IP, all of which you should have detected at earlier stages of design.

Timing exceptions fall into two broad categories: false paths and multicycle paths. False paths between two registers are those you cannot sensitize in the design or are otherwise irrelevant for timing closure. Multicycle paths, on the other hand, are possible but take multiple clock cycles to complete. Unless you identify the false and multicycle paths in the design constraints, static-timing-analysis tools assume that all paths are possible and single-cycle.

An incorrect timing exception can lead to a critical timing failure in silicon. On the other hand, every timing exception that remains unidentified is unnecessary and represents a wasteful timing-closure burden. It is therefore a fine balancing act to find just the right timing exceptions. You must at least formally verify all timing exceptions in use to ensure their validity. Another step that could accelerate timing closure is to look for additional timing exceptions, especially in those paths that are violating timing. You should formally verify every such path as a possible false- or multicycle-path candidate; if it is false or multicycle, you should add it to the list of timing exceptions for static-timing analysis. Consider the results of timing analysis on two timing-critical blocks from a multimedia design (Table 1). The timing results dramatically improve when you identify additional timing exceptions from paths that initially failed to meet timing. The impact on gate count and, hence, area is minimal.

Postplacement-netlist analysis


By the postplacement-analysis stage, the design has entered physical implementation and has undergone physical synthesis, placement, and clock-tree synthesis. You should repeat quality measures on the now fully placed netlist. You now have a more accurate assessment of the power, area, timing, and test coverage, and you can compare this estimate with the estimated results from RTL to identify blocks that may have deviated.

You can at this stage perform additional netlist-quality checks, such as floating pins or nets; clock, select, enable, or reset pins that tie to constants; unused or disabled cells; undriven or multiply-driven nets in the netlist; overloaded cells; underloaded cells, wasting area and power; pins connected to specific nets, such as tristate, clocks, and resets; scan-chain nets with more than the maximum number of elements; and high-leakage or snake paths. You should also check that pins connecting to the same net are of the same connectivity class.

Before clock-tree synthesis and before a clock network exists, you should specify, in the design constraints, the values the tools should assume for clock latency and clock slew rate. However, assuming that you’ve inserted the clock tree by this stage of the design, it is time to compute and apply the delays and slew. At this time, you must also update and verify design constraints for two critical areas. First, replace assumed transition times on individual flip-flops with transition times only on the primary inputs to the design. Second, set clock delays to propagated, rather than to a user-defined network latency on clocks.

Postroute-netlist analysis


Postroute-netlist analysis represents the home stretch for design implementation, yet design teams spend a lot of time and effort at this stage to close timing, signal integrity, manufacturability, power integrity, and a host of physical effects. Assuming that you’ve followed the quality measures in earlier stages, the design and constraints should be fairly high quality, and you should focus on these physical effects. In addition, you should place significant effort in layout and physical verification, tackling process variability and other manufacturing issues. This stage also encompasses final sign-off on power, timing, testability, and die size; hence, it is best to repeat the quality measures from earlier steps as part of the final sign-off.

In short, the quality of a design and its associated constraints have a large impact on design closure. You can, however, take a series of quality measures to improve the chances of design closure. It is also important that you take most of these measures at the early stages of design, especially at RTL, at which point you best comprehend the user’s intent. The later in the implementation flow that you address design quality, the less impact it is likely to have on design closure. If you get design goals and quality objectives right from the start, it is only a matter of staying the course during implementation.


Authors' Biographies

Piyush SanchettiPiyush Sancheti is senior director of business development at Atrenta Inc, where he is responsible for Atrenta’s strategic alliances with key members of the semiconductor supply chain. Sancheti has more than 15 years of experience in various marketing and product-management roles in the EDA industry at Sequence Design, Senté, and Cadence. He has a master’s degree in computer engineering from Iowa State University (Ames, IA). You can reach him at psancheti@atrenta.com.


Sanjay ChuriwalaSanjay Churiwala is senior director of engineering at Atrenta Inc, where he has worked for eight years. His responsibilities include EDA-tool development and application responsibility for strategic customers. He has a bachelor’s degree in technology from the Indian Institute of Technology (Mumbai). You can reach him at sanjayc@atrenta.com.



Rob KnothRob Knoth is senior product manager at Magma Design Automation, where he has worked for three years. He has a bachelor’s degree in electrical engineering from Purdue University (West Lafayette, IN). You can reach him at rknoth@magma-da.com.
RSS
Reprints/License
Print
Email
Canon Resource Center

Featured Company


Most Recent Resources

Advertisement
Related Content
  • 0 rated items found.
Advertisement
Advertisement
About EDN   |   Site Map   |   Contact Us   |   Subscription   |   RSS
© 2010 Canon Communications LLC. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Canon Communications sites

Canon Corporate | Design News | Test & Measurement World | Packaging Digest | EDN | Qmed | Pharmalive | Appliance Magazine | Plastics Today | Powder Bulk Solids | Canon Trade Shows