CDC verification of billion-gate SoCs
This article describes three different clock domain crossing (CDC) verification methodologies and how they can best be used in verifying SoCs being designed today. Growing design size, proliferation of internal and external protocols, and aggressive power requirements are driving an explosion in the number of asynchronous clocks in SoCs. This demands that design and verification teams spend an increasing amount of time on verifying the correctness of asynchronous boundaries on the chip. Incorrect asynchronous boundaries can lead to multiple design defects not encountered in simpler designs.
Metastability is one of the major defects. A flip-flop has metastability issues if the clock and data change very closely in time, causing the output to be at an unknown logic value for an unbounded period of time. While metastability cannot be eliminated, it is usually tolerated by adding a multi-flop synchronizer to control asynchronous boundaries, and using those synchronizers to block the destination of an asynchronous boundary when its source is changing. FIFOs and 2-phase and 4-phase handshakes are typical structures used for this type of synchronization.
Glitches on asynchronous boundaries can also cause defects since a glitch on an asynchronous crossing can trigger the capture of an incorrect signal transition. Data coherency issues occur in a design when multiple synchronizers settle to their new values in different cycles and subsequently interact in downstream logic. The list goes on. While the concepts and methodologies for verification of such issues have been extensively researched in the past ten years, practical solutions have been offered primarily at the IP-level. Little work has been attempted to tackle CDC verification signoff of large system-on-chip (SoC) designs.
Why is SoC CDC different?
CDC analysis, even at the IP level, requires some care – there are multiple factors related to which clocks are truly asynchronous, which clocks can be simultaneously active, which crossings are allowed to operate without standard synchronizers (for example, configuration signals which are known to be static through most of the operation of the system) and more. All of this is very manageable at the IP level but can quickly become overwhelming at the SoC level without a disciplined methodology.
Ad-hoc and customized methodologies have been suggested in some recent publications, but have left designers in the dark in terms of understanding various aspects of constraint creation and CDC verification of large SoCs, and what is the best methodology that would fit their existing design, implementation, and verification flow.
Figure 1 below illustrates the problem. Here you see a typical SoC with multiple peripheral interfaces, high performance internal compute engines, accelerators and bus fabric, hence multiple clock domains, also multiple power domains and, quite probably, dynamic voltage and frequency selection in several of these domains. Approaching CDC analysis of this system without a systematic methodology would be a nightmare. In this article, we describe three different methodologies and discuss their application in a typical design flow.
Figure 1: A Sample SoC Architecture
Flat SoC verification
In flat SoC verification, the entire SoC is verified in a single run. Flat SoC verification covers all the critical issues briefly described above: Metastability, glitches, and loss of coherency, in addition to functional requirements of the asynchronous interfaces and other critical issues across data, control, clock and reset circuitry. The size and complexity of a design is no excuse for missing a CDC bug.
The main advantage of flat SoC verification is setup simplicity. Typically, clocks, modes, and other design constraints are available at the chip level, and therefore design setup for CDC verification is straightforward. This is a significant advantage as proper setup is key to effective CDC verification and can avoid long signoff iterations.
While this approach presents obvious advantages, SoCs are sometimes assembled late in the design cycle. Consequently, identifying critical CDC issues after assembly will lead to multiple long iterations between block design and SoC assembly which could impact the schedule and quality of the SoC implementation.
It can also be difficult to scale flat analysis to large designs. As mentioned earlier, the volume of CDC issues identified at the full-chip level can be overwhelming. These issues are the accumulation of block design problems and SoC assembly problems. A flat flow can work well for small SoCs where the complete verification task, including CDC signoff, is assigned to a single person or team.