Pseudo hardening in SoC design
Kushagra Khorwal, Vijay Bhargava, and Abhishek Mahajan, Freescale Semiconductors - May 25, 2012
In today's ever-shrinking VLSI world, where the technology is changing very fast and innovative and more complex architectures are being introduced every day, time-to -market still remains the key challenge for VLSI designers. In pursuit of project cycle time reduction various approaches are followed today.
One such approach is the parallel development of the IP along with the SOC. So late stage integration of such critical blocks is not a rarity these days. Usually, there's an initial IP design available to the SOC team while the final delivery of the block is supposed to happen towards the end of the project. The initial design is frozen from IP-to-SOC interface perspective and interface pins are thus mutually agreed upon and are final before initiating the SOC implementation.
The conventional approach (as illustrated in Figure 1) to handle such blocks is to separately harden the block so that rest of the SOC integration happens as usual and is left unaffected when the final block appears late-stage. But this approach has certain disadvantages as enumerated below:
It becomes quite cumbersome to handle all the constraint tune-up as one move through the project cycle. Also it's extremely painful to accommodate last-minute ECOs and consequent placement changes to the neighboring blocks.
To circumvent these limitations we have come up with the flow (figure 2 below). The approach is to partition soft IP in such a manner that the overall die size is saved, timing closure cycle is reduced, and engineering effort is reduced. Below we illustrate (with examples and figures) the various steps involved at different phases of the design cycle.
Figure 1: Hard partitioning, the traditional approach to block hardening.

Figure 2: New pseudo-hardening approach overcomes disadvantages of traditional approach without impact on resources and time.
IP Synthesis
It's conventional to separately synthesize the IPs in cases like this, which although gives flexibility but is not the best strategy to adopt from optimization perspective. In this approach, we need to synthesize the Soft IP (SIP) along with the top level SOC. It helps in the best optimization of interface paths without much iteration. Additionally we need not manage separate budgeted constraints for the IP.
Boundary Optimization is turned off for SIP so as to remove possibility of across the block logic movement. This step is taken to facilitate direct plug-in of IP in later stages (post-Route or post-noise). For scan insertion, dedicated ports are created at the boundary of the SIP. Now there are two approaches possible for scan chain creation depending upon the SOC.
In the first, one can have dedicated chains if the block is pretty big and has different frequency domain. The second approach is to create abstract scan segments for the block which can then be merged into top-level scan chains. The later approach is better suited for relatively smaller blocks with the same frequency requirements as that of the top-level.
Logic Equivalence
Again a single Flat setup needs to be created for LEC as opposed to generally followed two separate setups in such cases. This saves on separate setup creation and runtimes. To doubly ensure that the synthesis tool has preserved the functionality of the SIP at its interface, CUT points should be defined at the boundary pins of the IP. The logic equivalence of these boundary pins after each optimization ensures that functionality of SIP is intact and is ready for new IP plug-in at any stage of the flow.

Figure 3: Separate cut points for soft IP will be defined for LEC checks.
Floorplan
Handling floorplanning for the block is easy in this approach, as dedicated floorplan creation is not required. A Soft region is created for the IP in the top-level floorplan itself. Soft region is preferred over hard fencing as it enables the region to be used for IO timing fixing and other across the block optimizations. A snapshot is shown below (SIP highlighted, Figure 4).
Figure 4: A floorplan snapshot is shown for a typical Fenced IP, where the placement is forced to sit in a particular region.
Placement
Placement too is a single run for the complete design, i.e., no separate placement for the block or anything. Few considerations need to be taken care though, depending on the potential logic change expected in the later stages a module- padding of 1-10% should be done to accommodate the gate count delta. As done in the synthesis, IP Ports need to be preserved to assist a direct late-stage plug-in of the new IP if that is required. Also as mentioned earlier, the soft-region for the SIP assists great deal in IO path timing since other logic is allowed to sit in the region marked for SIP.
Clock-Tree-Synthesis
CTS need to be done in two steps. In the first step, soft IP is Clock Tree Synthesized from pre-fixed clock roots. Most of the tools today support macromodel dumping for the clock-synthesized blocks and we need to use the same for dumping one for Soft IP. A clock Macromodel is nothing but a simple model for pre-built clocks of a block detailing various latencies and capacitances on each one of the clock port.
The second step is to clock-synthesize the whole SOC along with the macromodel pulled in for the SIP. This simple two step approach is again the easiest way to ensure late-stage plug-in of the soft IP, since now we could easily clock-tree synthesize the Soft IP with rest of the SOC untouched. The concept is shown in Figure 5.

Figure 5: This illustrates a modular approach to CTS building to facilitate last stage plug-in of the IP.
Routing/Noise
Nothing special is to be taken care for routing and noise optimizations using this approach. Since SIP is a soft partition, other module nets are allowed to pass over the region and optimizations are normally facilitated as it would be done for a flat design. Highlighted Nets in Figure 6 shows SIP nets with all other nets allowed traversing the region.

Figure 6: A typical fenced IP.
Advantages of this approach
No impacts on tool run time seen on designs < 2 million gates. (May have impact in larger designs):
How can this approach help when it comes to plugging in SIP at a later stage?
Once the final IP is received, it can then be synthesized separately, followed by incremental placement in ECO mode (with all cells other that SIP prefixed). CTS needs to be done separately for this SIP as described earlier maintaining already committed latencies and skews. Routing/noise would then follow in Incremental mode to achieve final closure.
About the authors
Kushagra Khorwal (Kushagra@freescale.com), received his Masters in VLSI Design from GGSIP University in Delhi, India. Currently, he is working with Freescale Semiconductors, Noida, India as a Senior Design Engineer in the field of SOC Physical Design His interests include SOC Clock Tree Synthesis, Padring Design, and SOC Power Optimization Schemes.
Vijay Bhargava (Vijaybhargava@freescale.com) is Lead Engineer at Freescale Semiconductors Noida. In his career spanning 11 years, he has worked extensively on Verification, Digital IPs, Power Estimation/Modeling and DSP Architectures. He has handled Synthesis and APR activities for SOC physical design.
Abhishek Mahajan (B13294@freescale.com) is a senior design engineer at Freescale Semiconductors, Noida, India. He has four years of experience in various domains such as logical and physical Synthesis, Static Timing Analysis, Place and Route and static low power verification.
One such approach is the parallel development of the IP along with the SOC. So late stage integration of such critical blocks is not a rarity these days. Usually, there's an initial IP design available to the SOC team while the final delivery of the block is supposed to happen towards the end of the project. The initial design is frozen from IP-to-SOC interface perspective and interface pins are thus mutually agreed upon and are final before initiating the SOC implementation.
The conventional approach (as illustrated in Figure 1) to handle such blocks is to separately harden the block so that rest of the SOC integration happens as usual and is left unaffected when the final block appears late-stage. But this approach has certain disadvantages as enumerated below:
- A dedicated place-and-route resource to handle the implementation of the block.
- A dedicated timing resource for constraints management and analysis.
- Need to handle block merging in the design flow.
- Inefficient use of placement area and routing tracks.
- Need to create feed-through for neighboring blocks.
- Challenges in optimization across the hardened block (specially if this is high frequency block)
It becomes quite cumbersome to handle all the constraint tune-up as one move through the project cycle. Also it's extremely painful to accommodate last-minute ECOs and consequent placement changes to the neighboring blocks.
To circumvent these limitations we have come up with the flow (figure 2 below). The approach is to partition soft IP in such a manner that the overall die size is saved, timing closure cycle is reduced, and engineering effort is reduced. Below we illustrate (with examples and figures) the various steps involved at different phases of the design cycle.

Figure 1: Hard partitioning, the traditional approach to block hardening.

Figure 2: New pseudo-hardening approach overcomes disadvantages of traditional approach without impact on resources and time.
IP Synthesis
It's conventional to separately synthesize the IPs in cases like this, which although gives flexibility but is not the best strategy to adopt from optimization perspective. In this approach, we need to synthesize the Soft IP (SIP) along with the top level SOC. It helps in the best optimization of interface paths without much iteration. Additionally we need not manage separate budgeted constraints for the IP.
Boundary Optimization is turned off for SIP so as to remove possibility of across the block logic movement. This step is taken to facilitate direct plug-in of IP in later stages (post-Route or post-noise). For scan insertion, dedicated ports are created at the boundary of the SIP. Now there are two approaches possible for scan chain creation depending upon the SOC.
In the first, one can have dedicated chains if the block is pretty big and has different frequency domain. The second approach is to create abstract scan segments for the block which can then be merged into top-level scan chains. The later approach is better suited for relatively smaller blocks with the same frequency requirements as that of the top-level.
Logic Equivalence
Again a single Flat setup needs to be created for LEC as opposed to generally followed two separate setups in such cases. This saves on separate setup creation and runtimes. To doubly ensure that the synthesis tool has preserved the functionality of the SIP at its interface, CUT points should be defined at the boundary pins of the IP. The logic equivalence of these boundary pins after each optimization ensures that functionality of SIP is intact and is ready for new IP plug-in at any stage of the flow.

Figure 3: Separate cut points for soft IP will be defined for LEC checks.
Floorplan
Handling floorplanning for the block is easy in this approach, as dedicated floorplan creation is not required. A Soft region is created for the IP in the top-level floorplan itself. Soft region is preferred over hard fencing as it enables the region to be used for IO timing fixing and other across the block optimizations. A snapshot is shown below (SIP highlighted, Figure 4).

Figure 4: A floorplan snapshot is shown for a typical Fenced IP, where the placement is forced to sit in a particular region.
Placement
Placement too is a single run for the complete design, i.e., no separate placement for the block or anything. Few considerations need to be taken care though, depending on the potential logic change expected in the later stages a module- padding of 1-10% should be done to accommodate the gate count delta. As done in the synthesis, IP Ports need to be preserved to assist a direct late-stage plug-in of the new IP if that is required. Also as mentioned earlier, the soft-region for the SIP assists great deal in IO path timing since other logic is allowed to sit in the region marked for SIP.
Clock-Tree-Synthesis
CTS need to be done in two steps. In the first step, soft IP is Clock Tree Synthesized from pre-fixed clock roots. Most of the tools today support macromodel dumping for the clock-synthesized blocks and we need to use the same for dumping one for Soft IP. A clock Macromodel is nothing but a simple model for pre-built clocks of a block detailing various latencies and capacitances on each one of the clock port.
The second step is to clock-synthesize the whole SOC along with the macromodel pulled in for the SIP. This simple two step approach is again the easiest way to ensure late-stage plug-in of the soft IP, since now we could easily clock-tree synthesize the Soft IP with rest of the SOC untouched. The concept is shown in Figure 5.

Figure 5: This illustrates a modular approach to CTS building to facilitate last stage plug-in of the IP.
Routing/Noise
Nothing special is to be taken care for routing and noise optimizations using this approach. Since SIP is a soft partition, other module nets are allowed to pass over the region and optimizations are normally facilitated as it would be done for a flat design. Highlighted Nets in Figure 6 shows SIP nets with all other nets allowed traversing the region.

Figure 6: A typical fenced IP.
Advantages of this approach
No impacts on tool run time seen on designs < 2 million gates. (May have impact in larger designs):
- Faster Timing closure.
- Die Size saving by allowing other instances to sit in the SIP region, resulting in faster IO timing closure.
- Shorter routes further allow meeting DRVs.
- No Extra Engineering effort needed to close the partition.
- No LEF/LIB/Floorplan needed for Soft Partition.
How can this approach help when it comes to plugging in SIP at a later stage?
Once the final IP is received, it can then be synthesized separately, followed by incremental placement in ECO mode (with all cells other that SIP prefixed). CTS needs to be done separately for this SIP as described earlier maintaining already committed latencies and skews. Routing/noise would then follow in Incremental mode to achieve final closure.
About the authors
Kushagra Khorwal (Kushagra@freescale.com), received his Masters in VLSI Design from GGSIP University in Delhi, India. Currently, he is working with Freescale Semiconductors, Noida, India as a Senior Design Engineer in the field of SOC Physical Design His interests include SOC Clock Tree Synthesis, Padring Design, and SOC Power Optimization Schemes.
Vijay Bhargava (Vijaybhargava@freescale.com) is Lead Engineer at Freescale Semiconductors Noida. In his career spanning 11 years, he has worked extensively on Verification, Digital IPs, Power Estimation/Modeling and DSP Architectures. He has handled Synthesis and APR activities for SOC physical design.
Abhishek Mahajan (B13294@freescale.com) is a senior design engineer at Freescale Semiconductors, Noida, India. He has four years of experience in various domains such as logical and physical Synthesis, Static Timing Analysis, Place and Route and static low power verification.
Design for manufacturing and yield
Decode a quadrature encoder in software
Converter translates Bayer raw data to RGB format
Floorplanning: concept, challenges, and closure
Microcontroller drives piezoelectric buzzer at high voltage
Relay driver switches two relays with one pin
ARM versus Intel: a successful stratagem for RISC or grist for CISC's tricks?
Currently no items
Datasheets.com Parts Search
185 million searchable parts
(please enter a part number or hit search to begin)
KNOWLEDGE CENTER
