Shaving power/area with merged logic in SoC designs

Shilpa Gupta and Gaurav Goyal, Freescale Semiconductor -December 31, 2013

In the modern era, there is always a requirement to achieve high frequency with lower power consumption. Achieving both targets simultaneously is very difficult and the situation becomes even more complex while moving down the technology nodes due to various sub-micron effects. A low power solution becomes more urgent as more and more complex features are integrated into any SoC. More features dictate the need for  more logic and hence more die size, which eventually affects chip cost. Consequently, it is extremely necessary to have a technique that would be able to reduce the logic without affecting any performance parameter while keeping power in check.

Designers often find the need to include many instances of the same IP in designs. When the identical modules are instantiated twice, the same combinational (combo) logic and flip-flops are of course instantiated twice. So, instead of replicating the whole module, designers can save silicon area/gate count if they have a way to share the combo logic between two IPs. The proposed architecture is a flip-flop design that can reduce the replicated logic in any SoC -- resulting in lower area and power and, accordingly, further reducing the chip cost. Figure 1 depicts the prior and proposed approach of integrating two duplicated IPs in any SoC.

Figure 1. Architecture level diagram of 2 IPs being integrated in two approaches

Figure 2 illustrates that in any SoC, there are multiple instances of same IP being integrated on the silicon. In turn, the need for different sets of combo instances would explode the area of the chip. Yet, the design requirement can be achieved through the proposed technique thereby reducing the combo instances to half.

A conventional circuit depicts two of the same combos for 2 IPs but in the proposed circuit, one combo is shared for both the IPs through some architecturally change in the design of the sequential element (the flip-flop, which will be discussed in detail below).

Figure 2. Two IPs being integrated in two approaches

First thing is to verify that the above two approaches are equivalent as far as the throughput is concerned. To achieve the same throughput for each module from this combined structure of merged sequential, the following conditions apply:
  1. A 2x clock needs to be provided and the master stage needs to be able to support 2x clocking which would ensure that each work at alternate clock edge (shown in timing waveform in Figure 5)
  2. The effective time period for each module is cut in half but the effective throughput is still the same.
  3. A select ā€˜Sā€™ signal which will enable step 2 and will toggle at x frequency needs to be brought in.

Loading comments...

Write a Comment

To comment please Log In