Sonics offers another way to approach DRAM controller complexity: in the interconnect architecture
Much has been written of late about the growing problem of interfacing SoCs to DRAM. Among other issues, the effective bandwidth of a single DDR2 channel is often insufficient. Often, this is not because the raw bandwidth is too small, but rather because the amount of data requested, the sequence of addresses within the request, or the sequence of the requests violate the strict rules necessary to keep synchronous DRAM happy.
There have been various approaches to the problem. One approach, not recommended here, is to simply put enough DRAM controllers on the SoC so that the raw bandwidth is enormous, and whatever effective bandwidth the applications end up seeing is sufficient. Alternatively, some people have tried to explicitly control DRAM access patterns in software, either giving up on caches and explicitly managing on-chip SRAM, or moving critical loops to accelerators that use scratchpad instead of cache, managing loads and stores in microcode.
Other designers have tried to load the whole problem of translating the various needs of multitasking–sometimes multithreaded–CPUs with L1 and L2 caches, multiple accelerators, and other memory traffic sources all into the DRAM controller. This approach automatically makes the DRAM controller a proprietary design, and often makes it the differentiating feature of the SoC. It also makes it a nightmare.
Now Sonics, arguably the founders of the idea of SoC interconnect as intellectual property, are offering yet another approach: one that the company claims can often improve DRAM channel utilization to the point that architects can stay with DDR2, but one that also looks forward to DDR3 and beyond.
Unlike most other approaches, the Sonics architecture, which they call SX with IMT (Interleaved Multichannel Technology, in case you were wondering), embeds much of the intelligence necessary to managing DRAM bandwidth in the interconnect structure, rather than in the DRAM controller. This may appear counter-intuitive, but Sonics founder and CTO Drew Wingard argues that in order to solve the real problems of next year’s SoCs, this approach is necessary.
To begin with, Wingard says, the complexity of the on-chip environment is getting out of hand. Today there may be multiple CPUs, each with its own cache controller with its own protocol, plus a variety of accelerators with their own local RAMs or prefetching algorithms. Some requests will be for cache lines, others for Bytes or words, and others still for media-specific objects such as scan lines or video macroblocks. All of this traffic would normally descend on the DRAM controller in no particular order.
The Sonics architecture offers a set of specific memory transaction requests from which each of these devices may choose what they wish. For instance a device may request a cache line, or a 2-D block of 12-bit pixels. All of the requests flow into the interconnect architecture, are organized for optimum DRAM utilization, and then flow on to the DRAM controller. This requires considerable local intelligence and no little cleverness. Wingard points out, for example, that Sonics has found an algorithm that allows the interconnect protocol to keep transactions in-order for the functional blocks, reorder them for best use of the DRAM channels, and yet not have a substantial SRAM reorder buffer on the chip.
Wingard points out that DDR DRAM and the needs of SoC functional blocks have been evolving in quite different directions. The minimum burst length on synchronous DRAMs keeps getting longer, while the typical burst of data required for a cache line or a video macroblock remains 32 Bytes or less. This mismatch causes inherent underutilization of the DRAM bandwidth unless you do something about it.
Sonics addresses this with a multi-channel controller, splitting traffic between two or more DRAM channels in a way that is transparent to the application processors. To keep utilization on the channels high, the Sonics architecture employs transparent, fine-grained address interleaving and automatic channel-load balancing, also built into the interconnect architecture. The attempt here is to keep the processor-specific organization near the functional processing blocks, the DRAM-specific organization in the DRAM controller, and let the interconnect architecture mediate transparently between the two domains.
Looking forward, the SX with IMT architecture is already anticipating continued growth the bandwidth demands of future SoCs. It is also anticipating the increased burst lengths of DDR3. Less obviously, the architecture is also looking forward to entirely new DRAM interconnect schemes. As DRAMs leave their DIMMs and become members of stack-die or stacked-package assemblies, architects see the current synchronous interface going away altogether. In its place, some speculate, we will see an interface still based on very fast serial I/O at the physical layer, but logically giving the SoC direct access to the individual physical banks within the DRAM die. Proposals such as the Serial Port Memory Technology (SPMT) interface are headed in this direction. At that point, features such as the ability to support eight physical channels take on a whole new meaning.
Clearly the new Sonics interconnect architecture, absorbing much of the functionality of the most sophisticated custom DRAM controllers out there and even anticipating a radical departure from current DDR thinking, is beyond the scope of a single short article. But perhaps these notes have suggested the wealth of complexity in the architecture, and the wealth of creative thinking about the DRAM problem that has gone into it.















