Is Wide I/O a game changer?
I was recently writing an article about an emerging technology for silicon die stacking — often referred to as 3D ICs. The promise is that instead of trying to squeeze everything onto a single die, you can use multiple dies. Each die may use a different fabrication technology optimized for that particular type of circuitry, such as memory, logic, analog, sensors etc. The dies are connected using Through Silicon Vias (TSVs), much the same as vias on a printed circuit board (PCB).
Most of the companies using this technology today have not yet combined the functional dies with TSVs. Instead they are using an interposer that acts just like a PCB. It performs the routing between the functional dies, but does not contain active components itself.
While there are some trials and tribulations associated with any emerging technology, there are some significant advantages, as well. Xilinx estimates that the new Virtex-7 devices have a hundredfold improvement in die-to-die connectivity bandwidth per watt with one-fifth the latency. Memory technology is seeing very significant gains by using this technology to the fullest. In a Synopsys paper¹ they discuss the recently announced Micron DRAM Hybrid Memory Cube (HMC). The memory die stack sits on top of a small logic die, which takes care of buffering and routing from/to the memory banks. The HMC delivers 15X higher bandwidth vs. DDR3 (120 GBps) using 70% less power in one-tenth the silicon real estate of existing technologies.
These gains are not just because of the reduced load caused by the packaging and board-level parametrics, but also because average wire lengths can be reduced by using a 3D space instead of 2D. It will require considerable advances in EDA tools to be able to perform automatic place and route in three dimensions, but that is only a matter of time and improved algorithms.
But it doesn’t stop there. JEDEC recently announced a new standard called Wide I/O for connection of memory to processor. Cadence² notes that the current wide I/O DRAM standard specifies 4 128-bit channels, providing a 512-bit interface to DRAM. It uses a 200 MHz single data rate (SDR) and provides 100 Gbit/s bandwidth. Current expectations are that it will provide close to a 50% power reduction compared to LPDDR3, the next-generation mobile DRAM standard, in a dual-channel configuration. It is also expected that future wide I/O standards will provide faster data rates and may boost bandwidth to as high as 2 Terabits/second.
Let us stop and think about this for a moment. In the past, we have designed systems where there is an expectation of slow communications between functional blocks. Processing is cheap, communications is expensive in terms of both performance and power. We see the results of that in the way we architect systems and the interfaces that connect IP blocks together. What may be possible if that equation changes and communications becomes much cheaper. Sure we will see systems get faster, but can we rethink some things at a more fundamental level?