EDN Senior Technical Editor Brian Dipert exposes, analyzes and
opines on diverse topics in technology.
Feb 13 2007 3:25PM | Permalink | Email this | Comments (0) |
Blog This! using: Blogger.com | LiveJournal |
Digg This | Slashdot This | add to Del.icio.us
Yesterday afternoon's microprocessor paper session at ISSCC was a feast for the senses, at least for aficionados of this particular semiconductor genre. Highlights included the first presentation in the track, from IBM on the upcoming POWER6 architecture, particularly interesting given the performance-versus-power-versus-lithography presentation of the prior evening. Fabricated on a 10-layer copper 65 nm low-k SOI process, the 341mm2 chip is "fully functional, with general availability scheduled in the middle of this year", according to the presenter. POWER6 offers 2x the frequency (4-5 GHz targeted) of POWER5, while maintaining POWER5's instruction pipeline depth and operating within the same approximate power envelope.
Constructed of more than 790 million transistors, the first iteration of the POWER6 architecture contains two µP cores, each with 64 KByte L1 instruction and data caches, a 4 MByte L2 cache (along with fast-path access to the other core's L2 cache) and a dedicated DDR SDRAM controller. Additionally, the design supports an optional on- or off-module 32MByte unified L3 cache. POWER6 operates at main supply voltages ranging from below 0.75V to above 1.3V; in "power-sensitive applications" (albeit with an unspecified performance impact) the chip consumes less than 100W.
Intel's PR team should be patting themselves on the back right now; the current industry fixation with 'core count escalation' has given Intel far more ISSCC coverage than, frankly, it deserves. Again and again over the past few days, I've seen breathless technical press headlines trumpeting Intel's '80 core microprocessor' achievement, without qualifiers such as the following:
Intel's NoC (network-on-chip) comprises 80 'tiles', interconnected to each other via a complex shared-crossbar routing scheme, and with each 'tile' comprising two independent fully-pipelined (9-stage) single-precision floating-point MACs, 32 KBytes of single-cycle instruction memory (IMEM) and 2 KBytes of data memory (DMEM). Quoting from the published paper, "A 96-bit VLIW encodes up to eight operations per cycle. With a 10-port (6-read, 4-write) register file, the architecture allows scheduling to both FPMACs, simultaneous DMEM load and stores, packet send/receive from the mesh network, program control, and dynamic sleep instructions. A router interface block (RIB) handles packet encapsulation between the processing engine (PE) and router. The fully symmetric architecture allows any PE to send (receive) instruction and data packets to (from) any other tile."
And if nothing else, Intel's NoS is an intriguing exercise in performance-versus-power consumption tradeoff and optimization. Again quoting from the company's writeup on the project, "The simulated chip frequency versus Vcc at 110°C shows a tile maximum frequency of 3.13GHz at 1V and 4GHz at 1.2V. With all 80 tiles actively performing block-matrix operations, the chip achieves a peak performance of 1.0 TFLOPS at 1V and 1.28 TFLOPS at 1.2V. Estimated typical power consumption is 98W at 1V and 181W at 1.2V....[the chip allows] up to 27GFLOPS/W and 310 GFLOPS of total performance at 0.6V with an estimated power dissipation of 11W." Intel constructed the NoS on a conventional 65 nm process; the 275mm2 custom layout has 100 million transistors. But the question remains; what practical use does the NoS have, aside from perhaps as a niche market coprocessor akin to ClearSpeed's products or some of the GPGPU focus areas? In response to a question from the audience, "How do you map an application onto this processor" (which got quite a few laughs from the assembled crowd), the presenter admitted that a whole lot of hand tweaking was (at least at the moment) necessary to translate the chip's performance potential into reality.
Continue reading with 'ISSCC: Processor Plethora, Part Two'