Wednesday, December 5, 2007

Karma for MPUs: is chip binning burning up?


A few weeks ago I attended a keynote at ICCAD in which, Jeff Welser, director of the SRC Nanoelectronics Research Initiative (NRI), outlined industry efforts to find a replacement for CMOS. Our coverage of that keynote is “CMOS running out of gas, new effort looks for scalable replacement, ICCAD keynoter says.”

In his presentation, Welser whipped through a plethora of fascinating, eye catching foils describing the innevidable fate of CMOS and the need for scalabe predecessor. But one of the foils that really caught my eye was a foil discussing the long practice of “chip binning” and how it is in jeopardy mainly because of transistor leakage issues.

Chip binning has always been fascinating to me on many levels. What is it? It’s essentially a practice in which chip manufacturers design a chip to hit a targeted speed grade, say for example 2GHz, but after the chips are manufactured and tested, manufactures find some of the chips perform at the targeted speed grade of 2GHz, some perform at higher than 2GHz, and even more perform at lower speeds than that targeted specification number (some of those lower performing chips may perform at 1.8 GHz, others at 1.5 GHz and some at 1 GHz…and lower).

But instead of throwing out the chips that didn’t hit the targeted performance specification, some semiconductor vendors, especially microprocessor vendors, sell most of them to us, the consumer. They simply put them in bins according to speed grade and price them accordingly. In processors for example, the processors that are the highest speed, essentially overclocked processors, traditionally sell for a premium and go into gaming machines. The ones that hit targeted performance go into high end home computing and business PCs. The ones that didn’t hit their targeted performance go into lower cost PCs and the very very lowest ones get thrown out. Very little is wasted. That’s one of the reasons processor companies do so well, they get to sell most of their inventories. Other types of chips, like ASICs, have to hit performance grades and meet system specifications or customers don’t buy them. But Ma and Pa consumer for the most part don’t even know about binnning.

ADVERTISEMENT
As a consumer, I’ve always wondered if when you are buying a new PC, how do you know you are getting a processor that hit its target spec? And how do you know if you are getting a processor that badly failed hitting its performance target? For example, if you are buying a processor that runs at 2GHz, how do you know it wasn’t targeted at 4GHz and so you are buying something that was essentially 1 MHz from going into the trash bin. As a consumer, one has to wonder am I essentially buying a defective product? As a tech savvy consumer, one has to further wonder why didn’t the processor hit its target? As far as I know, MPU vendors don’t disclose any of this info to consumers.

But binning may be undergoing a bit of Karma. In Welser’s presentation, he briefly showed a foil in which he described how because of transistor leakage issues in bleeding edge processes, the binning process is in jeopardy. Essentially, what the foil showed was that because of leakage and more so the heat created by transistor leakage, manufactures are increasingly being forced to throw out the highest performing chips (those chips running above their specifications) from their wafer lots. Welser said that manufacturers fear the chips running at these highest clock rates will emit too much heat and will essentially burn themselves out after running at top speed. Replacing a defective product is extemely expensive and potentially embarrasing.

MPU vendors have traditionally made the most profit off of these highest performing chips so leakage in CMOS is a big big big deal and hitting their bottom lines. That’s why finding a way to squelch leakage or better yet finding a scalable alternative is a top priority for the industry.

As you probably know, the main reason MPU vendors went “multi-core” a couple years back was because tradional single processor core architectures were running into leakage/heat problems. Up until that time, THE race in MPUs was, and to a certain extent still is, in performance. MPU vendors pushed single core architectures up to around 4.5 GHz before they realized that leakage and associated thermal problems were too great and would cause failures and dreaded recalls.  So now, MPU vendors, and quite a few other chip disciplines are going multi-core…esentially, putting multiple lower performance processor cores on a single chip and creating architectures to allow the processor cores to evenly distribute computation workloads without any of the cores running too fast, creating too much heat, cumulatively, and burning up the chip.

The practice of multi-core seems to be working to sidestep the leakage thermal problem. But as CMOS continues to scale and seemingly becomes more leaky, even with new materials like high-k in the mix, the question are how long will it work and what will be the limitations? In short, is it a bandaid on hemeraging wound?

“Irregardless” (as my friends in Boston say), you can expect the practice of binning to continue. Yet, in the era of multi-core it may be harder than ever to determin if you are buying the best processor. Indeed underlying Welser’s talk is the point that the most bleeding edge processor may in the long run not be the best processor. Increasingly, consumers are probably going to need to consider what package and cooling is offered with a system as well as how many cores it has, what its performance is and how much onchip memory it contains. Certainly, if you fork out the cash to buy a screaming Alienware gaming machine running on THE latest and greatest processor, be sure not skimp on the cooling…you’ll likely need it. Just for kicks ask the salesman which bin the processor came from?



<< Back | Print
© Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.