Turn Down the Heat … Please
Tom Reeves, VP of semiconductor and technology services at IBM, sat down with Electronic News at the site of the company’s 200mm fab and mask operation near Essex Junction, Vt., for a candid conversation about what’s next in chip manufacturing, where the problems are and where future technology will come from. What follows are excerpts of that conversation.
Electronic News: What’s the next big break in chip technology?
Reeves: Through the ’70s and early ’80s, bipolars went up to 100 watts. We had water-cooling systems, but you needed something new. Then we started with CMOS, which was a Holy Grail step-function improvement. Now, 20 years later, we’ve got 100 to 120 watt chips again. Power is everything. The efforts we’re taking to get leakage power down for cell phones or a base station or a Cisco switch are enormous. If you look at a chip in a base station or a switch, they’re 40 watts, and there are a lot of them. The total wattage gets up to 5,000 or 10,000. So the major focus now is not on Moore’s Law and how you get the next density step. We’ll get that. How you get the next performance step is harder work than it’s been, too. But the most important issue is how you manage power. Leakage power at the most advanced lithography is very challenging. And with active power, can you cool the gain? College kids were hanging some gaming systems out their dorm windows to cool them down.
Electronic News: So how do we solve these problems? Are we at a point where the road map is broken and we have to re-think what we’re doing?
Reeves: I think we can make incremental improvements. At 65 nanometers, we have solved that. At 45, we have additional ideas. But as you look at 32 nanometers and beyond, it’s still an open question. At 65 and 45 we’re using header and footer switches to turn things on and off, dynamic voltage scaling, dynamic frequency scaling. We’re qualifying processes for lower voltage levels than we would have. We’ve used voltage islands, too.
Electronic News: But isn’t that a Band-Aid approach?
Reeves: I’d call it business as usual—brute, tough-it-out engineering. It’s not like bipolar going to CMOS, though. We don’t have a panacea on our road map. There are certainly some interesting technologies—carbon nanotubes are one of them—but they all look as if they’re 20 years out, not five years out. There are some ultimate solutions that are game-changers. But at 32 nanometers and the step after that, we have CMOS and SOI.
Electronic News: What’s the big bottleneck now?
Reeves: Leakage power is now equal to active power. Leakage power used to be insignificant. The Swiss and Japanese watchmakers were the only ones worried about leakage because it affected how long the battery lasted. Sharp created a new cell phone, which is only available in Japan, with an Aquos LCD screen for watching TV. It’s the same material as in their TV screens. It tunes analog signals, digital signals and FM signals through an IBM silicon-germanium tuner, and it has an IBM EDRAM ASIC. They came to us and said they wanted 600,000 of each as fast as we could make them. It launched in early June in Japan. TV-tuning standards are different by country. A different model has to come out for Korea and Europe, and a different one—probably years from now—in the United States, because we tend to lag these standard migrations.
Electronic News: Does it matter to IBM which form factors sell best and which standards are used?
Reeves: We buffer the risk for customers. For example, we’re not sure whether ultra-wideband, WiMax or Zigbee will win, but we have collaborative design partners on all of them. Whatever wins, we’re going to ride it.
Electronic News: Are there any trends about who’s going to be using these new technologies?
Reeves: Well, the United States is embarrassingly delayed on cell phones technology. I was in Japan about a month ago and commenting about when a phone in the United States would be able to get these terrestrial TV broadcasts. Every one of our salespeople there opened up their cell phones. They all had TV tuners. The oldest phone was two years old. In Japan, they’ve had broadcast-reception phones for two years. Korea and Japan drive these new standards aggressively. Only GPS [global positions systems] in phones seems to be rolling out as fast here. This is a new market for IBM. We’ve only recently gotten into the consumer market, and it’s a very sophisticated market.
Electronic News: Will consumer drive the high-end of the chip market or will it still be computers and networking?
Reeves: The consumer market is going to ship more 65 nanometer earlier than the data-processing or the networking market. The network market drives 18-by-18 die in every generation, which the consumer market never will. Data processing and networking will always lead in difficulty at the mask house, yield engineering and test strategy. But in terms of using new litho nodes, the digital camera guys are further along in their plans of ramping manufacturing than the Ciscos and Junipers.
Electronic News: Let’s swap directions here. A year ago you said that if IBM’s customers follow its design rules, yield will be in the 90 percent range. Is that still true?
Reeves: We’ve maintained that 90 percent to 100 percent range consistently for digital ASICs. We provide the entire design environment, including a test-generation methodology. Cisco will do six ASICs for a line card, and all six will be single-pass silicon. What’s new is that we’ve extended that approach into the world of analog. In that case we won’t provide an entire ASIC design flow. We provide electrical models. But what we’re demonstrating is that we have a much tighter accuracy between the electrical models to the silicon we get back than other mixed-signal suppliers. Analog is something of a black art. But if you know what you want and how to design it, and assuming the electrical models are right, IBM can give you an environment where you have first-pass analog silicon. Other vendors are not that close in terms of electrical-to-hardware coordination. If it doesn’t work, you have to determine whether it was the design or the electrical models. It’s a very difficult process.
Electronic News: As a result of this, are you finding more buy-in from customers than in the past for your recommendations?
Reeves: I think there’s a clear trend toward buy-in. More and more, people are looking for tools to help them analyze the complexity of their potential designs before they send it out, and if it yields can they drop their customer price? Those conversations didn’t occur five years ago.
Electronic News: Does that 90 percent number work for analog designs, too?
Reeves: No, that’s strictly for digital ASICs, where you have an IBM-managed library, an IBM timing tool and router, IBM test methodology and power management. In analog and mixed signal, we’ve taken the uncertainty out of whether silicon will behave exactly the same way as the electrical models we gave you. But in that area, the client is still picking what tool they want to use. It may be Cadence for one thing and Synopsys for another. We haven’t made an investment in an RF CMOS or a mixed-signal design system.
Electronic News: Let’s look at design for manufacturability from a different standpoint. IBM has said it needs seven of the eight cores on the Cell processor to work for Sony’s Playstation. Will there be an aftermarket for chips with fewer operational cores?
Reeves: There are a lot of chips with six cores operational, and we’ve been thinking about whether we should really throw all of those away. We also have a separate part number for chips with all eight cores good. The stuff that’s going to be for medical imaging, aerospace and defense and data uses eight cores.
Electronic News: But might it be the less-expensive version of Playstation 3?
Reeves: It could, but I don’t think Sony has thought about offering that. That doesn’t mean there aren’t good uses for a chip with four SPEs [synergistic processing elements].
Electronic News: What’s the defining factor that makes some chips better than others?
Reeves: Defects. It becomes a bigger problem the bigger the chip is. With chips that are one-by-one and silicon germanium, we can get yields of 95 percent. With a chip like the Cell processor, you’re lucky to get 10 or 20 percent. If you put logic redundancy on it, you can double that. It’s a great strategy, and I’m not sure anyone other than IBM is doing that with logic. Everybody does it with DRAM. There are always extra bits in there for memory. People have not yet moved to logic block redundancy, though.
Electronic News: Do any of those cores ever go bad, so that you start out with seven and you wind up with six or five?
Reeves: There’s a reliability failure rate for all chip types. By definition, reliability failure is one point circuit that has failed. If it happens to be in an SPE, it will knock out one of the cores. We have electronic fuses now, rather than laser fuses, which you can only blow when you’re doing wafer tests. Electronic fuses you blow electrically. If you really want to be focused on reliability and up-time availability, you can design one of these chips to self-detect. You can ship it with eight cores working, blow one of them, and from a user perspective you would have self-healed it in the field.
Electronic News: But would it be as fast as the chip with eight cores?
Reeves: Yes, because the Playstation 3 only uses seven of them. You’d have a spare. That isn’t implemented in Cell, but it could be. We implemented that same strategy for IBM systems. If you take a logic hit on a chip, you don’t have any impact on performance because there is enough redundancy built in.
Electronic News: What happens if one of the cores blows on the Sony Playstation 3 if there are only seven to start with?
Reeves: It’s just like a reliability failure on your TV or DVD recorder. If it’s within warranty, you send it back. If it’s not, your game doesn’t work anymore. You’ll always have choices about how reliable you want to make a chip with burn-in. Most chips that go into the consumer marketplace on things such as camcorders or DVD players aren’t burned in. But you can add burn-in and improve reliability 5x to 10x. It’s extra cost. Certainly, a company like Sony adds that in.
Electronic News: How much extra cost?
Reeves: It’s variable. On DRAMs and SRAMs, it’s cents. On processors, because they’re so high-powered, it’s not trivial to power 100 or 1,000 at a time. With all the wattage, it can be dollars.
Electronic News: With the price Sony is going to charge, it can easily add that into the cost.
Reeves: Sony is very concerned about quality and backward compatibility. They want to get this right. They tested game after game after game. When there were about 40 Playstation 1 games that didn’t work properly, that didn’t pass their criteria for quality.
Electronic News: So does that mean the current Playstation 2 systems have a Cell processor?
Reeves: No, they have a 440 Power processor. It’s a 130-nanometer, single-core ASIC chip. It’s the same technology as if you buy a Sony DVD or a Sony Bravia TV. Sony is replacing all the Mips design points with Power design points.
ClarificationTom Reeves, IBM’s VP of semiconductor and technology services, said he was not making any specific references to past or current Cell yields in an executive insight interview that ran last week. He was, instead, referring to large die yield challenges in general and the successful leverage provided by logic redundancy strategies. IBM does not release product specific yield information. This clarification was made on July 14, 2006.