TierLogic lifts the veil: another take on the 3D FPGA
TierLogic, yet another large and expensive FPGA start-up that has been in stealth mode for years, today unveiled a radical approach to increasing the density and utility of large programmable logic devices. Like previously-announced Tabula, TierLogic describes their design as a 3D FPGA. But the two approaches are totally unlike each other, and neither is related to the concept of 3D ICs—involving stacked dice and through-silicon vias—that is currently the hot topic in SoC-of-the-future circles.
TierLogic’s big idea is elegant and audacious: increase the density of FPGAs by moving all the configuration memory—not the data memory or the look-up-table (LUT) memory, but the RAM cells that control the interconnect muxes—out of the silicon. Removing these memory bits by itself can cut die area—at least the die area occupied by logic fabric—more than in half, according to the company’s vice president of sales and marketing, Paul Hollingworth. TierLogic employs this advantage to use a mature 90nm process node and still deliver a smaller die area than a conventional SRAM FPGA would require, making it possible to offer the FPGAs at about half the cost of equivalent conventional parts.
But those SRAM cells have to go somewhere. That’s where TierLogic’s foundry partner Toshiba comes into the picture. Toshiba has developed a unique back-end-of-line process that puts a layer of amorphous-silicon thin-film transistors (TFTs) on top of the interconnect stack. The proprietary process uses virtually none of the wafer’s thermal budget, so it’s compatible with advanced CMOS. Yet at 180nm dimensions Toshiba can produce sufficiently fast and dense TFT SRAM cells to accommodate all the configuration memory required for the FPGA below. And since the configuration SRAM just sits there providing steering bits to the muxes—no user delay paths pass through the configuration memory—the slower, more stable TFT SRAM has no impact on user timing, except for the significant benefit of allowing the active die area be much smaller.
So this is what TierLogic means by 3D: the chips have two separate layers of active circuitry. The substrate holds the logic cells, interconnect muxes, block memory, and other user-accessible features. The TFT layer on top of Metal-8 holds the configuration memory. The result is an FPGA that can be functionally equivalent to industry-standard devices, but potentially on smaller dice, and so significantly lower in cost and power. Hollingworth said that in practice, TierLogic parts will be about 30 percent denser than economy FPGAs and 2.6 times the logic density of high-end conventional devices. For reasons we’ll discuss later, TierLogic is also claiming about a third better logic-cell utilization, so overall the company boasts over three times the logic density of existing high-end FPGAs.
There is a second major advantage to this two-layer implementation: ASIC conversion. Since the TFT SRAM cells are not in any user timing paths, TierLogic can replace the TFT layer with a simple metal layer containing hard straps to power and ground busses, and have no impact on user timing (except of course for eliminating the need for a power-up configuration mode.) Eliminating the TFT layer reduces cost further, creating a mask-programmed device that is functionally- and timing-equivalent to the field-programmable device, but cheaper. "This is the first time there has been an ASIC solution that really fits for volumes between a hundred and ten-thousand units," Hollingworth maintains.
The turn-around time for reducing a fuse map to a metal-mask and delivering the mask-programmed parts is four weeks. No redesign is necessary, nor should there be any need to reclose timing, although some customers will still have to requalify the parts. The quick turn-around is in part because TierLogic can bank all its wafers at Metal-8, and simply send the wafers to either the TFT line or to Metal-9 fabrication, as needed.
This capability gives TierLogic the equivalent of Altera’s Hardcopy capability—with die size and cost intermediate between an FPGA and a cell-based ASIC—but without requiring the customer to redo timing closure with a new set of timing files. The company is underlining this point by offering to early adopters that TierLogic will do the conversion from an existing production or prototype FPGA design or ASIC design to a TierLogic metal-programmed part as a service. For a minimum order of 50k units the service is free. The company will give you complete pin-compatibility with your existing part for a small NRE, or throw pin-compatibility in as well on a 100k-unit minimum order.
The tool flow for the devices is familiar: Mentor Precision Synthesis followed by proprietary mapping, routing, and analysis. One interesting point in the mapping process is that TierLogic’s LUTs are fracturable. If a path requires only a portion of a LUT—an inverter, say—the rest of the LUT is available to other nets. "Fracturing is known to be valuable—it improves our logic-cell utilization by 36 percent," Hollingworth said. "But if your configuration RAM is on your die, it’s just too costly to support fracturing."
Apparently Tabula’s announcement persuaded TierLogic to announce a little earlier than they had intended. The company is not ready to give detailed product descriptions yet. Hollingworth did say that the mask-programmed version of the parts is available today, so TierLogic invites interested prospects to register on their site and get more detailed information. The company has already done one design that includes an on-chip MIPS R4000 CPU implemented in the logic fabric, for example.
Hollingworth expects to ship engineering samples of the field-programmable part with the TFT SRAM layer by the end of June this year, but it will be a while longer before those devices are qualified for full production. There are still issues with TFT yield, he admitted, but the company has seen a new run that appears to solve the problem. It just has to be fully evaluated.
In the future, TierLogic has several options. Hollingworth said that the engineering team has done critical-dimension analysis that indicates the TFT approach will scale to at least the 40nm node, giving the idea lots of room for evolution. And there are at least two more revolutionary ideas afoot. First, since the TFTs are relatively low-performance devices processed at low temperature, the TFT layer is compatible with just about anybody’s advanced CMOS process. So TierLogic can license a field-programmable logic fabric as IP for use inside a cell-based SoC. You could have your wafers built at your favorite foundry, passivated and shipped to Toshiba, who could strip the passivation and fabricate the TFT layer. Hollingworth said that the company has already had discussions along these lines with some prospects.
The second point Hollingworth mentioned is that Toshiba is looking at the laser-annealing process that is coming on-stream for the 32nm process node. Once the laser-annealing systems are developed and in place, the high-speed laser annealing could create the local high temperatures necessary to produce a much higher-performance TFT without impacting the thermal budget of the underlying wafer. This would in principle allow TierLogic to put not just configuration memory but signal-path devices such as embedded SRAM blocks and even some logic structures or analog circuits in the top layer, creating an even denser FPGA. But that is for the future.