Leibson's Law: It takes 10 years for any disruptive technology to become pervasive in the design community. This blog is about the disruptive technologies that either have or will win over electronic engineers, some that won't, and why. Written by Steve Leibson, Tensilica's Technology Evangelist. See my history site at www.hp9825.com. You can email me by taking the first letter of my first name, appending that to my last name, then the magic email symbol, followed by the name of the company I work for, and then a dot followed by com.
Oct 2 2008 11:41AM | Permalink | Email this | Comments (0) |
Blog This! using: Blogger.com | LiveJournal |
Digg This | Slashdot This | add to Del.icio.us
If you are on a design team that’s about to develop a multi- or many-processor SOC (MPSOC), you should take a look at Tom Halfhill’s new 12-page article on Intel’s Larrabee GPU (graphics-processing unit), a many-core x86 architecture with an on-chip network linking the many processors; it’s dated September 29, 2008. (Full disclosure: In-Stat and Microprocessor Report both belong to Reed Business Information.) Tom has written a terrific analysis of the design goals and implementation decisions behind Larrabee. He’s also done his usual thorough job of properly positioning Larrabee and its chances within the greater scope of competitive offerings from ATI and Nvidia. But those are not the reasons I’m suggesting that you read this article. I think you should read Tom’s article because in 12 short pages, you’ll find a tremendous amount of practical information and analysis that will guide your own design and business decisions, whether your team is developing a GPU (which is less unlikely) or some other sort of MPSOC (much more likely).
Let’s talk business decisions first, because as an EDN reader you’re less likely to care about such things so if I leave them to the end, you won’t get that far. Don’t worry, the cool tech stuff will continue in just a few paragraphs.
Halfhill writes:
“Scheduled to debut in 2009 or 2010, Larrabee is a direct challenge to the long-established GPUs from ATI (acquired by AMD in 2006) and Nvidia. Those two vendors control 98% of the discrete-GPU market, according to Jon Peddie Research... Last year, the total market for GPUs (discrete and integrated) was 350 million units, according to Peddie.”
Here, in just two sentences, Halfhill delineates half of Larrabee’s business plan. There are 350 million GPU chips sold per year and this market is almost entirely owned by two companies. We (Intel) can get some of that. Just ten percent of this existing market is 35 million chips per year. We (Intel) should take some of that business.
But there are other, untapped—yet nonetheless obvious—markets that will drive GPU volumes up. Halfhill writes:
“Another rationale for basing Larrabee on the x86 architecture is that future PCs will be able to use their Larrabee GPUs as coprocessors for compute-intensive tasks other than graphics. Video transcoding is a good example of a highly parallel task suited for Larrabee. Although existing GPUs can do the same thing—indeed, they're doing it now, in small ways—having a GPU that shares the same architecture with the CPU could ease software development and task sharing.”
And:
“Frankly, MPR will be surprised if Larrabee doesn't trail the best GPUs from ATI and Nvidia when independent benchmark testers get hold of it. However, we doubt that superior graphics performance will be critical to its long-term success. The market for discrete GPUs is relatively flat and dominated by avid gamers. We expect Larrabee to be more important for the HPC market and integrated graphics. Scaled-down versions of Larrabee, integrated in the north-bridge chip or CPU, will likely satisfy the majority of PC users.”
If you can’t write a 1-page business explanation as clearly and succinctly as the above three paragraphs, then try harder. It’ll be worth it.
Thanks for sticking with me through business stuff. However, if you think the business stuff isn’t important, think again. Even working in the ivory tower of HP in the 1970s, we knew that the number one objective for the company was profit. Without it, there’s no company.
Now back to our regularly scheduled tech stuff:
As I wrote at the beginning of this blog entry, Halfhill’s article is chock full of good technical stuff for MPSOC designers. The following statements remind you about the need to connect the future with various aspects of company legacy, to reduce overall development costs and to ensure a built-in base of developers:
“By adapting the x86 for Larrabee, Intel doesn't have to rewrite its assemblers, compilers, profilers, debuggers, simulators, and drivers completely from scratch. Larrabee supports the popular DirectX and OpenGL graphics APIs, and it can run other existing middleware, after some modifications.”
Next, Halfhill reminds you to look through your own prior art, to help accelerate your project’s time to market:
“Having settled on the x86 architecture, the Larrabee architects did something surprising. They didn't design an entirely new x86 core or adapt the low-power Atom core (which wasn't finished yet). Instead, pressed for time and worried about power consumption, they unearthed the brain of their new processor from Intel's graveyard. They derived the Larrabee core microarchitecture from the RTL of the original Pentium, introduced in 1993.
It's not even the Pentium Pro, Pentium II, or Pentium III—just the plain old Pentium.”
Next, Halfhill has some pithy, counterintuitive things to say about Larrabee’s on-chip network. He compresses a college class covering the issues surrounding on-chip networks into a few short paragraphs:
“Intel settled on a ring topology to simplify the design, cut costs, and get to market faster. Although a ring isn't as versatile as a mesh, it simplifies the wire routing and allows Intel to populate the network with relatively complex 64-bit x86 processor cores. Most manycore and massively parallel chips that have mesh fabrics also have simpler processor cores...
According to Intel, the latency inherent in Larrabee's ring network is less significant than the usual memory latency for loads and stores, so the ring won't slow things down. In other words, the processors will spend more time waiting for data to arrive from off-chip memory than they will spend waiting for data to circle the ring...
Intel says the rings will be limited to four to 16 processor cores. Beyond that threshold, the ring grows so large that external memory latency won't hide the internal network latency.”
And finally, it takes Halfhill only one sentence to succinctly sum it all up:
“By reverting to a 15-year-old x86 microarchitecture (albeit with many improvements) and adopting a simple ring for the on-chip network, Intel has created a very scalable basic design.”
I hope by now, you’re getting the right idea about this article. If you subscribe to Microprocessor Report, go and read this article for a great, fast introduction to the ins and outs of MPSOC design. If you don’t subscribe, this is a good time to start. (Microprocessor Report is a paid-subscription publication.) If you don’t want to buy a subscription, spend fifty bucks for the article. It will pay you back 10,000x in time saved on your next MPSOC project and where else will you find that kind of ROI?
Related entries in: Computers | Consumer Products | Microprocessors | SOC |