Brian DipertEDN Senior Technical Editor Brian Dipert exposes, analyzes and
opines on diverse topics in technology.


Profile

RSS Feed

  • Add this blog to your RSS newsreader!

Recent Posts

Recent Comments

Most Commented On

Archives

By Category

Consumer Electronics Design Articles

Blog

Monday, August 25, 2008

The 2008 Intel Developer Forum: GPGPU Gains Relevancy

Aug 25 2008 10:05AM | Permalink | Email this | Comments (5) |
Blog This! using:  Blogger.com | LiveJournal |
Digg This | Slashdot This | add to Del.icio.us


One in a series of posts

It's a funny thing…back in mid 2005 (online supplements here), I could barely get Intel to acknowledge the presence of discrete GPUs in the PC platform, far from admit to their validity as a supplement to the primary CPU. Contrast this past cynicism with the two-weeks-back SIGGRAPH optimism that accompanied Intel's unveiling of the first details of its coming-within-a-year Larrabee media processor. I attended the embargoed briefing that predated Intel's week-before-SIGGRAPH announcement; Intel's Larry Seiler (senior principal engineer, and ex-ATI Techologies aka AMD graphics division employee) recycled the material one week later at IDF, where Larrabee shared key billing with SSDs and the Atom and Nehalem CPUs.

Instead of revisiting already-trodden ground, I'll begin by referencing Larrabee writeups published by my peers at other publications, for your perusal:

Particularly note the Atom-reminiscent CPU core, albeit with a graphics-tailored vector unit:

And the ring-arranged core-to-core interconnect scheme:

As I listened to Intel's pitch four weeks back, I remember repeatedly thinking "this clearly isn't just about graphics…and Intel's ultimate aspiration is to pull these functions into the CPU." Intel Senior Vice President Pat Gelsinger validated my perspective last Tuesday afternoon when, during his keynote, he made the following argument (which I'm paraphrasing):

  • Historically, as incremental capabilities get added to the PC platform, their initial implementation vehicle is as hardware on dedicated silicon, but
  • They eventually migrate to the host CPU in a software-based fashion, assisted in some cases by application-focused instruction set extensions

As such, Larrabee is a 'bridge' product. Being x86-based, it's a means by which Intel can wrest control away from the proprietary shader processors used by ATI/AMD, Nvidia and other GPU suppliers (thereby reminiscent of RISC-vs-CISC CPU wars of years past). And, as I pointed out back in mid-June, x86 cores are just as capable of crunching graphics code as any other architectural approach. But, with several Intel keynoters confidently predicting last week that Moore's Law has at least 15 years' worth of life left, the company's CPUs will rapidly migrate beyond the 8-physical-core max count forecasted for 45 nm-fabricated Nehalem, and parallel-processing functions such as graphics are an ideal way to consume any resultant silicon slack.

If this was just about graphics, I'd wager Intel wouldn't be in this particular game. The company's existing graphics core, whose tile-based rendering approach (which Larrabee also embraces, by the way) hearkens back to Intel's i740, isn't leading edge by any means. However, as I've pointed out on numerous occasions, it's adequate for the vast majority of users' needs; heavy-duty gaming and content creation are lucrative but increasingly slender slices of the overall PC pie. And with the graphics core already integrated within the core logic chipset, and with portions of the core logic chipset now getting integrated onto the CPU die with Nehalem, I think you can see where this trend is headed...

No, the Larrabee project is just as much about other GPGPU applications I've regularly mentioned, such as:

  • Still and video image encoding, decoding, transcoding and other processing tasks
  • Audio processing
  • Encryption and decryption
  • Database search acceleration, and
  • Physics processing

In this broader suite of functions, x86 has an inherent advantage over a proprietary processor core by virtue of its maturity and ubiquity, even if first-generation Larrabee won't support SSE-coded instruction streams. As I most recently noted in last Friday's Atom writeup:

The ability to tap into an enormous, mature code and development tools base with little-to-mostly-no alteration is tremendous.

Yes, Nvidia's right, CUDA is a C-based programming language. But the target graphics processor core is still proprietary, as are the development tools…and as such they're incompatible with ATI/AMD's equally proprietary Stream scheme. x86, conversely, is (for better and worse) the "One Ring to rule them all, One Ring to find them, One Ring to bring them all and in the darkness bind them." And, in part to counteract the reality that Larrabee will also be a proprietary GPU silicon platform by virtue of its x86-ness, you gotta believe (as Renee James strongly hinted in her Wednesday keynote) that Intel will be offering toolsets that create code which dynamically runs on the CPU, Larrabee or in parts on both, depending on whether Larrabee is present in each target system or not as well as on the respective CPU-vs-Larrabee core counts and other system-specific performance metrics.

However, given that graphics remains the dominant function executed on GPUs, Larrabee will need to be price/performance competitive with other suppliers' offerings a year from now when it's scheduled for release, in order to deliver a sufficient volume ramp (and developer embrace) to ensure its long-term survival. In this light, I'm glad to see that Peter Glaskowsky was misquoted last week (and had the opportunity to correct any misperceptions arising from the reporter's oversimplification of his stance). Intel's SIGGRAPH paper (which I commend to your inspection…punch 'Larrabee' into the search box at the top of the page to find the material) contained preliminary performance data as a function of number of cores…but calibrated to a 1 GHz per-core clock frequency.

We don't yet know what Larrabee's production clock rate will be (though rampant industry rumor at the moment...which I don't personally buy into...suggests something on the order of 3 GHz), nor do we know what range of core counts Intel will be shipping in various Larrabee proliferations, nor do we know what price tag Intel and its graphics card partners will associate with each of those proliferations. So, although it's fun to speculate, don't make any implement-or-not decisions until the data's more solid and more plentiful. Until then, I welcome your opinions on Larrabee's chances.

Stay tuned for more on GPGPU, specifically as it relates to Intel's competitors, after I see what Nvidia unveils at this week's conference.


Reader Comments


at 8/26/2008 7:14:21 AM, psykhon said:
intel's check to brian just keeeeeeep coming

at 8/26/2008 8:57:59 AM, Brian Dipert said:
Dear psykhon, as I said in reply to your comment left on my 'Atom Bomb' post: If you disagree with me, state why. Simply tossing out accusations and insults scores you no points whatsoever, and in fact further weakens your case.

at 8/26/2008 5:44:08 PM, someDude said:
I think a lot of people are too quick to assume that since Larrabee is x86 it will be easy to use existing software and tools. Sure it would be easy to use existing x86 code and tools on a single Larrabee core, but that would be pointless. Automatically parallelizing existing code is rarely easy, and using x86 does nothing to solve this problem. I predict that Larrabee will be more flexible but with less raw power than AMD and nVidia GPUs when it launches. Obviously larrabee will improve its raw performance over time, but the GPUs from AMD and nVidia are also getting more flexible with each iteration.

at 8/26/2008 5:49:48 PM, Brian Dipert said:
Dear someDude, Intel and its partners such as Microsoft have been working on making threading support front-and-center in the tools-and-resultant-applications strategy since the long-ago premier days of HyperThreading, yes? And arguably, even before that...SMP configurations stretch all the way back to the i386. All of this work is immediately also applicable to Larrabee.

at 8/26/2008 9:33:38 PM, someDude said:
Your arguments seem to involve a lot of handwaving. I have done a lot of distributed, SMP, and GPGPU programming. While these techniques all try to exploit parallelism, they all have different strengths and weaknesses and thus they tend to require different tools. While MPI is nice for distributed programming, it is often less than ideal for an SMP environment. GPGPU requires a different approach all together. Look at the how the Cell processor needed a whole bunch on new tools despite it being closer to existing multi core designs than Larrabee. The fact that Larrabee uses x86 rather than PowerPC instructions is irrelevant. I would love for someone to point out a single existing tool that would really help programmers take advantage of Larrabee.

Post a comment


Display Name

Before submitting this form, please type the characters displayed above:


ADVERTISEMENT

©1997-2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites

ADVERTISEMENT
You will be redirected to your destination in few seconds.