Getting a grip on cloud computing: NVIDIA, the Playstation-3, and big iron
Ron Wilson - August 27, 2009
With all the talk about cloud computing today, there’s surprisingly little concrete definition behind the chatter. And without specifics the importance of the idea for silicon designers remains (sorry) nebulous. That hasn’t stopped speculation, of course. At Hot Chips this week many silicon experts were willing to conjecture on just what’s going on and what it means for chip architecture.
It appears that some patterns are finally forming in the haze. For instance, when people say cloud, they are referring to a hierarchy composed of three kinds of gross structures. At the lowest level are computing kernels. These are processor cores or groups of cores enclosed within a secure perimeter and united by a single coherent address space. By this definition a kernel could be as limited as an application processor in a smart phone, as obvious as a multicore blade, or as huge as the 256-core coherent symmetric multiprocessing system envisioned by the architects of the IBM Power7.
At the next level are clusters. These are assemblies of kernels, in which the kernels are connected by some sort of private local-area network link carrying a message-passing protocol. Communications between tasks in different kernels is necessarily explicit, lower in bandwidth, and less deterministic than between tasks within a kernel.
The third level includes the systems formed when we connect clusters through public networks. Communication between clusters via wireless services or the Internet is necessarily even slower and less reliable than communication within a cluster. More significantly, communication between clusters must cross security perimeters.
From this frame of reference, Hot Chips produced some interesting ideas. One is that the build-out of the cloud may not take the form many people are expecting. Much of the press on cloud computing has come from a handful of organizations that own gigantic server farms, and that would really like to sell computing time to increase utilization, without getting involved in the support costs of software-as-a-service. Hence it’s easy to default into thinking of the cloud as a handful of big server farms that don’t happen to be busy doing searches. By the above definitions, in this view of the cloud kernels are servers boards, and clusters are server farms.
But several of the presenters at Hot Chips offered alternative views. One came from NVIDIA CEO Jen-Hsun Huang. In a presentation that was basically about how neat NVIDIA’s toys are, Huang put a lot of emphasis on the scientific computing capabilities of the company’s GPUs. He took this argument further, claiming that today both the personal-computing multi-core CPU and the GPU have the volume, the R/D budget, and the technology to become the kernels in the cloud. He envisioned the basic kernel of a cluster as a general-purpose CPU to handle inherently sequential tasks, tightly coupled to a GPU to handle tasks accessible to SIMD parallelism.
In fact, Huang said, NVIDIA is already engaged in some aggressive cloud-seeding, having sited about a thousand clusters of GPU-based computing platforms in, among other places, about a hundred universities around the world. He offered a view of such a cluster in the near future, in which a single GPU contained not hundreds, but thousands of programmable shading engines, and a cluster comprising thousands of GPUs.
An alternative view came from an entirely different source: Kaveh Massoudian, CTO of power.org and strategic alliances director for the Systems and Technology Group at IBM. He pointed out that the Sony Playstation-3—which contains the somewhat legendary Cell broadband processor—is in fact arguably a cluster, and is "the lowest-cost computing element for the power available today." Massoudian’s vision could be described, in our terms, as clusters which are PS-3s linked together through public networks to form a dispersed, but enormously powerful cloud. (For another view of this possibility, see here.)
Both views have their difficulties, of course. Even with the C-like CUDA language there is no escaping the fact that the closer you get to the hardware of the NVIDIA GPU the less it looks like a cluster of general-purpose computers. There are limitations to what physically can be done on the hardware—the absence of hardware floating-point, for instance—and significant difficulties in software when the structure of a problem differs significantly from the structure of the image-rendering job for which the chips were originally designed. That said, the GPU has an enormous maximum processing rate and a very high-bandwidth pipe to a multi-core CPU.
The PS-3 is a very different kind of cluster, with heavy-duty general-purpose computing resources, parallel floating-point hardware, very fast internal pipes, but a much smaller total number of computing elements than a GPU. Of course, the PS-3 also has a GPU. Connections between PS-3s, however, must be done over relatively slow public networks. So the largest cluster of PS-3s, by our definition, is one unit.
Either of these approaches could, however, sneak up on the assumption that the cloud will comprise interconnected clusters of big-iron servers. The horsepower is there, the smaller systems have the advantage of cost and volume, and both are inherently more friendly to open-source, collaborative computing efforts. It could be a contest worth watching.