Intel code lights road to many-core future
By Rick Merritt, EE Times - September 15, 2011
SAN FRANCISCO - Intel Corp released open source code for
Parallel JS, a data-parallel version of Javascript in an effort to help
mainstream programmers harness multicore processors.
The tool marks one small step on a long journey to the many-core future, said Intel chief technology officer Justin Rattner in an interview with EE Times. In a Thursday keynote at the Intel Developer Forum here, Rattner demoed the new language and other research efforts aimed at easing parallel programming and reducing power consumption for PCs and servers.
Intel, Microsoft, Nvidia, and others have poured millions into university research to define the tools tomorrow's programmers will need for the many-core processors now on their drawing boards. To date, parallel programming has been confined to use by experts in highly specialized technical applications.
"Were making good progress, but there won't be one [programming] model-there will be multiple models," Rattner said in the interview.
Parallel JS represents one of those models. The language boosts performance for data-intensive, browser-based apps such as photo and video editing and 3-D gaming running on Intel chips. It is meant to appeal to mainstream Web programmers who use scripting languages.
Rattner demoed the language's capability to harness up to eight x86 cores on an Intel CPU for a high-end animation.
"Most software written these days is in a scripting language like Java or Python, but to date those programmers have not had access to multicore tools," Rattner said. Parallel JS is "a pretty important step that gets us beyond the prevailing view that once you are beyond a few cores, multicore chips are only for technical apps," he said.
A future version of the language also will harness the graphics cores now embedded on Intel's latest processors. To that end, Rattner demoed a face recognition app that used both x86 and graphics cores.
"We are basically telling developers that it's time to think creatively about heterogeneous computing," Rattner said.
Many core, mobile outlook
In its labs, Intel also is working on ways to improve today's data-parallel tools used to run general purpose programs on graphics processors. The current such as OpenCL and Nvidia's Cuda tools use relatively low-level data primitives closely tied to hardware, Rattner said.
Intel is working on alternatives using higher-level programming abstractions such as nested vectors used in dense- and sparse-matrix arithmetic. The company could release those tools in 2012, Rattner said.
The new software represents an effort to bring to today's C++ programmers some of the concepts of the emerging school called functional programming.
"Functional programming looks to be one of the foundations for parallel programming going forward with higher levels of abstraction and more automation of parallelism," said Rattner. "The compiler can extract the parallelism and doesn't require the programmer to be as explicit as they have to be with OpenCL or Cuda," he added.
Beyond 2012, data-parallel techniques need a more fundamental change, Rattner said. Today's approaches handle tasks one at a time under strictly enforced schedulers, but that leaves some computer resources idle, wasting energy.
Tomorrow's approaches will be more asynchronous in nature but they are only at the concept stage. "Today we give up efficiency for programming convenience, but as we look to the future we can't afford to waste so much power," he said.
Throttling down PC power
In his keynote Rattner showed work on two research projects specifically aimed to throttle back power consumption in computing.
A near-threshold voltage processor uses novel, low-voltage circuits that operate close to threshold levels. The concept CPU runs fast when needed but drops power to below 10 milliwatts when its workload is light. To demonstrate the approach, Intel built a Pentium class chip that can run on a postage-stamp sized solar cell.
The demo chip, called Claremont, runs just 100 millivolts above threshold to show five- to ten-fold reductions in power consumption compared to existing processors. "That's a huge number--people fight for 20% reduction, so this is almost unheard of," said Rattner.
Claremont uses only an L1 cache because associated memories still need to run several hundred millivolts above threshold, Rattner said.
Separately, researchers from Intel and Micron showed a prototype for a novel memory stacking technology they co-developed. The Hybrid Memory Cube combines a stack of DRAM die with a logic layer at the bottom, using a new interface and protocol to translate the memory information to a separate processor.
Bryan Casper, Intel's led researcher on the project, claims the device is "the most energy efficient DRAM ever built when measured in number of bits transferred versus energy consumed." The prototype has 10 times the bandwidth and seven times the energy efficiency of the most advanced DDR3 memory modules available, he said.
Separately, Rattner demoed a standard x86 server acting as a base station, using new x86 signal processing algorithms on a Sandy Bridge CPU.
"I think we are maybe within a generation of having something worthy of production," he said. "I think Ivy Bridge probably turns the corner on delivering something very competitive to traditional DSP-based system," he said.
At IDF, Intel engineers taught classes on new signal processing and packet processing developer kits for its processors.
This story was originally posted by EE Times.
The tool marks one small step on a long journey to the many-core future, said Intel chief technology officer Justin Rattner in an interview with EE Times. In a Thursday keynote at the Intel Developer Forum here, Rattner demoed the new language and other research efforts aimed at easing parallel programming and reducing power consumption for PCs and servers.
Intel, Microsoft, Nvidia, and others have poured millions into university research to define the tools tomorrow's programmers will need for the many-core processors now on their drawing boards. To date, parallel programming has been confined to use by experts in highly specialized technical applications.
"Were making good progress, but there won't be one [programming] model-there will be multiple models," Rattner said in the interview.
Parallel JS represents one of those models. The language boosts performance for data-intensive, browser-based apps such as photo and video editing and 3-D gaming running on Intel chips. It is meant to appeal to mainstream Web programmers who use scripting languages.
Rattner demoed the language's capability to harness up to eight x86 cores on an Intel CPU for a high-end animation.
"Most software written these days is in a scripting language like Java or Python, but to date those programmers have not had access to multicore tools," Rattner said. Parallel JS is "a pretty important step that gets us beyond the prevailing view that once you are beyond a few cores, multicore chips are only for technical apps," he said.
A future version of the language also will harness the graphics cores now embedded on Intel's latest processors. To that end, Rattner demoed a face recognition app that used both x86 and graphics cores.
"We are basically telling developers that it's time to think creatively about heterogeneous computing," Rattner said.
Many core, mobile outlook
In its labs, Intel also is working on ways to improve today's data-parallel tools used to run general purpose programs on graphics processors. The current such as OpenCL and Nvidia's Cuda tools use relatively low-level data primitives closely tied to hardware, Rattner said.
Intel is working on alternatives using higher-level programming abstractions such as nested vectors used in dense- and sparse-matrix arithmetic. The company could release those tools in 2012, Rattner said.
The new software represents an effort to bring to today's C++ programmers some of the concepts of the emerging school called functional programming.
"Functional programming looks to be one of the foundations for parallel programming going forward with higher levels of abstraction and more automation of parallelism," said Rattner. "The compiler can extract the parallelism and doesn't require the programmer to be as explicit as they have to be with OpenCL or Cuda," he added.
Beyond 2012, data-parallel techniques need a more fundamental change, Rattner said. Today's approaches handle tasks one at a time under strictly enforced schedulers, but that leaves some computer resources idle, wasting energy.
Tomorrow's approaches will be more asynchronous in nature but they are only at the concept stage. "Today we give up efficiency for programming convenience, but as we look to the future we can't afford to waste so much power," he said.
Throttling down PC power
In his keynote Rattner showed work on two research projects specifically aimed to throttle back power consumption in computing.
A near-threshold voltage processor uses novel, low-voltage circuits that operate close to threshold levels. The concept CPU runs fast when needed but drops power to below 10 milliwatts when its workload is light. To demonstrate the approach, Intel built a Pentium class chip that can run on a postage-stamp sized solar cell.
The demo chip, called Claremont, runs just 100 millivolts above threshold to show five- to ten-fold reductions in power consumption compared to existing processors. "That's a huge number--people fight for 20% reduction, so this is almost unheard of," said Rattner.
Claremont uses only an L1 cache because associated memories still need to run several hundred millivolts above threshold, Rattner said.
Separately, researchers from Intel and Micron showed a prototype for a novel memory stacking technology they co-developed. The Hybrid Memory Cube combines a stack of DRAM die with a logic layer at the bottom, using a new interface and protocol to translate the memory information to a separate processor.
Bryan Casper, Intel's led researcher on the project, claims the device is "the most energy efficient DRAM ever built when measured in number of bits transferred versus energy consumed." The prototype has 10 times the bandwidth and seven times the energy efficiency of the most advanced DDR3 memory modules available, he said.
Separately, Rattner demoed a standard x86 server acting as a base station, using new x86 signal processing algorithms on a Sandy Bridge CPU.
"I think we are maybe within a generation of having something worthy of production," he said. "I think Ivy Bridge probably turns the corner on delivering something very competitive to traditional DSP-based system," he said.
At IDF, Intel engineers taught classes on new signal processing and packet processing developer kits for its processors.
This story was originally posted by EE Times.
BigDog robot: a sensor-based enhancement of human capabilities
Maxim Integrated 30th anniversary
Gnat-power sawtooth oscillator works on low supply voltages
The Black and Decker GH1000 Type 2 string trimmer
Simple reverse-polarity-protection circuit has no voltage drop
War of currents: Tesla vs Edison
Understanding the basics of setup and hold time
Why bypass caps make a difference - Part 1: How a regulator and its output capacitor can interact
Temp and voltage variation of ceramic caps, or why your 4.7-uF part becomes 0.33 uF
Simulation shows how real op amps can drive capacitive loads
Datasheets.com Parts Search
185 million searchable parts
(please enter a part number or hit search to begin)
KNOWLEDGE CENTER
