Tuesday, June 16, 2009
Multiprocessing #3: Things to consider with multicore
When programming multicore, debug becomes exponentially more difficult (or at least polynomially more difficult, for the literalists out there) with the number of cores. This is because there are so many more possible connections to be made (for N cores it is theoretically (N-1)!). When multicore was across a board, there was a lot of visibility to be had with a logic analyzer. But when the interconnect is “hidden” on the chip between cores the architecture has to be carefully designed to expose the right debug data without adding a large overhead in wiring to the chip. A chip that cannot be debugged is virtually useless in the multicore space.
Another item to consider - guaranteeing a certain order of events is critical when you have a complex interaction between multiple programs. Even in a single core embedded solution this was non-trivial because of interrupts and context switching due to the need to maintain real time. For multicore the problem becomes much more challenging, especially when the order of events and the time to complete these events are both considered. Many of the trickiest debug problems in multicore programming can be traced back to events occurring in an unexpected order.
There are roughly two types of thread organization seen in embedded systems. In one case there are many independent threads arriving stochastically and the goal is to make sure they all get access to resources in a reasonable time frame. Total throughput is more important than the delay in processing a particular thread. IP packet processing is a good example of this. In the other case, real time constraints mean that latency is more important than throughput and the processing load will tend to be bursty and periodic. Different multicore systems may do one or the other of these well.
Finally, the programming model used to program a multicore device will dictate the kinds of algorithms that you can implement. Certain C-based language extensions for multiprocessing are beginning to emerge, such as OpenMP and MPI in the high performance computing (HPC) space, MCAPI in the embedded space, and OpenCL/CUDA in the graphics space. A language extension may or may not suit a particular application space, or may require a piece of code to be rewritten in a completely different way to allow for a reasonable level of optimization of that code.
- Alan Gatherer, CTO and TI Fellow, High-Performance and Multicore Processing Business, Texas Instruments
If you missed the previous guest post on multiprocessing, check out what David Stewart from CriticalBlue had to say about multicore.
© Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
