Friday, June 12, 2009
Multiprocessing taxonomy #2
[Housekeeping]: There are two sets of posts going out in this series – the series here and the series in our guest blog. Today’s guest post is from David Stewart at CriticalBlue about multicore programming. I encourage you to read both series of posts as they are intended to be complementary. David was kind enough to avoid all marketing in his post, but I did expect him to at least mention his company’s multicore programming Prism tool – so I will mention it here. [End Housekeeping]
I believe that by proposing a multiprocessing taxonomy, we as a community of designers can converge our vocabulary and understanding of the key requirements for each application domain for the relevant chip architects, board-level-system designers, software developers, and development-tool providers. By establishing this common ground, all of the mention stakeholders can more meaningfully cooperate and discuss how to allocate technical approaches among themselves rather than doing an ad hoc fitting together of what each group builds.
The search for how to easily implement ABM appears to be the Holy Grail for the adoption of multicore designs. The argument goes, when operating systems and/or compilers can automagically parallelize legacy and/or sequential code, then the world will flock to multicore. On the surface, this might sound plausible, but this assumption is incomplete.
ABM is similar to channel-based multiprocessing (CBM) in that a developer needs to identify and break out where there is parallelism in a system task. Both of these types of multiprocessing share this same first gnarly problem – how do you break a task into parallel steps to take advantage of the redundant processing elements available in the computing system? Software threads are a technique that developers can use to implement this (there will be a post on threads in this series).
ABM differs from CBM on the back-end of the processing chain. CBM processing consists of similar and independent tasks that can execute independently in parallel throughout their lifecycle. ABM back-end processing, on the other hand, must reintegrate (or aggregate) the results from the parallel processing steps back into a single transformation before completing the lifecycle of that task. This crucial difference is the unique gnarly problem of designing ABM systems.
I worked many years developing fully autonomous vehicles that included vision systems. This type of vision differs greatly from machine inspection systems because nothing about the environment is controlled or fixed in place. A machine inspection system usually relies on the item of interest to be placed within a confined area, within some set of orientation limits, with some set of lighting conditions, and with some movement limits. The vision system we worked on had no control over the item of interest, nor the conditions with which the item would be observed.
The designs used multicore implementations to split the field of view (FOV) into multiple sections because the processing clock rates would not allow a single-core system to examine all of the image pixels in the permitted cycle time (think early 1990s). Splitting the FOV among the available DSP cores was the embarrassingly parallel (and easy) part of building the system.
After each DSP completed their processing on their portion of the FOV, the system needed to integrate or aggregate the results from all of the DSPs to account for those conditions when items in the FOV crossed over the boundary between two processing sections. In this case, it was easier to transform the raw data into an intermediate form and integrate those forms together. The system then performed further processing on the integrated results.
One reason I did not include pipeline or stream architectures as a separate taxonomy category is that pipeline structures are useful, but not necessarily required, in several, if not all, of the identified categories. Our vision system used a pipeline structure to support the cycle rate we needed; this required us to compensate for additional latency sources (more in the feedback multiprocessing post to come).
As an industry, we lack a variety of mature software development tools that help developers break a software design and implementation into parallel parts. Even more critical though, for the broad adoption of multicore for legacy systems, are development tools that also assist developers in aggregating the parallel parts into a single form so that the system can continue processing.
I shared my experience with these vision systems to provide a reference point for ABM. Please share your experiences designing systems that exhibit these characteristics so that we, as an industry, can better identify what are the critical and necessary gnarly problems that developers designing these types of systems need help solving. If your post would be detailed and involved, consider submitting it to me for the guest blog so that we can more easily group comments appropriately.
If you missed the previous post in this series, check out my taxonomy introduction to this multiprocessing series.
© Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
