Feature
FROM EDN EUROPE: Mixing it: DSPs and MCUs fall in together
Once in separate chapters of the microprocessor data book, control processors and DSPs are being brought together to present an optimal solution, at the lowest cost, for a wide range of consumer and industrial tasks.
By Graham Prophet, Editor -- EDN Europe, 1/6/2005
When semiconductor device manufacturers seek to differentiate their products, they often invoke "the real world." They remind digital-system designers that the real world is analogue, and that linear circuitry is always present around their designs to interface to the environment. It's unlikely that many processor system designers will have forgotten the point, but it's a handy marketing peg to hang a product story on. Similarly, competing product families almost invariably distinguish between conventional and digital-signal-processor architectures.
Just like the analogue/digital distinction, designers frequently find that their products don't recognize the different attributes of DSPs and microcontrollers, and a blend of the two alternatives would provide a better solution. However, the optimal resource balance that the two camps offer typically differs significantly between applications. An eight-bit μC may suit a consumer product that runs basic switch handling, timing, and control functions, but may require an upgrade to add connectivity, voice recognition, or some other operation that involves signal processing. Coding the signal-processing function on to a conventional μC is often possible, but can require considerable skill, as well as a switch to a much more powerful and expensive device. It's also possible that the processing power the extra feature requires could dwarf the resources that the basic product needs. While a small DSP chip might handle the signal processing, adding a separate DSP inevitably increases a product's build cost, maybe with more impact on the final selling price than the added feature justifies. Conversely, in an application with significant signal processing content, a modest DSP might be very capable of handling the signal path, but could present a real programming challenge to simultaneously handle keyboard and interface functions within the same device.
Two into one does goRecognizing this situation, semiconductor vendors have for some time offered hybrid processors that combine features from conventional and DSP devices. Recently, vendors have significantly increased the range and versatility of such hybrids, with new products that span all levels of complexity and performance. In this product sector, multi-core or unified-core architectures dominate two fundamentally different approaches to chip design. Essentially, the device designer can either place a control processor and a DSP on a single chip, or extend the instruction set of a control processor to accommodate DSP instructions (or vice versa). The point where a chip has sufficient DSP capability to qualify as a hybrid is open to question, since there are few signal-processing functions that ingenious programmers can't code onto a conventional core, given enough raw performance. Adding a hardware multiply/accumulate (MAC) block and its instruction set is a good indicator of the capability that a mixed-function application needs.
Microchip Technology's PIC is a familiar presence in the 8/16 bit micro-controller market that now has DSP capability available within its dsPIC30 family (Figure 1). The most recent additions are the 30F5011 and 30F5013 parts, with performance in the 30 MIPS range. Operating across industrial and extended temperature ranges, both are flash-based chips with 66-kbyte memories.
Microchip's approach combines its 16-bit, modified Harvard-style RISC core with DSP instructions that provide a tightly coupled instruction stream. The "standard toolchain - plus" development path that the company offers typifies the approach that many suppliers take. The basic platform remains the same as for the standard processor range, which in Microchip's case comprises its MPLAB integrated-development-environment and toolset. Accommodating the DSP functionality comes from extensions to areas such as the compiler. Microchip's European business development manager, Steve Diaper, acknowledges that designers will have to learn the basics of DSP to exploit the added features, but considers the extra knowledge they require to begin effective work is minimal. Diaper sees the uses of hybrid-type products falling into two distinct groups. The first focuses on a particular applications sector, where the DSP functions are central to the main task; the second is a general-purpose sector that provides an upward migration path for 8- and 16-bit μC users. Examples of the first group include applications such as motor control, where special-purpose variants also include hardware blocks such as programmable PWM generators and quadrature-encoder interfaces. General-purpose applications include voice or fingerprint recognition, or connectivity, such as software modems and TCP/IP communications. Microchip illustrates the impact of the incremental load that such applications create with two of its demonstration boards. One shows a conventional PIC18 μC that runs a cut-down TCP/IP stack leaving little headroom for other control functions. By contrast, a full TCP/IP stack running on a dsPIC30-based board has very limited impact on the hybrid processor's control bandwidth.
DSP capability also enables a range of processing options, such as FFTs and digital filters, which conventional processors find hard to handle. Diaper notes that most users are working with libraries to implement DSP functions. Design kits for hybrid processors typically come with pre-coded routines for common functions that programmers can call from high-level code. As with any DSP, programming efficiency increases if the programmer optimizes the code in and around the inner loops of the program - unlike most control code, DSP routines are typically highly iterative.
Start with a DSPChip designers can also create hybrid μCs by starting with a DSP and adding control functionality. At Texas Instruments (TI), this approach leads to the C2000 branch of its TMS320 family. The C2000 family offers DSP performance in the 20 to 40 MIPS range using 16-bit architectures, or up to 150 MIPS in 32-bit fixed-point chips. TI has recently added the R2812 and R2811 variants with up to 20k words of on-chip SRAM. Unlimited external memory is available via an SPI interface. The 150 MIPS devices have 32-bit MAC capability, with peripherals including a 12-bit ADC that suits applications such as high-resolution measurements and metering. Software support includes a math library that writes in a 32-bit floating-point format, and a filter package that runs under The MathWorks' Matlab.
Freescale adopts a similar approach in deriving the 56800 family of hybrid processors from its 56000 architecture. Freescale targets these devices at a relatively high level, adding control capability to a DSP that's sufficiently powerful to tackle algorithms such as advanced motion control. Again, this is a unified architecture that suits efficient programming in C, with the instruction set combining DSP and controller functions. One of the most recent additions to the product line is the 56F8365, a 60 MIPS machine with 512 kbytes of flash memory, a single-cycle 16316 multiplier, and four 36-bit accumulators. Because this device targets motion control, peripherals again include PWM outputs and encoder inputs, with CAN interfaces that further promote use in automotive and industrial environments. Other devices in the same family have peripheral sets that better suit, for example, security and medical applications. Development is via Metrowerks' CodeWarrior toolchain, with programming guidance available from the proprietary Processor Expert software.
Analog Devices (ADI) similarly pitches its Blackfin processors at a very high level of processing power, with recent introductions including dual-core variants with clock rates as high as 750 MHz. Taking advantage of the processor's media-engine background, ADI proposes concepts such as the single-processor head unit for signal processing that simplifies automotive dashboards. In this scheme, the processor handles all signal processing as separate threads - including AM/FM radio, multi-channel audio, and GPS - while having sufficient control capability to handle the user-interface and various displays. Devices elsewhere in the family sport interfaces, such as USB 2.0 and 100-Mbit Ethernet, that enable consumer multimedia designs. Rather than using a term such as digital-signal-controller - which more than one vendor employs - ADI calls the Blackfin an embedded media processor. It has a 32-bit RISC instruction set, with MAC blocks and dedicated media processing blocks, such as video engines, that depend on application focus. Blackfin family members operate as pure DSPs, μCs, or as any mixture of the two.
Multicore architectures form yet another approach to supporting DSP and control tasks on the same piece of silicon. One prominent example is Infineon's eponymous TriCore, which combines a peripheral processor for realtime tasks with a DSP for data flow operations and a RISC engine for overall supervision and computational throughput. Although it uses separate core blocks, the TriCore employs a single 32-bit (4 Gbyte) address space. (See Reference 1 for more on using this innovative architecture.) Hyperstone, a fabless semiconductor design house, is another company that uses separate RISC and DSP engines in its products.
Enter the hybrid RTOSDesigns at this level of complexity inevitably require real-time operating system support. Specific to the Blackfin hardware, RTOS supplier Quadros offers its RTXC/dm convergent RTOS that aims to optimize both RISC and DSP code (Figure 2). RTXC/dm combines attributes of the company's existing RTXC/ms (control processing) and /ss (data flow/signal processing) products. The system runs data flow processes as threads that have a higher priority than control tasks. Control tasks operate through an API whereas signal processing threads do not, resulting in service calls in the DSP portion of the code being three to five times faster than for the event-driven control domain. Quadros says that such measures address the different runtime needs of RISC and DSP application code. Because events typically drive control code, there are frequent changes in program flow. By contrast, DSP code needs to perform repetitive data manipulations and run to completion in a definite time, meantime responding to rapidly changing datasets. According to Quadros' president Tom Barrett, RTXC/dm combines a minimal-context executive for the DSP threads with a protected, prioritized, pre-emptive kernel for control tasks. Barrett asserts that this is the first RTOS to reflect, in the programming domain, the flexibility that hardware convergent architectures offer. He adds that the internal RTOS coding is "95% in C, and not very much is tightly bound to the specifics of the Blackfin" - meaning that the principle can and will be applied to other unified processor architectures. ADI's European marketing director, Stefan Steyerl, cites the ADSP-BF531 as a family member that the RTOS suits for cost-constrained applications, such as video surveillance. This chip is a 400 MHz device with four serial ports and a 16-bit external bus in a low-cost package, competing with multi-chip solutions that Steyerl estimates would cost three to four times as much.
Back in the 30 MIPS space, designs can also benefit from RTOS support. For example, Microchip offers three levels of RTOS for its dsPIC - a basic scheduler; scheduling with thread support; and a full RTOS with timing analysis. However, the company says that many of its design wins operate without using an RTOS at all.
Configurable optionsA radical alternative to off-the-shelf solutions involves designing a complete processor core to suit the project in hand, using one of the configurable processor solutions from vendors such as Tensilica or ARC. This is unlikely to be a realistic option outside of very high volume applications, when SoC/ASIC implementations can create highly silicon-efficient solutions.
Configurable processor architectures use the concept of a "core core", which allows the designer to extend the instruction set and the hardware by directly extracting the data manipulations that the application's algorithms require. Although not a literal equivalent of the converged processor, such algorithms can be signal-processing operations, and the technique offers an alternative route to equivalent functionality. For example, as part of the design process, Tensilica's design software generates a new variant of the compiler and toolchain that's specific to the individual processor instance. This approach most obviously lends itself to building up complexity, adding hardware as part of the configuration process to address complex tasks. Conversely, ARC recently stated that a significant proportion of its design wins exploit the configurability of the processor in the opposite direction - defining the key algorithms, adding instructions as necessary, but then stripping the design back to the minimum configuration that fulfils the application. Here of course, the objective is to minimize silicon area for lowest possible cost.
A variant on this approach is the processor architecture that start-up Stretch introduced (Reference 2). Stretch employs a Tensilica core with tightly bound programmable logic in the datapath. The designer sets up the programmable logic elements to carry out exactly the data operations the application requires.
Also in the processor IP space, demand for added DSP functions drives product announcements such as ARM's OptimoDE. This algorithm-centric configurable core technology works from a base library of DSP functions to configure a signal processing datapath. It also produces a dedicated C compiler variant for the new design. Similarly, MIPS announced its DSP ASE (application-specific extension) technology that can improve signal-processing performance by over 300%. ASE comes with a suite of software development tools and DSP library code.
You can reach Editor Graham Prophet at gprophet@reedbusiness.com.
| For more information... | ||
| For more information on products such as those discussed in this article, contact any of the following manufacturers directly, and please let them know you read about their products in EDN Europe. | ||
| Analog Devices www.analog.com | ARC www.arc.com | ARM www.arm.com |
| Freescale www.freescale.com | Hyperstone www.hyperstone.com | Infineon www.infineon.com |
| The Mathworks www.themathworks.com | Microchip www.microchip.com | MIPS www.mips.com |
| Quadros Systems www.quadros.com | Stretch www.stretchinc.com | Tensilica www.tensilica.com |
| Texas Instruments www.ti.com | ||
| References |
|














