EDN Access PLEASE NOTE:
FIGURES WILL LINK
TO A PDF FILE.

April 23, 1998


EDN's 1998 DSP 16-BIT Architecture Directory


Advanced RISC Machines Piccolo

09CS1601A DSP-coprocessor module for ARM7 µP cores, Piccolo adds a 32-bit DSP instruction set and uses the ARM coprocessor interface for communicating with the ARM processor core. The µP and the Piccolo DSP share a memory bus, however, Piccolo fetches its own DSP code from on- or off-chip memory.

Piccolo's interface includes a tagged input-queue structure and an output FIFO buffer. The input queue, or reorder buffer, comprises eight 32-bit entries or 16 16-bit entries. It enables the ARM µP to preload Piccolo with data before Piccolo requires the data, essentially demultiplexing multiple input data streams for DSP algorithms. The reorder buffer allows ARM code to fetch DSP data or coefficients from memory in sequential bursts and allows the DSP code to consume the items in the required order. Piccolo automatically and transparently refills its registers from the reorder buffer as Piccolo uses and replaces old data. Register remapping allows ARM to logically remap physical registers within a loop to other register values. This ability effectively creates small circular buffers in the register banks. Piccolo sequentially returns results to the µP through the output FIFO buffer.

The ARM µP handles all interrupts and data-address generation. Although the µP operates in parallel with Piccolo, the µP's performance degrades when the DSP is active, because Piccolo consumes the µP's bandwidth. In addition, because Piccolo reloads only from the reorder buffer, a data-intensive algorithm may starve the register file. Furthermore, the programmer must ensure that the ARM core responds to the needs of the DSP. In other words, Piccolo cannot interrupt or notify the ARM core when Piccolo needs data or has full output buffers. However, ARM made this trade-off to achieve a smaller DSP core with minimal complexity.

Piccolo also features a private 128-instruction cache, a 16×16-bit single-cycle multiplier, a 32-bit barrel shifter, four 40-bit extended-precision accumulators, two saturation units, and register-based storage for 32 16-bit or 16 32-bit data items. Four of the registers double as 40-bit accumulators for some instructions. ARM engineers designed the saturation units to allow you to easily implement the arithmetic primitives that international telecommunication standards use. Piccolo also has a split ALU that provides single-cycle, dual 16-bit arithmetic and logical operations in one instruction word.

Addressing modes

Piccolo provides hardware support for four nestable zero-overhead-loop constructs. Leveraging the resources of the ARM µP, the company limits Piccolo addressing to accessing data from the input and output buffers. Piccolo does not directly support bit-reversed addressing. Instead, ARM's FFT implementation performs the bit-reversed addressing on the ARM µP in parallel with the first stage of the FFT on Piccolo.

Special instructions

Unlike the ARM µP cores, Piccolo instructions are not conditionally executable. Piccolo offers intrinsic support for tasks such as Viterbi and bit manipulation. A repeat instruction indicates the loop size and the number of iterations, which can be a register value. The repeat instruction is uninterruptible; however, the ARM core can still service interrupts. You can nest repeat instructions four deep.

Support

ARM is developing high-level-language support for Piccolo, although the company believes that most programmers will use assembly language. The unified µP and DSP architecture allows the single ARM tool chain for development. Various third-party developers offer speech- and channel-coding libraries, Global System for Mobile communications, speech coders, modem implementations as high as V34.bis, and more.


| Back |


Copyright © 1998 EDN Magazine, EDN Access. EDN is a registered trademark of Reed Properties Inc, used under license. EDN is published by Cahners Business Information, a unit of Reed Elsevier Inc.