|
||
April 23, 1998EDN's 1998 DSP 16-BIT Architecture DirectoryZSP 16400 DSP
The core of the 16400 comprises a five-stage pipeline: fetch and decode, instruction group, read, execute operations, and write results into register file. One processing unit handles instruction scheduling, making it easier for ZSP to develop custom instructions without affecting the datapath. The pipeline-control unit performs instruction grouping and resolves data and resource dependencies. It then dispatches the instructions to the individual functional units, which operate in parallel. Pipeline control also performs result bypassing and interrupt processing. Result bypassing moves results from functional units directly back into the pipeline without going through the write-back stage. The 16400 provides multitasking support and uses a six-level interrupt structure with programmable priority levels; a high-priority event can interrupt all instructions. The 16400's functional units share a register file of 16 16-bit registers. The register file serves as a source and destination for MAC operands. A 32-word instruction cache and a 32-word data cache are standard features. Looping hardware loads instructions into the cache, and, if the loop exceeds the cache size, the prefetcher grabs code to get it ready to load into the cache. Separate 64-bit instruction and data buses feed these caches from a dual-port memory. With the dual-port SRAM, the device reads and writes memory during one clock cycle. If the processor is processing a loop that is bigger than the 32-word cache, the 16400's prefetcher keeps the cache filled before the instruction execution. The 16400 uses static branch prediction; you should try to design your code so that program flow takes all backward predictions. A mispredicted branch incurs a three- to five-cycle latency; this latency is only three cycles when the target instructions are still in the cache. The DSP core contains an integrated non-cycle-stealing DMA unit. This unit uses a dedicated 8-kbyte portion of the dual-ported SRAM, has its own buses, and operates independently of the core-processor operations. Two 16-bit timers are also a standard feature of the DSP core. ZSP's DSP has hardware support for two nested looping constructs. It also supports eight 32-bit or 16 16-bit barrel shifters. This support is possible because the DSP performs a single-cycle shift as large as 16 bits on each of the core's 16 registers. You can also concatenate two 16-bit registers and perform single-cycle, 32-bit shifts on that register combination. Addressing modes The ZSP 16400 performs bit-reversed addressing in software using a bit-reversing structure; an instruction flips the bits because there is no addressing hardware. The 16400 has no dedicated circular buffer hardware; you can implement two circular buffers by using the data cache with two of the core's registers. The DSP has hardware support for indexed (immediate or register content), indirect, and register-to-register addressing modes. Special instructions The 16400 has add-compare-select and parallel-add and -subtract instructions for FFT and Viterbi; single-cycle bit-manipulation instructions; and specialized load instructions, which free programmers from the complexities of detailed pipeline management. Zero-overhead looping requires an "again" instruction to indicate the end of the loop and to perform the counter decrement. Support ZSP offers a functional compiler; an optimized version should be available in July. The compiler supports new C fixed-point data types and employs a variety of C-intrinsic functions. The company also offers an assembler, a linker, a debugger, a profiler, and a hardware-evaluation board. ZSP has emulation tools based on the JTAG interface and a $5000 evaluation board. |
||
| Back | |
||
| Copyright © 1998 EDN Magazine, EDN Access. EDN is a registered trademark of Reed Properties Inc, used under license. EDN is published by Cahners Business Information, a unit of Reed Elsevier Inc. | ||