Zibb

Feature

2002 DSP directory

Market analysis forecasts DSP sales to turn upward in 2002, with iSuppli predicting a 4% rise and Forward Concepts expecting a 32% gain.

By Robert Cravotta, Technical Editor -- EDN, 4/4/2002

Last year was a harsh one for processor-device sales. According to market analysis from iSuppli (www.isuppli.com) and Forward Concepts (www.forwardconcepts.com), DSP sales were down 30 to 45%, respectively, for 2001, a marginally worse drop than microcontroller sales suffered. The same market analysis forecasts DSP sales to turn upward in 2002, with iSuppli predicting a 4% rise and Forward Concepts expecting a 32% gain. A DSP-inventory glut for the cell-phone market at the beginning of the year was a significant factor in the downturn in sales.

Signal processing is finding its way into many applications, including traditionally control-oriented ones, which are benefiting from access to DSP operations that speed math-oriented processing and allow more precise and sensorless control designs. The lines distinguishing DSPs and microcontrollers are blurring, as both types of devices are incorporating aspects of the other. In fact, several unified, or hybrid, microsignal-processor architectures now blend DSP and control processing into a single instruction thread, offering new options for applications that include both control- and signal-processing requirements (Reference 1).

In an attempt to maintain a clear distinction between DSP and controller devices, the directory survey requested devices, cores, or extensions that not only can process signals, but also find their primarily application in signal processing. The DSP had to be a software-programmable device, core, or extension that includes an assembler or a compiler in the tool set. This requirement eliminated those products that, although they may include a programmable DSP core, restrict users to only selecting and setting operating parameters. Last, listed devices or IP must be currently or soon available. These criteria eliminate potential products from inclusion, but the directory still contains more entries than ever before.

In addition to being more restrictive, this year's directory is structured differently from those of previous years. The directory lists entries alphabetically by vendor and consolidates the support section that each entry normally includes in the last entry of the vendor's section. This structure reduces the amount of duplicate information, but, more important, it emphasizes that tool sets are usually common across a vendor's product lines. Almost without exception, integrated tool sets are a strategic element of any DSP offering and play a large role in design wins (Reference 2). (This page contains devices A through D. At the bottom of the page, you'll find a link to part 2 of the DSP directory, including the article vendor box, or you can click here.)

Adelante Technologies' Saturn

at a glance:

  • The core measures 0.5 mm2 and consumes 0.25 mW/MHz in a standard 0.18-micron CMOS process.
  • The Saturn can perform 420 million MACs/sec at 210 MHz.

Adelante's Saturn is an extensible, low-power, small-area, "open" RTL DSP core and subsystem targeting wireless-baseband handsets and digital-control applications. It employs a dual Harvard architecture with two 16-bit multipliers, four 16-bit ALUs that can combine into two 40-bit ALUs (32-bit with 8-bit overflow), a shift and saturation unit, a bit-manipulation unit, a barrel shifter, a hardware-loop-control unit, a program-control unit, and two data memories configurable to 64k words and expandable to 1M word with paging. The designer extends the core via custom application-specific instructions, execution units, and coprocessors to accelerate repetitive tasks.

The Saturn core integrates into Adelante's Lunar DSP subsystem, which includes program and data memory, DMA, interfaces to external processors, peripherals, and I/O (including an AMBA bus for ARM and MIPS processors), BIST, and JTAG hardware-debugging capability. Special constructs in the three-stage pipeline enable single-cycle overhead short branching and zero-overhead long branching. One nonmaskable interrupt and 16 maskable interrupts with single-cycle interrupt switching with simultaneous shadow X/Y-address-pointer switching support immediate execution of service routines.

Addressing modes: The Saturn supports single- and dual-data-memory-operand addressing for 32-bit operands, with direct data and absolute addressing. Offset, indirect, absolute, immediate, modulo, and bit-reversed addressing support bit/nibble/byte access to memory. Two of the three X/Y-address pointers are context-sensitive.

Special instructions or integral-peripheral functions: Designers can extend the Saturn's 16-bit instruction set with 256 application-specific 96-bit VLIW (very-long-instruction-word) instructions that can fully exploit all core resources in parallel to accelerate repetitive DSP functions (for example, two-cycle execution of a 12-operation Viterbi butterfly). Designers can also integrate application-specific execution units and coprocessors into the DSP subsystem to accelerate computationally intensive functions, such as turbo coding or multichannel ADPCM (adaptive differential pulse-code modulation).

Support: The Atmosphere Development Environment supports code-development debugging for application-specific instructions and execution units. The code-development tools include a compiler, a linker, a debugger, an instruction-set simulator, and a profiler. The debugger supports use with a JTAG hardware debugger and the runtime debug block for in-circuit runtime emulation. Adelante offers design services for the development, integration, and verification of application-specific execution units and application-specific coprocessors.

Agere Systems' DSP16000

at a glance:

  • The DSP16410 can perform 800 MACs/sec at 200 MHz.

Agere's DSP16210 and DSP16410 devices use the DSP16000 core and target digital-communications applications that benefit from large, on-chip RAM with downloadable system support. The DSP16210 includes 60k words of dual-port RAM and can address as many as 192k words of external storage in both its code/coefficient-memory address space and data-memory address space. An internal boot ROM includes system-boot code and hardware-development-system code. This device also contains a bit-manipulation unit; a two-input, 40-bit ALU with add/compare/select for enhanced signal-coding efficiency and Viterbi acceleration; and a three-input adder for single-cycle accumulation of the results of both multipliers. To optimize I/O throughput and reduce the I/O service-routine burden on the DSP core, two modular I/O units manage the simple serial-I/O port and the 16-bit parallel host-interface peripherals. They also provide transparent DMA transfers between the peripherals and on-chip, dual-port RAM.

The DSP16410 targets communications-infrastructure applications and features twin DSP16000 dual-MAC DSP cores and enhanced DMA capabilities. Each DSP core has access to a 192-kbyte block of memory (384 kbytes total) and share a 4-kbyte block of memory for interprocessor communications. Its large on-chip memory supports fixed-point signal-processing functions, including equalization, channel coding, compression, and speech coding. A centralized DMA unit supports transparent peripheral-to-memory and memory-to-memory transfers. The DSP16410 includes a 16-bit parallel port with DMA support that can provide host access to all DSP memory. The two serial-I/O units also include DMA support, are compatible with TDM highways, and include hardware support for u-law and A-law companding.

Addressing modes: The DSP16000 core architecture supports immediate, register-direct, address-register-indirect, and program-counter-relative modes, as well as register-plus-displacement addressing, and circular-buffer addressing.

Special instructions or integral-peripheral functions: The special instructions are arithmetic, logical, and shift operations, and bit-manipulation instructions to implement nonlinear algorithms, such as signum, A-law and u-law conversions, half-wave and full-wave rectification, and rounding. The bit-manipulation instructions include barrel shifting, normalization and exponent computation, and bit-field insertion or extraction.

Agere Systems' StarPro2000

at a glance:

  • The StarPro2000 can perform 3000 million MACs/sec at 250 MHz.

The StarPro2000 targets high-performance communications-infrastructure applications, such as wireless base stations, voice-over-IP gateways, remote-access servers, and radio-network controllers. The StarPro2000 integrates three StarCore SC140 quad MAC DSP cores, 768 kbytes of shared memory, three serial-I/O units, one parallel-interface unit, and two external memory-interface units. Agere enhanced each SC140 DSP core with instruction and data caches, local data memory, an interrupt controller, and bus-interface logic. The combination forms a SuperCore macrocell that is the basic building block for the StarPro family. The cores and peripherals connect via a high-speed, split-transaction-architecture, 128-bit-wide local interconnect bus that can perform 128-bit transactions every clock cycle to minimize bottlenecks for internal and external transactions. The DSP core's instruction set, large set of general-purpose registers, and short instruction pipeline make it a viable target for coding algorithms in C.

Addressing modes: The SC140 architecture supports absolute, absolute-long, absolute-jump, reverse-carry, modulo, multiple-wraparound-modulo, register-direct, address-register-indirect, and program-counter-relative addressing modes. Dual 64-bit data fetch, stores, or both can simultaneously address parallel instructions as byte, word, long, as well as dual-word, quad-word, and dual-long.

Special instructions or integral-peripheral functions: The multipliers support mixed signed and unsigned operands in both fractional and integer formats. The SC140 architecture supports bit-mask-test/set/clear operations for both data registers and memory, and an SIMD (single-instruction-multiple-data) version of maximum and minimum additions and subtractions. The SC140 can perform eight 16-bit additions or maximum and minimum operations per cycle and includes the MAX2VIT, which works with Viterbi Shift Left to accelerate Viterbi decoding algorithms.

The StarPro2000 includes six 32-bit, general-purpose timers that can interrupt any DSP core; a 32-bit, parallel-interface port with 31-bit address space and 32-bit data bus; and two 8-bit, general-purpose bit-I/O units. The two external-memory interfaces have a 28-bit address space and a 32-bit data bus. Eight memory-to-memory DMA channels allow block-memory moves anywhere in the memory space. Two DMA channels support each of the three serial-interface ports.

Support: Agere's LUxWorks supports development for DSP16000 and StarPro2000 devices. The integrated system-level development tools includes a C compiler, an assembler, a linker, and a simulator. Hardware-development platforms and in-circuit-emulation capabilities are available through Agere's TargetView JTAG communication system featuring Agere's DART for real-time data collection. Agere also provides optimized libraries for voice transcoding and echo cancellation for wired networks and 2G, 2.5G, and 3G wireless standards. Additional third-party tools are available in cooperation with the StarCore Joint Development Center.

Analog Devices' ADSP-21xx

at a glance:

  • All ADSP-21xx processors are source-code-compatible.
  • Products cost less than $4 in high volume.

All ADSP-21xx processors are source-code-compatible and feature a high-level algebraic programming syntax. All instructions execute in a single clock cycle, including multifunction instructions. ADSP-21xx processors use separate program and data buses operating on 24-bit instructions and 16-bit data. The wider instruction word allows the device to use a more complex and robust instruction set than a 16-bit operation code, and the 16-bit data word provides lower power consumption for the needed dynamic range.

Processors are available with as much as 2.4 Mbits of on-chip SRAM around the DSP core. All ADSP-21xx processors integrate a programmable DMA controller to maximize I/O throughput. The ADSP-218x supports as much as 4 Mbytes of external memory, and the ADSP-219x architecture supports 16M words of external memory. All processors support a variety of serial-communications interfaces to ADC/DACs and other processors.

Addressing modes: ADSP-21xx processors support immediate, register-direct, memory-direct, and register-indirect addressing modes. The ADSP-219x adds register, indirect-post-modify, immediate-modify, and direct- and indirect-offset addressing modes. Each address generator supports as many as four circular buffers, each with three registers. The ADSP-219x supports as many as 16 circular buffers using a data-address-generator shadow-register set and a set of base registers for additional circular-buffering flexibility.

Special instructions or integral-peripheral functions: The ADSP-21xx contains dedicated loop hardware and a "do-until" loop instruction that supports loops ranging from 0 to 16,000 iterations, or loops, with infinite iterations. The ADSP-218x supports four-deep nesting via its loop hardware, and the ADSP-219x supports as many as eight loops. In addition to the standard arithmetic and logic instructions, the ALU supports division primitives. The ADSP-219x program sequencer features a six-deep pipeline and supports delayed branching. The ADSP-219x buses and instruction cache provide the data flow to maintain a continuous execution rate.

Analog Devices' ADSP-21000 SHARC family

at a glance:

  • SHARC natively supports 32-bit fixed, 32-bit IEEE floating, and extended 40-bit floating-point data types.
  • Multiprocessing configurations require no glue hardware.

The ADSP-21161N is the latest member of the SHARC family of general-purpose programmable DSPs. It is based on a Super Harvard ARChitecture and has both SIMD (single-instruction-multiple-data) and SISD (single-instruction-single-data) functions. The SHARC SIMD core contains two computational blocks that each include a multiplier, an ALU, a data-register file, and a barrel shifter that can process in parallel in SIMD mode. The core contains dual data-address generators, independent data- and address-memory buses, a program sequencer with zero-overhead looping, an instruction cache, and a timer. While the core operates at full speed, the I/O processor moves data on and off chip. SHARC DSPs integrate 1 to 4 Mbits of on-chip SRAM; as many as four serial ports, six link ports, and 14 zero-overhead DMA channels; an SPI-compatible port; a synchronous-DRAM controller; a parallel host interface; cluster-multiprocessing support; and an IEEE JTAG standard 1149.1 test-access port with on-chip emulation. The two independent, on-chip dual-ported SRAM blocks are selectable between program and data memory. The independent synchronous serial ports operate in TDM multichannel mode and, on the ADSP-21065L and ADSP-21161, offer I2S mode, which is useful for audio applications.

Addressing modes: ADSP-21000 SHARC DSPs support absolute and relative-direct addressing, premodify and postmodify registering, immediate-value-indirect addressing, and modulo and bit-reverse addressing. The dual-ported memory allows independent data transfers from the core and the I/O. Three on-chip buses allow two data transfers from the core and one from I/O in one cycle.

Special instructions or integral-peripheral functions: The ADSP-21000 SHARC family features distributed on-chip bus arbitration. Devices allow you to connect as many as six processors (two for the ADSP-21065L) in parallel, plus a host. All SHARC instructions execute in one cycle. Special instructions include bit manipulation, division iteration, reciprocal of square-root seed, conditional subroutine call, single and block repeat with zero-overhead looping, average-of-two numbers, bit packing and unpacking fixed- to and from floating-point conversion, and conditional execution of most instructions. SHARC supports IEEE-754 single-precision floating-point, 32-bit fixed-point, and a 40-bit extended IEEE format for additional accuracy.

Analog Devices' ADSP-215xx Blackfin

at a glance:

  • System architecture supports an integrated DSP and RISC microcontroller-unit instruction set.
  • Dynamic power management minimizes power consumption for power-constrained applications.

Built from the Micro Signal Architecture core jointly developed by Analog Devices and Intel, Blackfin DSPs feature dual-MACs, high clock rates, and dynamic power management for balancing system performance and power consumption. The modified Harvard architecture core combines signal- and control-processing features into a single instruction-set architecture that benefits programming in high-level languages, such as C/C++. DSP-core functional blocks and capabilities include two 16-bit MACs, two 40-bit ALUs, four 8-bit video ALUs, and a barrel shifter, plus eight 32-bit math registers with support for 8/16/32-bit integer and 16/32-bit fractional data types. The four 8-bit video ALUs address multimedia algorithms, including MPEG-2, MPEG-4, and JPEG, allowing a single device to concurrently process audio, video, imaging, and data information. The ADSP-21535 targets next-generation digital-communication systems and Internet appliances, and the ADSP-21532 targets consumer-multimedia systems.

Blackfin DSPs include support for user and supervisor modes, byte addressing, memory protection, and an orthogonal RISC instruction set. All Blackfin DSPs support a hierarchical and configurable memory model. L1 memory is physically closest to the core for highest system performance and is configurable as either SRAM or cache. L2 memory provides a larger memory space suitable for bulk storage of instructions or data. Additionally, dynamic power management permits context-sensitive control over power consumption by enabling designers to dynamically vary both the operating frequency and the voltage of the DSP core for optimizing power-consumption profiles.

Addressing modes: All Blackfin DSPs support DSP and general-purpose addressing modes, including indirect, indexed, auto-increment or -decrement, postincrement, and bit-reversed. Four sets of index, base, length, and modify registers enable circular (modulo) buffering of as many as four buffers per data-address generator. In addition, eight 32-bit registers are available for general-purpose addressing of 8-, 16-, and 32-bit data.

Special instructions or integral-peripheral functions: The Blackfin DSP instruction set includes special instructions for video and next-generation communications algorithms. Video-pixel-manipulation instructions include quad-byte operations for sum-absolute difference, average, and pack/unpack. Communications algorithms use dual-MAC instructions with rounding and saturate options in addition to add/compare/select or vector operators.

Analog Devices' ADSP-2199x

at a glance:

  • Processors target mixed-signal, embedded-control, and signal-processing applications.
  • Devices integrate a 150-MIPS DSP with a 14-bit, 18M-sample/sec ADC.

The ADSP-2199x family includes high-performance, mixed-signal DSPs that maintain full code compatibility with the ADSP-219x products. These devices integrate mixed-signal components, such as high-resolution ADCs, with a variety of peripheral components to form single-chip devices targeting embedded-signal-processing and -control applications, such as industrial measurement and control, high-end servo-motor drives, uninterruptible power supplies, high-end switched-mode power supplies, optical-networking control, and intelligent-sensor interfaces.

The ADSP-21990 and ADSP-21991 integrate a 150-MIPS, 16-bit ADSP-219x core with a 4k-word program memory, a 4k-word data memory, and an eight-channel, 14-bit, 18M-sample/sec ADC core, with dual S/H amplifiers for simultaneous sampling. An external memory interface enables direct access to as much as 1M word of external memory for program-memory expansion, data-memory expansion, or both. The ADSP-21990 is available in industrial-temperature ranges and packaged in both BGA and QFP versions. The ADSP-21991 supports increased on-chip memory to 32k-word program memory and 8k-word data memory in pin-for-pin-compatible packages.

Addressing modes: Identical to the ADSP-219x products, the ADSP-2199x products support immediate, register-direct, memory-direct, register-indirect, indirect-postmodify, immediate-modify, and direct- and indirect-offset addressing modes. The ADSP-2199x supports as many as 16 circular buffers using a data-address-generator shadow register and a set of base registers for additional circular-buffering flexibility.

Special instructions or integral-peripheral functions: The ADSP-21990 and ADSP-21991 products share all of the architectural features and special instructions of the ADSP-219x core. The key integrated peripheral of these products is the high-performance, 14-bit ADC. The embedded-control peripherals include a three-phase PWM generation unit; a 32-bit incremental-encoder interface; dual auxiliary PWM outputs; a watchdog timer; and general-purpose peripherals, such as timers, digital-I/O lines, and serial-communications and programmable-interrupt controllers. Additionally, these devices include an on-chip precision-voltage reference and an integrated power-on-reset circuit.

Analog Devices' ADSP-TS101S TigerSHARC family

at a glance:

  • TigerSHARC can perform 1500 MFLOPS or 2000 16-bit million MACs/sec at 250 MHz.
  • Targets wireless-infrastructure and multiprocessing applications.

The ADSP-TS101S TigerSHARC floating-point DSP targets multiprocessing and 3G wireless-infrastructure applications. This static superscalar architecture blends the best features of DSP, RISC, and VLIW (very-long-instruction word) for a high-performance DSP architecture. These features include a load/store architecture, branch prediction, large interlocked register file, fast mathematical computations, bit reversing, zero-overhead looping, background data movement with DMA, and an instruction width that varies from one to four words. Two computational blocks in TigerSHARC support 1-, 8-, 16-, and 32-bit operations. Each computational block contains a 32-entry register file, an ALU, a multiplier, and a shifter. It can execute two 32-bit floating-point MACs, eight 16-bit MACs with 40-bit accumulation, or two 16-bit complex MACs in a single cycle. The device can perform as many as 32 mathematical operations per cycle with 8-bit data types. Three 128-bit buses support TigerSHARC's three on-chip memory banks, which total 6 Mbits. In a given cycle, the device can fetch four 32-bit instruction words and load 256 bits of data into the register file or store it in memory.

Addressing modes: TigerSHARC has two integer ALUs in addition to the two computational blocks. It uses the ALUs primarily for data-address generation, and each unit contains a 32-bit ALU and a fully orthogonal, 32-word register file. These units can generate an address per cycle, which allows the device to send two 128-bit words to each computational unit. These units also support preaddress and postaddress modification, circular buffering, and bit reversing without an extra-cycle penalty.

Special instructions or integral-peripheral functions: Special instructions to accelerate both symbol- and chip-rate processing for 3G baseband-signal processing, include a complex MAC operation for chip-rate processing and add/compare/select operation for channel-decoding algorithms. Peripherals include four bidirectional link ports, a 14-channel DMA controller, and a 64-bit-wide external port that includes an SDRAM controller, a host interface, and support for glueless multiprocessing of as many as eight TigerSHARCs. The four link ports are byte wide interfaces that transmit data on both the rising and the falling clock edge and offer a second method for multiprocessing with ring and 2-D mesh multiprocessing configurations.

Analog Devices' SoftFone

at a glance:

  • SoftFone is a RAM-based design requiring no ROM turns.
  • Complete reference design is available.

The SoftFone, also known as the AD20msp430, integrates an ADSP-218x core and ARM controller to perform voice-coding and channel-equalizer functions for GSM cell phones and GPRS (general-packet-radio-service) mobile terminals. The RAM-based chip eliminates ROM turns, and you can implement it into different GSM/GPRS devices by changing the software.

Addressing modes: You can augment the 16 Mbits of directly addressed, on-chip flash memory with external flash and RAM.

Special instructions or integral-peripheral functions: The instruction set is the basic ADSP-218x instruction set. On-chip peripherals on all devices include a microstate machine to control events with less-than-1-bit-period resolution; bus-arbitration logic to allow direct access to and from memory and all internal buses and registers; and programmable serial ports for connection to the voiceband/baseband codecs. Some devices include USB and other standard interfaces for smart-phone and wireless-PDA devices. Complementary chips for voiceband/base-band codes, quad-band RFIC, and power management/battery charging are available.

Support: The CrossCore development components include the VisualDSP++ software-development environment, EZ-Kit Lite evaluation systems, emulators, and DSP/math libraries. VisualDSP++ is an integrated software-development environment that includes an assembler, a C/C++ compiler, a linker, a debugger, an archiver, a loader utility for creating bootable images, VDK (VisualDSP++ kernel), advanced plotting tools, and statistical profiling. The EZ-Kit Lite evaluation system supports extension by the addition of JTAG in-circuit emulation. Emulators are available for serial-port, PCI, and USB host platforms. The VisualFone is the development system for SoftFone-based products. A complete GSM/GPRS protocol stack for SoftFone is available from TTPCom.

ARC's ARCtangent

at a glance:

  • Cores support a user-customizable instruction set, registers, buses, interrupts, caches, and memories.
  • Single tool chain supports RISC/DSP and multiprocessor software development.

The ARCtangent-A4 and ARCtangent-A5 cores are synthesizable, user-customizable, 32-bit RISC processors with optional DSP extensions. Developers add the DSP extensions with the ARChitect configuration tool, a graphical-design tool that generates RTL files and synthesis scripts. The ARCtangent-A4 has a 32-bit instruction set, and the ARCtangent-A5 has the ARCompact 16/32-bit instruction set, which allows free mixing of 16- and 32-bit instructions for greater code density without mode-switching penalties. Both cores are synthesizable with industry-standard tools and are portable to almost any foundry or process. The integrated RISC/DSP cores allow programmers to use a single tool chain for RISC and DSP software development.

Addressing modes: The ARCtangent supports as many as four banks of XY memories from 512 bytes to 16 kbytes and has a user-extendable register file. The address-generation units for the XY memories support modulo and bit-reverse addressing with variable-offset preincrement and postincrement modes.

Special instructions or integral-peripheral functions: DSP features include 16×16, 24×24-, and dual 16×16-bit MACs with 8 guard bits for the accumulator, saturating add and subtract instructions, fractional arithmetic, normalize (find first bit), swap, minimum/maximum, 32×32-bit barrel shifter, 32×32-bit multiplier, and zero-overhead loops. The instruction set is conditional, with as many as 16 user-defined condition codes. Developers can also configure and extend the instruction set to optimize performance for specific applications.

Support: The ARCtangent RISC/DSP comes with RTL source code, extensive documentation, the ARChitect processor-configuration tool, MetaWare High C/C++ software-development tools, an assembly-language DSP-function library that is callable from C/C++ programs, customer training, and technical support. The single tool chain supports both RISC and DSP-software development. ARC also provides peripheral-IP cores, including USB, Ethernet, and Bluetooth controllers; the Precise/MQX RTOS system software; network-protocol stacks; and software for vertical-market consumer and communication applications.

BOPS' ManArray

at a glance:

  • The ManArray DSP core can achieve 100 MIPS/mW.
  • The core has power consumption of 11 to 36 mW for as many as 4000 MIPS.

The ManArray architecture is a fully scalable, configurable, synthesizable DSP architecture that is programmable and reusable in an array of implementations for communications, mobile-multimedia, and wireless applications. Each application-specific family balances the trade-offs in cost, power, and performance for targeted applications. The MoCARay configuration targets GPRS/EDGE (general-packet-radio-service/enhanced-data-rate-for-GSM-evolution) baseband layer 1 processing at less than 20 mW and turbo-codec processing at less than 50 mW, for software-defined, trimode 2G/2.5G/3G handsets. The MICoRay configuration targets full-duplex MPEG4 CIF codec processing at less than 100 mW for high-quality videoconferencing on Smartphones and PDAs. The WirelessRay configuration targets physical-layer processing for 802.11b, 802.11g, and 802.11a at less than 70 mW for wireless-LAN devices.

Addressing modes: The BOPS architecture supports SIMD (single-instruction-multiple data), MIMD (multiple-instruction-multiple-data), and SMIMD (synchronous-multiple-instruction-multiple-data) operation. A fully programmable, patternable, scalable DMA engine supports the addressing modes and data-flow management necessary to meet the computational requirements of the high-performance, scalable DSP cores.

Special instructions or integral-peripheral functions: Each family has an enhanced instruction set targeting mobile-wireless, mobile-video, or high-performance streaming media. You can easily integrate all functions—from a RISC coprocessor to a simple PCI interface—into BOPS SOCs (systems on chip).

Support: The BOPS software-development kit integrates tools for application-software programmers, SOC designers, firmware designers, and system architects into one development environment. The Jordan and Travis evaluation boards enable BOPS customers to evaluate the ManArray architecture and develop SOCs that use the BOPS' ManArray-based family of programmable DSP cores. BOPS Halo Parallelizing C compiler enables programmers accelerate their software-development schedule by automatically exploiting the three levels of parallelism in the ManArray architecture, including packed data, processor arrays, and indirect VLIW (very-long-instruction-word) instructions.

Cirrus Logic's CS49400

at a glance:

  • Support for AAC, DTS-ES 96/24, and THX Cinema requires no additional external logic or memory.

Cirrus Logic's CS49400 DSP integrates a 32-bit audio processor; a dedicated multistandard decoder; key peripherals; and X, Y, and program memories in a single chip targeting digital-entertainment applications. The device can support multichannel DTS 96/24, Dolby Digital, AAC, and THX Ultra2 Cinema without additional logic or memory, and it supports customer software-security keys.

Special instructions or integral-peripheral functions: Along with dual S/PDIF (Sony/Philips digital-interface) transmitters and serial and parallel host interfaces, the CS49400 includes 12 audio-input and 16 PCM-output channels.

Support: The CS49400 features an audio framework, including customizable programming, certified audio decoders, and sound-enhancement programs for DTS 96/24, Dolby Digital, AAC, and THX. The CrystalWare Software Library provides legacy audio-decoder support.

DSP Architectures' DSP24 family

at a glance:

  • The device targets low-power, software-minimized, frequency-domain signal-processing applications.
  • Radiation-hardened versions are available.

The high-performance DSP24 array-processor core, optimized for signal and image processing in the frequency domain, targets applications that perform operations on large arrays of data. It is a pass-based processor, with each function valid for one complete pass. Each operation code defines a basic flow for the desired operation that repeats for multiple pairs of data to complete one pass. For typical array-processing applications, such as FFTs, the device sets up a function code (for example, BFLY32). Radix32 butterfly then clocks the whole data array into the DSP24 and applies the function to the whole array. There is a latency when implementing the DSP24 functions, that the MMU24 automatically compensates for when you use it in a system. The pipelined systolic structure allows you to cascade multiple DSP24s for increased performance and higher radices. This structure permits 80/100-MHz operation on an unlimited array size with support for enhanced read-only FFT, double-length FFT, dual FFT, and stacked FFT to reduce latency.

Addressing modes: The DSP24 addressing includes digit reversing, read-only-FFT addressing, fast sine/cosine, decimation, interpolation, modulo increment/decrement, array padding, zero filling, radix2 through radix1024 patterns, and parameterized user sequences.

Special instructions or integral-peripheral functions: The DSP24 includes radix2 through radix1024 instructions. It can perform no-overhead window functions and filter/image multiplies and uses five bidirectional data ports for any-port-to-any-port data routing.

Support: The DSP24 and optional MMU24 software-development kit come with C models and optional VHDL models. Valley Technologies (www.valleytech.com) offers board and module products, including the VectorWare language.

DSP Group's OakDSPCore

at a glance:

  • The OakDSPCore handles bit-manipulation, control, and DSP instructions.
  • Power management includes three modes.

The 16-bit, fixed-point, single-MAC, licensable OakDSPCore architecture includes and microcontroller instructions for higher code density. The OakDSPCore has two data buses and one program bus, configurable ROM/RAM size, a data-address-arithmetic unit, a multiplier, a 36-bit ALU, two sets of two 36-bit accumulators, and support for a C compiler. It also includes a bit-manipulation unit with a 36-bit barrel shifter, an exponent-evaluation unit that supports fast normalization, and a bit-field-operation unit. The zero-overhead-loop mechanisms include an interruptible single-word instruction loop and four-level nesting of block repeats. User-definable registers speed hardware acceleration and provide coprocessor support. It has four pipeline stages, single-cycle interrupt latency, and automatic context switching. Power management includes active-, slow-, and idle-operation modes. OakDSPCore is compatible with the PineDSPCore.

Addressing modes: The OakDSPCore supports register, single- and double-indirect, short- and long-immediate, short- and long-index, and stack-pointer addressing modes. It supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space. It also has a program-memory-indirect addressing mode.

Special instructions: Instructions for the OakDSPCore include single-cycle minimum/maximum calculation with pointer latching, double-precision calculations, normalization, exponent, conditional accumulator modifications, division step, read-modify (add/subtract/OR/AND/XOR)-write, test 16-bit mask bits and test bit, delayed return, interruptible single-word repeat loop and block repeat, 36-bit shift left or right in a single cycle, and a bank exchange of alternative registers.

DSP Group's PalmDSPCore

at a glance:

  • Seven arithmetic units support SIMD and MIMD operations.
  • Devices feature high code density with 16- and 32-bit instruction widths.

PalmDSPCore is a family of three licensable, dual-MAC, soft DSP cores—of 16, 20, and 24 bits each—that have an instruction-level-parallelism architecture, MIMD (multiple-instruction-multiple-data) and SIMD (single-instruction-multiple-data) instructions, seven computation units working in parallel, and symmetrical cross-coupled MAC paths. PalmDSPCore has two multipliers; a three-input ALU; a three-input split adder-subtracter unit; four orthogonal, 40-bit accumulators; and a bit-manipulation unit, including insert-extract operations. The data-address-arithmetic unit contains two additional adder-subtracter-units. It has integrated accelerators for FFT and Viterbi-decoding, RTOS, and bit-exact standards. It has zero-overhead-loop mechanisms with infinite levels of repeat and block repeat and six pipeline stages. It also has coprocessor support and 16 user-defined registers for hardware acceleration. PalmDSPCore has high code density through variable instruction width (16 or 32 bits). Maximum PalmDSPCore program-memory space is 16M words. PalmDSPCore is a process- and library-independent, fully synthesizable soft core and is compatible with previous SmartCores generations, including Teak, TeakLite, and OakDSPCore.

Addressing modes: PalmDSPCore supports circular (modulo) buffering, register, short- and long-direct, short- and long-immediate, relative, bit-reversal, double-word, parallel, index-based, and stack-pointer addressing. It also supports a maximum quadruple-indirect addressing mode (for example, to simultaneously feed four inputs of the two multipliers or four inputs of the split ALU).

Special instructions: The device supports single, parallel, and multiparallel instruction sets. It also supports dual-MAC, complex FFT butterfly in two cycles, Viterbi decoding in two cycles, microcontroller instructions, delayed branches and return, normalization, exponent, conditional instructions (parallel moves, logic, arithmetic, and accumulator), and infinite levels of repeat and block repeat.

DSP Group's PineDSPCore

at a glance:

  • The DSP-and-control instruction set is compact.
  • PineDSPCore is a licensable DSP core.

PineDSPCore, the first generation of the SmartCores family, is a 16-bit, fixed-point, single-MAC, licensable DSP core. It has a compact DSP-and-control instruction set for high code density. PineDSPCore has two data buses and one program bus, a configurable ROM/RAM size, and a data-arithmetic-addressing unit. The computation unit includes a multiplier, a 32-bit product register, a 36-bit ALU, two 36-bit accumulators with 4 guard bits, and a normalization mechanism. The ALU performs arithmetic and logic operations on the data operands and functions, such as step division and rounding. PineDSPCore also includes two zero-overhead loop mechanisms: a single-word instruction loop and a block repeat. It has user-definable registers for hardware acceleration, coprocessor support, or both. It has three pipeline stages and single-cycle interrupt latency. Power management includes active-, slow-, and idle-operation modes.

Addressing modes: PineDSPCore supports register, single- and double-indirect, and short- and long-immediate addressing modes. It supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space. In addition, it has a program-memory indirect addressing mode.

Special instructions: Instructions include conditional accumulator modifications, conditional and unconditional call and branch, arithmetic and logical operations, round, rotate, shift, compare, division step, MAC, square, single-word repeat loop, and block repeat.

DSP Group's Teak

at a glance:

  • The compact code has a 16-bit instruction width.
  • FFT butterfly completes in five cycles, and Viterbi decoder completes in three cycles.

The low-power, 16-bit, fixed-point, dual-MAC, licensable Teak DSP soft core has an instruction-level-parallelism architecture. The process- and library-independent, fully synthesizable soft core supports the ASIC design environment. Teak has configurable memory size; a data-address-arithmetic unit; two multipliers; a 40-bit, three-input split ALU; four 40-bit accumulators; an exponent unit; and a bit-manipulation unit. It has integrated accelerators for complex FFT; Viterbi-decoder; wide-automatic-context-switching; RTOS; and bit-exact standards, such as GSM communications. It has zero-overhead- loop mechanisms with infinite levels of repeat and block repeat. Teak has compact code with a 16-bit instruction width, including parallel instructions. You can extend Teak's program memory to 4M words. With user-definable registers for hardware acceleration, coprocessor support, or both and mechanisms for power-consumption reduction, Teak is code-compatible with the OakDSPCore and TeakLite instruction sets of SmartCores.

Addressing modes: Teak supports circular (modulo) buffering, register, short- and long-direct, short- and long-immediate, relative, bit-reversal, and short- and long-index-based addressing modes. It can also perform quadruple-indirect addressing (for example, to simultaneously feed four inputs of the two multipliers or four inputs of the split ALU).

Special instructions: Instructions include dual-MAC performance, read/write double words to and from memory, and single-cycle minimum/maximum search with pointer latching. The device handles complex FFT butterflying in five cycles and Viterbi decoding in three cycles. Devices also perform bit-manipulation and microcontroller instructions, double-precision multiplication, normalization, exponent, conditional instructions, division step, and infinite levels of repeat and block repeat.

DSP Group's TeakLite

at a glance:

  • Devices feature high code density.
  • Power-management modes support lower power devices.

The 16-bit, fixed-point, single-MAC, licensable Teaklite soft DSP core that is code-compatible with the OakDSPCore instruction set. It enhances the OakDSPCore in several areas, mainly portability. It is a process- and library-independent soft core; it increases operating speed by 30% in the same process technology and reduces power consumption by architecture and power-reduction mechanisms. Its design methodology better meets ASIC-design-environment requirements by employing a single-edge design (still with four pipeline stages) and full or partial testability and by using standard memories. TeakLite has a configurable memory size, a data-address-arithmetic unit, a multiplier, an ALU, four 36-bit accumulators, a bit-manipulation unit, and zero-overhead loop mechanisms for repeat and block repeat. Its instruction set includes microcontroller instructions enabling high code density. It has user-definable registers for hardware acceleration, coprocessor support, or both; cycle-stealing DMA support, burst-mode DMA support, or both; and power-management operation modes.

Addressing modes: TeakLite supports register, single- and double-indirect, short- and long-immediate, short- and long-index, and stack-pointer addressing modes. It supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space. In addition, it has a program-memory-indirect addressing mode.

Special instructions or integral-peripheral functions: Instructions include single-cycle minimum/maximum calculation with pointer latching, double-precision calculations, normalization, exponent, conditional accumulator modifications, division step, read-modify (add/subtract/OR/AND/XOR)-write, test 16-bit mask bits and test bit, delayed return, interruptible single-word repeat loop and block repeat, 36-bit shift left or right in a single cycle, and a bank exchange of alternative registers.

Support: DSP Group provides a full set of advanced GUI-based development tools, including an optimizing C/C++ compiler, an assembler, a linker, common-object-file-format converters, a debugger with an emulation interface and the Assyst extendable simulator for system-on-chip simulation, a profiler, and the Evaluation Development Platform. DSP Group has a large infrastructure of third-party vendors offering software, tools, and design services.

Click here to see the rest of the DSP directory.



Reed Business Information Resource Center

Featured Company


Most Recent Resources

ADVERTISEMENT

ADVERTISEMENT

Feedback Loop


Post a CommentPost a Comment

There are no comments posted for this article.

Related Content

 

By This Author


ADVERTISEMENT

Knowledge Center



Technology Quick Links

EDN Marketplace


©1997-2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites