Zibb

2003 DSP directory (continued)

-- EDN, 4/3/2003

 

Click here to return to devices A through L and the article's manufacturers box.

MOTOROLA DSP56800 AND DSP56800E

at a glance:

  • The devices use a microcontroller/DSP-hybrid architecture.
  • The devices offer a high level of peripheral integration.

Motorola’s DSP56800 family integrates the instruction set of a DSP with the control functions of an embedded microcontroller into a single core. The 56800 family of products targets applications that traditionally use 16-bit microcontrollers but also require DSP functions, such as point of sale, voice recognition, digital telephone-answering devices, motor-control systems, and applications requiring voice, audio, or data processing. Motorola’s 56800E family enhances the DSP56800 architecture by providing five times the performance (to 200 MIPS) at one-third the power consumption of the original core, and by doubling code density. It offers expanded memory addressing for as much as 4 Mbytes of program memory and 32 Mbytes of data memory. The 56800E includes 19 addressing modes and 8-, 16-, and 32-bit data types; supports fast interrupts; and supports real-time debugging.

The 56F83X flash-based DSP controllers use 60 MHz and 60 MIPS, have embedded flash memory, and support an extended temperature range of –40 to +125°C. The 56F83x family of devices targets automotive, instrumentation, and industrial-networking applications, including electronic power-assisted steering, data-acquisition equipment, and factory-automation systems.

Addressing and processing modes: The address-generation unit performs all address calculations to minimize execution time. Addressing modes specify the location of the operands—whether they are immediate values, in a register, or in memory—and provide the exact address of the operands. The architecture groups 19 addressing modes into register-direct, address-register-indirect, immediate, and absolute categories. The 56F83x family also supports parallel-instruction execution.

Special instructions or integral-peripheral functions: The 56800 includes a bus structure that allows it to move data at the speed of the DSP and offers the peripheral set of a microcontroller, such as an interrupt controller, an external memory interface, general-purpose I/O, a scalable controller-area network, ADC, a quadrature decoder, a pulse-width modulator, SCI, SSI, SPI, a quad timer module, and on-chip-emulation.

Development support: The Metrowerks (www.metrowerks.com) CodeWarrior IDE tool set supports software development across Motorola’s entire family of 16-bit controllers. It includes an optimized C compiler, an assembler, a linker, a debugger, and an instruction-set simulator. Motorola offers target development boards and companion daughtercards for market-specific applications, such as motor control, industrial, and automotive. Other third-party tool developers and consultants support 56800 devices.

NEC ELECTRONICS SPXK5

at a glance:

  • SPXK5 achieves 1000 MIPS/500 MMACs (million MACs) at 250 MHz.
  • Enhanced media instructions accelerate video codecs.

The SPXK5, a four-way VLIW (very-long-instruction-word) DSP core, has a highly orthogonal instruction set that enables efficient high-level-language code generation and features low-power consumption. It targets multimedia applications on handheld terminals, such as 3G mobile phones and PDAs. The SPXK5 contains seven functional units, two 32-bit data buses, one 64-bit instruction bus, and register files. The functional units comprise two MAC (multiply-accumulate) units for 16×16-bit multiplication and 40/16-bit accumulation; two ALUs for addition/subtraction, shift, and logical operations; two data-address units for load and store; and one system-control unit for branch, zero-overhead looping, and conditional execution. The SPXK5’s register files include eight general-purpose, 40-bit registers, eight address registers, eight corresponding offset registers, and other system registers. The SPXK5 can issue as many as four instructions in parallel during the same clock cycle, as long as the total length of the instructions does not exceed 64 bits. All arithmetic, logical, and load/store operations other than MAC operations are single-cycle operations.

Addressing and processing modes: The SPXK5 core architecture supports direct-addressing mode, postmodify-indirect addressing mode, and premodify-indirect-addressing mode. The postmodification operations available for the address registers include no change, increment, decrement, index addition, modulo index addition, and bit-reversed.

Special instructions or integral-peripheral functions: Eight application-specific instructions support image/video processing and Viterbi decoding and provide sufficient coverage without significantly increasing the length of instruction encoding fields or the total code size. The PADD, PSUB, PSHIFT, PADDABS, PACKV, UNPACKV instructions are effective for the core functions of a video encoder/decoder algorithm. These instructions can accelerate MPEG-4 video-codec performance by 20%. The add-compare-select operation is the core of the Viterbi decoding. The SPXK5 can execute the PADD, PSUB, and PMAX instructions in two cycles and complete two ACS operations.

Development support: NEC Electronics provides GUI-based software-development tools for SPXK5 that include an ISO/ANSI-C compiler, an assembler/linker, a simulator, and a source-level debugger. The compiler provides enhanced optimization, such as instruction scheduling, inlining, loop analysis, and machine-dependent tuning. For efficient target code, the compiler accepts programs in DSP-C—a language extension of ISO-C including key DSP features, such as fixed-point types.

OAK TECHNOLOGY PM-44IX

at a glance:

  • The four parallel-pipelined processors can perform 3700 MIPS/930 MMACs (million MACs) at 233 MHz.
  • The PM-44ix supports as many as 16 color-ink-jet and 30 monochrome laser copies per minute.

Oak Technology’s iDSP family targets image-processing applications, such as imaging-enabled printers and multifunction peripherals. The iDSP provides designers with the flexibility of a software-based image-processing option. The PM-44ix contains four symmetric parallel-pipelined processors and employs the SIMD (single-instruction-multiple-data) parallel-processing architecture to take advantage of the parallelism inherent in image data.

Addressing and processing modes: All memory accesses in the iDSP are 32 bits wide.

Special instructions or integral-peripheral functions: Special extraction and insertion units allow designers to manipulate bit fields of any size within 32-bit registers. The iDSP instruction set contains specialized instructions for coordinating parallel processing.

Development support: The iDSP programming environment includes an IDE, an image-processing library, and an evaluation board. Oak Technology’s worldwide direct sales and support organization supports the iDSP.

PARTHUSCEVA OAKDSPCORE

at a glance:

  • The OakDSPCore handles bit-manipulation, control, and DSP instructions.
  • Power management includes active, slow, and idle modes.

The 16-bit, four-stage-pipeline, fixed-point, single-MAC, licensable OakDSPCore architecture includes DSP and microcontroller instructions for higher code density. It is the second member of the SmartCores family. The OakDSPCore has two data buses and one program bus, configurable ROM/RAM size, a data-address-arithmetic unit, a multiplier, a 36-bit ALU, two sets of two 36-bit accumulators, and support for a C++/C-compiler. It includes a bit-manipulation unit with a 36-bit barrel shifter, an exponent-evaluation unit that supports fast normalization, and a bit-field-operation unit. The zero-
overhead-loop mechanisms include an interruptible single-word instruction loop and four-level nesting of block repeats. User-definable registers speed hardware acceleration and provide coprocessor support. It has single-cycle interrupt latency and automatic context switching. Power management includes active-, slow-, and idle-operation modes. OakDSPCore is compatible with the PineDSPCore.

Addressing and processing modes: The OakDSPCore supports register, single and double-indirect, short- and long-immediate, short- and long index, and stack-pointer addressing modes. It supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space. It also has a program-memory-indirect addressing mode.

Special instructions or integral-peripheral functions: The OakDSPCore handles bit-manipulation, control, and DSP instructions. Instructions include single-cycle minimum/maximum calculation with pointer latching, double-precision calculations, normalization, exponent, conditional accumulator modifications, division step, read-modify (add/subtract/OR/AND/XOR)-write, test 16-bit mask bits and test bit, delayed return, interruptible single-word repeat loop and block repeat, 36-bit shift left or right in a single cycle, and a bank exchange of alternative registers.

PARTHUSCEVA PALMDSPCORE

at a glance:

  • Seven arithmetic units support SIMD and MIMD instructions.
  • Core offers parallelism using 16- and 32-bit-wide instruction.

PalmDSPCore is a family of three licensable, highly parallel, and dual-MAC soft DSP cores—of 16, 20, and 24 bits—targeting 2.5 and 3G terminals, voice-over-IP gateways, streaming audio/video, and infrastructure applications. PalmDSPCore is an instruction-level-parallelism architecture including MIMD (multiple-instruction-multiple-data) and SIMD (single-instruction-multiple-data) instructions and seven computation units working in parallel. The symmetrical cross-coupled MAC (multiply-accumulate) paths allow non-FIR-oriented algorithms, such as complex Radix-2 FFT butterfly, to execute in two cycles. PalmDSPCore has internal mechanisms and special instructions to reduce power consumption, in addition to active, slow, and idle power-management modes. PalmDSPCore has two multipliers; a three input ALU; a three-input split adder-subtracter unit; four orthogonal, 40/48/56-bit accumulators; and a bit-manipulation unit. The data-address arithmetic unit contains two additional adder-subtracter-units, which can perform control functions in parallel to the arithmetic calculations at the computation and bit-manipulation unit. It has zero-overhead-loop mechanisms with infinite levels of repeat and block repeat and six pipeline stages. PalmDSPCore increases code density by using variable instruction width (16 or 32 bits) so that a complete N-taps FIR filter is coded in only four words and executes in N/2+1 cycles. You can extend program memory to 32 Mbytes. PalmDSPCore is a process- and library-independent, fully synthesizable soft core, compatible with previous SmartCores generations, including Teak, TeakLite, and OakDSPCore.

Addressing and processing modes: PalmDSPCore supports circular buffering, register, short- and long-direct, short- and long-immediate, relative, bit-reversal, double-word, parallel, index-based, conditional, and stack-pointer addressing. It also supports quadruple indirect addressing mode to simultaneously feed four inputs of the two multipliers or four inputs of the split ALU.

Special instructions or integral-peripheral functions: PalmDSPCore has single, parallel, and multiparallel instruction sets that include SIMD and MIMD instructions. The core can execute as many as 18 operations in a single cycle using a 16- or 32-bit instruction width. It supports DSP and microcontroller instructions such as dual-MAC, complex FFT butterfly in two cycles, Viterbi decoding in two cycles, vector quantization, delayed branches/return, normalization, exponent, conditional instructions (parallel moves, logic, arithmetic, and accumulator), single-cycle 40/48/56-bit shift left/right, bit-field and insert-extract operations. For accelerating specific tasks, you can further extend the core with as many as 16 custom accelerators.

PARTHUSCEVA PINEDSPCORE

at a glance:

  • The DSP-and-control instruction set is compact.
  • PineDSPCore is a licensable DSP core.

PineDSPCore, the first generation of the SmartCores family, is a 16-bit, fixed-point, single-MAC (multiply-accumulate) unit, licensable DSP core. It has a compact DSP-and-control instruction set for high code density. PineDSPCore has two data buses and one program bus, a configurable ROM/RAM size, and a data-arithmetic-addressing unit. The computation unit includes a multiplier; a 32-bit product register; a 36-bit ALU; two 36-bit accumulators, including four guard bits; and a normalization mechanism. The ALU performs arithmetic and logic operations, such as step division and rounding. PineDSPCore includes two zero-overhead loop mechanisms: a single-word instruction loop and a block repeat. It has user-definable registers for hardware acceleration, coprocessor support, or both. It has three pipeline stages and single-cycle interrupt latency. Power management includes active-, slow-, and idle-operation modes.

Addressing and processing modes: PineDSPCore supports register, single- and double-indirect, and short- and long-immediate addressing modes. It has a program-memory indirect-addressing mode and supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space.

Special instructions or integral-peripheral functions: Instructions include conditional accumulator modifications, conditional and unconditional call and branch, arithmetic and logical operations, round, rotate, shift, compare, division step, MAC, square, single-word repeat loop, and block repeat.

PARTHUSCEVA TEAK AND XPERTTEAK

at a glance:

  • All instructions are 16 bits wide, including dual-MAC instructions.
  • Teak handles complex FFT in five cycles and Viterbi decoder in three cycles.

The fully synthesizable, low-power, 16-bit, fixed-point, dual-MAC (multiply-accumulate) unit, licensable Teak DSP soft core has an instruction-level-parallelism capability. It targets cellular terminals, digital cameras, voice-over-IP gateways and Internet-audio applications. Teak has active, slow, and idle power-management modes in addition to internal mechanisms to reduce power consumption. Teak includes a configurable memory size; a data-address-arithmetic unit; two multipliers; a 40-bit, three-input, split ALU; four 40-bit accumulators; an exponent unit; and a bit-manipulation unit. It has integrated accelerators for complex FFT; Viterbi-decoder; RTOS; and bit-exact standards, such as GSM communications. It has zero-overhead- loop mechanisms with infinite levels of repeat and block repeat, vectored interrupt, small interrupt latency, and wide-automatic-context-switching. Designers can extend Teak’s program memory to 8 Mbytes and the core by hardware accelerations via the user-definable registers. Teak is code-compatible with the OakDSPCore and TeakLite instruction sets of SmartCores.

The XpertTeak is a fully synthesizable, process-independent, licensable option of a low-power, programmable-DSP core, targeting cellular, image, video, audio, speech, and voice-over-packet applications. XpertTeak is available as a stand-alone DSP SOC (system on chip) or embedded with an ARM SOC. The embedded offering includes an AMBA bridge and an APB bridge. XpertTeak is based on the dual-MAC Teak DSP core with the addition of a high-performance DMA controller, buffered time-division-multiplexing ports, a host-processor interface, and timers. The power-management unit can reduce the power consumption of inactive peripherals.

Addressing and processing modes: Teak supports circular (modulo) buffering, register, short- and long-direct, short- and long-immediate, relative, bit-reversal, and short- and long-index-based addressing modes. It can also perform quadruple-indirect addressing (for example, to simultaneously feed four inputs of the two multipliers or four inputs of the split ALU). The XpertTeak has separate instruction- and data-memory ranges. It can access as much as 8 Mbytes of program memory and 8 Gbytes of data memory. It incorporates a synchronous SRAM interface for the memories.

Special instructions or integral-peripheral functions: Instructions include dual-MAC operation, read/write double words to and from memory, and single-cycle minimum/maximum search with pointer latching. Teak handles complex FFT butterfly in five cycles and Viterbi decoding in three cycles. It includes bit-manipulation and microcontroller instructions, double-precision multiplication, normalization, single-cycle exponent evaluation, conditional instructions, coprocessor support, division step, and infinite levels of repeat and block repeat. All the instructions are 16 bits wide, supporting compact code. Teak supports cycle stealing and burst-mode DMA, program boot, and code downloading.

XpertTeak includes an eight-channel DMA controller, two buffered time-division multiplexing ports, an 8/16-bit host port that facilitates interfacing to a variety of host processors in the stand-alone offering or AMBA and APB bridge in the embedded offering. XpertTeak also includes a power-management unit, serial I/O, a vectored interrupt-control unit, timers, PWM generators, JTAG, and an on-chip-emulation-module.

PARTHUSCEVA TEAKLITE AND VOPSTREAM

at a glance:

  • TeakLite is a low-power-consumption, licensable core.
  • VoPStream is a licensable design for voice-over-packet applications.

The 16-bit, fixed-point, single-MAC (multiply-accumulate)- unit, licensable TeakLite soft DSP core is code-compatible with the OakDSPCore instruction set. It enhances the OakDSPCore in portability, frequency, power consumption, and area. It is a process- and library-independent soft core, increases operating speed by 30% in the same process technology, and reduces power consumption by architecture and power-reduction mechanisms. Its design methodology better meets ASIC-design-environment requirements by employing a single-edge design (still with four pipeline stages) and full or partial testability and by using standard memories. TeakLite has a configurable memory size, a data-address-arithmetic unit, a multiplier, an ALU, four 36-bit accumulators, a bit-manipulation unit, and zero-overhead loop mechanisms for repeat and block repeat. Its instruction set includes microcontroller instructions enabling high code density. It has user-definable registers for hardware acceleration, coprocessor support, or both; cycle-stealing DMA support; and active, slow, and ideal power-management-operation modes.

The VoPStream, previously known as VoPKey, is a licensable design for VoP (voice-over-packet) applications. The company based VoPSTream on the TeakLite DSP architecture. It targets VoDSL (voice-over-DSL), VoCable (voice-over-cable), enterprise/residual gateways, and IP-PBX (Internet protocol-private-branch exchange) and IP (LAN) phone applications. VoPStream can operate alone, or you can use it as a subsystem in an integrated-networking/VoP SOC. VoPStream includes software for speech compression and decompression, echo cancellation, and other associated telephony functions. The software is open, allowing designers to add proprietary algorithms. An evaluation board is available, based on a mass-production chip that incorporates the VoPStream design and includes a set of development tools. The DSP firmware includes speech codecs, such as G.723, G.726, G.729 and G.711, G.168 echo cancellation, VAD, CNG, DTMF fax/modem detection, caller ID, and many more algorithms.

Addressing and processing modes: TeakLite supports register, single- and double indirect, short- and long-immediate, short- and long-index, stack-pointer, and program-memory-indirect addressing modes. It supports circular (modulo) buffering for all its pointers and direct addressing for the entire 64k-word data space. VoPStream supports synchronous and asynchronous interfaces to external memories, such as DRAM and flash for data storage. VoPStream can function as a slave device using the host-port interface. It can communicate with pulse-code-modulation devices and E1/T1 standard peripherals using SPI and time-division-multiplexing interfaces.

Special instructions or integral-peripheral functions: Instructions include single-cycle minimum/maximum calculation with pointer latching, double-precision calculations, normalization, single-cycle exponent evaluation, conditional accumulator modifications, division step, read-modify (add/subtract/OR/AND/XOR)-write, test 16-bit mask bits and test bit, delayed return, interruptible single-word repeat loop and block repeat, 36-bit shift left or right in a single cycle, and a bank exchange of alternative registers. It also has support for program boot and code downloading. The VoPStream on-chip serial port permits direct interface with PCM highways or to ADCs and DACs, and the 8/16-bit host port permits interfacing to a variety of host processors, including PowerPC, ARM, and MIPS. The VoPStream includes four optional on-chip sigma-delta analog codecs.

Development support: ParthusCeva provides GUI-based development tools for all the DSP cores and platforms. These tools include an optimizing C++/C-compiler, an assembler, a linker, common-object-file-format converters, a debugger with an emulation interface and a Matlab interface, the Assyst extendable simulator for system-on-chip simulation, a profiler, and the ability for multicore debugging. ParthusCeva offers evaluation-development boards for each core for specific market segments, such as image, video and VoIP (voice-over-IP), and for integrating with RISC architectures. ParthusCeva offers various algorithms, such as speech codecs, audio, and video, as well as technical training courses and design services. ParthusCeva’s infrastructure of third-party vendors offers software algorithms, development tools, and design services.

PHILIPS SAF7730

at a glance:

  • Fully integrated audio and radio processing includes ADCs and DACs.
  • The device integrates two independent radio-reception paths.

The SAF7730 mixed-signal DSP targets the automotive-car-radio market. The multicore DSP integrates several Philips DSP cores for audio and radio processing, and it integrates analog IF input, digital-radio and audio processing, sample-rate converters, and digital and analog audio output into a single device. The DSP core features a double Harvard architecture and uses a fixed-point, 2’s complement notation for 24 bits of data and 12 bits of coefficients. It combines a high clock frequency and double-precision arithmetic, allowing designers to obtain high total-harmonic-distortion and SNR performance.

Addressing and processing modes: The SAF7730 supports direct and relative addressing. Relative addressing supports decrementing the pointer by one or incrementing it by one or two.

Special instructions or integral-peripheral functions: The SAF7730 includes a delay line unit, a data I/O unit, a multiplier-accumulator unit, and a microprocessor interface for I2S control.

Development support: Philips offers an assembler, a simulator, an application board, and a real-time GUI-based monitor program. Philips has defined assembler-code-flow and application templates to enable integration of library modules from the audio-feature library, which includes a set of efficient filter implementations, into target applications.

RC MODULE NEUROMATRIX NM6403

at a glance:

  • The vector coprocessor can handle variable-length, 1- to 64-bit data.
  • Variable-length data enables speed and precision trade-offs.

RC Module’s NeuroMatrix NM6403 dual-core, application-specific DSP is based on the NeuroMatrix architecture targeting video-image processing and neural-network applications. It provides scalable performance, a programmable operand width of 1 to 64 bits, and operation as fast as 50 MHz. This flexibility allows designers to trade precision for performance to suit their applications. The NM6403 processor includes a 32/64-bit RISC processor and a 1- to 64-bit vector coprocessor that supports vector operations with elements of variable bit lengths (patent pending). Two identical programmable interfaces work with external memory, and two communication ports are hardware-compatible with Texas Instruments’ TMS320C4x, allowing designers to build multiprocessor systems.

The vector coprocessor, which has an SIMD (single-instruction-multiple-data) architecture, works on packed integer-data comprising 64-bit blocks in the form of variable 1- to 64-bit words. The device supports vector-matrix or matrix-matrix multiplication. The Vector coprocessor’s core looks like an array multiplier comprising cells that include a 1-bit memory (flip-flop) surrounded by several logical elements. Designers can combine the cells into several macrocells with two 64-bit programmable registers. These registers define the borders between rows and columns with macrocells. Each macrocell performs the multiplication on variable-input words using preloaded coefficients and accumulates the result from the macrocells in the column above it. The columns simultaneously calculate the results in one processor cycle. For 8-bit data and coefficients, the vector coprocessor performs 24 MAC (multiply-accumulate) operations with 21-bit results in one 20-nsec processor cycle. The number of MAC operations depends on the length and number of words packaged into a 64-bit block. The engine’s configuration can change dynamically during calculations. An application can start with maximum precision and minimum performance and dynamically increase performance by reducing the data-word lengths. To avoid arithmetic overflow, the NM6403 uses two types of saturation functions with user-programmable saturation boundaries.

The VLIW (very-long-instruction-word) RISC core uses a five-stage pipeline that operates with 32- and 64-bit-wide instructions. Each instruction usually executes two operations. Two 64-bit interfaces support SRAM, DRAM, and EDO DRAM and comprise two separate address-generation units that can address as much as 16 Gbytes. Each interface supports two memory banks and can support a “shared-memory” mode. Two DMA coprocessors transfer data between high-speed I/O-communication ports and external memory.

Addressing and processing modes: The NM6403 supports 32-bit immediate, base, indexed, and relative addressing.

Special instructions or integral-peripheral functions: The NM6403 processor uses vector instructions to handle packets of as many as 32 64-bit data words. These instructions may define operations such as matrix-matrix, matrix-vector, or vector-vector multiplication; vector-vector addition and subtraction with saturation of results; block moving; and bit manipulation. The NM6403 has conditional branch, call, and return instructions.

Development support: The NeuroMatrix Software Development Kit for PCs includes an ANSI X3J16/95-0029 preliminary-standard-compatible C++ compiler, an assembler, an instruction-level simulator, a cycle-accurate simulator, a linker, a source-level debugger, a load/exchange library, and a set of application-specific vector-matrix libraries. RC Module offers PCI and CompactPCI evaluation/development boards for real-time DSP and video-image-processing designs. The vector-matrix library simplifies C-language programming for FFT, DCT, Sobel, and Hadamard Transform. RC Module also provides a NM6403 Verilog behavioral model for Sun host platforms for system-level simulation and a synthesizable core targeting Samsung and Fujitsu semiconductor technologies.

SENSORY RSC-XX

at a glance:

  • RSC processors are specialized for speech recognition and synthesis.
  • RSC processors support noise-robust speech recognition and high-quality, 5-kbps speech output.

The RSC-3x and RSC-4x speech processors combine a microcontroller with advanced speech-processing technology targeting high-quality speech recognition, speech and music synthesis, speaker verification, and record and playback. These devices feature a high-performance microcontroller with generous on-chip RAM and ROM, and DSP blocks, such as an independent digital filter bank and a multiplier. The RSC-4x family also features a vector processor to accelerate signal processing.

The RSC-3x and RSC-4x use a neural network to perform speaker-independent speech recognition. Both achieve high-quality speech-synthesis output using time- and frequency-domain-compression techniques. The RSC-4x uses Sensory’s SX compression technology supporting speech playback data rates as low as 5 kbps. The RSC-4x features Hidden Markov Model speech recognition enabling text-to-speaker-independent generation of noise-robust recognition sets with just a few keystrokes. Designers have access to 24 I/O lines for general-purpose product control.

Addressing and processing modes: RSC devices support sequential addressing modes. Sensory has expanded the RSC-4x address space to 1 Mbyte.

Special instructions or integral-peripheral functions: The RSC-3x and RSC-4x processors include an on-chip microphone preamplifier and ADC for speech input and a DAC with optional amplification for speech output. This configuration enables a single-chip option for speech-in/speech-out dialogue applications. Enabling true continuous listening in battery applications, the RSC-4x includes an audio-wake-up feature that monitors the environment independently of the microcontroller, and an audio event triggers the chip to wake from a 50-mA sleep mode. Sensory has added new 16-bit data instructions to the RSC-4x. The RSC-4x family features twin DMA units, as many as four comparator inputs, a watchdog timer, and as many as eight nested interrupt sources for power-saving wakeup.

SENSORY SC-6X

at a glance:

  • SC-6X supports speech synthesis in bit rates covering a range of quality and memory requirements.
  • As many as 14 channels of music or 10 channels of music simultaneous with speech are possible.

The SC-6x DSP family targets speech- and music-synthesis applications. The Sensory speech algorithms support long-duration speech with data rates as low as 1 kbps using MX and the industry’s highest quality using CX. Six fixed-bit rates of CX, beginning at 3 kbps, and a variable range of MX bit rates are available to mix and match your quality and memory requirements. The SC-6x family supports 14-channel, polyphonic music and as many as 10 channels of music simultaneously with speech playback. These DSPs support three low-power modes, two timer interrupts, one DAC interrupt, and five general-purpose interrupts to improve battery life and response speed to button and keyboard presses. The SC-6x devices include 32 I/O lines to support speech and music synthesis and general-purpose product control with interactive interfaces.

Addressing and processing modes: The addressing modes are immediate, direct, indirect-with-postmodification, and three relative modes. The program-counter unit comprises the program counter, the data pointer, a buffer register, a code-protection write-only register, and a hardware-loop counter for strings and repeated-instruction loops. It provides addressing for program memory (onboard ROM) and includes a 16-bit arithmetic block for incrementing and loading addresses. The program-counter unit generates a ROM address as output.

Special instructions or integral-peripheral functions: The SC-6x processors offer instructions to facilitate filtering algorithms, such as FIR, FIRK, COR, and CORK. FIR is useful for adaptive filtering or applications in which coefficients come from an external source. COR instructions perform 16×16-bit multiplies and 48-bit accumulation in three clock cycles. Instructions are available to perform 16×16-bit multiplies and 32-bit accumulation in two clock cycles.

Development support: The RSC and SC-6x development-tool suite includes an in-circuit emulator with an integrated debugger and a C compiler. Quick T2SI allows text entry and generation of recognition sets, and Quick Synthesis and the SCT6000 tool provide “pushbutton” speech compression. A tool for integration of MIDI-format music in an SC-6x application includes a rich selection of music voices (instruments). Demonstration units and evaluation and prototyping tools, such as the Voice Extreme Toolkit, are available. Each kit includes required hardware and software, complete documentation, and numerous examples. Turnkey product-development and linguistics services are available directly through Sensory or through its worldwide network of third-party-development houses.

SIROYAN SRXXX and ONEDSP ARCHITECTURE

at a glance:

  • The VLIW DSP architecture can scale to as many as 32 dual-issue clusters.
  • OneDSP can perform as many as 25.6 billion MACs at 200 MHz.

Siroyan’s OneDSP architecture uses VLIW (very-long-instruction-word) and clustering techniques to provide scalable, high-performance DSP power allowing as many as 32 execution-unit clusters in a single core. Prevalidated configuration options include setting the number of clusters and endianess, as well as the cache-memory size and configuration. Each cluster consists of general-purpose registers, accumulators, a number of execution units, cache memory, local memory, and an on-chip bus interface. The master cluster executes either scalar RISC instructions from its instruction cache or VLIW instructions from its V-cache. In multicluster designs, VLIW instructions issue in parallel from the V-cache in each of the slave execution-unit clusters.

The foundry-independent SRA328 is the first implementation of the OneDSP architecture and enables SOC (system-on-chip) designers to deploy one to eight execution-unit clusters with 32-bit datapaths in a single IP (intellectual-property) core. The core targets a range of applications, from wireless communications to digital control and speech processing. The SRA328 offers a choice of architectural configurations, memory subsystems, and selectable, application-specific instructions. A combination of architecture-design and process optimizations, including shutting down clusters when they’re not in use and a variety of power-down modes, contribute to the system’s power efficiency.

Addressing and processing modes: In addition to normal RISC addressing modes, OneDSP supports autoincrement, autodecrement, circular-buffer, and bit-reversed addressing modes.

Special instructions or integral-peripheral functions: OneDSP supports Galois-field arithmetic for Reed-Solomon-coding applications and encryption algorithms. The SRXXX cores have an integrated DMA engine capable of basic scatter/gather functions and bit-reversed addressing and ships with a sample system that includes an AMBA AHB and an APB bus system, an external memory interface, and an area of on-chip SRAM.

Development support: Siroyan’s OneDSP development environment runs on Unix, Linux, or Windows OSs. The debugging interface connects via a Nexus 5001 interface to a debugging adapter, offering Class 4-compliance at rates reaching 100 MHz. A debugging adapter is available for connecting the debugging board on the target development board to the host computer via Ethernet, allowing programmers to share target boards. Siroyan supplies a tool chain for application-software development, including a Gnu C compiler for scalar code, an optimizing C compiler for both scalar and VLIW code, an assembler, a debugger, and an OS kernel. Siroyan also works with third-party developers to deliver software and tools.

STARCORE STARCORE DSP

at a glance:

  • StarCore supports user-defined instructions.

StarCore is a new company established to license IP (intellectual property) of the StarCore DSP and peripheral blocks. This synthesizable DSP core uses a fixed-point architecture with an extensible, 16-bit instruction word, and it targets communications applications, such as 2.5, 2.75, and 3G mobile handsets, wireless base stations, and communication-infrastructure devices. Low power dissipation helps extend battery life and meet power-per-channel budgets. The StarCore DSP employs parallelism to enable compact code that requires smaller memories. Designers can use a single DSP architecture and reuse key kernels and code for midlevel as well as advanced applications.

Addressing and processing modes: The StarCore DSP architecture supports register-direct mode, address-register-indirect mode, and program-counter-relative modes. For address-register-indirect modes, the architecture supports linear, reverse-carry, modulo, and multiple-wraparound-module arithmetic types.

Special instructions or integral-peripheral functions: The StarCore DSP multipliers support all combinations of signed and unsigned operands and both fractional and integer formats. The architecture supports an SIMD (single-instruction-multiple-data) version of maximum and minimum additions and subtractions (MAX2, ADD2, SUB2). It can perform eight 16-bit additions or maximum and minimum operations per cycle and includes MAX2VIT, which works with Viterbi shift left to accelerate Viterbi decoding algorithms. A user-defined instruction-set-accelerator module enhances the StarCore DSP standard instruction set.

Development support: StarCore provides direct support as well as services, including macro hardening, design support, and training for implementing a DSP core into an SOC (system on chip). It has established alliances with a network of leading third-party tool providers, OSs, and application software, giving developers alternatives to choose from, including Metrowerks (www.metrowerks.com), GreenHills (www.ghs.com), Altium/Tasking (www.tasking.com), Quadros (www.quadros.com), OSE Systems (www.ose.com), Trinity Convergence (www.trinityconvergence.com), Signals and Software (www.signalsandsoftware.com), HelloSoft (www.hellosoft.com), and Numerix (www.numerix-dsp.com).

STMICROELECTRONICS ST100

at a glance:

  • The ST122 can perform 1.2 MMACs (million MACs)/sec at 600 MHz.
  • Interfaces support customizable co-processors.

The general-purpose, 16-bit, fixed-point ST100 family architecture is suitable for integration into SOCs (systems on chip) targeting wired- and wireless-communications, automotive, or multimedia applications. The instruction set features DSP instructions as well as 32- and 16-bit microcontroller instructions for enhanced performance and code density. The architecture supports a 4-Gbyte memory space, 40-bit registers and accumulators, four idle modes for power-consumption reduction, and three zero-overhead nestable loops. It is scalable between high-performance and low-power operation.

Addressing and processing modes: The ST100 family supports 13 addressing modes-including circular, which suits FIR filtering, and bit reverse, which optimizes FFT implementations. Data-memory accesses handle bytes, half-words (16 bits), and words (32 bits).

Special instructions or integral-peripheral functions: The instruction set supports predication for most of its instructions, packed arithmetic, and a special instruction for Viterbi. The ST122 core supports dual 16×32-bit MAC (multiply-accumulate operations) for audio applications and multimedia-specific instructions. The ST122 subsystem includes a scalable program-cache-memory interface, which dedicated program-cache instructions control. The ST122 core can interface with as many as four tightly coupled coprocessors to improve system performance and power consumption.

Development support: STMicroelectronics and third-party partners, such as Green Hills (www.ghs.com) and OSE Systems (www.ose.com), offer a complete suite of evaluation boards, software-development tools, and application-software libraries to support hardware and software developments of ST100-based SOCs. Development tools are available for Windows and Unix host systems.

TENSILICA XTENSA CORE WITH VECTRA DSP ENGINE

at a glance:

  • Extensible core offers additional user-defined execution units and instructions.
  • DSP options include dual or quad multiply-accumulate unit.

Tensilica’s Xtensa V processor is a configurable, extensible, and synthesizable processor core. Designers can add DSP extensions via a Web-based configurator as well as application-specific instructions to define new registers, register files, and custom data types. The Xtensa processor generator automatically builds a correct-by-construction RTL description as well as a software tool set that incorporates the new instructions. The base architecture includes a 32-bit RISC ALU, as many as 64 general-purpose registers, and 80 base instructions, including 16- and 24-bit RISC instruction encoding with combined branch instructions, such as combined compare-and-branch and zero-overhead loops, and bit manipulations, including funnel shifts and field-extract options.

DSP engines in the Vectra family are fixed-point coprocessors for Tensilica’s Xtensa architecture. The Vectra DSP engines use an SIMD (single-instruction-multiple-date) architecture that allows vector registers to maintain data, coefficients, and intermediate results of an algorithm. The Vectra engine’s large vector register file helps reduce memory-bandwidth requirements and improves overall system performance. The Vectra engine supports single- and double-width operand sizes for greater computational accuracy. The Vectra instruction set extends the capability of the basic Xtensa microprocessor core.

Addressing and processing modes: Xtensa supports both little-endian (PC-compatible) and big-endian (Internet-compatible) address models as a configuration parameter and provides optional support for zero-overhead loops and an MMU with multiple memory-protection modes. Xtensa supports 8-, 16-, 32-, 64-, and 128-bit memory references. The Vectra DSP engine’s four addressing modes include immediate and indexed with or without updates to the base register.

Special instructions or integral-peripheral functions: Configuration options include multipliers and MACs, DSP engines, a floating-point unit, variable processor-interface width (32-, 64-, or 128-bit), big- and little-endian byte ordering, on-chip debugging, a trace port, XLMI high-speed local interface, as many as 32 interrupts, memory-management options, local data and instruction caches, and separate ROM and RAM areas. Compound instructions include special shifts, compare/branch, and zero-overhead loop instructions. Special Vectra instructions include vector operations for load/store, arithmetic (add, multiply), binary operations, and bit packing and unpacking.

Development support: Designers use the Xtensa Processor Generator to configure and extend a family of core processors with application-specific functions. The Xtensa Processor Generator also automatically generates a complete suite of development tools, including a compiler, an assembler, a linker, and a debugger that match the particular Xtensa/Vectra hardware implementation. Tensilica also provides a cycle-accurate instruction-set simulator, and a bus functional model.

Tensilica offers five main development tools, including a GNU-based software-development suite, the XCC (Xtensa C/C++ compiler), the instruction-set simulator and XTMP (Xtensa Modeling Protocol) API, the Mentor Graphics Xray debugger, and the TIE (Tensilica-instruction-extension) compiler. Third-party support includes Accelerated Technology’s (www.acceleratedtechnology.com) Nucleus Plus RTOS with Xtensa OSKit for Nucleus Plus and codelab developer suite, Wind River’s (www.windriver.com) VxWorks RTOS and the Tornado 2 development platform, Monta Vista’s (www.mvista.com) Hard Hat Linux, Sophia Systems’ (www.sophia.com) UniStac Xtensa (JTAG) in-circuit emulator, and Macraigor Systems’ (www.macraigor.com) Wiggler on-chip-debugging tool. Available third-party peripheral intellectual property includes AC3 decoder, Bluetooth, G723-1 Codec, G729AB Codec, MPEG-2 AAC decoder, MPEG-4 AAC decoder, MP3 encoder, MP3 decoder, and WMA decoder.

TEXAS INSTRUMENTS OMAP5910

at a glance:

  • OMAP5910 integrates DSP and RISC cores targeting multimedia-rich applications.
  • OMAP5910 offers system-on-a-chip functions with a flexible user interface.

The OMAP5910 embedded processor integrates a TMS320C55x DSP core with a TI-enhanced ARM925 on a single chip. The C55x DSP core has 64-kbyte dual-access RAM, 96-kbyte single-access RAM, 32-kbyte ROM, and three video-hardware accelerators (DCT/iDCT, pixel interpolation, and motion compensation). The C55x core has a six-channel DMA controller for high-speed data movement without DSP intervention. The ARM925 core is a 32-bit, pipelined RISC processor that performs 32- or 16-bit instructions and processes 32-, 16-, or 8-bit data. The ARM core includes an integrated LCD controller plus a 192-kbyte internal frame buffer. The OMAP5910 provides system DMA, multiple mailboxes, and a microprocessor-interface port for interprocessor communication.

Addressing and processing modes: The C55x DSP core supports single-data-memory-operand addressing that 32-bit operands. It also supports dual-data-memory-operand addressing that parallel instructions use. The C55x DSP core supports absolute addressing, register-indirect-addressing, direct-addressing, and displacement mode. The C55x includes dedicated registers to support circular addressing for instructions that use indirect addressing.

Special instructions or integral-peripheral functions: The C55x DSP core has special instructions that can combine instructions to perform two operations. You can combine built-in parallel instructions with user-defined parallel instructions. It can also perform dedicated-function instructions, such as FIR filters, single and block repeat, eight parallel instructions, and 10 multiply and eight dual-operand memory moves. The OMAP5910 includes an LCD controller, a camera interface, a USB 1.1 host and client, an MMC/SD (multimedia-card/secure-digital)-card interface, a keyboard, infrared support, I2C, Microwire, eight timers, a real-time clock, a nine-channel system DMA, eight serial ports, three UARTs, and hardware-video accelerators.

TEXAS INSTRUMENTS TMS320C2000

at a glance:

  • Devices combine performance and peripheral integration for the embedded-control industry.
  • These code-compatible DSPs target embedded-control applications.

The TMS320C2000 family of 17 code-compatible DSP controllers offer a combination of on-chip peripherals, such as flash memory, fast ADCs, and CAN modules targeting embedded-control applications, such as consumer and industrial-control applications—including motor control, white goods, HVAC, power tools—automotive, power supply, and optical-networking. The TMS320C24x is a 16-bit DSP core. The TMS320F2810 and TMS320F2812 DSPs are 32-bit control DSPs with onboard flash memory and performance to 150 MIPS. The C28x core offers 300 MIPS of computational bandwidth with a signal-processing core optimized for control. It is fully code compatible with current devices in the C2000 family.

Addressing and processing modes: The C2000 DSP platform supports indirect and direct addressing.

Special instructions or integral-peripheral functions: The C2000 DSP platform integrates flash memory, a 12-bit ADC, an event manager optimized for pulse-width-modulation generation, CAN modules, and serial interfaces. The C28x also features a C-to-assembly ratio of 1-to-1 that allows developers to write their algorithms in a high-level language. These devices support “virtual-floating-point” programming that provides a floating-point machine on a fixed-point architecture.

TEXAS INSTRUMENTS TMS320C5000

at a glance:

  • The C5000 DSP platform offers more than 30 code-compatible devices.
  • C5501 and C5502 are 300-MHz, dual-multiply-accumulate-unit DSPs with less than 200 mW power dissipation that cost less than $10.

The TMS320C5000 DSP platform has more than 30 code-compatible devices and includes the TMS320C54x and TMS320C55x DSP generations. These devices use a modified Harvard architecture and target portable Internet, multimedia communications, and medical and biometrics applications. The TMS320C5420 and TMS320C5421 are both dual-core devices, and the TMS320C5441 is a quad-core device targeting high-channel-density applications. The TNET3010 has six C55x DSP cores, targeting high-density voice and access class E1/T1 telecom applications. Compared with a Pentium 4, it has three times the transistor count (approximately 180 million transistors), consumes 50 times less power, is 20% smaller, and supports more than 200 voice channels.

The C55x DSPs are source-code-compatible with the C54x DSPs. The C54x focuses on low power consumption, but the C55x takes power efficiency to a new level: A 300-MHz C55x delivers a maximum fivefold improvement in performance over a 120-MHz C54x and dissipates as little as one-sixth its core power. The C55x has 12 independent buses, and the C54x has eight. Both architectures include one program bus and an associated program-address bus. The C55x bus is 32 bits wide, and the C54x bus is 16 bits wide. The C55x has three data-read buses and two data-write buses; the C54x has two data-read buses and one data-write bus. Each data bus also has its own address bus. The corresponding address buses are 24 bits wide on the C55x and 16 bits wide on the C54x.

Addressing and processing modes: The C54x supports single-data-memory-operand addressing that also supports 32-bit operands. It also supports dual-data-memory-operand addressing that parallel instructions use. It provides immediate, memory-mapped, circular, and bit-reversed addressing. In addition to the C54x modes, the C55x supports absolute addressing, register-indirect-addressing, direct-addressing, and displacement mode. The C55x includes dedicated registers to support circular addressing for instructions that use indirect addressing. Programs can simultaneously use as many as five independent circular-buffer locations with as many as three independent buffer lengths. These circular buffers have no address-alignment constraints. The C54x supports two circular buffers of arbitrary lengths and locations.

Special instructions or integral-peripheral functions: The C54x performs dedicated-function instructions, such as FIR filters, single and block repeat, eight parallel instructions, multiply, accumulate, and subtract (10 multiply instructions), and eight dual-operand memory moves. The C55x also has special instructions that take advantage of the additional functional units and increase parallelism capabilities. User-defined parallelism allows you to combine instructions to perform two operations. You can also combine a built-in parallel instruction with a user-defined parallel instruction. The C5509includes peripherals like a USB interface and multimedia-memory-card port.

TEXAS INSTRUMENTS TMS320C6000, TMS320DM642, AND TMS320DRI200

at a glance:

  • Performance can scale from 1200 to 4800 MIPS.
  • The TMS320DM642 consumes less than 1.5W of power.

The TMS320C6000 DSP platform, a general-purpose, VLIW (very-long-instruction-word) DSP architecture, targets advanced imaging, broadband, and wireless infrastructure. This architecture includes the fixed-point C62x and C64x DSP generations and the floating-point C67x DSP generation. The C6414, C6415, and C6416 DSPs offer maximum processor speeds of 600 MHz. These processors include large on-chip memories and target video and imaging, communications, and instrumentation applications. The C6411 device targets security, communications, and office-equipment applications. It is currently the lowest priced device in the C64x lineup, and it offers the lowest power of any device in the C6000 DSP platform. The C6713 is a floating-point DSP that is an upward migration path from the C6711. It adds I2S, I2C, and S/PDIF transmit support as well as an enhanced memory space.

The TMS320DM642 is a fully programmable, 600-MHz digital media processor using a C64x core that includes integrated multimedia and communication peripherals targeting video over IP, video-on-demand, multichannel digital-video-recording applications, and video encoding and decoding. Its C64x DSP core processor has 64 general-purpose registers of 32-bit word length and eight independent functional units that include two multipliers for a 32-bit result and six ALUs with VelociTI.2 extensions. It can complete four 32-bit MAC (multiply-accumulate) operations per cycle.

The DRI200 handles the baseband processing for HD Radio and incorporates digital channel source, data decoding, and demodulation functions. The IDM combines the memory and appropriate interfaces on a credit-card-sized board. TI bases the DRI200 on the C64x core. It is compatible with iBiquity’s (www.ibiquity.com) IBOC digital AM/FM system and can interface to an external microcontroller, DRAM, and SRAM. It is compatible with standard audio-DAC interfaces, includes JTAG emulation, and supports –40 to +85°C operation.

Addressing and processing modes: The C6000 DSP platform performs linear and circular addressing. Unlike other DSPs that have dedicated address-generation units, C6000 DSPs calculate addresses using one or more of its functional units. The DM642 performs dual 64-bit memory accesses each cycle and supports nonaligned memory accesses to enable sliding-window operations with SIMD (single-instruction-multiple-data) processing and circular addressing. The DM642 datapaths include support for packed data processing (SIMD), including quad 8-bit operations and dual 16-bit operations, useful for supporting video and image processing. Special instructions for key video-compression algorithms include sum-of-absolute differences for motion estimation and average instructions for motion compensation.

Special instructions or integral-peripheral functions: All C6000 DSP processors can conditionally execute all instructions, a method of reducing branching and thereby optimizing performance. On the C64x DSP, the MPYU4 instruction performs four 8×8-bit unsigned multiplies. The ADD4 instruction performs four 8-bit additions. Six of the C64x functional units can perform dual 16-bit addition/subtraction. Two of the functional units perform dual 16-bit compare, shift, minimum/maximum, and absolute-value operations. The M units also support dual 16-bit and quad 8-bit averaging operations as well as bit-expansion and bit-interleaving and -deinterleaving operations. Four of the six remaining functional units support quad 8-bit addition/subtraction operations. Two functional units support quad 8-bit compare and minimum/maximum instructions. Some instructions operate directly on packed 8- and 16-bit data. The C6411 peripheral set includes, two multichannel buffered serial ports; 32-bit, 33-MHz PCI; three timers; 64-channel enhanced DMA; and 32-bit external-memory interface. The C6414, C6415, C6416 peripheral set adds another multichannel buffered serial port, PHY interface for ATM (Utopia), and Viterbi and Turbo coprocessors.

The DM642 uses a two-level cache-based architecture. The Level 1 program cache is a 128-kbit, direct-mapped cache, and the Level 1 data cache is a 128-kbit, two-way set-associative cache. The peripheral set includes three configurable video ports; a 10/100-Mbps Ethernet media-access controller; a management data-I/O module; a VIC (VCXO interpolated control port); one multichannel buffered audio serial port; I2C bus module; two multichannel buffered serial ports; three 32-bit, general-purpose timers; a user-configurable 16- or 32-bit host-port interface; PCI; 16 general-purpose I/O pins; and a 64-bit, glueless external-memory interface that can interface to synchronous and asynchronous memories and peripherals.

TEXAS INSTRUMENTS TMS320DM310

at a glance:

  • The processor consumes less than 500 mW.
  • The processor provides real-time MPEG-4 video encoding and decoding.

The TMS320DM310 DSP-based programmable digital media processor integrates a C54x DSP core with an ARM9 32-bit RISC-processor core. It is one of the first devices of the digital-media-processor family to support real-time MPEG-4 video encoding and decoding at CIF (352×28-pixel) resolution and offers an integrated USB host supporting a direct-print capability. It is code-compatible with TMS320DSCx devices, and it supports both CMOS and CCD image sensors of as much as 6 million pixels with a less-than1-sec shot-to-shot delay. Operating-system support includes Nucleus, Linux, Ultron, VxWorks, and WinCE

Addressing and processing modes: The DM310 processor has an integrated SDRAM controller that supports SDRAM timing as fast as 100 MHz and provides continuous data access with low overhead.

Special instructions or integral-peripheral functions: The DM310 includes integrated USB-host and -device support, a real-time preview engine, integrated SRAM and external memory-interface controllers, 32 general-purpose I/Os, two serial ports, two multichannel, buffered serial ports, two UARTs, four timers, a watchdog timer, and seamless support for Compact flash, Smart Media, MMC/SD (multimedia-card/secure-digital) interfaces, and Memory Stick.

TEXAS INSTRUMENTS TMS320DSCX

at a glance:

  • Device clock speed is 100 MHz.
  • Device consumes less than 1W of power.

The TMS320DSC21, TMS320DSC24, and TMS320DSC25 DSPs digital-imaging systems on a single chip combine a TMS320C5000 DSP and an ARM7TDMI RISC processor targeting media-processing and system-control functions. The chips integrate a video encoder with an on-screen display, an SDRAM controller with a bandwidth-transfer rate of 320 Mbytes/sec, and a preview engine that performs 30-frame/sec NTSC and PAL previewing (DSC21/DSC25). The DSCx family of products can achieve real-time processing of a full-resolution 2 million-pixel image with a 1-sec shot-to-shot delay. DSCx DSPs can support the capture of high-resolution still photos and can record video clips with audio and music from the Internet. These systems support digital-audio and -video formats, including real-time MPEG-1, MPEG-4, JPEG, M-JPEG, H.263, and MP3, as well as data-communication standards, such as IrDA (DSC21), USB, and RS-232.

Addressing and processing modes: Addressing modes include SDRAM, SRAM, flash-media, and removable-media interfaces. The SDRAM transfer rate is 80 Mbytes/sec with both 332 (DSC21/24/25) and 316 (DSC24) interface capabilities. The DSC24 enables 2-D-to-2-D data transfer from SDRAM to an on-chip image buffer, as well as direct SDRAM access via an SDRAM controller. The ARM can access the DSP via the host-port interface, and its bus controller has on- or off-chip access to general-purpose I/O, flash, Compact flash, and Smart Media applications.

Special instructions or integral-peripheral functions: In addition to the TMS320C54x DSP-generation instruction set, the DSCx DSP subsystem incorporates imaging enhancements to provide fast-block-based processing for imaging or video-encoding and -decoding functions.

Development support: Texas Instruments’ eXpressDSP real-time-software and development tools encompass development for all TMS320 devices and include the Code Composer Studio integrated development environment; the DSP/BIOS scalable real-time kernel, the TMS320 DSP algorithm standard set of coding conventions and APIs, and a third-party network. Evaluation modules, technical training classes, and customer-application support are also available.

The Innovator development kit for OMAP is a handheld, expandable, flexible development platform for OMAP. Code Composer Studio IDE for OMAP integrates all host and target tools into one environment, simplifying DSP configuration and optimization to take full advantage of the high-performance processing capabilities of the DSP core in the OMAP5910 device. CCStudio for OMAP addresses each phase of the code- development cycle, including designing, coding and building, debugging, analyzing, and optimizing. OMAP simulation includes both TMS320C5000 and ARM925 simulators for heterogeneous debugging.

The network-video-developer’s kit for the C6000TM platform, based on ATEME’s (www.ateme.com) IEKC64x, is an evaluation and development board using the TMS320C6416 DSP. This kit includes applications for image, video-stream, and audio compression. Software support for the DM310 includes modules for audio, video, and imaging compression and transmission standards, including MPEG-1, MPEG-2, MPEG-4, H.263, JPEG, M-JPEG, AAC, MP3, and Windows Media Audio. Additional support for software integration includes a DSP and ARM framework, providing a “plug-and-play”-like environment and evaluation modules. DM310’s object code compatibility with DSC2x platform enables developers to begin programming using DSC platforms then port the development software to newer devices in the family as they become available.

3DSP SP-3, SP-5, AND SP-20/UNIPHY

at a glance:

  • Cores enable single-chip multifunction digital imaging.
  • 3DSP supplemented the SP-20/UniPHY core with additional IP to build 802.11a, b, and g subsystems.

The soft-IP (intellectual-property)-core, fixed-point DSP family, bus controller, peripherals, and microprocessor interfaces from 3DSP use a scalable 32-bit SuperSIMD (single-instruction-multiple-data) architecture. The core supports multiprocessor systems, program cache or direct-mapped program memory, 32 prioritized interrupts, 32 general-purpose I/O pins, and a JTAG-only debugging interface. The SP-3 core is a programmable, five-stage-pipelined DSP that targets MP3-player, home-audio (AAC, AC3), wireless-GSM-phone, GPS, and CPE (customer-premises-equipment) VOP (voice-over-packet)-processing applications.

The SP-5 core is a programmable, superscalar, dual-issue, five-stage-pipelined DSP that targets 3G wireless, VOP gateway, xDSL, MPEG-2, MPEG-4, and wireless-LAN applications. The SP-5flex core is a fully synthesizable and configurable DSP core, based on the SP-5 architecture, which supports balancing power, cost, and performance. Designers can change the memory size, register-file size, and number of function units, and they can add application-specific instruction sets. The SP-5flex targets VOP, digital-wireless, audio, video, imaging, and broadband-modem applications. The SP-5V is a programmable, superscalar, dual-issue, five-stage-pipelined DSP that targets VOP applications.

The programmable, dual-mode, nine-stage-pipelined SP-20/UniPHY DSP IP core targets multimedia applications including multimedia over wireless. UniPhy can execute speeds of 400 MHz to 1 GHz because it supports a multiple-standard PHY implementation on the same processor. The “soft-datapath” technology and programmability enables a “softPHY” implementation that facilitates modification for changing physical-layer standards.

Special instructions or integral-peripheral functions: The 3DSP core supports two SIMD multiplier options. The first option is a dual 24×16-bit multiply that can perform two 24×16-bit multiplies, four 16×16 multiplies, or eight 8×16 multiplies in a single cycle. The second option is a dual 32×32-bit multiplier that can perform all the functions of the 24×16-bit multiplier and perform two 32×32-bit multiplies in one cycle. The 32×32-bit multiplier provides the highest quality audio processing.

The SP-20/UniPHY core combines accelerated versions of 3DSP’s SuperSIMD architecture and SP-x instruction set with an expansion-instruction mode that contains custom instructions targeting universal physical-layer signal processing for 802.11a/b/g, HiLAN2, and xDSL. Over the last year, 3DSP supplemented its SP-20/UniPHY core with additional IP to build an 802.11a, b, and g subsystem that includes a Viterbi accelerator, MAC (multiply-accumulate) accelerator, and radio interface that a suite of optimized software for the baseband and MAC (media-access controller) supports.

Development support: Offerings from 3DSP include SOC (system-on-chip) integration tools and services, including the DSP-Shuttle system-bus controller and the HiFI SOC-development environment, as well as design services, for SOC integration. Developers using the DSP-Shuttle and HiFI software can progress from concept to silicon tape-out within nine months. The company also offers optimized suites of application software for multichannel-audio and MPEG4-video streaming multimedia, VOP, and wireless-LAN support.

The DSP-Shuttle is a high-speed, fully synthesizable and configurable DSP system-bus controller, based on the proprietary Intelligent DMA technology, to address the intense data flow of DSP applications. It features real-time and high-speed data transfer, dynamic data-dependent bandwidth allocation, and multicore support. It also provides a plug-and-play uniform interface to all system peripherals and enables low-power implementation.

The GUI-based HiFI SOC integration tool enables designers to co-develop hardware and software, configure the DSP core and memory subsystem, add peripheral devices, and test performance. They can use it to perform trade-offs among clock speed, area, and power consumption. The SOC integration tool includes Software Studio—a collection of software-development tools to edit, compile, assemble, optimize, debug, and manage application code.

Sketchpad development kits allow designers to optimize and evaluate the architectures, prototype hardware, and develop application software. The kits come ready for USB plug-and-play use with a Windows-based PC. All kits include a Sketchpad FPGA development board, Software Studio, power supply, and USB cables. Designers can download candidate hardware architectures onto the development board and evaluate the performance of each configuration using software for the application.

Click here to return to devices A through L and the article manufacturers box.



Reed Business Information Resource Center

Featured Company


Most Recent Resources

ADVERTISEMENT

ADVERTISEMENT

Feedback Loop


Post a CommentPost a Comment

There are no comments posted for this article.

Related Content

 

By This Author

There are no additional articles written by this author.


ADVERTISEMENT

Knowledge Center



Technology Quick Links

EDN Marketplace


©1997-2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy

Please visit these other Reed Business sites