News and New Products
Global Designer: FPGAs implement high-end image-processing applications
By Pradeep Chakraborty, EDN Asia -- EDN, 7/7/2005
New FPGAs with more DSP resources and embedded-processing capabilities have made the global image-processing market more competitive. According to Rahul V Shah, ASIC manager at eInfochips, an ASIC-design and -verification-services provider, designers can offload software-implemented algorithms, such as DCT, static Huffman, AES (Advanced Encryption Standard), color-space conversion, and gamma correction, to FPGA hardware. He claims that this approach improves system performance. Shah also notes that the conventional manner of implementing algorithms by software limits performance due to serial data processing. Increasing frequency beyond certain limits causes system issues. FPGAs have flexible architectures and dedicated DSP blocks. Coupling these benefits with parallel processing strikes a proper balance in system performance and cost. FPGAs extend the flexibility to reprogrammability, resulting in a quick turnaround time.
According to Shah, parallel processing in hardware is impossible because hardware processing can execute instructions only one at a time. "If you want to run a DCT along with static Huffman with the same processor, one process at a time will execute," he says. "However, in hardware, because everything runs in parallel, you can have DCT and SHF [static Huffman] running in parallel at the same time without any performance hit." He adds that FPGAs provide the flexibility to upgrade to new standards and reprogram devices. For example, you can modify any system-level application to DDR and then move to DDR-2.
Designers can also reshuffle images to create applications, such as video cell phones, set-top boxes, LCD projectors, keyboards, videos, mice, and digital cameras and camcorders, among others. "Implementing DSP algorithms for image-processing blocks, such as DCT, AES, and static Huffman, requires a huge amount of memory, multiplier, and accumulator blocks," says Shah. For example, a DCT at 133 MHz can take as many as 64 multiplication and addition operations. Designers can map these multipliers onto the hardware to perform multiplication and addition operations, rather than do it sequentially in software. Static Huffman and AES cores have high memory requirements for storing coefficient values and performing mathematical operations on incoming data. Handling these operations in software slows down system performance and overloads the CPU. Offloading these tasks to the hardware means that the memory stores the image, and the CPU performs other control operations.
Implementing a dynamic Huffman algorithm in hardware would be a bad idea, because the algorithm requires dynamic calculations, which need a large amount of hardware, so proper partitioning between hardware and software is necessary. This approach is more cost-effective and provides better performance than running the algorithm in software.
eInfoChips, www.einfochips.com.



