
You can easily design an image-compression system using an off-the-shelf Joint Photographic Experts Group (JPEG) compressor with the suggested default-parameter tables. Predicting the systems performance in terms of image quality, compression ratio, or processing time is not as simple, however. To attain predictable system performance, you must determine the algorithm parameters that achieve your design goals.
Bit rate and compression ratio measure the degree to which an image is compressed. Bit rate is the average number of compressed bits per pixel. Compression ratio is (original image size) : (compressed image size). Physical size, resolution, and the number of color components determine the original image size. The compressed image size is limited by your system or application. (If your system didnt have size limits, you wouldnt need compression.)
Compression algorithms fall into two categories: lossless and lossy. Lossless compression isnt necessarily superior to lossy. Lossless compression, or entropy coding, strives to represent information as compactly as possible. The datas entropy, however, limits lossless compression; entropy coding can only squeeze out existing redundancy. The key to further compression is to discard information judiciously.
Lossy compression works by throwing out information that is not important. The lossy process introduces an error called distortion. Most compression algorithms for photographic images are lossy, because lossless compression is not able to condense all the image information to a practical size.
The JPEG Standard defines a family of image-compression algorithms designed for color or gray-scale still photographic images (Reference 1). JPEG can also compress video frame by frame. The most important algorithm in the JPEG standard is the baseline sequential. (See box, "JPEG in a nutshell.") Base-line JPEG coding has both lossless and lossy steps.
Baseline JPEGs losses occur during a discrete cosine transform and subsequent quantization of the transform coefficients. The algorithm begins by dividing an image into 8×8-pixel blocks. It then transforms the image to the frequency domain one block at a time. Next, the algorithm performs a quantization, which involves dividing each of the 64 resulting frequency coefficients by the corresponding value in a set of 64 numbers. This set is the quantization table. Quantization rounds the quotient to the nearest integer. During decompression, the algorithm multiplies each quantized coefficient by the quantization-table value to restore an approximation of the original coefficient.
Trade quantization for quality
A large value in the quantization table means that the corresponding coefficient is restored less accurately. However, the quantized coefficient is a small integer that requires only a few bits for its representation. The quantization-table values, thus, represent a trade-off between accuracy and bit-length for each spatial frequency. The idea behind quantization is that some spatial frequencies, typically the higher ones, are less important to the eye than others. The algorithm can, therefore, represent the less important frequencies more coarsely without seriously degrading image quality.
Visually, the distortion introduced by quantization takes the form of rectangular artifacts. These artifacts range in size from 8×8 blocks to tiny speckles. Figure 1 shows a compressed image in which the 8×8 blocks are noticeable. These block artifacts result from quantization error in the zero-frequency (de) coefficient of an originally smooth image.
Speckles come from quantization errors in high-frequency components. Just as high harmonics are significant in accurately reproducing a square wave, the high-frequency coefficients of an image become significant at high contrast edges.
Figure 2 shows an image having sharper contrast and more detail than Figure 1. Medium-sized and small rectangular artifacts are apparent in this image. In Figure 2 (c) and (d), for example, the 8×8 blocks containing asterisks are full of speckles. These patterns result from quantization error in high-frequency coefficients.
Obviously, then, developing the quantization table is a significant part of a JPEG system design. You can think of JPEG system design as having two parts. One part of the job is setting values for the parameters, which tend to be independent of the compression-coder implementation. These parameters include initial image resolution, color and color space, acceptable levels of distortion, and target compression ratio or compressed image size. This part of the job ends with a preliminary design of your quantization tables. The second part of the design is dealing with implementation-specific issues, such as cost, ease of development, and processing speed.
These two tasks are not separable. Image resolution and compression ratio, for example, affect processing speed. System constraints, such as available bandwidth and memory size, may limit the acceptable compressed image size and thus determine the target compression ratio. If you are designing for a particular application, however, you should start by setting values for the implementation-independent parameters. You can test the values early in the design using a simple prototype system. Further, the values also feed into the rest of the system design. Even if you are going to leave the selection of some of these parameters up to your customer, you need to make assumptions about the values to decide whether the implementation will meet the customers needs.
Set independent parameters
Selecting the implementation-independent parameters requires that you make some trade-offs. For example, there is a trade-off between resolution and compressed-image size. A higher resolution image, in the absence of compression, shows more detail. As the number of pixels per inch increases, however, the compressed-image size also increases, although not necessarily in direct proportion. Coarse quantization reduces compressed-image size, but loses detail. If the compressed-image size is critical, you may achieve a better compromise by using a lower resolution image but compressing it more accuratelythat is, with finer quantization.
Another implementation-independent parameter to consider is choice of color space. JPEG allows you to choose the color space representation and the resolution of each color component. Video images, for instance, ordinarily are divided into red, green, and blue components. In this color space, all three components need the same resolution. However, you can reduce the number of bits needed to represent an image by choosing a different color-space representation. For instance, in a YCrCb representation (Reference 2), the Y, or luminance, component represents image brightness. The other two components, called chrominance, convey hue. For human vision, it is more important to represent precisely the luminance than the chrominance. The chrominance components can, therefore, be subsampled so that there are two blocks of each of the two chrominance components for every four blocks of luminance. Reducing the number of chrominance pixels is a form of lossy compression.
The most slippery implementation-independent parameter is distortion, or image quality. Mean squared error is one metric of distortion. Unfortunately, there is no good way to correlate mean-squared error with human perception of image quality. Instead, image quality must have its definition in terms of your application.
Many factors contribute to image quality in a given application. Display pixel size and the viewing distance affect perceived quality: Enlarging the image makes artifacts more apparent. You also need to consider which information in the image is important to the user. A user interested in the legibility of the low-contrast words "DOLLARS" and "CTS" in Figure 2 would demand less distortion than a user who only wanted to read the dark numbers. You must also address the nature of a typical image. For example, in photographs of natural scenes, small artifacts added by the compression routine may not be noticeable if they are camouflaged in textured regions. To define image quality, you need a suite of representative images for your application and set guidelines for evaluating their quality after compression.
Finally, decide on your needs for error recovery and random access within an image. In general, the JPEG decompressor operates on current data based on knowledge of prior data in the image. This dependency on prior data implies that you cannot begin decompression in the middle of the file, and the decompressor cannot recover after a bit error in the compressed data. The JPEG standard offers an optional restart capability, however, that gets around this limitation. Restart markers delimit restart intervals, which are horizontal segments of the image that can be decompressed independently. Because restart markers are easy to locate, you can have the decompressor extract only those image segments that are of interest. Also, the decompressor reinitializes itself at each restart marker, so errors do not propagate beyond the restart interval. Restart intervals are optional, and you choose their size. The intervals must contain an integral number of rows of 8×8 blocks.
Quantization tables are key
Once you define your system requirements, you can develop quantization tables. A quantization table is the key implementation-independent parameter, because it controls distortion and bit rate. The quantization tables suggested in the JPEG standard may be appropriate for your application. The standard gives two tables: one for the luminance component and one for the chrominance components. The tables assume a viewing distance of 6 times the screen width with a luminance resolution of 720×576 pixels and a chrominance resolution of 360×576 samples (Reference 3, pg 36). Your resolution and viewing distance may be different. These tables define a standard not to achieve a target compression ratio, but to distort to the threshold of visibility. Pennebaker and Mitchell (Reference 1) found, however, that the tables yield noticeable artifacts when viewed on high-quality displays.
Unfortunately, you will find that optimum quantization tables are highly image dependent. For example, coarse quantization of the medium-frequency coefficients causes turbulence or speckling around text characters on a plain background. A complicated background may mask these artifacts, allowing coarse quantization to produce acceptable results. Indeed, you may wish that you could flip between quantization tables, using one to compress textured areas and another to compress smoother areas. Unfortunately, the existing JPEG standard does not allow changing the quantization table in the middle of compressing a component of the image. The JPEG committee is considering this as an extension.
To develop your own quantization tables, you need prototyping tools. These tools consist of JPEG-compression and -decompression software; a representative collection of images; and a display or other output device that models the one planned for your system. By altering the quantization tables and compressing the images, you can observe the trade-off between distortion and bit rate. (Figure 2 shows the effects of five quantization tables.) The challenge is gathering a sufficient collection of images.
If compressed-image size is not critical to your application, the choice of a quantization table is simple. If you require a small compressed-image size, however, you will find that achieving a quantization table that gives acceptable distortion over a range of test images takes a lot of experimenting.
A common, although unsophisticated, method for generating quantization tables is to apply scalar multipliers to the example luminance quantization table in the JPEG standard, Annex K. You manually adjust the multipliers until the images compress to the desired average bit rate. You can also try fine tuning the table by tweaking individual table values. If you need a more sophisticated method, you can try quantization-table-design algorithms developed by the researchers at Unisys Corporation (Reference 4). These algorithms give a better quality-vs-rate trade-off than does scaling the example table.
Optimize Huffman tables
With any lossy compression technique, there are limits to the quality you can obtain for a given bit rate. You may discover that you cannot simultaneously meet your target quality and bit rate by adjusting the quantization table. The next step is to optimize the Huffman tables that encode the quantized coefficients. Huffman tables are much less significant than the quantization tables in setting the compression ratio. Fortunately, optimizing the tables is a straightforward process that uses a well-known algorithm in the compression world. You first need to choose your quantization tables and to compress your representative image set to the point of Huffman coding. You then collect statistics on the frequency of occurrence of each symbol. Assign short codes to the most common ones with the code length roughly in proportion to the negative log of the frequency of occurrence. Annex K of the JPEG standard gives details for generating JPEG Huffman tables.
Once you have set system parameters and determined your quantization table, you can design the systems physical implementation. Many ready-made software and hardware implementations are available. The easiest and most flexible solution is off-the-shelf software that performs JPEG compression and decompression on a general-purpose computer. Software-based compression and decompression may be too slow, however. If you need greater speed, you can buy a hardware accelerator or design your own to supplement the software operations. If you design your own accelerator, you face three more choices: You could use a DSP and take advantage of its JPEG library, you could use a microcode-based image processor IC that handles several compression algorithms, or you can take the most hardware-oriented approach and use a JPEG codec chip or chip set.
If you use a JPEG codec or chip set, you need to evaluate many alternatives. When examining the vendors literature, keep in mind that compression ratio and distortion claims are not meaningful. Bit rate and distortion depend on your images and quantization tables, not on the compression engine. Also, vendors claims about processing time are tricky to translate into useful measurements. The problem is that the compression speed often depends on the local compression ratio. Further, many designers find that the Huffman coding stage or the compressed data interface speed limit the codecs throughput. If your images dont tend to compress well, or if you are using finer quantization tables, your chosen chip set may not achieve maximum speed.
Your system-design problem becomes particularly tough if the system cant tolerate temporarily halting the flow of input data. A requirement that the system process a minimum number of images or pixels per second implies a lower bound on the compressors speed. Dont be misled by average processing speeds. For real-time work, average throughput is only meaningful when the average is taken over a time period that corresponds to the amount of buffering available.
Understand the slowdowns
If you must maintain a minimum throughput, some understanding of the chip sets inner workings helps you understand under what circumstances and by how much it slows down. You need to ask the vendor a lot of questions and carefully study data sheets and handbooks. Check these sources for clues to what affects throughput. The users manual for C-Cube Microsystems CL550/CL560 JPEG codecs (Reference 5), for example, is helpful in discussing the performance differences of two chips that are almost identical externally but have significant architectural differences.
When you are choosing a codec chip, you should also consider which functions it implements. Baseline JPEG requires several functions: raster-to-block conversion, discrete cosine transform, quantization, Huffman coding, byte stuffing and adding marker codes, generating restart intervals (if used), and assembling the rest of the information needed for a JPEG file. The more of these functions the codec chip set performs, the less the burden on other hardware or software.
Beyond those functions needed for baseline JPEG, other functions that a chip set offers may be useful in your application. LSI Logic designed a video memory controller into its L64702 (Reference 7). Several chips let you specify a rectangular window that they compress or decompress, rather than the entire image. Color-space conversion is another common feature.
Some chips offer on-the-fly mechanisms to sacrifice quality to maintain a minimum bit rate or compression ratio. C-Cubes CL560, for example, can delete the ac-transform coefficients if it gets badly delayed during compression. Zorans ZR36050 (Reference 6) allows you to specify a target compressed-image file size. It achieves its target by running the image through twice. On the first pass, the chip determines compressed size for a given quantization table. On the second pass, it first scales the quantization table elements up or down and then compresses the image, sacrificing the high-frequency coefficients if needed. This chip also implements the lossless (non-DCT-based) JPEG algorithm in addition to baseline JPEG.
Performance issues involving the codec affect the rest of the systems design. The codecs processing speed sets limits on the systems input and output rates. Conversely, the speed at which the rest of the system can service the input and output data streams limits codec performance. The actual input and output rates usually fluctuate depending on the busyness of each region of the image. This fluctuation implies a need for buffering in order to smooth the data rate.
Finally, consider giving your system some flexibility. Given image-to-image variations in bit rate, for example, you may wish to implement a mechanism to select quantization tables on a per-image basis. Further, you should make provisions for handling files larger than your target. You have no prior guarantee that a given quantization table will yield your target packet size on any image in your set. Finally, use experiments to set realistic bounds for buffering and bandwidth and design in provisions for handling the exceptions.
Debora Grosse has a BSEE and MSEE from the University of Michigan- - Ann Arbor. She works for Programmable Designs Inc (Ann Arbor, MI). She acquired her expertise in image compression during a 10-year stint at Unisys Corp (Plymouth, MI), developing image-processing and compression systems. She admits to having small children instead of hobbies.