EDN logo


Design Feature: January 18, 1996

JPEG parameters determine compression-system performance

Debora Grosse,
Unisys Corp

The wide availability of JPEG software and hardware simplifies image-compression system design. Before using the equipment, you must resolve quantization table design and system performance.

You can easily design an image-compression system using an off-the-shelf Joint Photographic Experts Group (JPEG) compressor with the suggested default-parameter tables. Predicting the system’s performance in terms of image quality, compression ratio, or processing time is not as simple, however. To attain predictable system performance, you must determine the algorithm parameters that achieve your design goals.

Bit rate and compression ratio measure the degree to which an image is compressed. Bit rate is the average number of compressed bits per pixel. Compression ratio is (original image size) : (compressed image size). Physical size, resolution, and the number of color components determine the original image size. The compressed image size is limited by your system or application. (If your system didn’t have size limits, you wouldn’t need compression.)

Compression algorithms fall into two categories: lossless and lossy. Lossless compression isn’t necessarily superior to lossy. Lossless compression, or entropy coding, strives to represent information as compactly as possible. The data’s entropy, however, limits lossless compression; entropy coding can only squeeze out existing redundancy. The key to further compression is to discard information judiciously.

Lossy compression works by throwing out information that is not important. The lossy process introduces an error called distortion. Most compression algorithms for photographic images are lossy, because lossless compression is not able to condense all the image information to a practical size.

The JPEG Standard defines a family of image-compression algorithms designed for color or gray-scale still photographic images (Reference 1). JPEG can also compress video frame by frame. The most important algorithm in the JPEG standard is the baseline sequential. (See box, "JPEG in a nutshell.") Base-line JPEG coding has both lossless and lossy steps.

Baseline JPEG’s losses occur during a discrete cosine transform and subsequent quantization of the transform coefficients. The algorithm begins by dividing an image into 8×8-pixel blocks. It then transforms the image to the frequency domain one block at a time. Next, the algorithm performs a quantization, which involves dividing each of the 64 resulting frequency coefficients by the corresponding value in a set of 64 numbers. This set is the quantization table. Quantization rounds the quotient to the nearest integer. During decompression, the algorithm multiplies each quantized coefficient by the quantization-table value to restore an approximation of the original coefficient.


Trade quantization for quality

A large value in the quantization table means that the corresponding coefficient is restored less accurately. However, the quantized coefficient is a small integer that requires only a few bits for its representation. The quantization-table values, thus, represent a trade-off between accuracy and bit-length for each spatial frequency. The idea behind quantization is that some spatial frequencies, typically the higher ones, are less important to the eye than others. The algorithm can, therefore, represent the less important frequencies more coarsely without seriously degrading image quality.

Visually, the distortion introduced by quantization takes the form of rectangular artifacts. These artifacts range in size from 8×8 blocks to tiny speckles. Figure 1 shows a compressed image in which the 8×8 blocks are noticeable. These block artifacts result from quantization error in the zero-frequency (de) coefficient of an originally smooth image.

Speckles come from quantization errors in high-frequency components. Just as high harmonics are significant in accurately reproducing a square wave, the high-frequency coefficients of an image become significant at high contrast edges. Figure 2 shows an image having sharper contrast and more detail than Figure 1. Medium-sized and small rectangular artifacts are apparent in this image. In Figure 2 (c) and (d), for example, the 8×8 blocks containing asterisks are full of speckles. These patterns result from quantization error in high-frequency coefficients.

Obviously, then, developing the quantization table is a significant part of a JPEG system design. You can think of JPEG system design as having two parts. One part of the job is setting values for the parameters, which tend to be independent of the compression-coder implementation. These parameters include initial image resolution, color and color space, acceptable levels of distortion, and target compression ratio or compressed image size. This part of the job ends with a preliminary design of your quantization tables. The second part of the design is dealing with implementation-specific issues, such as cost, ease of development, and processing speed.

These two tasks are not separable. Image resolution and compression ratio, for example, affect processing speed. System constraints, such as available bandwidth and memory size, may limit the acceptable compressed image size and thus determine the target compression ratio. If you are designing for a particular application, however, you should start by setting values for the implementation-independent parameters. You can test the values early in the design using a simple prototype system. Further, the values also feed into the rest of the system design. Even if you are going to leave the selection of some of these parameters up to your customer, you need to make assumptions about the values to decide whether the implementation will meet the customer’s needs.

JPEG in a nutshell

The JPEG standard is an umbrella for a family of compression techniques. The method described here, baseline sequential, is the most common JPEG mode. The method uses Huffman coding and 8-bit/pixel source images. There is no benefit to using samples with fewer than 8 bits. If you do, the larger steps in intensity from one pixel to the next result in a somewhat poorer bit rate, as well as poorer image quality.

The JPEG standard includes options that allow substitution of a technique termed "arithmetic coding" for Huffman coding, or extension of the pixel precision to twelve bits instead of eight. Another alternative is a lossless mode, which does not use a discrete cosine transform. JPEG also offers progressive and hierarchical modes that encode the image in ways that allow the decoder to produce a poor-quality image quickly from the initial part of the file and gradually refine the quality. The JPEG standard describes each of these variations.

For simplicity, consider the compression of one color component. The standard has you separately compress each color component and interleave blocks of multiple color components in the compressed file. A single component, therefore, can adequately illustrate the JPEG algorithm.

The algorithm first divides the image into 8×8-pixel blocks. After that, it operates on a block-by-block basis. The algorithm next performs a discrete cosine transform (DCT) on each block. (The DCT is similar to a discrete Fourier transform, but better suited to this type of image processing.) The DCT’s output data is an 8×8 block of transform coefficients with the dc term the upper left coefficient. The dc term’s value is eight times the average intensity of all pixels in the block. The other 63 coefficients are ac terms. The numbers represent the intensities of the 2-D cosine waves into which the DCT decomposes the block. Frequency increases as you move from left to right and from top to bottom in the block.

The next step of the algorithm is quantization. The algorithm divides each transform coefficient by the corresponding value in an 8×8 block of integers called the quantization table. Then, the algorithm rounds the quotients to integers. The resulting quantized coefficients again form an 8×8 block.

The remaining algorithm steps are lossless and code the transform coefficients using a minimum number of bits. The algorithm orders the 64 quantized coefficients in a zigzag pattern, starting with the dc coefficient and working diagonally back and forth across the block. The purpose of the zigzag ordering is to produce long runs of zeros. The ordering puts the lowest frequencies first and the highest frequencies, in which there are typically many zero-valued coefficients, last.

Separate codes for dc and ac

After ordering the coefficients, the algorithm Huffman codes the result using separate sets of Huffman codes for the dc and ac terms. A Huffman coder uses predefined codes based on the expected statistics of the image. Symbols occurring more frequently get shorter codes.

Because the dc coefficients of neighboring blocks typically do not vary, fewer bits are needed to encode the differences between successive dc coefficients rather than the absolute values. The JPEG algorithm, therefore, encodes the differences. Only at the beginning of the image, and at the beginning of periodic restart intervals within the image, does the algorithm use the actual dc coefficient value.

JPEG decomposes the dc coefficient difference into two parts, somewhat like exponent and mantissa, for Huffman coding. One part is the magnitude category. The magnitude category M of an integer N is the smallest integer M such that

2M>|N|.

JPEG uses a Huffman code to represent the magnitude category M. It then appends M bits after the Huffman code to identify a particular member of the category. For example, if the difference is 5, it is in category 3, which comprises the set {-7, -6, -5, -4, 4, 5, 6, 7}. JPEG adds 3 bits, in this case 101, following the code for category 3.

AC uses run-length codes

After the coding the dc difference, JPEG codes the block’s quantized AC coefficients. The compressor counts the lengths of runs of zero-valued coefficients to temporarily represent the ac data as run-length bytes. The upper four bits of a run-length byte indicate up to 15 consecutive zero-valued coefficients. The lower four bits are the magnitude category of the coefficient following the run. The compressor uses a Huffman code to represent the run-length byte and appends bits to identify the member of the magnitude category.

The decompressor reverses all of these steps and losslessly decodes the quantized coefficients. It then multiplies each quantized coefficient by the corresponding value in the quantization table to recover an approximate representation of the transform coefficient. Finally, it performs an inverse DCT.

The JPEG Standard also defines some aspects of file format. The Standard has a format for the compressed data, for custom quantization tables, and for Huffman tables. Marker codes identify restart intervals and the various structural parts of the format. Marker codes are easily locatable because they are byte-aligned and of the form FFxx, where xx is nonzero. The compressor then must use byte stuffing, inserting a '00’ byte after any ‘FF’ byte the coding process created by the coding process, to avoid generating spurious marker codes.


Set independent parameters

Selecting the implementation-independent parameters requires that you make some trade-offs. For example, there is a trade-off between resolution and compressed-image size. A higher resolution image, in the absence of compression, shows more detail. As the number of pixels per inch increases, however, the compressed-image size also increases, although not necessarily in direct proportion. Coarse quantization reduces compressed-image size, but loses detail. If the compressed-image size is critical, you may achieve a better compromise by using a lower resolution image but compressing it more accurately—that is, with finer quantization.

Another implementation-independent parameter to consider is choice of color space. JPEG allows you to choose the color space representation and the resolution of each color component. Video images, for instance, ordinarily are divided into red, green, and blue components. In this color space, all three components need the same resolution. However, you can reduce the number of bits needed to represent an image by choosing a different color-space representation. For instance, in a YCrCb representation (Reference 2), the Y, or luminance, component represents image brightness. The other two components, called chrominance, convey hue. For human vision, it is more important to represent precisely the luminance than the chrominance. The chrominance components can, therefore, be subsampled so that there are two blocks of each of the two chrominance components for every four blocks of luminance. Reducing the number of chrominance pixels is a form of lossy compression.

The most slippery implementation-independent parameter is distortion, or image quality. Mean squared error is one metric of distortion. Unfortunately, there is no good way to correlate mean-squared error with human perception of image quality. Instead, image quality must have its definition in terms of your application.

Many factors contribute to image quality in a given application. Display pixel size and the viewing distance affect perceived quality: Enlarging the image makes artifacts more apparent. You also need to consider which information in the image is important to the user. A user interested in the legibility of the low-contrast words "DOLLARS" and "CTS" in Figure 2 would demand less distortion than a user who only wanted to read the dark numbers. You must also address the nature of a typical image. For example, in photographs of natural scenes, small artifacts added by the compression routine may not be noticeable if they are camouflaged in textured regions. To define image quality, you need a suite of representative images for your application and set guidelines for evaluating their quality after compression.

Finally, decide on your needs for error recovery and random access within an image. In general, the JPEG decompressor operates on current data based on knowledge of prior data in the image. This dependency on prior data implies that you cannot begin decompression in the middle of the file, and the decompressor cannot recover after a bit error in the compressed data. The JPEG standard offers an optional restart capability, however, that gets around this limitation. Restart markers delimit restart intervals, which are horizontal segments of the image that can be decompressed independently. Because restart markers are easy to locate, you can have the decompressor extract only those image segments that are of interest. Also, the decompressor reinitializes itself at each restart marker, so errors do not propagate beyond the restart interval. Restart intervals are optional, and you choose their size. The intervals must contain an integral number of rows of 8×8 blocks.


Quantization tables are key

Once you define your system requirements, you can develop quantization tables. A quantization table is the key implementation-independent parameter, because it controls distortion and bit rate. The quantization tables suggested in the JPEG standard may be appropriate for your application. The standard gives two tables: one for the luminance component and one for the chrominance components. The tables assume a viewing distance of 6 times the screen width with a luminance resolution of 720×576 pixels and a chrominance resolution of 360×576 samples (Reference 3, pg 36). Your resolution and viewing distance may be different. These tables define a standard not to achieve a target compression ratio, but to distort to the threshold of visibility. Pennebaker and Mitchell (Reference 1) found, however, that the tables yield noticeable artifacts when viewed on high-quality displays.

Unfortunately, you will find that optimum quantization tables are highly image dependent. For example, coarse quantization of the medium-frequency coefficients causes turbulence or speckling around text characters on a plain background. A complicated background may mask these artifacts, allowing coarse quantization to produce acceptable results. Indeed, you may wish that you could flip between quantization tables, using one to compress textured areas and another to compress smoother areas. Unfortunately, the existing JPEG standard does not allow changing the quantization table in the middle of compressing a component of the image. The JPEG committee is considering this as an extension.

To develop your own quantization tables, you need prototyping tools. These tools consist of JPEG-compression and -decompression software; a representative collection of images; and a display or other output device that models the one planned for your system. By altering the quantization tables and compressing the images, you can observe the trade-off between distortion and bit rate. (Figure 2 shows the effects of five quantization tables.) The challenge is gathering a sufficient collection of images.

If compressed-image size is not critical to your application, the choice of a quantization table is simple. If you require a small compressed-image size, however, you will find that achieving a quantization table that gives acceptable distortion over a range of test images takes a lot of experimenting.

A common, although unsophisticated, method for generating quantization tables is to apply scalar multipliers to the example luminance quantization table in the JPEG standard, Annex K. You manually adjust the multipliers until the images compress to the desired average bit rate. You can also try fine tuning the table by tweaking individual table values. If you need a more sophisticated method, you can try quantization-table-design algorithms developed by the researchers at Unisys Corporation (Reference 4). These algorithms give a better quality-vs-rate trade-off than does scaling the example table.


Optimize Huffman tables

With any lossy compression technique, there are limits to the quality you can obtain for a given bit rate. You may discover that you cannot simultaneously meet your target quality and bit rate by adjusting the quantization table. The next step is to optimize the Huffman tables that encode the quantized coefficients. Huffman tables are much less significant than the quantization tables in setting the compression ratio. Fortunately, optimizing the tables is a straightforward process that uses a well-known algorithm in the compression world. You first need to choose your quantization tables and to compress your representative image set to the point of Huffman coding. You then collect statistics on the frequency of occurrence of each symbol. Assign short codes to the most common ones with the code length roughly in proportion to the negative log of the frequency of occurrence. Annex K of the JPEG standard gives details for generating JPEG Huffman tables.

Once you have set system parameters and determined your quantization table, you can design the system’s physical implementation. Many ready-made software and hardware implementations are available. The easiest and most flexible solution is off-the-shelf software that performs JPEG compression and decompression on a general-purpose computer. Software-based compression and decompression may be too slow, however. If you need greater speed, you can buy a hardware accelerator or design your own to supplement the software operations. If you design your own accelerator, you face three more choices: You could use a DSP and take advantage of its JPEG library, you could use a microcode-based image processor IC that handles several compression algorithms, or you can take the most hardware-oriented approach and use a JPEG codec chip or chip set.

If you use a JPEG codec or chip set, you need to evaluate many alternatives. When examining the vendors’ literature, keep in mind that compression ratio and distortion claims are not meaningful. Bit rate and distortion depend on your images and quantization tables, not on the compression engine. Also, vendors’ claims about processing time are tricky to translate into useful measurements. The problem is that the compression speed often depends on the local compression ratio. Further, many designers find that the Huffman coding stage or the compressed data interface speed limit the codec’s throughput. If your images don’t tend to compress well, or if you are using finer quantization tables, your chosen chip set may not achieve maximum speed.

Your system-design problem becomes particularly tough if the system can’t tolerate temporarily halting the flow of input data. A requirement that the system process a minimum number of images or pixels per second implies a lower bound on the compressor’s speed. Don’t be misled by average processing speeds. For real-time work, average throughput is only meaningful when the average is taken over a time period that corresponds to the amount of buffering available.


Understand the slowdowns

If you must maintain a minimum throughput, some understanding of the chip set’s inner workings helps you understand under what circumstances and by how much it slows down. You need to ask the vendor a lot of questions and carefully study data sheets and handbooks. Check these sources for clues to what affects throughput. The user’s manual for C-Cube Microsystems’ CL550/CL560 JPEG codecs (Reference 5), for example, is helpful in discussing the performance differences of two chips that are almost identical externally but have significant architectural differences.

When you are choosing a codec chip, you should also consider which functions it implements. Baseline JPEG requires several functions: raster-to-block conversion, discrete cosine transform, quantization, Huffman coding, byte stuffing and adding marker codes, generating restart intervals (if used), and assembling the rest of the information needed for a JPEG file. The more of these functions the codec chip set performs, the less the burden on other hardware or software.

Beyond those functions needed for baseline JPEG, other functions that a chip set offers may be useful in your application. LSI Logic designed a video memory controller into its L64702 (Reference 7). Several chips let you specify a rectangular window that they compress or decompress, rather than the entire image. Color-space conversion is another common feature.

Some chips offer on-the-fly mechanisms to sacrifice quality to maintain a minimum bit rate or compression ratio. C-Cube’s CL560, for example, can delete the ac-transform coefficients if it gets badly delayed during compression. Zoran’s ZR36050 (Reference 6) allows you to specify a target compressed-image file size. It achieves its target by running the image through twice. On the first pass, the chip determines compressed size for a given quantization table. On the second pass, it first scales the quantization table elements up or down and then compresses the image, sacrificing the high-frequency coefficients if needed. This chip also implements the lossless (non-DCT-based) JPEG algorithm in addition to baseline JPEG.

Performance issues involving the codec affect the rest of the system’s design. The codec’s processing speed sets limits on the system’s input and output rates. Conversely, the speed at which the rest of the system can service the input and output data streams limits codec performance. The actual input and output rates usually fluctuate depending on the busyness of each region of the image. This fluctuation implies a need for buffering in order to smooth the data rate.

Finally, consider giving your system some flexibility. Given image-to-image variations in bit rate, for example, you may wish to implement a mechanism to select quantization tables on a per-image basis. Further, you should make provisions for handling files larger than your target. You have no prior guarantee that a given quantization table will yield your target packet size on any image in your set. Finally, use experiments to set realistic bounds for buffering and bandwidth and design in provisions for handling the exceptions.


Author's biography

Debora Grosse has a BSEE and MSEE from the University of Michigan- - Ann Arbor. She works for Programmable Designs Inc (Ann Arbor, MI). She acquired her expertise in image compression during a 10-year stint at Unisys Corp (Plymouth, MI), developing image-processing and compression systems. She admits to having small children instead of hobbies.


References

1. Information Technology—Digital compression and coding of continuous-tone still images, International Standard ISO/IEC IS 10918-1, Oct 20, 1992.

2. Encoding parameters of digital television for studios, CCIR Recommendation 601-2, 1990.

3. Pennebaker, William B, and Joan L Mitchell, JPEG Still Image Data Compression Standard, New York, NY, Van Nostrand Reinhold, 1993.

4. Kidd, Robert C, "A Comparison of Wavelet Scalar Quantization and JPEG for Fingerprint Image Compression," Journal of Electronic Imaging, V4, no. 1, January 1995, pg 31 to 39.

5. CL550/CL560 JPEG Compression Processor User’s Manual, C-Cube Microsystems, 1993.

6. Data sheet for the ZR36050 JPEG Image Compression Processor, Zoran Corporation, September 1993.

7. L64702 JPEG Coprocessor Technical Manual, LSI Logic Corporation, July 1993.

8. Quinnell, Richard A, "Image Compression—Part 1," EDN, Jan 21, 1993, pg 62 to 71.


| EDN Access | feedback | subscribe to EDN! |
| design features | out in front | design ideas | departments | products |


Copyright © 1996 EDN Magazine. EDN is a registered trademark of Reed Properties Inc, used under license.