Image Compression Article Addendum: EDN Magazine June 18, 1998 Issue

Lossless Algorithm Fundamentals

The term lossless compression signifies that although the information may look completely different at the output of the encoding stage, the corresponding decompression algorithm is guaranteed to return the data to its exact original state. Several different lossless algorithms exist, and the decision on which one to use depends not only on the available processing power and memory, but also on the characteristics of the data to be compressed.

Run-length encoded (RLE) algorithms replace a long string of same-value data with a shorter combination of the data value plus the number of sequential locations that contain this value. For example, if you have a bitmap image containing a 250-pixel horizontal blue line, a RLE compression algorithm might replace the 250 identical data values with one value for blue followed by the number 250. Another example of an application where RLE might work well is an audio file containing a few long, constant pitch tones.

What happens if your data isn't necessarily consistant, but varies only within a narrow band of values? For example, what if the average pixel-to-pixel variation in your image file was very small, or the person whose voice you were recording on your digital tape recorder spoke in a monotone? Instead of storing each data sample's full code, you could instead store the difference between it and either the preceeding sample or a common reference sample. Not surprisingly, these types of lossless algorithms are called differencing routines.

Turning to a text file example, in the English language letters such as A, E and S are more commonly used than letters like Q, X and Z. Statistical coding (also called entropy coding) compression algorithms like the well-known Huffman code take advantage of probability statistics to assign short bit strings to commonly-used data patterns, and longer bit strings for less frequently used alternatives (Example 1). A statistical coding routine will analyze the entire input file, assign codes and provide a translation 'key' table, known as a canonical tree, within the compressed file for use by the decoding algorithm.


Example 1

String to be compressed: Run Spot Run.

Original bitstream length: 104 bits

Canonical tree:

Character

Value

R

000

u

001

n

010

[space]

011

S

100

p

101

o

110

t

1110

.

1111

Resultant bitstream length: 41 bits


A fax machine is one example of a statistical coding application; the most common byte sequence is a string of data corresponding to a scanned line of white pixels. Instead of using RLE to store the value for white plus the number of white pixels, why not assign this event a much shorter single or few-bit value (as done using Huffman coding techniques in CCITT Group 3 and 4, and arithmetic coding in JBIG)? If the coding table is standardized, the encoding system doesn't even need to send the table to the decoder along with the compressed file. Morse code is an example of arithmetic coding with a standard translation table.

Another well-known compression technique, known as substitutional compression, replaces repeated iterations of bit or byte strings with locational references to the first time the string appears in the source file (Example 2). Variations of the LZ (Lempel-Ziv) substitutional algorithm are most common. By varying the size of the sliding memory window used to search backwards in the file for string matches, substitutional algorithms can trade off compression for required memory and processing power.


Example 2

String to be compressed: See Spot Run. Run Spot Run.

Compressed string: See Spot Run.[i=9,l=4][i=4,l=10]

 

String to be compressed: Blah blah blah blah blah!

Compressed string: Blah b[i=5,l=18]!

i= index, l=length


The more redundancy that exists in the original file, and the longer each redundant string, the more significant the substitutional compression results. The LZW (Lempel-Ziv-Welch) algorithm forms the compression core of GIF and compressed TIFF files, V.42bis for modems and Postscript Level 2, among other applications. Unisys and IBM both own patents on LZW. Unisys, in conjunction with Compuserve, has recently begun enforcing patent rights with imaging companies whose products support GIF and compressed TIFF.

Several of the compression techniques discussed in this article are patented, partially or in full. Either research the legal boundary conditions before developing your own compression scheme or license a compression engine from another company, unless you want to find yourself paying royalties or spending time in court arguing your case.

Partially to circumvent the need to pay LZW royalties, a consortium of various manufacturers and academic researchers has developed the PNG (Portable Network Graphics, also known as 'PNG's Not GIF') file format, with support now beginning to appear in imaging programs. PNG uses a royalty-free form of LZ encoding that, on average, also produces slightly greater compression results than LZW under comparable logic performance and memory parameters. PNG's compression techniques also include a variation of differencing known as filter algorithms.

On a per-scan line basis, you can either store each byte's data directly or in the form of a comparison to other pixels' corresponding bytes in the source file, with pixels arranged in an assumed left-to-right and top-to-bottom orientation. The sub option stores the difference between the byte under examination and its predecessor in the pixel to its left. Up stores the difference from the corresponding byte in the pixel from the scan line above it. Average uses the byte value of neighboring pixels above and to the left, while Paeth (after Alan Paeth) chooses the closest-valued of three pixels; left, above and upper left.

Back To Top Of Page

Lossy Algorithm Fundamentals

Although lossless compression can impressively shrink the size of a carefully-tailored demonstration source file, any selected algorithm will produce varying results across all possible sets of data patterns. If you don't carefully consider the source file entropy (degree of randomness) characteristics, the data overhead necessary for proper subsequent decode might result in a file larger than the original. A lossless compressor must expand some input sequences if it compresses others. This is easily proven by a counting argument: there are fewer bit patterns of length < N than there are of length N, but to have lossless compression you must have a distinct compressed representation of every uncompressed string.

Intelligent compression algorithms will not blindly use their results but will compare the encoder output to the original file and select the smaller of the two. Robust algorithm APIs also allow the user to vary the performance-versus-compression ratio to meet specific system requirements and capabilities. The average lossless compression ratio works out to around 2:1 for text and continuous-tone images. To do any better than this, you'll need to remove data from the source file.

The term lossy compression can be deceptive, since when discussing images (and audio, for that matter), we're talking about information that's naturally analog. Whenever you digitize an analog signal, no matter how many bits of precision you use, you're throwing away theoretically useful data. The key, of course, is the word theoretically. You need to use only enough bits to adequately represent the signal for your users' quality expectations and differentiate it from system and environment-induced noise, while also meeting cost, performance and other overall design criteria.

In the context of this article, lossy compression means permanently discarding portions of the information after it's digitized so that when subsequently decoded, it does not match the initially digitized bitstream. At first glance, lossy compression may seem to be a bad idea, until you consider the impressive compression ratios that it provides. Keep in mind too that if done properly, lossy compression will remove information in the source file that the human sensory system wouldn't be able to detect and interpret anyway, mitigating distortion effects. Many lossy compression algorithms also subsequently run their output data patterns through one or a few lossless compression schemes.

Image compression algorithms commonly exploit several ideosyncracies of the human visual system. First, the eye and brain are more sensitive to detail found in the darker areas of an image, and less responsive to changes in highlight information. A nonlinear quantization routine will maximize accuracy where it's most needed by more precisely representing lowlight and, to a lesser degree, midrange pixels.

Secondly, the eye and brain will quickly notice errors in adjacent pixels that are supposed to be the same color. However, when the color significantly changes from one pixel to another, such as at the edge between two objects, the compression algorithm can represent each pixel's colors less accurately without the viewer noticing the error. The brain will 'correct' the edge colors using information from surrounding pixels.

The eye contains only approximately half as many color (chrominance) receptors as it has black-and-white brightness (luminance) receptors. This phenomenon explains why when you look at a dimly-illuminated scene, such as at night, the color information you interpret is limited and often inaccurate. If an image format stores data in terms of its luminance (Y) and chrominance (Cr and Cb) values, it can discard (called desampling, subsampling or decimating) roughly half of of the chrominance data without noticeably degrading image quality. The eye's color receptors are also more sensitive to green and green-derived colors than other frequencies in the visible spectrum.

Human beings, especially if the come from cultures that read words horizontally on a printed page, tend to view the world from a horizontally-prioritized perspective. This fact means that accurately storing pixel-to-pixel variations is more important when moving from left-to-right than from top-to-bottom of an image. Digital camera manufacturers have to be careful about using this approach when initially capturing an image, however, since users can hold the camera both horizontally and vertically to take pictures.

Finally, when you look at most natural photographic scenes, you'll see a high proportion of pixels that only slightly vary in chrominance and luminance from their nearest neighbors. Examples include a blue sky, a field of green grass and a white wall. Even if the pixels aren't exactly the same (thereby precluding lossless compression techniques) you can use frequency-conversion algorithms to nonlinearly quantize the data and reduce its size. Video also contains a large amount of frame-to-frame redundancy which a compression scheme can exploit to significantly reduce the average frame data size.

Most lossy techniques do not work well with computer-generated images which contain a large number of significant pixel-to-pixel luminance and chrominance transitions, such as complex graphics and even text on a printed page. Since lossy compression is also tuned to the unique attributes of the human eye, nerves and brain, a computer vision system most likely will not respond in the same way. As one company representative stated during the research phase of this article, 'I don't think I'd want a lossy-compressed image to be the data guiding a missile'.

Back To Top Of Page

Additional Resources Overview

The best places I've found for you to continue your compression research, aside from the literature in the References section of this article, are the comp.graphics.misc and comp.compression newsgroups. You can access corresponding FAQs at www.faqs.org/faqs/graphics and www.faqs.org/faqs/compression-faq.

Back To Top Of Page

JPEG

For JPEG, use the links found at www.jpeg.org or consult Tom Lane's FAQ found at www.faqs.org/faqs/jpeg-faq. Pegasus Imaging's JPEG Wizard at www.jpg.com/jpeg-wizard.html interactively shows you the quality results of compressing at various levels an input file you provide.

Back To Top Of Page

PNG

PNG information resides at www.cdrom.com/pub/png. Detailed explanations of PNG's underlying LZ compression scheme are at www.cdrom.com/pub/infozip and www.cdrom.com/pub/infozip/zlib, and filter algorithm information is at www.w3.org/TR/PNG-Filters.html.

Back To Top Of Page

MPEG

For MPEG, two useful information launching pads are the MPEG.ORG site at www.mpeg.org and the official MPEG committe homepage at drogo.cselt.stet.it/mpeg/. You can find MPEG FAQs at www.faqs.org/faqs/mpeg-faq and bmrc.berkeley.edu/projects/mpeg2-v38/faq_v38.html. The latter document, titled "Questions that should be frequently asked about MPEG" is both comprehensive and an enjoyable and often humourous read. Finally, C-Cube Microsystems provides an MPEG overview at www.c-cube.com/technology/mpeg.html.

Back To Top Of Page

Wavelets

Analog Devices' FTP site (ftp.analog.com/pub/dsp/adv601) contains VideoLab and VideoPipe schematics, a wavelet demonstration utility that converts bitmap files and also interfaces to Adobe Premiere, and other documentation and utilities. Bell Labs researcher Jelena Kovacevic has created another interesting wavelet site at cm.bell-labs.com/who/jelena.wavelet.html which includes a number of Java-based wavelet applets. Wim Sweldens, another Bell Labs researcher, has created the wavelet.org site and publishes a free monthly newsletter. Intel Corporation provides wavelet-based Indeo Video 5 software at developer.intel.com/ial/indeo/video.

Compression Engines offers a variety of wavelet software on their website, available for free download, including a plugin for Microsoft Internet Explorer and Netscape Navigator, their standalone image compressor and a streaming video decompressor that runs under Microsoft ActiveMovie. Infinop's free software includes Adobe Photoshop plugins for Windows 95/NT and Macintosh PowerPC, the standalone Lightning Strikes image compressor, a web-resident Java applet and both streaming and LiveCast video decompressors. Summus Technologies provides free image viewers for Internet Explorer and Navigator, and the 4U2C standalone image compressor.

Back To Top Of Page

Fractals

Yuval Fisher, visiting professor at the University of California, San Diego, has created an extensive list of fractal compression information at inls.ucsd.edu/y/Fractals/. The 1-hour video Colors of Infinity describes the theory behind fractals and is hosted by science fiction writer Arthur C. Clarke with soundtrack by David Gilmour of Pink Floyd. It contains incredibly beautiful fractal graphics and interviews with Benoit B. Mandelbrot, Stephen Hawking, professor Ian Stewart and David Barnsley, cofounder of Iterated Systems. You can order a copy by calling 1-800-257-5126. Iterated System's website contains a number of documents, useful weblinks and software for free download, including the standalone Fractal Imager compression program, Fractal Viewer web browser plugin and ClearVideo streaming video decoder. Altamira Group also provides a free demo version of their Iterated Systems-powered Adobe Photoshop plugin.

Back To Top Of Page

Image Comparisons

The graphics below are the original, JPEG-compressed, wavelet-compressed and fractal-compressed images used in Figure 2 of the version of this article that appeared in the June 18 issue of EDN Magazine. In case the quality differences between them aren't obvious in print, you can download and view them in any image editing program that can handle TIFF files.

  Figure 2a-Original (TIFF) (906 Kbytes)

  Figure 2b-Enlarged Original (TIFF) (64 Kbytes)

  Figure 2c-Enlarged and JPEG-Compressed (TIFF) (45 Kbytes)

  Figure 2d-Enlarged and Wavelet-Compressed (TIFF) (44 Kbytes)

  Figure 2e-Enlarged and Fractal-Compressed (TIFF) (48 Kbytes)

Back To Top Of Page

Example Fractal Images

Another good set of fractal links, this time of a non-academic nature, is at the Brain Games WebGuide. If you'd like to design and observe your own fractal creations, two good freeware packages are Tierazon and Fractint. In either case, I highly recommend 24-bit color graphics for maximum enjoyment.

The first image below, the Mandelbrot set, plus the seven that follow it are created by Lee Skinner, using Fractint. Lee sells a CD-ROM full of spectacular images like these. Please contact him or visit his website if you'd like more information.

  GIF (324 Kbytes)

  JPEG (295 KBytes)   PNG (900 KBytes)

  JPEG (282 KBytes)   PNG (760 KBytes)

  JPEG (224 KBytes)   PNG (589 KBytes)

  JPEG (253 KBytes)   PNG (795 KBytes)

  JPEG (324 KBytes)   PNG (838 KBytes)

  JPEG (123 KBytes)   PNG (493 KBytes)

  JPEG (404 KBytes)   PNG (776 KBytes)

Back To Top Of Page

Give Me Feedback!

Your comments on this article and website addendum and suggestions for future article topics are welcome.

Brian Dipert
edndipert@worldnet.att.net
http://members.aol.com/bdipert
1864 52nd Street
Sacramento, CA 95819
Phone: 916-454-5242
Fax: 916-454-5101

Back To Top Of Page