Channels ▼
RSS

Design

Replacing a Dictionary with a Square Root

Source Code Accompanies This Article. Download It Now.


Oct01: Algorithm Alley

Tom is a consultant based in Boulder, Colorado. He can be contacted at http://www.profcon.com/cargill/.


In the widely used GIF format, images are compressed using a representation that can be produced by applying the Lempel-Ziv-Welsh algorithm (see "A Technique for High-Performance Data Compression," by T. Welsh, IEEE Computer, June 1984) to the pixels of the raw image. However, creating a GIF encoding does not require the use of the Lempel-Ziv-Welsh (LZW) algorithm. A GIF encoding that is produced by means other than LZW is still a GIF encoding. To emphasize the point, a GIF encoding might be produced (in principle) by generating all possible encodings, ordered by size, until one is found that decodes satisfactorily. Indeed, for some inputs, this theoretical approach yields better compression than LZW.

Deciding how to program the encoding of a GIF image is complicated (until 2003) by a patent (U.S. Patent 4,558,302. "High speed data compression and decompression apparatus and method") held by Unisys on the LZW algorithm. By all accounts, CompuServe was unaware of the patent when LZW was chosen for GIF. Programmers wishing to generate GIFs without addressing this legal issue must consider alternative algorithms. In Compressed Image File Formats (ACM Press, 1999), John Miano discusses an approach that is very simple, at the cost of negative compression:

The easiest way to implement a GIF encoder without using LZW is to simply encode each data byte using 9 bits and output a clear code after every 254 codes.

Can you do better than 9 bits per pixel without infringing the patent?

Line Drawing Images

Recently, in programming a GIF generator, I had already started using Miano's approach when it struck me that, because all the images were line drawings, they would yield to a significantly more effective algorithm. Like Miano's, this algorithm uses none of the mechanisms of LZW. Unlike Miano's, it produces satisfactory compression for line drawings.

For these purposes, the crucial property of a line drawing is that most of it is empty; that is, it is dominated by long runs of background color. The dominance of background color runs makes a line drawing an ideal candidate for run-length encoding (see Fundamentals of Interactive Computer Graphics, by James D. Foley and Andries Van Dam, Addison-Wesley, 1982). While LZW does not produce an explicit run-length encoding, its generic dictionary mechanism naturally discovers and exploits runs in the input. LZW does not distinguish consecutive occurrences of the same value from other sequences of values; it merely exploits repeated sequences. When processing a line drawing, the contents of an LZW dictionary is therefore dominated by ever-longer runs of background color.

Though motivated by line drawings, the proposed algorithm works well on any image, or other data, that would compress well under run-length encoding. For example, an image with gradient block fill will compress well if the gradient runs vertically, but not if the gradient runs horizontally. A vertical gradient produces horizontal rows of uniform color that are, therefore, runs that can be exploited; a horizontal gradient produces runs in vertical columns, which cannot be exploited.

An Alternative Algorithm

The algorithm I propose here is derived by studying the input-output behavior of LZW over runs of a single input value, but without depending on its implementation in any way. The analysis is based on feeding runs into LZW and examining its output. For such inputs, LZW's output can be characterized by a simple formula based on the "triangle numbers." The proposed algorithm exploits this formula by inverting the underlying quadratic form into a closed expression that is based on a square root. In effect, LZW's dictionary is replaced by taking the square root of the length of a run.

Analyzing LZW

The basis of the proposed algorithm can be seen by looking at the output from LZW, as implemented in Listing One — a Java method called compress that's modeled on code from Miano. For simplicity of analysis, the input alphabet is restricted to the letters "a" through "z," and the dictionary codes are represented by strings such as "(12)." In practice, GIF literal values and dictionary codes are represented by variable-length bit fields, a complexity that's moot for the purposes of this article.

Note that this code embodies part of the algorithm that is protected by the LZW patent, which places legal constraints on its use. However, if you are not interested in the details of LZW, you can safely ignore all of the implementation of the compress method. All that matters is that it maps an input string to an output string. Indeed, a central point of this article is that there is no need to know LZW's implementation; it is to be replaced completely.

To illustrate the basic operation of the method, the call

compress("mississippi")

yields the result

"miss(1)(3)ppi"

The output starts with four literal values from the input, followed by the dictionary entry (1), which represents is. Next comes dictionary code (3), representing si, followed by three more literal values. The 11-letter input has been compressed to 7 letters and 2 dictionary codes in the output. The degree of compression usually grows with the length of the input, until it reaches an asymptote.

The string "mississippi" is representative of arbitrary input that contains repeated sequences in an arbitrary manner. LZW happens to exploit the repeated is and si because of the particular behavior of its dictionary, as shown in the compress method. However, for encoding line drawings, we are interested in how LZW encodes runs of a single character. As presented here, we are interested in how the compress method encodes a string such as "aaaaaaaaaaaaaaa." Analyzing the numerical patterns in the input-output mapping of the compress method over such strings permits an equivalent closed-form implementation.

Listing Two shows the effects of LZW's dictionary mechanism on inputs that are runs of the same character. It prints the LZW encoding of the strings that are from 1 to 20 repetitions of the single character "a." Figure 1 is the output of generateCompressMapping (Listing Two).

By observation, the output always begins with an instance of the underlying value in the run, in this case a. Then, for inputs with a length that is in the sequence 1, 3, 6, 10, 15,..., the "triangle numbers," the output is a uniformly increasing sequence of dictionary codes. The length of the sequence grows by one at each new triangle number. For an input length that is not a triangle number, the output is the same as the output of the preceding triangle number, with one additional value or code.

The triangle numbers are given by the formula N(N+1)/2, for N=1, 2, 3,... It is the square of N in this formula that results in a square root in the proposed algorithm.

For inputs that are runs, the proposed algorithm is constructed by inverting the underlying quadratic form, N(N+1)/2. For a given run length, it determines the largest triangle number that does not exceed the length. From that position in the triangle number sequence, and the difference between the run length and the triangle number, the output can be generated directly.

This algorithm, shown as the encodeRun method in Listing Three, operates exclusively in terms of arithmetic — there is no dictionary. Its key operation is a square root that determines the index within the triangle sequence (saved in the index variable). The square root arises from finding a root of a quadratic equation. The variable triangle holds the triangle number itself. The method's input parameters are the repeated value and the number of repetitions.

For the special case of runs of a single value, this method always yields the same result as the corresponding call to the compress method. For example, compress("xxxxx") and encodeRun('x',5) yield the same result, "x(0)(0)."

Applying the Algorithm

Using the algorithm to encode GIF images is straightforward. For the purposes of this article, I have ignored details such as establishing the GIF color table and dealing with the resulting bit-field lengths. Focusing purely on the compression step that is usually performed by LZW, the modification is easy to code. For that compression step, an encoder merely breaks the input image into runs, and then processes each run with the algorithm shown as encodeRun (Listing Three).

Measurements

In pathological cases, the proposed algorithm can match — and even exceed — the compression achieved by the conventional use of LZW when encoding a GIF image. However, for the kind of line drawings considered here, a GIF encoded by encodeRun is about four or five times the size of that generated by LZW.

Figure 2 is a sequence diagram, a mixture of straight lines and text. This image is 457 X 308 pixels. For this, LZW generates a GIF that is 2164 bytes. An encodeRun-based encoder generates a GIF that is 9462 bytes. For further comparison, the trivial LZW-free encoding suggested by Miano requires about 9 bits per encoded pixel, or about 160 KB for this GIF.

Conclusion

Data that would compress well under a run-length encoding mechanism can be encoded in a manner that is compatible with LZW's output, but independent of its patented algorithm. The technique is useful for encoding line-drawing images in the GIF format. While the level of compression does not normally approach that of LZW, it significantly exceeds a previously published mechanism, and may be sufficient in many settings.

DDJ

Listing One

String compress(String input) {
  Properties dictionary = new Properties();
  for( char letter = 'a'; letter<='z'; ++letter )
    dictionary.put(""+letter, ""+letter);
  int generatedCode = 0;
  String output = "";
  String last = "";
  for( int i=0; i<input.length(); ++i ) {
    String current = last + input.charAt(i);
    if( dictionary.get(current)==null ) {
      output += dictionary.get(last);
      String newCode = "("+generatedCode+")"; 
      dictionary.put(current, newCode);
      ++generatedCode;
      last = input.substring(i,i+1);
    } else
      last = current;
  }
  output += dictionary.get(last);
  return output;
}

Back to Article

Listing Two

void generateCompressMapping() {
  String run = "";
  for( int i=1; i<=20; ++i ) {
    run += 'a';
    String compressed = compress(run);
    System.out.println((i<10?" ":"")+i+": "+compressed);
  }
}

Back to Article

Listing Three

String encodeRun(char input, int repetitions) {
  String output = ""+input;
  int index = ((int)Math.sqrt(8*repetitions+1)-1)/2;
  for( int i=1; i<index; ++i )
    output += "("+(i-1)+")";
  int triangle = index*(index+1)/2;
  int remainder = repetitions-triangle;
  if( remainder==1 )
    output += input;
  if( remainder>1 )
    output += "("+(remainder-2)+")";
  return output;
}

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video