Algorithm Alley

Last month, Tom introduced algorithms and test programs for compressing and decompressing Windows bitmap files. This month, he presents the remaining test programs and a C++ utility that compresses real 256-color bitmap files.


September 01, 1993
URL:http://www.drdobbs.com/database/algorithm-alley/184409077

SEP93: ALGORITHM ALLEY

I'm not much of a gambler. Never mind what Jeff Duntemann had to say about my skills at the rubber-chicken toss in Las Vegas at the Circus Circus hotel one Comdex ago (DDJ, March 1991). True, I beat the feathers off Jeff that evening as we catapulted flacid poultry into pots rotating on a lazy Susan ringed by other chicken-hearted players. But you mustn't believe my friend Jeff when he claims high-stakes innocence. As I recall, it took three of us to drag him away complaining "this time for sure!" from the quarter avalanche machine--or whatever they call those jukebox-size boxes into which one flings quarters at layers of coins resting under oscillating pushers that promise to shove riches onto your lap on every turn. Jeff told me that he can tell just by looking, which machine is about to pay off. Maybe he's right, but I'll stick to chickens.

Actually, during that same trip, I did win $6.50 on five aces at a nickel poker slot machine, so perhaps I do have a smoldering Midas touch. Frankly, I doubt it, but when it comes to testing algorithms, who couldn't use a high-power lucky star? Rubber chickens and two-bit avalanches might not help in software development, but sometimes, there's simply no other way to prove an algorithm's correctness except to trust in chance.

Monte Carlo Method

The term "Monte Carlo Method," named after the famed Monaco casino, came into being during the mid-40's from its use in mathematical problems that were solved with random numbers. Now the term generally refers to test procedures that simulate a computer program's input data with a random-number generator. A typical Monte Carlo test repeatedly puts a program through its paces until an error is detected, or until the programmer decides enough testing is enough. In short to, "Monte Carlo" your code, as some would say, you write a program to create sample data files at random, and then feed those files into the program until you are satisfied that it's working correctly.

How do you know when to stop a Monte Carlo test? You don't. It's a gut feeling--like the one you get when you just know the dealer is about to turn up an ace. That feeling might steer you wrong in blackjack, but in programming, if you run a Monte Carlo test long enough--and if your test procedures are sound--it's a good bet the code is working. For a Monte Carlo test to be trustworthy, however, it should follow other tests that prove assumptions you've made about a program or algorithm. Only after you are reasonably sure that a program is working should you subject it to Monte Carlo scrutiny. Just throwing a bunch of random numbers at a program won't do any more for your code than pitching elastic birds into pots has done for my bank account.

The value of the Monte Carlo method is especially high for a program that parses data files of many different shapes and sizes, such as the Windows bitmap compressor introduced last month. With it you can test how well a program handles common data sequences. You can also test the code's outer limits. But you can never test all possible sequences the program will ever meet. That's where the Monte Carlo method comes in. The secret is to create random test files that closely resemble the real data the program will spend its life consuming.

Last month, I introduced algorithms and two test programs for compressing and decompressing Windows bitmap files. To simplify debugging, rather than mangle real bitmaps, I programmed the tests to operate on phony bitmaps stored in text files. This month, I'll list the remaining test programs (including a Monte Carlo sample-file generator) and the final C++ utility that can compress real 256-color bitmap files. I'll also point out a few quirks in the algorithms that gave me trouble.

Test Suite

Listing One (page 132), CREATE.CPP, is a C++ program that creates sample bitmap files in the format illustrated in Figure 1, repeated from last month. This isn't a real bitmap file, of course. It's a text representation of pixels that makes debugging and testing easier. The first two values represent the number of pixels per line (0A) and the number of scan lines in the image (04). The remaining values are hexadecimal bytes, one per pixel.

Just outputting pixel values at random would create unsuitable test files with few same-color runs. Such files would not exercise the algorithm's primary compression technique in which pixel groups such as 07 07 07 07 are run-length encoded as two values, 04 07 (four pixels of color 07). To create more realistic test files in the format illustrated by Figure 1, CREATE uses a simple but effective method listed in Pascal in Example 1. First, set a pixel to a random value from 0 to 255. Then, in a For loop (NP represents the number of pixels in a scan line), use a random function to decide whether to change the current pixel to another value. In this way, the program outputs pixel runs as might be found in a typical bitmap. The key here is to use an expression such as If (random(100)>=80) to decide whether to change an aspect of the output rather than directly use random()'s result. In this case, the if statement causes pixels to change color only about 20 percent of the time.

Listing Two (page 132), COMPARE.CPP, completes the test suite by comparing two sample bitmap files, ignoring the two-value information line at top (see Figure 1). Listing Three (page 132), MONTE.BAT, uses CREATE and COMPARE along with TPACK and TUNPACK from last month's column to test the bitmap compression and decompression algorithms in Monte Carlo style. Run MONTE.BAT with all four C++ programs compiled in the current directory. The test ends automatically upon detecting any errors, or you can press Crtl+C to end the test when you are satisfied the code is working.

Bitmap Compressor

Listing Four (page 132), BPACK.CPP, is the final bitmap compression utility. I wrote the code in C++, but I did not use IOStreams or classes, so the program should be easily ported to Pascal, C, and other languages. The program can compress only 256-color, Windows bitmap files. If you don't have any files in that format, load any 16-color bitmap into Windows Paintbrush, and save it under a different name with Type set to "256-Color bitmap." Or, just create a new image. If your file is named MYBITS.BMP, compress it with the DOS command BPACK MYBITS.BMP PACKED.BMP, then compare file sizes. Most files compress to about one half of their original size, but in some cases, BPACK has produced compression ratios as good as 80 percent.

Of course, you can also use a compression utility such as PKZIP or LHA to compress bitmap files. To view images compressed this way, however, you first have to decompress them to other files. To view images compressed by BPACK, you simply view them. Most Windows display drivers can directly display packed bitmaps. Unfortunately, though, not all Windows programs can do the same. For example, loading a packed bitmap into Paintbrush produces the error "The format of this file is not supported." (You'd think Windows' own programs would recognize bitmap compression, but evidently, this isn't the case.)

To view a packed bitmap image, therefore, you must use another utility such as the BITZOOM.CPP program that I wrote for Borland's The World of ObjectWindows video, or the program MDIBITS.PAS in my book, Borland Pascal 7.0 Programming for Windows (Bantam, 1993). The Pascal program has a bug, however, that prevents it from loading packed bitmaps. To repair the problem, add a global Word variable named Result to file UBITMAP.PAS, and change all instances of BlockRead(X,Y,Z) to BlockRead(X,Y,Z,Result). Without the Result parameter, BlockRead forces a run-time error of 100 if the number of bytes read from a file do not match the requested quantity in parameter Z. The repaired files are included on disk and on line with this month's files.

Quirks

The following are some suggestions for improving BPACK, and a few details about quirks in the compression algorithms. See last month's column for explanations about terms used in these items:

Your Turn

I may not be a gambler, but I'll wager this isn't the last "Algorithm Alley" on data compression--one of the most fascinating subjects in programming. Feel free to send comments, code, and algorithms to me in care of DDJ. You can also upload text files to my CompuServe ID, 73627,3241. Or, if you want to chat in person, try the rubber chicken arena at Circus Circus. I'm usually there.

Figure 1: Sample "fake" bitmap text file

0A 04
01 01 01 02 02 02 02 02 03 03
01 02 03 04 05 06 07 08 09 0A
01 02 03 04 04 04 01 02 03 04
01 02 01 02 01 02 01 02 01 02

Example 1: Generating pixel runs at random

pixel := random(256);
for I := 0 to NP - 1 do
begin
  if (random(100) >= 80)
    then pixel := random(256);
  Write(pixel)
end;



_ALGORITHM ALLEY_
by Tom Swan


[LISTING ONE]

/* ----------------------------------------------------------- *\
**  create.cpp -- Create random test bitmap file               **
**     Copyright (c) 1993 by Tom Swan. All rights reserved.    **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>
#include <time.h>

typedef unsigned char Byte;

void Error(const char *msg);
void RandomizeScanLine(Byte *sl, int np);
int RandomRange(int low, int high);
void PutByte(Byte b);

int main()
{
  int np;  // Number of pixels per scan line
  int ns;  // Number of scan lines

  cout << setiosflags(ios::uppercase);
  cout << setfill('0') << hex;
  randomize();
  np = RandomRange(10, 640);
  ns = RandomRange(4, 480);
  cout << setw(2) << np << ' ' << setw(2) << ns << endl;
  Byte *sl = new Byte[np];  // Allocate scan line
  if (!sl) Error("out of memory");
  while (ns-- > 0) {
    RandomizeScanLine(sl, np);
    for (int i = 0; i < np; ++i)
      PutByte(sl[i]);
    cout << endl;
  }
  delete sl;
  return 0;
}
// Display error message and halt
void Error(const char *msg)
{
  cerr << endl << "Error: " << msg << endl;
  exit(1);
}
// Insert random pixel values into scan line sl
void RandomizeScanLine(Byte *sl, int np)
{
  int pixel = random(256);
  for (int i = 0; i < np; ++i) {
    if (random(100) >= 80)
      pixel = random(256);  // 20% chance of new pixel value
    sl[i] = pixel;
  }
}
// Return integer at random from low to high
int RandomRange(int low, int high)
{
  return low + random((high - low) + 1);
}
// Write byte b in hex in 2 columns with leading 0
// plus one blank to cout
void PutByte(Byte b)
{
  cout << setw(2) << (unsigned int)b << ' ';
}



[LISTING TWO]


/* ----------------------------------------------------------- *\
**  compare.cpp -- Compare files created by tpack and tunpack  **
**     Copyright (c) 1993 by Tom Swan. All rights reserved.    **
\* ----------------------------------------------------------- */

#include <iostream.h>
#include <iomanip.h>
#include <fstream.h>
#include <stdlib.h>

void Error(const char *msg);
int main(int argc, char *argv[])
{
  int np, ns;  // Number of pixels, number of scan lines
  int b1, b2;  // Bytes from files 1 (original) and 2 (converted)

  if (argc <= 2)
    Error("filenames missing (enter two names)");
  ifstream ifsOriginal(argv[1], ios::in);
  if (!ifsOriginal)
    Error("unable to open file #1");
  ifstream ifsConverted(argv[2], ios::in);
  if (!ifsConverted)
    Error("unable to open file #2");
  // Skip number of pixels np and number of scan line ns
  // at beginning of original test data
  ifsOriginal >> hex >> np >> ns;
  while (!ifsOriginal.eof()) {
    ifsOriginal >> hex >> b1;
    ifsConverted >> hex >> b2;
    if (b1 != b2) Error("Files are different");
  }
  cerr << endl << "Files match" << endl;
  return 0;
}
// Display error message and halt
void Error(const char *msg)
{
  cerr << endl << "Error: " << msg << endl;
  exit(1);
}



[LISTING THREE]


@echo off
rem
rem monte.bat -- Monte Carlo batch file for bitmap tests
rem
:REPEAT
echo.
echo Starting new test
echo Deleting old bitmap.00? files
del bitmap.00?
echo Creating bitmap.000 test file
create >bitmap.000
echo Packing bitmap.000 to bitmap.001
tpack bitmap.000 >bitmap.001
echo Unpacking bitmap.001 to bitmap.002
tunpack bitmap.001 >bitmap.002
echo Comparing bitmap.000 and bitmap.002
compare bitmap.000 bitmap.002
if errorlevel 1 goto ERROR
goto REPEAT
:ERROR
echo.
echo ERROR: File mismatch found!!!
echo.
:END



[LISTING FOUR]


/* ----------------------------------------------------------- *\
**  bpack.cpp -- Pack (compress) a Windows bitmap file         **
**     Copyright (c) 1993 by Tom Swan. All rights reserved.    **
\* ----------------------------------------------------------- */

#include <stdio.h>
#include <stdlib.h>
#include <alloc.h>
#include <windows.h>

// Miscellaneous definitions
#define FALSE 0
#define TRUE 1
// State-machine definitions
#define READING 0         // General reading mode
#define ENCODING 1        // Encoding same-color pixel runs
#define ABSMODE 2         // Absolute-mode encoding
#define SINGLE 3          // Encoding short absolute-mode runs
#define ENDOFLINE 4       // End of scan line detected
// Type declarations
typedef unsigned char Byte;
struct BitmapStruct {
  BITMAPFILEHEADER bfh;   // Bitmap file header
  BITMAPINFOHEADER bih;   // Bitmap info header
  RGBQUAD *bmiColors;     // Pointer to color table (variable size)
  int clrSize;            // Number of colors in table
};
// Function prototypes
void Instruct();
void Error(char *msg);
int Odd(int v);
void ReadBitmapHeader(FILE *inpf, BitmapStruct &rbs);
int IsCompressible(BitmapStruct &rbs);
void WriteBitmapHeader(FILE *outf, BitmapStruct &rbs);
void WriteBitmapBits(FILE *inpf, FILE *outf, const BitmapStruct &rbs);
void PackRLE8(FILE *outf, int np, const Byte *sl);
void PutByte(FILE *outf, Byte b);

// Global variable
long imageSize;
int main(int argc, char *argv[])
{
  BitmapStruct bs;
  puts("\nBitmap file compressor by Tom Swan");
  if (argc <= 2) Instruct();
  FILE *inpf = fopen(argv[1], "rb");
  if (!inpf) Error("Can't open input file");
  ReadBitmapHeader(inpf, bs);
  if (!IsCompressible(bs)) Error("Cannot compress this file");
  FILE *outf = fopen(argv[2], "wb");
  if (!outf) Error("Can't open output file");
  printf("Compressing file...");
  WriteBitmapHeader(outf, bs);      // Write dummy header
  WriteBitmapBits(inpf, outf, bs);  // Compress bitmap image
  bs.bih.biCompression = BI_RLE8;   // Mark bitmap as compressed
  bs.bih.biSizeImage = imageSize;   // Modify image size value
  fseek(outf, 0, SEEK_END);         // Seek to eof for next statement
  bs.bfh.bfSize = ftell(outf);      // Modify file size in bytes
  WriteBitmapHeader(outf, bs);      // Write real header
  fclose(inpf);
  fclose(outf);
  printf("\n%s --> %s\n", argv[1], argv[2]);
  delete bs.bmiColors;
  return 0;
}
// Display instructions and exit program
void Instruct()
{
  puts("\nSyntax: BPACK infile outfile");
  puts("\nEnter the name of a bitmap (infile) to compress.");
  puts("The program packs the bitmap if possible, and");
  printf("stores the results in a new file (outfile).\n");
  puts("The original bitmap file is not changed in any way.");
  puts("This version is limited to 8-bit (256-color) files");
  puts("that are not already compressed.");
  exit(0);
}
// Display error message and exit program
void Error(char *msg)
{
  printf("\nERROR: %s\n", msg);
  exit(1);
}
// Return true if v is odd
int Odd(int v)
{
  return v & 0x01;
}
// Read bitmap headers and color table into rbs
void ReadBitmapHeader(FILE *inpf, BitmapStruct &rbs)
{
  if (fread(&rbs.bfh, sizeof(rbs.bfh), 1, inpf) != 1)
    Error("Cannot read bitmap file header");
  if (fread(&rbs.bih, sizeof(rbs.bih), 1, inpf) != 1)
    Error("Cannot read bitmap info header");
  if (rbs.bih.biClrUsed != 0) {
    rbs.clrSize = (int)rbs.bih.biClrUsed;
  } else switch (rbs.bih.biBitCount) {
    case 1: rbs.clrSize = 2; break;
    case 4: rbs.clrSize = 16; break;
    case 8: rbs.clrSize = 256; break;
    case 24: rbs.clrSize = 0; break;
    default: Error("biBitCount not 1, 4, 8, or 24");
  }
  if (rbs.clrSize == 0) Error("clrSize == 0");
  rbs.bmiColors = new RGBQUAD[rbs.clrSize];
  if (!rbs.bmiColors)
    Error("bmiColors is null. Out of memory.");
  if (fread(rbs.bmiColors, sizeof(RGBQUAD),
    rbs.clrSize, inpf) != rbs.clrSize)
    Error("Cannot read color table");
}
// Returns true if bitmap header rbs is compressible
// Required format: MS DIB, uncompressed, 8-bit (256-color)
int IsCompressible(BitmapStruct &rbs)
{
  if (rbs.bfh.bfType != 0x4d42) return FALSE;
  if (rbs.bih.biSize != sizeof(BITMAPINFOHEADER)) return FALSE;
  if (rbs.bih.biBitCount != 8) return FALSE;
  if (rbs.bih.biCompression != BI_RGB) return FALSE;
  return TRUE;
}
// Write bitmap headers and color table in bs to outf
void WriteBitmapHeader(FILE *outf, BitmapStruct &rbs)
{
  rewind(outf);
  if (fwrite(&rbs.bfh, sizeof(rbs.bfh), 1, outf) != 1)
    Error("writing bitmap file header");
  if (fwrite(&rbs.bih, sizeof(rbs.bih), 1, outf) != 1)
    Error("writing bitmap info header");
  if (fwrite(rbs.bmiColors, sizeof(RGBQUAD),
    rbs.clrSize, outf) != rbs.clrSize)
    Error("writing color table");
}
// Read pixel data from inf, compress and write to outf
void WriteBitmapBits(FILE *inpf, FILE *outf, const BitmapStruct &rbs)
{
  int np;      // Number of pixels per scan line
  int ns;      // Number of scan lines
  int slSize;  // Size of one scan line in bytes
  Byte *sl;    // Pointer to scan line
  // Assign miscellaneous sizes
  np = (int)rbs.bih.biWidth;
  ns = (int)rbs.bih.biHeight;
  // Allocate scan line buffer
  if (Odd(np))
    slSize = np + 1;  // Must have an even number of bytes
  else
    slSize = np;
  if (slSize <= 0) Error("slSize <= 0");
  sl = new Byte[slSize];
  if (!sl) Error("out of memory");
  // Read and compress scan lines
  while (ns-- > 0) {
    if (fread(sl, 1, slSize, inpf) != slSize)
      Error("reading pixel scan line");
    PackRLE8(outf, np, sl);
  }
  delete sl;
  PutByte(outf, 0);  // Mark end of bitmap
  PutByte(outf, 1);
}
// Compress and write np pixels in sl to output file outf
void PackRLE8(FILE *outf, int np, const Byte *sl)
{
  int slx = 0;           // Scan line index
  int state = READING;   // State machine control variable
  int count;             // Used by various states
  Byte pixel;            // Holds single pixels from sl
  int done = FALSE;      // Ends while loop when true
  int oldcount, oldslx;  // Copies of count and slx

  while (!done) {
    switch (state) {
      case READING:
      // Input:
      // np == number of pixels in scan line
      // sl == scan line
      // sl[slx] == next pixel to process
        if (slx >= np)                      // No pixels left
          state = ENDOFLINE;
        else if (slx == np - 1) {           // One pixel left
          count = 1;
          state = SINGLE;
        } else if (sl[slx] == sl[slx + 1])  // Next 2 pixels equal
          state = ENCODING;
        else                                // Next 2 pixels differ
          state = ABSMODE;
        break;
      case ENCODING:
      // Input:
      // slx <= np - 2 (at least 2 pixels in run)
      // sl[slx] == first pixel of run
      // sl[slx] == sl[slx + 1]
        count = 2;
        pixel = sl[slx];
        slx += 2;
        while ((slx < np) && (pixel == sl[slx]) && (count < 255)) {
          count++;
          slx++;
        }
        PutByte(outf, (Byte)count);  // Output run-length-encoded unit
        PutByte(outf, pixel);
        state = READING;
        break;
      case ABSMODE:
      // Input:
      // slx <= np - 2 (at least 2 pixels in run)
      // sl[slx] == first pixel of run
      // sl[slx] != sl[slx + 1]
        oldslx = slx;
        count = 2;
        slx += 2;
        // Compute number of bytes in run
        while ((slx < np) && (sl[slx] != sl[slx - 1]) && (count < 255)) {
          count++;
          slx++;
        }
        // If same-color run found, back up one byte
        if ((slx < np) && (sl[slx] == sl[slx - 1]))
          if (count > 1)
            count--;
        slx = oldslx;  // Restore scan-line index
        // Output short absolute runs of less than 3 pixels
        if (count < 3 )
          state = SINGLE;
        else {
          // Output absolute-mode run
          PutByte(outf, 0);
          PutByte(outf, (Byte)count);
          oldcount = count;
          while (count > 0) {
            PutByte(outf, sl[slx]);
            slx++;
            count--;
          }
          if (Odd(oldcount))
            PutByte(outf, 0);  // End run on word boundary
          state = READING;
        }
        break;
      case SINGLE:
      // Input:
      // count == number of pixels to output
      // slx < np
      // sl[slx] == first pixel of run
      // sl[slx] != sl[slx + 1]
      while (count > 0) {
        PutByte(outf, 01);
        PutByte(outf, sl[slx]);
        slx++;
        count--;
      }
      state = READING;
      break;
      case ENDOFLINE:
        PutByte(outf, 0);
        PutByte(outf, 0);
        done = TRUE;
        break;

      default:
        Error("unknown state in PackRLE8()");
        break;
    }
  }
}
// Write byte b to output file outf. Increments global imageSize variable
void PutByte(FILE *outf, Byte b)
{
  if (fwrite(&b, 1, 1, outf) != 1)
    Error("writing byte to output file");
  imageSize++;
}




Copyright © 1993, Dr. Dobb's Journal

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.