Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Understanding Photomosaics


Nov01: Understanding Photomosaics

Manuel and Marcelo are professors at the National University of Mexico. They can be contacted at [email protected] and [email protected], respectively.


A mosaic is an image traditionally composed of small pieces of material, such as stone or glass. A photomosaic, on the other hand, is a digital image like Figure 1 that is made up of other digital images, pieced together by software. Photomosaics are generally credited to Robert Silvers, who developed the technique while he was a student at MIT. Silvers subsequently patented the process, founded Runaway (a company devoted to photomosaics; http://www.photomosaic.com/), and coauthored (with Michael Hawley) Photomosaics (Henry Holt & Company, 1997, ISBN 0-805-05170-8).

How good photomosaics are created is something that has not been formally analyzed, however. For one thing, the idea is relatively new; for another, the quality of a good photomosaic depends in part on properties of the human eye and how we see colors and images. Still, there are several issues to address in terms of the criteria for considering a specific image as part of the photomosaic process.

  • How can you define something similar to a minimum color distance between a part of the original picture and a bunch of photos in a library?
  • How big can each mosaic be before it loses the details of the original image?

  • How many photos should you have in your library?

  • How many repetitions can you have before the photomosaic looks monotonous?

For us, the key to understanding some of these problems came from the article "The Recognition of Faces," by Leon Harmon (Scientific American, November 1973). The first image in Harmon's article was Abraham Lincoln, made from a collection of solid gray mosaics. Harmon, a Bell Labs researcher who considered himself a "cyberartist," wrote a program called "Magna Dott," which was probably the first mosaic filter for a digital picture. The important thing here is the study of the redundancy of a picture, which in some way takes the essence of the picture with a simple mosaic filtering. The Lincoln image is particularly good for this effect. In fact, Salvador Dali took the Lincoln image and used it in his painting "Gala Mirando al Mar" (http://members.es.tripod.de/Salvador_Dali/dali132.htm). If you see the painting from, say, 60 feet away, you see the Abraham Lincoln mosaic picture. Closer to the painting, however, you instead see Dali's wife. The painting was widely known as "Lincoln in Dalivision," and certain editions were sold with a special lens to facilitate the viewing of the distinct images.

These two images — Harmon's and Dali's — hold the key to the secret of photomosaics. Instead of writing a mosaic filter that exchanges an area of the original picture with a solid color, you can write an enhanced mosaic filter that exchanges the area of the processed photo with some image of a very similar color. When we understood that this was the key for photomosaic software, we began writing software to work with this particular idea. The key elements of the process are:

  1. Take a scanned photo (source image) to process.

  2. Make a grid over this photo.

  3. Look at every cell on the grid.

  4. Calculate the average color of each cell.

  5. Find the nearest image in the library with the same average color of the cell and substitute it in that cell.

  6. Repeat the process for each cell in the grid.

Building a Photomosaic System

A photomosaic system is not a single program. It requires different programs to accomplish different tasks. Consequently, we first wrote a program to preprocess all images in the library. Its output is to a so-called index file — a text file in which each line contains the average color description of each image in the collection. Example 1 shows the format. Since you have to preprocess the entire image library, the photomosaic creation software uses this text file to find out which is the best image for each cell over the grid of the source image.

Once an index file is present, the photomosaic program can run properly. It needs the source image file, the directory where the index file is present (normally the same directory as the image library), and the output filename.

The system generates two output files — a text file, which is used by a JPEG generator program, and an HTML file (an HTML table), which lets you see the results immediately. Typically, the photomosaic generator software takes between two to five minutes to create the output.

Our JPEG generator program takes the output text file created by the photomosaic generator software to build the final image. The output file is simply a set of image filenames and a <br> symbol (a break instruction) when the software needs to go to the next line, as a line-feed/carriage-return in a document. The software uses Borland's Delphi 5 JPEG unit. However, this unit has limitations in terms of the size of the JPEG it can create. Most of the time, the JPEG image can be created without any type of problem, but if you want a very large mosaic, the system refuses to write the generated JPEG to hard disk. One solution (admittedly not the best) is to divide the source images into small parts, then join them using Photoshop.

The source code for the complete implementation of this system, including sample data (such as images) is available electronically; see "Resource Center," page 5.

Image Library and Color Average

Although our approach looked straightforward, we quickly realized that the photomosaic process required tens-of-thousands of high-quality digital images. (Silver claims a library of 100,000 or more scanned pictures.) Consequently, we started with approximately 6000 high-quality scanned images, all of which have 24-bit colors.

The first step was to understand how to find an average color for a picture. We knew, of course, that color pixels are made up of the three color components red, green, and blue (RGB). The combination of different amounts of RGB makes very specific colors. Our idea was to take the total sum of each RGB pixel on a photo, then divide the result with the total number of pixels used. We wrote a program just to find out the average color of all images in the library. We then created an output file (an "index") with this information because we didn't want to have to recalculate them for every cell on the grid. The average color of each picture in the image library is an attempt to make the software fast enough. Example 2 is the formal equation that defines this procedure, where n and m are width and height of each image in the library.

Color Metrics

At this point, we had a file with the average color for each of the 6000 images. Next, we needed some color criteria to establish the minimum distance in color between a region (cell) of the original image and all the images in the library. The simplest idea was to use Euclidean distance in terms of the RGB average colors. We used the equation in Example 3(a), where R1, G1, and B1 are the average RGB component of the cell and R2, G2, and B2 are the average RGB of each picture. The best fit should be the picture from the library with the minimum color distance to the cell analyzed. We called it a "color metric."

Additionally, we found a second metric, see Example 3(b), that doesn't depend on the RGB directly. With this metric, the minimum linear difference between the color number of the cell against all the color numbers of each image in the library should give the best fit. We called this a "linear metric."

It seemed to be enough metrics to try, but we later ran across a study by Thiadmer Riemersma about how humans perceive colors (http://www.compuphase .com.cmetric.htm). We applied this last metric, which we called the "Riemersma metric," using the equation in Example 3(c).

Riemersma takes into consideration the way human eyes see red, green, and blue. Therefore, the fundamental algorithm to produce a photomosaic is:

  1. Take each cell grid from the source picture.

  2. Calculate in run time the average color of the cell.

  3. Find the minimum distance between this number compared with the average color of each image in the library (using one of the color metrics).

  4. Substitute the cell with this image.

  5. Keep going until each cell on the source image has been processed; see Figure 2.

The output was another problem. Our idea was to create a program to build an output image taking the selected images from the library. Luckily, we came up with a simple solution: The output of the program would be a web page; in fact, an HTML table like Example 4. Later, we wrote a complete program (available electronically) to handle all the selected images necessary to build a photomosaic (in JPEG format).

Initial Results

The success of our initial attempts clearly depended on the capabilities of the metric we used and how well they behaved in terms of the output picture. Figures 3, 4, 5, and 6 represent the source image to process, and the linear, Euclidean, and Riemersma metrics, respectively. In these instances, we are still not dealing with image repetition — only with how good the photomosaic looks compared with the original source image. The least successful approach was the linear metric, while both the Euclidean and Riemersma metric worked equally well.

In the process, we discovered other issues central to understanding mosaic technology:

  • Quality of a source picture. Not all images are good enough to be processed by photomosaic software. High-contrast images work the best. Of course, if the image has a lot of colors, it is better. A good balance between contrast and brightness is the best approach. Fortunately, we can manipulate these parameters using applications such as Photoshop, Retriever, ACDSee, and so on.
  • Library of images. For good results, the photomosaic software must have a large selection of high-quality images. Good selection criteria would be to find 24-bit color images with a resolution of 800x600 pixels, more or less. Small pictures, even with 24-bit color, generally aren't good enough.

We did run across a royalty-free CD-ROM from Global Star Software (http://www.globalstarsoftware.com/) with 10,000 high-quality images, and decided to replace our old collection with the new one. Surprisingly, however, our 6000-image collection produced better results. We determined that the problem was likely the spectrum of the image collection — how many different color images do we have in this collection? We processed the index file (which has all the information of RGB average for each image in the library), and found that our original 6000 image library had a better distribution of color images. The new 10,000 image library, however, had too many dark images. The moral of the story? A balanced collection of images is good. Too many light or dark pictures is bad.

Improvements

There is, of course, room for improvement to our photomosaics software. The most popular idea to enhance a photomosaic is to use a technique called "blending." (For more information, see "Alpha Blending Graphic Images," by Tim Wittenburg, DDJ, August, 1995.) This technique actually blends the original image with the photomosaic in some percentage. Blending enhances a photomosaic because if the criteria for selecting images is just based on the metric used, you would have to be lucky to find not only the best color image to substitute, but one with a shape similar to that of the original picture. This involves too much luck. Our software doesn't do blending; instead, we use Adobe's Photoshop program.

Good choices to enhance a photomosaic are cropping images or using only part of the image selected. Silvers does this frequently. But clearly more time is needed to implement these algorithms.

The preprocessing is arduous, however. Suppose each library image is preprocessed using a 3x3 or 5x5 grid (in these terms, we're using a 1x1 grid). Instead of having one average color for the entire image, you have nine average colors for 3x3 regions of a picture. Now, you will have to change the color metric used in order to take into account this new information about each image in the library. This probably significantly increases the process time of a photomosaic. Actually, a 550x550 24-bit color source image can be processed in about three minutes.

To repeat the same image around the photomosaic can result in monotony. Nevertheless, it is not easy to avoid repetitions if you don't have a huge collection of images. One technique is to mark the image selected and avoid using it until some parameter changes. For example, our photomosaic software can avoid around 200 repetitions (the number can be changed by users). When the software finds an image already used, it tries to find the second best fit. If the second best fit was also already used, it tries to find the third best fit, and so on. When the counter of images used equals the repetition limit, it clears to zero, clears the marks, and restarts operations. Avoiding repetition makes a more spectacular photomosaic, even if the best images weren't always selected.

The first time we tested the photomosaic program, we needed to understand what the software was doing and which image was being selected for each cell in the original picture. We then added debugging tools. The system can show the image selected and the cell being worked on. Also, a progress bar shows the advancement of the processing. Finally, just for the record, the system shows a message window with the total processing time.

Maybe a photomosaic doesn't need to be forced to find the best fit (the minimum color distance) for each cell in the grid. Maybe we can give some tolerance range to the selection process. That means giving an extra parameter to select the images between at least some specific values. We have not tried this idea yet, but it would be another parameter to play with. Some further testing could be to show how good it might be to have a tolerance factor.

For More Information

Castelle, Michael. "Fast Multiresolution Image Querying," http://www.grail.cs.washington.edu/projects/query/.

De Bonet, J.S. and P. Viola. "Novel Statistical Multiresolution Techniques for Image Synthesis, Discrimination, and Recognition: A Non-Parametric Multi-Scale Statistical Model for Natural Images." http://www.ai.mit.edu/~jsd/Research/Publications, 1997.

Husney, Jordan. Linux Image Montage Project (LIMP), http://linux-green.lanl.gov/linux/listsgimp-announce/html/001.html.

Tran, Nicolas. "Generating Photomosaics: An Empirical Study," ACM Communications, 1998.

Troebs, Michael. "Juggle, A Java Photomontage Program," October 1999, http://www.stud.uni-hannover.de/~michaelt/juggle/.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.