Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

JVM Languages

Java and Digital Images

May99: Java and Digital Images

David and Johnny are cofounders of Object Guild Inc., specializing in object-oriented consulting, training, and software. They can be contacted at [email protected] and [email protected], respectively.

The ability to capture, store, and retrieve images is an often-overlooked feature that can benefit many applications. The recent introduction of low-cost video-capture hardware has created a significant market for videoconferencing and online collaboration software. In addition, image capture, storage, and retrieval capabilities are potentially useful in more mainstream software applications. Consider, for example, a patient-care application that stores a patient's photograph to reduce the chances of misidentification. Other applications of low-cost video-capture hardware include inventory control, surveillance, security systems, or adding marketing appeal to demos of software that lacks highly visible features. (Demos that take snapshots of people's faces and store them continuously can be very effective at demonstrating a Java application's database capabilities, for example.)

C++ applications have imaging and video libraries readily available. On Windows, the standard API for accessing video-capture devices is Video for Windows. A C++ program written against this API should work with any Windows-compatible camera. But what if you're developing in Java?

Interfacing Java applications to a video-capture device poses a special challenge because there is currently no easy way to access the camera from Java. (The Java Media Framework API from Javasoft does not address video capture in the 1.0 release.)

The Java VM presents a barrier between applications and C/C++ APIs used to access the video camera. To access these APIs from Java, you must not only write JNI methods, but must also address image conversion problems, performance issues, and thread synchronization:

  • Captured images need to be converted to an image format readable from Java. Images returned by Video for Windows may be in one of several different formats, depending on the resolution, color model used, and bits per pixel.
  • Pixels need to be copied into the Java VM's memory space. If the application needs to capture frame-by-frame video, these memory transfers need to be optimized for speed.

  • Many low-level video-capture APIs use callback functions. Handling the callbacks requires synchronizing multiple threads in both Java and C++.

There are three approaches to incorporating video or image capture into a Java application, each with different usability/complexity tradeoffs:

No integration. Implement the image capture feature as an "open file" dialog, allowing users to select GIF or JPEG image files for the Java application to load. It is up to users to run third-party image capture utilities.

This approach avoids the problem altogether. The application gets images from a file, which could have come from a separate image-capture program connecting to a video device, or from any other source. All the application needs to do is read a GIF or JPEG image file, a trivial task in Java.

This may be an appropriate solution if the need for image capture is uncommon. It is cumbersome for the user. Not only does the user need to run a separate application to capture and save the image, but must also remember the image file location, and locate that file in a Java dialog.

Loose integration. When image capture is needed, the Java application executes a separate C/C++ application that lets users interactively capture images. The application saves an image as a GIF or JPEG image file in a predefined location, and signals the Java application, which retrieves the file.

This is really an automated version of the "no integration" approach. The application spawns the image capture program for users, and automatically retrieves the file when users have closed the image-capture program.

Users do not need to manually start a separate application, and do not need to worry about saving and retrieving the image file.

This solution burdens you (the programmer) with the need to write a custom image-capture program in C/C++. Installation is more complex, as a separate native executable must be installed along with the Java application.

The biggest drawback with this approach is cosmetic. If the Java application relies on custom widgets or Swing components for the user interface, the image-capture application will unavoidably have a different look-and-feel from the Java application. This presents an unprofessional appearance to users, as the look-and-feel of the image-capture screen is different from the rest of the application's UI.

Full integration. As you might guess, this approach involves being able to directly access the video device from Java, using a combination of Java and native C++ methods to connect to the camera, capture image frames, and convert them to Java's image format.

From the usability standpoint, this is the best approach. Users need not perform extra steps to connect to the camera; when they want to take a picture, it is done through a normal Java dialog. This is essentially an "all Java" solution, albeit with underlying native code in the back end. Thus, it provides smoother UI flow for users, and more capabilities, allowing video as well as still-frame capture.

The bad news is that this approach is the most difficult to implement. You must interface with the camera driver in C++, implement efficient image format conversions and memory transfer operations, and handle JNI memory-management and thread-control issues.

The good news is that it is possible to encapsulate this solution in a set of classes with a public API. If the API is designed properly, this approach becomes no harder to implement than the no integration approach. At Object Guild, we've implemented such an API, called "Grabber for Java."

Description of Grabber

Grabber consists of a set of Java classes and a native method DLL that provide access to a video-capture device directly from a Java application. It defines an API for connecting and disconnecting from the camera, adjusting image size, color depth, frame rate for video capture, and capturing still images. It also includes Swing and AWT GUI classes that make it easy to perform basic tasks, such as continuous video capture, and changing settings through dialogs.

The central goal in designing the Grabber API was to make video simple to incorporate into an application, while providing the power and extensibility to do more complicated tasks.

The video device is represented abstractly by the class com.objectguild.camera.VideoGrabber, which locates and connects to the video hardware device; performs frame-by-frame image capture, implementing all necessary image format conversions; and can return an image or raw pixel data, as a "snapshot." (The source code for a program that demonstrates this is available electronically; see "Resource Center," page 5.)

Before capturing images, the program must connect to the camera. This involves locating the camera driver, initializing the underlying Java and C++ classes, and signaling the camera to start capturing images. VideoGrabber reduces these tasks to a single connect() method, which throws an exception if the connection attempt fails. The disconnect() method invokes the low-level API calls to disconnect from the device, and frees memory on the C++ side.

VideoGrabber also defines methods for setting and retrieving image dimensions, color depth (bits per pixel), and frame rate.

Full-motion video in Grabber is automatic -- you can install a specialized Canvas object as an observer of the VideoGrabber object. When the VideoGrabber is connected, with a frame rate > 0, it updates the canvas whenever a new frame is captured. To take a snapshot, you call the snapshotImage() method, which returns an instance of java.lang.Image containing the latest captured frame. To store a snapshot in a database, VideoGrabber provides two lower-level snapshot methods: snapshotPixels(), which returns an int[] array containing the raw pixel data for the image, and getColorModel, which returns the ColorModel associated with the pixels.

A common task for applications incorporating live video capture is to open a window showing full-motion video camera images. To simplify this task, the API includes a class called VideoGrabberCanvas. This class, in conjunction with VideoGrabber, uses the Observer design pattern to allow the canvas to update itself automatically, whenever a new frame is captured from the camera; see Figure 2.

The Observer pattern provides a means of defining a one-to-many relationship between a single observable object and one or more observer objects, in which the observable object has no specific knowledge of its observers. The observable object can issue change notifications that are interpreted by each observer as it sees fit.

To implement this pattern, VideoGrabber extends java.util.Observable. It notifies its observers when a new frame is captured, or when the image dimensions or color depth are changed. VideoGrabberCanvas implements the java.util.Observer interface, which, when notified that the image has changed, gets the latest image from the VideoGrabber and draws it on the canvas. VideoGrabberCanvas' constructor takes a VideoGrabber object as a parameter, and automatically registers itself as an observer of the VideoGrabber; see Listing One. Thus, once a VideoGrabberCanvas is instantiated and added to an AWT or Swing window, all the image updating and painting is done automatically, whenever the VideoGrabber is connected.

Another common task is to prompt users to take a snapshot. For example, in an application where users are entering identification information for an individual, it may be desirable to allow users to take a snapshot of the individual, to be included in the person's profile. In this case, a dialog would come up containing a canvas showing real-time video input from the camera. Users would click the OK button to take a snapshot and close the dialog. Grabber provides AWT and Swing versions of a dialog class containing a self-updating canvas. This class has the static method Image TakePicture(Frame,VideoGrabber) that, when called, opens a modal dialog and returns an image or null, depending on whether users took a picture or canceled the operation.

The combination of a simple yet comprehensive API and GUI support classes yields the ability to incorporate video capture with few lines of code. Listing Two is a button listener that causes a modal dialog to pop up, allowing the user to position the camera, and then take a snapshot.


Although Grabber for Java initially supported only Windows-compatible cameras, such as the Connectix Color QuickCam, its clean design allows the addition of support for any hardware/operating system platform with no coding changes for the applications, and minimal coding changes for Grabber itself. To achieve this goal, we isolate platform-specific code using multiple levels of abstraction, on both the Java and C++ sides; see Figure 1.

The first level of abstraction is the public API, defining the class com.objectguild.camera.VideoGrabber. This is what the applications use to connect to the camera.

The second level of abstraction is a protected Java class that sits between the VideoGrabber class and native code. This class, com.objectguild.camera.VideoDevice, is a Java-side representation of a video camera. It defines the native method interface, and is responsible for loading the proper native implementation DLL. Different camera devices or operating systems can be specified by subclassing VideoDevice. The set of native methods is surprisingly small. It includes methods for initializing, connecting, and disconnecting; a method for retrieving the contents of the last scanned frame into an int[] array; and a method for retrieving the current color map.

The native method implementations are simple delegators to a C++ class called VideoCam, which is the third level of abstraction. It defines an abstract interface to a generic video camera. Subclasses of VideoCam work with specific low-level video-capture APIs. The current Grabber implementation interfaces with the Video for Windows API. The Linux version will use a different subclass that talks directly to the Connectix QuickCam.

Since VideoCam defines a low-level API, to add support for a different OS or camera, you need only change the VideoCam C++ class, and subclass VideoDevice, overriding the method to load a different DLL.

Even though you see only the top-most interface (the VideoGrabber class), using multiple layers of abstract classes provides a great deal of flexibility in adding support for different devices and operating systems.

Video for Windows

We chose to interface Grabber for Java with Video for Windows (VFW) because that is the de facto standard for video-capture devices on Windows systems. Because VideoGrabber talks to Video for Windows instead of a lower-level device driver, the VideoGrabber can connect to any camera build for Windows PCs.

The ability to support many cameras with the same code made VFW the obvious choice. However, in accessing VFW from Java, we ran into some setbacks resulting from VFW's tight integration with Windows. Among the problems were:

  • The need to create an invisible window, because each VFW function expects a handle to a window as a parameter.
  • Event conflicts. The whole program would freeze inside a VFW function call if the function was invoked while a Java button was in the pressed position. Some creative use of threads was required to work around this problem.

Once VFW grabs the frame, it invokes the callback function. This function must convert the image data from Windows' memory image format to Java's image format, and copy the converted data to a Java array.

Image format conversion is complicated by several idiosyncrasies of the Windows image format. In Windows' 24-bit image format, each pixel is represented by 3 bytes representing the blue, green, and red color components (BGR). In Java, the byte order for each pixel is Red-Green-Blue (RGB). Also, the Windows bitmap format stores the image upside-down. The first horizontal line in the bitmap corresponds to the last horizontal line in the displayed image. Java expects the bitmap to store the image right-side up. Thus, in copying the image data from the C array to the Java array, the line order must be reversed, and the order of the bytes in each pixel must be reversed as well.

Unfortunately, 24-bit BGR is only one of several possible Windows bitmap formats. The Windows image may also be in 4-, 8-, or 16-bit format, where each pixel is represented by an integer offset into a color palette. In this case, VideoGrabber detects the image format used, and creates a corresponding color palette in Java.


To illustrate Grabber, we built a Java application that reads frames from the video camera, takes snapshots, and saves snapshot images in JPEG format. Here we describe a simple Java application that displays video-camera input, allows users to take a snapshot, and saves snapshot images in a JPEG file.

The window layout contains two canvas panels, side-by-side. The left panel displays the continuously updating image from the camera; the right panel displays the latest snapshot. Users can connect and disconnect from the camera, alter the VideoGrabber's settings, take a snapshot, and save the snapshot to a JPEG file; see Figure 3.

The left panel contains a VideoGrabberCanvas, which is initialized with an instance of VideoGrabber when the application starts up. The right panel, which displays the snapshot, is a simple subclass of the Swing class JComponent. It paints the snapshot image, drawing a white border around it to simulate a photograph.

The left-most round button toggles the camera on and off. Listing Three is the code for doing this. To turn the camera on, it calls vc.startup() (where vc references the VideoGrabber instance). This method spawns a thread that connects to the camera and repeatedly captures frames at the default frame rate. To turn the camera off, vc.shutdown() is called, which disconnects from the camera and terminates the capture thread.

The middle button spawns com.objectguild.camera.ControlPanelFrame, a dialog for changing the VideoGrabber's dimensions, color depth, and frame rate; see Listing Four.

The rightmost button takes a picture, by calling vc.snapshot(), and causing the snapshot canvas to paint the image returned by that method; see Listing Five.

This application demonstrates the ease of incorporating video capture using Grabber for Java. Listings Three, Four, and Five contain virtually all of the camera-specific code; most of the development effort for this application went into laying out the components, and adding image-saving capability (using a public-domain Java JPEG class).

Grabber for Java is currently deployed at several beta sites. Current developments at the time of this writing include Linux support, and support for real-time video streaming and storage.


Listing One

public class VideoGrabberCanvas extends JComponet implements Observer {
    VideoGrabber vg;
public VideoGrabberCanvas (VideoGrabber camera) {
    this.vg = camera;

Back to Article

Listing Two

Image img;
// Create an action listener which spawns a modal dialog.
captureButton.addActionListener(new ActionListener() {
  public synchronized void actionPerformed(ActionEvent e) {
    try {
      // Modal dialog blocks this thread until "picture" is taken.
      img = SnapshotDialog.TakePicture(TestDialog.this, vc);
    } catch (ConnectFailedException ex) {
                                     "Unable to connect to camera");


Back to Article

Listing Three

/** If camera is connected, shuts it down; otherwise, connects to device. */
private void toggleConnect () {
  if (connected) {
    connectButton.setToolTipText("Connect to camera");
    videoCanvas.repaint();  // clear the canvas
  } else {
    try {
    } catch (ConnectFailedException ex) {
      JOptionPane.showMessageDialog(this, "Unable to connect to camera",
                                    "Bummer", JOptionPane.ERROR_MESSAGE);
    connectButton.setToolTipText("Disconnect from camera");
  connected = !connected;
class ConnectItemListener implements ItemListener {
  public void itemStateChanged (ItemEvent e) {

Back to Article

Listing Four

  settingsButton = createButton(SettingsText, SettingsUpIcon, 
                                SettingsDownIcon, "Change camera settings");
class SettingsListener implements ActionListener {
  public void actionPerformed (ActionEvent e) {
    if (control == null) {
      control = new ControlPanelFrame(vc);

Back to Article

Listing Five

  snapshotButton = createButton(CameraText, CameraUpIcon, CameraDownIcon,
                                "Take picture");
class SnapshotListener implements ActionListener {
  public void actionPerformed (ActionEvent e) {

    image = vc.snapshotImage();

Back to Article

Copyright © 1999, Dr. Dobb's Journal

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.