Database

Stereoscopic Imaging

By Andy Ramm, September 01, 1997

Stereoscopic imaging gives you the ability to deliver a user interface that provides a natural 3-D view of an object or scene, dramatically increasing technical proficiency and the ability to interpret multidimensional data.

Dr. Dobb's Journal September 1997:

Creating images in SGI GL and OpenGL

Andy is a product manager for StereoGraphics and can be contacted at [email protected].

Stereoscopic imaging is a critical element of many design and prototyping, molecular modeling and computational chemistry, GIS, and virtual reality applications. The ability to deliver a user interface that provides a natural 3-D view of an object or scene dramatically increases technical proficiency and the ability to interpret multidimensional data.

In the past, complicated mathematical models had to be used to create stereoscopic views within graphics-based applications. Today, many of the complex algorithms for creating a stereoscopic view are handled in OpenGL for Windows NT or UNIX workstations, or by rendering engines such as Microsoft's Direct3D, Argonaut's BRender, Criterion's Renderware, and the like. In this article, I'll present a methodology for creating stereoscopic views in graphical applications and provide examples of how this has been done.

Field Sequential Stereoscopic Display

The effect of three-dimensionality results from how the brain processes what our eyes see. The distance between human eyes causes each eye to see an image from a slightly different perspective. The brain interprets information from these two perspectives to create the perception of three dimensions -- an effect known as "stereopsis."

Stereopsis is recreated in a computer environment by delivering separate left-eye and right-eye views in sequence. This can be accomplished on a standard computer display if the user is wearing the appropriate shutter glasses. These glasses employ liquid crystal in the lenses that alternately darken and become clear to block the incorrect view while transmitting the correct one. CrystalEyes and SimulEyes (both from StereoGraphics, the company I work for) fall into this category, as do systems from NuVision, VRex, and others.

The advantage of shuttering eyewear for 3-D vision as opposed to head-mounted displays (HMDs) is largely based on resolution and refresh rates. These glasses are used in conjunction with high-resolution computer displays and can leverage the quality of their high-definition graphics capabilities. In contrast, HMDs use small liquid-crystal displays that -- even in the most advanced examples -- can only deliver VGA resolution at low refresh rates.

The three formats commonly used when developing software for field-sequential stereoscopic display are interlaced, above-and-below, and page flipping.

In interlace mode, left and right views are rendered on alternating lines with the left perspective drawn on the even lines, and right perspective on odd ones. This accommodates computers with limited rendering power running at lower field rates, and requires no assistance to implement beyond the correct driver software.

The above-and-below standard is perhaps more effective and does not require additional hardware support inside the computer. In this scenario, the left view is rendered in the upper half of the screen at half of its vertical resolution, while the right view is rendered in the lower half. These images are separated by a variable number of black lines that will operate as a vertical blanking interval on final display.

When output, the video signal is processed by a device that doubles the field rate of the video signal. In this example, a 60-Hz above-and-below video signal is transformed on the fly to a 120-Hz field-sequential signal with each perspective expanded to full-screen size. (StereoGraphics' Sync Doubling Emitter is one such device.)

The most advanced means to display stereoscopic images is via page-flipping, which is most prevalent among stereo-ready workstations such as SGI, Sun, DEC, or Intergraph. The feature is also available on the current crop of 3-D rendering chips from companies such as Rendition, 3D Labs, S3, and others. Page flipping involves a quad buffering scheme in which two full-resolution stereo pairs are rendered into two sets of buffers. As one stereo pair is being displayed, a second pair is being rendered into the frame buffer. This scheme produces the highest resolution and highest field-rate stereo images.

Creating Stereoscopic Images

Computing a stereoscopic image involves drawing two monocular viewpoints from different positions. The two views are produced via a horizontal perspective shift along the x-axis, with a horizontal shift of the resulting images (I call this horizontal image translation, or HIT) to establish the zero parallax setting (ZPS).

Figure 1 illustrates the geometry of the basic algorithm for producing computer-generated electro-stereoscopic images, showing left and right cameras located in data space. Their interaxial separation is given by t_c. The camera lens axes are parallel to each other in the z direction. The distance from the cameras to the object of interest is d_o.

Imagine that the cameras are two still or video cameras with lens axes that are parallel, mounted on a flat base. t_c is the interaxial separation or distance between the cameras or, more accurately, between the centers of the lenses. The cameras are mounted so that the lenses' axes (lines passing through the centers of the lenses and perpendicular to the imaging planes) always remain parallel. Imagine you're looking at an object in this data space. The two cameras will produce two different viewpoints because they are horizontally shifted by the distance of t_c.

If you were to take a look at the images overlaid one on top of the other, you would see two images that are not exactly superimposed, because the cameras are tc apart. Since the cameras are pointing straight ahead with parallel lens axes, they will see different parts of the object. Wherever the image from one view is exactly superimposed on the other, the parallax is zero, and that portion of the object (or scene) appears at the plane of the screen. If the images are horizontally shifted so that some part of the image is perfectly superimposed, that part of the image is at ZPS.

Remember that the object exists not only in the x- and y-axes, but also in the z-axis. Since it is a 3-D object, you can achieve ZPS for only one point (or set of points located in a plane) in the object. Suppose you use ZPS on the middle of the object. In this case, those portions of the object that are behind the ZPS will have positive parallax, and those in front of the ZPS will have negative parallax.

The degree to which you HIT the images is more than a matter of taste. ZPS should be used to produce the best compromise of parallax for the entire image. This approach involves no rotation and produces no geometric distortion or vertical parallax.

If HIT is used as described, with other conditions (angle of view of lenses, distance of object from the lenses, t_c) kept constant, there is no change in depth content. It will take a moment for your eyes to reconverge for the different values of parallax, but the total depth content of the image remains the same. After your eyes readjust, the image will appear to be just as deep as it was before.

However, you may have produced, through HIT, images that appear to be entirely in front of or behind the screen. It is best to place the ZPS at or near the center of the object to reduce the breakdown of accommodation and convergence.

The Parallax Factor

The use of small t_c values and a wide angle of view (at least 40 degrees horizontal) tends to produce the best parallax. The equation in Figure 2 illustrates the relationship. Figure 2 is the depth range equation used by stereographers, describing how the maximum value of parallax (P_m) changes as you change the camera setup. Imagine that by applying HIT, the parallax value of an object at distance d_o becomes zero. You say you have achieved ZPS at the distance d_o. An object at some maximum distance d_m will now have a parallax value of P_m.

The aim is to produce the strongest stereoscopic effect without exceeding a maximum of 1.5 degrees of parallax. The value of magnification (M) will change the value of P_m. For example, the image seen on a big screen will have greater parallax than the image seen on a small screen. A 24-inch-wide monitor will have twice the parallax of a 12-inch monitor, all things being equal.

Reducing t_c will reduce the value of screen parallax. Also note that reducing lens focal length f_c (using a wide-angle lens) also reduces P_m. The most important factor that controls the stereoscopic effect is the distance t_c between the two camera lenses. The greater t_c, the greater the parallax values and the greater the stereoscopic depth cue. The converse is also true.

If you are looking at objects that are very close to the camera -- say, insects or small objects -- t_c may be low and still produce a strong stereoscopic effect. On the other hand, if you're looking at distant hills, t_c may have to be hundreds of meters in order to produce any sort of stereoscopic effect. So changing t_c is a way to control the depth of a stereoscopic image.

Small values of tc can be employed when using a wide angle of view (low f_c). From perspective considerations, you see that the relative juxtaposition of the close portion of the object and the distant portion of the object, for the wide-angle point of view, will be exaggerated. It is this perspective exaggeration that scales the stereoscopic depth effect. When using wide-angle views, tc can be reduced (see "Stereoscopic Perceptions of Size, Shape, Distance and Direction," by D.L. MacAdam. SMPTE Journal, 1954, 62:271-93).

Creating the Image in SGI GL and OpenGL

To produce stereoscopic graphics that both appear natural and are mathematically correct, you need to render stereo pairs using two asymmetrical-frustum perspective (also called offset perspective) projections with projection axes that are parallel to each other.

This involves projecting from two slightly different viewpoints in the scene. Naturally, the left eye's viewpoint should be offset somewhat to the left of the right eye's viewpoint. Since GL's perspective- projection functions are based on the center of projection (the viewpoint) residing at the coordinate origin, this viewpoint offset is simulated by translating the scene elements in the opposite direction. Thus, you want to translate the scene to the right for the left-eye projection, and to the left for the right-eye projection. This preprojection translation is accomplished by using GL's translate() function just after the function that performs the perspective projection.

Asymmetrical-frustum perspective projections may be done in OpenGL using the standard OpenGL functions glFrustum() and glTranslate(), instead of the similar GL functions window() and translate().

How much separation should there be between the stereoscopic viewpoints? For stereoscopic graphics that are both compelling and comfortable, it is important to have enough stereoscopic interaxial separation, but not too much. Generally, you want to have negative parallax (projected display offset that causes something to appear to pop out of the screen) ranging from 0 to 3 percent of the width of the display window, and you want to have positive parallax (projected display offset that causes something to appear to sink into the screen) ranging from 0 to 3 percent of the width of the display window. This can usually be achieved using a moderately wide-angle field of view, and a camera separation that is about 5 percent of the distance from the viewpoints to the center of interest in the scene being rendered.

It is important to use asymmetrical-frustum projections for generating stereo pairs. If you use parallel perspective projections without asymmetrical frustums, the result will be uncomfortable graphics that reside entirely in negative parallax viewing space (everything coming out of the screen). GL provides two functions for implementing perspective projections, perspective() and window(). Of these, only window() permits asymmetrical-frustum projections. (You may also implement projections by defining transformation matrices.)

The frustum of the left-eye perspective projection should take in a slightly wider angle to the right of its projection axis than to the left of it, and the frustum of the right-eye projection should take in a slightly wider angle to the left of its projection axis than to the right. Scene elements that reside in a plane at which the two frustums intersect will be rendered at zero parallax. Scene elements at zero parallax will appear to reside at the display surface, neither popping out of the screen nor sinking into it. Ideally, scene elements close to the center of interest in the scene should be rendered at or near zero parallax.

Example 1 puts the viewpoint translation and the asymmetrical-frustum projection into GL code. Note that, except with square-pixel stereo, stereo projections should be rendered at a vertically squashed aspect ratio of about 0.48 (=1024/492), since the stereo display mode expands the display vertically by about a factor of two. Example 1 has already taken aspect ratio into consideration.

It is important to note that Microsoft Direct3D, Argonaut BRender, Criterion Renderware, and other third-party rendering engines do not support this method of frustum adjustment. However, you can still affect HIT by using clipped viewports. While this somewhat reduces usable viewport area, the loss is minimal and usually not detected by the end user.

The translation step for creating left and right viewpoint separation could still be implemented as a postprojection matrix operation. However, for third-party rendering engines, you can take advantage of specifying multiple viewpoints in the 3-D scene hierarchy. Separate left and right viewpoint "cameras" can be attached to the main monoscopic viewpoint for the scene. Then rendering left and right views can be performed by selecting the corresponding camera and viewport, without resorting to explicit translation matrix operations.

The Working Model

Listing One adapted from an OpenGL TQUAD.C example from Windows NT SDK, demonstrates how to display a stereoscopic image pair using the above/below format. (The complete TQUAD.c file is also available at http://www.StereoGraphics.com/.)

The application software renders the left and right stereoscopic views onto the single client area as two OpenGL viewports. The rendering generates the stereo views in pairs on the same window surface, regardless of single- or double-buffered hardware operations by the PC graphics controller.

Adding basic stereoscopic support to an application does not require an overwhelming amount of time or effort. In fact, many programmers find that most of the time dedicated to the effort is used fine-tuning HIT and parallax values to achieve the best looking images. More information and stereoscopic software examples and code splices can be found at http://www.StereoGraphics.com/.

Listing One

#if STEREO#define EYE_OFFSET 0.200    // default viewpoint separation
#define EYE_ADJUST -0.070   // default horizontal image shift adjustment


</p>
int winWidth, winHeight;    // client window
int winAdjust=30;       // Y height adjustment to "above" viewport 


</p>
GLfloat eyeDist=4.5;        // Z distance for zero parallax setting
GLfloat eyeOffset=EYE_OFFSET;   // X offset for left/right viewpoint separation
GLfloat eyeAdjust=EYE_ADJUST;   // X adjustment for horizontal image shift
#endif


</p>
static void CALLBACK Reshape(int width, int height)
{
#if STEREO
    winWidth = width;
    winHeight = height;


</p>
    // viewport and perspective matrixes must be updated every drawn frame to 
    // render left/right views in above/below format onto same client window
#else
    glViewport(0, 0, (GLint)width, (GLint)height);
    
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glFrustum(-1, 1, -1, 1, 1, 10);
    gluLookAt(2, 2, 2, 0, 0, 0, 0, 0, 1);
    glMatrixMode(GL_MODELVIEW);
#endif
}
static void CALLBACK Draw(void)
{
#if STEREO
    // calculate delta-X translation for left or right perspective view
    GLfloat dx0 = eyeAdjust;    // asymmetrical viewing pyramid offset
    GLfloat dx1 = eyeOffset;    // viewpoint separation


</p>
    // set "above" viewport for left eye rendering
    glViewport(0, (GLint)(winHeight/2 + winAdjust), (GLint)winWidth, 
                                           (GLint)(winHeight/2 - winAdjust));
    // set left-eye perspective 
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glFrustum(-1-dx0, 1-dx0, -1, 1, 1, 10); // skew viewer symmetry leftward
    glTranslatef(0+dx1, 0, 0);          // move rightward for left eye view
    glTranslatef(0, 0, -3);             // pull back to look at
    glMatrixMode(GL_MODELVIEW);
#endif // STEREO


</p>
    glLoadIdentity();
    glRotatef(xRotation, 1, 0, 0);
    glRotatef(yRotation, 0, 1, 0);
    glRotatef(zRotation, 0, 0, 1);


</p>
    glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);


</p>
    glColor3f(1.0, 1.0, 1.0);
    switch (whichQuadric) {
      case 0:
    glTranslatef(0, 0, -height/20.0);
    gluCylinder(quadObj,radius1/10.0,radius2/10.0,height/10.0,slices,stacks);
    break;
      case 1:
    gluSphere(quadObj, radius1/10.0, slices, stacks);
    break;
      case 2:
    gluPartialDisk(quadObj, radius2/10.0, radius1/10.0, slices, 
               stacks, angle1, angle2);
    break;
      case 3:
    gluDisk(quadObj, radius2/10.0, radius1/10.0, slices, stacks);
    break;
    }
# if STEREO
    // set "below" viewport for right eye rendering
    glViewport(0, 0, (GLint)winWidth, (GLint)(winHeight/2 - winAdjust));


</p>
    // set right-eye perspective
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glFrustum(-1+dx0, 1+dx0, -1, 1, 1, 10); // skew viewer symmetry rightward
    glTranslatef(0-dx1, 0, 0);          // move leftward for right eye view
    glTranslatef(0, 0, -3);             // pull back to look at
    glMatrixMode(GL_MODELVIEW);


</p>
    // render scene again for right-eye view
    glLoadIdentity();
    glRotatef(xRotation, 1, 0, 0);
    glRotatef(yRotation, 0, 1, 0);
    glRotatef(zRotation, 0, 0, 1);


</p>
    glColor3f(1.0, 1.0, 1.0);
    switch (whichQuadric) {
      case 0:
    glTranslatef(0, 0, -height/20.0);
    gluCylinder(quadObj, radius1/10.0, radius2/10.0, height/10.0, 
            slices, stacks);
    break;
      case 1:
    gluSphere(quadObj, radius1/10.0, slices, stacks);
    break;
      case 2:
    
    gluPartialDisk(quadObj, radius2/10.0, radius1/10.0, slices, 
               stacks, angle1, angle2);
    break;
      case 3:
    gluDisk(quadObj, radius2/10.0, radius1/10.0, slices, stacks);
    break;
    }
#endif // STEREO
    glFlush();


</p>
    if (doubleBuffer) {
    auxSwapBuffers();
    }

Back to Article

DDJ

1 2 3 4 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Database

Stereoscopic Imaging