Channels ▼


Robotic Control & 3D GUIs

By Hank Jones and Martin Snyder

, January 01, 2003

Source Code Accompanies This Article. Download It Now.

Jan03: Robotic Control & 3D GUIs

Hank is a Ph.D. candidate in Aeronautics and Astronautics at Stanford University. Martin is president of Ethermoon Entertainment. They can be reached at [email protected] and [email protected], respectively.

Outside the realm of science fiction, robots and advanced user interfaces have a meager common history. For the most part, interfaces for real robots are extensions of output systems designed by development engineers. Only recently has the number of sophisticated interfaces increased due to the availability of easy-to-use window- and graphics-creation tools. In this article, we describe a GUI based on the OpenGL 3D API ( that we use to operate GPS-enabled robots.

The proliferation of high-performance systems with 3D graphics capability makes it possible to implement real-time computer systems that accurately display dynamic real-world environments. The most important prerequisite for such systems is knowledge of the position and motion of objects in the world. An increasing number of mobile machines and electronics are enabled with just this type of positioning information, thanks to inexpensive global-positioning system (GPS) receivers.

The basic tasks for real-world UIs include:

  • Collecting the data from available sensors.
  • Displaying this data.

  • Issuing commands that alter the state of the system.

While robotics has not evolved to the point where it is possible to develop generic solutions to the first and third tasks, technologies for displaying real-time positional information are widespread enough to allow for such a solution. Here, we focus on one implementation of a real-time display, with only a general discussion of how our robots send data and receive commands.

System Description

The Free Flying Space Robot (FFSR) testbed at Stanford University's Aerospace Robotics Laboratory ( consists of three robots (0.5m×0.5m×0.75m; 75 kg) developed for research on advanced spaceborne motion, manipulation, and construction tasks; see Figure 1. The FFSRs are designed to operate autonomously, with onboard electrical power, computation, sensing, and propulsion. A bank of batteries powers a Motorola MVME 167-33 MHz (68040) real-time processor that has ably controlled all basic robot functions for over 10 years. The robots wirelessly send live video to remote PCs for vision processing, the only aspect of their operation not conducted using onboard resources. Compressed air is carried in three tanks and used for levitation and propulsion via eight cold-gas thrusters. A large horizontal momentum wheel allows efficient orientation changes, and two arms with grippers enable complex manipulation tasks.

Position sensing of the robot is accomplished with four GPS antennae that provide both position and attitude using Differential Carrier Phase (DCPGPS) techniques. An indoor constellation of pseudosatellite transmitters provides a GPS environment simulating low Earth orbit. From an implementation standpoint, this system is no different than the more familiar outdoor satellite-based GPS system.

As Figure 1 shows, the workspace of the robots contains objects to be manipulated. The positions of these objects are calculated relative to the robot using onboard video cameras. Objects are detected and classified using unique infrared LED patterns on their surface. A product of this sensing method is that the measurements are unreliable. Even with a low-pass filter to take out most of the noise, errors are likely in all measurements, and there are biases in object positions due to the robots' position errors. As the objects get farther away, the errors tend to increase. These conditions can cause problems when two robots are supposed to perform a cooperative task with an object.


Four applications run onboard in a real-time VxWorks ( environment:

  • A low-level, servo-control feedback controller for thrusters, momentum wheel, arms, and grippers.
  • A first-order theorem prover to determine robot capability depending on the state of the robot and its environment.

  • A trajectory planner to build collision-free paths.

  • A task manager to execute high-level commands.

These applications, as well as the GUI and any other off-board programs, communicate over IP via RTI's NDDS publish-subscribe middleware ( Although direct sockets could be used (with slightly less overhead) for simpler systems, the publish-subscribe model provides flexibility and robustness for distributing information.

GUI Software

The GUI software is written in C++ for a Windows 2000 environment. We used the GLT library (, a descendant of the GlutMaster library, to provide a C++ wrapper for the native C OpenGL Library and to permit cross-platform capability. GLT provides classes for "examiner" windows from which the robot GUI display window class C3DGUIWindow was derived. This derived window class sets up mouse event handling for zooming, panning, and rotating the view. Listing One is the setup and initialization calls we used.

GLT also provides a GltShape class for basic drawing capability in an object-based model. Derived classes draw specific objects ranging from cubes and cylinders to dodecahedrons and teapots. Our CGUIShape class is a container class for a list of GltShape objects, allowing assembly of complex structures.

We created a CGUIData class to contain all the relevant state data about a particular object, including its position, orientation, and data age. We then created a CGUIEntity class that inherited from both the CGUIShape and CGUIData classes to give us a one-class-per-physical-object structure. For those readers familiar with design patterns, CGUIEntity is an implementation of the well-known Composite pattern. The CGUIShape child list lets us arrange the objects in a logical tree structure based on physical location and sensing capabilities. Figure 2 shows how this construct is used to instantiate the robot system.

We included a virtual FFSR class as a child object of the robot class because each robot must sense other robots to interact with them and include them in their plans. This sensing is done directly by the robot, as relative position measurements are important for the robot's physical safety. Although the robots could sense one another by subscribing to other robots' position reports, this is generally unsatisfactory since the errors using this method can be large enough to cause collisions. The onboard sensor is the best sensor for planning and movement.

OpenGL is capable of effectively handling any scale of scene, from microscopic to universal, by using appropriate dimensions and a corresponding viewport. In our case, we used the actual dimensions of all physical entities, which gave us the added advantages of avoiding normalization calculations and a better sense of feedback about object velocities, relative positions, and collision possibilities. The GPS receivers onboard the robots sense their location as if they were in orbit, but apply a coordinate translation so that the center of the table is at the (0, 0, 0) x-y-z position. The robot thus publishes state update packets CStatePkt (Listing Two) that range from +3 to -3 meters in x, y, and z. However, OpenGL could have easily handled the raw orbital data with a change in the viewport values.

Because we required a high screen-refresh rate to make robot motions appear smooth, we needed solid performance when drawing the scene. We used the OpenGL display list functionality (a good idea for almost any OpenGL application) so that our draw commands would get compiled and, if possible, stored in the graphics card memory. Using display lists can lead to significant performance improvements, and it certainly did in our case. We create the display list during each object's initialization (Listing Three). The three important calls to create the display list are glGenLists(), to create a valid list index, and glNewList() and glEndList(), which bracket the drawing calls. There are some limitations on what may be in a display list; see the OpenGL documentation for details.

One of the more rewarding capabilities of OpenGL is the use of bitmapped textures to give objects a more realistic appearance. We included a GltTexture (from the GLT library) in many of our CGUIEntity-derived classes to provide this improved look. Use of textures on planes, as on the table, floor, and wall in Figures 3 and 4, is relatively straightforward. All we had to do was specify, on each vertex of the plane, which relative coordinate of the texture should be attached to that vertex. In Listing Three, we show an alternative method that lets us use glutSolidCube() to draw the six planes that make up the FFSR body. This method is also useful for other Glut calls that create more complex shapes.

Our GUI application also utilizes the GLUI library ( to enable cross-platform UI controls, particularly dialog boxes. GLUI is built on GLT, so the two libraries worked together well. The dialog boxes are used to provide operators with a choice of high-level commands for the robot system. When operators choose a robot or robots and then select an object in the environment, the robot is notified of the selection and responds with a list of possible commands as determined by the onboard first-order theorem prover. This list is then displayed as a column of buttons in a dialog box, affording operators only the currently possible commands.

Receiving Data

To get data from the robots, the GUI subscribes to all published sensing data. This data comes in two forms: the direct GPS measurements of the robots, and the position and orientation of any nearby objects as determined by the robots. Each data packet comes with the name of the entity being described as well as the data source so that subscribers can distinguish between an object sensed by one robot and an object (perhaps the same one) sensed by another.

When new data comes in, it is assimilated into the data structure. The use of the CGUIEntity base class lets this happen recursively, with each CGUIEntity looking for a match between itself and the name and source of the data packet, or then sending it to each of its children in turn. When a home for the data packet is found, the state of the entity and the data age is updated. Listing Four includes parts of the code to accomplish this data assimilation (note the use of the Chain of Responsibility pattern).

Displaying the Environment

Displaying the environment takes place in two parts: setting the current viewport correctly and then recursively drawing all the items in the tree. The relevant calls take place in the OnDisplay() function; see Listing Five.

Again, the viewport is maintained by the GlutWindow base class and is manipulated through mouse and keyboard event handlers. OpenGL provides glutLookAt() as a simple mechanism for changing the viewport. We used a CWindowPOV class to maintain the important viewpoint values—the location of the observer, the location of the focus of attention, and a vector pointing up; see Listing Six. Most useful viewport manipulations can be achieved by changing the values in one of these three parameter sets.

The function glutPostRedisplay() starts the drawing sequence for the OpenGL viewport. This routine is called whenever the scene should be redrawn, as determined by the user's input or by the OS. It is also called by a timer loop, so the screen updates regardless of other factors to show robot motion. Following standard practice, we utilize the Glut's "double-buffering" facility. Double buffering is the process of drawing into an offscreen image, then updating the actual display in one operation. This increases performance and reduces the flickering artifacts of a direct screen update.

Drawing takes place by calling the Draw() member function of the application's root CGUIEntity instance, which then calls all of the other members of the tree recursively. CGUIEntity implements Draw() by translating and rotating the current display matrix by the raw values of the object state (thanks to the use of real-world dimensions throughout) and calling on the appropriate display list set during the object initialization process. Listing Five is the base class Draw().

From this point on, OpenGL takes care of most of the hard work. The lighting, material properties, textures, unseen surface clipping, and antialiasing processes are all done automatically. Figure 3 shows the result for a typical robot experiment.

Object Correspondence

Notice in Figure 3 that many objects appear twice. This is an artifact of the sensing and data-reporting methods of the robots. Objects that appear twice are sensed by two robots but with a slight disagreement about where the object is actually located. This object correspondence problem is fundamental in systems without a global point of view. Dealing with correspondence is critical for improving the clarity of the user interaction, as well as for enabling the robots to conduct cooperative tasks using these objects.

We designed and implemented an independent software agent we call the Correspondence Agent (CA) to provide assistance with this issue. The CA subscribes to all of the robot sensor publications, just as the GUI does. By identifying similar, overlapping objects, the CA determines which packets of information appear to be describing the same object. For each apparently unique object, an instance of a CCorrespondenceGroup class is created, which includes a unique identification number and a list of the robot-sensed entities that the CA believes to be of the same object.

The CA also listens to publications of correspondence assertions, typically generated by the GUI according to input by operators. This functionality was added when we saw some cases where two objects were displayed that were clearly the same to operators, but did not fall within the rules the CA had been given.

The GUI subscribes to the correspondence publications and turns off the display of duplicate objects accordingly. (The robots would also benefit from knowing the correspondence between objects for planning cooperative tasks, but this capability has not yet been implemented.) As Figure 4 shows, the screen is much less cluttered and operators are able to command cooperative tasks thanks to the CA. A keystroke toggles this functionality in case a view of all objects would be useful to operators.

There are many options for displaying these duplicate objects. We could gray out all but one of each object, connect the objects with a web of lines or some other structure, or display an object at the position of the average of the group members' states. We chose to simply display the first object for three reasons: It reduced screen clutter, was simple to implement without creating a new entity to draw, and more closely reflects reality—at least one robot believes the object is in that particular location.


Although our implementation involves robots in a research lab, this basic technique has been used to operate helicopters, submarines, and space systems. We encourage you to try a test implementation of a simple system. Most GPS receivers generate output to a serial line that can be tapped, with some receiver manufacturers also providing basic SDKs. A laptop with such a receiver becomes a surrogate robot, and a wide variety of projects can proceed from there.


Listing One

#include "glutm/window.h"
// C3DGUIWindow is derived from GlutWindow, which performs all system 
// initialization for us. InitializeWindowValues performs additional 
// initialization for our app to specify drawing parameters we require.
void C3DGUIWindow::InitializeWindowValues()
  glEnable(GL_CULL_FACE); // An optimization that prevents OpenGL from 
                          // rendering sides of an object that cannot be seen.
  glCullFace(GL_BACK);    // So, don't worry about rendering the 'back' side 
                          // of objects since they should all be solids
  glDepthRange(0.0, 1.0);
  glClearColor(0.3, 0.3, 0.5, 0.0);  // Provides a sky blue background

Back to Article

Listing Two

class CStatePkt
    char*  m_sName; // e.g. "Huey"
    char*  m_sType; // e.g. "FFSR"
    char*  m_sSource;   // e.g. "Louie"
    double x;       // e.g. 2.348756;  x = 0.0 at table center
    double y;       // e.g. 1.235445;  y = 0.0 at table center
    double z;       // usually 0.0 (Robots and objects move on table surface)
    double yaw;     // ranges from 0 to 2p
    double roll;    // usually 0.0 for our robots and objects
    double pitch;   // usually 0.0 for our robots and objects

Back to Article

Listing Three

// nWrapParam can be GL_REPEAT or GL_CLAMP
// GL_REPEAT will tile the texture; GL_CLAMP will stretch it out
// nEnvParam can be GL_MODULATE or GL_DECAL (others values are possible 
//    but we don't use them)
// GL_MODULATE is makes the texture somewhat see-through; GL_DECAL is 
//    not see-through at all
// nWrapParam and nEnvParam are basically the only options 
//    needed to specify in initialization
bool CGUIShape::InitializeTexture(GLenum nWrapParam, Glenum nEnvParam)
  if (m_pTextureData == NULL) return false;  
     // m_pTextureData is defined in the constructor, and points to
     //   the proper texture bitmap structure if it exists
  m_pTexture = new GltTexture();
  m_nTextureEnvParam = nEnvParam;
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, nWrapParam);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, nWrapParam);
  return true;
// CGUIFreeFlyer is a child class of CGUIEntity, which inherits from CGUIShape
int CGUIFreeFlyer::CreateDisplayList()
  int nIndex;
  double xScale, yScale, zScale;
  InitializeTexture(GL_REPEAT, GL_DECAL);
  GLfloat sReflectPlane[] = {1.0, 1.0, 0.0, 0.0};
  GLfloat tReflectPlane[] = {0.0, 0.0, 1.0, 0.0};
  GLUquadric *qobj;
  // The FFSR is represented graphically by a cube. It has to be scaled to 
  // achieve the proper dimensions
  zScale = FFSR_HEIGHT;
  nIndex = glGenLists(1);
  glNewList(nIndex, GL_COMPILE);    
  glColor3f(1.0, 1.0, 1.0);
  // Start Texture-related calls
  if (m_pTexture->id() != 0) {
    glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, m_nTextureEnvParam);
    glTexGenfv(GL_T, GL_OBJECT_PLANE, tReflectPlane);
    glTexGenfv(GL_S, GL_OBJECT_PLANE, sReflectPlane);
  // End Texture-related calls
    glTranslatef(0, 0, FFSR_HEIGHT * 0.5);
    glScalef(xScale, yScale, zScale);
  // Texture-disabling calls
  if (m_pTexture->id() != 0) {
  return nIndex;

Back to Article

Listing Four

int CGUIEntity::ConsiderIncomingStatePkt(CStatePkt *pPkt)
  int nDigested = 0;
  if (ShouldDigestThisStatePkt(pPkt)) {
    nDigested = ProcessStatePkt(pPkt);  
    // printf("Home for packet from %s was found!\n", pPkt->sSource);
  // Send it to the child objects to see if one of them will digest it
  // Won't send it to the child nodes if it returned an answer above
  CGUIEntity* pChild;
  for (int j=0; !nDigested && j<size(); j++) {
    if (pChild = (CGUIEntity*)(*this)[j].get()) {
        nDigested = (pChild->ConsiderIncomingStatePkt(pPkt));
  return nDigested;
// Handles packets for objects that it senses (i.e. served as the source)
bool CGUIEntity::ShouldDigestThisStatePkt(CStatePkt *pPkt)
  return (strcmp(pPkt->sSource, GetName()) == 0);
CGUIEntity* CGUIEntity::FindObjectPtrByName(char *sName)
  if (strcmp(Name(), sName) == 0) return this;
  // Send it to the child objects to see if one of them is the right name
  //  Check the child nodes, and return their answer if they give one
  CGUIEntity* pChild;
  CGUIEntity* result = NULL;
  for (int j=0; j<size(); j++) {
    if (pChild = (CGUIEntity*)(*this)[j].get()) {
      result = pChild->FindObjectPtrByName(sName);
      if (result != NULL) return result;
  return NULL;
// The following function is virtual for CGUIEntity. It is implemented
// by CGUIRobot for all robots, including FFSRs (which inherit from CGUIRobot)
bool CGUIRobot::ProcessStatePkt(CStatePkt *pPkt)
  CGUIEntity* pObject = NULL;
  if (!(pObject = FindObjectPtrByName(pPkt->sObjName))) {
    return false;
  if (!pObject->SetStateData(pPkt)) {
    return false;
  // printf("Processed Object %s from %s\n", pPkt->sObjName, pPkt->sSource);
  return true;

Back to Article

Listing Five

void OnDisplay()
void CGUIShape::Draw() const
  if (!ShouldDraw()) return;
  // Save time by not calling this if there are no children; 
  // Default is that children exist
  if (m_bHasChildrenToDraw) GltShapes::draw();    
    // Calling GltShapes::draw(), not GltShape::draw() to get 
 // the child objects drawn
  if (GetDisplayListID()) {     // A display list exists
    int nMode = 0;
    glGetIntegerv(GL_RENDER_MODE, &nMode);
    if (nMode == GL_SELECT) glLoadName(GetDisplayListID());
      glTranslatef(translation()[0], translation()[1], translation()[2]);
      glScalef(scale()[0], scale()[1], scale()[2]); 
      // Perform a 3-2-1 coordinate transformation
      glRotatef(rotation()[0]*RAD2DEG, 0, 0, 1);
      glRotatef(rotation()[1]*RAD2DEG, 0, 1, 0);
      glRotatef(rotation()[2]*RAD2DEG, 1, 0, 0);
      if (m_nDisplayListID != 0) glCallList(GetDisplayListID());

Back to Article

Listing Six

// Basic starting POV looks at the origin of the coordinate system from the 
// (1,1,1) position  with the z axis pointing up in the viewport
  SetPOV(1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0);
void CWindowPOV::SetPOV(double EX, double EY, double EZ, double CX, 
                    double CY, double CZ, double UX, double UY, double UZ)
  SetEyeLocation(EX, EY, EZ);
  SetFocusPoint(CX, CY, CZ);
  SetUpVector(UX, UY, UZ);
void C3DGUIWindow::UpdateWindowViewport(CWindowPOV *POVinfo)
  glMatrixMode(GL_MODELVIEW); // Ensure you're manipulating model view matrix,
  glLoadIdentity();           // then reset it before calling gluLookAt()
  gluLookAt(POVinfo->Eye(0), POVinfo->Eye(1), POVinfo->Eye(2), 
    POVinfo->Focus(0), POVinfo->Focus(1), POVinfo->Focus(2),
    POVinfo->Up(0), POVinfo->Up(1), POVinfo->Up(2));

Back to Article

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.