Marc is a member of the research staff at Digital Equipment Corp.'s Systems Research Center in Palo Alto, CA. He can be reached at [email protected]
As illustrated by names like "Visual C++," "Visual Basic," "VisualAge," "Visual Objects," and the like, the current fashion among programming environments is to include the word "visual" wherever possible in the tool's moniker. Strictly speaking, most of these packages are not really visual, in that they do not use a purely (or even predominantly) visual notation to represent a computation. Instead, the visual aspects often serve as a rudimentary graphical scaffold on which pieces of program text are hung.
There is a range of approaches among these packages. Visual C++ provides little more than a unified GUI to conventional program-development tools (compiler, text editor, debugger, and resource editor). In tools such as Visual Basic, the primary interface is a direct-manipulation UI builder that allows you to lay out widgets in a window without having to write any code--until later in the game. Packages such as VisualAge even allow you to build simple applications in a purely visual way, by connecting predefined components (represented by icons) with line segments. However, in all of these packages, you eventually arrive at a point where you must resort to a language such as C++ or Basic to get the job done.
A true visual-programming language, on the other hand, can be considered "executable graphics," with no hidden text Although commercial instances of visual-programming languages are scarce, visual programming as a discipline has been around almost as long as computer graphics: William Sutherland (now director of research at Sun Microsystems) implemented the first visual-programming environment in 1965, two years after his brother Ivan created "Sketchpad," the first computer-graphics application. Sutherland's system was far ahead of its time: It had an integrated development environment, where you could use a pen-based graphics editor to interactively draw a dataflow diagram, then immediately execute and debug it. In fact, it must have appeared quite outlandish to most, given that at the time, the TX-2 at MIT's Lincoln Labs was the only computer in the world that had the required graphics hardware.
Indeed, the field of visual programming remained mostly dormant until the mid-1980s, when graphics hardware became widely available. Since then, there has been considerable interest in the research community (the IEEE Symposium on Visual Languages just went into its 11th year), and there is a small but growing number of commercial software systems that meet the stringent definition of a visual-programming language; for example, National Instruments' LabView or Pictorius's Prograph.
If commodity graphics hardware was the enabling factor for the success of visual languages, then we can expect that the eventual arrival of cheap, high-quality, virtual-reality hardware will foster a new crop of visual-programming languages -- ones that use a three-dimensional instead of a two-dimensional notation.
This hypothesis was the motivation for my work on Cube, which to my knowledge is the first 3-D visual programming language. My goals in this project were to show that programming in 3-D is feasible, investigate whether the third dimension can provide a richer level of expression (as opposed to being a mere eye-catcher), and gain a better understanding of what new tools and techniques are needed for building and using 3-D programming environments. My ambitions did not go so far as to build a full-strength system. The current implementation of Cube is still very much a prototype, and the Cube programs I have written are the classic toy examples found in entry-level programming textbooks. However, the language, despite being purely graphical, has first-class computational strength, incorporating such notions as recursion, higher-order predicates, and user-defined types. In a few respects, Cube's computational expressiveness goes beyond some conventional text languages; for example, it allows multiple textual solutions to a computation.
Visual Language Pros and Cons
Not surprisingly, the arguments for the merits of visual languages are as old as visual languages themselves.
- The human mind is visually oriented. Evolution has equipped us with a powerful visual cortex; humans can process visual information rapidly, and are very good at discovering graphical relationships (such as connectivity or inclusion) in complex pictures. So, visual notation provides for fast information transfer, and moves part of the mental workload from the cognitive to the perceptual level.
- Graphical representations provide a syntactically rich language. The elements of a picture have a multitude of attributes, such as shape, size, position, orientation, color, and texture, all of which can be used to encode meaning. Also, graphical representations allow for concrete metaphors, such as icons in place of names.
There are also a number of problems associated with graphical representations:
- Screen space. Visual languages use screen real estate much less frugally than textual ones. However, this problem can be alleviated by use of high-resolution displays.
- Input. While it might be faster to read a visual program than a textual one, it takes longer to write one--most people type faster than they draw. However, only a small fraction of a programmer's time is actually spent on entering code. A much larger fraction is spent on designing and debugging, and on understanding other people's code.
- Naming. It is harder to come up with good icons than with good names. Presumably, this is a cultural rather than an intrinsic problem.
A number of arguments have been made, by me and others, for 3-D over 2-D visual languages:
- More information. Three-dimensional allows us to use one more dimension to convey semantic information. (In Cube, you use the third dimension to stack 2-D dataflow diagrams on top of each other.)
- Easier layout. The metaphor most commonly used in visual languages is the dataflow metaphor, in which programs are represented by boxes connected by lines. In 2-D, it is often impossible to avoid intersecting lines (not every graph is planar), whereas in 3-D, such intersections can easily be avoided.
- Space efficiency. A 3-D representation alleviates the screen- space problem. Although the physical screen real estate remains unchanged, the virtual space of a 3-D image is larger than that of a 2-D image. The user can access different parts of the space by looking at it from different positions and angles. The drawback, however, is that parts of the picture may be occluded at times.
- Three-dimensional notations are well-suited, if not ideal, for programming in virtual realities.
To illustrate programming with a truly visual language, I'll first present a program that converts temperature values from Celsius to Fahrenheit and vice versa. Recall that x degrees Celsius correspond to 1.8x+32 degrees Fahrenheit. The program in Figure 1(a) describes this relation. It consists of two opaque green "predicate cubes," labeled with the symbols "*" and "+", respectively, and four transparent green "holder cubes." Two of the holder cubes are filled with opaque green "value cubes" that are labeled "1.8" and "32," respectively. The other two holder cubes are empty: The left one is meant to receive a Celsius temperature value, and the right one, a Fahrenheit value.
In Cube, addition and multiplication are viewed as ternary predicates rather than binary functions. This is similar to writing plus(a,b,c) in place of a+b=c. So, each of the two predicate cubes has three arguments (also known as "ports") shown as labeled indentations in the cube's side walls. The label of each port identifies the argument as being a, b, or c in the notation above. The two "input" ports of the multiplication predicate are connected by a "pipe" to the left empty holder cube and to the holder cube containing the value 1.8; the "output" port of the multiplication predicate is connected to one "input" port of the addition predicate, its other "input" port being connected to the holder cube containing the value 32, and its "output" port being connected to the right empty holder cube.
Cube, like most other visual languages, is based on a dataflow metaphor. Holder cubes and ports of predicate cubes are connected by pipes, and values flow though these pipes. Pipes are undirected; data simply flows from full to empty holder cubes and ports. Cube differs in this respect from most other dataflow languages, which use directed connections. Also, there is no real distinction between the "input" and the "output" of a predicate. If you put a value cube (say, 10) into the empty holder cube on the left and run the program, the system will fill the right holder cube with the value 50. Likewise, if you put a value (say, 68) into the right empty holder cube, as in Figure 1(b), the system will fill the left holder cube with the value 20; see Figure 1(c).
If you are familiar with Prolog, you'll find Cube's undirected dataflow resembles the undirected nature of logic variables in Prolog. Indeed, the underlying semantics of Cube are quite similar to those of Prolog.
This example has introduced some important syntactic elements of Cube (holder cubes, predicate cubes, value cubes, ports, and pipes), and has explained the dataflow metaphor and how Cube programs are undirected (that is, input and output arguments are often interchangeable). However, it did not show us a convincing use of the third dimension; the temperature-conversion program is essentially a flat dataflow diagram.
The Classic Factorial
The next example is a program for computing the factorial of a number. We can define "factorial" recursively as follows: The factorial of 0 is 1 (this is called the "base case"), and the factorial of n (where n > 0) is n times the factorial of n-1 (the recursive case).
Figure 2(a) shows the Cube definition of factorial: A transparent green cube, called a "predicate definition cube," with the icon "!" on its top. Inside the definition cube are two transparent boxes, called "planes," that are stacked on top of each other. Each plane contains a dataflow diagram. There are two pipes coming out of each plane and leading into two ports set into the side of the definition cube. I'll call the left pipe the "input pipe" and the right pipe the "output pipe."
Once defined, the factorial predicate can be used anywhere within a Cube program, including inside its own definition. During execution, every opaque predicate cube referring to factorial is replaced by (a copy of) the transparent cube defining it. Values that flow through the "input" port into the cube are split up and flow in parallel into each plane, feeding into the data flow diagram within that plane. In the process of evaluation, a plane might fail, in which case it is taken out of the computation, or it might succeed, in which case its result flows out of the plane towards the output port of the predicate.
From this description, it is apparent that the evaluation of a Cube program can yield any number of solutions. If all planes of a predicate fail, then there are no solutions, if more than one plane succeeds, then there are multiple solutions. As it happens, all the programs presented here yield, at most, one solution.
The upper plane in Figure 2(b) describes the base case. The plane contains two holder cubes, the left one containing the value 0, and the right one containing 1. The left holder cube is connected to the input pipe, and the right holder cube is connected to the output pipe. During execution, the outside world must either send the input pipe a value compatible with the value inside the connected holder (in this case, 0) or not send any value at all; otherwise, the plane will fail. The same holds true for the output pipe.
The lower plane in Figure 2(c) describes the recursive case. The input pipe enters the plane on the left side and splits up. One of its ends connects to one port of a predicate cube labeled ">"; the other port of this cube is connected to a holder cube containing the value 0. This will ensure that during execution, the value flowing through the input pipe is greater than 0 (and cause the plane to fail if it isn't).
The input pipe also leads into one input port of a subtraction predicate, whose second input port is connected to a holder cube containing 1. The output of the subtraction predicate flows into the input port of the factorial predicate (that is, a recursive use of the predicate we are about to define), and the output of factorial flows into one input port of a multiplication predicate. The other input port is connected to the input pipe, and its output port is connected to the output pipe.
Having defined factorial, you can use it in the same manner as addition or multiplication. Figure 2(d) is just such an application. The program contains the factorial-definition cube and a predicate cube referring to it. The input port of this predicate is connected to a holder cube containing the value 5, while its output port is connected to an empty holder cube. Running the program will fill the empty holder cube with the value 120; see Figure 2(e).
Mapping a Predicate Over a List
The final example is the Cube analog of map, the classic textbook example of a higher-order function. The standard version of map takes a unary function f (such as, factorial) and a list [x1, ... , xn] as its arguments, and returns a list [f(x1), ... , f(xn)]. Alternatively, you can give the following recursive definitions of map: "Applying map to a function f and the empty list yields the empty list" (the base case), and "Applying map to a function f and a list with head h and tail t yields a list whose head is f(h) and whose tail is the result of applying map recursively to f and to t" (the recursive case).
Recall that Cube is a logic programming language, and that results are treated as arguments. So, the map predicate cube takes three arguments: a binary predicate (such as factorial) and two lists (the second list being the "output"). There are two special value cubes for describing lists: a cube that denotes the empty list, and another cube that attaches a value to the front of a list. (In Scheme and other functional languages, these two "constructors" are referred to as nil and cons.) The two value cubes are not Cube primitives; they are derived from a user-supplied type-definition cube that describes the list type.
Figure 3(a) shows the predicate-definition cube for map. It has two ports in its sides that take the input and the output list, and a third port set into its top that takes the binary predicate. The port is labeled "-->"; you can use the argument not only by routing a pipe to it, but also through a predicate cube labeled with the same icon. Finally, the definition cube contains two planes, one for the base case and one for the recursive case. Pipes run from the ports for the input and the output list into both planes.
The lower plane in Figure 3(b) describes the base case. The plane contains two holder cubes, each of which contains a value cube representing the empty list. The left holder cube is connected to the input pipe, and the right holder cube, to the output pipe. The plane will fail if the outside world puts a nonempty list into either pipe; otherwise, it will succeed and put empty lists into both pipes.
The upper plane in Figure 3(c) describes the recursive case. The input pipe enters the plane from the left and connects to a holder cube that is filled with a cons value cube. The output pipe enters the plane from the right and connects to a holder cube that is filled with another cons value cube. True to the spirit of Cube's bidirectional nature, the cons cube can be used to compose a list (that is, to attach an element in front of a list) as well as to decompose it (that is, to split a list into head and tail).
If a list flows through the input pipe into the left holder cube, it is split into a head and a tail. The head value flows out through the upper pipe and into the input port of the predicate cube labeled "-->", the binary predicate passed in as an argument. The tail value flows out through the lower pipe and into the input port of the map predicate (a recursive reference to the predicate we are currently defining). The output ports of the binary predicate and of map flow to the right cons cube, where they are composed into a list. This list finally leaves the plane through the output pipe on the right. So, the meaning of this plane can be stated as follows: "Given a nonempty list, split it into a head and a tail, apply the user-supplied binary predicate "-->" to the head and map recursively to the tail, and combine the results into a new list."
Figure 3(d) is an example of the map predicate. The program contains the map-definition cube and a predicate cube referring to it. The input port of map is connected to a holder cube filled with the list [1,2,3], the output port is connected to an empty holder cube, and the factorial predicate is slotted into the port on the top of map. The program is supposed to apply factorial to each element of the list [1,2,3]. Running the program will cause the result--the list [1,2,6]--to flow into the holder cube on the right. Figure 3(e) shows a close-up of the result.
The Implementation of Cube
All figures in this article were created by a prototype implementation of a Cube environment written in Modula-3. It consists of four major functional components: a renderer, a type-inference system, an interpreter, and a rudimentary editor.
The renderer does not rely on any dedicated 3-D hardware. I use a variation of the z-buffer algorithm combined with alpha blending to perform hidden-surface removal and to deal with transparent surfaces. The renderer delivers very realistic pictures, but is rather slow: It takes about three seconds on a 233-MHz Digital AlphaStation 400 4/233 to render Figure 1 at a resolution of 640x512 pixels--too slow to navigate the scene at interactive speeds. Therefore, the renderer displays a wireframe rendition of the scene whenever the viewpoint or the scene content changes, and starts (or restarts) a background thread to generate the high-quality rendition. If the scene remains unchanged long enough for the background thread to complete its task, the high-quality image eventually replaces the wireframe rendering.
The type-inference system translates the external representation of a Cube program into a simpler, intermediate representation, and then employs the Hindley-Milner type-inference algorithm (the algorithm used by virtually all statically typed functional languages) on it. The Hindley-Milner algorithm not only verifies that a given program is type correct, it also infers the type of each variable used. Cube uses these inferred types to provide feedback to the user: It fills each empty holder cube with a type cube representing the inferred type of the holder cube. For example, the empty holder cubes in Figure 1 would be filled with the type cube representing the real numbers.
The interpreter translates the internal representation of Cube programs into another intermediate representation, and then evaluates this intermediate form. Cube is an inherently concurrent language: The various planes of a predicate definition and the various predicates in each plane are evaluated in parallel. The interpreter simulates this concurrency by using a time-slicing approach.
Evaluating a Cube program yields a (possibly empty) set of solutions that the user can interactively browse and examine. When the user selects a particular solution, the interpreter fills each empty holder cube with a value cube representing the computed instantiation of the holder cube.
Finally, the editor allows the user to interactively construct new Cube programs. The current system implements only a core part of the needed functionality (that is, the ability to create predicate and holder cubes, to fill them with values, and to connect them through pipes).
When I implemented Cube, I had no access to virtual-reality hardware, so the input is entirely mouse based. As it turns out, editing 3-D scenes with a 2-D input device is extremely cumbersome. A more-appropriate input device, such as a data glove, would speed up the editing process considerably.
The proliferation of low-cost, high-resolution graphics displays has laid the groundwork that will allow visual languages to move into the mainstream of computing. I believe that their initial success will be in providing intuitive programming environments with a shallow learning curve to nonprogrammers or casual programmers: For instance, by enabling laboratory technicians to construct virtual lab instruments (as done by LabVIEW), by enabling scientists to build customized data visualizations (as done by AVS and IRIS Explorer), or by enabling end users to easily connect software components.
Professional programmers, on the other hand, still prefer textual environments, potentially with GUI front ends (as in Visual C++) and containing GUI builders (as in Delphi). However, visual languages are making some inroads here as well, for instance, in the form of the dataflow language that is part of VisualAge.
At the moment, 3-D visual programming is still the domain of research. This might change if virtual-reality hardware becomes commonplace. In fact, looking at the history of graphical user interfaces and of 2-D visual languages, I believe it is actually quite likely to change.
The Cube project has not produced the programming environment of the next century (nor was it intended to), but it has shown the feasibility of 3-D visual programming, and it has helped in identifying the research problems that need to be tackled in order to make programming in virtual realities a (more than virtual) reality.
For more information on Cube, visit http://www.research.digital.com/SRC/personal/najork/cube.html on the World Wide Web.
<B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=16">Figure 1(a)</A>:</B> The temperature-conversion program. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=17">Figure 1(b)</A>:</B> Converting 68 degrees Fahrenheit to Celsius. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=18">Figure 1(c)</A>:</B> The temperature-conversion program after evaluation. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=19">Figure 2(a)</A>:</B> The factorial predicate. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=20">Figure 2(b)</A>:</B> The upper plane. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=21">Figure 2(c)</A>:</B> The lower plane. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=22">Figure 2(d)</A>:</B> Computing the factorial of 5. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=23">Figure 2(e)</A>:</B> The factorial predicate after evaluation. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=24">Figure 3(a)</A>:</B> The map predicate. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=25">Figure 3(b)</A>:</B> The lower plane. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=26">Figure 3(c)</A>:</B> The upper plane. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=27">Figure 3(d)</A>:</B> Mapping factorial over the list [1,2,3]. <B><a href="/showArticle.jhtml?documentID=ddj9512a&pgno=28">Figure 3(e)</A>:</B> The map predicate after evaluation.
Copyright © 1995, Dr. Dobb's Journal