Database

The Active Expressions Library

By Mauricio De Simone and Gregory V. Wilson, August 01, 1998

The Active Expressions library combines the conciseness and checkability of language extensions with the portability and extensibility of a library.

Dr. Dobb's Journal August 1998: The Active Expressions Library

Mauricio De Simone developed the Active Expressions library while at the University of Waterloo. He can be reached at [email protected]. Greg is the author of Practical Parallel Programming (MIT Press, 1995) and coeditor with Paul Lu of Parallel Programming Using C++ (MIT Press, 1996). Greg can be reached at [email protected].

Daniel Slotnick began work on the world's first parallel computer, the ILLIAC-IV, more than 30 years ago. Today, multiprocessor workstations and PCs are common, but parallel programs are still rare. The main reason for this is that they are hard to write. As soon as you create a second thread of control (never mind a 200th), you have to start dealing with synchronization, communication, resource allocation, deadlock, and a host of other headaches. According to John Reynders of the Los Alamos National Laboratory, the result is that the time spent writing and debugging a parallel program is often greater than the time it would have taken to calculate the same results using conventional hardware.

In fact, while parallel programming is intrinsically harder than sequential programming, many parallel programming systems make it harder still. Some systems use proprietary language extensions, or entirely new languages. Many of these are available on only one platform, or fail to keep up with the evolution of their base language, as in the case of some of the parallel dialects of C++ surveyed in Parallel Programming Using C++ (Gregory V. Wilson and Paul Lu, editors, MIT Press, 1996).

Other systems (such as the POSIX threads standard and PVM) encapsulate parallelism, synchronization, and communication in libraries. These systems tend to be longer-lived and more portable than proprietary languages or dialects, but they also have drawbacks. For instance, they tend to be more cumbersome than language extensions: Even a "Hello world" program can require several dozen lines of obscure function calls.

Another drawback of many libraries is that programmers have to give up type safety. If a sending process has to package messages into byte blocks, for example, there is no way for the compiler to verify that the receiving process is unpacking those bytes correctly. Off-by-one and type-casting errors (such as sticking the first four bytes of a string into an integer by mistake) are both common and notoriously difficult to track down, especially in heterogeneous environments.

In this article, we'll describe the Active Expressions library, which combines the conciseness and "checkability" of language extensions with the portability and extensibility of a library. Active Expressions is written in C++, and makes heavy use of templates and operator overloading. It does not try to do everything (it would make more sense, for example, to do matrix algebra using Todd Veldhuizen's Blitz++ library) but its conciseness and performance are proof that you can sometimes have the best of both worlds. Active Expressions was developed by Mauricio De Simone while at the University of Waterloo; its implementation borrows ideas from ABC++, which was developed by Bill O'Farrell, Greg Wilson, and others at IBM.

The Active Expressions library has been used on a variety of UNIX platforms, and its run-time system handles both shared-memory systems and distributed-memory (or networked) machines (for more information, see http://www.interlog.com/ ~mdesimon/Ae/). Performance figures are available from the web site; as they show, Active Expressions is 5 to 10 percent slower than "naked" threads on shared-memory systems, but several times faster than PVM on LANs.

Developing Active Expressions

The development of Active Expressions began with a simple question: What does the expression a | b | c mean? In UNIX, it means "pipe the output of a to b, and the output of b to c." In C, it means "calculate the bitwise OR of a, b, and c."

Its meaning in C++ depends on the types of a, b, and c and on how the | operator has been overloaded. If the operands a, b, and c are threads (or C++ handles for threads) and if | has been defined to mean "run in parallel," then the expression can have the same meaning inside a C++ program as it has on the UNIX command line. What's more, if * (multiplication) is defined to mean "replicate," users can write expressions such as producer | (20 * worker) | consumer to create a task farm in which a single producer creates jobs, a pool of 20 workers execute jobs as they become available, and a consumer collates the results; see Figure 1.

Using overloaded operators to implement parallelism has several advantages over using either function libraries or language extensions. Since C++ selects overloaded operators based on argument types, Active Expressions can be made typesafe. At the same time, since the core of the library is just a set of classes and operators, a user can extend it to meet his application's needs. For example, the || operator can then be defined to mean "broadcast," and / to mean "concurrent," so that source || (N * search) | (check / save) creates the system in Figure 2, in which a source broadcasts instructions to N search engines, whose outputs are both checked and saved.

Active Expressions have three parts: the base classes from which users derive active components (like the producer and consumer in the examples), the overloaded operators that connect these components together, and the run-time system that makes it all go. Every Active Component (AC) belongs in one of the following categories:

closed. The AC does not accept input, and cannot generate output.
input[T]. The AC can accept objects of type T.
output[T]. The AC produces objects of type T.
input/output[T1,T2]. The AC can accept T1s, and produces T2s.

The C++ Active Expressions library contains four template base classes -- C_Ac, I_Ac<Ti>, O_Ac<To>, and IO_Ac<Ti, To>. To create Active Components, you derive application-specific classes from these base classes, specifying the type of data each can accept or produce. These derived classes must override the pure virtual method run() to specify their behavior. This method is called by the run-time system, after a thread for the Active Component has been created.

For example, suppose that a task farm is going to be used to calculate the Mandelbrot Set. The producer encapsulates individual jobs in "tiles," shown in Listing One. These tiles are passed to workers, which calculate colors for xPixels*yPixels covering the rectangle from (xLo,yLo) to (xHi,yHi) and store these pixels in pixel blocks; see Listing Two. Note the use of the template array class Array<> to create an array of colors. Listing Three shows the producer, worker, and consumer classes for this program.

Defining Operators

Active components can communicate with the outside world using files, sockets, or any other mechanism, but can only communicate with each other through their ports. Each component's ports are strongly typed: If a program tries to connect ports with different types, the compiler detects the mismatch and reports an error, just as it would report an error if the wrong type of argument was passed to a function call.

An input-only AC inherits a single port called pin from I_Ac. This port is an instance of the template class I_Port<Ti> and has an overloaded stream operator operator>> which can be used for reading. Similarly, output ports are derived from the base class O_Port<To>. An output-only AC has one such (output) port, while an input-output AC has one of each. Using these ports, we can flesh out the definition of the Mandelbrot Set Worker's run() method as shown in Listing Four.

Active Components are just building blocks; the real power of the Active Expressions library lies in the operators that can be used to combine these components. In C++, if the function in Listing Five(a) has been defined for types Mulder, Scully, and Skinner, then the assignment statement in Listing Five(b) becomes Listing Five(c).

Our first attempt to create Active Expressions uses standard operator overloading. The overloaded operators take Active Expressions as arguments, and return Active Expressions as results. These Active Expressions are not concurrent threads themselves, but rather passive proxies for threads. Executing the overloaded operators builds a parse tree in memory; evaluation of that parse tree then creates the parallel structure defined by the expression tree.

To make this more concrete, consider the pipe operator in Listing Six(a), where AeNode is a simple class that holds pointers to operands of type AeNode; see Listing Six(b). Given this definition, and assuming that the base classes for active components are all derived from AeNode, a structure such as AeNode head = n1 | n2 | n3 | n4; is compiled into AeNode head = operator|(n1, operator|(n2, operator|(n3, n4)));, which, if inlining is used, in turn becomes AeNode head = *new AeNode(n1, *new AeNode(n2, *new AeNode(n3, n4)));.

This structure holds the information needed to create a dataflow network, but does no type-checking. To get the safety that type-checking provides, you must use template classes, organized into the hierarchy; see Figure 3. Using this hierarchy, the definition of the pipe operator in Listing Five can be changed to validate the types of connections, as in Listing Seven(a). This operator is essentially the same as the one shown before, but its usage will only be triggered by connections between O_AeNodes and I_AeNodes (or O_Acs and I_Acs, if these are derived from their node counterparts). Three definitions of operator| are needed in addition to Listing Seven(a) to handle the four kinds of connectivity that can arise; see Listing Seven(b). The type of the result of each operator reflects the semantics of the result of the connection made. This ensures that the entire tree is constructed correctly.

The operators in Listing Seven ensure correct connectivity, but not correct typing. To guarantee this, you must refine the class hierarchy one more time by templating all of the involved classes. Once this has been done, the operators will look like Listing Eight. This, our final definition, checks both connectivity and typing at compile time.

Implementing Operators

Defining operators is only half the battle -- you must also implement them. For each operator, you define a class hierarchy that mimics the structure in Figure 4. Each category of this hierarchy is derived directly from the appropriate Active Expression node type. The pipe representative needs an extra template parameter (Tcon) that holds the type information for the connection that would otherwise be lost in the construction of the tree. These representative objects can be used as return values for the operators describe before; see Listing Nine.

And that's all there is to it. For every operator, four implementations must be defined, and four accompanying classes derived to store descriptive information. A Perl script can be used to generate the required definitions for new operators.

Returning to the Mandelbrot Set

Let us return to the Mandelbrot Set task farm specified by the expression producer | (20 * worker) | consumer. When executed, this expression constructs a simple tree, but does not actually create or run any threads. For that to happen, the expression must be activated by assigning it to a variable of type C_Ae (a closed expression). The overloaded assignment operator operator= invokes the run-time system, which walks the expression tree, creating, connecting, and running threads as required. Thus, to actually calculate and display the Mandelbrot Set, you would use the code shown in Listing Ten.

The same effect can also be achieved using the ! operator, which instantiates and runs an active expression !(producer | (20 * worker) | consumer);. (Because ! is sometimes called "bang," this is referred to as "banging an expression.")

The expression tree describing the concurrent system does not disappear once the expression finishes running. By calling the method do() on the handle mandelbrot, a program can rerun the Active Expression. An interactive task farm could therefore be written as in Listing Eleven. The loop control in the code relies on a type conversion operator const int(), defined for C_Ae, which returns the error code produced by the most recent execution of the Active Expression, or 0 if no error occurred.

One advantage of delaying execution until the value of an active expression is required is that it lets you encapsulate concurrency in application-specific ways. A task farm, for example, can be encapsulated as in Listing Twelve, which can then be used by you to write code like !(databaseReader(inputFileName) || TaskFarm(p, w, c) | (check / save));.

The greatest strength of library-based systems like Active Expressions is the ease with which they can be extended. For example, consider a dataflow loop like that shown in Figure 5. This introduces a new node type that has two pairs of ports. One pair connects the node to the rest of the program; the second pair connects the node to an input/output expression. This keeps data circulating as long as the value returned by the Boolean virtual method loop() (which the user must override) is True. Listing Thirteen shows the base class for a loop. (The input and output types of the loop must be the same to allow for the possibility of zero traversals. The equivalent of a do-while loop can also be defined, with different input and output types.)

The expression to be looped is given to the node as a constructor argument, as in result = something | loop(a|b|c) | somethingElse;. Adding this class to the library took only a few minutes; adding support for other constructs, such as two-dimensional grids, took somewhat longer, but was still much easier than it would have been with most other systems.

Conclusion

Future work on Active Expressions will focus on three areas. The first, a port to Win32 and Microsoft Visual C++, is under way. The second is I/O: While it would clearly be useful to be able to connect active expressions directly to standard C++ streams, the semantics of this is tricky to define. For example, if N threads are reading concurrently, should the whole file be broadcast to each thread, should each get one record in order (assuming the file is record-based), should each get one segment of the file, or should reading be interleaved randomly?

The third area for future work is error handling. If a thread in a pool of workers throws an exception, should that exception cause the whole expression to terminate? Should it somehow be passed downstream? Or should it propagate to the outside world only after the expression as a whole has terminated?

DDJ

Listing One

struct MandelTile{
    float xLo, yLo, xHi, yHi;   // boundary coordinates of tile
    int xPixels, yPixels;       // number of pixels in tile
};

Back to Article

Listing Two

struct PixelBlock{
    int xPixels, yPixels;       // number of pixels in tile
    Array<color> pixels;        // 24-bit color, xPixels*yPixels long
};

Back to Article

Listing Three

class Producer : public O_Ac<MandelTile>{
    virtual void run(){
        while (tiles to generate){
            generate and output tiles
        }
    }
};
class Worker : public IO_Ac<MandelTile, PixelBlock>
{
    virtual void run(){
        MandelTile tile;
        PixelBlock block;
        while (input stream open){
            read a job spec into tile
            calculate and output block
        }
    }
};
class Consumer : public IO_Ac<MandelTile, PixelBlock>
{
    virtual void run(){
        PixelBlock block;
        while (input stream open){
            read pixels into block
            display
        }
    }
};

Back to Article

Listing Four

void Worker::run(){    MandelTile tile;
    PixelBlock block;
    while (! pin.eos()){        // eos() is the end-of-stream test
        pin >> tile;
        calculate
        pout << block;
    }
}

Back to Article

Listing Five

<b>(a)</b>
Skinner operator+(Mulder a, Scully b){
    ...implementation...
    return instance of Skinner;
}

<b>(b)</b>
Mulder fox;Scully dana;
Skinner walter;
 ...
walter = fox + dana;

<b>(c)</b>
walter = operator+(fox, dana);

Back to Article

Listing Six

<b>(a) </b>
AeNode & operator| (const AeNode& lhs, const AeNode& rhs){
    return *(new AeNode(lhs, rhs));
}

<b>(b)</b>
class AeNode{
    public :
        AeNode(const AeNode& lhs, const AeNode& rhs)
        : _left(&lhs), _right(&rhs)
        {}
    protected :
        AeNode * _left, * _right;
}

Back to Article

Listing Seven

(a)

C_AeNode & operator|(const O_AeNode & lhs, const I_AeNode & rhs){
    return *new C_AeNode(lhs, rhs);
}

<b>(b) </b>
O_AeNode& operator|(const O_AeNode& lhs, const IO_AeNode& rhs){
    return *new O_AeNode(lhs, rhs);
}
I_AeNode& operator|(const IO_AeNode& lhs, const I_AeNode& rhs)
{
    return *new I_AeNode(lhs, rhs);
}
IO_AeNode& operator|(const IO_AeNode& lhs, const IO_AeNode& rhs)
{
    return *new IO_AeNode(lhs, rhs);
}

Back to Article

Listing Eight

template <class T>C_AeNode&
operator | (const O_AeNode<T>& lhs, const I_AeNode<T>& rhs)
{
    return *new C_AeNode(lhs, rhs);
}
 ...other variations defined similarly...

Back to Article

Listing Nine

template <class T>C_AeNode<T>&
operator | (const O_AeNode<T>& lhs, const I_AeNode<T>& rhs)
{
    return *new C_Pipe<T>(lhs, rhs);
}

Back to Article

Listing Ten

C_Ae mandelbrot;mandelbrot = producer | (20 * worker) | consumer;

Back to Article

Listing Eleven

{    // run first time
    C_Ae mandelbrot = producer | (20 * worker) | consumer;
    while ((mandelbrot == 0) && (user isn't bored)){
        get new parameters
        producer.reset(new parameters);
        mandelbrot.do();
    }
    // mandelbrot automatically destroyed here
}

Back to Article

Listing Twelve

template <class Job, class Result>class IO_Ac<Job, Result> &
TaskFarm(
    O_Ac<Job>& producer,
    IO_Ac<Job,Result>& worker,
    I_Ac<Result>& consumer
){
  int n = 3 * number of processors in the system;
  return producer | (n*worker) | consumer;
}

Back to Article

Listing Thirteen

template<class T>class Loop_Ae
{
public :
    // main-line ports
I_Port<T> pin;
    O_Port<T> pout;
    // looping ports
    I_Port<T> loopIn;
    O_Port<T> loopOut;
    // run() pre-defined
    virtual void run(){
        while (! pin.eos()){
            T data;
            pin >> data;
            while (loop()){
                loopOut << data;
                loopIn  >> data;
            }
            pout << data;
        }
    }
    // user must override control
    virtual const bool loop() = 0;
};

Back to Article

1 2 3 4 5 6 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Database

The Active Expressions Library

Developing Active Expressions

Defining Operators

Implementing Operators

Returning to the Mandelbrot Set

Conclusion

Listing One

Listing Two

Listing Three

Listing Four

Listing Five

Listing Six

Listing Seven

Listing Eight

Listing Nine

Listing Ten

Listing Eleven

Listing Twelve

Listing Thirteen

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Database Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Database

The Active Expressions Library

Developing Active Expressions

Defining Operators

Implementing Operators

Returning to the Mandelbrot Set

Conclusion

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Database Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content