In recent issues of *CUJ*,
Thomas Becker explained some principles of template metaprogramming [1,
2]. In this article, we’ll take up the metaprogramming
thread and give you some insights into a powerful metaprogramming tool known
as the expression template.

As Becker stated, you can use templates to calculate values at compile time, such as the factorial of **n**. Becker also described the principle of type traits and showed how to strip the reference property off of a type. We’ll get a little more ambitious in this article: we’ll show you how to evaluate complex expressions at compile time, thereby reducing the run-time effort for expression evaluation to its bare minimum. Our goal is to demonstrate how to prepare an expression at compile time for repeated, highly efficient evaluation at run time.

Consider the integral:

You can approximate this integral by evaluating the expression **x/(1+x)** for **n** equidistant points in the interval **[1.0,5.0]**. A function that calculates integrals for arbitrary arithmetic expressions could look like the following, provided you manage to implement expression templates that can be evaluated repeatedly for different values:

template <class ExprT> double integrate (ExprT e, double from,double to,size_t n) { double sum=0, step=(to-from)/n; for (double i=from+step/2; i<to; i+=step) sum+=e.eval(i); return step*sum; }

**
ExprT** in this example would somehow represent an expression such as **x/(1+x)**.

### Expression Templates for Arithmetic Expressions

An expression template for the solution described in the preceding section requires the compile-time evaluation of arithmetic expressions consisting of unary and binary arithmetic operators and variable and constant operands. You’ll find a solution for this type of problem in the classic GOF patterns book [3]. The GOF book has a pattern for exactly this case — the Interpreter pattern.

The Interpreter pattern provides a way to represent a language in the form of an abstract syntax tree and an interpreter that uses the syntax tree to interpret language constructs. Actually, the Interpreter pattern is a special case of the Composite pattern. The part-whole relationship of the Composite pattern corresponds to the relationship of an expression and its sub-expressions in the Interpreter pattern:

- The leaf is a
*terminal*expression. - The composite is a
*non-terminal*expression. - Evaluation of the components is
*interpretation*of the syntax tree and its expressions.

In this case, the goal is to use the Interpreter pattern’s syntax tree to represent arithmetic expressions such as **(a+1)* c** or **log(abs(x-N))**. The syntax tree has two types of terminal nodes: numeric literals and numeric variables. The non-terminal nodes are unary or binary expressions consisting of one or two sub-expressions. The expressions have different semantics such as **+**, **-**, *****, **/**, **++**, **--**, **exp**, **log**, and **sqrt**.

Consider the concrete example of an expression, say **(x+2)*3**. The composite
structure, that is, the syntax tree for this expression, is shown in Figure
1.

The classic object-oriented technique for implementing the Interpreter pattern, as it is suggested in the GOF book, involves the classes shown in Figure 2.

Listing 1 shows samples of the corresponding
source code for an implementation of this technique. The base class for **UnaryExpr**
is implemented in analogy to class **BinaryExpr**, and all the concrete unary
and binary expressions follow the example of class **Sum**.

Listing 2 shows how to use the Interpreter
pattern for evaluating the expression **(x+2)*3**. First an expression object
**expr** is created that represents the expression **(x+2)*3**, and then
the expression object is told to evaluate itself. Naturally, this is an utterly
inefficient way to calculate the result of a primitive expression such as **(x+2)*3**.
But hold on; we will now turn the object-oriented approach into a template-based
solution.

### A Compile-Time Version of Arithmetic Expression Evaluation

In the object-oriented implementation, the classes representing the terminal and non-terminal expressions have a common base class that defines the operation they have in common. This technique is well known: commonalities are expressed by means of a common base class. In template programming, commonalities are expressed in terms of naming conventions and name commonalities. A virtual function in the object-oriented approach becomes a plain, non-virtual function with a certain name. Classes no longer derive from a common base class. Instead, the classes are stand-alone classes that happen to have a member function with the same name and a compatible signature (in the preceding example, an evaluation function named **eval()**). The expression template solution eliminates all base classes.

Next, we express all the non-terminal expressions, such as **Sum** or **Product**, as classes generated from class templates **UnaryExpr** and **BinaryExpr**, both of which are parameterized on structural information. These class templates take the types of their sub-expression(s) as type template arguments. In addition, we parameterize the expression class templates on the type of the operation that they represent. That is, the actual operation (**+**, **-**, *****, **/**, **++**, **--**, **abs**, **exp**, **log**) will be provided as a function object, and its type is one of the template arguments of the expression class template.

We implement the terminal expressions as regular (non-template) classes, pretty much like they are implemented in the object-oriented approach.

Instead of using run-time recursion, we use compile-time recursion for the expression template: we replace the recursive invocation of the virtual evaluation function with recursive-template instantiation of the expression class templates. Remember the calculation of the factorial of **n** as demonstrated in [1]. This calculation is based on the fact that instantiation of templates is a recursive process. The compiler may find that for the instantiation of one template it may need an instantiation of another template. The compiler then instantiates that other template, which may require the instantiation of yet another template, and so forth. Many of the template metaprogramming techniques take advantage of the recursive nature of the template instantiation process and use it to perform recursive calculations. We will use this effect to replace the recursive calls of the evaluation function **eval()** with recursive instantiation of expression class templates.

Figure 3 shows the classes in the template-based solution. Listing 3 shows the source code for this implementation.

The class template for **UnaryExpr** is analogous to class **BinaryExpr**.
For operations, we can use the pre-defined STL function object types (**plus**,
**minus**, **multiplies**, **divides**, etc.), or we can define our
own function object types as needed. A binary expression representing a sum,
for instance, would be of type **BinaryExpr< ExprT1, ExprT2, plus<double>
>**. Since this type name is rather unwieldy, we add a creator function
for a more convenient solution. Listing
4 shows two examples of the creator functions. (For more information, see
sidebar, “Creator Functions”).

Listing 5 shows how to use the template-based
implementation of an interpreter for evaluating the expression **(x+2)*3**.
An expression object **expr** represents the expression **(x+2)*3**, and
then the expression object is told to evaluate itself. The type of the expression
object is not visible in Listing 5,
thanks to the creator function, but the type would be:

BinaryExpr< BinaryExpr < Variable,Literal, plus<double> >, Literal, multiplies<double> >

### Evaluation

What have we gained by re-implementing the Interpreter pattern using templates instead of inheritance? Well, under the assumption that the compiler inlines all the creator functions and constructors and **eval()** functions (which is likely since all of them are trivial) the expression:

makeProd(makeSum(Variable(x),Literal(2)),Literal(3)).eval()

evaluates to nothing more than **(x+2)*3**.

Compare this to the effect of:

Product expr(new Sum(new Variable(x), new Literal(2)), new Literal(3)).eval()

(See Listing 2). The object-oriented
approach in Listing 2 results in a
number of allocations from the heap and subsequent constructions plus a number
of invocations of the virtual **eval()** function. Most likely, none of the
calls to **eval()** will be inlined because compilers typically do not inline
functions that are invoked through a pointer.

In essence, the template-based solution is much faster at run time than the object-oriented implementation.

### Further Elaboration of the Expression Template Solution

We will now tweak the expression templates a bit to turn them into something really useful. Our first improvement addresses readability. We want to make an expression such as:

makeProd(makeSum(Variable(x),Literal(2)),Literal(3)).eval()

more readable by making it look more like the expression that it represents, namely **(x+2)*3**. We can achieve this goal by means of operator overloading. We will make the expression look like **eval((v+2)*3.0)** with just a few minor modifications.

The first change is to rename the creator functions so that they are overloaded operators; that is, we rename **makeSum** into **operator+**, **makeProd** into **operator***, and so forth. The effect is that the term:

makeProd(makeSum(Variable(x),Literal(2)),Literal(3))

turns into:

((Variable(x) + Literal(2)) * Literal(3))

That’s good, but not good enough. We would like to write the expression as **((x+2)*3)**. Hence, our goal is to eliminate the creation of a **Variable** and a **Literal**, which still clutter the expression.

We will start by asking what an expression such as **x+2** means, now that
we have renamed the creator function **makeSum** into **operator+**. (Listing
6 shows the implementation of **operator+**.)

**x+2** is the same as **operator+(x,2)**, which formerly was **makeSum(x,2)**. For this reason **x+2** is the result of creating a binary expression object that represents a sum and that was created with the **double** variable **x** and the **int** literal **2** as constructor arguments. More precisely, **x+2** is an unnamed object created as **BinaryExpr<double,int,plus<double>>(x,2)**. Note that the type of the object is not quite what we want. We need an object of type **BinaryExpr<Variable,Literal,plus<double>>**, but the automatic template argument deduction does not know that **x** is a **Variable** and **2** is a **Literal** in our context. The compiler deduces type **double** from argument **x** and type **int** from argument **2** because the compiler examines the types of the arguments passed to the function.

It turns out that we must nudge the compiler a little bit. If we pass an object
of type **Variable** instead of the original variable **x**, then the
automatic argument deduction would at least yield type **BinaryExpr<Variable,int,plus<double>>**,
which is a little closer to the goal. (We will address the remaining **int**-to-**Literal**
conversion in a minute.) For this reason, a minimal degree of cooperation from
the users’ side is inevitable: users must wrap their variables into objects
of type **Variable** to make the expression work, as shown in Listing
7.

By using a **Variable** object **v** instead of a plain numeric variable, we ensure that an expression such as **v+2** evaluates to an unnamed object equivalent to **BinaryExpr
<Variable,int,plus<double>>(v,2)**. Such a

**BinaryExpr**object has two data members, which are of type

**Variable**and

**int**respectively. The evaluation function

**BinaryExpr<Variable,int,plus**

<double>>::eval()would return the sum of the result of the evaluation of its two data members. The crux is that the

<double>>::eval()

**int**data member does not know how to evaluate itself; we need to turn the literal

**2**into an object of type

**Literal**, which does know how to evaluate itself. How can we automatically convert constants of any numerical type to objects of type

**Literal**? In order to solve the problem, we’ll define expression traits. For more information on traits, see the sidebar, “Traits.”

The traits technique solves the problem of converting numeric literals to
objects of type **Literal**. For each expression type, we define expression
traits that provide information about how sub-expressions should be stored inside
the expression objects of which they are operands. All entities of numeric types
shall be stored as objects of type **Literal**; all objects of type **Variable**
shall be stored as they are, namely as **Variable**s; and all non-terminal
expression objects shall also be stored as they are. Listing
8 shows the definition of the expression traits.

The expression traits class defines a nested type **expr_type**, which represents the expression object’s expression type. A general expression traits template defines the expression type for all expressions that are class types, such as **BinaryExpr**, **UnaryExpr**, or **Variable**. In addition, there are specializations of the general class template for all those types that are built-in numeric types such as **short**, **int**, **long**, **float**, **double**, etc. For all non-class expressions, the expression type is defined as type **Literal**.

Inside the definition of the **BinaryExpr** and **UnaryExpr** classes, we will use the expression traits to determine the data member type that will hold the sub-expression.

Thanks to the expression traits, an expression object of type **BinaryExpr<Variable,int,plus<double>>** will contain its two operands as objects of type **Variable** and **Literal**, as desired.

Now we have ensured that an expression such as **((v + 2) * 3).eval()**,
where **v** is a **Variable** that wraps a **double** variable **x**,
will evaluate to **(x+2)*3**. Let us do some minor tweaking to make the expression
even more readable. Most people find it weird to invoke a member function of
something that looks like an expression. If we define a helper function, we
can turn the expression **((v + 2) * 3).eval()** into something like **eval((v
+ 2) * 3)**, which looks more natural to most people but is otherwise equivalent.
Listing 9 shows the helper function.

Figure 4 illustrates how the expression
**((v + 2) * 3).eval()**, where **v** is a **Variable** that wraps
a **double** variable **x**, gradually unfolds at compile time to the
expression **(x+2)*3**.

### Repeated Evaluation of an Expression Object

What is the benefit of expression objects? Each expression object represents the syntactical decomposition of an arithmetic expression; it’s a syntax tree that knows how to interpret itself to produce a numeric value. Basically, we’ve set up machinery for evaluation of expressions — something that is built into the language anyway. So, what is the point? A final small adjustment to the solution will reveal the point.

So far, the syntax tree interpretation is rather static. Each syntax tree is created and interpreted only once. A more dynamic usage model is possible, where a given syntax tree is evaluated repeatedly for different input values. Remember, we ultimately want to calculate integrals such as:

using an integration function such as the one below, which approximates the integral by evaluating an expression for a specified number of equidistant points in an interval:

template <class ExprT> double integrate (ExprT e,double from,double to,size_t n) { double sum=0, step=(to-from)/n; for (double i=from+step/2; i<to; i+=step) sum+=e.eval(i); return step*sum; }

We would like to use this function as shown below to approximate the integral of **x/(1+x)** from the example above:

Identity x; cout << integrate (x/(1.0+x),1.0,5.0,10) << endl;

We need an expression object that can be interpreted repeatedly, something
that our expression templates do not allow yet. It just takes a minor modification
to turn our static syntax tree interpretation into a repeatable interpretation.
We just have to change all evaluation functions of all the expression class
templates so that they accept an input argument, namely the value for which
they evaluate themselves. The non-terminal expressions will pass on the argument
to their sub-expressions. The **Literal** will accept the argument and ignore
it; it will continue returning the constant value that it represents. The **Variable**
will no longer return the value of a variable that it refers to but will return
the value of its argument. For this reason we rename it to **Identity**.
Listing 10 shows the modified classes.

If we add non-terminal expressions for incorporation of numeric functions (such as **sqrt**, **sqr**, **exp**, **log**, etc.), we can even calculate the Gauss distribution:

double sigma=2.0, mean=5.0; const double Pi = 3.141593; cout << integrate( 1.0/(sqrt(2*Pi)*sigma) * exp(sqr(x-mean)/(-2*sigma*sigma)), 2.0,10.0,100) << endl;

For implementation of non-terminal expressions for the numeric functions, such
as **sqrt**, **exp**, and **sqr** in the example, we can use functions
from the Standard C library. We would implement these non-terminal expressions
as creator functions that create a unary expression, whose operation is one
of the predefined C functions (see Listing
11).

With these additions in place, we now have a powerful, high-performance solution for evaluating arithmetic expressions. The techniques demonstrated in this article allow you to set up expression templates for logical expressions. By renaming the evaluation function from **eval()** to **operator()**, which is the function call operator, you can easily turn the expression objects into function objects, which you can then use in conjunction with STL algorithms. Below is an example of a logical expression that is used as a predicate for counting elements in a list:

list<int> l; Identity x; count_if(l.begin(),l.end(), x >= 0 && x <= 100 );

The example nicely illustrates the gain in readability that expression templates enable, at zero cost in terms of run-time performance. Provided that the expression templates are already in place, they are easy and convenient to use. All expression template libraries are built on principles similar to the ones discussed in this article.

### References

Quite a number of expression template libraries are available for free download. Some of these libraries are mentioned in the list of references below. However, this list is by no means comprehensive or representative. If you are interested in further information on expression templates, we recommend one of the directories ([5, 6]).

[1] Thomas Becker. “STL & Generic Programming: C++
Template Metprogramming,” *C/C++ Users Journal*, August 2002.

[return to text]

[2] Thomas Becker. “STL & Generic Programming: More
on C++ Template Metprogramming,” *C/C++ Users Journal*, October 2002.

[return to text]

[3] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.
*Design Patterns: Elements of Reusable Object-Oriented Software* (Addison-Wesley,
1995).

[return to text]

[4] T. Veldhuizen. “Expression Templates,” *C++
Report*, June 1995, pp. 26-31. The article has been reprinted in the book
*C++ Gems* edited by Stanley Lippman (SIGS Books & Multimedia, 1996).

[return to text]

[5] Research Centre Jülich, <www.fz-juelich.de/zam/cxx/>.
An impressive directory of C++ resources such as books, articles, FAQs, other
C++ pages, compilers, libraries, etc. See in particular the links to other C++
libraries at <www.fz-juelich.de/zam/cxx/extmain.html#lib>.

[return to text]

[6] The Object-Oriented Numerics Page, <www.oonumerics.org/oon/#libraries>.
A directory of links to freely available numeric libraries.

[return to text]

[7] The Blitz Project, <http://oonumerics.org/ blitz/>. A C++ class library for scientific computing that uses template techniques to achieve high performance.

[8] PETE (Portable Expression Template Engine), <www.acl.lanl.gov/pete/>. A portable C++ framework to easily add powerful expression templates.

[9] POOMA (Parallel Object-Oriented Methods and Applications), <www.acl.lanl.gov/pooma/>. An object-oriented framework for applications in computational science requiring high-performance parallel computers.

[10] MET (Matrix Expression Templates), <http://met.sourceforge.net/>.
A C++ matrix class library. C++ overloaded operators enable one to write the
simplest code like **u = m*v + w**.

[11] MTL (Matrix Template Library), <www.osl.iu.edu/research/mtl/>. A high-performance generic component library that provides linear algebra functionality for a wide variety of matrix formats. Uses an STL-style approach. Provides generic algorithms corresponding to the mathematical operations that define linear algebra. Containers, adaptors, and iterators are used to represent and to manipulate matrices and vectors.

[12] Jeremy G. Siek and Andrew Lumsdaine. “The Matrix Template Library: A Generic Programming Approach to High-Performance,” <www.lsc.nd.edu/downloads/research/ mtl/papers/iscope_final.pdf>.

[13] FACT! (Functional Additions to C++ through Templates and Classes), <www.kfa-juelich.de/zam/FACT/start/index.html>. FACT! is a library that offers lambda expressions built on top of PETE. By using FACT!'s lambda syntax, the programmer can define generic functions for use with STL on the fly.

[14] FC++ (Functional Programming in C++), <www.cc.gatech.edu/~yannis/fc++/>. FC++ is a library for functional programming in C++, which you can use to define your own higher-order polymorphic functions.

[15] BLL (The Boost Lambda Library), <www.boost.org/libs/lambda/doc/index.html>. BLL is a C++ template library that implements lambda abstractions for C++. The primary motivation for BLL is to provide flexible and convenient means to define unnamed function objects for STL algorithms.

[16] Phoenix (A parser used by Spirit), <http://spirit.sourceforge.net/index.php?doc=docs/phoenix_v1_0/index.html>. Phoenix is a framework that opens up functional programming techniques such as Lambda (unnamed functions) and Currying (partial function evaluation), similar to FC++ and BLL.

### Acknowledgements

Our thanks to Gary Powell, who brought FACT!, FC++, BLL, and Phoenix to our attention.

### About the Authors

**Angelika Langer** works as an independent freelance trainer/consultant.
Her main area of interest and expertise is object-oriented software development
in C++ and Java. She can be reached at **langer@camelot.de**.

** Klaus Kreft** is a senior consultant at Siemens Business Services in
Germany. He can be reached at **klaus.kreft@siemens.com.**