Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Loops, Metaloops, & C++


August, 2004: Loops, Metaloops, & C++

Roshan Naik is a software-design engineer for Hewlett-Packard. He can be contacted at [email protected].


Metaprogramming is about writing code that performs some task at compile time. You could optimize code (loop unrolling, for instance), compute a value (factorial of a number), manipulate compile-time data structures (such as type lists), or perform compile-time selection of appropriate algorithms to be executed at runtime. But no matter how useful metacode might be, there are restrictions on what can be done with it. The biggest hurdle to the endemic use of meta code is that most information needed to perform computations is only available at runtime.

Selection and looping have become the two key cornerstones of the Turing/von Neuman model of computation. Both these ideas can be simulated in C++ metacode using partial specialization and recursion in templates.

Metaloops are implemented in C++ using recursive template instantiation. Typically, implementing metaloops requires the creation of a primary template class to perform recursion and a specialization to end the recursion. This pattern is implemented each time a metaloop is needed. If you can decouple the logic of creating a loop from the computation performed within it, then you can reuse the code that implements the looping logic.

It is somewhat common these days to encounter a template named something like If<> or IfThenElse<> that handles compile-time selection [1]. This does not seem to be the case for metaloops, however. Metaloops are fundamentally more complex than metaselection. It is difficult to have a general-purpose looping construct that can satisfy every need effectively. Sometimes less general-purpose solutions can be more effective. In this article, I'll examine ideas around encapsulating metaloops and consider their usefulness.

Taking a First Stab

Listing 1 demonstrates the traditional-style metacode to perform looping. The code uses metalooping to print all numbers in an inclusive range.

The first class is the primary template that provides the recursion, and the second template, which is a specialization of the first, is used to terminate the recursion. The primary template does one iteration worth of work (in this case, print an integer), then moves on to the next iteration by recursively instantiating itself. The template argument start is incremented prior to each recursive instantiation. The recursion continues until the value of start is equal to the value of end. At this point, the specialization kicks in and ends the recursion. The template parameter start implements a loop counter, and end marks the inclusive upper limit for start.

It would simplify things if you could package this metalooping logic into some kind of library construct (let's call it Loop) that can be used as in Listing 2. Here, Loop<> repeatedly invokes the function or function object passed to it with sequential numbers in the range [1,20]. You can see in Listing 1 that the loop's body is tightly coupled with the looping logic, but in Listing 2 these have been decoupled using the Loop<> template. The loop's body is implemented as a unary function (or function object) and passed to Loop<>. Assuming the compiler can inline call to print, the code is unrolled into 20 inline cout statements, in both Listings 1 and 2. Metaloops are automatically unrolled at compile time. You should be able unroll a loop as long as you can deterministically tell at compile time (within the limits of the language) exactly how many times the loop needs to be executed. In reality, this statement is a simplification and does not tell the whole story. You may have already figured out that there can be more involved with loops than simple loop counters. Consequently, Loop<> is not always going to fit your needs.

Loop<>'s implementation (Listing 3) is straightforward, and is based on the traditional metalooping technique. A more robust implementation would also include an overloaded version of the eval member function taking a const T& expr parameter to allow use of const function objects as arguments.

Loop<> provides basic flexibility in specifying the start and end values of the range for the "loop counter," and makes this loop counter available to each iteration. Being a sophisticated template programmer, you try to use Loop<> for all your metalooping needs and soon run into its limitations. You may like to use nested metalooping to unroll a loop that walks through your two-dimensional array and initializes each element. Listing 4 implements Loop2<>, which adds support for nested looping (one level only).

Using Loop2<> for nested loops is straightforward. Listing 5 initializes a two-dimensional array with random values. The ternary function elem_init is used to implement the loop's body. It assigns a random value to the specified element of an array. You first bind the 5×6 array my_array to the first parameter of the elem_init function using boost::bind (from the Boost library), then use the resulting binary function object as the outer loop's body. The outer loop binds a value to the first parameter and passes the resulting unary function to the inner loop. The inner loop finally executes the loop's body. To initialize a 5×6 array, you need an outer loop to generate array indices in the range [0-4] and an inner loop to generate array indices in [0,5]. Loop2<0, 4, Loop2<0, 5> >::eval(....) provides you with this nested looping infrastructure.

What if you need to increment the value of your loop counter by some arbitrary amount (say 2, to print every alternate number in the range)? In that case, you need to be able to customize the increment logic. What if you require more "state" than simply an integral loop counter? For instance, you may want to have typelists or some other complex compile-time data structure as part of the loop's state. This means that you need to replace the loop counter with some custom-defined state and the looping condition should be customizable, too. You should be able to implement a compile-time predicate that is aware of your custom-defined state and use this predicate to control the loop. So is Loop<> any good? If your needs are restricted to simple loop unrolling instead of sophisticated compile-time algorithms (which I feel is true more often), then Loop<>, or a custom variant of it, can be a simple, effective, and easy-to-use tool.

Compile-Time Selection Techniques

There are several useful techniques for performing compile-time selection. These will be required for implementing more advanced looping schemes. In Listing 6, If_else<> demonstrates a typical implementation of one compile-time selection technique. The first parameter b is defined as a Boolean value and the remaining two are type parameters. If b is true, then If_else<>::result is typedefed to the first type; otherwise, to the second type. In Listing 7, you can see If_else<> being used. It first defines a simple predicate, isSmall, which verifies if a type has size less than 2. (Later on, the predicate is used along with the If_else<> to determine the type for foo.)

A common optimization to the aforementioned technique relies on C++'s rule dealing with instantiation of members of a template class. In short, the rule says that the compiler instantiates a member of a template class only when (and if) it is explicitly referenced. This is different from the rule for nontemplate classes, where all members are instantiated along with the class. The declaration of foo in Listing 7 instantiates A<T1>, A<T1>:: result, B<T1>, B<T1>:: result, isSmall<T1>, isSmall<T1>::result, If_else<...>, and If_else<...>::result. The order of instantiation should also to be noted. isSmall, A, and B will be instantiated in an unspecified order prior to instantiating If_else<>.

In Listing 7, you don't really need to instantiate both A<T1>::result and B<T1>::result, since you will only use one or the other. For simple cases, it's a nonissue, but sometimes it can be expensive in terms of compilation time. A<T1>::result and B<T1>::result could both be another If_else<>, which, in turn, contains several more levels of selection. Or it could have been some expensive metacode dealing with long typelists. To get around redundant instantiation of template members, you need to delay any referencing of those members until it is absolutely required. C++ Templates, by David Vandevoorde and Nicolai Josuttis (Addison-Wesley, 2002), provides an analysis of how compile time is reduced by taking advantage of delayed template member instantiation using square root calculation as an example.

Listing 8 shows a modified version of the If_else template called If. References to A<T1> and B<T1>'s member result has been tucked inside the If template in Listing 8. Since the first argument to If<> is false, the template specialization If<false, T, F> kicks into play and references only B<T1>::result. Consequently, A<T1>::result is not referenced or instantiated. This selection technique implies that the two type parameters T and F need to provide a member type named result. In some cases, you may want to pass simple types such as int and float that do not have such members. This is can be solved by using a simple adapter class template that wraps the nonconformant type.

template<typename T>  struct wrap {  typedef T result; };
typedef If_else <  false , 
                    wrap<int> , 
                    wrap<char>  >::result  bar;

If<> selects one of two types based on a Boolean value. You may sometimes need to select at compile time a piece of code to execute at runtime, based upon a Boolean value:

template <bool b , typename T, typename F>
struct If_exec
{
    static void exec() {
        T::exec();
    }
};
template <typename T , typename F>
struct If_exec <false , T , F>
{
    static void exec() {
        F::exec();
    }
};

If_exec<>::exec() invokes T::exec() if b is true, or else F::exec(). Since operator() cannot be defined as a static member function, you require the function object to define a static member function called exec instead. If_exec<> turns out to be a very useful tool for unrolling loops that need to conditionally execute some code in each iteration.

A Grand Looping Scheme

Listing 9 is a solution that overcomes Loop<>'s limitations. The template class For models the language-provided for looping construct. Figure 1 shows the five different aspects that make up a for construct in C/C++.

You can simplify things by performing the initialization just prior to invoking For<>. That leaves you with four aspects to deal with, instead of five, when implementing For<>. It is up to users to define each of these four aspects and pass them as template parameters to For<>. For<> takes these and orchestrates them into a metaloop resembling the runtime for. Contrast this with Loop<>, which defined most of these policies internally.

For<> consists of two members. The first member is a type called result and the second is a function called exec(). Member function exec() is used for loop unrolling type tasks. The member type result is used for metaloops that produce either a constant value (such as factorial of a number) or a type (such as a typelist sorted on type size) at compile time. result and exec() provide two independent services and do not depend on each other.

Input to For<> is comprised of the looping condition (condition), body of loop (body), the increment logic (increment), and the preinitialized state (state). body performs one iteration of a loop. For<>::exec() requires body to define a static member function, also called exec(), that invokes zero or more (runtime) executable statements. On the other hand, For<>::result requires body to define a member type, called result, representing the output of the body's actions on the loop's state. For<> always requires body to be a template that has only one type of argument (which will be the loop's state). Both condition and increment take state as an argument.

Here's how For<>::result does its job: body and increment are expected to operate on the state argument and make the new state output available in their result member. condition, on the other hand, only runs some checks on state and provides a true/false value in its result. state is an arbitrary, user-defined data type whose internals are known only to condition, increment, and body. For<> takes the results of their operations and orchestrates the loop.

For<> performs the following steps (similar to for barring initialization). Again, For<> mimics the steps preformed by the for construct (barring initialization):

  1. if (looping condition is true).
  2. a. execute body (passing in state)
  3. b. execute increment (passing in the state obtained by executing body)
  4. c. continue on to next iteration
  5. else leave state unmodified and end the loop

For<> employs If<> to make decisions in steps 1 and 2. For steps 1(a) and (b), it uses the body and increment parameters, respectively. And for step 1(c), a recursive reference to For<> is used. When making this recursive reference, condition, increment, and body are passed on without any change. state, on the other hand, is expected to change at the end of each iteration. The new state for the next iteration is increment< body<state>::result >::result. This essentially executes body first, then increment on state to obtain the new state. If neither increment nor body modify state, you could end up with infinite recursion, as condition<state>::result produces true ad infinitum. Next, iteration is pursued only if condition<state>::result is true; else, the current state is returned as the final result of the looping operation. Since state can be any arbitrary type, wrap<> adapts state's interface to If<>.

There are a couple of interesting details to note in the implementation of For<>. First is the order of instantiation of templates. All arguments to a template are instantiated (in an unspecified order) prior to instantiating the template itself. Thus, the compiler instantiates state, then condition<state>, and finally condition<state>::result. Similarly, template arguments to the recursive For<> reference are instantiated prior to that For<> reference. There is no guarantee on which of three arguments to If<> is instantiated first, and For<> does not rely on their order of instantiation. But it does rely on the body operation being "executed" (that is, body<...>::result being instantiated) prior to the increment operation and making body's result available to increment. Making body<state>::result an argument to increment achieves the desired behavior.

The second interesting detail to note is the use of If<> to delay the instantiation of For<>::result. Consider what would happen if you used If_else<> instead of If<> as follows:

If_else< condition<...>::result , For<...>::result , wrap<...>::result >, 

This would cause the compiler to instantiate For<...>::result and wrap<...>::result upfront, regardless of the value of condition<...>::result. Instantiating For<...>::result at this point means an attempt to compute the next iteration. This creates an infinite dependency chain where each iteration requires the next iteration to be computed, all along ignoring condition<...>::result's indications to stop/continue the recursion.

Listing 10 is a metaprogram using For<> to compute square roots. The template class sqroot makes use of For<> to implement its metalooping. The algorithm searches for the square root of a number N in the range [0,N]. In each iteration, the range is halved and the search continues in either the first half or the second half. If the square of the midpoint of this range is greater than or equal to N, then the search continues in the first half; otherwise, in the second half. Four templates need to be defined to implement the various aspects of the loop. sqrt_state holds the number (num) whose square root is to be found and two variables (low and high) marking a range in which the search for the square root is to be carried out. sqrt_body updates this range on each iteration of the loop. sqrt_cond checks to see whether the state::low has crossed state::high, in which case you have found the answer. In this example, sqrt_incr leaves the state unmodified. You need to compare this For<>-based solution to a solution that uses hand-implemented metaloops. Comparing Listings 10 and 11, you can clearly see that the For<>-based solution needed more lines of code. It did the looping for me, but I ended up writing more code to define the different arguments it needed. A bit of a shock when you consider that Loop<>, which did very little, bought a lot more, where applicable, than the "wonderfully customizable" For<>.

Nesting

An interesting point to note about For<> is that its high degree of generality automatically supports the creation of arbitrarily deep levels of nested loops. Listing 12 illustrates the general idea. Nesting is achieved by passing another For<> in the body argument. The inner and outer loops are required to share the same state as the outer loop passes its state to the inner loop. Each loop generally has its own unique increment, condition, and body. When defining the inner loop, you do not explicitly pass any state to the inner loop. You pass state only to the outer loop, which, in turn, passes state to the inner loop on each iteration of the outer loop. Defining the inner loop without the state requires simulating template typedefs (forthcoming in C++0x), which lets you take an existing template with multiple parameters and bind only some of the parameters to arguments. In Listing 12, inner_loop<> binds all parameters of the inner For<> loop except state. Nesting is then achieved by passing inner_loop<> to a For<> in the body argument. Deeply nested metaloops is one area where For<> can reduce your lines of code and at the same time improve readability.

Conclusion

A general-purpose tool like For<> is difficult to implement and likely to have usability issues for simple loops. But simple tools such as Loop<>, having relatively fewer areas of application, are fairly easy to invent and use. It is likely that a small collection of such simpler classes stand a good chance of tackling most (if not all) of your metalooping needs better than a general-purpose class such as For<>.

Template meta code often gets very dense due to the syntactic requirements and, in some cases, the need to work around some language limitations. Classes that encapsulate compile-time selection can help in reducing code without introducing overhead. One issue to keep in mind when writing meta code that uses recursive templates is the depth of recursion. Different compilers support different depths of recursively nested template instantiations. The C++ Standard "recommends" (in Annex B) a minimum depth of 17 to be supported. Fortunately, many compilers report a clear error message once their recursion limit has been reached.

Acknowledgments

Thanks to Julius Gawlas and Vijay Krishna for their help on this article.

References

  1. [1] Boost's MPL library (http://www.boost.org/libs/mpl/doc/) is one place to find a class similar to the template class If<> described in this article. There does not seem to be a template equivalent to If_exec<> in Boost.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.