Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Standard C/C++


August 1996/Standard C/C++

Standard C/C++

P. J. Plauger

Algorithms

STL lets you apply algorithms to collections of objects, even if you don't understand the objects or how they are contained. Becoming familiar with STL's stock algorithms is well worth the effort.


Introduction

Algorithms are the workhorses of the Standard Template Library. They take the form of numerous template functions. Thus, algorithms are the part of STL that most closely resembles a traditional function library. The major difference, as I keep emphasizing, is that these are template functions. They are not supplied as a linkable library of precompiled object modules. Rather, they are defined completely in the STL headers. You can specialize each template function numerous ways, greatly enhancing its usefulness as a generic program component.

With rare exceptions, these template functions use iterator arguments to manipulate sequences. That is why I have devoted my initial columns on STL to a detailed presentation of iterators, dry as that topic may be at times. You are now better prepared to appreciate the implications of all these algorithms, knowing as you do the broad potential of iterators.

The template functions are fairly self-contained. You can drop them into all sorts of places with remarkably little additional supporting code, particularly if the iterators you use are conventional object pointers. Thus, the machinery I present over the next few months should be more quickly and obviously useful in any C++ code that you write.

These algorithms work well with the STL containers I describe later on. Indeed, many of the member functions in container template classes use them to advantage. But please note, it is not necessary to understand containers to use these algorithms. None of the algorithms use the STL containers directly. Rather, they manipulate the sequences controlled by container objects, using iterators supplied by the member functions associated with those objects. That is why I save containers for last. You can understand algorithms without knowing about containers, but it is harder to describe containers without a knowledge of algorithms.

I have already presented a handful of algorithms in earlier headers. Examples are uninitialized_copy, uninitialized_fill, and uninitialized_fill_n in the header <memory>. (See "Standard C/C++: The Header <memory>,'' CUJ, July 1996.) Two headers define the remaining algorithms: <algorithm> and <numerics>.

Listing 1 shows all the template functions declared in the header <algorithm>. As you can see, it is quite a long list. Listing 2 shows all the template functions declared in the header <numeric>, which is much more reasonable in size. Don't ask me why the algorithms are distributed among headers so unevenly -- I can't justify the choices made within the standards committees here.

Function Objects

One additional header, <functional>, defines a number of template classes that describe function objects, which can be used to advantage with many algorithms.

A critical part of many algorithms is a certain test that it must perform. For example, to find the smallest element of a sequence requires some definition of the predicate "smallest.'' More specifically, the algorithm must repeatedly determine the boolean result of an expression such as smaller(x, y), where x and y are elements of the sequence. An obvious solution in this case is to use the expression x < y -- and indeed many algorithms do so. You could, for example, declare the template function:


template<class FwdIt>
    FwdIt smallest(FwdIt first,
                   FwdIt last);

which returns the iterator that designates the smallest element in the interval [first, last), using operator< to compare elements.

But it seems a pity to hardwire such a critical aspect of the algorithm, particularly when the template machinery allows so much latitude in accessing the sequence itself. That's where function objects come in. A function object doesn't necessarily store any data. Like the iterator tags described in <iterator>, the type itself conveys much of the needed information. Here is one way to declare a function object f:

struct smaller_int {
    bool operator()(int x, int y)
        {return (x < y); }
    } f;

You can treat f for all the world as the name of a function. If the signature is correct, for example, the expression f(x, y) calls the member function defined for smaller_int.

Thus, you can declare a version of the template function smallest that looks something like:

template<class FwdIt, class Pred>
    FwdIt smallest(FwdIt first,
                   FwdIt last,
                   Pred pr);

and supply what you mean by smallest when you actually call the function, as in:

int get_smallest(int *first,
                 int *last)
    {return (*smallest(first, last,smaller_int())); }

The third argument generates a trivial object whose sole purpose is to determine the actual type of the parameter Pred when the template function is specialized.

Note, by the way, that this third argument can be a pointer to a function. You could rewrite the example as:

bool is_smaller_int(int x, int y)
    {return (x < y); }
int get_smallest(int *first, int *last)
    {return (smallest(first, last, &is_smaller_int)); }

(The ampersand is optional.) Both C and C++ allow the expression pfn(x, y) as an alternative to the sometimes more revealing form (*pfn)(x, y), so the template specialization produces valid code.

Function objects have several advantages over pointers to functions:

  • A function object can define multiple overloads.
  • It can make use of its stored value when the function is called.
  • It can be replaced by a function pointer as needed.

An algorithm characterized by some critical predicate typically occurs in two forms in the Standard Template Library, much like the template function smallest shown above. One form is hardwired with the most likely form of the predicate. A second form is parametrized on the type of an added function object argument.

The net effect of this approach is to nearly double the volume of code required to implement algorithms. But the payoff is that you get slightly more compact and efficient code for the most used predicates, without sacrificing the power that comes with using function objects when you need them.

You will find that function objects can be very powerful indeed.

Common Terms

The descriptions of algorithms in the months to come employ several shorthand phrases:

  • The phrase "in the range [A, B)'' means the sequence of zero or more discrete values beginning with A up to but not including B. A range is valid only if B is reachable from A -- you can store A in an object N (N = A), increment the object zero or more times (++N), and have the object compare equal to B after a finite number of increments (N == B).
  • The phrase "each N in the range [A, B)'' means that N begins with the value A and is incremented zero or more times until it equals the value B. The case N == B is not in the range.
  • The phrase "the lowest value of N in the range [A, B) such that X'' means that the condition X is determined for each N in the range [A, B) until the condition X is met.
  • The phrase "the highest value of N in the range [A, B) such that X'' usually means that X is determined for each N in the range [A, B). The function stores in K a copy of N each time the condition X is met. If any such store occurs, the function replaces the final value of N (which equals B) with the value of K. For a bidirectional or random-access iterator, however, it can also mean that N begins with the highest value in the range and is decremented over the range until the condition X is met.
  • Expressions such as X - Y, where X and Y can be iterators other than random-access iterators, are intended in the mathematical sense. The function does not necessarily evaluate operator- if it must determine such a value. The same is also true for expressions such as X + N and X - N, where N is an integer type.

Several algorithms make use of a predicate that must impose a strict weak ordering on pairs of elements from a sequence. For the predicate pr(X, Y):

  • "strict'' means that pr(X, X) is false
  • "weak'' means that X and Y have an equivalent ordering if !pr(X, Y) && !pr(Y, X) (X == Y need not be defined)
  • "ordering'' means that pr(X, Y) && pr(Y, Z) implies pr(X, Z)

Some of these algorithms implicitly use the predicate X < Y. Other predicates that typically satisfy the "strict weak ordering'' requirement are X > Y, less(X, Y), and greater(X, Y). (The last two expressions involve function objects defined in <functional>.) Note, however, that predicates such as X <= Y and X >= Y do not satisfy this requirement.

A sequence of elements designated by iterators in the range [first, last) is "a sequence ordered by operator<'' if, for each N in the range [0, last - first) and for each M in the range (N, last - first) the predicate !(*(first + M) < *(first + N)) is true. (Note that the elements are sorted in ascending order.) The predicate function operator<, or any replacement for it, must not alter either of its operands. Moreover, it must impose a strict weak ordering on the operands it compares.

A sequence of elements designated by iterators in the range [first, last) is "a heap ordered by operator<'' if, for each N in the range [1, last - first) the predicate !(*first < *(first + N)) is true. (The first element is the largest.) Its internal structure is otherwise known only to the template functions make_heap, pop_heap, and push_heap. (defined in <algorithm>.) As with an ordered sequence, the predicate function operator<, or any replacement for it, must not alter either of its operands, and it must impose a strict weak ordering on the operands it compares.

Using Algorithms

You use the STL algorithms much like you use a conventional function library. Include the header that defines the template function you wish to use. Be warned, however, that the failure modes for an ill formed program can be quite different. If you call a conventional function incorrectly, the translator either issues a diagnostic or quietly converts one or more arguments to the expected types. The latter case can lead to curious runtime behavior, but it is the sort of curiosity that modern debuggers are designed to help pinpoint.

If you specialize a template function incorrectly, the translator may do the same, or it may choose a wrong type for a template parameter. Such an incorrect type can generate truly mysterious diagnostics, particularly for a template that specializes other templates. With multiple specializations of a template, even runtime debugging can be a challenge.

To mitigate this problem, the single most important thing you can do is minimize the use of implicit type conversion in function argument expressions. This is generally a good idea anyway, particularly in the presence of multiple function overloads. The type of an argument expression can influence the choice of overloads, possibly incorrectly if no exact match exists for the argument type. You are fortunate if the translator reports an ambiguity, rather than guessing wrong.

A template function can use the argument type to determine a template parameter type, so the opportunities for going astray are all the more varied. The code present makes judicious use of type casts, in argument expressions, to minimize the likelihood that the type will prove incorrect. I encourage you to adopt a similar style.

Coming Soon

Starting next month, I will devote several successive columns to presenting all the algorithms defined in the headers <algorithm> and <numeric>. As you can see, they are quite numerous. Fortunately, many are small, and their descriptions are largely self contained. Bear with me.

P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active on the C++ committee, WG21. He developed Lib Suite ++, the Plum-Hall validation suite for the Standard C++ library. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.