Channels ▼
RSS

<algorithm>: find


August, 2005: <algorithm>: find

Bjorn is author of Beyond The C++ Standard Library: An Introduction to Boost. He can be contacted at [email protected]


In this series of articles, I focus on clause 25 of the C++ Standard, which is the Algorithms Library. These algorithms are designed to work with iterators from the standard containers [2] (std::vector, std::map, and so forth) and other sequences. The first part of this series of articles covers the nonmodifying sequence operations, the second part talks about the mutating sequence operations, and the third discusses sorting and related operations. These divisions are named after the categorization of algorithms defined in the C++ Standard. The topic of this article is an algorithm called find. Its purpose is to find a specific element in a sequence.

Here is how the C++ Standard defines find() and find_if():

25.1.2 Find [lib.alg.find]
  template<class InputIterator, class T>
    InputIterator find(
      InputIterator first, 
      InputIterator last,
      const T& value);

  template<class InputIterator, class Predicate>
    InputIterator find_if(
      InputIterator first, 
      InputIterator last,
      Predicate pred);

1 Requires: Type T is EqualityComparable (20.1.1).
2 Returns: The first iterator i in the range [first, last) for which the following corresponding conditions hold: *i == value, pred(*i) != false. Returns last if no such iterator is found.
3 Complexity: At most last - first applications of the corresponding predicate.

Header: <algorithm>
Namespace: std
Category: Non-modifying sequence operation

The find() function expects three arguments: an iterator first to the first element of the sequence, an iterator last to the last element of the sequence, and an element that is used for finding a match, using the element type's equality operator (operator==). Note that the element (if such an element exists—it is common to pass a past-the-end iterator for the container) for the last iterator is never dereferenced, nor is it tested for equality with value. find() returns an iterator to the first matching element, or the iterator last if the element could not be found.

The find_if() function also expects three arguments: an iterator first to the first element of the sequence, an iterator last to the last element of the sequence, and a unary predicate that is used for finding a match, using the predicate's function call operator. find_if() returns an iterator to the first matching element, or the iterator last if the element could not be found.

For a find() example, suppose you have a container with elements of type std::string. You need to test whether one of the container's elements equals "Hammerfall." Here's how you can use std::find() for that task:

#include <deque>
#include <algorithm>
#include <string>
#include <iostream>

int main()
{
  typedef std::deque<std::string> stuff_container;
  typedef stuff_container::iterator iterator;

  stuff_container stuff;

  // Code for adding "stuff" omitted
  
  // Here comes the call to std::find
  iterator it=std::find(
    stuff.begin(),
    stuff.end(),
    "Hammerfall");
  
  if (it!=stuff.end())
  {
    // Found it!
    std::cout << "Hammerfall was found!\n";
  }
  else
  {
    // Didn't find it
    std::cout << "Couldn't locate Hammerfall.\n";
  }
}

The call to find() passes an iterator for the first element of the deque instance stuff as the beginning of the range, a past-the-end iterator for stuff as the end of the range, and an implicitly created std::string with the value "Hammerfall" as the third argument. If one of the elements equals "Hammerfall," find() will return an iterator to that element. If not, the iterator last (in this case, the past-the-end iterator) is returned.

Figure 1 shows a conceptual view of how an std::deque with four elements (with values "Rainbow," "Metallica," "Hammerfall," and "Iron Maiden") could be searched for an element with the value "Hammerfall" using std::find().

For a find_if() example, suppose you have a container with elements of type std::string. You need to test if there exists a string longer than five characters. Here's how you can use std::find_if() with a special predicate:

#include <deque>
#include <algorithm>
#include <string>
#include <iostream>

// Predicate for determing if a string is longer than N characters
template <int N> bool 
  string_length_greater_than(const std::string& s)
{
  return s.size()>N;
}

int main()
{
  typedef std::deque<std::string> stuff_container;
  typedef stuff_container::iterator iterator;

  stuff_container stuff;

  // Code for adding stuff omitted

  // Find a string with size>5
  iterator it=std::find_if(
    stuff.begin(),
    stuff.end(),
    &string_length_greater_than<5>);
  
  if (it!=stuff.end())
  {
    // Found it!
    std::cout << 
      "This string is more than 5 characters long: " << 
        (*it) << '\n';
  }
  else
  {
    // Didn't find it
    std::cout << "Couldn't find such a long string.\n";
  }
}

The call to find_if() passes an iterator for the first element of the deque stuff as the beginning of the range, a past-the-end iterator for stuff as the end of the range, and a pointer to the function string_length_greater_than<5>(). If one of the elements in stuff has more than five characters, find_if() will return an iterator to that element. If not, the iterator last (in this case, the past-the-end iterator) is returned.

Basic find

For a less contrived example, suppose that you have a container with elements of type programmer. You need to find one specific element, a programmer with the name "Joe the Programmer." That can be accomplished using std::find(). You use find() together with an element that is equal to the one you're searching the sequence for, like you saw in the preceding example using find(). However, if you don't have such an element, or if the search needs to be performed using a relation that is not defined by operator==, you need to use find_if(). The following implementation defines the class programmer in a namespace [3] called programming, together with an enumeration, programming_language, used for identifying different programming languages.

#include <list>
#include <algorithm>
#include <string>
#include <iostream>

namespace programming
{
  // An enumeration for popular
  // programming languages
  enum programming_language
  {
    cpp,
    java,
    python,
    perl
  };

  // A class representing a programmer
  class programmer
  {
    std::string name_;
    programming_language primary_language_;
  public:

    // Constructor
    programmer(
      const std::string& name,
      programming_language primary_language)
      : name_(name),
        primary_language_(primary_language)
    {
    }

    // Get the programmer's name
    const std::string& name() const
    {
      return name_;
    }

    // Get the primary programming language
    programming_language primary_language() const
    {
      return primary_language_;
    }
  };

  bool operator==(const programmer& lhs,const programmer& rhs)
  {
    return 
      lhs.name()==rhs.name() &&
      lhs.primary_language()==rhs.primary_language();
  }
}

// Start of the program
int main()
{
  typedef std::list<programming::programmer> programmer_container;
  typedef programmer_container::iterator     iterator;

  programmer_container programmers;

  // Code for initially adding (hiring) programmers omitted

  // Find Joe the Programmer
  programming::programmer joe("Joe the Programmer",programming::cpp);
  
  // Here comes the call to std::find
  iterator it=std::find(
    programmers.begin(),
    programmers.end(),
    joe);
  
  if (it!=programmers.end())
  {
    // Give Joe a raise
    std::cout << 
      "Hey Joe, I've got good news!\n";
  }
  else
  {
    // Joe seems to have quit his job
    std::cout << 
      "Where can I find a great C++ programmer these days?\n";
  }
}

The call to find() passes an iterator for the first element of the list instance programmers as the beginning of the range, a past-the-end iterator for programmers as the end of the range, and a programmer instance that identifies Joe. For this to work, there must be a way for find() to determine element equality, and in C++, this is done using operator==. We say that classes that support operator== are EqualityComparable [4], which is a requirement for elements used with the algorithm find_if(). Therefore, we had to define such an operator for comparing programmers:

bool operator==(const programmer& lhs,const programmer& rhs)
{
  return 
    lhs.name()==rhs.name() &&
    lhs.primary_language()==rhs.primary_language();
}

The operator doesn't need to be made a friend of programmer, although that is very common for operators, because programmer provides the information needed to calculate equality through the public interface—through the member functions name() and primary_language(). This operator makes it possible for find() to test each element in the range [first,last), and return an iterator to the first element for which operator== returns true.

Sometimes the preceding solution doesn't work, because you might not be able to create a programmer from scratch and pass that to find(). Furthermore, you might need to use another criteria for what constitutes a match for find(), for example, to find a programmer that uses C++, regardless of his or her name. That question is not answered by operator==, and we need to turn to find_if() for a solution. Because find_if() accepts a predicate, you can easily implement a function used for finding C++ programmers:

#include <functional>
namespace programming
{
  // Enumeration for programming languages omitted
  // Definition of programmer omitted
  // Definition of programmer's operator== omitted
  
  // A predicate for finding a certain type of programmer
  class programming_language_finder : 
    public std::unary_function<programmer,bool>
  {
    programming_language language_;
  public:
    programming_language_finder(programming_language language)
      : language_(language)
    {
    }

    bool operator()(const programmer& p) const
    {
      return p.primary_language()==language_;
    } 
  };
}

With this unary predicate, you can use std::find_if() to locate programmers with any primary programming language. The function object inherits from the template class std::unary_function, available through the header <functional>. unary_function requires two template arguments—the first is the argument type for the function call operator, and the second is the return type of the function call operator. For the programming_language_finder function object, the argument type is programmer, and the return type is bool, and so the derivation looks like this:

class programming_language_finder : 
  public std::unary_function<programmer,bool>

If you're wondering what the reason is for deriving from unary_function, hang on a minute. Before we get there, let's have a look at how programming_language_finder can be used. Let's look for C++ programmers, shall we?

#include <list>
#include <algorithm>
#include <string>
#include <iostream>

namespace programming
{
  // Enumeration for programming languages omitted
  // Definition of programmer omitted
  // Definition of programmer's operator== omitted
  // Definition of programming_language_finder omitted
}

int main()
{
  typedef std::list<programming::programmer> programmer_container;
  typedef programmer_container::iterator     iterator;

  programmer_container programmers;

  // Code for adding (hiring) programmers omitted
  
  // Find a C++ programmer
  iterator it=
    std::find_if(
      programmers.begin(),
      programmers.end(),
      programming::programming_language_finder(programming::cpp));
  
  if (it!=programmers.end())
  {
    // Talk to the C++ programmer
    std::cout << "Hey " << (*it).name() << 
      ", done any RAII lately?\n";
  }
  else
  {
    // No C++ programmers? That's awful!
    std::cout << 
      "Where can I find a great C++ programmer these days?\n";
  }
}

The call to find_if() creates an object of type programming_language_finder, passing programming::cpp to the constructor. By passing programming::cpp, you configure the function object to return true when its function call operator is invoked with a programmer argument that has a primary programming language that equals programming::cpp; i.e., a C++ programmer.

But you may want to use this function object in a different way. What if you need to find a programmer that does not have C++ as his/her primary programming language? Do you write a new predicate? No, of course not—or at least you shouldn't need to when dealing with well-behaved function objects. That's the reason why programming_language_finder derives from std::unary_function: to make it adaptable. std::unary_function adds two typedefs to programming_language_finderresult_type and argument_type—that allow other classes to determine the return type and argument type for the function call operator. Now we are able to use another component of the C++ Standard Library, namely std::not1, which is a negator. By applying std::not1 to a programming_language_finder set up to find C++ programmers, you will get a function object that returns true when it is passed a programmer that is not a C++ programmer; i.e., the return value from the function call operator is negated:

#include <list>
#include <algorithm>
#include <string>
#include <iostream>
#include <functional>

namespace programming
{
  // Enumeration for programming languages omitted
  // Definition of programmer omitted
  // Definition of programmer's operator== omitted
  // Definition of programming_language_finder omitted
}

int main()
{
  typedef std::list<programming::programmer> programmer_container;
  typedef programmer_container::iterator     iterator;

  programmer_container programmers;

  // Code for adding (hiring) programmers omitted
  
  // Find a non-C++ programmer
  iterator it=
    std::find_if(
      programmers.begin(),
      programmers.end(),
      std::not1(
        programming::programming_language_finder(
          programming::cpp)));
  
  if (it!=programmers.end())
  {
    std::cout << (*it).name() << '\n';
  }
  else
  {
    // No C++ programmers? That's awful!
    std::cout << 
      "Where can I find a great C++ programmer these days?\n";
  }
}

Remember to include the header <functional> when using one of the predefined function objects, such as std::not1, from the C++ Standard Library.

When using find(), operator== is required for the element type of the container. For find_if(), a suitable function or function object is needed. Always make such function objects adaptable, either through providing the required typedefs—for unary predicates, argument_type and result_type—or by deriving from std::unary_function.

Intermediate find

It is sometimes convenient to be able to define the predicates directly when calling find_if(); i.e., when there is no reasonable predicate to use, and there is little reason to write a separate function object for it, because it won't need to be reused. A particularly nice library that supports creating predicates directly at the call site is Boost.Lambda [5, 6, 7]. Using Lambda, writing predicates can be both terse and extremely readable. For example, suppose you have an std::list<int> that contains values in an unpredictable order. What you need to do is find a value that is greater than or equal to 15, and less then 25. Obviously, plain old find() won't cut it, because this is not a job for operator==. You need to use find_if(), and you need to provide a suitable predicate. Even the binders and the function objects—std::bind1st() and std::bind2nd() for binding, std::greater_equal and std::less for the comparisons—won't help here, because you need to perform two tests: the value should be >=15 and <25. So, the C++ Standard Library can't help with everything here, but Boost.Lambda can. Here's how:

#include <list>
#include <algorithm>
#include <iostream>
#include "boost/lambda/lambda.hpp" // For lambda expressions

int main()
{
  typedef std::list<int> value_container;
  typedef value_container::iterator iterator;

  value_container values;

  // Code for adding values omitted
  values.push_back(10);
  values.push_back(20);
  values.push_back(5);

  using namespace boost::lambda;

  iterator it=
    std::find_if(
      values.begin(),
      values.end(),
      _1>=15 && _1<25);  

  if (it!=values.end())
  {
    std::cout << 
      "Found a matching value: " << (*it) << '\n';
  }
}

The lambda expression that creates the predicate is directly coded as the third argument to find_if():

std::find_if(
  values.begin(),
  values.end(),
  _1>=15 && _1<25);  

The actual lambda expression is _1>=15 && _1< 25. The strange-looking _1 is called a placeholder, and it is used instead of the actual argument, which is supplied for each element in the range when find_if() executes (each element until a match is found, of course). With the help of a using directive [8]—using namespace boost::lambda—the placeholders are brought into the current scope, relieving you from having to type boost::lambda::_1. So, the expression reads, "If the value is greater than or equal to 15 and the value is less than 25, return true. Else return false." Very succinct and to the point. Always remember to include boost/lambda/lambda.hpp when you want to use lambda expressions. Using Boost.Lambda this way is very powerful, and it's possible to do more, as the next example will show. Rather than relying only on relational operators like in the preceding example, you'll see that it is also possible to create lambda expressions that bind to member functions. Reusing the programmer class, you can create a predicate directly at the call site that returns true only when a programmer has either C++ (programming::cpp) or Java (programming::java) as his or her primary programming language: <

#include <list>
#include <algorithm>
#include <string>
#include <iostream>

#include "boost/lambda/lambda.hpp" // For lambda expressions
#include "boost/lambda/bind.hpp"   // For bind expressions

namespace programming
{
  // Enumeration for programming languages omitted
  // Definition of programmer omitted
}

int main()
{
  typedef std::list<programming::programmer> programmer_container;
  typedef programmer_container::iterator     iterator;

  programmer_container programmers;

  // Code for adding (hiring) programmers omitted

  using namespace boost::lambda;

  // Find a programmer that has 
  // C++ of Java as his/her primary programming language
  iterator it=
    std::find_if(
      programmers.begin(),
      programmers.end(),
      bind(&programming::programmer::primary_language,_1)
        ==programming::cpp || 
          bind(&programming::programmer::primary_language,_1)==
            programming::java);

  if (it!=programmers.end())
  {
    // Found a programmer!
    std::cout << "Hey " << (*it).name() <<
      ", feel like programming today?\n";
  }
}

The learning curve for using binders [9, 10] like the one above is much steeper than for lambda expressions involving relational operators and arithmetic operators. Still, it is sometimes very convenient to create lambda expressions like this one:

bind(&programming::programmer::primary_language,_1)
  ==programming::cpp || 
    bind(&programming::programmer::primary_language,_1)==
      programming::java);

In the call to boost::lambda::bind(), the first argument is a pointer to member to programmer::primary_language, the member function that returns the primary programming language. It is required because the expression we are trying to create needs to call that member function. bind creates a function object that delays the function call until it is invoked, and the placeholder _1 is then substituted for the actual argument, which is a programmer. Two similar binders are created: one tests if the programmer's primary programming language is C++, the other if it's Java. If you learn how to read the bind expressions—and that doesn't take long—I think you'll agree that the rest of the expression is straightforward: It's just a placeholder and a logical or.

Lambda expressions are fantastic tools together with the C++ Standard Library algorithms, because they expect a function object to perform operations that are often quite straightforward. When that's the case, creating new classes or functions for those simple tasks is not very tempting, mainly because there is no real opportunity for reusing the function objects, and because they need to be written in another source code location, which makes it a little more difficult to understand the code.

Associative Containers and find

All of the associative containers except std::bitset provide a member function find() that you can use to search for elements. When provided, you should always opt to use the member functions, because they offer much better performance. So, if you are operating on an std::map, std::multimap, std::set, or set::multiset, you should use the member function find(). Of course, if you need to search using another predicate than element equality, you will still need to use the free function find_if(): there is never such a member function, because it is not possible to provide a better performance guarantee than the free function find_if() does. The performance guarantee for find() on the aforementioned containers is logarithmic, compared with the linear complexity of the free function find(). It is worth noting that the member function find() does not allow you to search a part of the container's elements, so if you need to search only a subset of the elements, you must still use the free function. For example, if you have an std::set containing programmers, here is how you should search once again for "Joe the Programmer," this time using std::set's member function find():

#include <set>
#include <string>
#include <iostream>

namespace programming
{
  // Enumeration for programming languages omitted
  // Definition of programmer omitted
  // Definition of programmer's operator== omitted
  // Definition of programming_language_finder omitted

  bool operator<(const programmer& lhs,const programmer& rhs)
  {
     return lhs.name()<rhs.name() &&
            lhs.primary_language()<rhs.primary_language();
  }
}

int main()
{
  typedef std::set<programming::programmer> programmer_container;
  typedef programmer_container::iterator    iterator;

  programmer_container programmers;

  // Code for adding (hiring) programmers omitted
  
  programmers.insert(
    programming::programmer("Joe the Programmer",programming::cpp));
  programmers.insert(
    programming::programmer("Jane the Hacker",programming::perl));

  // Find Joe the Programmer
  programming::programmer joe("Joe the Programmer",programming::cpp);
  
  // Here comes the call to set::find()
  iterator it=programmers.find(joe);
  if (it!=programmers.end())
  {
    // Give Joe a raise
    std::cout << "Hey Joe, I've got good news!\n";
  }
  else
  {
    // Hire a good C++ programmer and give him a hefty salary
    std::cout << 
      "Where can I find a great C++ programmer these days?\n";
  }
}

The diligent reader will have noted that operator< for programmers was added to support storing programmers in std::sets; the reason is that by default, sets compare elements using operator<. That's an interesting observation, because it also means that when searching for elements in one of the associative containers an equivalence relation is used (with the help of operator<) rather than equality (with the help of operator==). Thus, if you are storing elements of a type where equivalence and equality are different, you might still need to use the free function find() even if the performance is worse than the member function.

Related Topics

If you find yourself (no pun intended) calling find() multiple times on the same sequence, it might be much more efficient to sort the sequence (see separate articles on sort() and stable_sort()) and then call lower_bound() followed by a test for equality. The technique of sorting and then calling lower_bound() when doing multiple searches is discussed in the article about lower_bound().

If you need to find one of a set of elements, you should not be using find() with a special predicate, but rather look at the algorithm find_first_of().

Table 1 lists all of the algorithms in find()'s category.

Summary

You have seen how the algorithm find() can help you locate elements without resorting to loops. find() and find_if() are very often used, because the task of locating elements with certain properties is indeed common in everyday programming. You have seen how important it is to have your classes support tests for equality through the operator==, and how easy it is to write predicates for use with find_if(), rather than resorting to loop constructs and defining ad-hoc predicates directly in the loop together with the actual operations on the result. Finally, in the intermediate section, you saw how to use Boost.Lambda for creating the function objects directly at the call site. The use of lambda expressions isn't only useful for find_if(): It has been shown that many of the C++ Standard Library algorithms are good candidates for using them.

Acknowledgments

I have chosen to acknowledge not only the people who have directly contributed to this article, but also those who have made the STL what it is today. There are more than those mentioned, of course, and I appreciate if you are able to make me aware of contributors to which I have failed to give due credit.

  • Alexander Stepanov, David Musser, and Meng Lee: Thank you for creating the STL and its foundation.
  • Andrew Koenig and Bjarne Stroustrup: Thank you for making sure that the STL made it into the C++ Standard.
  • Matthew Wilson, thank you for reviewing the article.

References

[1] The C++ Standard, ISO/IEC 14992-98, often referred to as "C++98," is available for purchase from the ANSI standards store at http://webstore.ansi.org/ansidocstore.
[2] To start learning about Standard Template Libray (STL) containers, I would suggest reading The C++ Standard Library : A Tutorial and Reference, by Nicolai Josuttis.
[3] If you are unsure of how namespaces in C++ work, I recommend that you read Bjarne Stroustrup's immortal tome The C++ Programming Language, 3rd Edition.
[4] EqualityComparable and a number of other important concepts are defined in the C++ Standard. Their meaning is typically obvious, so you can easily figure out which functionality is required for classes to reify these concepts.
[5] Download the Boost libraries at hhttp://www.boost.org/. Boost.Lambda is one of the great libraries in Boost, and you can read more about it at http://www.boost.org/doc/html/lambda.html.
[6] I have written an article that introduces Boost.Lambda for C/C++ Users Journal, and it can be found online at http://www.cuj.com/documents/s=9765/cuj0312karlsson/.
[7] One of the chapters in my book Beyond the C++ Standard Library: An Introduction to Boost talks in detail about Boost.Lambda.
[8] See reference [3].
[9] There are also binders in the C++ Standard Library: std::bind1st and std::bind2nd. I recommend that you don't bother learning how to use them if you don't know how already. Rather, use modern binders, such as the ones in Boost.Lambda.
[10] Another library that contains binders—and nothing but binders—is Boost.Bind, which you can read more about at http://www.boost.org/libs/bind/bind.html.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.