Channels ▼


Efficient Use of Lambda Expressions and std::function

std::function and Its Costs

Another Boost option, std::function, implements a type erasure mechanism that allows a uniform treatment of different functor types. Its predecessor boost::function dates back to 2001 and was introduced into TR1 in 2005 as std::tr1::function. Now, it's part of C++11 and has been promoted to namespace std.

We shall see a few details of three different implementations of std::function and related classes: Boost, the Microsoft C++ Standard library (MSLIB for short), and the GNU Standard C++ Library (a.k.a. libstdc++, but referred to here as GCCLIB). Unless otherwise stated, we shall generically refer to the relevant library types and functions as if they belonged to namespace std, regardless of the fact that Boost's do not. I will cover two compilers: Microsoft Visual Studio 2010 (MSVC) and the GNU Compiler Collection 4.5.3 (GCC) using option -std=c++0x. I'll consider these compilers, compiling their corresponding aforementioned standard libraries, and also compiling Boost. Using std::function, find_root's declaration becomes

double find_root(std::function<double(double)> const& f);

Generally, std::function<R(T1, T2, ..., TN)> is a functor class that wraps any functor object that takes N arguments of types T1, ..., TN and returns a value convertible to R. It provides template conversion constructors that accept such functor objects. In particular, closure types are implicitly converted to std::function. There are two hidden and preventable costs at construction.

First, the constructor takes the functor object by value, implying that a copy is made. Furthermore, the constructor forwards this copy to a series of helper functions, many of which also take it by value, implying further copies. For instance, MSLIB and GCCLIB make four copies, and Boost makes seven. However, the large number of copies is not the culprit for the biggest performance hit.

The second issue is related to the functor's size. The three implementations follow the standard's recommendation to apply a small object optimization so as to avoid dynamic memory allocation. Hence, they use a data member to store a copy of the wrapped functor object; but because the object's size is known only at construction, the storage may not be big enough to hold the copy. In this case, the copy is created on the heap through a call to new (unless a custom allocator is specified) and only a pointer to this copy is stored by the data member. The exact size beyond which the heap is used depends on the platform and alignment considerations. The best cases for common platforms are 12 bytes, 16 bytes, and 24 bytes for MSLIB, GCCLIB, and Boost, respectively.

Improving Performance of std::function

Clearly, to address these performance issues, copies and big objects should be avoided. The natural idea is working with references instead of copies. However, we all know that this is not generally possible because you might want the std::function object to outlive the original functor.

This is an old issue, as STL algorithms also take functors arguments by value. A good solution was implemented by Boost a long time ago and is now also part of C++11 as well. The template class std::reference_wrapper wraps a reference to an object and provides automatic conversion to the wrapped type making the std::reference_wrapper usable in many circumstances where the wrapped type is expected. The size of std::reference_wrapper is the size of a reference and, thus, small. Additionally, there are two template functions — std::ref and std::cref — to ease the creation of non-const and const std::reference_wrappers, respectively. (They act like std::make_pair does to create std::pairs.)

Back to the first example: To avoid the multiple copies of is_multiple_of (which actually don't cost much since this is a small class) we can use:

is_multiple_of f(n);
std::count_if(v.begin(), v.end(), std::cref(f));

Applying the same idea to the lambda expression yields:

auto f([n](int i){return i%n == 0;});
std::count_if(v.begin(), v.end(), std::cref(f));

Unfortunately, things get a bit more complicated and depend on the compiler and library.

  • Boost in both compilers (change std::cref to boost::cref): It doesn't work because boost::reference_wrapper<T> is not a functor.
  • MSLIB: Currently, it doesn't work, but should in the near future. Indeed, to handle types returned by functor objects, MSLIB uses std::result_of which, on TR1, depends on the functor type having a member result_type — a typedef to the type returned by operator(). Notice that is_multiple_of has this member type, but the closure type doesn't (as per C++11). In C++11, std::result_of has changed and is defined in terms of decltype. We are in a transition period and MSLIB still follows TR1, but the next release of MSLIB is supposed to follow C++11.
  • GCCLIB: It works.

In addition, as per C++11, functor classes originating from lambda expressions are not adaptable — they don't contain certain type members required by STL adaptors — and the following code is illegal:

std::not1([n](int i){ return i%n == 0; });

In this case, std::not1 requires argument_type. Notice that is_multiple_of defines it.

The previous issue takes a slightly different form when std::function is involved. By definition, std::function wraps functors only. Hence, when its constructor receives an object of type std::reference_wrapper<T>, it assumes that T is a functor class and behaves accordingly. For instance, the following lines are legal with Boost and GCCLIB, but not yet with MSLIB (though they should be in the next release):

auto f1([n](int i){return i%n == 0;});
std::function<bool(int)> f(std::cref(f1));
std::count_if(v.begin(), v.end(), f);

It's worth mentioning that std::functions wrapping unary and binary functor classes are adaptable and can be given to STL adaptors (for example, std::not1).

I'm led to conclude that if you don't want heap storage (and custom allocators), then Boost and GCCLIB are good options. If you are aiming for portability, then you should use Boost. For developers using MSVCLIB, the performance issue remains unsolved until the next release. For those who can't wait, here is a workaround that turns out to be portable (works with GCC and MSVC).

The idea is obvious: Keep the closure type small. This size depends on the variables that are captured by the lambda expression (that is, appear inside the square brackets []). For instance, the lambda expression previously seen

[n](int i){ return i%n == 0; };

captures the variable int n and, for this reason, the closure type has an int data member holding a copy of n. The more identifiers we put inside [], the bigger the size of the closure type gets. If the aggregated size of all identifiers inside [] is small enough (for example, one int or one double), then the heap is not used.

One way to keep the size small is by creating a struct enclosing references to all identifiers that normally would go inside [] and putting only a reference to this struct inside []. You use the struct members in the body of the lambda expression. For instance, the following lambda expression

double a;
double b;
// ...
[a, b](double x){ return a * x + b; };

yields a closure type with at least 2 * sizeof(double) = 16 bytes, which is enough for MSLIB to use the heap. The alternative is

double a;
double b;
// ...
struct {
	const double& a;
	const double& b;
} p = { a, b };
[&p](double x){ return p.a * x + p.b; };

In this way, only a reference to p is captured, which is enough for MSLIB, GCC, and Boost to avoid the heap.

A final word on the letter of the law: The standard says that the closure type can have different sizes and alignments. This means that the aforementioned workaround might not work. More precisely, the code remains legal, but the heap might be used if the object gets big enough. However, neither MSVC nor GCC do this.


I would like to thank Lorenz Schneider and Victor Bogado for their comments and careful reading of this article.

(Update: A few code items that could prevent compilation under C++11 compilers have been updated. [December 2013])

Cassio Neri has a PhD in Mathematics. He works in the FX Quantitative Research at Lloyds Banking Group in London.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.