Channels ▼

Mark Nelson

Dr. Dobb's Bloggers

C++11's async Template

May 30, 2012

C++11 brings rich support for threading to the language, and one of the features that really works for me is the function template async. This provides a great mechanism for spinning off worker threads and easily collecting their results. In this article, I'll show you how I used async to introduce threading to C++ beginners in a nice, non-threatening fashion.

C/C++ and Threads

Even though the languages lacked standardized support for threading, people have been writing multithreaded C and C++ programs for quite a long time. Libraries like pthreads work pretty well and give you a shot at reasonable portability. But I found that beginners would often stumble when using these as their introduction to multithreading.

The biggest annoyance for beginners would have to be the ability to pass parameters to threads, and to return values from threads. The pthreads library handled this in a conventional way for a C API, requiring all threads to have the same function signature — which relied on void pointers, dynamic allocation, and casting. Plenty of places for the newcomer to stumble.

And returning data from a thread to the caller? You're on your own. Ad hoc methods are easy enough to implement, but again, it's just one more place to make a mistake.

C++11 async()

C++11 solves most of these problems quite nicely with its thread support. In particular, for straightforward worker thread applications, the new async function template is perfect.

When using async(), a thread is modeled as a standard C++ function, as is the case with most other libraries. However, you can pass your choic e of parameters to the library — full type safety is observed, and you can pass values by copy or reference.

The function prototype for async is a bit of a mess, but in a nutshell, you call async with the following arguments:

  • A launch policy of either std::launch::async, which asks the runtime to create an asynchronous thread, or std::launch::deferred, which indicates you simply want to defer the function call until a later time (lazy evaluati on.) This argument is optional — if you omit it your function will use the default policy.
  • The name of the function you are calling.
  • The arguments to be passed to the function. Normally these are passed just as you would when calling the function, but if you wish to pass an argument by reference, you need to wrap it in a call to std::ref().

The actual prototype is shown below:

template< class Function, class... Args >
std::future<typename std::result_of<Function(Args...)>::type>
    async( std::launch policy, Function&& f, Args&&... args );

You'll note a few instances of C++11 variadic template syntax — the dot-dot-dot (not really an ellipsis) following the word class in the template type parameter list, and following the argument name Args. In both usages, this syntax means 0 or more arguments of this type. (How these variadic arguments are processed is a topic for another post.) The key point is that the function template async is instantiated to accept a typesafe list of arguments that have to match the function you are using to instantiate a thread.

My Teaching Example

To give students a feel for threading using C++11, I asked them to create a simple program called WordSearch that identifies possible Scrabble plays using a simple syntax. A single command-line argument is passed to the program — a template expression for searching through the dictionary. I had the students identify specific characters with the actual letter to be played, and the period character for places where any letter could be played. A typical run might look like this:

markn@ubuntu:~ $ ./WordSearch ..x..n
Found 5 matches for ..x..n
markn@ubuntu:~ $ 

The words are identified by brute force, working through all the entries in a copy of the Scrabble dictionary. (One of the nice things about using this pattern is that I can check the program output against grep, since the periods form a regular expression.)

To get started with the program, I asked the students to simply read the words from the scrabble dictionary into a simple deque<string>:

    ifstream f( "sowpods.txt" );
    if ( !f ) {
        cerr << "Cannot open sowpods.txt in the current directory\n";
        return -1;
    string word;
    deque<string> backlog;
    while ( f >> word )
        backlog.push_back( word );

I then asked them to create a function called find_matches that would locate all the matches in that deque, and return them to the caller in a vector<string>. A typical implementation might look like this:

vector<string> find_matches( string pattern, deque<string> &backlog )
    vector<string> results;
    while ( backlog.size() ) {
        string word = backlog.front();
        if ( match( pattern, word ) )
            results.push_back( word );
    return results;

The implementation of match() should be trivial.

Calling this routine and printing the results is easy enough now:

    vector<string> words = find_matches( pattern, backlog );
    cerr << "Found " << words.size()
         << " matches for " << pattern
         << endl;
    for ( auto s : words )
        cout << s << "\n";

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.



Thank you for the article! I think Bjarne's message that async must not be used for IO makes sense. There are better ways of doing IO, e.g. via concurrency::create_task.


Microsoft Visual C++ 11 is available to all licensees of recent versions of Microsoft Windows for free.


As others felt, it would be more appropriate to make the backlog data structure read-only. The threads should only read the data 'designated' to them, not touching the others - opening the scaleout possibilities.

One good way of doing it is, instead of 'dequeue', using some 'indexable' data-structure, such as vector or map for the backlog.

Then each thread will operate only on data related to it. For example, for N threads, thread1 will read all (x%N==0) indices, thread2 will read all (x%N==1) indices, thread3 will read all (x%N==2) indices etc... this way no two threads can access same data.
My personal favorite would be Mapped vector datastructure for the backlog - index the strings with their 1st char in the map and create 26/52 threads (as many as the number of items in the map) and let each thread operate on each set of strings for that char. (If the map is sparse, this may be inefficient). Ofcourse both these two techniques can be combined (since map uses vector or similar indexable datastructure to store its set of strings anyway)



I agree that your course is a good introduction to C++11 threads. However your WordSearch.cpp program is about as confusing as it gets for beginners.

It makes no sense to change the backlog deque. Using a vector to return the results and making the original deque a read-only vector is definitely the way to go. And it will be much faster when the number of threads increases.

I am afraid your students will be misinformed on what threading is all about after seeing examples like this one. I side Bjarne on this one: WordSearch.cpp is definitely an antipattern.


Threads should operate on different parts (ranges) of immutable input. There is absolutely no reason for threads to mutate input deque. No mutation = no race conditions = no need for locks.