C++11's async Template
Converting to Multithreading
At this point we have a single-threaded program. Now, thanks to the excellent work of the C++11 committee, we can convert this to a threaded implementation in two easy steps.
First, we wrap the function call in an async wrapper:
auto f1 = async( launch::async, find_matches, pattern, backlog );
The return type from async() is called a future. It doesn't contain anything particularly useful immediately after the function call, but after the thread exits, it will provide a simple way to get the value returned by find_matches, the function that is now running as a separate thread. We can retrieve that value with a call to member function get():
vector<string> words = f1.get();
Since the function was called with std::launch::async, the call to get() will presumably have to join() the thread, which will block until the thread is complete. After the thread has been joined, the future object returns the value returned by find_matches. It really couldn't be much easier, could it? I'm able to retrieve a complex object as if it were returned directly from a thread, and I don't need to write any declarations or create any glue code to make it happen.
Using Multiple Threads
This has all been pretty easy so far, but to actually learn something about threading, it helps to surface some of the issues that you will encounter when traveling down this path. To force the issue, I next ask my students to modify the program so that it runs three copies of find_matches at once.
Just calling async three times is not quite enough. I also have to make sure that I pass argument deque<string> backlog by reference. Since all three threads are going to be pulling data out of it simultaneously, I need to make sure that they are working on the same copy of the data.
Because of the way C++ infers the types of arguments, it isn't quite possible for async to know that backlog is a reference argument. To make that happen, I need to include the <functional> header and wrap reference arguments in a call to ref. The result looks like this:
auto f1 = async( launch::async, find_matches, pattern, ref(backlog) );
auto f2 = async( launch::async, find_matches, pattern, ref(backlog) );
auto f3 = async( launch::async, find_matches, pattern, ref(backlog) );
This fires up three threads more or less at once, and as my students are quick to see, will inevitably result in a crash. Standard library objects like deque are not threadsafe — making them so would probably violate the guiding principle that you don't pay for things you don't use.
The simplest solution to this problem is to use the mutex class added to C++11, and lock all access to the deque. A version of find_matches that uses a global mutex m results in code like this:
vector<string> find_matches( string pattern, deque<string> &backlog )
{
vector<string> results;
for ( ; ; ) {
m.lock();
if ( backlog.size() == 0 ) {
m.unlock();
return results;
}
string word = backlog.front();
backlog.pop_front();
m.unlock();
if ( match( pattern, word ) )
results.push_back( word );
}
}
A slight increase in complexity, but with this lock in place the program runs to completion without trouble. A run with a slightly different pattern produced the following results:
markn@ubuntu~ $ ./WordSearch ..x..l. Found 9 matches for ..x..l. in thread 1 maxilla maxwell mixable mixedly mixible saxauls sextile sixfold sixthly Found 7 matches for ..x..l. in thread 2 buxomly hexapla vexedly vexilla vixenly waxable waxbill Found 8 matches for ..x..l. in thread 3 boxball boxfuls fixable fixedly foxhole taxable taxably textile markn@ubuntu:~ $
Using process substitution under bash you can actually check that WordSearch is returning the same thing that grep sees when processing the data. For example:
./WordSearch ..x..l. 2> /dev/null | sort | diff - <(grep "^..x..l.$" sowpods.txt)
A very simple way to dip your toes in the water, courtesy of C++11.

