Channels ▼
RSS

Parallel

Improving Futures and Callbacks in C++ To Avoid Synching by Waiting


The C++11 standard provides several long-requested concurrency features such as the std::thread, std::future, and others. While those are a welcome addition to the language, in this article, I will show that they are not sufficient for all but the most basic concurrency needs. I will argue that the primitives in C++11 are particularly ill-suited for modern applications that must deal with the concurrency imposed by I/O operations and exploit multicore at the same time.

Fortunately, many of these limitations can be addressed by augmenting C++11 futures with continuations, based on experience with the Parallel Patterns Library (PPL) at Microsoft. The reader is expected to have a working knowledge of C++11 and some experience writing parallel code, but familiarity with the PPL is optional.

Connected and Multicore

We take it for granted that the software we use daily is both connected to the Internet and able to harness multiple cores. It is natural to think of the Internet (or the cloud) and multiple cores not as two distinct capabilities of a program, but as a single elastic compute resource. As Herb Sutter puts it in Welcome To The Parallel Jungle, "The network is just another bus to more compute cores."

However, a developer building a modern connected multicore application faces two distinct challenges:

  • Building a well-performing connected program requires dealing with the latency and unpredictability typical of I/O operations. This is difficult, and getting it right results in a an application that is responsive and scalable, but not necessarily faster.
  • Building a well-performing multicore program is a parallel programming job — a different challenge altogether, often requiring a different tool set and different skills. When this is done right, the program runs faster, although its speed is usually orthogonal to the responsiveness and the scalability.

Taking the view of the cloud as a natural extension of multicore behooves us to find a programming model that, at the very minimum, gives us a way to efficiently compose the I/O operations and the multicore operations.

Concurrency in C++

In the last decade, the software industry has developed a many tools for multicore programming in C++. Libraries such as Intel's TBB (Threading Building Blocks) or Microsoft's PPL (Parallel Patterns Library) are the state of the art. These tools excel at parallel decomposition — partitioning serial code into multiple "chores" that run on multiple cores.

But there is more to being "connected and multicore" than just parallelism. Well-performing concurrent programs must combine the connected components, with their inherent latency and unreliability, with the parallel components. Put another way, if parallelism is about decomposing the program into independent parts, concurrency is about both decomposing and composing the program from the parts that work well individually and together.

I believe that it's in the composition of connected and multicore components where today's C++ libraries are still lacking.

The Dreaded Wait

Most of the concurrency primitives in C++11 are composed via waiting. One can spawn a thread (by creating an instance of std::thread), then wait for it to finish by calling the join method. Likewise, the result of a future object (represented by std::future) can be retrieved by calling the get method — during which the calling thread waits for the result to become available.

Why is this a problem?

Waiting on the GUI thread means that the user of the application is rewarded with the "hourglass" or the "spinning donut" while the thread is waiting for an operation to complete. This is bad enough for a CPU-bound operation, but the length of an I/O-bound call can be truly unpredictable and therefore very long.

Clearly, the GUI thread of the application is a scarce resource, and we want to return it to the message pump as soon as possible — but let's not kid ourselves by thinking that all we need to do is offload the long-running operation to another thread. If we did that, how would we synchronize the two threads without waiting?

The woes of composition-by-waiting are not limited to GUI programs. By default, at creation, a thread reserves 1 MB of stack space on Windows and 8 MB on Linux. This value is configurable, but reducing the stack size may break programs with deep call chains or multiple stack-allocated objects. In other words, not only GUI threads are expensive — all threads are. This can be felt acutely in a multithreaded server application. If many threads decide to block at the same time, waiting can bring the server to its knees very quickly.

Continuations

In C/C++, continuations are commonly known as "callbacks," and they are often used for asynchronous programming. This is not to say that the concept is unique to C++, but because the language has been a laggard in adopting mainstream functional programming features such as the lambda expressions, C++ libraries that use continuations consistently are still rare.

The concept of the continuation was pioneered by Scheme, which introduced the style of programming where instead of returning a value, a function takes an additional parameter — the continuation that is invoked to process the return value of the function. Naturally, the continuation itself is also a function that can take continuations, and so on.

Continuations make the flow of control explicit — instead of invoking a function and waiting for it to complete, a program written in a continuation-passing style specifies explicitly what to do with the return value when it is available.

For concurrent programs, continuations are a boon because they allow us to avoid blocking waits — which, as I stated above, greatly hinder responsiveness and scalability.

JavaScript has made continuations ubiquitous in Web programming. Because JavaScript is single-threaded, waiting for the server to produce the data would freeze the browser. Instead, JavaScript uses a technique known as AJAX, where the act of issuing a request to the server is separated from the act of handling the data retrieved from the server:

http.open("GET", "customer.html");
http.onreadystatechange = function() { 
    if(http.readyState == 4) {
        var serverResponse = http.responseText; 
        // process serverResponse here: ... 
    }
}

More recently, Node.js has been very successful at capturing the mindshare of the developer community thanks to its use of continuations for server-side programming. In Windows Runtime, which powers the Metro-style apps in Windows 8, the concept of continuations is used holistically for all potentially long-running applications. Continuations are, in fact, the only way of working with asynchronous operations.

Tasks, Futures, and Promises

A beloved child has many names, as the saying goes. The concept of the "task" — also known as the future or the promise, depending on the language and library — represents a relatively straightforward idea.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Comments:

ubm_techweb_disqus_sso_-736c982abb368dc90b438f24fb751580
2014-12-06T21:42:31

Hello Artur,

I understand your code with PPL, task and ".then" method. But I can't compile it with Visual Studio 2010. I downloaded the source code of the PPL book from http://parallelpatternscpp.cod.... But I still didn't find the ".then" method and "task". You code seems me to be pseudocode?

But I'm still looking for your proposed concept with continuations/callbacks. Please give me a hint, where I found some working code.

Edit: With Visual Studio 2012 it works great. You only have to include "ppltask.h" See http://msdn.microsoft.com/en-u...
Thanks,
Thomas


Permalink
IanEmmons
2013-01-28T23:09:46

I don't understand why continuations solve any of the problems outlined in this article. If the continuation code is simply run on another thread out of the thread pool, then why not put that code into the original asynchronous method? In other words, the point of calling get() on a future is to get the result back to the thread of control that spawned the asynchronous task. If the continuation simply runs on another thread pool thread, I still have to synchronize with the originating thread to get the result back where it needs to be, and therefore the continuation hasn't solved much of anything. What am I missing here?


Permalink
ubm_techweb_disqus_sso_-f39d8a5e69ada869afcb28769aad62b3
2012-07-27T07:22:23

@benzen: good point, in addition to the “simple” wait you can also poll, or do a “busy” wait. I don’t consider it a viable alternative, for many reasons:
- Polling doesn’t work on the client, where you need to return the thread to the event loop ASAP, or on the server, where you want to let the thread process other requests (unless you want to take a stack dive in do_other_task).
- The approach doesn’t compose – think how you’d join two or more futures this way.
- You always tend to poll either too often or not often enough, meaning extra overhead or unneeded delays.
- Finally, a “simple” wait can often be detected by the thread pool, allowing it to spawn up a new thread to pick up the slack. No such luck with polling – you’re spinning a hamster wheel until you run out of the little pieces of work to pick up.


Permalink
ubm_techweb_disqus_sso_-1189d74e9a2c1c5da478d5fbe018ed4e
2012-07-26T14:30:44

you don't have to wait with futures.

whille ( future.wait_for(std::chrono::seconds(0)) != std::future_status::ready )
{
do_other_task();
}

future.get();

The interface could be better like having a std::future::is_ready instead but it is false to say you must wait on get.


Permalink

Video