Parallel

How Much Scalability Do You Have or Need?

By Herb Sutter, August 03, 2007

How many cores (or hardware threads) can your code harness to get its answers faster?

In your application, how many independent pieces of work are ready to run at any given time? Put another way, how many cores (or hardware threads, or nodes) can your code harness to get its answers faster? And when should the answers to these questions not be "as many as possible"?

Table 1 summarizes three "orders" of throughput available in a given application or operation, using big-Oh notation to describe the amount of CPU-intensive work that will typically be ready to be actively executed at a given time. The rest of this article overviews these orders of scalability and summarizes when each one is appropriate.

Last month [2], we looked at three "pillars" of concurrency: Pillar 1, isolation by structuring work asynchronously and communicating through messages; Pillar 2, scalable throughput by exploiting parallelism in algorithms and data structures; and Pillar 3, dealing with mutable shared state. This month's topic delves into throughput-oriented techniques using techniques in Pillar 1 and Pillar 2.

Order	O(1): Single-Core	O(K): Fixed	O(N): Scalable
Tagline	One thing at a time	Explicit threading	Re-enable the free lunch
Summary	Sequential applications, and bottlenecked parallel applications	Explicitly express how much work can be done in parallel	Express lots of latent concurrency in a way that can be efficiently mapped to N cores
Examples	Multithreaded code convoyed on a global lock or message queue, occasional or intermittent background work	Pipelining, hardwired division of tasks, regular or continuous background computation	Tree traversal, quicksort, compilation
Applicability	Single-core hardware, single-threaded OS, or nonCPU-bound app	Hardware with fixed concurrency, or app whose CPU-bound parts have limited scalability	Hardware with variable (esp. growing) concurrency, and app with CPU-bound parts that are scalably parallelizable
Examples	Code targeting legacy hardware, small embedded systems, single-core game consoles; simple text processor	Game targeting one multicore game console generation; code whose key operations are order-sensitive (e.g., can be pipelined but not fully parallelized)	Mainstream desktop or server software with CPU-bound features and targeting commodity hardware or future upgradeable game consoles
Pillar (see Notes [2]), and today's mainstream tools	Pillar 1: Threads, message queues, futures		Pillar 2: Thread pools, futures, OpenMP

Table 1: Orders of throughput.

1 2 3 4 5 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

How Much Scalability Do You Have or Need?

Table 1: Orders of throughput.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Parallel

How Much Scalability Do You Have or Need?

Table 1: Orders of throughput.

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content