Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Clay Breshears

Dr. Dobb's Bloggers

"Task Parallelism Does Not Exist"

July 31, 2012

I went to the Intel reception during the 2011 IDF event. The DJ was spinning dance or house or techno music to the eager crowd. It was rumored that will.i.am was in the house and might make an appearance soon. I was standing outside admiring the San Francisco skyline at night with a drink in hand when a noted parallel programming expert and evangelist dropped the little gem I've taken for the title of this post. Luckily I wasn't taking a sip of my bourbon at that time or I might have done a spit-take all over him. Instead, I gave him my most significant look of incredulity.

More Insights

White Papers

More >>


More >>


More >>

My mind quickly took up the opposite position to the statement. I knew I had implemented many different task parallel algorithms (e.g., my iterative Quicksort codes that have been featured in this forum). These had been effectively parallelized and achieved some speedup over the serial equivalent.

Having achieved the desired consternation in my mind and before I could voice any counter-examples to his controversial position statement, the expert explained. If you want to have any chance to have a scalable parallel algorithm — something suitable for many-core processors — you wouldn't implement a task parallel solution. What you need would be to devise data parallel code. As the size of the workload increases, you would be assured that more and more cores would also be able to be utilized effectively.

I had to admit it was a fair assessment; pretty obvious once you stop to think about it. There really is no other way to program the hundreds of cores becoming more and more prevalent in GPGPUs and other coprocessor and accelerator add-ons. Few truly task parallel solutions will contain enough work to use more than a handful of threads. Plus, the SIMD execution model was designed with data parallel and vector computations in mind.

Going over in my mind the types of parallel computations that I have been exposed to over the years, I realized that most of the solutions were data parallel. The whole panorama of matrix-based and grid-based scientific algorithms has the most obvious examples. Whether it was computing the probable path of a hurricane, the flow of air over a helicopter rotor, or the temperature changes wrought by ocean currents in the Strait of Juan de Fuca, dividing up large data sets into smaller parts and executing the same computations on each part was the standard methodology.

Even with all of this, my expert friend finally did admit that task parallelism was still a possible solution. His statement had been meant to elicit shock and awe and to persuade programmers to first think about a data parallel solution. If you only have four or eight cores available, a task parallel solution might be sufficient and could fully utilize the resources of the platform.

Since that September night, as I've reflected more and more on this conversation, I thank my lucky stars that I wasn't in the middle of taking a swig of my drink when I first heard that controversial statement. Being in a city that I wasn't all that familiar with, I'm not sure if I could have found a reputable dry cleaner to get alcohol stains out of clothes. I've also come to realize that my favorite set of algorithms, sorting, are going to be task parallel in their implementation. While the number of tasks may (eventually) be relatively large, that number is more dependent on the order of elements to be sorted than it is on the size of the data.

Consider a parallel Quicksort algorithm. Whether you implement an iterative or recursive (creating tasks for recursive calls), there will be a marked difference in the number tasks and amount of computation per task between data sets whose keys are randomly distributed versus a data set where the keys begin in nearly sorted order. In the first case you would expect the partitioning operation to divide the subarrays into roughly equal-sized sets while the second data set will more likely generate empty partitions on one side of the pivot element. In either case, it is easy to see that the amount of work for tasks generated from the partition step can vary widely, which is not a situation that is handled well by SIMD or data parallel computations.

So, what was the point of this post? Like the expert, I wanted to reinforce the idea that data parallel solutions are going to be more desired and scalable. Even though you might initially come up with a task parallel algorithm, you should take some time to see if a reworking of the algorithm or the data representation can better facilitate a more data-parallel version. For one example of this idea, look back to the prefix scan version of the Partition operation within Quicksort that utilized a data parallel computation within parallel sorting tasks.

Related Reading

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.