Channels ▼


Designing Parallel Algorithms: Part 2

Global Communication

Figure 6: A centralized summation algorithm that uses a central manager task (S) to sum N numbers distributed among N tasks. Here, N=8 , and each of the 8 channels is labeled with the number of the step in which they are used.

A global communication operation is one in which many tasks must participate. When such operations are implemented, it may not be sufficient simply to identify individual producer/consumer pairs. Such an approach may result in too many communications or may restrict opportunities for concurrent execution. For example, consider the problem of performing a parallel reduction operation, that is, an operation that reduces N values distributed over N tasks using a commutative associative operator such as addition:

Let us assume that a single "manager'' task requires the result S of this operation. Taking a purely local view of communication, we recognize that the manager requires values etc., from tasks 0, 1, etc. Hence, we could define a communication structure that allows each task to communicate its value to the manager independently. The manager would then receive the values and add them into an accumulator (Figure 6). However, because the manager can receive and sum only one number at a time, this approach takes time to sum N numbers -- not a very good parallel algorithm!

This example illustrates two general problems that can hinder efficient parallel execution in algorithms based on a purely local view of communication:

  1. The algorithm is centralized : it does not distribute computation and communication. A single task (in this case, the manager task) must participate in every operation.
  2. The algorithm is sequential : it does not allow multiple computation and communication operations to proceed concurrently.

We must address both these problems to develop a good parallel algorithm.

Distributing Communication and Computation

We first consider the problem of distributing the computation and communication associated with the summation. We can distribute the summation of the N numbers by making each task i , 0<i<N-1 , compute the sum:

Figure 7: A summation algorithm that connects N tasks in an array in order to sum N numbers distributed among these tasks. Each channel is labeled with the number of the step in which it is used and the value that is communicated on it.

The communication requirements associated with this algorithm can be satisfied by connecting the N tasks in a one-dimensional array (Figure 7). Task N-1 sends its value to its neighbor in this array. Tasks 1 through N-2 each wait to receive a partial sum from their right-hand neighbor, add this to their local value, and send the result to their left-hand neighbor. Task 0 receives a partial sum and adds this to its local value to obtain the complete sum. This algorithm distributes the N-1 communications and additions, but permits concurrent execution only if multiple summation operations are to be performed. (The array of tasks can then be used as a pipeline, through which flow partial sums.) A single summation still takes N-1 steps.

Uncovering Concurrency: Divide and Conquer

Opportunities for concurrent computation and communication can often be uncovered by applying a problem-solving strategy called divide and conquer. To solve a complex problem (such as summing N numbers), we seek to partition it into two or more simpler problems of roughly equivalent size (e.g., summing N/2 numbers). This process is applied recursively to produce a set of subproblems that cannot be subdivided further (e.g., summing two numbers). The strategy is summarized in Algorithm 2.1. The divide-and-conquer technique is effective in parallel computing when the subproblems generated by problem partitioning can be solved concurrently. For example, in the summation problem, we can take advantage of the following identity (, n an integer):

The two summations on the right-hand side can be performed concurrently. They can also be further decomposed if n>1 , to give the tree structure illustrated in Figure 8. Summations at the same level in this tree of height can be performed concurrently, so the complete summation can be achieved in rather than N steps.

Figure 8: Tree structure for divide-and-conquer summation algorithm with N=8 . The N numbers located in the tasks at the bottom of the diagram are communicated to the tasks in the row immediately above; these each perform an addition and then forward the result to the next level. The complete sum is available at the root of the tree after steps.

In summary, we observe that in developing an efficient parallel summation algorithm, we have distributed the N-1 communication and computation operations required to perform the summation and have modified the order in which these operations are performed so that they can proceed concurrently. The result is a regular communication structure in which each task communicates with a small set of neighbors.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.