Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Going Superlinear


Parallel Search: Exploiting Nonuniformity

Let's say we want to search an unsorted collection of size N for a value that matches some criteria. First, consider a simple sequential search:

  • If no match exists, the sequential search has no choice but to look at all N elements.
  • If there's one match, on average, the sequential search has to look at half of the elements to find the match, which is still O(N).
  • If there are M matches, it typically has to look at proportionately fewer elements, or O(N/M).

These results are summarized in the first column of Table 1.

A simple P-way parallel search, where each worker explores 1/P-th of the collection, will normally be P times faster than the sequential search even in the worst case. After all, we're just searching P elements at each step instead of only one. (See the second column of Table 1.) Figures 2 and 3 show sample sequential and parallel searches side by side.

[Click image to view at full size]

Figure 2: Sequential versus parallel search, single match.

[Click image to view at full size]

Figure 3: Sequential versus parallel search matches evenly distributed.

[Click image to view at full size]

Table 1: Sequential versus parallel search, matches distributed uniformly versus nonuniformly.

But something interesting happens when the distribution of solutions is not uniform, because then averages no longer apply the same way. To illustrate what happens when matches are clustered, consider Figure 4: We still have M matches in the collection, as before, but let's assume that they are distributed, not evenly throughout, but with higher probability in some regions.

[Click image to view at full size]

Figure 4: Sequential versus parallel search, when matches are clustered.

For simple sequential search, this is no big news: It fares neither better nor worse, and on average still takes as long to find a match as it would if the matches were distributed uniformly throughout the collection. Even if the sequential algorithm writer knows in advance that matches tend to be clustered, it's hard to see how to take advantage of that without also knowing in advance where the clusters are.

For the simple parallel search, on the other hand, clustering is great news, because a parallel search directly takes advantage of any locality there may be in the matches. Consider: Any clustering effect means that some subrange workers will end up discovering that they are impoverished, having relatively fewer matches, or none at all, in their assigned search areas. But other workers will enjoy a correspondingly higher number of matches in their areas, which means—and this is the important part—they will have a higher probability of finding a match each time they test an element, which in turn means that on average they will need to visit fewer elements to find a match. Like Forty-Niners panning and digging for nonuniformly distributed gold in California (near Sutter's Mill; no relation), many workers will find little or nothing, while a lucky few, or just one, will hit the Motherlode. And "just one" is all it takes; any clustering at all is immediately helpful because the time to perform the total search is the minimum of the workers' runtimes—we're done as soon as any worker finds a match.

The final row of Table 1 illustrates this effect with a simplified example case that defines a clustering factor, C: We still have M matches, but the matches are concentrated in 1/C-th of the subranges (not necessarily contiguous) to be explored by the parallel workers. When C=1, we're just in the uniform distribution case. But when C>1, the lucky workers' probability of success on each test is higher by a factor of C, and because in this example we are guaranteed that one of them will find the answer, this has the same effect on the performance of the entire search as if all workers enjoyed the same boost; that is, it's as if the whole collection contained not just M matches, but CM matches. Hence the parallel search enjoys not just the expected linear speedup of P, but a superlinear speedup of CP—the greater the clustering factor, the greater the improvement above and beyond the linear speedup from using a larger number of processors [1].

This isn't limited to simple array-like collections. Many kinds of collections naturally enjoy clustered matches. For example, when we traverse a tree or graph using depth-first search, the parallel version of the algorithm typically has each parallel worker searching a localized subtree, and in many problem domains the solutions tend to be clustered in subtrees. When that happens, one worker will have a disproportionately better chance of being assigned an area with more matches, and we can expect superlinear improvement. Example applications range from searching game trees (for example, a bad move at one node in the tree can cause the entire subtree below it containing subsequent moves to have no good solutions), to CAD and VLSI circuit layout algorithms (see [2]).


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.