Channels ▼
RSS

Parallel

PLINQ: Parallel Queries in .NET


PLINQ Operators and Methods

You can modify the behavior of a PLINQ query with a variety of clauses and methods that are actually extension methods of ParallelQuery<TSource>. Most of these are the same clauses and methods available to LINQ. You can use these operators either independently or together to affect the behavior of a PLINQ query. However, PLINQ also introduces some new constructs, which are introduced in this section.

More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

The ForAll Operator

You create a PLINQ query to parallelize your code. In most circumstances, the next step is to iterate the results by using a foreach or for method. At that time, the query is most likely performed by using deferred execution. The results are processed in iterations of the for­each loop. There is only one problem: The foreach loop is sequential. This is a classic "hurry-up-and-wait" scenario. After executing a PLINQ query, you might want to extend parallelism to handle the results in parallel as well.

LINQ's Parallel.ForEach method is useful for parallelizing the same operation over a collection of values. It would appear natural to adhere to the same model to process the results of a PLINQ query. PLINQ returns a ParallelQuery<TSource> type, which represents multiple streams of data. However, Parallel.ForEach expects a single stream of data, which is then parsed into multiple streams. For this reason, the Parallel.ForEach method must recognize and convert multistream input to a single stream. There is a performance cost for this conversion.

The solution is the ParallelQuery<TSource>.ForAll method. The ForAll method directly accepts multiple streams, so it avoids the overhead of the Parallel.ForEach method. Here is a prototype of the ForAll method. The first parameter is the target of the extension method, which is a ParallelQuery type. The last parameter is an Action delegate. For the Action delegate, you can use a delegate, a lambda expression, or even an anonymous method. The next element of the collection is passed as a parameter to the delegate.

public static void ForAll<TSource>(
  this ParallelQuery<TSource> source,
  Action<TSource> action
)

Here is a short demonstration that illustrates how to use the ForAll operator. In this example, you will perform a parallel query on a string array and then select and display strings longer than two characters in length.

Perform a parallel query of a string array

1. Create a console application for C# in Visual Studio. In the Main method, define a string array.

string [] stringArray = { "A", "AB", "ABC", "ABCD" };

2. Perform a PLINQ query on the string array. Select strings with a length greater than two.

var results=from value in stringArray.AsParallel()
  where value.Length>2 select value;

3. Call the ForAll operator on the results. In the lambda expression, display the current item.

results.ForAll((item) => Console.WriteLine(item));

Here is the source code for the entire application:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ForAll
{
  class Program
  {
    static void Main(string[] args)
    {
      string[] stringArray = { "A", "AB", "ABC", "ABCD" };
      var results = from value in stringArray.AsParallel()
        where value.Length > 2 select value;
      results.ForAll((item) => Console.WriteLine(item));
      Console.WriteLine("Press Enter to Continue");
      Console.ReadLine();
    }
  }
}

The application will display ABC and ABCD as the result.

Using ParallelExecutionMode

So far, we have used the AsParallel method to convert LINQ to PLINQ. It is a simple change to a LINQ query that alters the semantics completely.

A PLINQ query is not guaranteed to actually execute in parallel. Overhead from executing the parallel query in parallel, such as thread-related costs, synchronization, and the parallelization code, can exceed the performance gain. Determining the relative performance benefit of the PLINQ query is an inexact science based on several factors. Here are some of the considerations that might affect the performance of a PLINQ query:

  • Length of operations
  • Number of processor cores
  • Result type
  • Merge options

One of the biggest factors is the duration of the parallel operations, such as the Select clause. Dependencies and the synchronization that results from them adversely affect the performance of any parallel solution. Furthermore, shorter operations might not be worth parallelizing, because the associated overhead might exceed the duration of the operation. For small operations, you could change the chunking to improve the balance of execution to overhead. Custom partitioners, including those that change the chunk size, are an option.

The number of processor cores might affect the performance of your parallel application, including PLINQ. However, you should typically ignore the number of processor cores, because that's mostly beyond your control. Maintaining hardware independence in your application is important for both scalability and portability.

PLINQ does not consider all of the above factors when deciding to execute a query in parallel. Based on the shape of the query and the clauses used, PLINQ decides to execute a query either in parallel or sequentially. You can override this default by using the WithExecutionMode clause with the ParallelExecutionMode enumeration as a parameter. The two options are ParallelExecutionMode.ForceParallelism and ParallelExecutionMode.Default. Use the ParallelExecutionMode.ForceParallelism enumeration to require parallel execution.

The ParallelExecutionMode.Default value defers to PLINQ for the appropriate decision on the execution mode. Here is an example that forces a parallel PLINQ query.

from item in data.AsParallel()
  .WithExecutionMode(
    ParallelExecutionMode.ForceParallelism
  )
  select item;

Using WithMergeOptions

How the result of your query expression is handled can also affect performance. For example, the following PLINQ query returns a List<T> type. Converting the PLINQ to a list requires that the results be buffered to return an entire list.

intArray.AsParallel()
  .Where((value)=>value>5)
  .ToList();

As mentioned, for the aforementioned code, the results are buffered. In some circumstances, PLINQ might buffer the results, but that is mostly transparent to your code.

Using the .NET Framework 4 thread pool, PLINQ uses multiple threads to execute the query in parallel. The results of these parallel operations are then merged back into the joining thread. The merge option describes the buffering used when merging results from the various threads.

Here are the merge options as defined in the ParallelMergeOptions enumeration:

  • NotBuffered: The results are not buffered. For operations such as the ForAll operation, NotBuffered is the default.
  • FullyBuffered: The results are fully buffered, which can delay receipt of the first result.
  • AutoBuffered: This option is similar to NotBuffered, except that the results are returned in chunks.
  • Default: The default is AutoBuffered.

You can override the default buffer preference with the WithMergeOptions operator.

Using AsSequential

The difference between PLINQ and LINQ starts with the AsParallel clause. As we've seen, converting from LINQ to PLINQ is often as simple as adding the AsParallel method to a LINQ query. Here is a basic LINQ query:

numbers.Select(/* selection */ )
  .OrderBy( /* sort */ );

Here is a parallel version of the same query, with the required AsParallel method added.

numbers.AsParallel().Select
   (/* selection */ ).OrderBy( /* sort */ );


Related Reading






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video