Channels ▼
RSS

Design

Custom Parallel Partitioning With .NET 4


Using Partitioners

IList<T> implements IEnumerable<T>, and thus the first custom partitioner we implemented isn't functionally necessary given that we have the second one. However, the interlocked operations and indexing into the IList<T> are, in general, significantly cheaper operations than taking locks and using a shared enumerator. As such, we'd like to be able to use the IList<T> partitioner when it's appropriate to do so, and fall back to the IEnumerable<T> partitioner only when we have to. Listing 3 contains a static Create helper method that does just that.


public static class SingleItemPartitioner
{
    public static OrderablePartitioner<T> Create<T>(IEnumerable<T> source)
    {
        return source is IList<T> ?
            new SingleItemIListPartitioner<T>((IList<T>)source) :
            new SingleItemIEnumerablePartitioner<T>(source);
    }
}

Listing 3: A Load-Balancing Orderable Partitioner for IEnumerable<T>

(The full source for SingleItemPartitioner, including the individual partitioners, is available here from Microsoft.)

With SingleItemPartitioner in hand, we can now set about to use our custom partitioning. For example, Listing 4 shows a hypothetical situation where we're processing an enumerable of stock ticks by streaming it to a Parallel.ForEach loop, and then doing similarly with PLINQ:


IEnumerable<StockTickEvent> ticks = ...;
var results = new ConcurrentBag<StockTickEventResult>();
Parallel.ForEach(SingleItemPartitioner.Create(ticks), tick =>
{
     var result = ProcessStockTick(tick);
     results.Add(result);
});
return results.ToArray();
 ...

IEnumerable<StockTickEvent> ticks = ...;
return SingleItemPartitioner.Create(ticks)
       .AsParallel()
       .Select(tick => ProcessStockTick(tick))
       .ToArray();

Listing 4: Using a Custom Partitioner

Of course, while it's possible to implement custom partitioning, you'd don't want to have to in most cases. As with most extensibility points in .NET, there are default implementations of partitioning available, and only if those default implementations don't suite your needs should you consider utilizing a custom partitioner. This might be because you have detailed knowledge of the layout of your data structure, so you know how to best take advantage of cache locality in how you group elements into partitions. It may be because your data structure doesn't implement and isn't convertible to IEnumerable<T>, and thus the only way to process it in parallel with PLINQ or Parallel.ForEach is by writing your own partitioner. It might be for a multitude of reasons. In general, use the built-in partitioners first, and only if they don't meet your needs explore alternatives.

The System.Collections.Concurrent.Partitioner static class provides several overloads of a Create method for creating partitioners for different purposes. The list of overloads as of the .NET Framework 4 is shown in Listing 5:


public static OrderablePartitioner<TSource> Create<TSource>(
    IEnumerable<TSource> source);
public static OrderablePartitioner<TSource> Create<TSource>(
    TSource[] array, bool loadBalance);
public static OrderablePartitioner<TSource> Create<TSource>(
    IList<TSource> list, bool loadBalance);
public static OrderablePartitioner<Tuple<int, int>> Create(
    int fromInclusive, int toExclusive);
public static OrderablePartitioner<Tuple<long, long>> Create(
    long fromInclusive, long toExclusive);
public static OrderablePartitioner<Tuple<int, int>> Create(
    int fromInclusive, int toExclusive, int rangeSize);
public static OrderablePartitioner<Tuple<long, long>> Create(
    long fromInclusive, long toExclusive, long rangeSize);

Listing 5: Partitioner.Create Overloads in .NET 4

When Parallel.ForEach is called with an enumerable, it delegates to the first overload to get a partitioner for that source. This enumerable partitioning algorithm is publicly available, so if you have a need to write your own parallelization construct similar to Parallel.ForEach or PLINQ, you can utilize the built-in partitioning algorithms, or any custom implementations for that matter. For example, the code in Listing 6 implements a very simple foreach loop. It takes advantage of the built-in enumerable partitioning capabilities to create one partition on the enumerable per processor. It then spins up one System.Threading.Tasks.Task instance per partition, and that task iterates through the partition's enumerator, executing the body Action<T> delegate for each item found. The using construct ensures that the enumerator is properly disposed when the partition has finished with it.


static void SimpleParallelForEach<T>(IEnumerable<T> source, Action<T> body)
{
    var partitions = Partitioner.Create(source).
        GetPartitions(Environment.ProcessorCount);
    var tasks = (from partition in partitions select Task.Factory.StartNew(() =>
                 {
                     using (partition)
                         while (partition.MoveNext()) body(partition.Current);
                 })).ToArray();
    Task.WaitAll(tasks);
}

Listing 6: Custom Parallelization Construct Using a Partitioner

For More Information

Again, "Samples for Parallel Programming with the .NET Framework 4" (available for download at http://code.msdn.microsoft.com/ParExtSamples) includes the sample code presented in this article, along with several additional custom partitioner implementations, and much more. This partitioning support is part of a significantly larger effort to make parallel programming with .NET leagues easier, and this article has only scratched the surface of what's possible. For more information, check out the Parallel Computing developer center on MSDN at http://msdn.com/concurrency, as well as the blog of the team that built this parallelism support at http://blogs.msdn.com/pfxteam. We'd love to hear about any work you do to parallelize your applications in .NET 4, so please feel free to get in touch. Happy parallelizing!


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video