Using Partitioners
IList<T> implements IEnumerable<T>, and thus the first custom partitioner we implemented isn't functionally necessary given that we have the second one. However, the interlocked operations and indexing into the IList<T> are, in general, significantly cheaper operations than taking locks and using a shared enumerator. As such, we'd like to be able to use the IList<T> partitioner when it's appropriate to do so, and fall back to the IEnumerable<T> partitioner only when we have to. Listing 3 contains a static Create helper method that does just that.
public static class SingleItemPartitioner { public static OrderablePartitioner<T> Create<T>(IEnumerable<T> source) { return source is IList<T> ? new SingleItemIListPartitioner<T>((IList<T>)source) : new SingleItemIEnumerablePartitioner<T>(source); } }
(The full source for SingleItemPartitioner, including the individual partitioners, is available here from Microsoft.)
With SingleItemPartitioner in hand, we can now set about to use our custom partitioning. For example, Listing 4 shows a hypothetical situation where we're processing an enumerable of stock ticks by streaming it to a Parallel.ForEach loop, and then doing similarly with PLINQ:
IEnumerable<StockTickEvent> ticks = ...; var results = new ConcurrentBag<StockTickEventResult>(); Parallel.ForEach(SingleItemPartitioner.Create(ticks), tick => { var result = ProcessStockTick(tick); results.Add(result); }); return results.ToArray(); ... IEnumerable<StockTickEvent> ticks = ...; return SingleItemPartitioner.Create(ticks) .AsParallel() .Select(tick => ProcessStockTick(tick)) .ToArray();
Of course, while it's possible to implement custom partitioning, you'd don't want to have to in most cases. As with most extensibility points in .NET, there are default implementations of partitioning available, and only if those default implementations don't suite your needs should you consider utilizing a custom partitioner. This might be because you have detailed knowledge of the layout of your data structure, so you know how to best take advantage of cache locality in how you group elements into partitions. It may be because your data structure doesn't implement and isn't convertible to IEnumerable<T>, and thus the only way to process it in parallel with PLINQ or Parallel.ForEach is by writing your own partitioner. It might be for a multitude of reasons. In general, use the built-in partitioners first, and only if they don't meet your needs explore alternatives.
The System.Collections.Concurrent.Partitioner static class provides several overloads of a Create method for creating partitioners for different purposes. The list of overloads as of the .NET Framework 4 is shown in Listing 5:
public static OrderablePartitioner<TSource> Create<TSource>( IEnumerable<TSource> source); public static OrderablePartitioner<TSource> Create<TSource>( TSource[] array, bool loadBalance); public static OrderablePartitioner<TSource> Create<TSource>( IList<TSource> list, bool loadBalance); public static OrderablePartitioner<Tuple<int, int>> Create( int fromInclusive, int toExclusive); public static OrderablePartitioner<Tuple<long, long>> Create( long fromInclusive, long toExclusive); public static OrderablePartitioner<Tuple<int, int>> Create( int fromInclusive, int toExclusive, int rangeSize); public static OrderablePartitioner<Tuple<long, long>> Create( long fromInclusive, long toExclusive, long rangeSize);
When Parallel.ForEach is called with an enumerable, it delegates to the first overload to get a partitioner for that source. This enumerable partitioning algorithm is publicly available, so if you have a need to write your own parallelization construct similar to Parallel.ForEach or PLINQ, you can utilize the built-in partitioning algorithms, or any custom implementations for that matter. For example, the code in Listing 6 implements a very simple foreach loop. It takes advantage of the built-in enumerable partitioning capabilities to create one partition on the enumerable per processor. It then spins up one System.Threading.Tasks.Task instance per partition, and that task iterates through the partition's enumerator, executing the body Action<T> delegate for each item found. The using construct ensures that the enumerator is properly disposed when the partition has finished with it.
static void SimpleParallelForEach<T>(IEnumerable<T> source, Action<T> body) { var partitions = Partitioner.Create(source). GetPartitions(Environment.ProcessorCount); var tasks = (from partition in partitions select Task.Factory.StartNew(() => { using (partition) while (partition.MoveNext()) body(partition.Current); })).ToArray(); Task.WaitAll(tasks); }
For More Information
Again, "Samples for Parallel Programming with the .NET Framework 4" (available for download at http://code.msdn.microsoft.com/ParExtSamples) includes the sample code presented in this article, along with several additional custom partitioner implementations, and much more. This partitioning support is part of a significantly larger effort to make parallel programming with .NET leagues easier, and this article has only scratched the surface of what's possible. For more information, check out the Parallel Computing developer center on MSDN at http://msdn.com/concurrency, as well as the blog of the team that built this parallelism support at http://blogs.msdn.com/pfxteam. We'd love to hear about any work you do to parallelize your applications in .NET 4, so please feel free to get in touch. Happy parallelizing!