Dr. Dobb's | Parallel Extensions to the .NET Framework

Parallel Extensions to the .NET Framework

Parallel extensions to the .NET Framework. Do we need them?

April 23, 2008
URL:http://www.drdobbs.com/windows/parallel-extensions-to-the-net-framework/207401588

Matt is director of technology for Lab49 (www.lab49.com), a company that builds advanced technology applications for the financial services industry. You can read his blog at https://mdavey.wordpress.com.

Microsoft Research and the CLR team at Microsoft are on to a winner with their "Parallel Extensions to the .NET Framework," formally known as the "Parallel FX Library." The aim of Parallel Extensions is to help software engineers take advantage of the multiprocessors/multicore hardware that is increasingly common on the desktop and server.

Parallel Language Integrated Query (PLINQ), part of Parallel Extensions, appears to be gaining the most column inches in blog land. PLINQ is an extension to Language Integrated Query (LINQ) that became available in .NET Framework 3.5. LINQ has spawned numerous pipeline extensions since its debut, some available at CodePlex; Streaming LINQ, for instance. Why do we need Parallel Extensions to the .NET Framework? The general message coming from Microsoft is:

"The free lunch is over," no more Moore's Law.
"Multicore is the future for performance and scalability" (at least, according to Intel).

This therefore leads to the simple equation: Concurrency == Parallel Extensions for the .NET Framework.

MSDN magazine has published a number of Parallel FX articles (msdn.microsoft.com/msdnmag/issues/07/10/PLINQ), so rather than regurgitate the same code here, I'll assess the impact of Parallel Extensions from the perspective of investment banking software projects. Putting business logic aside, many investment bank applications live or die on performance. If technology can reduce the latency of processing trades, then its uptake within a bank will happen swiftly. PLINQ offers a simple way to execute queries in parallel. The downside to this is that code that wasn't originally intended to execute in parallel could now cause applications to exhibit unexpected behavior if not thoroughly tested.

Today, most banking software engineers don't have multiprocessor/multicore hardware, but the business side of the banks (traders/sales) do. The mismatch between the development hardware and the production hardware means that software engineers writing .NET code can't be sure that their application will perform correctly once it is deployed. Coupled to this concern is the fact that the testing hardware often doesn't mirror production hardware, which again means that threading bugs in production can't be replicated in development. To take advantage of these new approaches, it is important for banks to ensure that their software engineers have appropriate hardware and software prior to adopting Parallel Extensions.

PLINQ solves the short-term problem of letting developers take advantage of multiprocessor/multicore hardware; however, PLINQ doesn't address a way to access the ever-expanding compute-grid infrastructure that investment banks are building.

Microsoft Research has an answer to the PLINQ compute-grid problem—DryadLINQ (research.microsoft.com/research/sv/DryadLINQ). Dryad enables reliable, distributed computing on thousands of servers for large-scale data-parallel applications. Microsoft AdCenter already deploys Dryad in a live production system. DryadLINQ enables data parallel applications to run over the Dryad distributed computing environment. Unfortunately, at this time, DryadLINQ is only available for Microsoft internal use.

Given that PLINQ went CTP in late 2007, could we build a grid variant of LINQ today to start taking advantage of compute grids? If we assume for this discussion we're going to use DigiPede (www.digipede.net) as our distributed computing solution, we could develop a C# 3.0 extension method that submits work to the grid:


namespace Home.GridLinq
{
  public static class GridEnumerable {
      public static IEnumerable<T> OnGrid<T>(         this IEnumerable<T> source)
      {
// Submit work for DigiPede (see http:mdavey.wordpress.com)
      }
   }
}

Using the OnGrid extension method above, we could then price trades in parallel on the DigiPede grid:


var prices = trades.OnGrid().Select(t => t.Price)

Assuming that trades is a collection of Trade class objects that follow the Digipede Worker pattern (Serializable), the LINQ expression above effectively serializes the Trade objects out onto the compute grid, letting Digipede agents execute pricing, and returning the results back to the client LINQ application.

Where are we? Sitting in the financial vertical, PLINQ can't be released quickly enough. As long as banks understand the implications, PLINQ will improve the trading cycle. Microsoft, however, needs to continue to push in the concurrency field, making developers' lives easier by giving them access to cores locally as well as remotely, and ensuring that the development/debugger tools keep up with the library as it advances.