New Garbage Collectors Designed With Parallelism in Mind

On the one hand, Garbage Collectors simplify developers' lives, but on the other hand, they can become the greatest enemies of a parallelized algorithm's performance. Finally, Java 7 and .Net 4 are going to offer new Garbage Collectors really targeted for multicore microprocessors with large memories.

I work with many programming languages. I work with unmanaged C++, C# and Java, among others. One of the most exciting features of both C# and Java is their Garbage Collectors. Most developers tend to forget about releasing unused resources. The recommendation is to leave the Garbage Collectors do their work. It is a "Mother Nature will provide" approach.

One of the great advantages of designing an algorithm that is going to be programmed using unmanaged C++ is that the developer is responsible of releasing the resources at the right-time. This is very important in complex algorithms running many concurrent tasks in multicore microprocessors. Parallelized algorithms usually require more memory than their serial code version. Choosing the right time to release the resources is crucial to achieve the best possible performance. There is no Garbage Collector marking elements to be released in the next collection process. You don't have to trust in the Garbage Collector's fortune-telling capabilities. You know the algorithm and you know exactly what you want to do. You control all the variables.

However, when you work with C# or Java and you trust in the Garbage Collectors' algorithms, your algorithm can be the next victim of their inaccuracies. As aforementioned, parallelized algorithms usually require more memory than their serial code version. Therefore, they add a great pressure to Garbage Collectors and they can add serious performance problems to algorithms with outstanding designs.

The great problem is that the algorithms used in the Garbage Collectors current versions were not optimized to run on microprocessors with a great number of cores. They were optimized for multiprocessor systems. However, a Core i7, for example, offers 8 logical cores in a single physical microprocessor. It is completely different than a system with 8 physical microprocessors. Garbage Collectors are really complex and the hardware available nowadays is different than the hardware that was available a few years ago.

Luckily, .Net 4.0 and Java 7 will offer new Garbage Collectors, really optimized for multicore microprocessors. They were both designed to target the new micro-architectures, support high levels of concurrency, manage larger memory and reduce the latencies introduced during applications' execution. Of course, they have many differences, because JVM (Java Virtual Machine) and .Net's CLR (Common Language Run-time) are very different. However, the Garbage Collectors are changing in similar directions.

This is great news for C# and Java developers thinking seriously about multicore programming.

.Net's new CLR 4 will offer a new Garbage Collector mode, Background GC, which reduces latency among other improvements. You can watch the video of the presentation offered by Joshua Goodman on Lang.Net Symposium 2009. CLR 4 is available in .Net Framework 4.0 Beta 1 and Visual Studio 2010 Beta 1.

Java 7 will offer the new G1, also known as Garbage-First, Garbage Collector. G1 is available as an early preview since Java 6 Update 14.
You can read this excellent white-paper explaining its technical issues.
Besides, you can go here and watch the slides of "The Garbage-First Garbage Collector", by Tony Printezis and Paul Ciciora.

Parallel Pattern 5: Stencil
All memory addresses used for reads are expressed as offsets
Distributing Work Across Cores Using .NET
A roll-your-own ThreadPool implementation
Looking For The Lost Packets: Part 2
Looking For The Lost Packets: Part 1

Real World Parallelism Webinar Series
  • February 18, 2010
    Lock Contention, Using Intel Parallel Studio to Improve Performance
    Speaker: Vasanth Tovinkere, Software Engineer, Intel Corporation (Bio)

    Vasanth Tovinkere is a software engineer in the Developer Products Division (DPD) at Intel. His current role involves defining novel approaches to understanding and visualizing parallel performance and consulting with strategic customers to help them prepare and deliver code for the multicore world. Vasanth has been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. He also has been awarded three patents and has two patents pending.

    Abstract:
    Discover how easy it is to use the power of Microsoft Visual Studio and Intel Parallel Studio to find performance issues due to lock contention in threaded applications. This ensures that shipped applications can take better advantage of multicore processors. In this webcast, we provide live demonstrations that show how to identify lock contentions issues with Visual Studio and Intel Parallel Studio, an add-in to Visual Studio that helps developers create fast, reliable code on multicore processors.t.