Multicore Testing Requires Real Parallelism to Happen

Testing an application prepared to run concurrent code can become a nightmare for old-fashioned testing platforms. Multicore testing requires new techniques, new expertise and new hardware. For example, you cannot guarantee a parallelized application's accuracy testing it on computers with single core microprocessors.

I'm going to borrow a sentence from Bram Stoker's "Dracula"

"We learn from failure, not from success!"

One of the most frustrating experiences with multicore programming could be a parallelized application generating unexpected random problems. However, if this application had successfully passed the testing process, it would be even a more annoying situation. Why could this happen? Because testing techniques also have to Go Parallel.

Usually, the best computers (workstations or servers) are dedicated to run the final version of the applications. Nowadays, there is a great probability of having at least four or more logical processing cores in a server (four hardware threads).

You can parallelize an existing algorithm and you can debug it using a dual-core CPU (two logical processing cores, two hardware threads). Then, an extensive testing process could be performed on many different dual-core computers (again, two logical processing cores, two hardware threads). The application could offer accurate results, it could work as expected. However, when running the application on the server, something could go wrong. A hidden bug could appear, a bug generated by an unexplored concurrency.

Two hardware threads do not guarantee real concurrency all the time the algorithm is scheduled to run in parallel. The great problem is the operating system, the scheduler, the kernel and all the other processes and software threads that are competing for processing time. They can avoid some real concurrency to happen because two threads are not always running in parallel. This situation could solve some concurrency bugs. It's a question of time. Some instructions are not running on parallel, they are not running at the same time because there are other threads stealing processing time.

However, when you move to the parallel processing power offered by the server, the additional hardware threads (logical cores) offered by this computer would enable the software threads to run in parallel. Hence, real concurrency will happen. Pure concurrency bugs will appear because the instructions that produce the problem will run exactly at the same time.

How can you detect these pure concurrency bugs? You have to use the appropriate hardware to let real parallelism happen. You cannot test a parallelized algorithm running on single core microprocessors. You need more logical cores, more hardware threads. You have to use the adequate hardware according to the kind of parallelization you're willing to create. It doesn't mean that you need 256 logical cores to develop an application that could be capable of scaling to this number of cores. However, it means that sometimes, two logical cores aren't enough.

Once you face this kind of horrible and difficult to detect bugs, you'll learn to create better parallelized algorithms. You'll learn many things from failure. The recently launched Intel® Parallel Studio offers an excellent toolbox to detect these bugs. It is available for C/C++ programming languages.

Most modern IDEs (Integrated Development Environments) are adding features to help the developers to detect and solve these bugs. However, I do believe Intel® Parallel Studio is the most complete toolbox. I'd love to see versions for .Net and the JVM (Java Virtual Machine) in the future.

Don't forget to check your testing platforms and environments before deploying the final version of a parallelized application. Doing so, you'll avoid terrifying concurrency nightmares.

Parallel Pattern 5: Stencil
All memory addresses used for reads are expressed as offsets
Distributing Work Across Cores Using .NET
A roll-your-own ThreadPool implementation
Looking For The Lost Packets: Part 2
Looking For The Lost Packets: Part 1

Real World Parallelism Webinar Series
  • February 18, 2010
    Lock Contention, Using Intel Parallel Studio to Improve Performance
    Speaker: Vasanth Tovinkere, Software Engineer, Intel Corporation (Bio)

    Vasanth Tovinkere is a software engineer in the Developer Products Division (DPD) at Intel. His current role involves defining novel approaches to understanding and visualizing parallel performance and consulting with strategic customers to help them prepare and deliver code for the multicore world. Vasanth has been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. He also has been awarded three patents and has two patents pending.

    Abstract:
    Discover how easy it is to use the power of Microsoft Visual Studio and Intel Parallel Studio to find performance issues due to lock contention in threaded applications. This ensures that shipped applications can take better advantage of multicore processors. In this webcast, we provide live demonstrations that show how to identify lock contentions issues with Visual Studio and Intel Parallel Studio, an add-in to Visual Studio that helps developers create fast, reliable code on multicore processors.t.