Visualizing Parallelism and Concurrency in Visual Studio 2010 Beta 2
Visual Studio 2010 Beta 2 includes many interesting improvements related to its multicore programming features. The parallelism and concurrency profiling tools allow developers to visualize the behavior of a multithreaded application on multicore microprocessors and collect resource contention data.If you want to translate multicore power into application performance, you have to make sure your concurrent software threads are running on hardware threads taking advantage of parallelism. Visual Studio 2010 Beta 2 improved many profiling reports related to parallelism and concurrency.
The IDE uses the name Concurrency. However, I'd rather talk about both parallelism and concurrency. When you create a multithreaded application, using task-based programming or raw threads, you're creating concurrent code. Nonetheless, it doesn't mean that the concurrent code is going to run in parallel all the time. It depends on the decisions taken by the operating system scheduler, the underlying hardware and the synchronization problems, among others. Therefore, it is necessary to evaluate whether the programmed concurrency is taking advantage of certain parallel hardware capabilities. Are the software threads running in parallel taking advantage of the existing hardware threads? The new Concurrency profiling tools offered by Visual Studio 2010 Beta 2 provide nice information to answer this question. Again, this tool allows you to visualize parallelism and concurrency, not just concurrency.
This option works with Visual Studio 2010 Beta 2 Premium or Ultimate versions. Besides, it requires Windows Vista, Windows 7, Windows Server 2008 or Windows Server 2008 R2.
Before beginning, you must run Visual Studio 2010 Beta 2 as Administrator. Then, you can open the multithreaded solution to analyze and select Analyze, Launch Performance Wizard… from the main menu. I'm going to explain some of the results offered activating the options Concurrency (Parallelism and concurrency in my parallel programming language), Collect resource contention data and Visualize the behavior of a multithreaded application, as shown in the following picture:
Specifying the desired profiling method.
If you're working on a 64-bits operating system, you'll probably see a dialog box whit this message "To enable complete call stacks on x64 platforms, executive paging must be disabled. A reboot is then required. To make this change, click "Yes", save your work, and then reboot.", as shown in the following picture:
On 64-bits operating systems, the IDE will disable executive paging and force you to reboot.
You have to take into account that the application is going to take more time to run whilst being profiled. Once the application finishes or the profiling session is interrupted, Visual Studio will start analyzing the generated report.
Minor criticism, the IDE usually takes a long time to analyze the report. It doesn't take advantage of multicore in order to run this CPU-intensive process… I think that multicore programming analysis tools should be optimized for multicore. However, remember that I'm talking about Beta 2. As a multicore developer, I expect multicore development environments to take full advantage of modern multicore microprocessors.
The first graph will show a concurrency visualization, displaying the wall clock time, as shown in the following picture:
Visualizing the behavior of a multithreaded application.
Then, you can click on CPU utilization and Visual Studio will display the average CPU utilization for the analyzed process on a graph, considering the available hardware threads (logical cores). In this case, the average CPU utilization was 86%, as shown in the following picture:
Visualizing the CPU utilization.
However, you have to be careful whilst analyzing this graph. As I explained in my previous post, "TMonitor: Understanding What Happens With Each Hardware Thread", some technologies like Enhanced Intel SpeedStep Technology and Intel Turbo Boost Technology affect the CPU utilization. Besides, a high CPU utilization percentage could mean huge synchronization overheads. Remember to measure speedup and scalability considering the execution time with different hardware threads (logical cores) before profiling.
Then, you can click on Threads and Visual Studio will display visual timelines for the disks activities, the main thread and all the worker threads. This is a very useful visualization because it helps to split between execution and synchronization times. Visual Studio uses different colors, as shown in the following visible timeline profile:
Visual Studio uses different colors to fill the timelines and offers a very clear summary.
The following visualization shows the result of running an application that creates groups of worker threads to take advantage of four hardware threads (logical cores). It is not using the work stealing queues offered by .Net 4.0 Beta 2:
Visualizing timelines for each worker thread.
The application uses raw threads. Therefore, it is very easy to see that it is not reusing threads to schedule tasks. It is very important to reduce the thread creation overhead and the existing synchronization to optimize the application. The profiler offers very useful information.
Finally, you can click on Cores and Visual Studio will display how each software thread was executed on each available hardware thread (logical core). In this case, the application ran on a quad-core CPU with 4 hardware threads (4 logical cores and 4 physical cores), as shown in the following picture:
Visualizing the software threads running on the available hardware threads (logical cores).
Besides, the profiler summarizes the cross-core context switches, the total context switches and the percent of context switches that cross cores.
These new visualization options are really useful to optimize applications to help developers using Visual Studio to successfully translate multicore power into application performance. There are many additional options. This is just an introduction to the new views. I'll be adding real-life examples related to parallel programming and profiling using the new features found in Visual Studio 2010 Beta 2.