Mario Hewardt is the founder of The High-Tech Avenue, a consulting and educational company that specializes in Windows and .NET. Mario is also the author of Advanced .NET Debugging and Advanced Windows Debugging
Far too often, performance is overlooked and tackled too late in the release schedule. And while developers spend long hours chasing elusive -- but "standard" -- bugs (crashes, resource leaks and the like), performance bugs go unnoticed for long periods of time or not addressed at all. After all, "it's not crashing, just taking a little longer than expected".
I would bet that all developers have been in situations where all the standard bugs have been taken care of and the application works satisfactorily in staging -- but once taken to production, comes to a grinding halt. After spending weeks troubleshooting the problem (all the while revenue is impacted) by going through log files, performance counters and other interesting data points, the problem is finally isolated and a fix implemented and deployed.
One of the interesting challenges when troubleshooting performance problems is that of using a suitable toolset that can quickly help you find the root cause the problem. It is common for performance problems to lie outside of your direct code base. Imagine a credit-ard processing server where the RAID controller has been incorrectly configured, causing dropped transactions. In this case, you can look at your credit-card transaction code until you are blue in the eyes. What is needed is a tool that is able to tell you about the system as a whole to help identify bottlenecks.
Fortunately, that tool exists -- the Microsoft Windows Performance Toolkit. In this article I focus on a subset of the toolkit called XPERF, a powerful tool that helps with overall system performance analysis.
Installing XPERF
XPERF is part of the Microsoft Windows Performance Toolkit (MSWPT) which in turn is part of the Windows SDK. Since the MSWPT is a fairly lean toolkit, it might seem like overkill to have to download the entire Windows SDK to get just that toolkit. Fortunately, the Windows SDK Web Installer lets you specify a subset of the Windows SDK to download and install and MSWPT is part of the Development Tools subset. To use the Windows SDK Web Installer, go to the Microsoft web site and follow the download instructions for the particular Windows SDK version. For example, on my machine, I would use the following link to install the Windows 7 SDK:
http://www.microsoft.com/downloads/details.aspx?FamilyID=c17ba869-9671-4330-a63e-1fd44e0e2505&displaylang=en
Once the Windows SDK Web Installer starts running, I deselect all the options except for the Development Tools. An interesting caveat here is that once the installation completes you're almost all the way there but not quite. The installer doesn't actually install MSWPT, rather copies the installer package (MSI) to your installation path. The final step in the installation process is to manually invoke the MSI to get the toolkit installed. For example, on my machine, I installed the Development Tools to the following folder:
C:\Program Files\Microsoft SDKs\Windows\v7.0
Upon successful installation, the following files are available under the bin folder:
- wpt_ia64.msi
- wpt_x64.msi
- wpt_x86.msi
Each of the installation packages represents the toolkit for a given architecture. If you run the package corresponding to the architecture of choice, the installation process is straight forward and installs the toolkit in the folder of choice. On my machine, I left the default options and it installed into:
C:\Program Files\Microsoft Windows Performance Toolkit
In this folder, you will see the two binaries that we will be using throughout this article:
- xperf.exe -- Drives the configuration and collection of performance data
- xperfview.exe -- Used to analyze the performance data collection from a previous run of xperf.exe
Architecture
As with any type of troubleshooting, success is directly proportional to the amount of diagnostics data that is available for analysis. I'm sure we've all been in a situation where we were debugging a heap corruption only to find out that the crash dump file we were analyzing didn't have full memory information contained within it, thereby limiting the success of finding the root cause. The same is true when debugging performance problems -- the more data that is available the higher the success rate. By increasing the amount of diagnostics data that is logged however, we also increase the pressure on the system as a whole. One of the single biggest questions that always come up with any type of tracing is how much can we trace before it adversely affects the system?
The answer is that it depends on the tracing technology that is utilized. I can very easily write a simple tracer that writes and flushes to a file every time the log method is called. Of course, that approach is highly inefficient I probably won't be able to log enough data to be useful before the system is affected. XPERF faces that same problem. For it to be a useful tool (across the system as a whole) it must be able to log large amounts of diagnostics data. How does it go by doing that without affecting the system overall? The answer lies in Event Tracing for Windows (ETW), a general purpose and super-efficient tracing mechanism that is built into Windows (both in kernel and user mode). In addition to being extremely performance, ETW lets you dynamically enable/disable logging without having to restart the system and/or application. Figure 1 illustrates the overall architecture of ETW.

Fundamentally, ETW consists of four primary components:
- Session: The base entity responsible for writing traces to the trace files.
- Producer: A producer writes event traces to one or more sessions.
- Controller: A controller can configure ETW sessions as well as enable and/or disable producers.
- Consumers: Applications that read event traces from the ETW trace logs.
Based on the architecture depicted in Figure 1, XPERF is really just a controller (in the sense that it can control what is being collected, i.e., the providers) as well as a consumer since it can display the results of the collection in a meaningful fashion to the user. For example, I can configure XPERF to collect certain kernel mode diagnostics data and then use the resulting file to analyze the results.
They key takeaway here is that XPERF is based on an incredibly powerful tracing mechanism native to the Windows operating system. It has the ability to tell (control) Windows exactly what to collect in an extremely efficient manner -- far more efficient than any other tracing mechanism.