Introduction to Power Debugging

If you know how to use them, even the simplest of debugging tools can be powerful.


August 29, 2007
URL:http://www.drdobbs.com/tools/introduction-to-power-debugging/201802993

Previously in the Advanced Products research group at Citrix Systems, Toby is currently working in Processor Enabling at Intel Corporation. He can be contacted at [email protected].


The debugger is one of the simplest tools available to a developer and yet it is also one of the most powerful. In the simplest case the debugger can display a stack trace of a thread and set break points on functions. This simply requires the debugger to walk the stack or modify and restore memory locations. The outcome of these simple actions can be very powerful. The use of one or two simple commands can solve what would otherwise be a head banging problem.

Developers may spend hours modifying source code to help track down the problems they are seeing. This is definitely a good thing to do and is often necessary, however there are tricks and shortcuts that can be done just using the debugger.

In this article, I will not be covering how to do stack traces or how to track down an access violation. Instead I will show how to use the debugger in non-conventional ways and demonstrate advanced debugging and reverse engineering techniques using very simple debugger features.

The tools I will be using in this article are "cdb", "ntsd" and "windbg," which are freely downloadable debuggers provided by Microsoft (http://www.microsoft.com/whdc/devtools/ddk/default.mspx). There are obviously many other great tools and debuggers such as Bounds Checker, Softice, Ollydebug, VTune and etc. however these are beyond the scope of what will be covered here.

Debugger Feature: Break Point Command Strings

The Microsoft debuggers support the automatic execution of several commands upon encountering a breakpoint. This allows you to automate what you can do at a break point. You can use this to add modifications to the program execution at runtime and avoid recompiling.

The debugger command syntax is bp <Address> "<Command String>". There are other parameters that can also be specified, but this is the simplest form. The command string can contain several commands separated by a semicolon. The limitation of this command is that they will not be executed if you are single-stepping program execution to the break point location.

The command string may also contain what is known as a conditional break point. This is essentially an if-then-else statement in the command string whose syntax resembles the C ternary operator. The syntax of the command string for this is j <expression> ? <then> : <else>. These allow you to conditionally break into or display different messages depending on the value of some variable or register.

Using Command Strings as "OutputDebugString"

Let's say that you want to track every file creation in an application, every registry key or perhaps window creation. Or you may want to track a function call that's internal to your own source. Perhaps this isn't even your application but another application you have installed on your system that you do not have the source code to modify. How do you go about doing this?

There are many ways to do this and it all depends on what you want to accomplish. There are many tools you can use such as FileMon (http://www.sysinternals.com/), RegMon (http://www.sysinternals.com/), Microsoft Spy++ and application verifiers as well as advanced debugger features. There is however a simple solution that may sometimes be all that you need — just a break point.

The problem with a traditional breakpoint on a call such as CreateFile is that you would need to manually verify the parameters and then manually type g. CreateFile is also called all the time by a lot of components outside of your own code. The solution is simply to display the parameters you want to the debugger and go. This can be accomplished by using bp kernel32!CreateFileW "du poi(esp + 4);g".

The break point has two parameters in which the last parameter is g to continue. This allows the previous parameters, only one in this case, to display output and then automatically continue executing the application. The du command will dump memory contents and treat it as a Unicode string pointer. The poi is essentially a pointer dereference, so to get the value contained at the stack location pointed to by ESP, we need to dereference the stack address.

Figure 1 shows a partial stack example of CreateFileW. When a break point set on a function is performed ESP will always point to the return address. The stack grows downward in memory so you simply need to add to ESP in multiples of 4 to dereference any parameter.

[Click image to view at full size]

Figure 1: Partial stack example of CreateFileW.

Automatic Break Points

The diagram from the previous section brings up an interesting point: the return address of a function call is in a known location at the time of the break point. You can display it or even set a break point on it automatically by using the command bp kernel32!CreateFileW "bp poi(esp);g".

Figure 2 demonstrates the debugging notepad and using a more complex break point. The break point on CreateFileW will display the file being opened and then automatically create a conditional break point on its return address and display "Handle Is Valid" if the handle returned is valid or "Handle Is Not Valid" if the handle returned is not valid. The output has been highlighted and you will notice that there was no interaction from the user.

[Click image to view at full size]

Figure 2: Debugger window using automatic break points.

The break point used is the following: bp Kernel32!CreateFileW "du poi(esp+4); bp poi(esp) \"j eax == ffffffff ? '.echo Handle Is Not Valid;g' ; '.echo Handle Is Valid;g'\";g". You may want to notice the syntax: for example, to insert the quotes for the automatic break point's command string, you need to specify a backwards slash. A single quote was also used to group multiple commands together with the conditional break point as opposed to a double quote.

The insertion of automatic break points could be quite useful in avoiding modifying and recompiling your code to place debug output after every function call. This example only displayed a string, however it could stop at a failure and even display the GetLastError value. You can even generate log files of the output using the .logopen command.

There are alternative methods of creating similar functionality for certain types of events. These include using the application verifier to break on system event errors such as file open and create failure. These, however, most likely would not cover libraries or internal APIs that you may want to track. This is also a simple alternative to using tools like application verifier.

Debugger Feature: Custom Debug Extensions

The Microsoft debuggers have an SDK that allows you to write your own debug extension. The extension simply exports debug commands and the debugger will pass your function the parameters entered by the user. The extensions are provided with APIs that allow it to read from and write to the memory in the process being debugged. They are also provided an API to write text out to the debug window.

In the past, the majority of debug extensions were written to simply dump the contents of data structures in a readable format. This use has all but disappeared with the addition of the dt command and structural information in the PDB symbol files. The dt command provides the ability to dump data structures that are contained in the PDB files. The PDBs provided by the Microsoft symbol server even contain a lot of the internal data structures in Windows. The use of PDBs for this purpose is better since structural information would be automatically updated each build rather than requiring you to modify your extension when the structure changes.

The command set of the debugger itself has also become richer, which makes debug extensions less necessary. Even so, there are still use cases for the debug extension and I will demonstrate a few. I will not go into any details of how to write a debug extension but an article on how to do this can be found at the following URL: http://www.codeproject.com/debug/cdbntsd4.asp.

Injecting & Extracting Binary Data

I was once debugging an application that wasn't properly displaying a bitmap. The problem was that the bitmap was received from the network or from a device and was only visible in the memory space of the process. I could have of course attempted to modify the code and write the bitmap to disk. The problem is that a lot of the code was also contained in a library that I did not have the source for. I still could modify the source to read and write to the disk, but I thought of another option that would be more dynamic and reusable.

That other option was to write a debug extension that had the ability to inject binary data into a process as well as extract it. This debug extension could then be used in other cases as a general solution while modifying just this code could not. I was able to extract the bitmap from memory and display it in another image viewer. I was also able to inject other bitmaps into the memory space overwriting that bitmap for use by the application.

This feature does have many other uses such as exporting data to files for binary comparison, extracting files that exist in memory only and the like. I have not included a demonstration, however the source code for the included debug extension includes !importfile and !exportfile for experimentation and modification.

Poor Man's Performance Timing

There have been times that I've written some code that I would have liked to get performance timing on. There have even been times when I wanted to get performance of an application that I didn't have the source for. The application may have been crunching numbers or talking with hardware and through debugging I had determined the locations I needed to measure. I just needed a way that I could do the measuring.

It is easy when you have the source to just modify the code with QueryPerformanceCounter and OutputDebugString and then recompile. After all, I just want an estimate and the code I'm attempting to measure isn't a few instructions but rather a more complex operation. The performance will always vary depending on the machine and the system load so to the nanosecond accuracy isn't necessary.

The flaw with modifying the code is that it would only work for that piece of code. It wouldn't work with applications I don't have the source to and I would be required to modify all locations that I want to time. I could buy a tool, but perhaps the tool is overkill for what I need and I don't want to spend the money. What about writing a debug extension?

The real question is how much accuracy is lost when using the debug extension as opposed to modifying the source code. In order to find out, we need to measure the timing of some simple code, shown in Example 1.

for(uiIndex = 0;uiIndex < g_Loops; uiIndex++)
{
	 dSampleX *= dSampleY;
}
Example 1: Double precision multiplication in a loop.

The data shown in Table 1 is a comparison of two different methods to retrieve performance data. The first listed is "QPC" which is short for "QueryPerformanceCounter" API. This is an abstraction over an operating system-implemented method for querying a performance counter. The second is RDTSC and while the first method could possibly use RDTSC as well, there would be some more overhead involved over using the instruction yourself. The first method, however, could be implemented in a variety of methods including ACPI hardware ports.

Modified Source Debug Extensions
# of Loops QPC RDTSC QPC RDTSC
10 0.006705 ms 0.005133 ms 0.569346 ms 0.564379 ms
100 0.051403 ms 0.049727 ms 0.633879 ms 0.604160 ms
1000 0.498667 ms 0.495724 ms 1.074438 ms 1.042093 ms
10000 5.010413 ms 4.993571 ms 5.577245 ms 5.498423 ms
100000 50.437975 ms 50.366829 ms 51.285289 ms 49.725790 ms
1000000 508.223811ms 507.440129 ms 508.854617 ms 495.261641 ms
Table 1: Timing comparisons

Accuracy and precision are the two factors that make up performance timing. Accuracy is how close your numbers are to the actual time and precision is the interval in which you are measuring. In the above, our precision is milliseconds and when doing performance monitoring via a debug extension I would only suggest to measure items that would require no less than a millisecond resolution.

Modified Source Debug Extensions
QPC RDTSC QPC RDTSC
Overhead 0.006705 ms 0.000051 ms 0.435530 ms 0.421219 ms
Table 2: Performance-monitoring overhead

The data listed in Table 2 is the performance timing of doing nothing and essentially illustrates how much overhead each method typically has. Example output is shown in Figure 3 when using the RDTSC debug extension.

[Click image to view at full size]

Figure 3: Using the RDTSC debug extension.

The following are some of the key factors when interpreting the performance numbers:

  1. The Hardware — The hardware is one of the factors in determining the speed of a segment of code. The performance monitoring I performed was using a Pentium 4M 1.8 GHz laptop.
  2. The Machine Load — The system load will affect the outcome of performance numbers. You can show this by moving your mouse or clicking on an application while performance numbers are being displayed. You can attempt to be more accurate by increasing the priority of the timing thread up to "real time," however depending on what that thread is doing, you may end up skewing the results or locking up the system. The thread being timed may be doing other work for I/O on another thread for example.
  3. What is being timed — The other factor is essentially what is being timed. If you are timing a hardware operation, for example, context switching may skew your results less, as your thread will essentially go into a "WAIT" period. The hardware may then crunch or move the data independent of the CPU and when it's finished the thread will be scheduled again. In an example of straight number crunching however as demonstrated here, a context switch will interrupt the processing and thus is more likely to skew the final time with these interruptions.

In my timing you will note that I generally did not get one straight time for any of these samples. In fact they can vary by several milliseconds due to context switching, for example. You will notice that as the operation you are timing becomes longer the less effect the overhead of using a debug extension skews the results. In fact you may notice that in some cases the result was actually lower with the debug extension than the modified code! This may be a result of the debug extension running in the context of another thread and as such not sharing its time slice with the thread being timed.

This example demonstrates that if you are looking simply for estimates within a few milliseconds then a debug extension may just do the trick. However, if you need more accurate timing then a more isolated environment using increased thread priorities and possibly specialized tools may be a more appropriate solution.

Debugger Feature: Symbol Search

The Microsoft debuggers provide the functionality of searching symbols for function names using wild cards. This can be done on a per-module basis or on a wild card basis. The syntax for this is x <modulename>!<functionname> which can use a combination of letters and * to represent wild cards.

Poor Man's MSDN

This is a very useful feature for those who forget or want to find a function name. It's local, it's fast and it doesn't require you to know the name of the function. Just attach the debugger or start a process in the debugger that loads that DLL. You can then do a wildcard search for functions. I even do cdb rundll32 x.dll bogus in order to simply load a particular DLL and search for functions. I find this feature useful when I'm not even debugging anything!

Conclusion

The debugger is a powerful tool that can be used for a wide variety of purposes beyond just tracking down an access violation. The debugger is not a substitute for the appropriate tools, however it does provide functionality that just may be "good enough" for what you are attempting to accomplish.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.