Debugging As Science: A Concrete Example
Last week, I argued that debugging is more science than art. I would like to continue that discussion with an example from my own experience.
More Insights
White Papers
- Stop Malware, Stop Breaches? How to Add Values Through Malware Analysis
- Blue Coat Research Report: The Visibility Void
Reports
More >>Webcasts
- Real results: Speeding quality application delivery with DevOps [in financial services]
- Advanced Threat Protection For Dummies ebook and Using Big Data Security Analytics to Identify Advanced Threats Webcast
Once upon a time, a colleague brought me a C program. He had spent several days trying to figure out why it didn't work. Not only that, but it failed in a remarkable way: It would terminate, apparently normally, but not produce any output.
The debugging tools available on the machine I was using were rudimentary, but they did offer the ability to trace each function as it was called. Doing so produced large amounts of trace output, of course, but it also showed me that the last function it called was printf
. In other words, the program was trying to print, but something caused the program to terminate without actually printing anything. Moreover, the program was failing in this way the first time it tried to print anything.
It's hard to figure out why a program is crashing in a call to printf
by inserting more calls to printf
. Instead, I decided to try a scientific approach by making a hypothesis: The reason printf
terminated without printing anything was that it was trying to allocate memory for an output buffer, and that allocation attempt was failing for some reason. I picked this particular hypothesis because I could not imagine any other way in which printf
might fail without indicating that failure in some way.
My next question was why the buffer allocation might have failed; my hypothesis was that the memory allocator's internal data structures had become corrupted in some way. Experience showed that such corruption could happen very easily if a program allocated less memory than it actually needed. In short, my hypothesis was that the program was failing because a call to malloc
had a value that was too small.
Moreover, I hypothesized that whatever was given to malloc
was only a little too small, not much too small. My reasoning was that if the value was much too small, then the result would be to overwrite a lot of malloc
’s internal storage, which I figured might scramble things so badly as to cause an overt crash rather than a mere allocation failure.
How might I test this hypothesis? My first step was to write a substitute for malloc
:
void *malloc1(size_t n) { return malloc(n + 8); }
This function behaved just like malloc
, except that it allocated 8 bytes more memory than was requested. I added this function to the program I was testing, and then used a text editor to change every call to malloc
into a call to malloc1
. As soon as I did this, the program worked!
Now that I was reasonably certain that the problem was that one of the calls to malloc
was allocating too little memory, all that remained was to find which call to malloc
was the culprit. I could have changed the calls to malloc1
back to calls to malloc
one by one, but in this case I thought it would be easier to start by examining each call and trying to prove that it was allocating the right amount of memory.
It did not take me long to find the problem: One call to malloc
was in a context something like this:
char *s = malloc(strlen(s1) + strlen(s2) + 1); strcpy(s, s1); strcat(s, "."); strcat(s, s2);
Here, the + 1
was intended to account for the period between the values of s1
and s2
, but it failed to account for the null terminator at the end of the result. Because of this failure, s
always pointed to one byte too little memory than would be necessary to contain the result. Changing the 1
to 2
in this code fixed the problem permanently.
The essence of science is forming hypotheses about how the world works, then testing those hypotheses by experiment. It usually takes just a single failed experiment to disprove a hypothesis, and yet no number of successful experiments will completely prove it. Nevertheless, each time we do an experiment that tends to confirm a hypothesis, our faith in that hypothesis grows.
For example, we can hypothesize that the gravitational attraction between two objects is inversely proportional to the square of the distance between them. If this hypothesis is true, it will predict properties of planetary orbits that we can confirm by observation. No such observations will prove that the hypothesis is correct, but the more observations we make that are consistent with it, the more confident we are in its correctness.
Similarly, in this example, I hypothesized that this program's failure was due to memory overrun, a common cause of such failures. I constructed an experiment that, if it failed, would disprove my hypothesis. The experiment succeeded, which — although it did not prove the hypothesis — strengthened my faith in it. With this partial confirmation, I felt it was worthwhile to inspect every place the program allocated memory. Without the confirmation, I might have tried to narrow the search further by other means.
Part of the reason I say that debugging is more science than art is how often I have been able to use this kind of hypothesis/experiment technique to find bugs in programs.