Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Debugging by Hypothesis

December 13, 2012

Last week's article got an unusually large number of comments, so I'm going to continue this theme with another example. This time, the failing program was a compiler.

More Insights

White Papers

More >>


More >>


More >>

There's a technical term for people who claim that their programs don't compile because of a compiler failure rather than a bug in their own programs: arrogant. In this particular case, though, this arrogance was justified, because I would compile a program and get a bunch of error messages, and then I would compile exactly the same program again without any problems at all. Moreover, the error messages seemed to have nothing to do with my program: They would complain about invalid characters that my program most definitely did not have.

It's particularly hard to find out what is wrong when doing the same thing twice gives different results. Even worse was that the failures were rare: The spurious error messages would occur only about one time in 10. Moreover, I had nothing to do with the compiler in question, so I couldn't fix the bug even if I knew what it was. On the other hand, I needed to be able to compile my programs. What could I do?

I decided that my only chance of getting anywhere was to figure out how to reproduce the problem. If I could say to the compiler folks: "When you do X, Y, and Z, your compiler occasionally produces spurious error messages," then I might be able to get them to fix it.

This particular compiler had several phases, which were run from a shell script. As a result, the logical first step was to figure out which phase was failing. Once I had done that, I could easily change the shell script to capture a copy of the input to that phase. My hope was that I could run just that phase with that particular input, thereby provoking a failure. I would then be able to give this input file to the compiler group and tell them that their compiler occasionally, but not always, failed with this particular input.

This is probably a good time to take a step back and look at the principles behind this strategy.

  • An important early step in debugging any problem is to learn how to make it fail.
  • If you can divide a failing program into several parts (compiler phases in this case), and you can show that the failure is happening in a particular part, the problem will be easier to find.
  • If someone else is going to be fixing the problem you have found, it is important to bundle up all the information needed to produce a failure.

These principles seem straightforward enough. However, an important one is missing:

  • Once you have bundled up the information needed to produce a failure, you must verify that your information actually does produce a failure.

I would have been embarrassed indeed if I had not realized this last principle. For when I ran the failing compiler phase on the input file that I had captured from a prior failure, it worked perfectly. Every time. Apparently, something about the act of capturing the input file caused the problem to go away.

I invite readers to speculate about what might cause this state of affairs before I continue the story next week.

Related Reading

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.



With regard to race conditions, it would be interesting to see what would happen if you put a sleep statement between each part of the compilation script. Try 1 minute, 5 seconds, and 10 minutes. If one or more never shows the error, that might be informative.

I ran into one case where the files couldn't be opened until a fraction of a second after the other program closed them. I ended up putting in a number of one second sleep statements.

Does the same problem occur if you use the same compiler on another computer? Could it be hardware related, perhaps problems in the disk drives? Are there any system event logs that you can look at to see if other things happened at the time of the failures. Can you run self-tests on the memory and disk drives?


Working with Microsoft compiler, I have received "Internal compiler error" many times. Then, I would restart Visual Studio and it compiles.


I would like to say "it sounds like my wife", but I would get in trouble.


Some hardware is sensitive to alignment, and before execute-disable bits were common, could run off into data or uninitialized areas attempting to execute whatever was there, leading to some rather odd output. Depending on the hardware involved, a misaligned instruction output by the compiler could have caused this. The position of the compiler and/or output areas in memory at compile time would affect the generated code.

Also, an optimizing compiler could be more likely to run into this if the compiler writer found a seldom used instruction in the instruction set, had the compiler emit this instruction under the right circumstances, *and* got something wrong with the semantics of the seldom-used instruction. Early optimizers also did weird things to code, not all of them properly invariant.


I had a similar problem when programming on a CDC Cyber series main-frame in the 70's. The program would produce pages of errors on perfectly good lines of code. No one at the University had a clue as to the problem. I finally found out the problem was that I was requesting too much memory for an array. But rather than saying anything about that it complained about everything else in the program.


I found a compiler bug once myself. Once in about 20 years of coding. :-) Compiler bugs certainly can happen, but it's pretty rare. Always assume it's your code first and you'll be right a lot more often than being like those that always assume the compiler is junk. :-)


Some characters that were not present in your source code but they appeared in the error message... I wonder if you redirected both stdout and stderr into your captured file. Could it be that one phase output some garbage to stderr and that the next phase received it? Just guessing though.


A simple way to have GCC output random and spurious error messages, sometimes crashes, sigdevs, what have you is to run it in a virtual machine which does not have enough memory (e.g. Ubuntu with just 512mb ram in my case). It took me quite a while to figure out what went wrong, it was all so random that I thought of a multithreading race condition. Indeed after I limited the number of concurrent compilations to 1 it worked! But in reality this only reduced peak memory consumption, so this was not a proof of my initial hypothesis (can only be invalidated, not proven), but just covering up the underlying problem, lack of ram. Also one might argue the real problem was not out of memory, but the very poor, actually not existing, error handling of GCC, which made it next to impossible to decuce what goes wrong (none of the random errors hinted at a memory problem). So in short: bad error handling is a huge waste of time, not for programmers, but for end users.


Uninitalized memory or unintialized variables are typical causes of such problems. If an other compiler step releases the system memory containing random data such problems may occur.
But may be the failing compiler phase isn't the problem and the previouse phase produces a damaged output stream if it's output fifo (the next compiler step) works to slow.


That's interesting. I guess it was some kind of race condition. Was the later phase reading the input file before it was completely flushed to disk?