Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

A parable about undefined behavior

July 26, 2008

Once upon a time there was a Very Large System written in C.  The VLS developers  had found a troublesome bug, which they traced down to strange behavior in the C compiler.  Specifically, one of the developers had written a statement that looked like this:

    if ((c = a - b) == 0) { /* ... */ }

They had discovered that when a was a large positive number and b was a large negative number, it was possible that the result of the subtraction would overflow.  In that case, the value of c might be zero, but the code in the braces would not be executed.

They duly reported this bug to the compiler's author, and were informed that this behavior was not a bug: If the evaluation of a - b overflows, the result is undefined and the implementation can do as it pleases.

The VLS developers were incensed.  Surely, if you assign zero to a variable and then test the value of the variable, you should be entitled to assume that the test will show that the variable is zero.  Not so, said the compiler author.  If your program causes an overflow to occur, that program is invalid, and the compiler has no obligations about what it should do with invalid programs.

On the surface, this attitude may seem like needless stubbornness on the compiler author's part.  However, the reason for his attitude becomes apparent when we look a little more closely at the computer's instruction set.  This particular computer did comparisons by means of a tiny register called the condition code.  This register contained two bits, which meant that it could represent one of four states.  Those states were usually negative, zero, positive, and overflow.  Many arithmetic operations, particularly including addition and subtraction, would set the condition code to reflect the result of the operation.  One could test the state of the condition code with a conditional jump instruction.

Now consider the code that the compiler generated for the expression ((c = a - b) == 0).  It would begin by loading a into a register and subtracting b.  This subtraction would set the condition code to reflect the result.  The next instruction would store this result in c, an operation that does not change the condition code.  Therefore, it would be possible to follow this store instruction by a conditional jump that would test whether the result of the subtraction (and therefore the value of c) is zero.

What happens if the subtraction overflows?  Then the condition code is set to the overflow state, and checking against zero fails (because overflow state is not the same as zero state).  In other words, the actual value of c cannot affect the outcome of the conditional jump if an overflow has occurred.

Of course, it would have been possible to change the compiler to insert an extra instruction that would test the value of c.  Doing so, however, would have imposed the time and space overhead of that instruction for all programs, not just ones in which overflows happened to occur at inopportune times.  He took the view that the language definition says that overflow is undefined, and therefore that it would be inappropriate to try to define it for the sake of mere convenience.

When a programming language definition talks about undefined behavior in a context that is not obvious, it is often possible to discover the reason by thinking about what the implementation would have to do in order to define the behavior reliably in that context.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video