Bugs in Dark Corners
A common thread that has appeared when I've discussed elegance versus trickery (1, 2, 3, 4) is the notion that unfamiliar code is tricky code. Today I would like to point out that programming in an unfamiliar style has another hazard: It is more likely to uncover the compiler bugs that often lurk in dark corners.
The corners don't even have to be particularly dark. I have three examples of this phenomenon that I think illustrate this point.
The first involves a simple statement:
*p++ = f();
I know someone who encountered a compiler that would generate incorrect code from this simple statement whenever p was a pointer to short. I suppose that the compiler realized that short should generally be promoted to int, and incorrectly promoted the pointer as well. However, one could get the compiler to generate correct code by rewriting the statement as
short temp = f(); *p++ = temp();
or as
*p = f(); ++p;
The author of the program that contained this statement had no choice but to rewrite it; even if the compiler was fixed on one machine, there was no reason to believe that it would be fixed on other machines that might run this program.
My second example involved a program I once wrote to generate an encrypted checksum of a file. I forget why, but somewhere in the code I multiplied two unsigned long values and expected a result that would be truncated to the size of an unsigned long. Unfortunately, one of the compilers on which I wanted to run the program yielded zero instead of the correct result whenever the multiplication overflowed. I ultimately had to rewrite that part of the program to cater to the broken compiler.
The final example comes from a former colleague who once tried the experiment of creating a text file that contained a single line that was 25,000 characters long. He wanted to see how various text-processing programs handled such a file as their input. What he found was that almost all such programs quietly gave incorrect results. Typically, they would ignore the part of the line that was beyond an undocumented length limit. Sometimes they would crash or give nonsensical results. Yet users of these programs did not have any clue in advance that the programs would not behave as documented when presented with unreasonably long input lines.
What these three examples have in common is that they expose a discrepancy between how systems — particularly compilers — are supposed to behave and how they behave in practice. Their intended behavior depends on their intended use; their actual behavior depends on their actual use. This discrepancy is a powerful argument against using tools in unexpected ways.
This line of thinking, in turn, tends to argue against any kind of change in how we use tools that have become familiar. It is often easier to convince people to use a completely new tool than it is to convince them to change how they use tools that they are already using — or, for that matter, to use new features in old tools.
I will have more to say next week about how this phenomenon affects the way tools evolve.

