The term "optimization" refers to the techniques a compiler uses to produce code that is appreciably smaller and/or faster than code generated by comparable non-optimized compilers. Over the past year, C compiler manufacturers have been jumping on the optimization bandwagon as a means of both improving their product and of differentiating it from other compilers on dealers' shelves. Unfortunately, they don't go into much detail explaining exactly what they mean by optimization other than saying their product is "faster and better" than other available compilers.
There are several optimization techniques available to compiler developers of which one or more method may be implemented within an individual compiler. These techniques include register allocation, constant propagation, dead assignment elimination, dead code removal copy propagation, and common subexpression elimination. For a detailed description of each of these methods, see "Optimizing Compilers for C" by Richard Relph (DDJ, August, 1987). You may also want to refer to Compiler Principles, Techniques, and Tools (also known as "The Dragon Book") by A. Aho, J. Ullman, and R. Sethi (Addison-Wesley).
Of all these methods, register allocation is perhaps the most advanced and important optimization technique and a key aspect of approach used by developers such as Watcom, Datalight/Zortech, and others. With register allocation, the number of memory references in a program are kept to a minimum by keeping variables and temporary results in registers. This allows shorter instructions to be generated which results in smaller, faster code.
The compilation process always begins by translating source into some internal, intermediate level representation. Once an intermediate representation of a function has been built, algorithms can be used to perform live variable analysis. A variable is live at a point in the flow graph if its value at that point could be used at some other point in the flow graph. Intuitively, a variable is live if its value may be used "later on" in the function.
Live variable information is used to build a data structure called the conflict graph. Every variable in the function has a corresponding node in the conflict graph. A link between two nodes is present whenever two variables are live at the same point in the flow graph. These links indicate that the two variables may not share the same register. The method used to assign registers to variables is called a coloring algorithm since it is similar to the problem of coloring a map so that no two adjacent countries have the same color. Since a machine has a limited number of registers, the object is to color (assign a register to) as many nodes of the conflict graph as possible without assigning the same color (register) to any two adjacent nodes. When a node is left uncolored, the corresponding variable must be assigned a position in memory or on the stack.
Of course, allocating registers intelligently requires information about the use of the variables. A variable that is heavily used in a loop is a better candidate for a register than a variable that is only used twice at the beginning of a function. Analysis may be performed on the flow graph to determine the loop nesting depth of each basic block. The uses of each variable are found and a weight is assigned to each use. Higher weights are assigned to uses within one or more loops. These weights are added up and stored in the conflict graph. The coloring algorithm takes these weights into account when assigning registers.
Although optimization can occur over several ranges statement, block, function, module, and program the compilation process, the most advanced are global optimization techniques that occur over the range of an entire function rather than just a single statement. Global optimizations are applied to intermediate representations of functions that consist of instructions formed by a set of operands and one operator. These instruction sets are organized into sequences called blocks that have one label at the beginning and one jump or return at the end. Blocks are linked together to form the control flow of the function.
In some cases, it is convenient for compilers to let the register allocator introduce some anomalies that are removed later in a post-optimization phase. One post-optimization technique involves keeping track of the value of a register at any point in the block. The data structure used to keep track of this information is sometimes called a register scoreboard. Each instruction is examined and the scoreboard is updated to reflect the effect that the instruction has on the register. ff a register is stored in a memory location, the scoreboard remembers that the memory location and the register contain the same value. When a move instruction is encountered, the scoreboard is consulted to determine if the move is redundant. When a memory reference is found in an instruction, the scoreboard is checked to find out if a register contains the appropriate value, By carefully designing the register scoreboard, a significant gain can be achieved. -- eds.
All the major players in the C compiler marketplace claim to support the standard language as defined by Kernighan and Ritchie, but do they really? And if not, what are the discrepancies?
An outfit called The Austin Code Works (11100 Leafwood Ln., Austin, TX 78750; 512-258-0785) sells a test suite to find out. It's called the C Compiler Torture Test. For $20, you get a PC-compatible disk containing 23 C programs and a READ.ME file that tells you how to use them.
The Torture Test measures a compiler's compliance with the K&R "white book." In a total of some 6600 lines of code, the suite exercises the entire standard. Particularly stressed are language features likely to be used by advanced C programmers: complex constructs, cascading defines, pointer manipulation such as multiple indirection, and so forth.
Testing with the suite occurs on two levels: compile-time and run-time. In some cases, the compiler might choke on certain constructs. This is a clear indication of noncompliance. A clean compile means that the compiler accepted the source language, but it doesn't mean that the result is correct.
Consequently, after torturing the compiler, you run each test program to see that it does what is expected. For example, dereferencing a chain of pointers should ultimately lead to a known variable that the program can compare with a hard-coded value. When the result is wrong, the program issues an error message. The messages are cryptic, but the test programs are structured so that you can look up the error message in the source text and determine from comments what the problem is.
Since the Torture Test programs contain only vanilla K&R code (regardless of its level of difficulty), a perfectly complying compiler should compile them without a whimper and the resulting programs should run correctly. A failure at-either level of testing pinpoints lack of compliance with the standard.
The C Torture Test is a valuable resource to both compiler writers and C users. Although delivered on a PC diskette, the code can be ported to any platform and used as is. And at only $20, nobody can complain about the price. -- eds.