Our C specialist helps you become a loop specialist.
January 01, 2003
URL:http://www.drdobbs.com/statements-and-loops/184401602
Several years ago, Jill, a friend of mine, was interviewing an applicant for a job with her company. Jill's employer had several openings for programmers working in different specialties: compilers, operating systems, user interfaces. Jill, wishing to match the applicant with the proper job opening, asked, "What type of programming do you like to do?"
The job applicant paused for a second to consider the question carefully. He then answered, "Loops. I like to write loops."
I have never quite made up my mind if the job candidate was being overly specific or overly general. On one hand, I have never heard, "Whoa, we need a loop here. Better call in a specialist." On the other hand, loops are useful in almost all forms of programming, even compilers, where recursive algorithms give loops a run for their money.
This column is dedicated to everyone who has ever written a loop and realized that a lot of work was going to happen in it. Which brings us to this month's subject, the changes to statements made in the 1999 revision of the C Standard. Although C99 did not create any new types of statements, it did introduce some new rules regarding the existing statements that increase their flexibility and perhaps allow you to avoid bugs. Ultimately, the statements most affected are loops.
The first property is that statements cannot use identifiers before they are declared for the simple reason that the names are not yet accessible. This rule should not be surprising since it has always been true that declarations cannot use identifiers declared in later declarations. Listing 1 shows invalid references to identifiers declared later in the block.
Some readers might think that struct and union tags are an exception to this rule. However, merely mentioning an identifier after the struct or union keywords declares that identifier as a struct or union tag, even if you do not provide a brace-enclosed list of declarations for the members of the new struct or union type. The rules for tags are somewhat involved (see 6.7.2.3 of [2]) and have been the same since the early days of C and even forward into C++ (although C++ rules out a few contexts, such as casts may not declare new types as a side effect). Tags do not violate the no-reference-before-declaration rule, and I will not discuss them specifically any further.
The second property necessary to understanding declarations following statements is that if an object in a block with automatic storage duration (declared without the static or extern keywords) is initialized, the initialization happens at run time when the declaration is reached. In other words, the initialization functions like an assignment statement, and every time the declaration is executed, the object will receive the specified value.
Thus, there is no difference between this pair of statements in a block:
int x; x = f();and this statement:
int x = f();In both cases, x will be set to the return value of calling f every time the statements are executed.
Consider Listing 2. There could be hundreds of statements between the declaration of sum and the first assignment to sum. The compiler will be no help at all if a programmer "cleaning up" the function manages to move the printf that references sum to before sum gets a value. In a large function, and sometimes even in a small function, it is very easy to lose track of the region of program text that sets the value of a variable.
In contrast, what if the loop-computing sum had been written as:
int sum = 0; for (i = 0; i < 5; ++i) sum += a[i];There would be no vast region of program text during which you could reference sum, but sum would not have the correct value. The compiler would prohibit such references.
The grammar for C++ just makes a declaration another type of statement. Thus, wherever you can have a statement, you can have a declaration. However, the grammar for C99 says that a compound statement (brace-enclosed block) is a sequence of block items, and block items are either statements or declarations.
A first glance, this appears to accomplish the same thing, since you can now put statements and declarations in a block in any order. But, while C and C++ both agree you can put a goto label on a statement, C++ considers declarations to be statements, and C99 does not. So the following is valid C++:
// C++ only loop: int x = 0;but not valid C99. You can write the equivalent in C99:
//valid C99 and C++ loop: ; int x = 0;since empty statements (a single semicolon) are valid both in C and C++. (However, I hope you really do not care about the ins and outs of goto labels.)
The second place where the two grammars for C99 and C++ permit differences is that there are contexts in the languages that permit a single statement. C++ permits a declaration to be there, but C99 does not. For example:
// C++, not C99 for (i = 0; i < 5; ++i) int x;That code probably looks pretty alien to old C programmers. So alien that there might be a moment of panic wondering what the code does. Do you end up with five variables named x?
for (i = 0; i < 5; ++i) { int x; }In other words, create and destroy a variable named x five times. Most compilers will eliminate all code for such a loop. Many will even complain that a variable was declared but never referenced.
You ask, if C99 does not permit a declaration in such a context, why does it borrow the C++ rule that the body of a for, while, and do-while loop is a block? The primary motivation is that it gives well-defined semantics to compound literals [3]. Compound literals are a new form of structured constant that allow you to create an unnamed object by "casting" a brace-enclosed initializer to the right type. In Listing 3, the function diagonal draws a diagonal line of the indicated length by calling drawpixel. The function drawpixel takes an argument that is a pointer to a point. The call to drawpixel in diagonal creates an unnamed object of type struct POINT using the compound-literal syntax and passes the address of that unnamed object to drawpixel. The lifetime of that unnamed object is the implicit block that is the body of the loop.
C99 and C++ not only make the bodies of for, while, and do-while loops implicit blocks, they also make the then and else clauses of if statements and the body of a switch statement also implicit blocks.
Not only are the various bodies of loops and switch statements and the then and else clauses of if statements implicit blocks, but the entire statement itself is another implicit block containing those blocks. Thus:
for (/*...*/; /*...*/; /*...*/) /*stmt */means exactly the same as:
{ for (/*...*/; /*...*/; /*...*/) { /*stmt */ } }likewise for the if, switch, while, and do-while statements. In case you are worried, entering and exiting a block, even one that reserves storage, takes little or no time. Except when variable length arrays [4, 5, 6, 7] are used, most compilers generate code to allocate stack space only once upon entering a function. (The amount of space allocated is the minimum amount necessary to handle the maximum requirements of any of the blocks in the function.)
Again, compound literals provide part of the motivation for making these entire statements implicit, local blocks. However, there is an additional, more obvious reason that applies only to the for statement. C99 adopted the feature from C++ and Java where the first item in the parenthesized list following the for keyword (the "initializer" clause) can be either a declaration or an expression. Let's rewrite that loop from Listing 2:
int sum = 0; for (int i = 0; i < 5; ++i) sum += a[i];Now not only has the declaration sum been moved to the first point that it is needed, but the declaration of i has been moved to the first point it is needed. The scope of i is just the loop itself. It cannot be referenced before the loop or after. Since they are separate scopes, all of the loops in an enclosing block can have their own index variable named i. (C++ programmers beware: some older C++ compilers do not consider the loop itself to be a block, and any index variable you declare will persist to the end of the explicit block enclosing the for loop.)
Note, C99 did not pick up the C++ feature that allows declarations as the controlling expressions of while, if, switch, or do-while statements. The most common uses of declarations in those contexts are an idiom involving the C++-only feature of run-time type identification. C programmers would likely never find declarations useful in those contexts.
[2] ANSI/ISO/IEC 9899:1999, Programming Languages -- C. 1999. Available in Adobe PDF format for $18 from <www.techstreet.com/ ncitsgate.html>.
[3] Randy Meyers. "The New C: Compound Literals," C/C++ Users Journal, June 2001.
[4] Randy Meyers. "The New C: Why Variable Length Arrays," C/C++ Users Journal, October 2001.
[5] Randy Meyers. "The New C: Variable Length Arrays, Part 2," C/C++ Users Journal, December 2001.
[6] Randy Meyers. "The New C: Variable Length Arrays, Part 3: Pointers and Parameters," C/C++ Users Journal, January 2002.
[7] Randy Meyers. "The New C: Variable Length Arrays, Part 4: VLA typedefs and Flexible Array Members," C/C++ Users Journal, March 2002.
Listing 1: Invalid references to identifiers declared later
void ex1() { // y and MyInt are not declared // so next line is wrong int x = (MyInt) y; typedef int MyInt; float y = 0.0; float a; // b and MyFloat are not declared // yet, so next line is wrong a = (MyFloat) b; typedef float MyFloat; int b = 0; }
Listing 2: Potential loop during reference of sum
void ex2(int a[5]) { int sum; int i; // lots of other declarations // lots of statements sum = 0; for (i = 0; i < 5; ++i) sum += a[i]; // lots of statements printf("sum=%d\n", sum); }
Listing 3: diagonal draws a diagonal line by calling drawpixel
struct POINT {int x, y}; void drawpixel(struct POINT *p); void diagonal(int len) { int y; for (y = 0; y < len; ++y) drawpixel(&(struct POINT) {y, y}); }
Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.