The narrative above begs the question: Why is all of this important for the beginning programmers to understand? First, if all students use the same terms in a consistent way, there is less chance for them to misunderstand the topic at hand. Most textbooks use the terms "data definition" and "data declaration" as though they are synonyms. They are not. Making the distinction early makes teaching more complex topics easier, as you will see in a moment.
Second, taking the time to explain what a symbol table is and just some of the information it contains helps students better understand error messages issued by the compiler. For example, after spending one lecture on the concepts discussed above, we have never had a student ask what a "Duplicate Definition" error message means. They immediately understand what it means and how to correct it. While most modern programming languages require a variable to be defined before it can be used in an expression, understanding the concepts behind a symbol table makes it clear to beginning students why they must define a variable before they can use it. Even though such things may be intuitively obvious to us, they are not to beginning students.
Finally, understanding topics like value types versus reference types and pass-by-value versus pass-by-reference become much easier to explain using the concepts that we demonstrate in the remainder of this article. As someone once said: "If the only tool you have is a hammer, all your problems begin to look like a nail." Understanding the difference between value and reference variables are often complex topics for students to comprehend and the techniques described here can be another tool to use when teaching such topics.
We find that the following diagrams make it easier to present the concepts to the students. We start off with a simple definition of an integer variable:
int i; // Statement 1
We can then represent this statement as in Figure 1.
We tell the students that the reason the lvalue and rvalue boxes have question marks is because the compiler has not sent a request to the operating system for storage. In other words, all the compiler has done to this point is checked the syntax in Statement 1 (which is okay) and checked the symbol table to see if variable i is already defined at the same scope level. Figure 1 represents the state of variable i at this point in the program.
The compiler then asks the operating system's memory manager for 4 bytes of storage. (See column 3, Table 5.) Assuming the memory manager finds 4 contiguous bytes of storage, it passes back the memory address of those 4 bytes of storage (e.g., assume memory address 900,000). Our diagram now becomes like Figure 2.
Note, because variable i now has an lvalue, we have a data definition for variable i.
What does the rvalue represent? The rvalue is what it stored at the lvalue. (Again, the term harkens back to the old assembly language days and represented the "register value" of a data item.) In other words, the rvalue is the current value of variable i. We have left the rvalue unknown because, at this juncture, some compilers may initialize the value to 0, while other languages leave the 4 bytes unchanged and the rvalue is whatever random bit pattern happens to be in memory at that (lvalue) memory address. (We always teach our students to never assume the compiler initializes a variable with a meaningful rvalue.)
Now consider the statement:
i = 10; // Statement 2
After the compiler checks the statement for proper syntax, it processes the statement by going to the symbol table, finding the lvalue for the variable (900,000) and depositing "4 bytes with a value of 10" at that memory address. The state of variable i is transformed to reflect the state in Figure 3.
Note that the rvalue is now 10. Also, note that, if variable i was a data declaration, there would be no lvalue and, hence, no way to change its value. At this point, we review the fact that an assignment statement is always concerned with moving whatever is on the right side of the expression into the rvalue of whatever is on the left side of the expression. We also make the assertion that the assignment operator can only be used with variables that have been defined previously at some point in the program.
Quite honestly, some students' eyes are a little glazed over at this point, suggesting that this approach is less than intuitively obvious to some students. Fortunately, that is easily resolved.