Declarations and Reliability
autoleft me with a nagging doubt, which I would like to discuss here.
When I wrote about
auto a while ago, some readers said that they thought
auto was a bad idea because it allowed programmers to get by without knowing what types they were using. I explained my thinking about this issue. Nevertheless, the discussion left me with a nagging doubt, which I would like to discuss here.
The first programming language I learned was FORTRAN, which had (and still has) the interesting characteristic that if you do not declare a variable, the compiler creates it for you. In fact, the early version of FORTRAN that I learned did not even allow users to declare variables. So, for example, if you intended to write
FOO = 1.0
and actually wrote
FOE = 1.0
and later wrote
FOO = FOO * 2.0
the compiler would not complain that you had a variable named
FOE that you never used, nor would it complain that you used a variable named
FOO without giving it an initial value, and would blithely take whatever garbage was in the memory assigned to
FOO and multiply it by 2.0, thereby yielding other garbage.
I remember reading the claim — though I don't remember where — that one of the great advances of Algol over other languages (presumably FORTRAN) was that it required its programmers to declare every variable. In effect, when you write an Algol program, you have to state the name of every variable at least twice: once when you declare it, and then again when you use it. If those two statements don't match, the compiler will reject your program.
The question, then, is this: If it is a good thing that a compiler requires you to state the name of every variable at least twice, why is it not a bad thing when the compiler figures out the types of your variables for you? For example: auto it = v.begin();
Why should the programmer not be required to know that
v is a
vector<string>, and therefore be required to write
vector<string>::iterator it = v.begin();
or something similar?
After thinking about it carefully, I believe that the difference between these two cases is that FORTRAN not only creates a variable if you do not declare it, but it also creates a value for that variable. To see why FORTRAN's behavior is dangerous, consider what would happen if a Python programmer were to try to execute
FOE = 1.0 FOO = FOO * 2.0
When it came time to execute the second statement, the implementation would complain that no variable named
FOO exists, because in Python, a variable does not exist until a value has been assigned to it.
Similarly, in C++, when I execute
auto it = v.begin();
I am not asking the compiler to invent anything for me. I've written
v.begin(), which is a legitimate expression, and I've said that I want to put the result of that expression somewhere. If I want to use that result, then its type is going to have to be consistent with the way(s) in which I intend to use it, so the compiler will check whatever aspects of the result's type are necessary in order for me to use that result.
In short, although I agree with the claim that Algol's requirement to declare variables makes programs more reliable, I now think that the reason for this reliability is not that it requires programmers to state the name of every variable twice. Rather, I think the problem with the FORTRAN treatment is that the compiler tries to guess what the programmer meant. In cases such as
auto, no guesswork is involved.