Loop Invariants And Testing: Often Possible, Often Difficult
We continue last week's discussion by talking about invariants and testing.
We can think of an invariant as being a claim or belief about the state of program that we expect to be true each time control reaches a particular point in the program. Accordingly, if such a claim turns out not to be true, we have discovered a bug in the program. Therefore, one obvious way to detect bugs in a program is to turn some or all of its loop invariants into
If this strategy works, that's great: If you can turn an invariant into an
assert statement, and the
assert fails, you know that the invariant has turned out not to hold when you expected to do so. Therefore, encountering such a failure means that you have found a bug in the program. What complicates matters is that such failures can be hard to induce for several reasons.
One reason is that it may not be easy to turn an invariant into code. For example, consider the power function that we discussed last week: Its invariant is
Our desired result is
r * xn
This invariant is hard to implement for two reasons. First, it claims that two quantities are equal, namely "our desired result" and
r * xn . Neither of these quantities is easy to compute, because to compute either of them, we must successfully run the very program that we are trying to test. This, then, is an example of an invariant that does not help us much with debugging, even though it does help us think about our program.
Sometimes, however, we can use an invariant to tell us something about our program that we might be able to test. For example, consider the
problem that Dijkstra posed that I discussed in August. In my solution to that problem, I described the program's state each time through the loop as
< ww, bucket k contains a red pebble.
< xx, bucket k contains a white pebble.
xx <=k <
bb, bucket k has not yet been examined.
bb <=k <
NL, bucket k contains a blue pebble.
This description makes a fine invariant — or at least most of it. If we have not examined the buckets with index k such that
xx <= k
< bb, then our invariant cannot make any claims about the contents of those buckets, because we don't know what those contents are. However, we most definitely can write an
assert statement that will check the other three parts of this four-part invariant.
Of course, such testing will be slow. One nice aspect of
assert statements is that one can leave them in the code but turn them off once testing is complete. That way, they continue to serve as documentation. In case you are tempted to argue that
assert statements should never be turned off because doing so decreases code reliability, please observe that this particular
assert statement has the effect of reinspecting every pebble that has already been inspected every time through the program's main loop. In other words, it turns a linear algorithm into a quadratic one — never a good idea. This kind of speed loss is often a side effect of
In addition to invariants that are difficult or impossible to check with
assert statements, and invariants that can be checked usefully, there is a third kind: Sometimes, it is possible for an
assert statement to appear to succeed even though the code is broken. Suppose, for example, that we have three pointers (or iterators). Two of them, named
end, define a range of elements in a sequence; the third one, named
p, is expected, as part of a loop invariant, to point either to an element of that sequence or one past the end. We might imagine expressing that invariant as an
assert statement such as
assert(begin <= p && p <= end);
assert statement fails, the code is definitely broken because
p is out of range. However, if the
assert succeeds, we are not assured that the invariant actually holds. For example,
- Technically speaking, comparing pointers to elements of two different arrays evokes undefined behavior. Therefore, if
phappens to point to an element of a completely different array from the one delimited by
assertis permitted to succeed or fail. Of course we would hope, on a well-behaved C++ implementation, that the
assertwould fail under those circumstances, but it is not required to do so.
- We translated the requirement that
ppoint to an element of the sequence into a relationship between three pointers or iterators. However, that translation did not capture the requirement that
endactually delimit a valid sequence! So, for example, if someone were to free the memory occupied by that sequence, the
assertmight well still succeed because the values of the pointers are still within bounds.
In short, invariants are sometimes, but not always, useful as debugging tools. When they are useful, they can detect the presence of bugs, but not always their absence. Therefore, it is plausible to say that invariants are usually more useful as an intellectual tool than as a software-engineering tool. Nevertheless, they do provide not only a tool for thinking about programs but also a way of documenting those thoughts that can sometimes detect bugs. In that sense, invariants are useful not only in theory but also in practice.
Next week, we will show how an invariant can help us solve a problem that is easy in principle, but for which practical solutions often get the details wrong — namely, binary search.