The Trouble with Testing
There is a school of thought that says that when you set out to write a program, you should write the tests first. Your goal is then to write the simplest program that passes the tests. Once you have done so, you're done.
This approach is certainly appealing. In particular, having automated tests that capture as much as possible of a program's desired behavior is an excellent idea. But what if a program has to have a characteristic that you don't know how to test?
One possible answer to this question is that there are no such characteristics -- if you can't figure out how to write a test for a program's behavior, you have no way of observing the behavior, and therefore you should not care whether the program behaves that way.
Although that point of view is appealing, I can think of several counterexamples.
The first is that even a simple program may have so many possible inputs that there is no way to test them all, and it might not even be possible to think of a set of representative samples. As an example, consider a program that does a floating-point multiplication. Suppose you are trying to verify that the program conforms to the relevant IEEE floating-point standard. There isn't enough time in the world for you to test every possible pair of input values, so you have to select them somehow. But the moment you do so, you run into the possibility that a bug may lurk in one of the possibilities you didn't select.
The second counterexample might come from a requirement that a program not leak. It's hard to express such a requirement in terms of specific numbers, because the requirement really governs asymptotic behavior. In such circumstances, it is typically a judgment call as to whether the requirement has been met in a particular case.
The third counterexample is related: It is often important that a program be robust against incorrect input. In particular, programs, such as web applications, that take input from the public have to be robust against malicious attempts to manipulate them with inappropriate input. In such cases, the cost of forgetting to test for a particular hazard may be the loss of the entire system of which the program is a part. Worse, that hazard might be part of a library component that there is no easy way to test directly.
What can we do in such circumstances? The best approach I can think of is to try to verify the program's behavior independently of testing it. Here I am using the word "verify" to mean reasoning about the program with an eye toward showing that certain kinds of failures are impossible. If, for example, you can locate every place in a program where memory is allocated and prove that the memory is always freed, then you have just increased your confidence that the system as a whole doesn't leak memory. One way of proving such behavior might be to show that memory is allocated only in constructors and is always freed in the corresponding destructors. In that case, the program can only leak memory if it leaks objects, and it may well be much easier to prove that that cannot happen.
I don't want to minimize the importance of testing. Indeed, I have some interesting examples that show just how important testing can be. But even when a program has passed all its tests, I think it's a mistake to assume that it's incapable of improvement.