Test cases exist to verify that operations do in fact have the results they are expected to have. "Monkey at the keyboard" work---where what should happen is unknown or what happens is ignored--may be doing something but it certainly is not testing.
Verification, in fact, is the most important part of testing. Why is it, then, that it is often left to the end, something to be done once the test case is doing whatever it is supposed to do? Part of the reason is that getting the test case to do whatever it is supposed to do is painful. It takes so much effort to figure out exactly which widgets in what order need to clicked, and how to know when the application is done processing the last click so that the test case can move on to the next click, that by the time all that works testers have little energy left to spend on verifying that all those clicks and moves and keystrokes actually did the right thing.
Problem: Reusing Verification Is Hard
Determining exactly what should happen when and how to figure out where to get the information necessary to determine that what should have happened in fact did is also a time consuming and painful process. That knowledge would ideally be reused across test cases, but generalizing and centralizing verification code into shared helpers tends to be much more problematic than is doing this to execution code. The number of data necessary to properly parameterize common verification methods can become quite large, and the calculations required to robustly calculate expected state in all cases quickly become complicated.
An alternative to these large and complicated all-encompassing verification methods is a family of more focused verification methods. This typically reduces the number of arguments that must be passed to each method but it simultaneously increases the number of methods that must be called, for a net gain near zero. Also, while the methods themselves do get smaller they don't necessarily become any less complex.
Problem: Verification Is Intertwined With The Test Case
Regardless of whether the verification is done inline to the test case or in helper methods called from the test case, the code verifying the test case is intermixed with the code executing the test case. Initial state data necessarily must be gathered before individual operations are executed. Expected state can be calculated anytime after initial state is recorded to just before actual state is verified. Verification that actual state matches expected state must of necessity be done sometime after each operation is executed; often immediately after, if subsequent steps in the test case will destroy the current actual state. All of this makes it difficult to differentiate between execution code and verification code. What code needs to change if a step in the test case changes? What code needs to change if the expected result of an operation changes? These questions can be quite difficult to answer.
Problem: Verification Is Less Than Comprehensive
In part because of all these complications, the set of properties that are typically verified is nowhere near the complete set that would be necessary for truly comprehensive verification (that is, verifying every property after every operation). Doing so would cause verification code to overwhelm the execution code to the point that searching out execution code would be akin to searching for a needle in a haystack. Calculating all this expected state would be complex and error prone. Checking all this state would require verification call after verification call after verification call. This copious amount of work is generally deemed not worth the trouble: the work required to add this verification code to every test case is not worth the troubles, and the effort required to update every test case when verification details change certainly isn't worth it.
This is especially true since for any particular operation most properties will be unchanged. Experienced testers, though, will recognize that this is exactly how the most insidious bugs are manifested: changes in something that should be completely unaffected by the operation. If a technique existed that:
- Gives all the benefits of checking every property after every action while avoiding the tedium of explicitly acting to do so,
- Allows everything to be verified all the time without requiring every test case be visited each time an expected result changed, and
- Allows the definition of "everything" to be modified without requiring every test case be updated,
the balance of power would be changed. Bugs currently missed by not checking everything all the time could be caught.
Solution: Decouple Verification From Test Cases
Loosely Coupled. Loosely Coupled Comprehensive Verification is that technique, and it is easy to explain and almost as easy to adopt and implement. Just before a test case executes an operation, it notifies the Verification Manager that it is about to do so and provides any relevant details. The test case next executes the operation, and then finally it notifies the Verification Manager that it has completed the operation. That's it as far as the test case is concerned.
Nothing about verification is embedded in the test case. They of course must broadcast the details of each operation they execute, but these details are typically much less in number and much less complicated than tends to be the case when shared verification helpers are called directly.
As always, the devil is in the details, and by no means have those details been eliminated. The key here, though, is that those details are almost completely decoupled from the test cases. Test case changes do not affect verification just as verification changes do not require editing test cases.
Baselined. Additionally, those details are somewhat less complicated. Recall that when the Verification Manager receives notification that something is about to happen, that notification includes any parameters necessary to implement the operation. The Verification Manager responds by grabbing a copy of the complete current state of the application; this forms the basis of the expected state when the operation is complete. It then passes the Operation Beginning message on to a collection of Expected State Generators.
Each Expected State Generator is responsible for knowing what should happen to a particular portion of the application state in response to each possible operation. If an operation should not have any effect its part of the model, the Expected State Generator does not need to do anything. When an operation should have an effect on its part of the model, the Expected State Generator updates the expected state data structure however is appropriate.
When the Verification Manager is notified that the operation is complete, it responds by again grabbing a copy of the complete current state of the application. It then compares this data structure against its cached expected state data structure and logs any differences.
Isolated. This loose coupling between the verification model and the rest of the system makes it very flexible. The source of the information verification is given about each action is irrelevant, so a test case could call directly into verification just as easily as any test infrastructure might. Similarly, the details of how each Expected State Generator performs its calculations have bearing on neither Expected State Generators nor the rest of the system, so these details can be changed at will. The verification model itself does not know anything about the rest of the system save that some part of the system informs verification about its actions. Thus if any of these details--as expected state generation is likely to frequently do--just that one small part of the system is affected.
The one point at which verification is coupled to the rest of the system is at the definitions of the Operation Beginning and Operation Ended events. Wherever these events are defined, action implementers must call them and Loosely Coupled Comprehensive Verification must listen to them. Each side must agree upon the information provided for and the semantics of each operation.
Easily Changed. A useful side effect of decoupling verification details from test cases is that the set of properties being verified can start small and simple. As application code comes online and the feature team comes to agreement as to what exactly is expected to happen in particular cases these details can expand. As testers have time to implement the necessary calculations they can expand even farther. It is true that test cases may initially be executing with minimal or even no verification, but they are running, test cases can be debugged, and crashes will be found.
Helping expected state calculations come online over time is the ability to say "I don't care" what happens to a particular property as a result of a particular operation. If the feature team hasn't yet decided what should happen in a particular case, the Expected State Generator can set the associated state values to "I don't care", causing those values to be ignored during the actual-to-expected state comparison. When the feature team does decide what should happen, the tester can simply update the Expected State Generator appropriately and suddenly every test case will automatically expect the new behavior. No changes to any test case are necessary; everything just continues to work.
Follow-On Failures Eliminated. Another benefit of Loosely Coupled Comprehensive Verification is the near elimination of follow-on failures. The typical strongly coupled verification hardcodes expected values throughout the test case; if a given step has an unexpected result, the test case cannot adapt to expect that incorrect value in subsequent verifications and so they fail as well. With Loosely Coupled Comprehensive Verification, though, the first step of a verification cycle is to gather the current state of the application. Thus, Expected State Generators base their calculation on that state--not some predetermined or hardcoded value. If an action has an unexpected result that failure will be caught, but baselining prevents it from affecting subsequent actions and verifications.
Many thanks to everyone who reviewed this paper, most especially Mike Gallacher. Thanks also to the entire Designer Tools team for assisting our efforts to make testing more efficient and productive.
Michael J. Hunter is Designer Tools Test Technical Lead at Microsoft. He can be contacted at Michael.J.Hunter@microsoft.com .