The subject for this and next month's columns is testing, with a focus on OO systems. This month we'll cover the basics, looking at tests for the code that I've presented over the past few months: the
ExtendedLogger classes. You may want to go back and reread those articles so that you can see what we're testing. Next month I'll focus more on mock-based testing (and mock-based design).
The test code from this month is pretty straightforward, but serves as a vehicle for talking about core issues.
I've mentioned this before, but the one essential principle that underlies all object oriented systems is "abstraction." Your objects should be black boxes. You know what the interface to the object looks like, but the way in which the object works should be completely hidden. This hard-core abstraction has a highly desirable side effect: You should be able to completely change the implementation of an object literally throw out all the field definitions and all the method bodies and as long as the interface hasn't changed, the clients (the objects that use the class you just modified) should be unaware that you've made any changes.
The opacity that OO abstraction entails controls the way an object is put together in really fundamental ways. For one thing, there can be no so-called "properties" getter or setter ("accessor" or "mutator") methods that just return or modify fields because these methods provide too much information to the outside world about how the object works internally. You typically can't change the field that underlies the getter/setter, much less eliminate that field altogether, without severely impacting all the clients. It's a maintenance nightmare.
Not only does this approach improve maintainability, you can't do any sort of Agile development without it. All agile processes require you to easily introduce new business requirements into your code. If all of your classes are coupled to each other through getters and setters, however, it's almost impossible to do that. Very small changes ripple out to the entire program very quickly, and making even trivial modifications to one class can touch just about every class in the system by the time your done. You just can't shoehorn that much work into the short cycle time required by all Agile processes.
On a purely structural level, the basic principle is the Law of Demeter (LoD): "Talk only to your friends." (The phrase was coined by Ian Holland, but was popularized in many books, most notably Hunt and Thomas's The Pragmatic Programmer. For the classicists amongst us, the name has only a tenuous relationship to Greek goddess of the harvest. The Law came out of Project Demeter, whose central philosophy was "grow software in small steps," which certainly should be the law, but doesn't seem to have a pithy name.) A great example (from Freeman, Mackinnon, Pryce, and Walne's great paper Mock Roles, Not Objects):
"Programmers should avoid writing code that looks like:
dog.getBody().getTail().wag(); colloquially known as a 'Train Wreck.' This is bad because this one line depends on the interfaces and implied structure of three different objects. This style laces structural dependencies between unrelated objects throughout a code base. The solution is described by the heuristic 'Tell, Don't Ask,' so we rewrite our example as:
dog.expressHappiness() and let the implementation of the dog decide what this means."
That is, you may choose to implement
expressHappiness() inside the
body.wagTail(), but you now have the option of doing something else entirely. The problem with the "train wreck" anti-pattern is that changing any of the classes in the chain breaks the entire chain, so it's very fragile. The "friends" in "talk only to your friends" are your immediate neighbors in the network of objects that comprise your program. Talk only to objects to which you have a direct reference (stored in a field or passed into a method as an argument). A train wreck is, of course, a series of getters.
Note, by the way, that there's a big difference between a getter or setter that's doing nothing but providing public access to a field and a method that implements a well-thought-out interface by returning or modifying a field. The first case breaks the abstraction. In the second case, the fact that the easiest way to implement some method is to simply sets a value in no way precludes you from completely changing that method to do something else entirely in the future.
That "Tell, Don't Ask" aphorism is, I think, more helpfully expressed as "Ask for help, not for information." I discussed the principle, in the context of making your code smaller, in Solving the Configuration Problem for Java Apps, the article that started this series, but it really fundamentally changes the way that you structure your code. You end up thinking about the responsibilities of the objects and the operations that they have to support to implement those responsibilities. You'll need fields to implement the methods, but that's just an implementation detail that doesn't even come up at design time, and is never exposed to the outside world.
So how does all this apply to testing?
The notion that changes to the object shouldn't affect the clients applies in spades to tests. If both your test and the object that you're testing change at the same time, you have no idea whether a failure indicates a broken test or a broken object. Ideally, you want to make changes to the object, run the tests, and then move on if the tests pass. If your tests are suspect, you can't do that. To make the test immune to changes in the tested object, the object must be a black box to the test. That is the test must work entirely through an interface that provides no direct access to the inner workings of the object. The test should act like any other client, and radical changes to the implementation of the object under test shouldn't affect the tests themselves. You just can't get that level of isolation if you expose implementation with things like getters and setters.
Having solid tests also changes the way that you work. I typically get nervous if I add more than about 20 lines of code to a class without running the at least a subset of my unit tests, and I've found that I work much faster when I work that way. Bugs tend to be very easy to find, because they're probably somewhere in those 20 lines. If you test once a day, you need to look at an entire day's work to find the bug. I'm typically not doing formal "test driven development" I'm not building the tests before I write any code but instead add unit tests incrementally as the code evolves. I do add tests when I've added something new to the object. When I find a bug, I generally add a test that makes sure that the bug doesn't reappear. All these tests really free up your ability to experiment since you can immediately tell if an experiment fails. More to the point, you can't really refactor if you don't have the tests in place, because you have no other way of telling whether or not your refactoring has broken something.
This test-often strategy depends on the tests testing behavior without knowing how the object works under the covers. That is, if I need to introduce significant structural changes to a class without changing the tests, then the tests can't assume anything about how the object is implemented. If, for example, my test validates the success of a method call by using getters to examine the fields that represent the interior state of the object, the test will fail outright if I need to change (or eliminate) those fields, and can fail subtly if I introduce a field that the test doesn't know about. Once I've determined that a test works properly, I never want to change it again.
OO tests, then, stimulate an object and then observe that object's behavior (which is often best done with mocks; I'll discuss those next month). The tests do not look at the object's state to judge whether or not the test succeeds, they just look at the object's behavior. When I call
dog.expressHappiness(), my test succeeds when I observe that
wagTail() is called. I would never look inside the
dog at the
isHappy field, nor would I have a
getIsHappy() method and call it in a test. Sometimes, the result of the stimulus might be quite distant from the stimulus itself. For example, I might verify that a
notifyByEmail() method on a
Customer object worked by seeing that the email actually arrives at the correct place, effectively testing the entire email chain. I couldn't do that, however, If I didn't have a fully functional email system, and I would have written many smaller tests in the process of getting that email system working (all of which would still be in place).
Boundary level testing is particularly useful when you need to modify a complicated existing system that has no unit tests (which is all too often the case maybe I work for too many startups). Start introducing unit tests at the outer edges of the system (test that the mail arrives). The obvious problem is that, when (not if) something doesn't work, you have no idea where the problem lies, but at least you can now tell that there is a problem. Over time, as you work on the system, you can add unit tests that verify that the innards of the system are working as expected.
Note, by the way, that this same look-at-behavior-not-state argument applies to examining the state of the database as well. It's a really bad idea to validate a method call by looking at the database to see if some row has been modified in some way (because changes to the schema break all the tests). A better approach looks at the way that database-dependent objects behave. For example, let's say that a "User" object in your system is defined by a user name expressed as an email address and a password. When you create a new user, you could look at the database to see if the new user was created as expected. However, if you can log in successfully, you can infer that the database was set up correctly without having to look at the database at all (assuming that the log-in process was validated by a previous test, of course).
If you can't test an object via behavior, then you should really wonder if the work you just did has any value at all. Ultimately, it's the behavior of the system that you care about, not it's internal state. If all the work that you just did doesn't change any behavior, then what have you accomplished? This argument applies just as much to the object level as it does to the system level. If an object's behavior doesn't change, then what use was the change you just made?
You can also use behavior-based tests to identify (and eliminate) useless parts of the code. If, for example, changing a method argument doesn't change any behavior, then that argument (and all the code that uses the argument) is unnecessary and should be purged. Small programs are faster, easier to understand, and easier to maintain. If a method is never called, it shouldn't exist.