Channels ▼
RSS

Design

In Praise of Small Classes


If you've been doing OO programming for a while, you've surely run into the seemingly endless essays on testability. This issue they debate is how to write code to make it more amenable to automated testing. It's a vein that is particularly intriguing to exponents of test-driven development (TDD), who argue that if you write tests first, as in the orthodox approach to TDD, your code will be inherently testable.

In real life, however, this is not always how it happens. TDD developers frequently shift to the standard code-before-tests approach when hacking away at a complex problem or one in which testability is not easily attained. They then write tests after the fact to exercise the cod; then modify the code to increase code coverage. There are good reasons why code can be hard to test, even for the most disciplined developers. A simple example is testing private methods; a more complex one is handling singletons. These are issues at the unit testing level. At higher levels, such as UAT, a host of tools help provide testability. Those products, however, tend to focus principally on the GUI aspect (and startlingly few of those handle GUIs on mobile devices). In other areas, such as document creation, there is no software that provides automated UAT-level validation because parsing and recreating the content of a report or output document is often an insuperable task.

I don't want to get off my main point, however. Which is that what makes code untestable is frequently not anything I've touched on so far, but rather excessive complexity. High levels of complexity, generally measured with the suboptimal cyclomatic complexity measure (CCR), is what the agile folks correctly term a "code smell." Intricate code doesn't smell right. According to numerous studies, it generally contains a higher number of defects and it's hard — sometimes impossible — to maintain. Fortunately, there are many techniques available to the modern programmer to reduce complexity. One could argue that Martin Fowler's masterpiece, Refactoring, is almost entirely dedicated to this topic. (Michael Feathers' Working Effectively With Legacy Code is the equivalent tome for the poor schlemiels who are handed a high-CCR codebase and told to fix it.)

My question, though, is how to avoid creating complexity in the first place? This topic too has been richly mined by agile trainers, who offer the same basic advice: Follow the Open-Closed principle, obey the Hollywood principle, use the full panoply of design patterns, and so on. All of this is good advice; but ultimately, it doesn't cut it. When you're deep into a problem such as parsing text or writing business logic for a process that calls on many feeder processes, you don't think about Liskov Substitution or the Open-Closed principle. Typically, you write the code that works and you change it minimally once it passes the essential tests. In other words, as you're writing the code there is little to tell you, "Whoa! You're doing it wrong."

For that, you need another measure, one which I've found to be extraordinarily effective in reducing initial complexity and greatly expanding testability: class size. Small classes are much easier to understand and to test.

If small size is an objective, then the immediate next question is, "How small?" Jeff Bay, who contributed a brilliant essay entitled "Object Calisthenics" (in the book The Thoughtworks Anthology) that touches on this topic, suggests the number should be in the 50-60 line range. Essentially, what fits on one screen.

Most developers, endowed as we are with the belief that our craft does not and should not be constrained to hard numerical limits, will scoff at this number (or at any number of lines) and will surely conjure up an example that is irreducible to such a small size. Let them enjoy their big classes. But I suspect they are wrong about the irreducibility.

I have lately been doing a complete rewrite of some major packages in a project I contribute to. These are packages that were written in part by a contributor whose style I never got the hang of. Now that he's moved on, I want to understand what he wrote and convert it to a development style that looks familiar to me and is more or less consistent with the rest of the project. Since I was dealing with lots of large classes, I decided this would be a good time to hew closely to Bay's guideline. At first, predictably, it felt like a silly straitjacket. But I persevered, and things began to change under my feet. Here is what was different:

Big classes became collections of small classes. I began to group these classes in a natural way at the package level. My packages became a lot "bushier." I also found that I spent more time in managing the package tree, but this grouping feels more natural. Previously, packages were broken up at a coarse-grained level that dealt with major program components and they were rarely more than two or three levels deep. Now, their structure is deeper and wider and is a useful roadmap to the project.

Testability jumped dramatically. By breaking down complex classes into their organic parts and then reducing those parts to the minimum number of lines, each class did one small thing that I could test. The top level class, which replaced its complex forebear, became a sort of main line that simply coordinated the actions of multiple subordinate classes. This top class generally was best tested at the UAT level, rather than with unit tests.

The single-responsibility principle (SRP), which states that each class should do only one thing, became the natural result of the new code, rather than a maxim that needed to apply consciously.

And finally, I have enjoyed an advantage foretold by Bay in his essay: I can see the entire class in the IDE without having to scroll. Dropping in to look at something is now quick. If I use the IDE to search, the correct hit is easy to locate, because the package structure leads me directly to the right class. In sum, everything is more readable; and on a conceptual level, everything is more manageable.

Update: The follow-up editorial, "Making Large Classes Small (In 5 Not-So-Easy Steps)," is available here.

— Andrew Binstock
Editor in Chief
alb@drdobbs.com
Twitter: platypusguy


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video