Channels ▼


The Missing Theory of Refactoring

So what is refactoring? Refactoring is an attempt to bring software more in line with these general principles of good design. What is a bad smell? A bad smell is the absence of one of the design tenets. A really bad odor is caused by breaking several design principles at once. A given refactoring transformation applies in certain situations (pre-conditions) to improve the software's compliance with good design.

For example, consider the known bad smells "switch statement" and "duplicated code". Both result from violating the principle of component singularity. These smells are just special cases of the more general problem of missing singularity. In the same way, the smells "feature envy", "data clumps", "shotgun surgery", and "parallel inheritance" are all special cases of missing functional locality. The transformations known to remove these smells work because they move the software toward restoration of these broken underlying design principles.

I assert that every refactoring transformation has between one and seven real effects -- improving the listed design criteria. I assert that there are only seven basic smells, one for breaking each design principle. Current refactoring literature presents the transformations and smells ad hoc; they might be valid, but there is no apparent scheme or explanation for them. The design principles solve this problem by providing a theoretical foundation for refactoring.

The relationship between refactoring and the design guidelines is like the relationship between the substances of the world and the periodic table. There are seemingly thousands of different kinds of "stuff" on earth. But the discovery of elements and the periodic table changed this belief by saying, "There are actually only a small number of elements, which combine in myriad ways to form the many objects we see." By Ockham's Razor, a simpler theory is preferred over a theory with additional constructs. In the same way, the software principles combine to form many smells and correcting transformations.

So here are answers to the questions I posed.

  1. How does a programmer know when to refactor? Refactor when software violates one or more of the principles of good design.
  2. Which refactoring should be applied in a given situation? Use the transformation that most easily reestablishes good design where it is currently broken.
  3. Why is refactoring an improvement to software? Because there are universal principles of proper software design. Refactoring helps software conform to these guidelines.

As a theory should, the software design rules also suggest undiscovered phenomena: new bad smells and new refactorings to correct them. Confirming the existence of these phenomena is evidence the theory is correct. (This approach is common in physics. A theory predicts certain behavior. When the behavior is observed experimentally, it supports the truth of the theory.)

For software, a new bad smell predicted by the design rules is "reinventing the wheel", which violates the principle of minimality. An example of this odor is the common practice by programmers of writing their own sort routines. Since many operating and middleware systems provide highly-optimized sorting services, a new sort method can unnecessarily increase the total size of software. A new refactoring transformation to remove this smell is Replace with System Service. This refactoring is applied by completely removing a long run of code (or a method or class) and replacing it with the invocation of a built-in feature. Used appropriately, it is clear this change is an improvement to software.

In addition to this example, I strongly suspect the seven software design principles predict other bad smells and transformations to correct them, and that confirmation of these predictions will further support the theory. I leave this as a topic for future articles and other researchers.


My thanks to Judy Stafford for suggesting I look into a possible link between refactoring and my ideas about software design theory.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.