Hays is a lead engineer at MITRE, and can be contacted at [email protected] Raphael is chief scientist at Eidea Labs and can be reached at [email protected] They are coauthors of AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis (John Wiley & Sons, 1998).
The emergence of the design patterns movement has gone a long way toward codifying a concise terminology for conveying sophisticated computerscience thinking. Still, the likelihood of success for practicing managers and developers is grim. According to J. Johnson in "Creating Chaos" (American Programmer, July 1995), for instance, studies of hundreds of corporate software-development projects indicate that five out of six projects are considered unsuccessful, about a third are canceled, and those remaining deliver software at typically twice the expected budget, having taken twice as long to be developed as originally planned.
While we can assume that all software was intended to solve some problem, these solutions are sometimes as bad as (or worse than) the problem they originally were intended to solve. It is these repeated failures, or negative solutions, that provide the source for the study of what we call "AntiPatterns."
AntiPatterns versus Design Patterns
Unlike design patterns, AntiPatterns start with an existing solution (or legacy approach). Most patterns assume a situation where there are few, if any, preexisting concerns that affect the design of a system. However, experience has demonstrated that situations with legacy and existing problems are much more commonplace in practice. The AntiPattern background and general form amplifies the problem in a way that helps you recognize the problematic structure, symptoms, and consequences. The AntiPattern then presents a common solution that refactors the system to improve benefits and minimize consequences.
As Erich Gamma et al. point out in Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley, 1994), design patterns focus on a particular software-design problem or issue. They contain a detailed discussion of the forces that influence the issue. In the design-pattern solution, the forces are either resolved or an appropriate tradeoff between the forces is established. This results in a divergence-convergence pattern where the context and forces diverge from the problem to provide a greater scope, and the solution diverges by resolving the forces into a single focused solution.
AntiPatterns, on the other hand, are based on a different rhetorical structure. AntiPatterns begin with a compelling problematic solution. From this solution, as Figure 1 illustrates, a discussion of the root causes focuses how the problematic solution is the result of incorrectly resolving the forces for a specific underlying set of problems. This convergence from a concrete situation to the more abstract underlying forces is a key component in communicating an understanding of how and why the problem exists. The Symptoms section provides additional clues for recognizing the improper resolution of the key forces. In the Consequences section, the implications of the problematic solution are discussed, thus providing a divergence similar to a design-patterns discussion of context. Finally, the Refactored Solution provides a better convergence of the underlying forces that lead to a better understanding of the problem and an effective method of resolving it.
AntiPatterns are therefore a natural extension to design patterns. Rather than codifying pure computer science, AntiPatterns focus on repeated software failures in an attempt to understand, prevent, and recover from them. To developers, AntiPatterns are a tool that bridge the gap between architectural concepts and real-world implementations. They are used to illustrate the common, critical problems faced by most software developers and enable learning from other developers' experiences. Only with an understanding of the causes and motivations of an AntiPattern can you ensure that your mistakes are not continually repeated within an organization.
The use of a template defines the difference between design patterns and AntiPatterns and other forms of technical discussion. Templates ensure that important questions are answered about each pattern in a pattern language, pattern catalog, or pattern system. The full AntiPattern template provides a wide range of information about a particular class of problems, including a discussion of the AntiPattern background, the general form, symptoms and consequences, root causes, a refactored solution, and an example detailing how the refactoring process can be applied to create an effective solution. A catalog of AntiPatterns is available electronically (see "Resource Center," page 3). For this discussion, however, we'll use an abbreviated version of a pair of AntiPatterns (the Lava Flow and Stovepipe System). The solutions include the following:
- General Form. Often including a diagram, the general form section identifies the general characteristics of this AntiPattern. This is not an example, but a generic version. A prose description explains the diagram (if any), and provides the general description of this AntiPattern. The refactored solution resolves the general AntiPattern posed by this section.
- Symptoms. Symptoms provide the observable indicators that suggest an AntiPattern might exist in a specific situation. They manifest through comments from individuals, the observed structure of software, organizational changes, the inability to meet targeted goals, and so on.
- Typical Causes. This section summarizes the key underlying causes that typically lead to the existence of a particular AntiPattern.
- Known Exceptions. AntiPattern behaviors and processes may not always be wrong. There are often specific occasions when this is the case. This section briefly identifies the primary exceptions to each full AntiPattern.
- Consequences. The Consequences section describes the harmful effects of the AntiPattern. They include the continuing damage caused by the AntiPattern as well as likely outcomes if corrective measures are not taken.
- Refactored Solution. This section explains a refactored solution that resolves the forces in the AntiPattern identified in the General Form section. (This solution is described without variations that are addressed in the Known Exceptions section.) The solution can be structured in terms of solution steps.
AntiPattern Name: Lava Flow
As Figure 2 illustrates, the Lava Flow AntiPattern is commonly found in systems that originated as research but ended up in production.
It is characterized by the lava-like "flows" of previous developmental versions strewn about the code landscape, but now hardened into a basalt-like, immovable, generally useless mass of code that no one can remember much (if anything) about. This is the result of earlier developmental phases where developers tried out several ways of accomplishing things, perhaps while in a research mode. Predictably, a deadline arises and, in the rush to deliver, sound design is cast to the winds and the documentation of the system becomes sketchy, irrelevant, or altogether nonexistent.
The result is several fragments of code -- wayward variables, classes, and procedures that are not clearly related to the overall system. In fact, these flows are often so complicated and spaghetti-like that they seem important but no one can really explain what they do or why they exist. Sometimes a developer can remember certain details, but everyone has decided to "leave well enough alone" since the code in question "doesn't really cause any harm, and might actually be critical, and we just don't have time to mess with it."
While it can be fun to dissect these flows, there is usually no time in a production schedule for such meanderings. Instead, people take the expedient route and work around them. This AntiPattern is common in innovative design shops where proof of concept or prototype code rapidly moves into production. It is poor design, for several key reasons:
- Lava Flows are expensive to analyze, verify, and test. All such effort is expended entirely in vain and is an absolute waste. In practice, verification and testing are rarely possible.
- Lava Flow code can be expensive to load into memory, wasting important resources and impacting performance.
- As with many AntiPatterns, you lose many of the inherent advantages of an object-oriented design. In this case, you lose the ability to leverage modularization and reuse without further proliferating the Lava Flow blobs.
- Frequent, unjustifiable variables and code fragments in the system.
- Undocumented, complex, important-looking functions, classes, or segments that don't clearly relate to the system architecture.
- Very loose "evolving" system architecture.
- Whole blocks of commented-out code with no explanation or documentation.
- Lots of "in flux" or "to be replaced" code areas.
- Unused (dead) code.
- Unused, inexplicable, or obsolete interfaces in header files.
- R&D code placed into production without thought toward configuration management.
- Uncontrolled distribution of unfinished code. Implementation of several trial approaches toward implementing some functionality.
- Code written by a single developer.
- Lack of configuration management or compliance with process-management policies.
- Lack of architecture, or nonarchitecture-driven development. This is especially prevalent with highly transient development teams.
- Iterative development process. Often the goals of the software project are unclear or change repeatedly. To cope with the changes, the project requires rework, backtracking, and prototype development efforts. In response to a demonstration deadline, there is a tendency to make hasty changes or to code on-the-fly in order to deal with immediate problems. The code is never cleaned up, leaving architectural consideration and documentation postponed indefinitely.
- Architectural scars. Sometimes architectural commitments are made during requirements analysis, only to be deemed obsolete after some amount of development. The system architecture may be reconfigured, but these in-line mistakes are seldom removed. It may not even be feasible to comment-out unnecessary code, especially in modern development environments where hundreds of individual files comprise the code of a system.
Small-scale, throwaway prototypes in an R&D environment are ideally suited for implementing the Lava Flow AntiPattern. It is essential to deliver rapidly and the result is not required to be sustainable.
- If existing Lava Flow code is not removed, it can continue to proliferate as code is reused in other areas.
- If the process that leads to Lava Flow is not checked, there can be a geometric growth as successive developers, too rushed or intimidated to analyze the original flows, continue to produce new, secondary flows as they try to work around the original ones.
- As the flows compound and harden, it rapidly becomes impossible to document the code or understand its architecture enough to make improvements.
In cases where Lava Flow exists already, the cure can be painful. An important principle is to avoid architecture changes during active development. In particular, this applies to computational architecture -- the software interfaces defining the systems-integration solution. Management can postpone development until a clear architecture has been defined and disseminated to developers. Defining the architecture may require one or more system-discovery activities. System discovery is required to locate the components that are really used and are necessary to the system. System discovery also identifies those lines of code that can be safely deleted. This activity is tedious; it can require the investigative skills of an experienced software detective. As suspected dead code is eliminated, bugs will be introduced. When this happens, resist the urge to immediately fix the symptoms without fully understanding the cause of the error. Studying the dependencies will help you better define the target architecture.
To avoid Lava Flow, it is important to establish system-level software interfaces that are stable, well defined, and clearly documented. In the long term, up-front investment in quality software interfaces can produce big dividends, especially compared to the cost of jack-hammering away hardened globules of Lava Flow code. Tools such as source-code control systems (SCCS) assist in configuration management. An SCCS is bundled with most UNIX environments and provides a basic capability to record histories of updates to configuration-controlled files.
AntiPattern Name: Stovepipe System
The Stovepipe System AntiPattern concerns how the subsystems are coordinated within a single system.
The key problem in a Stovepipe System is the lack of common subsystem abstractions (see Figure 3).
Subsystems are integrated in an ad hoc manner using multiple integration strategies and mechanisms. All subsystems are integrated point-to-point. The integration approach for each pair of subsystems is not easily leveraged toward the integration of other subsystems. The system implementation is brittle because there are many implicit dependencies upon system configuration, installation details, and system state. The system is difficult to extend since extensions add additional point-to-point integration links. As each new capability and change is integrated, system complexity increases. Complexity increases throughout the lifecycle of the Stovepipe system. System extension and maintenance become increasingly intractable.
- Large semantic gap between architecture documentation and implemented software. Documentation does not correspond to system implementation.
- Architects unfamiliar with key aspects of integration solution.
- Project is overbudget and has slipped its schedule for no obvious reason.
- Requirement changes are costly to implement. System maintenance generates surprising costs.
- System may comply with most paper requirements but does not meet user expectations.
- Users need to invent workarounds to cope with limitations of system.
- Complex system and client installation procedures that defy attempts to automate.
Multiple infrastructure mechanisms used to integrate subsystems. Lack of a common mechanism makes the architecture difficult to describe and modify.
- Lack of abstraction. Each interface is unique to each subsystem.
- Insufficient use of metadata. Metadata is not available to support system extensions and reconfigurations without software changes.
- Tight coupling between implemented classes, requiring excessive client code that is service specific.
- Lack of architectural vision.
R&D software production will often use the Stovepipe System AntiPattern to achieve a rapid solution. This is acceptable for prototypes. Sometimes, lack of knowledge about a domain may require a Stovepipe System to be developed initially to gain domain knowledge either for building a more robust system or for evolving the initial system into an improved system (see "Big Ball of Mud," by Brian Foote and Joseph Yoder, Proceedings of Pattern Languages of Programming, 1997). Finally, sometimes the wish to use a vendor's product can lead an organization to conclude that the benefits outweigh the consequences of this AntiPattern.
- Lack of interoperability with other systems. Inability to support integrated system management and intersystem security capabilities.
- Changes to the system become increasingly difficult.
- System modifications become increasingly likely to cause serious new bugs.
The refactored solution is a component architecture that provides for flexible substitution of software modules (see Figure 4). Subsystems are modeled abstractly; so there are fewer exposed interfaces than there are subsystem implementations. The substitution can be both static (compile-time component replacement) and dynamic (run-time dynamic binding). The key to defining the component interfaces is to discover the appropriate abstractions. The subsystem abstractions will model the interoperability needs of the system without exposing unnecessary differences between subsystems and implementation-specific details.
To define a component architecture, decide upon a base level of functionality that the majority of applications should support. In general, the base level should be small and focus upon a single aspect of interoperability such as data interchange or conversion. Define a set of system interfaces that support this base level of functionality. Because most services will have an additional interface to express finer-grained functional needs, the component interface should be small.
Having a base level of component services available to all clients in the domain encourages the development of thin clients that will work well with existing and future services without modification. Thin clients are those that do not require detailed knowledge of the services and architecture of the system; a framework may support and simplify their access to complex services. Having several plug-compatible implementations available increases the robustness of clients as they potentially have many options for fulfilling their service request.
Applications will have clients that are written to more specialized (vertical) interfaces. Vertical clients should remain unaffected by the addition of the new component interfaces. Clients that only require the base level of functionality can be written to the horizontal interfaces, which should be more stable and easily supported by other applications. The horizontal interface should hide all the lower-level details of a component via abstraction and should only provide the base-level functionality. The client should be written to handle whatever data types are indicated by the interface to support any future interchange of the horizontal component implementations.
Figure 5 is a typical Stovepipe system. There are three client subsystems and six service subsystems. Every subsystem has a unique software interface. Every subsystem instance is modeled as a class in the class diagram. When the system is constructed, there is unique interface software for each client corresponding to each of the integrated subsystems. If additional subsystems are added or substituted, the clients must be modified with additional code integrating new unique interfaces.
The refactored solution to this example considers the common abstractions between the subsystems (Figure 6). Since there are two services of each type, it is possible for each model to have one or more service interfaces in common. Then, each particular device or service can be wrapped to support the common interface abstraction. If additional devices are added to the system from these abstract subsystem categories, they can be integrated transparently to the existing system software.
AntiPatterns provide guidance in identifying and resolving real-world problems in software development. Like design patterns, they are effective in capturing expertise by documenting commonly recurring software processes and designs. However, they go beyond current design pattern approaches in providing detailed assistance in recognizing when a solution exists. Additionally, AntiPatterns provide step-by-step guidance in changing an existing dysfunctional situation into a beneficial one that avoids the harmful impending consequences. The AntiPattern form has already appeared sporadically on the Internet, with informal AntiPatterns written by authors from the design-pattern community. Based on the Internet forums, the consensus is that AntiPatterns are a worthwhile research area. We hope this article and future work will be instrumental in promoting the AntiPatterns concept into the mainstream of the software development community.
Many thanks to our coauthors of AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis, Dr. Thomas Mowbray, chief scientist at Blueprint Technologies ([email protected]), and William J. Brown, process director for product development at Concept Five Technologies ([email protected]pt5.com), and John Wiley & Sons publishers.
Copyright © 1998, Dr. Dobb's Journal