Shawn is a software consultant with 13 years of experience in medical and embedded application engineering. He can be contacted at firstname.lastname@example.org.
Cleanroom software engineering is a process for developing high-quality software with certified reliability. Originally developed by Harlan Mills, the "cleanroom" name was borrowed from the electronics industry, where clean rooms help prevent defect during fabrication. In that sense, cleanroom software engineering focuses on defect prevention, rather than defect removal, and formal verification of a program's correctness. The Cleanroom Reference Model (http://www.sei.cmu.edu/publications/documents/96.reports/ 96.tr.022.html) provides guidelines for defining development teams and project roles, most importantly, the distinction between testers and developers. Having testing and development in the same group is a conflict of interest, while splitting them into separate groups allows for natural competition to push the project toward higher quality results.
Cleanroom differs from other formal methods in that it doesn't require mathematically defined requirementsthose stated in plain English are adequate. These requirements are divided into tagged statements for traceability. The process of tagging requirements in small verifiable statements allows for tracing and verification of each requirement throughout the process. Moreover, since attempts to document requirements are likely to have errors, inconsistencies, and omissions, Cleanroom refines many of these through the "Box Structure Development Method," a process that treats software as a set of communicating state machines that separate behavioral and implementation concerns (http://www.cs.utk.edu/sqrl/papers/ 199905_sjp_hmc_slides.pdf).
To illustrate the Cleanroom development process, I'll present a GUI that eliminates individual Save and Quit features. The project requirements are split between an abstract GUI and data-editing requirements. Partitioning the tagged requirements along lines of "separation of concerns" is the first step toward discovery of natural divisions in the software architecture. The source code implementation of this GUI is available electronically; see "Resource Center," page 5.
Increments, Architectures, and Schedules
Once initial requirements are established and reviewed by the client, the first step is to plan increments, architecture, and the schedule. Increments are checkpoints to the schedule and treated as deliverables. The architecture defines the interfaces to the system's black box (Figure 1), a mechanism that accepts stimuli and produces responses. Stimuli are any outside input that the software can detect and to which it is expected to react. Stimuli can come from interrupts, function calls, data queues, and the like. A response is any action from the software that crosses the external interfacefunction calls, writes to output streams, and so on. Stimuli and responses are identified from the requirements. Derived requirements are added for any additional stimulus or identified response. The results of this process should be approved and reviewed by clients or project teams.
Table 1 lists some of the initial requirements for the GUI project I present here (complete tables are available electronically). At this point, the requirements are fairly complete, even though this example has implicit assumptions and odd rabbit trails of logic. Resolving issues such as these is typically done during implementation or (worse) after delivery to the client.
In Figure 2 (the architecture), black boxes are circles and the arrows are the interfaces between them. The design choice was to follow the Document/View pattern. Tables 2 and 3 list some of the stimuli/responses for the GUI application (App) state machine. There is no initial requirement for IN (initialize), the first stimulus in Table 2. Since many systems must perform some kind of initialization, this is the first derived requirement. Every stimulus/response should have at least one requirement. Each of these derived requirements should be recorded and reviewed with clients. In my GUI example, some of the stimuli/responses were discovered later in the process. The process of adding derived requirements and communicating with clients verifies assumptions and ensures that the right software is developed. These reviews should be recorded in an accessible manner, such as internal web pages. Table 1(b) lists some derived requirements.
The preliminary work sets the stage for one of the most powerful Cleanroom toolsenumeration, a stepwise construction technique that results in complete, consistent, and verifiably correct specifications. With enumeration, every possible stimulus history is considered, starting with all those of length one, and proceeding forward to length two, three, and so on. At each stage, a stimulus history may have a set of responses, and may be equivalent to a previously identified sequence. Equivalence denotes when a stimulus is equal for both sequences. If a sequence is equal to an earlier sequence, it no longer needs to be extended. Illegal sequences can arise due to environment or design, and are ignored. It can be beneficial in unit testing to generate notification on illegal stimulus occurrences.
Once the extension of sequences has terminated, the process terminates, resulting in a system that has been completely and consistently defined. The process of enumeration is also useful in reverse engineering. Stimuli are applied and observed responses recorded in an enumeration. This is straightforward and based on mathematics. Each sequence is traced back to the requirements that generated it, creating a trace for later verification.
Table 4 is the start of the enumeration for the GUI example. Most sequences are classified as illegal. While enumeration guarantees that all cases are considered for a state box, making it possible to consider the entire set of use cases of the state box, it still seems tedious.
The good news is that enumeration has a provision for simplifying the processabstract stimuli. Consider the blocking "about" box (Table 1, Requirement 4). Once the "about" stimulus occurs, there is no possible other stimuli than the OK button, returning the application to where it began. Since this is always the case, it can be removed from the enumeration by the "deletion abstraction." Both partition abstractions, which can be used to group sets of stimuli sequences into abstract stimuli and bundle abstractions; and where several stimuli are handled in an equivalent manner (say, Ctrl+X or Exit from the user menu), both trigger an exit. Table 5 lists some of the abstract stimuli applied to this GUI example. Now the enumeration can proceed with less clutter.
Table 6 lists the enumeration using these abstractions. As the process goes forward, more derived requirements are identified. For example, in line 43, an automatic save of user edits is attempted and failed, just as the user had requested to open a new file. What should the GUI application do in such a case? The answer is shown in derived requirement D6it should not risk losing user edits without user permission. Other customers may have had a different idea about how the GUI should handle such cases, and this is the reason for reviews. I added the "Query" response and the "QY" and "QN" stimuli. Discovering missed cases can lead to rethinking the requirements because the client's original intents can change drastically when such considerations are made. In any event, you get credit for uncovering usage scenarios that had been left unconsidered and, in the process, iron out the hidden details.
Enumeration should be reviewed following the open-source software approachthe more it's reviewed, the less likely there are to be mistakes. In the full enumeration, some sequences had no equivalence; these are "canonical sequences" and form the basis of the state space of the black box. The sequence-analysis step invents a set of variables that uniquely identify each canonical state (Table 7). In this example, several of the state variables actually indicate which thread the code is executing in. The bottom of Table 7 shows the states without thread information. The actual physical states of the GUI example are: not running, running-no file, running-file, running-edits. The thread information helps specify the flow that occurs when handling user requests.
When defining the specification functions for the system, the responses are mapped to each stimulus coming into the system versus the state variables that were invented. These functions can also rely upon underlying specified data types that can be defined in structurally inductive form, such as queues and arrays. Table 8 lists some sample specification functions.
Coding the Design
Coding directly follows from the specification functions (again, the source code for the GUI example is available electronically). Most coding involves ensuring that stimuli are routed properly, and choosing exiting methods for handling the responses. The resulting logic is simple due to the canonical analysis. The first GUI example I coded for this article was based on existing codeand the example failed. The code was examined, stimuli and responses identified, and an enumeration was completed. However, the enumeration collapsed to three states, which I thought was an error because the code consisted of pages and pages of logic looking at state variables. I recoded it based on the newly defined, simple specification function, and it worked like the original, which had been developed by accretionand the actual function was hidden in the large amount of unreduced logic.
A state box does not have to be explicitly implemented in the code because the code itself is treated as a state box. A function call can be viewed as a stimulus/response pair. An object maps neatly into a combination of a state box and data structuring.
At this point, a potential for errors still exists, and testing should be performed. The goal of certification testing is to certify the reliability of a given system within statistical limits. Test-case generation is done by creating a usage model, which is a stochastic Markov chain. While the earlier state box specification focused on behavior, the usage model focuses on usage. Things such as "The user will spend 90 percent of his time editing" are considerations for the usage model. Users are outside the box, issuing stimuli. One state in the usage model can be several states in the behavioral model, and vice versa. Figure 3 illustrates the usage model for the GUI example.
From this model, you can derive model coverage, which is a set of test cases that will test every transition in the usage model. Statistically generated test cases are possible by several means, and the amount of statistical testing is determined by the level of certification and reliability that the project is striving to achieve. I used JUMBL from SQRL (http://www.cs.utk.edu/sqrl/) to generate the test cases of the system given in the GUI source listings. The two methods used were the arc-coverage "Rural Chinese Postman" algorithm (see "An Optimization Technique for Protocol Conformance Test Generation Based on UIO Sequences and Rural Chinese Postman Tours," by Alfred V. Aho, Anton T. Dahbura, David Lee, and M. Umit Uyar, Proceedings of the 8th Symposium on Protocol Specification, Testing, and Verification; June 1988) and random generation (see "Planning and Certifying Software System Reliability," by J.H. Poore, H.D. Mills, and D. Mutchler. Cleanroom Software Engineering: A Reader, Blackwell Publishers, 1996; ISBN 185554654X). Table 9 shows a sample test case.
Formal verification of software has been an area of intense debate in the software community (Communications of the ACM, August 2002). Cleanroom demonstrates that formal verification is not only possible in practice, but can be rewarding because a large cost of software development is maintenance (IBM Systems Journal, 1994). For more information, see Cleanroom Software Engineering: Technology and Process, by Stacy J. Prowell et al., (Addison-Wesley, 1999; ISBN 0201854805).