If you have seven years to develop and test a software product with a fixed specification, then quality is readily achievable. This article, however, concerns practical software quality for software professionals who operate under less ideal conditions. I outline a philosophy that centers on quality gates and analyze how to make quality gates as effective as possible.
White PapersMore >>
The key to measuring and achieving software quality is developing tests, known as quality gates, through which your software must pass. The tighter the gates, the higher the software quality is guaranteed to be. Quality gates are built up over time and become as valuable as the product itself.
Quality gates take various forms. They can be processes (two peers must review and sign off on every change before check-in), results from beta trials (90% of beta sites will rate the software as defect-free), or actual tests run either manually or automatically. To be a gate, success or failure has to be measurable before release. Retrospective measurements, such as the number of defects per 1,000 lines of code, are not quality gates.
Most software professionals think that defects exist in software because of bugs in the code. A more productive perspective is that defects exist because of bugs in the quality gates that is, quality gates were not designed or applied sufficiently well to trap all errors. The central idea in this view is that bugs do not exist in code; rather, they exist only in quality gates.
In other words, code is never correct or incorrect. It is only compliant or noncompliant with the quality gates. To fix a defect, the quality gates are improved to detect the defect. Eventually, the code is made compliant with the tightened quality gates, but fixing the code is secondary to fixing the quality gates.
Some of the best quality gates are test suites that verify that, given appropriate input, the expected outputs are produced. Many organizations put enormous amounts of work into their test suites, but they are not always as effective as they could be. Three reasons account for this ineffectiveness.
1. Lack of Visibility. A software tester has very little visibility into the internal state of a system. In fact, today’s software testers are like cardiologists without an EKG. They exercise the patient as hard as possible, and if the patient keels over (core dumps), the tester concludes there must have been a problem. What the software tester lacks is an EKG-equivalent that can detect subtle conditions indicating future trouble. In software, examples of such conditions include a hash table with duplicate entries or a doubly linked list that is not correctly linked. In general, a software tester needs to be able to run a test exercising part of an application, followed by extensive application-specific internal consistency checks. You can combat the lack of visibility into software by requiring applications writers to provide external hooks (environment variables, functions) that perform extensive internal consistency checks that the software tester can use after each test.
2. Delay Between Cause And Effect. Many software defects do not cause immediate visible symptoms, but if the program continues to be tested, the symptom may appear.
In Figure 1, each branch represents a different code path. For example, the initial three-way branch might be the New, Open, and Close menu entry code. In this example, a bug is triggered by the open code, but no symptom appears. Only later, perhaps when the file is accidentally re-opened by the user, will a symptom (crash) appear. Certainly this delay between cause and effect makes tracking causes difficult, but more importantly, it means test suites must be sophisticated enough to trigger an error, and they must go on to trigger a symptom. The combinatorics of exercising code are bad enough, but the extra work needed to trigger symptoms in addition to errors makes the job nearly impossible.
Figure 1: The delay between cause and effect.
3. Working By Accident. Many software programs work only by good fortune and luck. Much of the time, that luck holds and users remain content. A classic example of working by accident can be seen in the following example:
char* x = malloc(100); fill_and_sort(x, 100); if (x == ERRVAL) abort();
On the third line of this code fragment, the memory location
x is being read and compared against the constant
ERRVAL. The problem, of course, is that
x is the last valid memory location of
x, and that
x has uninitialized random contents. Most of the time (good luck), the memory
x is not equal to
ERRVAL, and the call to
abort() is not performed. Sometimes (bad luck), the memory
x happens to be equal to
ERRVAL, and the program core dumps. Note that no memory is being corrupted in this example, and the good-luck case has no lingering effects of the defect. Thus, a program that is working by accident produces correct results, and the application is entirely internally consistent afterwards, yet this program will fail periodically in the field.