Tools

Testing Testers

By Ron van der Wal, February 01, 1997

In an ideal world, development progresses smoothly from requirements to completion. In the real world, errors creep in. Ron examines a collection of commercially available automated testing tools that aid in ferreting out those errors.

Dr. Dobb's Journal February 1997: Testing Testers

Ron, author of the Tarma Simulation Framework, can be contacted at [email protected].

Sidebar: "Instrumentation Techniques"
Sidebar: "Static and Dynamic Testing"

In an ideal world, software development would progress smoothly from requirements to completion. In the real world, however, errors creep in. To get them out, various automated checking tools are available, including BoundsChecker, CodeGuard, Purify, PC-lint, and others. Basically, these tools watch your program in operation and report on errors. (The exception is PC-lint, which operates on your source code.) Which checking tool is best for you? Every vendor claims its tool is the best -- and they all have the numbers to prove it. So what are you to do?

To answer this question, I worked with a number of tools in my development efforts, using a number of tests -- some from vendors, some of my own (mine are available electronically; see "Availability," page 3) -- to compare the tools. What I discovered (not surprisingly) is that each tool has strengths and weaknesses. No single checking tool is ideal in all respects, but neither are there any bad tools among the crop that I examined -- it just depends on what you're looking for.

The testing tools I examine here fall into two categories -- static and dynamic. Static testers do their job without executing the program under test, and dynamic testers monitor its execution (for more information, see the accompanying text box entitled "Static and Dynamic Testing"). PC-lint is the only static tester in the roundup; all others are dynamic testers. Dynamic testers use several different methods to monitor the program, but one way or another, they all insert some sort of probe into your program. These probes are known as "instrumentation," and many differences between the tools can be traced back to their instrumentation method (see accompanying text box entitled "Instrumentation Techniques").

BoundsChecker 4.0 Professional

NuMega's BoundsChecker is probably the best-known tool in its class. The BoundsChecker family started several years ago with tools for DOS and 16-bit Windows; the most recent versions (the ones examined here) target Win32 platforms. In particular, I tested BoundsChecker 4.0 Professional for Windows NT and its companion for Windows 95.

BoundsChecker can be used as a stand-alone program loader and tester. If you don't do anything else, BoundsChecker will intercept calls to Windows API functions and heap-related C and C++ run-time library functions while your program is running and record invalid parameters, invalid pointers, error-return codes from API functions, and a general event trace. At program termination, memory and other resource leaks are reported. If a problem is detected, BoundsChecker by default pops up a dialog box that shows the type of error, its location (if source debug information is available), and several options, among which are the abilities to suppress further reporting of the same error and to break into a debugger at the error location. All error reports are collected in BoundsChecker's log window, which can be saved at the end of the session. As a final option, BoundsChecker can perform a Win32 compliance check, which signals the presence or use (your choice) of Win32 API functions that differ among Win32s, Windows 95, and Windows NT.

If Microsoft Visual C++ 4.0 (and later) is your C++ compiler, you can take advantage of BoundsChecker's Integrated Debugging mode. In this mode, you don't need the BoundsChecker program loader but can instead use the Visual C++ workbench's debugger to run your program. The rest is the same: BoundsChecker sits in the background and pops up if it detects a problem. The resulting error log ends up in the Visual C++ Output window, under a separate BoundsChecker tab.

All this is included in the Standard edition of BoundsChecker. If you have the Professional edition, you can improve error detection by adding compile-time instrumentation (CTI). In essence, this is a preprocessing step on your C and C++ source code that adds numerous checks to pointer operations and the like that help detect several additional types of errors. Once the extra instrumentation is in place, operation is identical to the Standard edition. One thing to be aware of: CTI is currently supported only in conjunction with Microsoft Visual C++ (4.0 and later).

BoundsChecker's operation can be tweaked extensively through options accessible from its program loader (or through the Visual C++ workbench) and by means of configuration, suppression, and library specification files. Furthermore, version 4.0 adds the ability to specify custom validation modules, which let you add virtually any kind of checking or logging, using the same routines that BoundsChecker uses internally for the built-in checks.

CodeGuard 32/16

Borland's CodeGuard is a companion tool to Borland's family of C++ compilers. It made its debut as a 16-bit add-on tool for Borland C++ 4.5 and is now part of the Borland C++ 5.0 Development Suite, with 16- and 32-bit versions covering both Win16 and Win32 programs (DOS programs are not supported by CodeGuard). During testing, I concentrated on the 32-bit version. The 16-bit version is similar, with the exception of a number of pointer checks that rely on the 32-bit CPU model and are therefore not available in 16-bit mode.

Once installed, CodeGuard becomes part of the IDE. In fact, the C++ compiler knows enough about CodeGuard to insert the CodeGuard instrumentation code into the object code of C and C++ programs during compilation if the right options are given. With instrumentation code in place, the program is then linked to the CodeGuard library, which intercepts both C runtime and Windows API functions. At runtime, the CodeGuard DLL tracks pointer usage with the aid of the instrumented code, and API usage through the intercepted entry points. If an error is detected, CodeGuard by default pops up a message box (no specific information -- just that an error was found) and writes a report to a log file. If the program is run from within the Borland IDE or Turbo Debugger, a breakpoint will occur, and the debugger will become active. At program termination, a leak search is performed and added to the log file. If so configured, CodeGuard will also add a function-call profile to the log file containing the number of times each intercepted function is called. The whole process from compilation onward can also be performed from the command line, in which case CodeGuard does report and log errors, but no breakpoints will occur.

CodeGuard's operation is configurable through a separate setup program (accessible through the IDE) or by directly editing a configuration file that records the options that apply to a given executable. Through this configuration file, you can determine the amount of memory and pointer validation, the details of API checks, and set several other options.

Purify 4.0 NT

Pure Atria's Purify has been known for several years as a UNIX tool. With Purify 4.0 NT it is available for PC platforms, but initially only if they run Windows NT.

Purify operates as a program loader and tester. For instrumentation, it uses a technique called "Object Code Insertion" (OCI), which takes an executable module (either .EXE or .DLL), analyzes the code within, and inserts additional instructions that check pointers and memory accesses. This happens for each module that is used by a given program, including third-party modules such as system DLLs. (Don't worry, the original module is never modified; Purify works with instrumented copies.) As a result, every piece of code that is executed by your program will be instrumented, regardless of its origin. However, OCI alone does not catch all errors; to detect memory leaks, Purify uses an approach that resembles the "mark" phase in "mark and sweep" garbage-collection schemes. It recursively follows potential pointers to identify reachable heap blocks. Any blocks that cannot be reached at all are then considered leaks; any blocks that have pointers into them, but not to their start addresses, are reported as potential leaks.

After the instrumentation phase, Purify runs the instrumented program and tracks all detected errors. The error reports are collected in a log view, which is updated while the program is running. Unlike the other dynamic testers, Purify does not pop up a message box when it detects an error, but it can be configured to cause a debugger breakpoint in those cases.

Purify configuration is done from within the loader. Apart from settings that determine features such as the size of the deferred free queue and whether memory and handle leak checks are performed at program termination, Purify's output can be tailored to your needs through the use of filter sets. As its name implies, a filter determines which types of error reports are shown and which aren't. This is purely a display matter: The unwanted error reports are hidden, but remain present in the overall log. A sophisticated filter manager makes it easy to create and combine different filters and share them across programs. Finally, the Purify run-time support can be accessed through a documented API, which allows your program to communicate with Purify while it is being tested -- for example, to test for new memory leaks created between two locations in your program. Incidentally, this is the only case for which you need to recompile and link your source code in order to work with Purify.

PC-lint 7.0

Gimpel Software's PC-lint is the only test tool examined here that performs a static analysis of your program to detect potential problems. It is inspired by the well-known UNIX lint utility, but has been greatly improved over the years by Gimpel Software. Version 7.0 can produce well over 500 different diagnostics relating to C and C++ syntax, usage, and programming style, and includes fairly sophisticated techniques such as strong typing (yes, really strong, not the C/C++ idea of strong) and interstatement value tracking.

To operate PC-lint, you invoke it on your source file(s) as if it were a compiler. PC-lint parses the source code, processes include files, and so on, and complains (to stdout) about what it considers to be illegal, dangerous, or just bad style. Generally, it finds a lot to complain about. Its inspiration is drawn from the C and C++ (draft) standards, but also from expert advice from the books of Cargill, Coplien, Meyers, Murray, Plum, and Saks. As a result, PC-lint acts very much as a C and C++ programming-in-the-small style oracle. There is overlap with the dynamic testers, though: By virtue of its initialization and value tracking, and also because it recognizes more general problems (for example, absence of copy constructors or assignment operators in classes with pointer data members, or absence of delete operations in destructors of the same), it will frequently spot potential memory leaks or array out-of-bound accesses without actually executing the program.

PC-lint is configurable to the extreme. You can specify options on the command line, in response files, or even embed them as comments in your source code. The options range from enabling and disabling of certain diagnostics (in general, per module or per identifier) through specification of its operating environment (to allow PC-lint to mimic, say, a Microsoft C++ compiler, complete with the right definition of _MSC_VER, integer and pointer sizes, and include directories) to tailoring its output format. The latter is particularly useful because it allows you to add PC-lint as a tool to your favorite IDE or editor, and then use its diagnostic output processing to jump to the right source-code location in response to PC-lint's messages. To get you started, PC-lint comes with configuration files for a few dozen C and C++ compilers.

Evaluating the Tools

The tests I ran included several buggy programs provided by the respective vendors (obviously designed to bring out the best in their tools, and the worst in their competitors' tools), some in-depth test programs of my own, and a number of real programs that I worked on during the test period. Table 1 presents a summary of the results.

Tools are only effective if you actually use them. So, no matter how sophisticated their tests are, they must be easy to operate or they become shelfware. In fact, they should become part of the development cycle, since we all know that "testing quality into a product" as a final step in the development process is a sure way to doom your program to fail.

I could come up with carefully wrought analyses of which user interface is the best, which options make most sense, and so forth, and declare a winner on these grounds. I will not do so. Instead, I just kept a tally of how often I actually used each tool, and to which tools I turned if I had a problem. This is not quite as scientific, but it is probably more honest and more indicative of which tools stood the test of practice. You'll have to bear with my personal preferences (or aberrations) as far as my development environment goes: Throughout the test period, I mostly used Borland C++ 5.0 and Microsoft Visual C++ 4.1, and occasionally used Symantec C++ 7.21 and Watcom C++ 10.6. Programs under test were Win32 console, OWL, and MFC GUI applications, ranging from small (a few hundred lines) to large (up to 60,000 lines). All programs were written in C++.

The result: BoundsChecker (with RTI, not CTI) and Purify were the tools I used most often, with about equal frequency. The reason is quite simple: Neither requires changes to the build process (Purify's OCI is performed automatically when a program is loaded). Their usage frequencies are about the same because I use one as a second opinion to the other. I feel more comfortable with BoundsChecker's memory-leak detection and API validation, but Purify inspires more confidence in its in-depth pointer checks. On the other hand, Purify can only be used in conjunction with Microsoft-generated code, which made it unsuitable for the other compilers (which BoundsChecker could handle). The same goes for the target platform, but since I primarily use Windows NT, this was less of a problem.

BoundsChecker with CTI came in lower in the usage count for two reasons: One is that the instrumentation process requires a separate build, and the other is that the resulting executable often ran too slow for my admittedly limited patience. As a result, I used BoundsChecker Pro with CTI primarily if I needed a very thorough test; it has no equal in this respect. (By the way, this is also what NuMega recommends.) CodeGuard is also a special case. First of all, it depends on the Borland C++ compiler, and then it requires a separate build (possibly also of the libraries used by the program at hand). In the end, I mostly used it for one specific 16-bit OWL application; unfortunately, it sometimes crashed along with the program.

What about PC-lint? This is one tool that requires iron discipline to use. Despite my best intentions, I must confess that I used it far less than I planned to. Apart from my obvious lack of discipline, I attribute this to the fact that I tired of tweaking PC-lint's options over and over again. PC-lint would benefit greatly if Gimpel Software (or some kind soul in the programming community) would make an interactive option tweaker available. More than anything else, configuring PC-lint to report the important things without flooding you with minor quibbles is a chore that hampers day-to-day use of the tool. Especially if you routinely use different C++ compilers with different libraries and frameworks (as I do), you get bogged down in configuration file upon configuration file. This is a pity, because PC-lint truly deserves to be a fully integrated part of your development environment.

Conclusions

BoundsChecker will be the testing tool of choice for many situations. It is broad in scope and, in combination with CTI, catches almost any error that relates to memory usage. However, you should be aware of the fact that to use CTI you need to add a separate build variant to your development process, that a CTI-instrumented executable can be significantly slower than a regular one, and finally, that only Microsoft Visual C++ compilers are supported with CTI.

CodeGuard is a good tool for Borland C/C++ users. As part of the Development Suite, it is inexpensive and covers a large number of common errors. On the other hand, it requires a separate build variant, and its error detection is sometimes flaky and may even crash the program.

Purify combines excellent error detection capabilities with an easy-to-use instrumentation step, requiring no changes to your development process. Moreover, run-time performance with instrumentation is reasonable to good, so there is little reason not to use it. Regrettably, it lacks extensive API validation and is currently only available for Microsoft Visual C++ programs running under Windows NT.

PC-lint is in a different category altogether. It can be extremely useful for both C and C++ developers, but effective use requires quite some configuration work. Nevertheless, I would strongly recommend it for use with C and C++ programming, since it will uncover much dubious or outright incorrect code that many other tools will miss.

For More Information

NuMega Technologies
9 Townsend West
Nashua, NH 03063
603-889-2386
http://www.numega.com/

Borland International
100 Borland Way
Scotts Valley, CA 95066
408-431-1000
http://www.borland.com/

Pure Atria
1309 South Mary Avenue
Sunnyvale, CA 94087
408-720-1600
http://www.pureatria.com/

Gimpel Software
3207 Hogarth Lane
Collegeville, PA 19426
610-584-4261
http://www.gimpel.com/

DDJ

1 2 3 4 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Tools