Adding Exception Testing to Unit Tests

By Bob Stanley, April 01, 2001

Exceptions can add a bewildering number of potential execution paths to otherwise simple code. Here is a way to test those extra paths without writing a bazillion test cases.

April 2001/Adding Exception Testing to Unit Tests

Adding Exception Testing to Unit Tests

Ben Stanley

Exceptions can add a bewildering number of potential execution paths to otherwise simple code. Here is a way to test those extra paths without writing a bazillion test cases.

Introduction

Much water has gone under the bridge since Tom Cargill expressed reservations about the reliability of code that uses exception handling [1]. Tom pointed out that naive programming of exception handling typically leads to resource leaks and incoherent object states. Exceptions should arise only under extreme conditions such as low memory or lack of disk space, making exception related problems difficult to reproduce.

Testing is no substitute for writing robust and complete code. Before attempting to write code that must be exception safe, you should read Herb Sutter’s excellent book, Exceptional C++, for some robust exception handling techniques [2, Items 8-19]. Once the code is written, a solid testing regime can help flush out any remaining bugs, and increase confidence that the code operates correctly. Rigorous testing methods have been developed for normal execution paths [3, Chapter 25]. This article describes a simple method of adding exhaustive testing of the exception paths to the test suite.

Current testing methodology seeks to construct a set of tests to verify the integrity of a unit of software. Usually, this unit is a class, although it could be a group of collaborating classes. For the test to be thorough, there must be at least one test to exercise each path of execution through every function or class method in the unit. Thus, there are at least as many tests in the test suite as there are normal execution paths through the software, and usually more. If you seek to test exceptional paths in addition to normal execution paths, you will have to add more tests. How many? Consider the following code snippet:

String EvaluateSalaryAndReturnName(
  Employee e )
{
  if( e.Title()=="CEO" or
      e.Salary() > 100000 )
  {
    cout << e.First() << " "
         << e.Last()
         << " is overpaid" << endl;
  }
  return e.First()+" "+e.Last();
}

Herb Sutter claims that there are no fewer than 20 possible exceptional code paths through this function, compared to the three normal paths [2, Item 18]. That’s a lot of extra paths to test! Even worse, it is less than obvious what these paths are, or how to cause them to execute. Luckily, it is possible to test these 20 extra exceptional paths using only the test suite for the three normal paths, and some extra template code, which is shared by all your tests.

A simplified outline of the method is as follows:

Identify places where exceptions may be thrown. These will be called exception points. Each exception point causes an additional exceptional code path.
Divide the unit test suite up into sets of tests related to one ‘feature’ of the class. These tests will be called feature tests.
Conceptually number the exception points in each feature test, in order of execution. Each exception point may be labeled with more than one number.
Execute each feature test inside a for loop that causes an exception to be thrown from each numbered exception point in turn.

I demonstrate this method of exception testing by three different examples, which use a common framework for performing the testing.

Design of the Exception Testing System

The design of the exception testing system is shown in Figure 1. Boxes indicate code modules. Boxes with single borders are generic and do not change for testing separate systems — these modules are supplied in the online source code that accompanies this article (see www.cuj.com/code). The boxes with double borders indicate code that must be supplied by the user. The arrows indicate function call dependencies between the code modules. The dotted arrow indicates a function call dependency that is usually not necessary.

The central idea in the design is the enumeration of the exception points. This is done by placing a call to the exception point counter class at each point in the code where an exception may directly arise. The exception counter class maintains two counts: the number of the current exception point, sExceptionPointCount, and the number of the exception point where a test exception should be thrown, sThrowCount. The code under test calls function CouldThrow at each point where an exception could be thrown. The CouldThrow method will throw an exception if it is at the right exception point. The test harness driver has a loop which enumerates all the exception points in turn, by first calling SetThrowCount and then calling the unit test in the test harness.

Since most exception points are due to memory management problems, we can take a giant shortcut by replacing the default implementation of operator new and operator delete with debugging versions that have been modified to call CouldThrow every time new is called. In many cases, this shortcut eliminates the need to write a debugging version of the classes used by the code under test.

Some exception points will be impractical to instrument with a call to CouldThrow. Such exception points may alternatively be exercised by designing the test suite so that it sets up the conditions that cause the exception as one of the test cases. This will still cause the exception code path to be tested.

Any remaining exception points that are not exercised by calling CouldThrow or explicitly tested by the test suite will not be tested and are deemed to be outside the domain of the test.

Implementation

The TestDriver Function

The TestDriver template function causes all the exception points in a test to actually throw an exception, one at a time. It requires a functor class containing one method with the signature static void DoTest(). This function is assumed to perform a sequence of tests to exercise one aspect of a class’s functionality:

template<class Func>
void TestDriver()
{
  int i = 0;
  do {
    // ...
    Counter::SetThrowCount( ++i );
    try {
      Func::DoTest();
    }
    catch( TestException& e ){}
    catch( bad_alloc& e ) {}
    catch( ... ) {
      Counter::Fail(
        "Bad Exception");
    }
    // ...
  } while( Counter::HasThrown() );
  Counter::SetThrowCount( -1 );
}

A more complete version of this function is shown in Listing 1 (Counter.h). The loop enumerates the exception points, using i as the counter. The call to SetThrowCount at the top of the loop instructs the Counter class to throw an exception at the specified exception point. Testing starts at the first exception point. Then, the function under test is called, inside a try/catch block. It is expected that the function will throw an exception. The Counter::CouldThrow method throws a TestException. However, operator new translates this into a bad_alloc exception for testing purposes, so we have to be prepared to deal with that here. Any other kind of exception that is triggered by bad inputs should be caught by the test harness, so any other strange exceptions caught here cause a test fail. This should be changed if it is unsuitable for your application.

An interesting point here is the loop termination condition. There must be a finite number of exception points in a test function. When they have all been tested, the TestDriver function will return normally, testing the normal path of execution for that test as a by-product. In this case, the exception has not yet been thrown. This condition is detected by the Counter::HasThrown method, which is used to terminate the loop.

At the end of the exception tests, the throw point is set to -1 to prevent exceptions from being thrown in the testing program.

The parts of Listing 1 marked with an ellipsis (...) contain checks for memory leaks that may have occurred during a test. It is assumed that each test is self contained, and that no memory should remain allocated after the test. You can also walk the heap and check that everything is okay, if your debugging memory manager supports such things.

The ... areas also contain code to print out the number of exception paths that were tested.

Test Harness Functions

The test harness functions are supplied as functor objects. This allows automatic reporting of the name of the test set through the use of RTTI information. Each test harness functor should have the following form:

class TestDefaultConstruct {
public:
  static void DoTest() {
    String s;
    Counter::Test( s == "", "correct value" );
  }
};

The above code just tests the default constructor of a String class. It does so by constructing a String using the default constructor, and then testing its value against the expected value. The static function Counter::Test accepts a Boolean result from a test as its first argument — a true counts as a pass, false counts as a fail. The second argument is a description of the test. This is printed in the case of a failure, to help in diagnosing what happened.

A proper test harness will have one of these classes to test each aspect of functionality of your class or module. You could incorporate test data from a file, or just hard-code the expected results from simple hand-worked examples. The key point is to compare the actual state of the class against what you expect it should be, given the functions you called. If the test harness detects anything amiss, it will let you know.

Instrumenting the Code Under Test

This section refers to the code that you wish to demonstrate operates correctly in the presence of exceptions. You can test individual functions, simple classes, or template classes.

If you are testing a class, it is useful to add an extra debugging method, named Consistent, that checks whether the data members of the class are in a consistent state. This is also known as checking the class invariants. For example, if a stack class has a null pointer to its array element, but its size member says it has five elements, then the class state is inconsistent.

Instrumenting Classes

If your code under test uses any other classes, you may need to create debugging versions of them that call Counter::CouldThrow at every exception point. However, if the only exception points are calls to the standard operator new and operator delete, then you don’t have to change anything. By using overloaded versions of operator new and operator delete, the required calls to CouldThrow can be obtained without changing anything. Other exception points will have to be treated [on an individual basis?]

Exception Point Counting

The exception point counting code is at the bottom of the call graph diagram (Figure 1). It is really very simple:

class Counter {
private:
  static int sExceptionPointCount;
  static int sFailCount;
  // ...
public:
  static void CouldThrow() {
    if( ++sExceptionPointCount ==
      sFailCount )
        throw TestException();
  }
  static void SetThrowCount(int c){
    sTestCounter = 0;
    sFailCount = c;
  }
  // ...
};

The SetThrowCount method is used to set the number of the exception point that will throw. The CouldThrow method will throw the exception at the appropriate exception point. There is a lot of other functionality included in this class in the full version (see Listing 1, Counter.h and Listing 2, Counter.cpp); its operation should be self-evident from an inspection of the code.

A Debugging Memory Manager

I have included a debugging memory manager with the online listings for two reasons:

1) To find simple memory errors in the program under test; and

2) To allow calls to operator new and operator delete to be instrumented to call Counter::CouldThrow.

The overloading of operator new and operator delete is a convenience — these versions automatically call CouldThrow. This functionality is useful because the majority of exception points in a typical application are due to memory allocation.

The debugging memory manager used in this testing facility additionally maintains a linked list of all allocated blocks, and a count of how many have been allocated. This allows the program to check if any memory leaks have occurred within a single test, and to also check that pointers passed to delete are actually pointers to validly allocated blocks. This was the minimum necessary functionality to detect the bugs that are in the examples. A more professional debugging memory manager may be substituted, as long as its operator new and operator new[] functions can be modified to call Counter::CouldThrow. Techniques for writing debugging memory managers are discussed by Steve Maguire [4].

Test Examples

The following examples demonstrate how to use the testing method on a function and on a template class.

Testing a Function

The function to be tested is the EvaluateSalaryAndReturnName function mentioned in the introduction. In order to test it, you need a String class and an Employee class; these are provided with the online source listings.

There are three normal paths of execution through this code, so three test cases are needed to exercise it. A single test case is shown here, which forms one part of the test harness box in the design in Figure 1:

class TestPath1 {
  static void DoTest() {
    String s;
    s=EvaluateSalaryAndReturnName(
      Employee( "Homer", "Simpson",
        "Nuclear Plant Controller",
        25000 ) );
    Counter::Test(
      s == "Homer Simpson",
      "Correct return value" );
  }
};

Here I have constructed a test case which will fail to enter the if statement within the EvaluateSalaryAndReturnName function, so nothing should be printed. To fully test this function, the output would have to be redirected into an internal string buffer or file. This is possible on Unix by closing and reopening the standard output stream, but is left as an exercise for the reader. Our interest here is the exception paths that we can cause to execute through the function using the TestDriver function.

After constructing the test cases, a suitable main is needed to call them:

int main()
{
  TestDriver<TestPath1>();
  // TestDriver<TestPath2>();
  // ...
  Counter::PrintTestSummary();
  return 0;
}

A full test harness will call further tests. However, when this program is built and run, it prints the following output:

Doing test 9TestPath1
18 execution paths were tested.
Test results:
Total Tests: 19
Passed     : 19
Failed     : 0

(The strange test name is just how g++ prints out the class name when you use RTTI.) The important thing that the output shows here is that 18 execution paths were tested, even though there was only one normal path of execution through the code accessible to the test suite. This is due to all the exception points being tested, one by one. Each exception point gives rise to a separate execution path. The number is different from the claimed 20 possible paths for several reasons:

1) The test harness contains extra exception points outside the code under test. Thus, some execution paths only differ outside the function being testing here. This is acceptable, since we still know these paths are being tested.

2) The path that is being tested does not cover all the possible exception points in the function, since half of the if test and the body of the if statement are not executed. Thus, not all of the exceptional paths are revealed yet.

At this point, I should point out that e.Salary() might return a user-defined type. The comparison between this user-defined type and 100000 would be performed by a user-defined operator==, which could throw. This possibility is included in Herb Sutter’s claimed count of 20 exception points. However, this test program has not tested these things, because this particular program does not return a user-defined type from e.Salary(). The program has only tested what could actually throw.

A complete test set is supplied as part of the code with the article. While developing this test set, I found two paths of execution that neglect to call the String destructor. (This was verified by inspecting a function call trace.) These appear to be caused by bugs in the code generated by g++ 2.95.2! This testing methodology has also found bugs in other compilers.

Testing a Template Class

The template class to be tested here is the same one used in Tom Cargill’s article on exception handling — David Reed’s stack class [6]. This class is a good example for showing that the test method finds problems. The code is shown in Listing 3 (Stack_Reed.h). The original code assumed that new returned zero when it failed — I have changed the code to assume that new throws bad_alloc. I have changed a few method names to be consistent with Herb Sutter’s implementations, to allow a common test harness. I have also added a Consistent method, which checks on the internal consistency of the Stack’s data.

The template class, Stack, is instantiated with a template argument of type TestClass (see Listing 4, TestClass.h). TestClass behaves like a very temperamental integer — it can be copied, assigned to, added, and so on, all with the possibility of throwing an exception. The only method of TestClass that does not throw is the destructor. This throw capability is achieved by inserting a call to Counter::CouldThrow into each of the methods, thus allowing us to test the behavior of Stack under hostile conditions.

The complete stack test suite is shown in Listing 5 (TestStack.cpp). The following Stack test from the suite exposes a fault in the copy constructor:

class TestCopyConstruct2 {
public:
  static void DoTest() {
    Stack<TestClass> a;
    a.Push( TestClass(1) );
    a.Push( TestClass(2) );
    { Stack<TestClass> b( a );
      // ...}
    // ...              }   };

When this test is run, it produces the following output:

Doing test 18TestCopyConstruct2
****    Failed test Memory leak
  (1 block) at exception point 27.
****    Failed test Memory leak
  (1 block) at exception point 28.
29 execution paths tested.

It seems that there are two memory leaks when exceptions are thrown during the tests. To identify the cause of the leaks, the execution paths must be identified. The program may be converted so that only the faulty execution path is used, for the convenience of debugging. To do this, follow this procedure:

Comment out all the tests except the one that causes the problem.
Hard-code the exception point into the TestDriver function, using i = exceptionPoint-1 before the start of the do loop.
Put a break before the while statement in the TestDriver function to prevent other tests from running.

This program will follow only the faulty execution path. This makes it easier to trace the program in the debugger to find where the leak occurs. The exact point where the exception is thrown may be found by placing a breakpoint on the throw statement in CouldThrow. To facilitate further testing, it is best to perform these modifications on a copy of the original files.

It turns out that these memory leaks are caused by an exception from TestClass::operator= while the elements are being copied from one array to the other in the Stack copy constructor. The memory for the array is not deallocated. The exception leaves the constructor, leaving the class only partially constructed, so the destructor is not called. The second leak is caused by having two elements on the stack to copy, so the same mechanism is repeated by throwing from the second assignment in a subsequent test. This fault can be fixed by inserting a try/catch block into the constructor to delete the array in the case of an exception during copying. See Listing 6 (Stack_Reed_Fixed.h) for the fixed version.

The test harness (Listing 5, TestStack.cpp) and exception testing system can be used to find a complete list of the problems with this Stack class. I have done this, and the results are shown in Table 1. There are comments in the fixed version of Stack indicating where each repair was made. With this number of faults, this class is probably better scrapped and rewritten according to the guidelines in Exceptional C++ [2]. Indeed, it is impossible to pass the test suite without changing the interface to the stack class (problem 8 from Table 1). One of the stack classes included in Exceptional C++ is included with this article as Listing 7 (Stack_Sutter_1.h). Another (Stack_Sutter_3.h) is included with the online sources. These classes have a Top method, and the Pop method has been modified so as not to return anything. These changes address problem 8 from Table 1. The same test suite can be used to test all these implementations. Both of Sutter’s implementations pass the test suite without modification.

Testing Exceptions Thrown by the Class Under Test

Now that I have discussed how to test exceptions caused by memory faults, I will turn to exceptions that are thrown by the class under test. How do we test these? By setting up the conditions that cause the exception to be thrown. For example, the Pop method of the Stack class throws an exception if you try to pop an empty stack. We can test this exception path by writing a test such as the following:

class TestPop {
public:
  static void DoTest() {
    Stack<TestClass> a;
    try {
      a.Pop();
      Counter::Fail("Pop empty");
    } catch(const char* ) {
      Counter::Pass("Pop empty");
    }
    Counter::Test( a.Consistent(),
      "a internal state");
    Counter::Test( 0 == a.Size(),
      "a correct size");
  }
};

This particular test tests Sutter’s version of the class, in which Pop has been modified not to return the popped element. The test sets up the stack class to be empty so that Pop will fail. The test is subsequently written so that Pop must throw an exception to pass the test. The stack must also subsequently have consistent internal state, and have zero size as well.

It would also have been possible to test this path of execution by placing a call to Counter::CouldThrow at the point where the Pop could fail. This would not have required such a carefully designed test suite. However, it would have required modification of the code undergoing testing, which is usually undesirable.

Discussion

Using this technique imposes the following requirements:

1) You must have the source code of the class or function under test. (Object code may be sufficient if the class does not use any templates.)

2) You must write an exhaustive test suite for the functionality of that class or function, including for any exceptions that the class or function itself may throw due to being misused in any way.

Meeting the above requirements allows us to:

1) Test whether the class or function under test leaks memory under any circumstances, including due to exception propagation;

2) Easily exercise exception handling code for exceptions that are caused by out-of-memory conditions; and

3) Exercise exception handling code for exceptions that are caused by misusing the class (and the misuse is included in the test suite).

Thus, this testing method is applicable to class-based testing or unit testing. It does not allow us to do any of the following:

1) Test pre-compiled binaries of libraries or complete programs; or

2) Test exceptions due to any reason other than memory failure or conditions deliberately set up by the test suite. An example of such conditions would be I/O faults. (I/O faults could be tested using this method if the I/O library were instrumented with calls to CouldThrow.)

Therefore this method, as it stands, is not applicable to integrated system testing.

This is not a perfect solution, but is much better than having no means of performing this testing at all.

Conclusion

This method allows you to test a function, normal class, or template class for its exception handling integrity. It requires some extra effort over that required to write a standard test suite. The tests must be crafted with some care for the results of the tests to be meaningful. However, if you go to the trouble of writing good tests and fixing any problems found with your code, you can be correspondingly more confident in your code. This technology has already been used to improve the quality of student assignments for simple classes. It has also uncovered incorrect exception handling code output from some compilers. This article only scratches the surface of what can be done by automated unit testing.

Acknowledgements

I gratefully thank Herb Sutter for encouraging me to write this article in the first place, and for assisting with reviewing it.

This technique was invented independently by Matt Arnold, and subsequently used by David Abrahams [7] to write a generic test suite for the STL [8].

References

[1] Tom Cargill. “Exception Handling: A False Sense of Security,” C++ Report, Vol. 6, No. 9, November-December 1994. Also available at http://meyerscd.awl.com/.

[2] Herb Sutter. Exceptional C++ — 47 Engineering Puzzles, Programming Problems, and Solutions (Addison-Wesley, 2000).

[3] Steve McConnel. Code Complete (Microsoft Press, 1993).

[4] Steve Maguire. Writing Solid Code (Microsoft Press, 1993).

[5] Scott Meyers. More Effective C++ (Addison-Wesley, 1996). Also available as a CD; see http://www.meyerscd.awl.com

[6] David Reed. “Exceptions: Pragmatic Issues with a New Language Feature,” C++ Report, October 1993.

[7] David Abrahams. “Exception Safety in Generic Components,” Dagstuhl Conference on Generic Programming, April 27 - May 1, 1998. Online at http://www.cs.rpi.edu/~musser/gp/dagstuhl/gpdag.html.

[8] David Abrahams and Boris Fomitchev, “Exception Handling Test Suite,” available at http://www.stlport.org/doc/eh_testsuite.html.

Ben Stanley graduated from the Australian National University with Honors in Theoretical Physics in 1994. He is now doing a PhD in Robotics at the University of Wollongong, Australia. He has lectured some first and second year C++ units. When he’s not busy writing his thesis, he makes puzzles.

Previous 4 5 6 7 8 9 10 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Adding Exception Testing to Unit Tests