Tools

Pseudorandom Testing

By Guy W. Lecky,Thompson, August 01, 2004

Guy examines how you can test objects by creating specific test harnesses with verifiable datasets.

August, 2004: Pseudorandom Testing

Test harnesses & verifiable datasets

Guy is a video game design consultant and author of Infinite Game Universe (Charles River Media, 2002). He also conducts research into the use of pseudorandom and genetic algorithms in software applications. Guy can be contacted at http://www.lecky-thompson.net/.

Part of the process involved in taking a software project from planning to completion is geared toward testing the code that makes up the core of the application. There are numerous tools that you can use to test the user interface (WinRunner comes to mind), which let scripted test cases be executed to simulate the actions of users.

This is well and good, and sophisticated sets of tests can be created that let most bugs in the UI design be caught relatively efficiently. This efficiency stems from the principle that the test tool can simulate user actions without the actual user being present. An additional bonus is that it does not get bored, and can run all day and all night—something you can't expect of a human.

However, the drawback is that you have to wait until the UI interface is ready before the testing can commence. Often, the inner workings of the application, the libraries that are to be used, and the functionality that provides the data processing are ready before the user interface.

This means that programmers and test teams need to devise methods of validating these core parts of the system prior to releasing them as part of the final package. Failure to do this results in an incredible amount of inefficiency as bugs are only found late in the development process.

Part of the problem can be resolved by using rigorous design methodologies and principles for determining that the software should be correct, but you're never sure that the programmers will capture the exact nature of the proven design in the implementation.

What's required is a test harness that offers the functionality of a UI test tool, but can be used in conjunction with the libraries that make up the core processing functionality of the application. Because these will be entirely nonstandard and change from application to application, a generic tool is difficult to envision.

The success of tools such as WinRunner is based on the fact that a GUI such as that offered by Microsoft Windows is composed of standard pieces. There are buttons, text boxes, menus, and the like, which can each be tested in a generic fashion, as well as in conjunction with each other.

Microsoft has done a lot of work to ensure that this remains the case, and the various exposed parts of the GUI elements (such as window handles) are well documented. Clearly, this will not be the case for the various data objects that make up the core functionality of the application under test, as well as their associated methods.

The Solution: Part I

Using the release of the SHA-1 algorithm (which provides secure hashing functionality for IDEA and Triple DES encryption) as an example, you can see that this is a problem that has been addressed before, using specific datasets with known inputs and outputs.

The driving force behind the test is that, once the algorithm has been implemented, you can provide it with a test dataset and compare the result with the known outputs. If they match, then the algorithm has been correctly implemented, according to the design, and everyone is happy.

However, there will be cases where the input is unpredictable, and the output should match it, or be within a set of parameters, and no strict dataset can be created, which will enable such a philosophy to be used. Data storage and reproduction is a classic case, and it is this that I address first.

Assume that a class that represents a specific data object needs to be implemented, in this case strings and sequential identification numbers for an employee database. Listing One is a possible C++-style implementation of the object. The contents of Listing One lead nicely into a slight digression. It turns out that the most effective way of testing such examples is to pass the header files, along with precompiled libraries, to a separate team of people who usually haven't had a hand in creating the actual implementation.

From the design and the header file, it should be obvious how the object is supposed to behave, and can hence be tested. In this case, the test team might look at the header file and determine that the most effective way to test the implementation would be to create an instance of the CEmployee object, instantiated with known values, and use the data access methods to verify that the correct values have indeed been stored. Consequently, the test team might come up with a test harness like Listing Two.

If there is no output on the screen once this little test harness has been executed, then you can assume that the implementation is correct and the quality check has been passed. You must also assume that the functions that you have chosen to use for data verification, in this case the strcmp function from the Standard ANSI C Libraries, and the nonequality operator have been tested beforehand.

At this point, if you are actively involved in software testing, you're probably rolling your eyes in disbelief—no, I haven't forgotten exception cases, I just haven't mentioned them yet.

An exception case is one that is assumed to use data that is illogical or incorrect. Thus, you might want to introduce checks along the lines of Listing Three. Whether these kinds of test cases make sense depends on the library as well as target application, but the techniques that we are about to introduce can be augmented by the use of exception cases to provide a more complete test harness. They are simply not my focus here.

The Solution: Part II

The preceding test cases were static, using a single set of values. This might seem like enough, but there will be cases where the test team might want to perform some stress or bounds checking to ensure absolute stability, or even just to see where the limits of the system might lie.

In such cases, it would be tedious to manually create sets of test data to verify these aspects of the system. Imagine trying to test for the largest possible name that the CEmployee class will support, or whether the entire ANSI character set is supported, using a hand-coded test harness.

Instead, you would probably prefer to generate the test data, apply it to the object being tested, and verify the correctness of the implementation that way. This is not actually as daunting as it might at first seem, but there is an obvious and less obvious solution.

First, the obvious one. For string testing, you need to verify that every printable character supported by the standard ANSI character set can be stored and retrieved. As luck would have it, you do not even need to know what these are on the target platform, since the ctype.h standard header file should contain a function (or macro) isprint (int c), which returns True if c is printable. Leaving out the actual test, which is identical to that in Listing Two. Listing Four shows this technique in action.

There is a drawback to using this approach, which becomes apparent when you consider the next phase in the testing process—variable-length, variable-content strings. You might be tempted to try and encapsulate the aforementioned process into a function CreateRandomString, which would take a variable of type char * and fill it with random printable characters. This function might look akin to Listing Five and admittedly would do the job, and could be used as in Listing Six. This seems perfectly adequate, and probably is. Since the standard ANSI random-number generator can be seeded to produce a stream of repeatable values, it can even be described as providing static datasets of arbitrarily long strings filled with printable characters.

The problem is that it might not be portable—each platform might generate different streams of characters. What is more, the use of malloc to create the memory block introduces a variable that cannot be verified prior to being initialized, which adds to the uncertainty of the test harness.

A Better Way

The second, less obvious solution requires patience in understanding. If you assume that the isprint function only provides visible printable characters, leaving out control characters, then surely you can combine the two methods to create a set of static test data containing elements that can be set to known, verifiable values prior to compiling the test harness.

This intriguing thought needs two separate pieces of software. The first creates the second, which is the test harness proper. Listing Seven shows the complete application to create the test harness for verification of a single set of random values for the CEmployee string handling. A similar approach can be used to test the long integer handling.

Even better than setting up the test data variable is the slightly more convoluted version:

// Print out the test process
fprintf( hOut, "CEmployee * OEmployee =
new CEmployee ( \"%s\", 1 )\n",
test_data );

Using this version also requires a similar change to the test process itself:

fprintf( hOut, "if ( strcmp (
OEmployee->GetName(),
\"%s\" ) != 0)\n", test_data );

The principles need to be extended, of course, to provide for multiple datasets within the same test harness file; judicious use of the malloc function in conjunction with the two small alterations to Listing Seven can be employed to provide a much larger dataset than we have seen here.

It is better than the first, more obvious method because it lets the test team look at the source code, and knows exactly what values are being used to populate the data contained in the OEmployee object—the test data is more transparent.

For this reason, you can use nonstatic variables in the application that creates the test harness (such as char *) and still be able to verify the data passed to the constructor. This is something you could not do with the original solution.

Conclusion

In this article, I have focused on ways in which objects can be tested by creating specific test harnesses with verifiable datasets in a much more efficient manner than hand coding them. Of course, the intermediate stages need also to be tested, but since the result is to be compiled, and can be reviewed by programmers, there is much less scope for error.

While it can seem slightly over complicated for the simple example class I presented, the power of the technique becomes clearer as soon as that class begins to be extended. You could, for example, simulate data entry of many hundreds of thousands of records, to be held together by a linked list, with little additional extra coding.

Although this could also be done using a tool such as WinRunner, these techniques can be used incrementally, before the interface is complete. The advantages are twofold. First, you catch errors earlier in the application creation process and, second, you can test code at the core of the application before it becomes further complicated with other libraries, such as the user interface or file handling.

DDJ

Listing One

class CEmployee
{
  private:
    char * szName;
    long int lID;

  public:
    // Constructor
    CEmployee ( char * szName, long int lID );
    // Destructor
    ~CEmployee ( );
    // Inline Data Access Methods
    char * GetName() { return this->szName; }
    long int GetID() { return this->lID; }
};

Back to article

Listing Two

#include <stdio.h>  // Standard ANSI C I/O
#include <string.h> // Useful for string comparisons
#include "CEmployee.h" // The object class to be tested int main ( void )
{
  // Instantiate the test data
  char test_name [] = "This is a test name. 123. ABC.";
  long int test_id = 12345;
  // Create an instance of the CEmployee object, using the test data
  CEmployee * OEmployee = new CEmployee ( test_name, test_id );
  // Check that the data has been correctly stored
  if ( strcmp( test_name, OEmployee->GetName() ) != 0)
    printf("Name test failed!\n Expected [%s] but found [%s]",
      test_name, OEmployee->GetName() );
  if ( test_id != OEmployee->GetID() )
    printf("ID test failed!\n Expected [%ld] but found [%lld]",
      test_id, OEmployee->GetID() );
  // Clean up...
  delete OEmployee;
  return 0;
}

Back to article

Listing Three

CEmployee * OEmployee = new CEmployee( "", 1 );       // Illogical, Empty name
CEmployee * OEmployee = new CEmployee( 1, "" );       // Incorrect parameters
CEmployee * OEmployee = new CEmployee( "Test", -1 );  // Illogical ID
CEmployee * OEmployee = new CEmployee( "Test", 3.5 ); // ID wrong type

Back to article

Listing Four

    ...
    char test_name[255]; // This array is larger than it needs to be
    // Generate the test_name data, containing every printable ANSI character
    int pos = 0;
    for (int j = 0; j < 255; j++)
    {
      if ( isprint(j) )
      {
       test_name[pos] = j;
        pos++;
      }
    }
    test_name[pos] = '\0'; // Just in case
    CEmployee * OEmployee = new CEmployee ( test_name, 1 ); // For example
    ...

Back to article

Listing Five

void CreateRandomString ( long int nLength, char * szText )
{
  long int j;
  j = 0;
  while ( j < nLength )
  {
    int c = 0;
    while ( !isprint ( c ) )
    {
      c = rand() % 255;
    }
    szText[j] = c;
    j++;
  }
}

Back to article

Listing Six

char * szText;
long int lSize;

lSize = rand() % MAX_LONG_INT;

szText = (char *) malloc ( (lSize + 1) * sizeof(char) );

CreateRandomString ( lSize, szText );
szText[lSize] = '\0';

Back to article

Listing Seven

#include <stdio.h>
#include <stdlib.h>
#include "RandomString.h" // For the CreateRandomString function
void main ( void )
{
  char szFileName[] = "CEmployeeTest.cpp";
  FILE * hOut = fopen( szFileName, "w" ); // Open file for writing
  // Create the preamble for the test harness
  fprintf( hOut, "#include <stdio.h>\n\n #include \"CEmployee.h\"\n\n" );
  fprintf( hOut, "void main ( void )\n{\n" );
  // Set up the test_data
  char * test_data;
  long int lSize;
  lSize =  rand() % (MAX_LONG_INT / 2);
  lSize += rand() % (MAX_LONG_INT / 2);
  test_data = (char *) malloc ( (lSize + 1) * sizeof ( char ) );
  CreateRandomString ( lSize, test_data );
  // Print out the dataset
  fprintf( hOut, "char test_data[] = \"%s\"\n", test_data );
  // Print out the test process
  fprintf( hOut, "CEmployee * OEmployee = new CEmployee ( test_data, 1 )\n" );
  fprintf( hOut, "if ( strcmp ( OEmployee->GetName(), test_data ) != 0)\n" );
  fprintf( hOut, "\tprintf(\"String test failed!\\n\")\n");
  fprintf( hOut, "}\n" );
  fclose ( hOut );
}

Back to article

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Tools

Pseudorandom Testing

Test harnesses & verifiable datasets

The Solution: Part I

The Solution: Part II

A Better Way

Conclusion

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Tools Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Tools

Pseudorandom Testing

Test harnesses & verifiable datasets

The Solution: Part I

The Solution: Part II

A Better Way

Conclusion

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Tools Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content