Dr. Dobb's | Change Code Without Fear

Change Code Without Fear

Behavioral regression testing is a technique that provides a fast and easy way to determine if code modifications change or break existing functionality.

February 06, 2008
URL:http://www.drdobbs.com/web-development/change-code-without-fear/206105233

Nada is Product Manager of Java Solutions at Parasoft. She holds a bachelor's degree in computer science from the University of California, Los Angeles and can be contacted at [email protected].

Working on an existing code base that has minimal or no tests is like walking on eggshells: Every move you make has the potential to break something. Yet when you're working on software, the damage is not always immediately obvious—at least, not without extensive, system-wide testing. But such validation is not always possible or easy, considering the frequency of change, limited resources, and looming deadlines that most developers face. As a result, many developers commit their additions and modifications, then cross their fingers and hope for the best.

If your code lacks sufficient tests, Behavioral Regression Testing is a technique that provides a fast and easy way to determine if code modifications change or break existing functionality. What is Behavioral Regression Testing? It's a baseline test that captures the project code's current functionality. To detect changes from this baseline, you run your evolving code base against a test suite on a regular basis. In most cases, such a test suite can be generated overnight using automated unit-testing tools. Once this test suite is in place, you can incrementally improve its intelligence and value by adding more test cases, modifying the automatically generated ones, and keeping it in sync with intentional program changes. The resulting test suite serves as a change-detection safety net, letting you modify code without fear of accidentally changing or breaking the code's intended behavior.

The key to making this practice practical is to automate as many tasks as possible so that you can focus your efforts on the few regression-testing tasks that truly require human intelligence. By leveraging automation for this purpose, you gain a reliable way to determine when and how your code modifications impact the rest of the application—with minimal effort.

In this article, I explain how to build, maintain, and extend Behavioral Regression Test Suites that help you write code faster and change it with confidence.

Understanding the Key Phases

There are two key phases involved in creating and using Behavioral Regression Test Suites:

Phase 1. Automated one-time test generation. A one-time process that involves generating a regression test suite that can be run on a regular basis and will not report any errors unless the captured behavior changes (Figure 1).
Phase 2. Automated test generation for new code + Automated daily execution + extend tests. A daily process that involves automated generation of regression tests for new code, as well as automated execution of the existing test suite. After the execution completes, you review and respond to any functional changes that the test suite detected. The response process involves injecting human intelligence into the test suite.

[Click image to view at full size]

Figure 1: Phase 1.

Once per project, an automated unit test generation tool scans the project code base (accessed from the source control system), then automatically generates and executes unit test cases with assertions that capture the project code's current functionality for a wide range of automatically generated test scenarios. Test execution results should be stored on a central server, where they can be shared team-wide, and the generated tests should be added to source control. At this point, you might want to fine-tune the test suite to minimize noise (by ignoring any assertions that fail on a second run against the exact same code base, ignoring time-sensitive and date-sensitive assertions, and so forth). However, such fine-tuning is completely optional at this phase.

Because your goal is to establish a behavioral baseline rather than verify the application's current functionality, there is no need to actually review—or even glance at—the generated tests or the assertions/observations/outcomes attained. Assuming that your application is currently functioning as expected, just blindly verify all the reported assertions. Or, even better, configure the tool to automatically set the actual outcomes as the expected ones.

No test failures should be reported for the initial run of this test suite. To minimize the amount of noise presented, failures should be reported only when code behavior changes as a result of code-base modifications (Figure 2).

[Click image to view at full size]

Figure 2: Phase 2.

This daily process involves running the evolving code base against the behavioral regression test suite on a regular basis. Running in batch mode, the unit-testing tool scans the modified project code base (accessed from the source control system), automatically executes the existing regression test suite, and also automatically generates regression tests for new code. Test failures are reported only if the current code behavior does not match the behavior captured in the baseline test suite.

This alerts you whenever the previous day's changes caused the tests to fail. If this occurs, you need to review and address the reported failures. At minimum, you need to respond to failures daily to keep the test suite in sync with the app. As resources permit, you may also extend the test suite as needed to cover any critical application functionality not yet represented in the test suite.

Ideally, the results from this automated test execution not only reveal which test case assertions failed as a result of the previous day's code modifications, but also indicate exactly which developer modified the code that broke each assertion. For instant feedback on whether their code changes broke the existing functionality, developers can import information into their IDE about the regression failures caused by their modifications. Because the regression failures are directed to the developers responsible for them, the overall process of fixing them is much more streamlined than it would be if all developers were looking at the same list of regression failures. And since the results are available in their IDEs, resolution is faster and easier.

Establishing a Supporting Process

When you arrive at work each morning, review and respond to any test failures reported for your code. In doing so, you will either address the functional defects in your code or update the test to reflect the correct behavior of the code.

These are the recommended responses to test-case failures :

If the previous behavior is still the correct behavior, fix the code that was broken.
If the functionality was modified intentionally, verify the new behavior by updating the test-case assertions.
If the assertion is checking something that you don't want to check or something that changes over time (if it is checking the day of the month), comment out the assertion.
If the test case doesn't check something you want to assert or if the test case is no longer valid, delete the test case. For instance, this might be appropriate if someone adds validation checks to some methods, and now the inputs that were originally generated are no longer valid.
Additionally, you might want to further enhance the tests with additional logic, assertions, or new test cases.

The end result of this repeated daily process is a regression suite that evolves with the application, which is more robust and more intelligent than it was the day before. By spending a fraction of each day to keep the test suite in sync with the application and by continually enhancing it by building more intelligence into the tests, development teams can increase the lifespan and value of the test cases they already created, and expose most code-regression errors as soon as they are introduced—which is when they are fastest and easiest to diagnose and fix.

Extending the Test Suite

Automated test-case generation is what makes Behavioral Regression Testing practical for teams that haven't been building functional test cases for every piece of code as developed. Automated test cases are invaluable for quickly generating a baseline test suite to help you identify whether code modifications change code behavior. However, they may need to be extended to verify the continued operation of use cases, check scenarios that require a complicated setup, or simply improve test coverage. This requires injecting human intelligence into the automatically generated test suite.

One way to extend the test suite is to enhance the automatically generated test cases (that is, by adding more function calls, using more realistic inputs, ensuring that objects are properly initialized, and so on). Another way is to write your own functional tests using whatever test frameworks are applicable for your application (JUnit, Cactus, CppUnit, NUnit, and the like). If the team has any legacy test cases (for example, unit test cases that developers wrote for one-time verification), those should also be integrated into the test suite. Additionally, leverage any other technologies you have at your disposal to generate test cases that cover vital functionality and/or provide increased coverage of your application. These tests should all be executed in concert with the automatically generated test cases, letting you see the cumulative coverage of the complete test suite.

For example, if you are working on a web application and/or SOA, tests that exercise the application from those perspectives should also be added. Ideally, such tests should be converted to source-code level tests (such as JUnit or HTTPUnit). This correlates front-end and message-layer behavior to the back-end code, which helps you:

Isolate the code responsible for any issues discovered in the Web or SOA test scenarios.
Determine how much of your code is exercised by your complete test suite.

Pattern-matching and data-flow static analysis are also valuable extensions to the regression test suite. Traditionally, people do not consider static analysis as part of the regression test suite, but it can be a very important tool in detecting regressions. Tools that can run pattern-based and data-flow static analysis are good at identifying mechanical errors that would take humans a long time to find. Think of it this way: This morning, no violations were reported for your code. You change the code today, then when you arrive at work tomorrow morning, you see a violation reported for that code, alerting you to a coding problem that—if not corrected—causes resource leaks. This is essentially a regression failure—it indicates that the recent code modifications introduced problems into the code. If static analysis was not performed as part of the regression testing, this problem might have gone unnoticed until load testing later in the development process, when it would be considerably more difficult to diagnose and repair.

Establishing a Supporting Infrastructure

Running automated and regularly scheduled build and testing processes should involve minimal distraction if set up properly and if all required infrastructure is present. At minimum, that infrastructure should include a source-control system, a tool for automated test execution, and a reporting mechanism to track results of the automated build and testing.

Most development teams use a source-control system to store code they are working on. Some teams store their tests in the source-control system, and others leave testing to the discretion of the individual developers. In some cases, developers are testing the code on their own machines, but the tests never make it into the source-control system. In other cases, they do some testing if time permits and don't do any when in a crunch. In either case, the value of any tests that exist on an individual developer's machine is minimal and such tests are typically used only to verify that the code is functioning properly at the time it is written. Once stored into the source-control system, the test can be leveraged over and over to validate that the new code didn't break the existing functionality or introduce problems that could impact reliability or functionality.

The regression suite generation/execution should be scheduled to run regularly (for example, nightly, or several times a day in a continuous integration process) in whatever manner makes the most sense for your specific development environment and team. This could be a combination of an Ant script and Windows scheduler or CruiseControl, or a shell script and a crontab on UNIX. This process then regularly scans the project and generates more test cases for new and updated code, and executes all tests that comprise the regression suite, then reports all failures and cumulative coverage information. This report can be e-mailed to individual developers, or they can import it directly into their IDE—effectively triggering their review of the failures and the new/changed code. This cycle is then repeated every time the regression process is run.

Building a Behavioral Regression Test Suite

To demonstrate this process, I use Parasoft JTest to create a Behavioral Regression Test Suite for the JPetStore 4.0 project from iBatis (sourceforge.net/project/showfiles.php? group_id=60632). This is a simple web application that lets you purchase pets online.

First, automatically generate an initial regression suite for the JPetStore project. This one-time process scans the project and generates a test class for each Java file in the project. In the JPetStore example, 40 new test classes containing 702 test cases were generated. The tool proceeded to execute only these automatically generated test cases because the regression test suite did not include any manually written test cases. It reported coverage of 91 percent of the source code:


[exec] Executed Test Cases: 702
[exec] Runtime Exceptions: 0
[exec] Assertion Failures: 0/0 verified
[exec] Contract Violations: 0
[exec] Profiling Problems: 0
[exec] Unverified Outcomes: 0
[exec] Coverage:
[exec]    Line: 91% [994/1090 executable lines]

Next, set up the automated testing tool to run using Ant. The appropriate Ant properties are configured to point to the tool installation directory and the Eclipse workspace (which is where the JPetStore project is located); for example:


<target name="all" >
 <echo message="Testing 'JPetStore'" />
 <exec dir="." executable="${testtoolcli}" >
    <arg line="-data ${workspace} ${config} -resource 'JPetStore' 
   -report '${workspace}/JPetStore/Report.xml' -publish
   -localsettings '${workspace}/JPetStore/examples.properties'"/>
 </exec>
</target>

Extending the Test Suite

Before extending the test suite, it's helpful to review the automatically generated test cases. For example, Listing One is a sample test that's automatically generated for the com.ibatis.jpetstore.domain.Order class (specifically, its initOrder() method). Notice how the tool automatically generated input data to use for the test. At the same time, it created assertions based on the observed state of the objects at the end of the test. This now becomes a test that you can always rerun against the Order class to confirm that its behavior remains intact as you are extending and modifying it.

Further, you can extend the automatically generated test suite either by manually adding more test cases or by modifying the automatically generated tests to use realistic data, check specific assertions, or test various functional scenarios that you may be interested in. Here is an example of a test for the initOrder() method in the com.ibatis.jpetstore.domain.Order class. Listing Two (available online at http://www.ddj.com/code/) is a test created by manually extending one of the automatically generated tests to add more logical, realistic data—as well as to add more assertions.

/**
* Test for method: initOrder(com.ibatis.jpetstore.domain.Account,com.ibatis.jpetstore.domain.Cart)
*/
public void testInitOrder21() throws Throwable {
   Order testedObject = new Order();
   Account account = new Account();
   Cart cart = new Cart();
   account.setUsername("username1");
   account.setPassword("password0");
   account.setEmail("email0");
   account.setFirstName("firstName0");
   account.setLastName("lastName0");
   account.setStatus("status1");
   account.setAddress1("140 East 45th Street, New York, NY 10017");
   account.setAddress2("1600 Pennsylvania Avenue NW, Washington, DC 20500");
   account.setCity("Tokyo");
   account.setState("New York");
   account.setZip("90011-1234");
   account.setCountry("England");
   account.setPhone("1234567");
   account.setFavouriteCategoryId("favouriteCategoryId0");
   account.setLanguagePreference("languagePreference0");
   account.setListOption(false);
   account.setBannerOption(false);
   account.setBannerName("bannerName0");
   testedObject.initOrder(account, cart);
   assertNotNull(testedObject.getLineItems());
   assertEquals(0, testedObject.getLineItems().size());
   assertNotNull(testedObject.getOrderDate());
   assertEquals(new java.util.Date().toString(), testedObject.getOrderDate().toString());
   assertNotNull(testedObject.getTotalPrice());
   assertEquals("0", testedObject.getTotalPrice().toString()); 
   assertEquals("P", testedObject.getStatus());
   assertEquals(0, testedObject.getOrderId());
   assertEquals("username1", testedObject.getUsername());
   assertEquals("140 East 45th Street, New York, NY 10017", testedObject.getShipAddress1());
   assertEquals("1600 Pennsylvania Avenue NW, Washington, DC 20500",testedObject.getShipAddress2());
   assertNotNull(cart.getAllCartItems());
   assertNotNull(cart.getCartItems());
}

Listing One

Over time, you might want to further improve the intelligence of the regression test suite by using automated tools that record interactions with a running application, then produce functional JUnit tests representing the recorded interactions. Listing Three (available online) is an example of a front-end JUnit test that represents a scenario of purchasing a dog and a cat, then validates that the total for this order is $112.

Responding to Reported Failures

Imagine that this process has been running for about 30 days, and that the development team just received a new requirement to implement a discount policy—25 percent of the discount will be applied to the total price if the weighted average price per individual item is greater than $20.

To implement this requirement, the getSubTotal() method of the com.ibatis.jpetstore.domain.Cart class needs to be changed. Listing Four (available online) implements the new idea. You can change this code without fear of unknowingly introducing problems. The regression test suite alerts you if your code modifications introduced any undesired results.

When you rerun the regression suite against the modified code, you get one assertion failure and one case where the test fails due to an ArithmeticException; see Figure 3.

[Click image to view at full size]

Figure 3: Running the regression suite against modified code.

Taking a closer look at the ArithmeticException failure, you notice that the new code is introducing a defect because it is not properly handling a case when the number of items in the cart is 0. The problematic line is marked with the comment //Division by zero. The code should be modified to check whether the total number of items in the cart is not equal to 0 before it proceeds to perform the division. For example:



if (totalItems !=0) { 
BigDecimal actualAvePrice = 
  subTotal.divide(new BigDecimal     (totalItems));
  if (actualAvePrice.compareTo       (targetAvePrice) > 0) {
     //if avg price is $20 or        more, let's apply        25% discount
     BigDecimal sb =        new BigDecimal(0.75);
     subTotal = subTotal.multiply        (sb).setScale(2);
     }
}

When you rerun the regression test suite, the ArithmeticException failure is no longer reported. In this case, the regression suite prevented you from introducing a defect into the code.

Next look at the reported assertion failure. It indicates that a test case was expecting the order total to be $112, but it was actually $84. This failure occurs in the functional BuyDogandCatTest test case (Listing Three), which was added to the regression suite.

Given the new discount policy in place, this is actually the correct behavior—not a defect. This means that the test needs to be updated to remain in sync with the source code change. In this case, the expected amount in the assertion needs to be changed from $112 to $84. If you rerun the regression suite after this change is made, no failures are reported.