Ensuring Database Quality

There are lots of reasons why you need to develop a comprehensive testing strategy for your databases.


November 01, 2006
URL:http://www.drdobbs.com/tools/ensuring-database-quality/193402922

Scott is a DDJ Senior Contributing Editor and author of numerous IT books. He can be contacted at www.ambysoft.com/ scottAmbler.html.


Last month, I summarized a July 2006 survey exploring the current state of data management within IT organizations. The area of greatest concern revealed by this survey is the abysmal level of database testing: 96 percent of organizations considered data to be a corporate asset and 64 percent implemented mission-critical functionality within the database, yet only 40 percent had tests to validate the data and 46 percent the functionality. Worse yet, the survey revealed a lack of recognition that we need to be doing database testing at all—only 32 percent of organizations not testing for data quality, and 39 percent of organizations not testing database functionality realized that they needed to do so. Clearly something is amiss.

There are three fundamental reasons why you need to develop a comprehensive testing strategy for your RDBMS:

What Is There to Test?

When I describe database testing to people, they're often puzzled—what could there be to test? The answer is that there is quite a bit, and it's all very important. Figure 1 uses threat boundaries (the dashed lines) to indicate that there are two categories of database testing—database interface testing and internal database testing. Database interface testing is focused on ensuring that correct data is being put into the database and taken out of it, whereas internal database testing ensures that the database runs as expected.

[Click image to view at full size]

Figure 1: Where to test relational databases.

Common database interface tests include validating data values before saving them into the database and validating the data values coming back from the database. SQL code is still code, therefore you should test it. If your team uses a persistence framework such as Hibernate or Genome, then you'll want to test your mappings as well.

Internal database testing isn't as common as database interface testing, likely due to a current dearth of testing tools, although it is arguably more important. The most obvious need is for unit testing your stored procedures and functions—not-so-obvious tests that validate your referential integrity (RI) rules. Because RI is typically implemented by triggers, and triggers can get updated and/or dropped, you'll want tests in place to ensure that your database is still working properly. Tests that validate your view definitions are also important because they often implement critical calculations and data combinations. Finally, data quality tests such as validating the default value of a column and ensuring invariants of a single data column, invariants between columns, and invariants between rows should also be performed.

Databases are shared resources, therefore there should ideally be a common database test suite that can be invoked by any application team. A single test suite would enable your organization to support consistent database testing between teams and ensure that your testing investment is spent wisely—do you really want 50 application teams writing the same basic interface tests yet ignoring internal testing?

Writing Database Tests

There's no magic when it comes to writing database tests, you write them just like you would any other type of test. A database test is typically a three-step process:

A common debate amongst developers is where to obtain test data: Should you use production data or create your own test data? The answer is that you need both. For unit testing, I prefer to create sample data to ensure that I can predict the actual results for each test. For other forms of testing, particularly system integration testing and function testing, I will use production data so as to better simulate real-world conditions. For load/stress testing, I will use production data if it is available; otherwise, I will create the requisite test data. Tools such as DBUnit (www.dbunit.org) and DTM Data Generator (www.sqledit.com/dg/) are good options for creating test data.

There are several strategies for managing test data, each of which can be used alone or in combination:

A significant advantage of the second and third strategies is that it is much more likely that the developers of that code will place it under configuration management (CM) control. Although it is possible to put test data itself under CM control, worst case you generate an export file that you check in. This isn't a common practice and therefore may not occur as frequently as required. My advice is to choose an approach that reflects the culture of your organization.

Raising the Bar

I believe that the agile software development community is in the process of generally raising the bar within the IT community. We're rethinking software development, adopting forgotten practices from yesteryear such as pair programming and active stakeholder participation, and introducing new practices such as continuous regression testing and refactoring. Due in part to the reduced feedback lifecycle, as well as a greater focus on quality, agile software developers are clearly becoming more productive than their traditional counterparts.

Not only is the bar being raised for application programming, it is also being raised for database development. The greatest challenge concerning modern techniques such as database regression testing and database refactoring is apathy amongst the data management community. Luckily, this may not be an issue much longer. Endeavors such as the Eclipse Data Tools Platform (www.eclipse.org/datatools/) and Microsoft's Visual for Database Professionals (msdn.microsoft.com/vstudio /teamsystem/products/dbpro/) are starting to provide the tools that developers require to support agile database techniques. These tools enable developers to perform common data management tasks in minutes, which may have taken days or weeks in the past. In short, the agile community appears to be reinventing data management, and they're doing so in such a way as to improve both quality and responsiveness to change. My fear is that the apathetic among us are not only being left behind, they may even be risking unemployment.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.