In This Issue:
- Database Regression Testing: Isn't It Time to Bring Quality to Data Management?
- Book Review: Producing Open Source Software
- Hot Links
Database Regression Testing: Isn't It Time to Bring Quality to Data Management?
Databases are often a central part of the overall information technology infrastructure with most organizations. Although developers and database administrators (DBAs) may argue over whether it is the best way to build systems, a fact of life is that mission-critical business functionality is often implemented in stored procedures and triggers within the database. The importance of your databases is further exemplified by the fact that data is an important corporate asset. Clearly your organization should have a realistic regression testing strategy in place to ensure the quality of both your databases and the data within them. Your organization does have such a strategy, doesn't it?
Unfortunately, few organizations have such a strategy in place, and worse yet it is extremely rare to find a company blessed with a data management group that has even thought of the idea. The typical data management strategy to ensure database and data quality is to hold model reviews and to enforce onerous procedures for updating database schemas, which I suppose sounds like a good idea in theory. However, when you consider the current data quality challenges which most organizations face it's clear that these strategies aren't getting the job done and are arguably little more than political power plays on the part of data management professionals.
When it comes to data quality, the current "state of the art" in many organizations is for data professionals to control changes to the database schemas, for developers to visually inspect the database during construction, and to perform some form of formal testing during the test phase at the end of the lifecycle. Unfortunately, none of these approaches prove very effective in practice. Application developers will often go around their organization's data management group because they find them too difficult to work with, too slow in the way they work, or sometimes they don't even know they should be working together. The end result is that the teams don't follow the desired data quality procedures and therefore quality suffers. Although visual inspection of query results is a good start, you likely won't do it consistently nor will you do it often enough. Testing late in the lifecycle is better than nothing, but as Barry Boehm noted in the early '80s it's incredibly expensive to fix any defects you find at that point. We need a new approach to data quality which actually works.
Throughout this decade the agile community has rediscovered the idea that if you test often and early that the quality of your work will go up because you find problems that much faster. Better yet, we've discovered that taking a test-driven development (TDD) approach where you write a test before you write production code is an incredibly effective development technique: Not only do you ensure that you have a 100-percent regression test suite in place, you effectively do detailed design in the form of unit test creation. So, just as the agile community has led the way introducing comprehensive testing techniques for application code, it appears that we must also do the same for database testing.
My experience is that database testing must be performed at two levels: Internally within the database itself and at the interface level where you put data into the database and retrieve it from the database. Both categories of testing are crucial to your success.
Internally within the database you clearly need to test database methods such as stored procedures, functions, and triggers. Testing database methods is pretty much the same as testing operations and procedures in your application code, so your existing code testing skills will come in handy. You'll also want to tests that validate view definitions, referential integrity (RI) rules, default values on a column, and any data invariants for a single column or involving several columns. This is a bit different than traditional code testing, but I think that you can see it's straightforward when you start to think it through.
At the interface level you need to test the incoming data values. This is arguably an application testing responsibility, although it is reasonable to expect data professionals to work with developers to ensure that the testing is both appropriate and sufficient. In some circumstances it is more appropriate to test at the database level, particularly when it comes to batch database loads and/or replication between databases. Outgoing data values from queries, views, and stored procedures must also be validated. Finally, you should have tests to verify any O/R mappings, even those implemented via meta-data approaches common to products such as Hibernate or TopLink.
During development iterations the primary people responsible for doing database testing are application developers and DBAs. They will typically pair together, and because they are hopefully taking a TDD-approach to development the implication is that they'll be doing database unit testing on a continuous basis. During the release iteration, also known as the endgame iteration, your testers will be responsible for the final system testing efforts and therefore they will also be doing database testing.
Why is database regression testing important? Not only does it help to improve data quality, it also enables evolutionary database development techniques such as database refactoring. Evolutionary development methods, such as the Unified Process (UP), Extreme Programming (XP), and the Microsoft Solutions Framework (MSF) are swiftly becoming the norm within the IT community. Data professionals need to recognize this fact and begin to retool their skillset to remain relevant within today's IT environment.
Producing Open Source Software: How to Run a Successful Free Software Project
If you're thinking about starting up an open source project, or trying to save an existing one, then Producing Open Source Software is definitely for you. Karl Fogel describes what he's learned in his years of experience working on OSS projects: He not only worked on CVS and Emacs, he managed the Subversion project for Collabnet. The book covers critical project start-up issues, such as which categories of development tools that you'll need, how to pick a host site, and how to choose between the various licensing strategies. It also provides advice for managing a project, include strategies for testing, bug fixing, and releasing.
More importantly, this book covers the people side of an OSS project, particularly how to motivate people to contribute, how to promote effective communication between people, and how to distribute responsibilities between individuals. I suspect that Fogel's discussion of "people issues" will prove valuable for both OSS and non-OSS people alike. This is an insightful book: If you're new to OSS development, or at least interested in it, then you will find it to be a fascinating read. If you're trying to manage a distributed team, OSS or not, then this book is truly a "must read".
Producing Open Source Software: How to Run a Successful Free Software Project
Karl Fogel O'Reilly & Associates, 2005 http://www.amazon.com/exec/obidos/ASIN/0596007590/ambysoftinc/
- A Roadmap for Regression Testing of Relational Databases describes in detail how to go about database regression testing.
- Richard Dallaway's "Unit Testing Database Code" describes how to use DBUnit to successfully unit test a relational database.
- The process of database refactoring.
- The article "Evolutionary/Agile Database Best Practices" summarizes techniques, and provides links to detailed descriptions of them, that data professionals need to adopt to enable them to be effective members of modern development teams.
- The Agile Alliance is the best starting point for anyone interested in learning more about agile software development.
- Agile Models Distilled provides links to overviews of a wide variety of models.
- The principles of Agile Modeling v2.
- The practices of Agile Modeling v2.
- Check out the Agile Modeling mailing list.
- Get agile modeling training resources.