Scott reports on his recent Data Management survey. You'll be surprised by the results. Lots of companies have a data group, but much of the time, they aren't being used effectively.
October 05, 2006
URL:http://www.drdobbs.com/architecture-and-design/whence-data-management/193104893
Scott is a DDJ Senior Contributing Editor and author of numerous IT books. He can be contacted at www.ambysoft.com/ scottAmbler.html.
Data management is a critical success factor in all IT organizations, yet as an industry, we really don't have good figures indicating how effective we are at it. In July 2006, DDJ ran a survey that explored the current state of data management within organizations. The results likely aren't a surprise to youwe're clearly facing some serious challenges. Because the first step in addressing a problem is recognizing that you've got it, I believe that this survey reveals some important deficiencies that organizations need to address.
In total, 60 percent of organizations had a data group, although that figure rises to 80 percent for organizations with 50 or more IT professionals. In organizations with a data group, two-thirds of respondents indicate that sometimes developers go around the data group and address data issues on their own. Why is this a problem? Because on their own, developers do a less-than-perfect job of database design. The "command-and-control" solution to this problem is to put processes and organizational structures in place to force developers to work with their data groups. Sounds great in theory, but it doesn't seem to be working well in practice. A more practical approach would be to recognize that if developers are doing database design, then we should help them gain those skills through training, mentoring, and pair development. Sadly, in the organizations with a data group and where developers sometimes choose to go around the data group, only 34.2 percent provide data-oriented training to developers.
Figure 1 summarizes the motivations of development teams that go around their data groups. The good news is that roughly 25 percent of the problem can be easily fixed through education of developers: 8 percent don't know that the data group exists and 17 percent don't know that they're supposed to be working with the data group. The bad news is that the other problems aren't so easy to address: 20 percent of developers find data professionals too difficult to work with (although to be fair, many data professionals find developers difficult to work with as well); 36 percent of developers believe that data groups are too slow to respond; and 19 percent believe that data groups offer little value to them.
One way to address these problems is to promote greater understanding between the groups: If developers understand the basics of data management, then they would likely recognize the value that data professionals have to offer and would more likely be able to find ways to work with them effectively. Similarly, if data professionals understood modern development techniques such as refactoring and agile modeling, and modern methodologies such as Extreme Programming (XP) and Open Unified Process (Open UP), then they would be in a better position to work in a more responsive manner. The survey indicated that many organizations can benefit from these strategies. Currently, only 36.4 percent provide data training to developers and 44.2 percent provide development training to data professionals.
It should be no surprise that 95.7 percent of organizations considered data to be a corporate asset (although it is surprising that 4.3 percent don't). If data is a corporate asset, doesn't it make sense that you have a test suite in place to validate it? Apparently not, because only 40.3 percent of respondents indicated that they do. Worse yet, of those organizations, only 63.3 percent let developers run this test suite whenever they needed to, hampering their ability to detect whether their development efforts would inject defects into the database. Of the organizations that didn't have a test suite, only 31.6 percent had discussed putting one in place, implying that 40.8 percent (59.7 percent * 68.4 percent) of organizations are seriously challenged with respect to ensuring data quality.
The problem just gets worse. 63.7 percent of respondents indicated that their organizations implement mission-critical functionality in the database, yet only 46 percent of those had a regression test suite in place. If functionality is mission critical, or even if it isn't for that matter, shouldn't you test it? Similar to data quality testing, only 66.3 percent of respondents work in organizations where developers can run this test suite whenever they need to, and in organizations without such a test suite, only 38.6 percent had discussed putting one in place.
Although these numbers sound bad, and they are, I suspect that they're optimistic. The survey didn't distinguish between traditional regression testing where the majority of testing is done late in the lifecycle and the more agile test-driven development (TDD) approaches where testing is done throughout development on a continuous basis. A survey being run in September addresses this issue, and more, and will be summarized in early 2007.
61.9 percent of respondents indicate that their organizations have problems with their existing production data. Although this number is arguably low, very few data sources are perfect, and we can often live with minor data problems. However, considering that most organizations consider data to be a corporate asset, shouldn't we be doing something to fix it? As Figure 2 reveals, many organizations seem to be struggling with addressing legacy data problems.
Of the respondents working in organizations with data problems, 18 percent report that there is no strategy in place to address the problems and 33 percent have strategies not to make things worse. In my opinion, these two strategies will both eventually lead to failure: With developers commonly going around data groups and often doing a questionable job of database design as a result, and with business users using existing applications to do new things that weren't considered in the original data design, things are bound to get worse. 8 percent of organizations indicate that they intend to rewrite everything at once, a strategy that I suppose could work for smaller organizations. The good news is that 33 percent indicated that their organizations are taking an evolutionary approach to fixing data sources, which in my opinion is the most viable approach.
The most interesting aspect of the survey is that we asked people about the level of service provided by their data groups at both the beginning and the end of the survey. Figure 3 summarizes the results, with 1 being very poor and 5 being excellent. Although there are slight differences between the various positionsdata professionals rated themselves slightly higher than everyone else didthe trends are identical: After being asked to think about their organization's approaches to database testing, training, and resolving production data problems, the satisfaction with the level of data group-provided service went down. Why did we ask the same question twice? To make it clear that when people step back and start thinking about some of the data-management challenges their organizations face, that they aren't being as well served by their data groups as they originally thought. In short, we all inherently seem to know that there is room for improvement in our approaches to data management. Everyone may not appreciate the results of this survey, but at least now we have a basis from which to start talking about the "data-management elephant" in the room.
The source data, the original questions, and a summary presentation for this survey will be downloadable from www.ambysoft.com/surveys/ the first week of October. As noted, after running this survey, we discovered that there were a few potential problems with the way the questions were worded, problems that were addressed in a second survey in September. That survey also looks into a few more critical issues within data management.
Who Responded
Some interesting statistics about the survey:
The number of IT professionals in the organization:
There are always biases in surveys, including this one. For example, 98 percent of respondents worked in North America and everyone subscribes to a well-respected magazine (DDJ). Roughly half were developers, although we still had significant numbers of non-developer respondents; interestingly, the trends were similar regardless of the position held by the respondent, so this might not be a significant problem. As I point out in the main article, the questions surrounding agile techniques such as database regression testing and database refactoring may not have been clear to the respondents because the topics are new to most people. In short, the numbers might not be academically perfect, but I suspect that the potential challenges revealed by this survey are worthy of serious consideration. |
A New Vision for Data Management
The principles and PRACTICES of agile software development can and should be applied to data management. Specifically, your organization should:
The Agile Data site, www.agiledata.org, presents detailed descriptions of agile and evolutionary techniques for data professionals. The material is available, but you need to decide whether to take advantage of it. |
Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.