Whence Data Management?

Scott reports on his recent Data Management survey. You'll be surprised by the results. Lots of companies have a data group, but much of the time, they aren't being used effectively.


October 05, 2006
URL:http://www.drdobbs.com/architecture-and-design/whence-data-management/193104893

Scott is a DDJ Senior Contributing Editor and author of numerous IT books. He can be contacted at www.ambysoft.com/ scottAmbler.html.


Data management is a critical success factor in all IT organizations, yet as an industry, we really don't have good figures indicating how effective we are at it. In July 2006, DDJ ran a survey that explored the current state of data management within organizations. The results likely aren't a surprise to you—we're clearly facing some serious challenges. Because the first step in addressing a problem is recognizing that you've got it, I believe that this survey reveals some important deficiencies that organizations need to address.

We're Not a Big Happy Family

In total, 60 percent of organizations had a data group, although that figure rises to 80 percent for organizations with 50 or more IT professionals. In organizations with a data group, two-thirds of respondents indicate that sometimes developers go around the data group and address data issues on their own. Why is this a problem? Because on their own, developers do a less-than-perfect job of database design. The "command-and-control" solution to this problem is to put processes and organizational structures in place to force developers to work with their data groups. Sounds great in theory, but it doesn't seem to be working well in practice. A more practical approach would be to recognize that if developers are doing database design, then we should help them gain those skills through training, mentoring, and pair development. Sadly, in the organizations with a data group and where developers sometimes choose to go around the data group, only 34.2 percent provide data-oriented training to developers.

Figure 1 summarizes the motivations of development teams that go around their data groups. The good news is that roughly 25 percent of the problem can be easily fixed through education of developers: 8 percent don't know that the data group exists and 17 percent don't know that they're supposed to be working with the data group. The bad news is that the other problems aren't so easy to address: 20 percent of developers find data professionals too difficult to work with (although to be fair, many data professionals find developers difficult to work with as well); 36 percent of developers believe that data groups are too slow to respond; and 19 percent believe that data groups offer little value to them.

Figure 1: Why do development teams go around the data group?

One way to address these problems is to promote greater understanding between the groups: If developers understand the basics of data management, then they would likely recognize the value that data professionals have to offer and would more likely be able to find ways to work with them effectively. Similarly, if data professionals understood modern development techniques such as refactoring and agile modeling, and modern methodologies such as Extreme Programming (XP) and Open Unified Process (Open UP), then they would be in a better position to work in a more responsive manner. The survey indicated that many organizations can benefit from these strategies. Currently, only 36.4 percent provide data training to developers and 44.2 percent provide development training to data professionals.

Quality Concerns

It should be no surprise that 95.7 percent of organizations considered data to be a corporate asset (although it is surprising that 4.3 percent don't). If data is a corporate asset, doesn't it make sense that you have a test suite in place to validate it? Apparently not, because only 40.3 percent of respondents indicated that they do. Worse yet, of those organizations, only 63.3 percent let developers run this test suite whenever they needed to, hampering their ability to detect whether their development efforts would inject defects into the database. Of the organizations that didn't have a test suite, only 31.6 percent had discussed putting one in place, implying that 40.8 percent (59.7 percent * 68.4 percent) of organizations are seriously challenged with respect to ensuring data quality.

The problem just gets worse. 63.7 percent of respondents indicated that their organizations implement mission-critical functionality in the database, yet only 46 percent of those had a regression test suite in place. If functionality is mission critical, or even if it isn't for that matter, shouldn't you test it? Similar to data quality testing, only 66.3 percent of respondents work in organizations where developers can run this test suite whenever they need to, and in organizations without such a test suite, only 38.6 percent had discussed putting one in place.

Although these numbers sound bad, and they are, I suspect that they're optimistic. The survey didn't distinguish between traditional regression testing where the majority of testing is done late in the lifecycle and the more agile test-driven development (TDD) approaches where testing is done throughout development on a continuous basis. A survey being run in September addresses this issue, and more, and will be summarized in early 2007.

Have We Given Up?

61.9 percent of respondents indicate that their organizations have problems with their existing production data. Although this number is arguably low, very few data sources are perfect, and we can often live with minor data problems. However, considering that most organizations consider data to be a corporate asset, shouldn't we be doing something to fix it? As Figure 2 reveals, many organizations seem to be struggling with addressing legacy data problems.

Figure 2: Strategies for addressing production data problems.

Of the respondents working in organizations with data problems, 18 percent report that there is no strategy in place to address the problems and 33 percent have strategies not to make things worse. In my opinion, these two strategies will both eventually lead to failure: With developers commonly going around data groups and often doing a questionable job of database design as a result, and with business users using existing applications to do new things that weren't considered in the original data design, things are bound to get worse. 8 percent of organizations indicate that they intend to rewrite everything at once, a strategy that I suppose could work for smaller organizations. The good news is that 33 percent indicated that their organizations are taking an evolutionary approach to fixing data sources, which in my opinion is the most viable approach.

We Need to Improve

The most interesting aspect of the survey is that we asked people about the level of service provided by their data groups at both the beginning and the end of the survey. Figure 3 summarizes the results, with 1 being very poor and 5 being excellent. Although there are slight differences between the various positions—data professionals rated themselves slightly higher than everyone else did—the trends are identical: After being asked to think about their organization's approaches to database testing, training, and resolving production data problems, the satisfaction with the level of data group-provided service went down. Why did we ask the same question twice? To make it clear that when people step back and start thinking about some of the data-management challenges their organizations face, that they aren't being as well served by their data groups as they originally thought. In short, we all inherently seem to know that there is room for improvement in our approaches to data management. Everyone may not appreciate the results of this survey, but at least now we have a basis from which to start talking about the "data-management elephant" in the room.

Figure 3: What do you think of your data group?

The source data, the original questions, and a summary presentation for this survey will be downloadable from www.ambysoft.com/surveys/ the first week of October. As noted, after running this survey, we discovered that there were a few potential problems with the way the questions were worded, problems that were addressed in a second survey in September. That survey also looks into a few more critical issues within data management.








Who Responded

Some interesting statistics about the survey:

  • The survey was sent out to 28,149 people on the DDJ mailing list.
  • There were 1176 respondents:
  • 618 developers
  • 188 IT management
  • 133 project managers
  • 98 data professionals
  • 139 others

The number of IT professionals in the organization:

  • 29% had 10 or less
  • 22% had 11 to 50
  • 11% had 51 to 100
  • 24% had 101 to 500
  • 23% had 501 or more
  • 78% of respondents indicated that they worked in the private sector

There are always biases in surveys, including this one. For example, 98 percent of respondents worked in North America and everyone subscribes to a well-respected magazine (DDJ). Roughly half were developers, although we still had significant numbers of non-developer respondents; interestingly, the trends were similar regardless of the position held by the respondent, so this might not be a significant problem. As I point out in the main article, the questions surrounding agile techniques such as database regression testing and database refactoring may not have been clear to the respondents because the topics are new to most people. In short, the numbers might not be academically perfect, but I suspect that the potential challenges revealed by this survey are worthy of serious consideration.









A New Vision for Data Management

The principles and PRACTICES of agile software development can and should be applied to data management. Specifically, your organization should:

  • Accept the situation. Does your organization have a comprehensive strategy for database regression testing? Does it have a viable strategy for resolving existing production data problems? Does it have a consistent and effective approach to supporting data management issues on development projects? If the answer to any of these questions is no, then you have some work to do.
  • Prefer collaboration and communication over command and control. The hardest aspect of IT isn't the technology, it's the people. If members of your IT department aren't working together effectively, then it is very unlikely that they'll be effective. You can't command people to work together effectively, but you can adopt ways to streamline communication and enable collaboration.
  • Improve training, education, and mentoring. People who are highly specialized struggle to interact with others who are highly specialized because there is no common ground between them. The agile community has rediscovered that generalizing specialists (www.agilemodeling.com/essays/generalizingSpecialists.htm) with one or more specialties and a general knowledge of software development and the business domain are much more effective. Smart organizations provide their staff with opportunities to expand their skillsets.
  • Do continuous database regression testing. If data truly is a corporate asset, and if you're implementing mission-critical functionality within the database, you need to adequately test your database. Next month I cover this topic in detail.
  • Adopt evolutionary techniques. Developers work in an evolutionary (iterative and incremental) manner, therefore so must data professionals if they are to be responsive to developers needs. Techniques such as database refactoring, database regression testing, and agile data modeling exist, which enable data professionals to work in an evolutionary manner.
  • Adopt a viable strategy to address production problems. The best way to address legacy database problems is to refactor your schema safely over time.

The Agile Data site, www.agiledata.org, presents detailed descriptions of agile and evolutionary techniques for data professionals. The material is available, but you need to decide whether to take advantage of it.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.