One of the realities of building business applications with object-oriented technology is that the only way to persist objects is typically with a relational database. Because of the so-called "object-relational impedance mismatch" (that is, the poor mapping of object technologywhich uses memory location for object identityonto relational databaseswhich use primary keys for object identity), the two have a rocky working relationship. The technical challenges, difficult enough to overcome on their own, are compounded by the object-data "divide"the politics between the object community and the data community. This month, let's explore the roots of this discord. Next month, I will present proven software processes to successfully marry object and relational technology within your organization.
Defining the Divide
The object-data divide refers specifically to the difficulties object-oriented and data-oriented developers experience when working together, and generally to the dysfunctional politics that occur between the two communities in the industry at large. Symptoms of the object-data divide include object developers who claim relational technology either shouldn't or can't be used to store objects and data professionals (ranging from data modelers to database administrators) who claim that object/component models should be driven by their data models. As with most prejudices, neither of these beliefs is based on fact: thousands of organizations are successfully using relational databases to store objects, and data models are generally perceived as too narrowly focused to be used as a reliable foundation for object-oriented models.
What are the origins of the object-data divide? Consider the history of object technology. First introduced in the late 1960s and early '70s, with the Swedish simulation language SIMULA and the Xerox Corporation Palo Alto Research Center's Smalltalk-72, object-oriented programming made inroads into business application development about 10 years ago with the availability of object-oriented extensions for popular compiled languages such as C and Pascal.
Of course, spectacular hype surrounded objects at the start: "Everything is an object. Objects are easier to understand and use. Object technology is all you'll ever need." In time, these claims were seen for what they were: wishful thinking. Unfortunately, one claim did serious damage: the idea that object-oriented databases would quickly eclipse relational databases. Several influential studies were behind the trend, stating that object and structured techniques didn't mix well in practice; some of the research was presented at the 9th Washington Ada symposium on Empowering Software Users and Developers by Brad Balfour ("Object-Oriented Requirements Analysis vs. Structured Analysis,") and Kent A. Johnson ("Structured Analysis as a Front-End for Object-Oriented Design") and a number of panel presentations were equally persuasive ("Structured Analysis and Object Oriented Analysis," Proceedings of Object-Oriented Programming: Systems, Languages, and Applications (OOPSLA) 1990, ACM Press; and "Can Structured Methods Be Objectified?" OOPSLA 1991 Conference Proceedings).
In addition, a decade ago the data community was coming into its own. Already important players in the traditional mainframe world, data modelers were equally critical in the two-tier client/server world (then the dominant technology for new application development). Development in both of these worlds worked similarly; the data folks would develop the data schema, and the application folks would write their program code. This worked because there wasn't a great deal of conceptual overlap between the two tasks: data models illustrated the data entities and their relationships, while application and process models revealed how the application worked with the data. Data professionals believed very little had changed in their world. Then object technology came along. Some data professionals quickly recognized that the object paradigm was a completely new way to develop software. I was among them and joined the growing object crowd. Unfortunately, many data professionals did not, believing that the object paradigm was either a doomed fad or merely another programming technology in an expanding field.
In the end, both communities got it wrong. To the dismay of object purists, objectbases never proved to be more than a niche technology, whereas relational databases have become the de facto standard for storing data. Furthermore, while the aforementioned studies demonstrated that structured models should not be used for object-oriented languages such as C++ or Smalltalk and that object models should not be used for structured languages such as COBOL or BASIC, these studies didn't address the idea of melding object and structured modeling techniquesa reasonable approach when building a business application such as a customer service information system. In fact, practice has shown that mapping objects to relational databases is reasonably straightforward (see Chapter 10 of my book Building Object Applications That Work, Cambridge University Press, 1998).
To the dismay of data professionals, object modeling techniques, particularly those contained in the Unified Modeling Language, are significantly more robust than data modeling techniques, and are arguably a superset of data modeling (see Robert J. Muller's Database Design for Smarties: Using UML for Data Modeling, Morgan Kaufmann Publishers, 1999). Object-oriented techniques are based on the concept that data and behavior should be encapsulated together and that the issues surrounding both must be taken into consideration in order to model software that is extensible, scalable and flexible. On the other hand, data-oriented techniques are based on the concept that data and process should be considered separately and hence focus on data alone.
The object approach had superceded the data approach. In fact, there was such a significant conceptual overlap that many data modelers mistakenly believed that class diagrams were merely data models with operations added in. What they didn't recognize is that the complexity of modeling behavior requires more than class diagrams (hence the wealth of models defined by the UML), and that their focus on data alone was too narrow for the needs of modern application development. Object techniques worked well in practice; not only isn't object technology a fad, it has become the dominant development platform. The status quo has changed so much that most modern development methodologies (to their detriment) devote little more than a few pages to data modeling.
The object-data divide has a number of deleterious effects. First, it adds to the many factors preventing IT departments from producing software on time and on budget. Second, if object modelers and data modelers cannot work together, object schema and data schema may be mismatched. The greater the discord in your schemas, the more code you will need to write, test and maintain; such code is likely to run slower than the simple code needed for coordinated schemas. Third, object models (UML class diagrams, sequence diagrams, collaboration diagrams, and so on) that are driven by your existing data design are effectively hack jobs. To repeat a truism, your requirements drive the development of your software models in software engineeringyou don't start at design. Fourth, the political strife associated with the object-data divide typically increases staff turnover.
How can your organization bridge this divide? The first step is acknowledgement. Both your object and data groups need to recognize the problem; if one group sees it but the other doesn't (or refuses to), then you've got trouble. My experience is that object professionals are more perceptive of the object-data divide because relational database technology has significantly more marketshare and mindshare than objectbase technology. Data professionals are often reticent regarding this issue; they have been in a position of power within many organizations for over a generation and are seldom motivated to consider new approaches that might reduce their standing. In fact, it's quite common for senior data professionals to insist that their data models should be used to drive the development of object-oriented models, even though they have little or no experience with object techniques and often cannot even describe what those techniques are or how to apply them. The data community must accept that post-year 2000 techniques and technologies such as modeling, patterns, Enterprise JavaBeans (EJB), Java and enterprise application integration (EAI) require novel approaches to development. The second step is to distinguish between the three flavors of development: new operational development, new reporting development and legacy migration/integration.
New operational development focuses on the creation of the online and transactional aspects of new applications that support the evolving needs of users, and Java, C++, Visual Basic, HTML, EJB and relational databases are common implementation technologies for this purpose. New reporting development, on the other hand, focuses on the creation of the reports that output massaged and/or summarized views of the data produced by operational systems. Data warehousing, data mart and reporting tools are common technologies used to fulfill this need. Finally, legacy migration or integration uses EAI technologies to present a common and consistent view to legacy and commercial, off-the-shelf software.
Each of the three flavors of development, because of its different area of focus, requires a unique software process. Although I will describe each process in detail next month, I'll leave you with this teaser: New operational developmentincluding the definition of the operational data schemashould be the responsibility of your organization's object or component modelers. New reporting development and legacy migration or integration should be the main responsibility of your data professionals, who collaborate with the object modelers to identify user requirements and prioritize migration or integration efforts.
Of course, each community must both recognize that the other has an important role to play in the success of their overall software efforts. Teamwork, not politics, will allow you to build a bridge across the object-data divide.