Channels ▼

Ken North

Dr. Dobb's Bloggers

Future of the Web and the Cloud: Data Sharing or Data Silos?

December 08, 2009

Missing from many discussions about the Internet, enterprise computing, social networks and cloud computing is the benefit for which the 'data bank' and 'data base' were conceived - to facilitate data sharing, consistency and data integrity. As a result, we find ourselves working through data quality issues and unnecessary integration problems made more complex by data silos. Complicating the situation are those reactionary developers whose enthusiasm for a database sea causes them to turn a blind eye towards productivity losses from data silos, including the effort and expense to reconcile and integrate data.

Fifty years ago the typical approach to software development was to build disjoint applications that created ad hoc data stores. Eventually organizations recognized the importance of data sharing by integrated information systems. When the idea of data as an asset gained traction, computer scientists pursued technology for maintaining data hubs and data banks, such as E. F. Codd's "A Relational Model for Large Shared Data Banks". Today we have the Protein Data Bank, National Practitioner Data Bank, Cosmic Data Bank, NEA Data Bank, various national and state DNA data banks and other manifestations of data as a shared asset. But we also have the information soup of the World Wide Web, which Jim Gray called the world's largest database.

Data Base

The early data processing paradigm, marked by much redundancy and duplication and little information sharing, evolved into the notion of building applications over a unified and integrated data base. The data base would serve as a cornucopia of facts which would beget useful applications.

A fundamental reason for operating with a data base was to have one place in an organization's information structure for storing a fact that was accessible by multiple applications. This meant data could be an asset shared by an entire organization instead of being locked up in a data store accessible to a single application. A second benefit was efficiency; a database management system  (DBMS) represented common logic that did not have to be rewritten for each new application program.

Organizations such as IBM, GE, MITRE, SDC and TRW pushed the early development of database technology, along with pioneers such as Charles Bachman, Dwight Buetell, Don Nelson, Ted Olle, Dick Pick and John A. Postley. The consortium that published the 1960 COBOL standard, CODASYL, published the first database standard in 1971. The seminal work of David L. Childs and E.F Codd, respectively, on set-theoretic data structures and the relational model, leveraged Georg Cantor's set theory to provide a foundation for today's SQL databases.

Database

By the 1980s software technology had advanced to the point there was emphasis, even on small projects, in reusing code and defining formats for shared data. Development with CASE tools, modeling software and data dictionaries, such as Digital's Common Data Dictionary, was commonplace. The database management system (DBMS) had become mainstream technology. By the 1990s, companies such as IBM, Oracle, Sybase, Ingres, Informix and Microsoft were competing in the multi-billion dollar market for database software. SQL databases became a favored solution for enterprise applications, including mission-critical applications. And the original notion of a data base as its single repository of facts for an organization had morphed into the database, a container for data and logic that was managed by a DBMS. Organizations typically had disparate databases at the workgroup, department and enterprise level.

Data Warehouse

Because of competition and other influences, some organizations undertook the creation of data warehouses. This was in some measure a return to the original concept of having a single authoritative source for facts. The size increases for data warehouses have been dramatic. In 1995, Wal-Mart's 7.5 terabyte data warehouse was among the world's largest. In less than five years it grew to 24 terabytes and today there are petabyte-sized data warehouses. eBay's data warehouse is 5 petabytes and Wal-Mart's has grown to 2.5 petabytes. Although a data warehouse provides the base for doing analytical queries and business intelligence processing, it does not represent that 1960s goal of a single unified data base that can sustain an organization's operational and management information systems.

Data Silos

The growth of networks, distributed databases, object databases, and embedded databases have moved us further away from the goal of an integrated, unified data base. Next we'll look at more of the data silo issues, and remediation with enterprise data models, data integration, federated data and linked data.

 

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video