Channels ▼


Discovering Relationships in Context

Source Code Accompanies This Article. Download It Now.

Joe is a database consultant and author of Trees & Hierarchies in SQL. He can be contacted at [email protected]

Relational databases hold models of a world that assumes you know the basic relationships among the entities in your problem space. Their purpose is to maintain business rules and access data in a known format. This is a good assumption in a production environment. Actually, it is a necessary assumption. You want to produce known and well-defined transactions and reports in online processing (OLTP) environments. A constantly changing schema would be a model of a world where elephants can drop out of the sky. Even online analytical processing (OLAP) deals with known relationships, usually summaries taken from transactional systems and outside data sources.

The relational model (RDBMS) is a deductive system that can find sets of entities from the data in this model. If you find a new relationship, you have to add tables and new constraints to your schema to model it. Likewise, if you drop a table from your schema, you have to see how that changes the whole model. This is the nature of a deductive approach to data.

RDBMS is very useful, but there are problems in the world that cannot be done with deductive methods. We need inductive tools that let us add relationships, rather than data, to a model. What kind of data requires inductive reasoning? Imagine you are a cop on the "CSI" television show. All you have is a collection of odd facts that do not fall into nice neat relational tables. These facts tie various data elements together in various ways. You have 60 minutes to find a network of associations to connect the bad guys to the crime in some as-of-yet unknown manner. And a new fact introduced after the last commercial break can change the outcome in the last five minutes.

In any of the shows, the detectives go to a marker board and start making "fishbone" diagrams (also known as "Ishikawa" diagrams) and other general-directed graphs. Each event is tied together until we have a path from the perpetrators to the crime. The whole point of the show is finding that path by hard work, some shooting, and clever thinking. It's good drama, but in the real world, it can be slow and we don't have scriptwriters to guarantee outcomes.

What we need is a tool to manipulate a general-directed graph with various relationships between simple entities. Ideally, we would like this tool to be declarative and self-optimizing—like SQL engines.

For instance, graph theory, a branch of mathematics that deals with nodes and edges, is one of the most powerful mathematical tools we have because it is so general. A road map's nodes are cities and the edges are roads. A family tree's nodes are people and the edges are blood relations. A circuit diagram's nodes are electronic components and the edges are the wires between them. Graphs are so general—you've used them all your life, but not thought of them in a formal sense.

Because I'm a big fan of SQL, my first approach to any problem is to write a schema and some queries. I have tried to do general graphs in SQL and my conclusion is that it's possible—but not practical—for any data set of a realistic size.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.