Channels ▼

Arnon Rotem-Gal-Oz

Dr. Dobb's Bloggers

DBMS Future?

October 04, 2008

Earlier this week I read here a post called  "DBMS Past, Present and Future". I thought It would be appropriate to (re)introduce an alternate future (which is already happening) to RDBMS use. The post below is actually a repost from something I wrote last year in my old DDJ blog i.e. pre dobbs code talk (with appologies to those of you who already read it back then). 

 The title I used then was - The RDBMS Is Dead

Okay now, that I have your attention -- RDBMS isn't dead yet, but we can see a whole class of applications (maybe a couple of classes) where the importance of the RDBMS as we know it today is greatly diminished.

In an article I posted recently on InfoQ, (which I also mentioned in the post on eBay architecture last week), I discussed the notion of database denormalization on Internet-scale sites (such as Amazon, eBay, Flickr, etc.). One point of denormalization is immutable data where there isn't a lot of gain in normalization to begin with.

The other thing is entity representation vs. speed. The problem is that joins are slow and sometimes you get to corners where if we want any type of decent speed we need to denormalize. Todd Hoff notes that as well:

The problem is joins are relatively slow, especially over very large data sets, and if they are slow your website is slow. It takes a long time to get all those separate bits of information off disk and put them all together again. Flickr decided to denormalize because it took 13 Selects to each Insert, Delete or Update.

This point is, however, that these "corner cases" get more and more prevalent even in smaller scale application -- especially when you have complex entities (as is the case with defense systems, for example). Mats Helander recently wrote a post about saving to Blob, and only adding fields as needed for indexing and identity purposes. Mats also suggests the semi-transparent way of using XML columns where the database can do something with the otherwise opaque data.

This point, in fact, demonstrates that the relational data future is indeed not totally secure as we do see that that leading databases begin to treat XML data (which is hierarchical and not relational) as a native citizen -- to the point we can even index XML data.

So far we've seen a trend to denormalize more, handle non-relational data, what else? Ah, transactions.

I've worked on several systems where the data was constantly updated and actually gave the system's representation of the world outside (of the system) the focus was on availability and latency. Which is again also aligned with the approach taken by the large Internet sites which emphasis eventual consistency over immediate consistency.

In distributed systems, crashes happen. The RDBMS is show-stopper when it comes to crashes -- if we can't commit, we need to stop, roll back. Now maybe we can start-over. Is this acceptable? There are many 
scenarios where it is not. I've seen it in defense systems, in communications systems, and even in e-commerce systems ("if you are not responsive, I'll just go to the competition").

What do you do in the presence of error? Joe Armstrong suggest the following as the basis forErlang in his thesis:

To make a fault-tolerant software system which behaves reasonably in the presence of software errors we proceed as follows:

1. We organize the software into a hierarchy of tasks that the system has to perform. Each task corresponds to the achievement of a number of goals. The software for a given task has to try and achieve the goals associated with the task. Tasks are ordered by complexity. The top level task is the most complex, when all the goals in the top level task can be achieved then the system should function perfectly. Lower level tasks should still allow the system to function in an acceptable manner, though it may offer a reduced level of service.The goals of a lower level task should be easier to achieve than the goals of a higher level task in the

2. We try to perform the top level task.

3. If an error is detected when trying to achieve a goal, we make an attempt to correct the error. If we cannot correct the error we immediately abort the current task and start performing a simpler task.

On top of that we try to keep any update local, i.e. within a task boundary on the hardware where the task occurred -- distributing the transactions is not a good option. I outlined why when I talked about SOA and cross-services transactions but the reasoning holds.

Well, truth be said the RDBMS is not dead, its demise probably not even around the corner. Also this does not mean that there aren't any uses for a database. But that's true for other architectural choices. Whoever said that a single tier solution is not the right one for very specific types of system....

RDBMS succeeded to become the de-facto standard to building system because they offer some very compelling attributes -- ACID brings a lot of piece of mind. Large-scale systems, low-latency systems, and fault-tolerant systems opt for another set of compelling attributes (BASE). The point is that when you design your next solution maybe the conventional database thinking is something that you should at least give another thought to and instead of just following dogma.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video