Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

Inside the db4o Database


Rick is a QA Engineer for Compuware's NuMega Labs, where he has worked on both .NET and Java projects. He can be contacted at [email protected].


In the context of databases—and object databases, in particular—replication is the ability to duplicate an object from one database (call it "database A") into another ("database B"). The duplication is performed in such a way that both object instances can be modified, so that at a later time the duplicated object can be returned to its original database, and any differences between the object in A and the object in B can be resolved. There are variations on this scenario, but that's the basic idea.

Users of mobile devices might recognize this as "synchronization." In such applications, the desktop system maintains a set of "master" databases that are replicated onto the handheld device. In the process of using the handheld, you add phone numbers, delete appointments, create new shopping lists, and so on. And, at some future point, you synchronize the mobile device with the desktop, which records the changes you've made into the master database. In short, replication allows a subset of information to be drawn from one database into another, and permits the user to operate on that subset in a disconnected fashion.

The Need

Suppose you want to implement replication. What capabilities do database systems need to provide for error-free—or, at least, as close to error-free as we could reasonably get—reconciliation of data modified by disconnected users? (In this article, I assume that the databases employed are object databases. So, when I speak of "data," I speak of "objects.")

Certainly, the first requirement would be some way of identifying an object, even as clones of it are replicated from one database to another. In other words, when you put an object into the database, a unique identifier must be attached to it. That identifier must be bound to the object in such a way that, when the object is replicated into another database, the identifier goes with it. (You cannot count on the object's data content as the sole means of identifying that object.) Ideally, this identifier will be invisible and unmodifiable, except under unusual circumstances. After all, there is no reason to make its presence and value known to anything other than whatever framework handles the replication process.

Furthermore, this identifier must be universally unique. Suppose you begin with database A. You replicate a subset of its data into database B. Then you replicate data from database A into database C (some of the objects in C may also be objects in B). Suppose further that, after replication, object X was created in B, and object Y was created in C, and that both objects are of the same class. We must be guaranteed that the unique identifiers of X and Y are universally unique, even though databases B and C (and, in fact, A as well) are disconnected. If, by some chance, object X and object Y received the same IDs, then it would be difficult to impossible to reconcile both databases back to the master database.

Accurate reconciliation of disconnected databases also demands that some sort of version information be attached to an object when the object is modified. This is necessary so that synchronization can determine which of two replicated objects is the most up to date. Ideally, this would be information along the lines of "this object was modified at date and time xxx." At the least, we need a modification flag, such as the one on the Palm OS, which indicates when a record has been made dirty. Otherwise, the synchronization application will have to examine each and every object's content (that is, each object passing through synchronization), comparing it with the original (the object in the master database), to see if any changes have occurred, and attempt to deduce from those changes which object is older. Such resolution code would be unacceptably complex and slow.

Finally, a synchronization system must provide a mechanism that allows the developer to manage conflict resolution. Under normal circumstances, when an object that has been replicated is being reconciled back to the main database, the replication framework code can examine version information (assuming it exists) and determine whether the master database object or the replicated object is the most up to date. Conflicts occur when the version information is such that the winner is not apparent. The developer must be able to provide code that can resolve the confusion.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.