Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

ODBMS Solutions


Dr. Dobb's Sourcebook September/October 1997: ODBMS Solutions

Chris works for Primus Communications in Seattle, Washington, and can be contacted at [email protected].


One of the benefits of using an object-oriented database system is the seamless marriage of the database and programming language. With Versant Object Technology's Versant Object Database Management System (ODBMS), you can think of the objects in your database simply as C++ objects that are persistent. To use a persistent object, you dereference a Versant link -- the Versant database equivalent of a C++ pointer. You use a link the way you would a pointer. When the link in Example 1 is dereferenced, for instance, it uses the object's unique permanent identifier to retrieve the object from the database. The object is placed in the address space of the database client as a C++ object. If the object is already in the client's cache, dereferencing the link simply returns a C++ pointer to the cached object.

Seems simple, right? So where's the catch? Using persistent objects introduces at least three additional issues -- locking, pinning, and refreshing.

Locks are placed on objects to manage concurrent reading and updating by multiple users. When a Versant object is brought into cache, it's pinned, meaning the object will physically remain in the cache until the pin is released. Finally, if an object is read without a lock, the object may be updated in the database by another user. This causes the version of the object in our cache to grow "stale," and makes it necessary to periodically refresh the object. If the link is simply treated as a smart C++ pointer, these issues are merely hidden.

The Versant ODBMS engine is at the heart of Primus' SolutionBuilder, an application designed to help customer-support analysts capture information and find solutions to customer problems. SolutionBuilder requires extensive use of "dirty reads" (reads with no locks) in the searches and for extensive locking and updating of objects on large databases. For SolutionBuilder to succeed, it was imperative to explicitly control the locking, pinning, and refreshing of objects in the database. From this need arose a solution that addressed these four questions:

  • How can the state of the object's lock be controlled?
  • How can a consistent view of an object (before the object is dirtied) be guaranteed?
  • How can the point at which the client is no longer interested in an object be determined?
  • How can the size of the cache be managed?

These problems were solved by wrapping the Versant Link class. This isolated the necessary logic for locking, pinning, and refreshing in a single location. In addition, our link class was integrated with a reference counting facility. This integration is key to SolutionBuilder, but before discussing how reference counting addressed each of these questions, let's first discuss the reference counting model.

Reference Counting

A reference counting mechanism was already in place for transient objects. Reference counting provides a clean way of returning objects from function calls. It doesn't incur the expense of returning objects by value. It also avoids the sticky ownership issues that arise from passing C++ pointers, including memory leaks and double deletions.

Listing One is a simple implementation of reference counting to determine when a transient object is no longer needed and, therefore, can be deleted.

Listing Two is an example of how the references are applied. The Solution object is allocated on the heap and passed into a reference that is allocated on the stack. When a reference is constructed, it is added to a set of references held by the referent object in class Refable. Since the reference is allocated on the stack, the reference will automatically be destroyed when it goes out of scope. When a reference is destroyed, it is removed from the set. When the last reference is removed from the set, a virtual function on the referent, called SignalZeroRefs, is invoked, which deletes the object.

This simple idea can be extended to persistent objects as well. However, instead of creating and deleting the object, we pin and unpin the object. The base class for all persistent objects (Object) inherits from the Versant base class PObject (as well as Refable), and overrides SignalZeroRefs to unpin the object instead of deleting it (see Listings Three and Four). A class called RLink was created: It wraps the Versant class Link. Additionally, a function called MakeRef was added to RLink to perform the dereference of the link. To dereference a persistent object, you call MakeRef to get a reference, then use the dereference operator to get a C++ reference. The object is brought into cache by calling refreshobj in MakeRef. This also has the effect of locking and pinning the object. As long as at least one reference exists on the object, the object remains pinned and is safe to dereference. When the last reference is destroyed, the object is unpinned. Unpinning the object makes it available to be released from cache if necessary.

The key to this solution is integrating our link class with references. References allow objects to be dereferenced repeatedly, without having to worry about locking, pinning, or refreshing. Reference counting also automatically handles downgrading the lock and unpinning the object.

Managing Locks

The first time you dereference a link to an object, the object is retrieved with the default lock. Since the default lock is a global value that can be set at any time by calling set_default_lock(), dereferencing the link can have the unexpected side effect of locking the object with whatever the current default lock happens to be. Our application usually needs to perform a dirty read. Thus, it was critical to explicitly specify the lock mode when retrieving the object to avoid inadvertently locking the object.

All object activation occurs in RLink::MakeRef(). A function was used, instead of an operator, so that it's clear in the code where the object is being retrieved. Also, there is no default for the lock mode parameter because it's important for the caller to explicitly decide what type of lock is called for. Once there is a reference, you can safely dereference the reference without accidentally placing a lock on the object.

Refreshing the Object

To change the state of a persistent object, you mark it as "dirty," telling Versant that you've made a change to an object that needs to be committed at the end of the transaction. Before you dirty an object, it's necessary to make sure you have the latest version of the object (so you don't accidentally overwrite someone else's changes). Versant doesn't automatically refresh the object when you place a lock on the object or when you dirty the object. Instead, these are two distinct and separate concepts, which provides more flexibility in designing your own locking and transaction model.

Fortunately, our link and reference wrappers provide the perfect opportunity to ensure that we always have the latest version of the object when the object is locked. When an object already in cache is upgraded from a NOLOCK to a read or write lock, refreshobj is called with the requested lock mode. This atomically locks the object and refreshes the cached object.

To guard against dirtying the object without refreshing, we made the Versant PObject::dirty() function private in class Object, and added a function called Dirty() that throws an exception if a lock has not been placed on the object before calling Dirty(). As an added bonus, the Dirty() function allows us to trap all calls to Dirty() for a variety of useful purposes, including logging and detecting unexpected calls to Dirty().

Client Reference Counting

Sometimes an application needs to hold a lock on an object for a long period of time, spanning multiple transactions. For example, any user can edit a solution, but only one user can edit an individual solution at a time. We enforce this with a write lock on the persistent object PSolution. But how can you tell when the user is done editing the solution, so that the lock can be dropped and another user can edit the solution? We decided to extend our reference counting to the client. Figure 1 shows how references on the client pin and lock the persistent solution object so two users can't edit the solution at the same time.

We acquire a write lock on PSolution and hold it with a reference. The reference is stored as a data member of the transient class Solution. Next, the solution reference is sent from the application server to the client. Our application uses C++ stored procedures (called actions) to transmit data between the client and server. An action can marshal various basic types including integers, strings, and network references. A network reference (class NetRef) is a special type of reference that can be returned to the client.

In Figure 1, a network reference is created and the reference to the Solution object is stored in the NetRef. The network reference is placed in an action. When the action is returned to the client, all of the attributes of the action are marshaled to a copy of the action on the client. When the network reference is marshaled, it is placed in a global set of network references on the server. This keeps the reference count for the network reference greater than zero, so that the Solution object isn't deleted and the PSolution object remains pinned and locked.

The network reference treats the address of the referent as a unique ID that can be reference counted on the client just like on the server. However, since the network reference doesn't hold a C++ pointer to an object on the client, it can't be dereferenced. The client can cache a network reference. For example, you might store it as a data member of a ClientSolution object that maps one-to-one with a Solution object on the application server. When users are done editing the solution, the network reference is cleared from the ClientSolution object. This causes the reference count on the client to go to zero. The client notifies the server that the network reference is no longer held. The network reference is removed from the global set on the server, which causes a cascade of reference decrementing. The reference counts on the Solution and PSolution objects go to zero, which causes the Solution object to be deleted and the persistent PSolution object to be unpinned and unlocked. At this point, another user is free to edit the solution by acquiring a write lock on the PSolution object.

Managing Cache Growth

Information retrieval is a key component of our application. To speed searching, commonly accessed objects must be cached in the database client's address space. This means it is too costly to flush all of the objects out of the cache at the end of the transaction. (However, if we never released any objects, the cache would grow too large, consuming lots of memory.)

Versant provides support for managing the growth of the client cache. A "swap threshold" indicates how large the client cache can grow before unpinned objects are released to make room for the retrieval of additional objects. Only objects that are unpinned can be released. Since objects are unpinned when the reference count reaches zero, the objects are automatically cleaned up, allowing Versant to properly manage the cache size.

Moving Objects

As shown in Listing One, our first implementation used the same reference counting mechanism for both persistent and transient objects. The reference class Ref holds a pointer to the persistent C++ object. This should be safe since the object is guaranteed to be pinned while at least one reference is held on the object. Since a pinned object is guaranteed to physically remain in cache, the pointer should be safe to dereference.

Unfortunately, there's a hole in this logic. It's possible for the object to move if the lock is upgraded from a NOLOCK to a read or write lock, thus invalidating the C++ pointer held in any references previously acquired with no lock. Listing Five gives an example to help illustrate the problem. Ref1 is acquired with a NOLOCK. Ref2 is acquired with a WLOCK. Since someone else may have changed the object, the object must be retrieved from the database again, using refreshobj. If the object has grown and there's insufficient contiguous memory to hold the object at the current address, Versant must move the object. Ref1 now holds a dangling pointer to invalid memory. Subsequent use of Ref1 will lead to unpredictable results.

This problem can be solved by adding the class PRef, which behaves exactly like Ref except it holds a Versant Link instead of a C++ pointer. Listing Six defines this class. Since a Versant Link is guaranteed to be valid even if an object moves in cache, we no longer have a problem with dangling pointers.

Conclusion

Even though it's possible to treat Versant links like pointers, we found that by wrapping the Link dereference operator we gained much better control over locking and refreshing. By integrating with references, we correctly handle pinning as well. Programmers have a simple interface to control locking, refreshing, and pinning without worrying about the required mechanics. This has led to a programming environment less prone to error.

DDJ

Listing One

class RefAny{
};
template <class T>
class Ref : public RefAny
{
public:
 Ref() : referent_(NULL) {}
 Ref(T& obj) : referent_(NULL) { Set(obj); }
 Ref(Ref<T>& ref) : referent_(ref.referent_) {
    if (referent_) referent_->AddRef(*this); }
    ~Ref() { Clear();
 }
 T& operator ()() const { return Deref(); }
 T& operator =(T& obj) { Clear(); return Set(obj); }


</p>
 void Clear () { if (referent_) referent->RemoveRef(*this); }


</p>
private:
 T& Deref () const {
 if (!referent_) {
    NullRefException err(__FILE__,__LINE__);
    throw err;
 }
 return *referent_;
 }


</p>
 T& Set (T& obj) {
    referent_ = &obj;
    obj.AddRef(*this);
    return obj;
 }
private:
 T* referent_;
};
class Refable
{
public:
 void AddRef(RefAny& ref) { refs_.Insert(&ref); }
 void RemoveRef(RefAny& ref) {
    refs_.Delete(&ref);
    if (refs_.Length() == 0) SignalZeroRefs();
 }
 virtual void SignalZeroRefs() { delete this; }


</p>
private:
 Set<RefAny*> refs_;
};

Back to Article

Listing Two

class Solution : public Refable{
   Solution (char* id, char* title) : id_(id), title_(title) {}


</p>
   char* GetTitle() const { return title_; }
   char* GetID() const { return id_; } 


</p>
private:
   char* title_;
   char* id_;
}
Ref<Solution> App::CreateSolution(char* id, char* title)
{
   Ref<Solution> solution = new Solution(title id);
   return solution;
}
void main()
{
   App app;
   Ref<Solution> solution = app.CreateSolution("mysolution","1234");
   cout << "Created solution " << solution().GetTitle();
}

Back to Article

Listing Three

template <class T>class RLink
{
public:
  RLink() : link_() {}
  RLink(const Link<T>& link) : theLink(link) {}


</p>
  Ref<T> MakeRef (o_lockmode lm) const {
    Ref<T> ref;
    if (!link_.in_cache() ||
    (GetLock() == NOLOCK && lm != NOLOCK) ) {
      ref = ::dom->refreshobj(link_,lm);
    } else {
      ref = *link_;
    }
    return ref;
  }
    o_lockmode GetLock () {
    o_lockmode  current_lock = NOLOCK;
    o_u4b       lock_counter = 0;
    link_.getcachedlockinfo(NULL, ¤t_lock, &lock_counter);
    return current_lock;
  }
private:
  Link<T> link_;
};


</p>
class Object : public PObject, public Refable
{
public:
  void SignalZeroRefs() {
    DowngradeLock(NOLOCK);
    Unpin();
  }
  void Dirty() {
    Link<Object> link(this);
    if (link.GetLock() == NOLOCK) {
      DirtyWithoutLockException err(__FILE__,__FILE__);
      throw err;
    }
  }
private:
  void Unpin () {
    Link<Object> link(this);
    while (link.is_pinned()) {
      link.unpinobj();
    }
  }
  void DowngradeLock (o_lockmode lm) {
    Link<Object> link(this);
    ::dom->downgradelock(link,lm);
  }
  void dirty() { PObject::dirty(); }
};

Back to Article

Listing Four

class PSolution : public Object{
   PSolution (char* id, char* title) : id_(id), title_(title) {}


</p>
   char* GetTitle() const { return title_; }
   char* GetID() const { return id_; } 


</p>
private:
   PString title_;
   PString id_;
}
Ref<PSolution> App::CreateSolution(char* id, char* title)
{
   // Create persistent solution object using overloaded new operator macro.
   Ref<PSolution> solution = O_NEW_PERSISTENT(PSolution)(id,title);
   return solution;
}
void main()
{
   App app;
   Ref<PSolution> solution = app.CreateSolution("1234","My Solution");
   cout << "Created persistent solution " << solution().GetID()
        << " - " << solution().GetTitle() << endl;
}

Back to Article

Listing Five

void SetTitle (RLink<PSolution>& link, char* title){
   // Caches C++ pointer to PSolution object
   Ref ref1 = link.MakeRef(NOLOCK);
   char* old_title = ref1().GetTitle();


</p>
   if (strcmp(old_title,title) != 0) {
      // May cause object to move!
      Ref ref2 = link.MakeRef(WLOCK);
      ref2().SetTitle(title);
   }
   // Possible error dereferencing ref1!
   cout << "Changed title of solution ID: " << ref1().GetSolutionID() << endl;
}

Back to Article

Listing Six

template <class T>class PRef : public RefAny
{
public:
  PRef() : referent_(NULL) {}
  PRef(T& obj) : referent_(NULL) { Set(obj); }
  PRef(Pref<T>& ref) : referent_(ref.referent_) {
     if (!referent_.is_null()) Deref().AddRef(*this);
  }
  ~Ref() { Clear(); }


</p>
  T& operator ()() const { return Deref(); }
  T& operator =(T& obj) { Clear(); return Set(obj); }


</p>
  void Clear () { if (referent_) referent->RemoveRef(*this); }


</p>
private:
  T& Deref () const {
    if (referent_.is_null()) {
      NullRefException err(__FILE__,__LINE__);
      throw err;
    }
    return *referent_;
  }
  T& Set (T& obj) {
    Link<T> link(&obj);
    referent_ = link;
    obj.AddRef(*this);
    return obj;
  }
private:
  Link<T> referent_;
};

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.