The majority of software development today, as it has been for several decades, focuses on multiuser systems. In multiuser systems, there is always a danger that two or more users will attempt to update a common resource, such as shared data or objects, and it's the responsibility of the developers to ensure that updates are performed appropriately. Consider an airline reservation system, for example. A flight has one seat left, and you and I are trying to reserve that seat at the same time. Both of us check the flight status and are told that a seat is still available. We both enter our payment information and click the reservation button at the same time. What should happen? If the system works only one of us will be given a seat and the other will be told that there is no longer a seat available. An effort called concurrency control makes this happen, and it's just as applicable to the development of Web-based systems such as Amazon.com's shopping cart as it is to legacy applications like your organization's COBOL-based human resources system. This month I examine the issues surrounding concurrency control in object-oriented systems.
|Figure 1. Object Concurrency Control Diagram
Why is object concurrency control an issue? The problem stems from the fact that to support several users working simultaneously with the same object, the system must make copies of the object for each user, as indicated in Figure 1. The source object may be a row of data in a relational database and the copies may be entity beans in an Enterprise JavaBean (EJB) application server. Similarly, the source may be a C++ object in an object database, while the copies are Java objects that are edited via HTML pages. Regardless of the technology involved, you need to synchronize changesupdates, deletions and creationsmade to the copies, ensuring the transactional integrity of your source. (See Thinking Objectively, "Distributed Object Transactions," July 2000 for related material.)
Object Concurrency Control Strategies
There are three basic object concurrency control strategies: pessimistic, optimistic and truly optimistic. Pessimistic concurrency control locks the source for the entire time that a copy of it exists, not allowing other copies to exist until the copy with the lock has finished its transaction. The copy effectively places a write lock on the appropriate source, performs some work, then applies the appropriate changes to the source and unlocks it. This is a brute force approach to concurrency that is applicable for small-scale systems or systems where concurrent access is very rare: Pessimistic locking doesn't scale well because it blocks simultaneous access to common resources.
Optimistic concurrency control takes a different approach, one that is more complex but is scalable. With optimistic control, the source is uniquely marked each time it's initially accessed by any given copy. The access is typically a creation of the copy or a refresh of it. The user of the copy then manipulates it as necessary. When the changes are applied, the copy briefly locks the source, validates that the mark it placed on the source has not been updated by another copy, commits its changes and then unlocks it. When a copy discovers that the mark on the source has been updatedindicating that another copy of the source has since accessed itwe say that a collision has occurred. A similar but alternative strategy is to check the mark placed on the object previously, if any, to see that it is unchanged at the point of update. The value of the mark is then updated as part of the overall update of the source. Your software is responsible for handling the collision appropriately, strategies for which are described below. Since it's unlikely that separate users will access the same object simultaneously, it's better to handle the occasional collision than to limit the size of your system. This approach is suitable for large systems or for systems with significant concurrent access.
The third approach, truly optimistic concurrency control, is the most simplisticit's also effectively unusable for most systems. With this approach, no locks are applied to the source and no unique marks are placed on it; your software simply hopes for the best and applies the changes to the source as they occur. This approach means your system doesn't guarantee transactional integrity if it's possible that two or more users can manipulate a single object simultaneously. Truly optimistic concurrency control is only appropriate for systems that have no concurrent update at all, such as information-only Web sites or single user systems.
Optimistic Marking Strategies
So how do you mark the source when you are taking an optimistic approach to object concurrency control? The fundamental principle is that the mark must be a unique valueno two copies can apply the same mark value; otherwise, they won't be able to detect a collision. For example, assume the airline reservation system is Web-based and built using a farm of EJB application servers that connect to a shared relational database. The copies of the seat objects exist as Java objects on the application servers, and the shared source for the objects are a row in the database. If the object copy that I am manipulating assigns the mark "Yabba Dabba Do" to the source and the copy that you're working on assigns the same mark then we're in trouble. Even though I marked the source first, your copy could still update the source while I am typing in my credit card information; then my copy would overwrite your changes to the source because it can't tell that an update has occurred as the original mark that it made is still there. Now we both have a reservation for the same seat, which is likely bad news for you because I take karate. Had your copy marked the source differently, perhaps with "Hey Boo-Boo," then my copy would have known that the source had already been updated, because it was expecting to see its original mark of "Yabba Dabba Do."
There are several ways that you can generate unique values for marks. A common one is to assign a time stamp to the source. This value must be assigned by the server where the source object resides to ensure uniqueness: If the servers where the copies reside generate the time stamp value, it's possible that they can each generate the same value (regardless of whether their internal clocks are synchronized). If you want the copies to assign the mark value, and you want to use time stamps, then you must add a second aspect to make the mark unique such as the user ID of the person working with the copy. A unique ID for the server, such as its serial number, isn't sufficient if it's possible that two copies of the same object exist on the same server. Another approach is simply to use an incremental value instead of a time stamp, with similar issues of whether the source or the copy assigns the value. A simple, brute-force strategy, particularly for object-oriented systems, is to use a persistent object identifier (POID) such as a high or low value (as described in "Enterprise Ready Object IDs," Thinking Objectively, Dec. 1999). Another brute force strategy that you may want to consider when the source is stored in a relational database is including your unique mark as part of the primary key of the table. The advantage in this approach is that your database effectively performs collision detection for you, because you would be attempting to update or delete a record that doesn't exist if another copy has changed the value of your unique mark. This approach, however, increases the complexity of association management within your database and is antithetical to relational theory because the key value changes over time. I don't suggest this approach, but it is possible.
Your software can handle a collision several ways. Your first option is to ignore it, basically reverting to truly optimistic locking, which begs the question of why you bothered to detect the collision in the first place. Second, you could inform the user and give her the option to override the previous update with her own, although there are opportunities for transactional integrity problems when a user negates part of someone else's work. For example, in an airline reservation system, one user could reserve two seats for a couple and another user could override one of the seat assignments and give it to someone else, resulting in only one of the two original people getting on the flight. Third, you could rollback (not perform) the update. This approach gets you into all sorts of potential trouble, particularly if the update is a portion of a multistep transaction, because your system could effectively shutdown at high-levels of activity because it is never completing any transactions (this is called live locking). Fourth, you could inform all the users who are involved and let them negotiate which changes get applied. This requires a sophisticated communication mechanism, such as publish and subscribe event notification, agents or active objects. This approach only works when your users are online and reasonably sophisticated.
One thing to watch out for is a false collision. Some false collisions are reasonably straightforward, such as two users deleting the same object or making the same update. It gets a little more complex when you take the granularity of your collision detection strategy into account. Do you detect collisions at the scope of entire objects or parts of objects? For example, both of us are working with copies of a person object: I change the first name of the person, whereas you change their phone number. Although we are both updating the same object, our changes don't actually overlap, so it would be allowable for both changes to be applied if your application is sophisticated enough to support this.
For object concurrency control to be successful, all of your developers must follow the same strategy. All it takes is one person doing truly optimistic concurrency control when everyone else is doing optimistic, and you're out of luck. Your object concurrency control strategy should be implemented in a common application framework or reusable component that everyone uses in their code. Object concurrency control is an important aspect of any multiuser application, be it developed using C++, C#, Visual Basic or Enterprise JavaBeans. The technology may change over time, but the fundamentals stay the same.