Know When to Use an Active Object Instead of a Mutex
Got state? Hide it! And if it's shared and either popular or slow, make it asynchronous...it's just good encapsulation
Let's say that your program has a shared log file object. The log file is likely to be a popular object; lots of different threads must be able to write to the file; and to avoid corruption, we need to ensure that only one thread may be writing to the file at any given time.
Quick: How would you serialize access to the log file?
Before reading on, please think about the question and pencil in some pseudocode to vet your design. More importantly, especially if you think this is an easy question with an easy answer, try to think of at least two completely different ways to satisfy the problem requirements, and jot down a bullet list of the advantages and disadvantages they trade off.
Ready? Then let's begin.
Option 1 (Easy): Use a Mutex (or Equivalent)
The most obvious answer is to use a mutex. The simplest code might look like that in Example 1(a):
// Example 1(a): Using a mutex (naive implementation)
//
// The log file, and a mutex that protects it
File logFile = …;
mutex_type mLogFile( … );
// Each caller locks the mutex to use the file
lock( mLogFile ) {
logFile.write( … );
logFile.write( … );
logFile.write( … );
} // unlock
Example 1(a): Using a mutex (naive implementation)
If you've been paying attention to earlier installments of this column, you may have written it as shown in Example 1(b) instead, which lets us ensure that the caller doesn't accidentally write a race because he forgot to take a lock on the mutex (see [1] for details):
// Example 1(b): Using a mutex (improved implementation)
//
// Encapsulate the log file with the mutex that protects it
struct LogFile {
// Hide the file behind a checked accessor
// (see <a href="http://www.drdobbs.com/go-parallel/article/showArticle.jhtml?articleID=224701827">[1]</a> for details)
PROTECTED_WITH( mutex_type );
PROTECTED_MEMBER( File, f );
// A convenience method to avoid writing "f()" a lot
void write( string x ) { f().write( x ); }
};
LogFile logFile;
// Each caller locks the entire thing to use the file
lock( logFile ) {
logFile.f().write( … ); // we can use the f() accessor
// explicitly
logFile.write( … ); // but mostly let's use the
logFile.write( … ); // convenience method
}
Example 1(b): Using a mutex (improved implementation)
Examples 1(a) and 1(b) are functionally equivalent, the latter is just more robust. Ignoring that for now, what are the advantages common to both expressions of our Option 1?
The main advantage of Option 1 is that it's correct and thread-safe. Protecting the log file with a mutex serializes callers to ensure that no two threads will be trying to write to the log file at the same time, so clearly we’ve solved the immediate basic requirement.
But is this the best solution? Unfortunately, Option 1 has two performance issues, one of them moderate and the other potentially severe.
The moderate performance problem is loss of concurrency among callers. If two calling threads want to write at the same time, one must block to wait for the other's work to complete before it can acquire the mutex to perform its own work, which loses concurrency and therefore performance.
The more serious issue is that using a mutex doesn't scale, and that becomes noticeable quickly for high-contention resources. Sharing is the root of all contention (see [2]), and there's plenty of potential contention here on this global resource. In particular, consider what happens when the log file is pretty popular, with lots of threads intermittently logging things, but the log file's write function is a slow, high-latency operation it may be disk- or network-bound, unbuffered, or slow for other reasons. Say that a typical caller is calling logFile.write regularly, and that the calls to logFile.write take about 1/10 of the wall-clock time of the caller's computation. That means that 10% of a typical caller's time spent inside the lock which means that at most 10 such threads can be active at once before they start piling up behind the lock and throttling each other. It's not really great to see the scalability of the entire program be limited to at most 10 such threads' worth of work.
We can do better. Given that there can be plenty of contention on this resource, the only winning strategy is not to share it…at least, not directly. Let's see how.








