Option 2 (Better): Use Buffering, Preferably Asynchronous
One typical strategy for dealing with high-latency operations is to introduce buffering. The most basic kind of buffering is synchronous buffering; for example, we could do all the work synchronously inside the calls to write
, but have most calls to write
only add the data to an internal queue, so that write
only actually writes anything to the file itself every N-th time, or if more than a second has elapsed (perhaps using a timer event to trigger occasional extra empty calls to write
just to ensure flushing occurs), or using some other heuristic.
But this column is about effective concurrency, so let's talk about asynchronous buffering. Besides, it's better in this case because it gets much more of the work off the caller's thread.
A better approach in this case is to use a buffer in the form of a work queue that feeds a dedicated worker thread. The caller writes into the queue, and the worker thread takes items off the queue and actually performs the writing. Example 2 illustrates the technique:
// Example 2: Asynchronous buffering // // The log file, and a queue and private worker thread that // protects it message_queue<string> bufferQueue; // Private worker thread mainline File logFile = …; while( str = bufferQueue.pop() ) { // receive (async) // If the queue is empty, pop blocks until something is available. // Now, just do the actual write (now on the private thread). logFile.write( str ); } // Each caller assembles the data they don't want interleaved // with other output and just puts it into the buffer/queue string temp = …; temp.append( … ); temp.append( … ); bufferQueue.push( temp ); // send (async)
Example 2: Asynchronous buffering
Note that in this approach the individual calls to send
on multiple threads are thread-safe, but they can interleave with each other. Therefore, a caller who wants to send several items that should stay together can't just get away with making several individual calls to send
, but has to assemble them into an indivisible unit and send that all in one go, as shown above. This wasn't a problem in Option 1, because the indivisible unit of work was already explicit in the Example 1(a) and 1(b) calling code while the lock was held, no other thread could get access to the file and so no other calls could interleave.
Another minor drawback is that we have to manage an extra thread, including that we have to account for its termination; somehow, the private thread has to know when to go away, and Example 2 leaves that part as an exercise for the reader. I call this issue "minor" because the extra complexity isn't much, and termination is easy to deal with in a number of ways (note that Option 1 had a similar termination issue, too, to make sure it destroyed the file object), but I mention it for completeness if you use a strategy like Example 2, don't forget to join with those background helper threads at the end of the program!
But enough about minor drawbacks, because Option 2 delivers major advantages in the area of performance. Instead of waiting for an entire write operation to complete, possibly incurring high-latency accesses and all the trimmings, now the caller only has to wait for a simple and fast message_queue.push
operation. By never executing any part of the actual write on the caller's thread, callers will never have to wait for each other for any significant amount of time even if two try to write at the same instant. By thus eliminating throttling, we eliminate both performance issues we had with Option 1: We get much better concurrency among callers, and we eliminate the scalability problem inherent in the mutex-based design.
Guideline: Prefer to make high-contention and/or high-latency shared state, notably I/O, be asynchronous and therefore inherently buffered. It's just good encapsulation to hide the private state behind a public interface.
Oh, but wait don't modern languages have something called "classes" to let us express this kind of encapsulation? Indeed they do, which brings us to Option 3…