I recently wrote up my rules of thumb for concurrent programming. I've received complements on the rules but I stirred up some controversy with my rule #5.
Rule #5 was a rant against locks -- simply put, I tried to say "avoid locks if you can." I said "just say NO to locks."
Why would I lash out against locks? Simple: every day we see programs that lock so much, in an effort to protect share data and maintain a consistent program state, that they do not scale well. Locks turn sections of programs into serial code. Amdahl's Law tells us this is bad. So how could anyone defend locks?
Coming to the Defense of Locks
It's not that hard to defend locks, it turns out. Because there is something much worse than a program not scaling namely: a program that is not working!
Ideally, I suggest you only remove those locks that are safe to remove. We need to redesign our code to need fewer locks. I could sum up the feedback from experienced developers: "It doesn't work that way in practice. Locks are important."
In fact, it's a safe generalization that bad things happen when locks are removed. It's almost to the point of being able to say, "if you remove a lock, you'll have seven days of bad luck (debugging)."
The More Locks the Better: Surely You're Joking?
In fact, I've been told by some, "the more locks the better." While this is a bit extreme, the point is that locks are cheap and easy. They only hurt scaling in the case where data is highly contended. Locks only make code serial (and invoke the downsides of Amdahl's Law) when other threads have to wait for a lock.
Since the goal with designs should be to avoid having highly contended locks -- the obvious observation is that we should use lots of locks and focus only on avoiding contention. Lots of locks will give us the safety and working program we desire, avoiding contention is what will give us scalability.
The New Rule #5
So, I think rule #5 will now be "avoid contention when using mutual exclusion." The subtitle will be "Locks: you can't live with them, you can't live without them."
The Bigger the Lock, the Safer, and the More Contention
I just cannot get over the headaches around scaling when locks are put around big sections of code. Consider the code:
c[i] = c[i] + sqrt(c[i]);
If I want consistent state in a threaded program, I need to put a LOCK around the code. I could have a separate lock for each c[i], but that's a real hassle and not clear it is worth all the set up for the locks. So I write:
LOCK c_lock c[i] = c[i] + sqrt(c[i]); UNLOCK c_lock
Assuming the lock is contended, this section of code can easily become a bottleneck. We could fix by using more locks:
LOCK c_lock[i] c[i] = c[i] + sqrt(c[i]); UNLOCK c_lock[i]
But it is not hard to envision more complex examples where it is not this simple. And those examples turn out to be very common, and the common and reasonable response is to use a single big lock around it all. Why? It is safe, it works, and life goes on. Scaling can wait -- at least until we get a bunch of cores.
Transactional Memory to the Rescue?
What if we could just write:
DO THIS AS IF ALL AT ONCE BEGIN c[i] = c[i] + sqrt(c[i]); DO THIS AS IF ALL AT ONCE END
That's the idea behind transactional memory. The key is that nothing we wrote said "make this run in serial." Critical sections, locked regions, etc. all insist on "make this run in serial."
It also turns out that a compiler can implement transaction memory without changes to the hardware. This is known as Software Transactional Memory, or STM. Intel just released support to extend their C and C++ compiler to support STM on the Whatif.intel.com Web site.
A classic example of when a "GIANT LOCK" would generally be used, and transactional memory "saves the day" is this: assume you have two tables (queues, hash tables, etc.) and at any given time entry "GOLD NUGGET" can be in one and only one of the tables. If you want to move the "GOLD NUGGET" from one table to the other while making sure any other thread will always see it in one table or the other (never both, never neither), you are likely to use a LOCK to prevent access to either table while you do the processing to move the "GOLD NUGGET." This works, and it's safe -- but it is exactly the type of overkill that will stop all threads trying to access either table.
Transactional memory excels at offering a better option.
Rule #5: Locks, Locks, Locks and More Locks and Some STM?
With my rule #5 being "avoid contention when using mutual exclusion," we can expect to see locks far into the future. But let's try to limit their use, and let's figure out when using STM is a better choice.
Unfortunately, STM is not a cure-all replacement for locks. There are things that locks are better suited for -- but we'll have to cover that later. In fact, STM can create deadlocks in some cases if blindly used to replace locks. Explaining that might take a whole column. And the promise of "composability" for STM goes only as far as you can avoid non-reversal operations such as most I/O. Again, we'll have to offer equal time to STM-bashing in the future. For now, I've had to rethink my lock-bashing and offer a little more praise for them. They work well -- do NOT underutilize them. A missing lock is far worse than an extra lock. Far worse.
Herb Sutter's Advice
Herb Sutter recently reminded me of a presentation I saw him give this past spring -- he's since updated it. His recommendation is the same: Greatly reduce locks. (Alas, not "eliminate.")
- Enable transactional programming: Transactional memory is our best hope.
- Abstractions to reduce "shared": Messages. Futures. Private data (e.g., active objects).
- Techniques to reduce "mutable": Immutable objects. Internally versioned objects.
- Some locks will remain. Let the programmer declare: (1) Which shared objects are protected by which locks. (2) Lock hierarchies (caveat: also not composable).
I couldn't agree more.
So, keep on locking. Look for some relief from STM -- but not a cure-all.
James Reinders is a senior engineer and is currently the director of business development and marketing for Intel's Software Development Products and serves as the chief evangelist and spokesperson.