In a previous editorial, I posed the question: Why is it that we do not use task-oriented parallelism more frequently? Several readers were kind enough to write in and point out that, in fact, this form of parallelism is alive and well. I have no doubt about that. The question I want to pursue is why is it not more common, given that it tends to do away with many of the problems created by mutual-exclusion-based parallelism?
Probably the most widely used model of task-based parallelism is implemented via actors. As I have discussed in previous columns, actors are small kernels of computation that are complete unto themselves (generally implemented as a thread with an attached mailbox that serves as a message and task queue). In this model, the actors represent discrete tasks to which data is sent for processing. Because the actors communicate only with other actors and do so only through incoming and outgoing messages, there is no interference between threads and most of the mutual-exclusion problems disappear.
Actors are not a free ride to easy parallelism, however. In most implementations, the data sent to actors must be immutable. If the actor needs to change an incoming immutable object, it clones the object and makes changes to the cloned version. This is where one of the first obstacles to widespread task-based programming becomes salient: Few languages have full support for immutable data. Nearly all languages have some limited support for immutability, but rarely is it robust enough to facilitate parallel data processing without lots of caveats or extra code. Most of the languages that do offer robust immutability are not mainstream (Erlang, Scala, Ada, and Fantom, among others). Some models, such as dataflow programming, use an entirely different kind of lexicon for directing data movement that side-steps the immutability problem.
Other models in which the work to be done, rather than just the data, is sent to the task for completion is hamstrung by the lack of robust support for first-class functions in mainstream languages. First-class functions enable developers to send a function as if it were an object to another function for execution. In C, the function pointer is a weak equivalent.
If you think that my mention of first-class functions and immutability is about to tip over into an encomium for functional programming languages, you'll be disappointed. I don't believe that the core issue that makes task-based programming elusive is the failure of functional languages to gain greater popularity. However, having these features in mainstream languages would be of considerable assistance.
Task-based programming today is mostly being done the hard way using third-party libraries and doing a lot of manual work. The available libraries (OpenMP 3.0, Intel TBB, Intel Cilk, among others) depend on the developer to provide tasks suitable for handling by thread pools. As a result, the primary use of these libraries is oriented toward problems defined by data decomposition. Not all such coding depends on libraries. At the end of this article, I've included a short but very interesting letter from reader John Revill who describes his task-oriented programming using an entirely different approach.
While it's certainly good news that third-party libraries facilitate data-parallel tasks and recursive functions, these use cases will not lead us to reconsider all programs as series of parallelizable tasks and crossing that chasm is the goal I have in mind.
Using parallelism for one-off tasks is sometimes referred to as unstructured parallelism. It's traditionally been the kind of work that was done wholly manually by fork/join operations and other coding techniques equally as pleasant as open heart surgery. It involves the kinds of tasks that naturally spin out of functional decomposition. This decomposition asks the question: What tasks can be broken off as separate standalone entities that can be run in parallel, with a minimum of synchronization with other tasks?
My quest, though, is to extend this kind of task encapsulation into architecture and design in the same way that objects have done in the past. Suppose that tasks were to become the new objects from which programs are assembled. How would programming change?