Parallel Map and Reduce in Heron
I've been rather quiet lately, while I continue to labor away on Heron in the background. Rather than freezing the feature set and moving as fast as I can towards an official v1.0 release of Heron, I've decided to make it a bit more sexy by parallelizing the reduce and map operators.
Heron has several operators for list processing:
- map - Transforms a list into a new list, by applying a unary function to each item in the original list to get the values in the new list
- accumulate - Iterates over items in a list, combining each item with an accumulator value using a binary function.
- select - Creates a list by selecting items from another list that satisfy a unary predicate.
- reduce - Combines items in a list by applying a binary function to two items in a list in a nondeterministic order.
The current implementation of Heron runs these operations sequentially. However, of these operations the "map", "select", and "reduce" operator all can be parallelized (read as: run on multiple cores).
Releasing a language with the potential to be interpreted in parallel, without actually doing so in my default implementation, would be kind of anticlimactic so I decided that I really should try to get at least the map and reduce operators parallel.
If "map and reduce" sounds familiar, but you aren't sure where, MapReduce is the name of a framework for performing distributed computing on large data sets.
So why do am I doing all of this? Well because I am too darn lazy to learn how to do concurrent programming properly using threads. I figured it would be easier to make my language manage concurrency automatically for me. The irony is I still have to learn to use threads properly to make the implementation multithreaded. Oh well, at least I'm having fun, I think.