Putting It All Together
Now let's put everything together in a semirealistic program that calculates the primes up to, but not including, N. There are almost as many different approaches to parallel prime finders as there are primes. In this one, we use eachElem to create a number of tasks, where each task represents a range of odd integers that should be combed for primes. To test for primality, we'll try to divide each number, n, by all primes less than or equal to sqrt(n), which implies that tasks are dependent on the results of previous tasks.
Listing One is the entire code. The master instantiates a Sleigh, initializes its workers via eachWorker, creates a list of tasks, and then runs the tasks via eachElem. The two general parameters that give the chunk length (which must be even), and N, are bound to a variable in an NWS and are retrieved by the function invoked via eachWorker. Each task is represented by an integer: the beginning of a chunk of integers to check for primality. Each task returns a list of primes that were found in that chunk. Checking a particular number may require candidate factor primes that the worker doesn't already know, in which case they'll get them via find.
import sys from nws.sleigh import Sleigh def initPrimes(): global chunk, chunks, limit limit, chunk = SleighUserNws.find('prime parameters') chunks = {} def findPrimes(first): last = min(first+chunk, limit) # we need to know all the primes up to the smaller of the start of # this chunk or the square root of its last element. need = min(first-2, int(.5+(last-1)**.5)) myPrimes = [] for c in xrange(3, need+1, chunk): if not c in chunks: chunks[c] = SleighUserNws.find('primes%d'%c) myPrimes += chunks[c] newx = len(myPrimes) for x in xrange(first, last, 2): for p in myPrimes: if x%p == 0: break else: myPrimes.append(x) if first**2 < limit: SleighUserNws.store('primes%d'%first, myPrimes[newx:]) return myPrimes[newx:] def master(workers, limit, chunk): s = Sleigh(workerCount=workers) s.userNws.store('prime parameters', (limit, chunk)) s.eachWorker(initPrimes) primes = [2] map(primes.extend, s.eachElem(findPrimes, range(3, limit, chunk))) return primes if __name__=='__main__': primes = master(int(sys.argv[1]), int(sys.argv[2]), int(sys.argv[3])) print len(primes), primes[:3], '...', primes[-3:]
The workers use SleighUserNws as their NWS; this is a workspace object corresponding to a uniquely named NWS that is created by Sleigh. It is a convenient place to store variables related to the Sleigh run. Also, the workers remember which primes they've seen so that they don't have to get them for subsequent tasks.
Conclusion
Python and NetWorkSpaces make it easy to create and experiment with parallel programs without requiring specialized tools or hardware. Communication and synchronization are greatly simplified by reduction to manipulation of variables in a shared workspace.
The programs can be tested on a single CPU using multiple processes (or threads), or for actual speedup, on multicore CPUs or a collection of computers. The state of the parallel computation is captured by the variables stored in NetWorkSpace. These can be visualized via the web interface, which makes it easy to understand and debug the program, and monitor progress. Because NetWorkSpaces is language-neutral, code and idioms can be transferred to different environments, and it can even be used to create ensembles from programs written in different languages.
Acknowledgments
Thanks to Scientific Computing Associates (www.lindaspaces.com) for supporting the development of NetWorkSpaces, and Twisted Matrix Labs for supporting Twisted (twistedmatrix.com). Sleigh was inspired in part by the R Project's SNOW package (short for "Simple Network Of Workstations"), developed by Luke Tierney, A.J. Rossini, Na Li, and H. Sevcikova (cran.r-project.org/doc/packages/snow.pdf).