Game of Life — Avoiding Deserted Areas
I once read a humorous article that detailed how different professions and programmers of different languages would approach the task of finding an elephant in Africa. The basic computer science algorithm was to start at the Cape of Good Hope, walk east until you reach the ocean, move north one step, then walk west, and repeat (moving back and forth across and up the continent) until you find an elephant. Of course, the savvy programmer will place a known elephant in Cairo to ensure that the execution halts. (One online version of such an article can be found here.)
White PapersMore >>
- High Performance Computing in Finance: Best Practices Revealed
- Accelerate Cloud Computing Success with Open Standards
The default version of the Game of Life that I've discussed recently feels a lot like the search for an elephant in Africa. In each generation the application looks at each grid point in the simulation and counts the number of live neighbor cells to determine the state of the current cell for the next generation. Even those cells that have nothing around them to count have to be processed, just like looking at all those acres of African desert that do not contain an elephant or any other kind of animal.
But what if we could focus only on those parts of the grid where we know live cells reside? We could ignore those empty spots that have no live neighbors and compute the next generation much quicker. If I were to equate this with the elephant search metaphor, I would say this is like having satellite coverage of the whole African landmass and being able to identify large gray blobs, which could be potential elephants or herds of elephants. Knowing that there are possible elephants in a given area (or better yet, knowing those places that are elephant-free), I can restrict my searching efforts and find an elephant much quicker than if I have to look at every square foot of the continent. This is an approach to the Game of Life that I discovered in the book Data Structures & Program Design in C (Second Edition) by Robert Kruse, C.L. Tondo, and Bruce Leung (Prentice-Hall, 1997).
Like the original version, this improved version of the Game of Life needs two arrays (with border cells that are not part of the grid). Only one of these arrays will be used to denote which cells are live. In the original version I needed two such grid arrays — one for the previous generation configuration and one to hold the state of the cells in the generation being computed.
In the new version, the first array will still hold the configuration of cells in the current generation, which can be used to print out the state of the simulation at any desired time step. The second array is used to keep a count of the live neighbors for each corresponding grid point of the cell's array. The obvious advantage of this array is that once the initial neighbor counts are stored, finding the number of neighbors for any cell is simply an array look up.
So what? With the second array I've simply taken a function call with a loop over the eight neighbor cells to compute the number of live neighbors and replaced it with a table look-up. Using just these two arrays, you should realize that to find those cells that might be newly alive or need to be removed in the next generation I still have to compute across the whole grid area. I'm sure you'll agree that these two grids won't lead to much of a performance boost and it certainly does not bring a computational focus to only those spots in the grid that might be most volatile as promised above.
The key to the new version of Game of Life is to use lists of grid points that are candidates of points that might birth a new live cell or have a cell that could be killed in the next generation. With these additions, to compute a new generation of cells the grid points in the lists are each examined, the count of the current neighbors (second array) is consulted, and the proper update is made to the grid cell (first array). Also, wherever live cells are added or subtracted from the grid, the neighbor counts of the surrounding cells are updated and those grid points that should be examined for birth or death in the next generation, due to a critical neighbor count being reached, can be added to the lists.
Here is the pseudo-code outline from Data Structures & Program Design in C (Second Edition):
Get the initial configuration of living cells
and use it to calculate an array holding the neighbor counts of all
cells. Determine the cells that will become alive and cells that will
become dead in the first generation; Repeat the following steps as long as desired:
Get the initial configuration of living cells and use it to calculate an array holding the neighbor counts of all cells. Determine the cells that will become alive and cells that will become dead in the first generation;
Repeat the following steps as long as desired:
Since the processing only checks the grid point cells that have been added to one of the lists being kept, only those points of the grid that require computation will be processed. All other cells that are either empty or have a stable configuration of live cells (that don't change from one generation to the next) are not processed. For configurations that "move" across the grid (e.g., gliders), as new cells are added to the structure, the surrounding grid points in the path of the "motion" will be added to the appropriate list and those grid points on the trailing edge (those that die after the structure has passed) will stop being added to lists.
In the next post, I'll give more details and code for the program steps outlined above. Then, once we have an understanding of the serial code and how it works, I'll look at what can be done to make this new version of Game of Life run in parallel.
To whet your appetite for the parallelization, consider the following "off the cuff" observation. In the parallel code of the original version of the Game of Life, all of the processing covering the entire grid makes for a load-balanced solution. Because of the domain decomposition of the grid into equally sized subgrids, each thread has the same number of cells to process. All threads execute roughly the same number of operations and none of them should be waiting too long for all processing of the current generation to complete. If, in the new version, we again divide up processing into equal-sized subgrids, can we guarantee a load-balanced processing for each generation? If not, what is the division of work for assignment to threads?