Game of Life — Distributed Lists
At some point I will want to have the output results of the computed generations displayed. The same initial two methods as the input above also apply to the output. That is, one process can be in charge of all output, printing the live grid cell coordinates received from the other processes, or each process can perform output of its own subgrid. The goal will be to print a coherent picture much like the serial version of the code does. Using a single process to do the output would appear to be easiest to implement. Of course, the output process should not keep a copy of the entire grid in memory. (If it did, why distribute the computations in the first place?) Rather, the process should accept messages containing lists of live grid cells from each other process and use that data to print. This may involve sorting the data depending upon how the grid was divided and how the message(s) are sent/received from each process.
White PapersMore >>
Having each process in charge of the output of its own live grid cells will require some synchronization and coordination between the processes. How complex this synchronization is will depend upon how the original grid is apportioned to the MPI processes. For example, using four processes in the computation, the grid could be quartered by columns or by rows or split in half both horizontally and vertically (like you would cut a rectangular pie into four smaller quadrilaterals).
If I've done a column-wise grid division, but need to print in the natural row-wise fashion, a message must be sent between one process (after each row of the current process was printed) to the next process in the logical order of output before the next process can begin printing data from the same logical row. At the end of each row printed, the last process must inform the first process that it can then print its next row. On the other hand (the hand rotated 90 degrees anti-clockwise, that is), if the grid were divided by rows, each process could print its entire subgrid after receiving a message from the previous process that the previous rows have been printed. This row-major division generates much less message traffic than the column-major division.
Communicating Life and Death on the Edge
This brings me to the heart of this treatise on distributed Game of Life. To determine whether a grid cell will remain alive, die off, or be born in the current generation, the number of live neighbors that a grid cell has must be known. For those grid cells wholly internal to the subgrid, the correct value of
numNeighbors will be in local memory. For grid cells at the interface between subgrids, newly born cells and newly killed cells will affect the
numNeighbors entries in the process across that logical border. To keep the counts correct, some communication of the status of gird cells on the edges of subgrids must be exchanged.
One of the easiest methods to exchange and use grid data from other processes is to declare ghost cells around the subgrid assigned to the process. The ghost cells replicate the data that is held by another process just over the border. (If you remember back to my initial posts about the Game of Life, there was already a hedge around the outside of the full grid to make computation of the number of neighbors easy at the edges.) After determining which new cells are alive and which have died, each process will bundle up the status of the edge row/column on its side of the border into a message. This data is exchanged with the process holding the subgrid across the divide. Data received from that process is placed in the ghost cells that correspond to the edge row/column on the opposite side of the border. The number of neighbors can then be adjusted for these edge cells.
There are two important things to consider for this data exchange. First, what will be the format of the message to be sent (and received)? Will this be a variable length message containing the coordinates of the live cells or a fixed length message that encodes the entire set of edge cells to be either alive or dead? Second, the message exchanges have to be coordinated correctly so that all processes don't attempt to first receive cell data from another process and deadlock the execution. With blocking MPI communication functions, some processes will need to first send while others first receive and then change roles. Alternatively, an asynchronous receive can be posted by each process for each border process that its subgrid abuts and, when ready, send out messages with the needed status updates to those same border processes.
Related to this second point is how many processes will need to exchange data. This is directly related to how the grid is divided. If a row-wise (column-wise) division is used, the inner processes will need to exchange data with two other processes, but the topmost and bottom (leftmost and rightmost) processes would only need to coordinate with a single other process. In the case of four processes, dividing the grid into quadrants will yield more consistent communication since each process would exchange data between two other processes. Other divisions of the grid for more processors (for example, 16 processes making four vertical and four horizontal cuts) can create different number of bordering subgrids between processes. Irregular grids with irregular divisions could have multiple subgrids along a single border, which would require multiple messages be exchanged.
These are the major issues that need to be considered when implementing a domain decomposition version of the list-based Game of Life in a distributed memory environment. After making the appropriate choices from the ideas presented above, you should be able to devise a solution of your own. In my next post, I'll show the code that I've written to implement this variation.