Finding My Summer Vacation's Shortest Path

August 21, 2012

Back when gas prices were below a dollar per gallon, my family would take cross-country driving trips. When we were planning how to get from point A to point B, we consulted a paper map, especially if we hadn't been there before. Besides allowing you to trace possible routes, a map usually had a lower triangular matrix with the precomputed distances between a set of cities found on the map (and some that might be just off the edge of the paper). Looking at the intersection of rows/columns for, say, Albuquerque and Boston, I would find that the shortest route known to the cartographers was 2232 miles. Since there isn't one road that connects these two cities, how did those mapmakers compute the shortest distance between these two places? (Remember that Google Maps wasn't even the glimmer in anyone's eye back then.)

Think of a map as an instance of a graph, the cities are the nodes and the roads between cities are the edges. The length of the road is the weight of the corresponding edge. The All-Pairs Shortest Path problem asks you to compute the minimum length path (shortest distance) between all pairs of nodes within the graph. Thus, besides computing the shortest distance needed to drive between Albuquerque to Boston, you can also find the minimum driving distance between Charlotte and Detroit, Ellensburg and Frankfort, or Greenwich and Hilton Head.

The All-Pairs Shortest Path problem takes a graph of `n` nodes represented by an `n x n` weight matrix, `W`. The result is an `n x n` matrix, `D` (for distance), where the `D[i][j]` entry holds the minimum weight of the path from node `i` to node `j`. Entries in the `W` matrix can be zero, positive (if a direct edge lies between the two nodes), or the infinity value (∞) where there is no direct edge between the nodes. The main diagonal of `W` is filled with zeroes to denote that it takes no time or distance to travel from a node to itself. It is okay to have negative weights in the `W` matrix as long as there is no negative length cycle. This condition will assure that only simple paths are found. (For a graph that models physical distances, you won't have negative values. However, if the edge weights are for something like tolls on the roads, a negative weight might represent a payment you receive for travelling between the two nodes.)

Floyd's Algorithm is a simple solution to this problem. The key point to the algorithm is to find the shortest path between node `i` and node `j` that includes (or excludes) node `k` on the path. For each pair of nodes, each possible node `k` in the graph is tried in turn to find these shortest paths. The algorithm computes a series of successive `Dk` matrices, one for each individual `k`, with the previous `Dk-1` matrix used in the computation. In formal notation, we compute the value of an `[i][j]` entry from the previous matrix (i.e., look for a shorter path that uses node `k`) with the following formula:

```Dk [i][j] = min(Dk-1 [i][j], Dk-1 [i][k] + Dk-1 [k][j])
```

After `k` iterations, the `Dk` matrix will hold the lengths of the shortest paths that use the first `k` nodes as intermediate nodes in the path. Once the algorithm has iterated over all `n` nodes in the graph, the `Dn` graph will contain the shortest path lengths between all pairs of nodes with some combination of other nodes as intermediate nodes. This is the result that we are looking for.

From the above description, I can see that there will be a nested loop pair to compute the minimum path between each `i,j` pair. This computation is going to be the body of the loop that iterates over all the choices for `k`. If the initial `D0` matrix is simply the original weight matrix, `W`, a serial code implementation of Floyd's Algorithm is shown here.

```void Floyds(float **D, int N)
{
int i, j, k;

for (k = 0; k < N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N ; j++)
D[i][j] = min(D[i][j], D[i][k] + D[k][j]);
}
}
}```

I've elected to use a float array for both the weight and distance matrices. The constant FLT_MAX from the `float.h` include file (not shown, but assumed to be part of the initialization of `W`) is used to play the role of infinity for nodes not directly connected by an edge.

If you're in a hurry to find the distance between your starting point and somewhere exotic that is within your travel time budget, you might want to implement and run a parallel version of Floyd's Algorithm on multiple cores. One point to consider in such a conversion is that the iterations of the outer loop are similar to a time series computation. What I mean is that the algorithm must finish all updates to the `D` matrix entries for each `k` before going to the computation using `k+1`. Thus, the outer loop must be run in serial and the two inner loops are the only possible candidates that can be made to run concurrently.

More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>