Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

# C++ Hash Table Memoization: Simplifying Dynamic Programming

## More Insights

More >>

More >>

### Webcasts

More >>

A true bottoms-up approach is probably impractical for this problem.

First, it's a lot of work to traverse a tree one level at a time, starting at the lowest level. Second, there are problems with the tabular subproblem storage scheme usually favored in dynamic programming. It's not entirely clear where in a table we would store the result from `GET-MAX-WEIGHT( q )`.

An approximation of the bottoms up approach could be achieved by traversing the tree in postorder, visiting all the children of a node first, then visiting the node itself. In this scenario we would visit the nodes from Figure 2 in the following order:

h i d q r j k e b l m f n p g c a

At every non-leaf node, we would have all the information we needed to calculate `GET-MAX-WEIGHT( node )` without actually actually recalculating any results -- the subproblem results will have already been calculated and stored.

Of course, traversing the tree in postorder is actually what we're already doing already in the recursive implementation. Seeing this, my solution to the algorithm was to use a C++ hash table to perform memoization of the subproblems. This has a few advantages:

• My implementation looks just like the recursive definition of the algorithm. It just has a few additional lines needed to check for stored subproblems.
• I don't have to worry about finding a tabular storage method -- I store the results based on a hash of the the node pointer, making lookup and storage nice and simple.
• I don't have to worry about how to do bottoms-up traversal of the org chart. My recursive definition visits the nodes in the order I want without any changes.

I'll start by simply presenting my modified version of `GetMax( node )`, then present the infrastructure that goes with it. Finally, I'll discuss the last step in any dynamic programming problem: using the stored subproblem results to get a solution.

### The Memoized GetMax()

The C++ code to solve this problem is completely defined in a class called `Memo`, which relies on a tree whose nodes are defined by class `Node`. The two class definitions are shown here:

```class Node {
public :
Node( std::istream &input );
const std::string &Name() const;
int Weight() const;
const Node &Child( size_t i ) const; //return ref to child node
size_t Children() const; //return count of child nodes
};
class Memo {
public :
int GetMax( const Node &node );
void MarkIncluded( const Node &node );
void PrintTree( std::ostream &s, const Node &node, bool init=true );
private :
struct Solution {
Solution();
bool mAttending;
bool mIncluded;
int mMaxWeight;
};
std::map<const Node*,Solution> mSolutions;
};
```

To run an instance of this problem, I first construct a tree by having the `Node` constructor read in a definition file. The resulting root node is then passed in to `Memo::GetMax()`, which works its way through all the subproblems and calculates the maximum total weight for this problem.

The code in `GetMax()` should appear nearly identical to that in the pseudocode version of `GET-MAX-WEIGHT()`, with one critical difference: in my implementation of `GetMax()`, I check to see if a hashed value of `GetMax()` has been saved upon entry to the routine, and if it has, I short-circuit execution and return that value immediately.

Likewise, at the end of the routine, I store the newly calculated maximum value in the hash table.

The key to this hash table is a pointer to a node -- it could just as easily have been the name of the node, but using the pointer has the advantage of making the program a little more invulnerable to input error -- duplicate node names won't break the algorithm.

The use of a pointer as a key into a hash_map is a technique that can be used to work around some of the issues I discussed earlier in the article. Because the hash containers offer built-in support for pointers of any type, use of a pointer as a key allows you to avoid having to define your own hash function and comparison function, which is nice.

```int Memo::GetMax( const Node &node )
{
std::map<<const Node*,Solution>::const_iterator ii;
ii = mSolutions.find( &node );
if ( ii != mSolutions.end() )
return ii->second.mMaxWeight;
int weight_not_attending = 0;
for ( size_t i = 0 ; i <node.Children() ; i++ )
weight_not_attending += GetMax( node.Child( i ) );
int weight_attending = node.Weight();
for ( size_t i = 0 ; i <node.Children() ; i++ )
for ( size_t j = 0 ; j <node.Child( i ).Children() ; j++ )
weight_attending += GetMax( node.Child( i ).Child( j ) );
std::cout <<"GetMax(" <<node.Name() <<")\n";
if ( weight_attending>= weight_not_attending ) {
mSolutions[ &node ].mAttending = true;
return mSolutions[ &node ].mMaxWeight = weight_attending;
} else
return mSolutions[ &node ].mMaxWeight = weight_not_attending;
}
```

This routine is conceptually pretty simple. The first four lines check to see if I've cached a previously seen solution to the problem in the `mSolutions hash_map`. If I have, the mMaxWeight member of that solution information is returned immediately, avoiding a duplicated subproblem.

If the solution is not already stored off, I have to do the real work. This means evaluating both possible solutions -- the weight if this node attends, and thus excludes all its immediate children, and the weight if this node does not attend, and thus includes the children. Those two values are calculated and stored in local variables `weight_attending` and `weight_not_attending`.

Given those two values, its a simple matter to decide which of the two is larger, and return that one. However, there are a couple of issues to deal with before returning.

First, I want to make sure that we store the solved subproblem in the solution table. Both return statements do that by writing the value to the `mMaxWeight` member of the correct entry in `mSolutions`.

That would be enough to get the maximum value, but I may also want to create a solution for the problem -- which means not only knowing the maximum weight achieved, but a list of attendees. In order to get that list of attendees, I need to also keep track of whether the maximum value achieved at this node was done so by attending or not attending. I do that by updating a member called `mAttending` in the Solution object. (Note that the constructor sets `mAttending` to false upon creation.)

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.