Channels ▼
RSS

C/C++

User-Defined Hash Functions for Unordered Maps in C++0x


One Point of Light

So what is to be done? The language has structural problems that make it hard to issue good errors. Documentation is never as good as we'd like it to be. Are we stuck at the status quo?

While I'm not likely to change the Visual Studio documentation or the C++ compiler, I have found that one small article, like this one, that includes good SEO terms to describe the problem, can help a lot of people. As an example, my next_permutation() article from 2002 still gets hundreds of readers every week, and I hope most of them walk away understanding how to use this function.

The same thing could end up being true for this article. I'll spend the rest of it showing you four good ways to define a hash function for use in unordered_map under C++0x, and with Google's help, it may end up providing the missing manual for this particular problem.

Method 1 — A Simple Function

You're used to seeing unordered_map declared with two template parameters. But a look at the help page shows that it actually takes five — the last three usually accept default values:

template<class Key,
    class Ty,
    class Hash = std::hash<Key>,
    class Pred = std::equal_to<Key>,
    class Alloc = std::allocator<std::pair<const Key, Ty> > >
    class unordered_map;

The third parameter to the definition is a function object type that is used by the class to convert a key into a hash code. By default, it is set to std::hash. Internally the unordered_map class calls operator() on an object of that type in order to get a hash code for a given key.

Note also that several of the constructors for unordered_map also take a default parameter, which is an instance of this function object type.

So there are two places where we can provide some information about how to hash the key used in the unordered_map, but we normally don't fill in these items. The reason that we often don't is that the C++ standard library already defines instances of std::hash for commonly used types. So I can write a program that contains a line like this:

int main(int argc, char* argv[])
{
    cout << std::hash<const char *>()("Mark Nelson") << endl;
    return 0;
}

which, when run, produces:

134514544

The problem, of course, is that the standard library did not implement a version of hash<pair<string,string>> — or any of the other infinite varieties of user-defined classes. However, since my new Name class is composed of two string objects, and the standard library knows how to hash string objects, I can create a pretty good hash function of my own that looks like this:

size_t name_hash( const Name & name )
{
    return hash<string>()(name.first) ^ hash<string>()(name.second);
}

As a general rule of thumb, if I have two hashes for independent variables, and I combine them using XOR, I can expect that the resulting hash is probably just as good as the input hashes.

Building this function is easy enough, but how do I actually use it with unordered_map? Although the solution is fairly simple, if you find templates confusing already, you aren't liable fumble your way through to the answer without an enormous amount of effort.

Basically, I have to modify my use of the map class in two places. First, I have to pass in a pointer to the hash function in the constructor of the map. The standard defines a constructor that takes an initial number of buckets and a hashing object as inputs. So the first step is to modify the declaration code to look like this:

unordered_map<Name,int> ids(100, name_hash );

This is only half the battle, however. The default implementation of unordered_map expects to be using a function object of type std::hash<key> to calculate hashes, and that is not what I passed in to the constructor. So I also have to add a third template parameter to my declaration — a template parameter that matches the function object I am passing in to the constructor.

Creating the proper template parameter to match this simple hashing function requires more than your usual level of library-fu. The easiest way I know of is to wrap the function type inside the std::function template, which is defined in header <functional>. When you do this, your map will have a hasher object that is an instance of function. Upon initialization of the map, the function object will be assigned a pointer to name_hash, which can then be called via the interface to function.

The resulting declaration will then look like this:

unordered_map<Name,int,function<size_t( const Name & name )>> ids(100, name_hash );

In the final, complete program using this method, shown below, I simplify my definition of the template parameter by using the new decltype feature. This makes the code a bit easier to read by showing explicitly that I am taking the type of function name_hash, as well as removing one more place where a type mismatch could be inadvertently created.

//
// This program uses a simple user-defined function
// to provide a hash function for use in unordered_map
// 
// Compile this example with Visual Studio 2010
// or g++ 4.5 using -std=c++0x
//
#include <iostream>
#include <unordered_map>
#include <string>
#include <functional>

using namespace std;

typedef pair<string,string> Name;

size_t name_hash( const Name & name )
{
    return hash<string>()(name.first) ^ hash<string>()(name.second);
}

int main(int argc, char* argv[])
{
	unordered_map<Name,int,function<decltype(name_hash)>> ids(100, name_hash );
	ids[Name("Mark", "Nelson")] = 40561;
	ids[Name("Andrew","Binstock")] = 40562;
	for ( auto ii = ids.begin() ; ii != ids.end() ; ii++ )
		cout << ii->first.first 
                     << " "
                     << ii->first.second 
                     << " : "
                     << ii->second
                     << endl;
	return 0;
}

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video