Method 2 — A Simple Function Defined Inline
In some cases where your hash function is short and sweet, you might want to take advantage of the new C++0x support for lambda expressions. In this particular case, using a lambda to define your hash function lets you define the hasher right where you use it — which may or may not provide clarity for the program. The finished result looks like this:
//
// This program uses a simple user-defined function
// to provide a hash function for use in unordered_map.
// The function is passed in as a lambda expression to
// the unordered_map constructor.
//
// Compile this example with Visual Studio 2010
// or g++ 4.5 using -std=c++0x
//
#include <iostream>
#include <unordered_map>
#include <string>
#include <functional>
using namespace std;
typedef pair<string,string> Name;
int main(int argc, char* argv[])
{
unordered_map<Name,int,function<size_t ( const Name & name )>>
ids(100, []( const Name & name )
{
return hash<string>()(name.first) ^ hash<string>()(name.second);
} );
ids[Name("Mark", "Nelson")] = 40561;
ids[Name("Andrew","Binstock")] = 40562;
for ( auto ii = ids.begin() ; ii != ids.end() ; ii++ )
cout << ii->first.first
<< " "
<< ii->first.second
<< " : "
<< ii->second
<< endl;
return 0;
}
As far as the compiler is concerned, this program will probably generate nearly identical code to the previous one, so it is unlikely that you should prefer one over the other for reasons of efficiency.
I don't use the lambda expression method for two reasons:
-
When using the lambda expression, I can't use
decltype in the template class definition to get the type of the hash function object. This means I have to manually enter it, which in turn means I'm manually synching one definition between two places in my code — always an opportunity for trouble.
As this is being written in 2011, lambda expressions are still unfamiliar to people, and aren't supported in a lot of compilers currently in use on production systems, so I save their use for places where they provide a marked improvement in either program structure or readability.
Your opinions may well differ.
Method 3 — A Function Object
Function objects package up functions so they can be called in a way that is often convenient to a library writer. So even though you might not have ever dreamed up the idea of creating a function object to provide a hash function to the library, this is the way unordered_map prefers things. My definition of a function object to use with this definition of Name is shown here:
struct hash_name {
size_t operator()(const Name &name ) const
{
return hash<string>()(name.first) ^ hash<string>()(name.second);
}
};
This is simply the hash function wrapped up in a class definition. And while that might not seem like a big deal, it allows a class to use this function when all it has is the class definition. When I define unordered_map using this function object, my code is a bit simpler:
unordered_map<Name,int,hash_name> ids;
You can see that I still have to include a third template class parameter, but I don't have to pass in a reference to the function object in the constructor. This is because unordered_map keeps track of the class definition, and when it comes time to actually get a hash, it can simply construct the object on the fly and pass in the data. A sample (hypothetical) class that does this might have code like:
template <class HashKey,
class HashValue,
class HashObject >
class HashMap {
...
void insert( HashKey &key, HashVal &val )
{
size_t index = HashObject()( key );
...
Putting it together with the sample program I've been using throughout, you get this code:
//
// This program uses a function object to define
// a hash function for use in unordered_map.
//
// Compile this example with Visual Studio 2010
// or g++ 4.5 using -std=c++0x
//
#include <iostream>
#include <unordered_map>
#include <string>
#include <functional>
using namespace std;
typedef pair<string,string> Name;
struct hash_name {
size_t operator()(const Name &name ) const
{
return hash<string>()(name.first) ^ hash<string>()(name.second);
}
};
int main(int argc, char* argv[])
{
unordered_map<Name,int,hash_name> ids;
ids[Name("Mark", "Nelson")] = 40561;
ids[Name("Andrew","Binstock")] = 40562;
for ( auto ii = ids.begin() ; ii != ids.end() ; ii++ )
cout << ii->first.first
<< " "
<< ii->first.second
<< " : "
<< ii->second
<< endl;
return 0;
}
This method clearly has some nice features. In return for having to package your hash function inside a structure definition (and use the unfamiliar operator()), you get the advantage of having an unordered_map declaration that is considerably simpler than the ones used in the previous two examples.
In terms of the generated code, this example is again probably going to generate nearly identical code to the previous examples, meaning that your preference towards using it should be based strictly on ease of coding, readability, and maintenance issues.


