Allocators in the Standard C++ library have a complicated and low-level interface . Unlike new and delete, they decouple memory allocation from object creation. Unlike malloc and free, they require you to keep track of the data type for which you're allocating memory and the number of objects you're allocating.
Ordinarily, this isn't a problem. Allocators have a low-level interface because they're a low-level concept: they're normally buried inside of container classes, and they do not belong in normal user code. There's one time, however, when you may have to worry about allocators: when you're writing a new container class of your own. Allocators are harder to use than new and delete, and, accordingly, they're more error-prone. If you do have to write code that uses allocators, how can you make sure your code is correct? Run-time testing can never prove the absence of errors but only their presence. Still, testing is important the sooner you find a bug, the easier it is to fix.
All of the standard container classes take an allocator class as a template parameter; this parameter has a default so that, for example, std::vector<int> is an abbreviation for std::vector<int, std::allocator<int> >. If we wrote a new allocator class that performed extra checking to make sure it's being used consistently, then we could provide that class, instead of std::allocator<int>, as vector's second template argument. Assuming there are no bugs, vector will behave just the same as before. (Except that the extra checks will make it slower, of course.)
This column presents such a debugging allocator class. It's useful in its own right, and writing it also turns out to be an interesting exercise in working with allocators.
What kinds of errors would we like to check for? The most important are:
- The memory that we pass to deallocate was actually allocated by this allocator.
- When we deallocate a block, we're deallocating the same number of bytes as we allocated. (Unlike malloc and free, allocators don't keep track of that for themselves.)
- We're deallocating memory for the same type as we allocated it for. (Although allocators decouple memory allocation and object creation, memory must still be allocated for a particular data type.)
- When we work within a block of memory, we don't overstep the bounds of that block.
- We never attempt to construct two objects at the same location, and we never attempt to destroy an object twice.
- When we deallocate a block of memory, we make sure to destroy all objects in that block first.
As we shall see, our debugging allocator will not satisfy all of those goals. We'll be able to check for some of those errors, but not all.
The basic idea behind the debugging allocator is simple : whenever we allocate a block of memory, allocate a bit more than the user asked for and store some extra information in the first few bytes. The user never sees this debug area; we pass the user a pointer to the memory that follows it. When we're passed a pointer to a memory block that we've allocated, however, we can decrement the pointer, look in the debug area, and verify that the memory block is being used consistently.
This design has two implications. First, it's impossible to implement it quite portably. We have to preserve alignment: given an address that's properly aligned for the user's data, we want to reserve a few bytes and still have something that's properly aligned. How can we know how much to add? In principle, we can't; the language doesn't provide a mechanism for determining alignment requirements. (Perhaps this will be added to a future version of the Standard.) In practice, it's not a serious portability issue: aligning everything on double-word boundaries tends to be good enough on most existing platforms, and can easily be changed on those few platforms with more stringent requirements.
A second implication is that, with this design, there are some errors we can easily check for and others that we can't. A user gets a memory block with allocate and passes it to deallocate, so this design makes it very easy to check that allocate and deallocate are being used consistently. Unfortunately, it's harder to verify that construct and destroy are being used consistently.
The problem is that in general the argument that's passed to construct or destroy isn't a pointer that was returned from allocate. If you write a.construct(p, x), then p must point into a memory block that was allocated through a but that's into, not to. Perhaps the user allocated enough memory for 1,000 elements, with p1 = a.allocate(1000). In that case, we don't know whether the first argument to construct is p1, p1+5, or p1+178. We can't find the debugging information we need, because we have no way of finding the beginning of the block that the user allocated.
There are two possible solutions to this problem, but neither of them quite works. First, and most obvious, we could give up on the idea that all of the debugging information must go at the beginning of the memory that the user allocates. This doesn't quite work, though, because we'd have to store our information intermingled with the user's. But this would break pointer arithmetic: the user would have no way to step from one element to the next without seeing our hidden debugging information. Alternatively, we could maintain an extra data structure off to the side: for example, we could maintain an std::set of active objects, adding an address to the set whenever the user creates an object with construct, and removing it whenever the user destroys an object with destroy.
That technique is simple and elegant, and the reason it doesn't work is rather subtle. The problem is that the user doesn't necessarily have to use the same allocator to create an object as to destroy it: the user need only use an equal allocator. Consider the following sequence of operations:
- The user is given an allocator, a1, and makes a copy of it, a2.
- The user creates a new object with a1.construct(p, x).
- The user destroys the object with a2.destroy(p).
This sequence is legal, but an allocator that maintained a list of active objects as a set would flag it as an error: we'd be adding the new object to a1's list of active objects, and we wouldn't find it when we destroyed the object with a2.
Could we get around this problem by using a list that's shared between all copies of a given allocator? Perhaps, but we'd start running into definitional problems. If we have:
my_allocator<int> a1; my_allocator<double> a2(a1);
then should a1 and a2 share the same list of active objects? (The answer might seem to be "no," but then what about my_allocator<int>(a2)?) We also run into implementation problems: an object that's shared behind the scenes requires special care, especially in the presence of concurrency.
Because of these problems, I have simply given up on the idea of nontrivial checks in construct and destroy. This version of a debugging allocator does some minimal checks of their arguments, but it doesn't try to make sure that destroy's argument was previously constructed or that an object is only destroyed once.
The main purpose of this debugging allocator is making sure that allocate and deallocate are being used consistently. When we allocate memory, we reserve two words at the beginning of every memory block, and store the number of elements in that block, as well as a hash code that's derived from the type (from the type's name, in particular, as given by typeid(T).name()). Then we reserve another word at the end of the memory block and store another copy of the hash code as a sentinel. Then when we deallocate the block, we check that the number of elements we've stored is the same as the argument passed to deallocate, and that both copies of the hash code are correct. We call assert so that an inconsistency will cause the program to fail.
This doesn't give us all of the checks that we hoped for at the beginning of this section, but it does allow us to make sure that memory passed to deallocate really was allocated by this allocator, that it was allocated for the same type as it's being deallocated for, and that argument counts are consistent. It also gives us some limited protection against overruns: a user who makes an off-by-one error in either direction will overwrite the sentinel, and this error will be detected at deallocation time.
An Allocator Adaptor
So far I haven't shown any code, because I haven't yet said anything about the exact form of the debugging allocator. One simple option would be to build the debugging allocator on top of malloc or on top of std::allocator. In that case, we'd have a class with a single template parameter, the type of the object to be allocated. That's not quite as general as we might want: users wouldn't be able to combine testing with a user-defined allocator. For greater generality, it's better to write the debugging allocator as an allocator adaptor. (Another motive for writing it as an allocator adaptor, I admit, is pedagogical: that way we can explore general features of allocator adaptors.)
Writing an allocator adaptor presents two new problems. The first is that we don't get to make as many assumptions about what we're dealing with as we may be used to. We can't necessarily assume that Allocator::pointer is of type T*, and we can't necessarily assume that we can put objects even objects of built-in types, like char and int in uninitialized memory. We have to use construct and destroy religiously. This is a nuisance, but it's just a matter of being careful.
The second problem is a design issue: what should our debugging allocator's template parameters look like? A first thought might be that we should just have a single template parameter, a template template parameter. However, that's not general enough: a template template parameter is only good for a specific number of arguments, and allocators aren't so constrained. A template template parameter that sufficed for the default allocator, std::allocator<T>, wouldn't work for a user-defined allocator with extra arguments, like my_allocator<T, flags>.
What about a single ordinary template parameter, then? We might imagine writing:
template <class Allocator> class debug_allocator;
That comes closer. A user might write, for example, debug_allocator<std::allocator<int> >. Unfortunately, there's still a problem. An allocator's value type may be void, and sometimes (as I described in my earlier column) this is even useful and important. What happens if a user writes this?
typedef debug_allocator<std::allocator<int> > A; typedef typename A::template rebind<void>::other A2;
The problem is that A2's value type is void, and there are some things in allocator classes that make no sense for void. There's a reference typedef, for example, and void& makes no sense; it will cause a compilation error. The default allocator has a specialization, std::allocator<void>. Somehow, we need a specialization too.
We can't quite say that we want a specialized version of debug_allocator<Allocator> that we'll get whenever Allocator's value type is void, but there's a trick that does the next best thing. We can give debug_allocator a second template parameter, let it default to Allocator::value_type, and then write a partial specialization that applies whenever the second template parameter is void. That second template parameter is purely an implementation detail: users will never write it explicitly, but will refer to this adaptor by writing (for example) debug_allocator<std::allocator<int> >.
Once we have this trick, it's not hard to put all of the pieces together: the full debug allocator is shown in Listing 1. You may find debug_allocator useful if you need to check memory usage in your container classes, but what's more important is that you can use it as a model. The implementation techniques used in debug_allocator are useful for allocator adaptors of your own.
Matt Austern is the author of Generic Programming and the STL and the chair of the C++ standardization committees library working group. He works at AT&T Labs Research and can be contacted at [email protected].