Dr. Dobb's | The Standard Librarian: A Debugging Allocator

The Standard Librarian: A Debugging Allocator

Error-checking is always important, and especially so when dealing with complicated interfaces like STL allocators. This column shows a class, debug_allocator, that performs some run-time checks. The techniques used in writing debug_allocators turn out to be useful for a wide range of allocator adaptors.

December 01, 2001
URL:http://www.drdobbs.com/the-standard-librarian-a-debugging-alloc/184403807

December 2001 C++ Experts Forum/The Standard Librarian

Allocators in the Standard C++ library have a complicated and low-level interface [1]. Unlike new and delete, they decouple memory allocation from object creation. Unlike malloc and free, they require you to keep track of the data type for which you're allocating memory and the number of objects you're allocating.

Ordinarily, this isn't a problem. Allocators have a low-level interface because they're a low-level concept: they're normally buried inside of container classes, and they do not belong in normal user code. There's one time, however, when you may have to worry about allocators: when you're writing a new container class of your own. Allocators are harder to use than new and delete, and, accordingly, they're more error-prone. If you do have to write code that uses allocators, how can you make sure your code is correct? Run-time testing can never prove the absence of errors but only their presence. Still, testing is important — the sooner you find a bug, the easier it is to fix.

All of the standard container classes take an allocator class as a template parameter; this parameter has a default so that, for example, std::vector<int> is an abbreviation for std::vector<int, std::allocator<int> >. If we wrote a new allocator class that performed extra checking to make sure it's being used consistently, then we could provide that class, instead of std::allocator<int>, as vector's second template argument. Assuming there are no bugs, vector will behave just the same as before. (Except that the extra checks will make it slower, of course.)

This column presents such a debugging allocator class. It's useful in its own right, and writing it also turns out to be an interesting exercise in working with allocators.

The Tests

What kinds of errors would we like to check for? The most important are:

The memory that we pass to deallocate was actually allocated by this allocator.
When we deallocate a block, we're deallocating the same number of bytes as we allocated. (Unlike malloc and free, allocators don't keep track of that for themselves.)
We're deallocating memory for the same type as we allocated it for. (Although allocators decouple memory allocation and object creation, memory must still be allocated for a particular data type.)
When we work within a block of memory, we don't overstep the bounds of that block.
We never attempt to construct two objects at the same location, and we never attempt to destroy an object twice.
When we deallocate a block of memory, we make sure to destroy all objects in that block first.

As we shall see, our debugging allocator will not satisfy all of those goals. We'll be able to check for some of those errors, but not all.

The basic idea behind the debugging allocator is simple [2]: whenever we allocate a block of memory, allocate a bit more than the user asked for and store some extra information in the first few bytes. The user never sees this debug area; we pass the user a pointer to the memory that follows it. When we're passed a pointer to a memory block that we've allocated, however, we can decrement the pointer, look in the debug area, and verify that the memory block is being used consistently.

This design has two implications. First, it's impossible to implement it quite portably. We have to preserve alignment: given an address that's properly aligned for the user's data, we want to reserve a few bytes and still have something that's properly aligned. How can we know how much to add? In principle, we can't; the language doesn't provide a mechanism for determining alignment requirements. (Perhaps this will be added to a future version of the Standard.) In practice, it's not a serious portability issue: aligning everything on double-word boundaries tends to be good enough on most existing platforms, and can easily be changed on those few platforms with more stringent requirements.

A second implication is that, with this design, there are some errors we can easily check for and others that we can't. A user gets a memory block with allocate and passes it to deallocate, so this design makes it very easy to check that allocate and deallocate are being used consistently. Unfortunately, it's harder to verify that construct and destroy are being used consistently.

The problem is that in general the argument that's passed to construct or destroy isn't a pointer that was returned from allocate. If you write a.construct(p, x), then p must point into a memory block that was allocated through a — but that's into, not to. Perhaps the user allocated enough memory for 1,000 elements, with p1 = a.allocate(1000). In that case, we don't know whether the first argument to construct is p1, p1+5, or p1+178. We can't find the debugging information we need, because we have no way of finding the beginning of the block that the user allocated.

There are two possible solutions to this problem, but neither of them quite works. First, and most obvious, we could give up on the idea that all of the debugging information must go at the beginning of the memory that the user allocates. This doesn't quite work, though, because we'd have to store our information intermingled with the user's. But this would break pointer arithmetic: the user would have no way to step from one element to the next without seeing our hidden debugging information. Alternatively, we could maintain an extra data structure off to the side: for example, we could maintain an std::set of active objects, adding an address to the set whenever the user creates an object with construct, and removing it whenever the user destroys an object with destroy.

That technique is simple and elegant, and the reason it doesn't work is rather subtle. The problem is that the user doesn't necessarily have to use the same allocator to create an object as to destroy it: the user need only use an equal allocator. Consider the following sequence of operations:

The user is given an allocator, a1, and makes a copy of it, a2.
The user creates a new object with a1.construct(p, x).
The user destroys the object with a2.destroy(p).

This sequence is legal, but an allocator that maintained a list of active objects as a set would flag it as an error: we'd be adding the new object to a1's list of active objects, and we wouldn't find it when we destroyed the object with a2.

Could we get around this problem by using a list that's shared between all copies of a given allocator? Perhaps, but we'd start running into definitional problems. If we have:

my_allocator<int> a1;
my_allocator<double> a2(a1);

then should a1 and a2 share the same list of active objects? (The answer might seem to be "no," but then what about my_allocator<int>(a2)?) We also run into implementation problems: an object that's shared behind the scenes requires special care, especially in the presence of concurrency.

Because of these problems, I have simply given up on the idea of nontrivial checks in construct and destroy. This version of a debugging allocator does some minimal checks of their arguments, but it doesn't try to make sure that destroy's argument was previously constructed or that an object is only destroyed once.

The main purpose of this debugging allocator is making sure that allocate and deallocate are being used consistently. When we allocate memory, we reserve two words at the beginning of every memory block, and store the number of elements in that block, as well as a hash code that's derived from the type (from the type's name, in particular, as given by typeid(T).name()). Then we reserve another word at the end of the memory block and store another copy of the hash code as a sentinel. Then when we deallocate the block, we check that the number of elements we've stored is the same as the argument passed to deallocate, and that both copies of the hash code are correct. We call assert so that an inconsistency will cause the program to fail.

This doesn't give us all of the checks that we hoped for at the beginning of this section, but it does allow us to make sure that memory passed to deallocate really was allocated by this allocator, that it was allocated for the same type as it's being deallocated for, and that argument counts are consistent. It also gives us some limited protection against overruns: a user who makes an off-by-one error in either direction will overwrite the sentinel, and this error will be detected at deallocation time.

An Allocator Adaptor

So far I haven't shown any code, because I haven't yet said anything about the exact form of the debugging allocator. One simple option would be to build the debugging allocator on top of malloc or on top of std::allocator. In that case, we'd have a class with a single template parameter, the type of the object to be allocated. That's not quite as general as we might want: users wouldn't be able to combine testing with a user-defined allocator. For greater generality, it's better to write the debugging allocator as an allocator adaptor. (Another motive for writing it as an allocator adaptor, I admit, is pedagogical: that way we can explore general features of allocator adaptors.)

Writing an allocator adaptor presents two new problems. The first is that we don't get to make as many assumptions about what we're dealing with as we may be used to. We can't necessarily assume that Allocator::pointer is of type T*, and we can't necessarily assume that we can put objects — even objects of built-in types, like char and int — in uninitialized memory. We have to use construct and destroy religiously. This is a nuisance, but it's just a matter of being careful.

The second problem is a design issue: what should our debugging allocator's template parameters look like? A first thought might be that we should just have a single template parameter, a template template parameter. However, that's not general enough: a template template parameter is only good for a specific number of arguments, and allocators aren't so constrained. A template template parameter that sufficed for the default allocator, std::allocator<T>, wouldn't work for a user-defined allocator with extra arguments, like my_allocator<T, flags>.

What about a single ordinary template parameter, then? We might imagine writing:

template <class Allocator>
class debug_allocator;

That comes closer. A user might write, for example, debug_allocator<std::allocator<int> >. Unfortunately, there's still a problem. An allocator's value type may be void, and sometimes (as I described in my earlier column) this is even useful and important. What happens if a user writes this?

typedef debug_allocator<std::allocator<int> > A;
typedef typename A::template rebind<void>::other A2;

The problem is that A2's value type is void, and there are some things in allocator classes that make no sense for void. There's a reference typedef, for example, and void& makes no sense; it will cause a compilation error. The default allocator has a specialization, std::allocator<void>. Somehow, we need a specialization too.

We can't quite say that we want a specialized version of debug_allocator<Allocator> that we'll get whenever Allocator's value type is void, but there's a trick that does the next best thing. We can give debug_allocator a second template parameter, let it default to Allocator::value_type, and then write a partial specialization that applies whenever the second template parameter is void. That second template parameter is purely an implementation detail: users will never write it explicitly, but will refer to this adaptor by writing (for example) debug_allocator<std::allocator<int> >.

Once we have this trick, it's not hard to put all of the pieces together: the full debug allocator is shown in Listing 1. You may find debug_allocator useful if you need to check memory usage in your container classes, but what's more important is that you can use it as a model. The implementation techniques used in debug_allocator are useful for allocator adaptors of your own.

Notes

[1] Matt Austern. "The Standard Librarian: What Are Allocators Good For?" C/C++ Users Journal C++ Experts Forum, December 2000, <www.cuj.com/experts/1812/austern.htm>.

[2] This debugging allocator is based on the one in the SGI library, <www.sgi.com/tech/stl>. The original version was written by Hans-J. Boehm.

Matt Austern is the author of Generic Programming and the STL and the chair of the C++ standardization committee’s library working group. He works at AT&T Labs — Research and can be contacted at [email protected].

December 2001 C++ Experts Forum/The Standard Librarian/Listing 1

Listing 1: Complete implementation of the debugging allocator

template <class Allocator, class T = typename Allocator::value_type>
class debug_allocator {
public:                        // Typedefs from underlying allocator.
  typedef typename Allocator::size_type       size_type;
  typedef typename Allocator::difference_type difference_type;
  typedef typename Allocator::pointer         pointer;
  typedef typename Allocator::const_pointer   const_pointer;
  typedef typename Allocator::reference       reference;
  typedef typename Allocator::const_reference const_reference;
  typedef typename Allocator::value_type      value_type;

  template <class U> struct rebind {
   typedef typename Allocator::template rebind<U>::other A2;
   typedef debug_allocator<A2, typename A2::value_type> other;
  };

public:                        // Constructor, destructor, etc.
  // Default constructor.
  debug_allocator()
   : alloc(), hash_code(0)
   { compute_hash(); }

  // Constructor from an underlying allocator.
  template <class Allocator2>
  debug_allocator(const Allocator2& a)
   : alloc(a), hash_code(0)
   { compute_hash(); }

  // Copy constructor.
  debug_allocator(const debug_allocator& a)
   : alloc(a.alloc), hash_code(0)
   { compute_hash(); }

  // Generalized copy constructor.
  template <class A2, class T2>
  debug_allocator(const debug_allocator<A2, T2>& a)
   : alloc(a.alloc), hash_code(0)
   { compute_hash(); }

  // Destructor.
  ~debug_allocator() {}


public:                        // Member functions. 
                               // The only interesting ones 
                               // are allocate and deallocate.
  pointer allocate(size_type n, const void* = 0);
  void deallocate(pointer p, size_type n);

  pointer address(reference x) const { return a.address(x); }
  const_pointer address(const_reference x) const {
    return a.address(x);
  }

  void construct(pointer p, const value_type& x);
  void destroy(pointer p);

  size_type max_size() const { return a.max_size(); }

  friend bool operator==(const debug_allocator& x,
                         const debug_allocator& y)
   { return x.alloc == y.alloc; }

  friend bool operator!=(const debug_allocator& x, 
                         const debug_allocator& y)
   { return x.alloc != y.alloc; }

private:
  typedef typename Allocator::template rebind<char>::other
    char_alloc;
  typedef typename Allocator::template rebind<std::size_t>::other
    size_alloc;

  // Calculate the hash code, and store it in this->hash_code. 
  // Only used in the constructor.
  void compute_hash();

  const char* hash_code_as_bytes()
   { return reinterpret_cast<const char*>(&hash_code); }

  // Number of bytes required to store n objects of type value_type.
  // Does not include the overhead for debugging.
  size_type data_size(size_type n)
   { return n * sizeof(value_type); }

  // Number of bytes allocated in front of the user-visible memory
  // block. Must be large enough to store two objects of type
  // size_t, and to preserve alignment.
  size_type padding_before(size_type)
   { return 2 * sizeof(std::size_t); }

  // Number of bytes from the beginning of the block we allocate
  // until the beginning of the sentinel.
  size_type sentinel_offset(size_type n)
   { return data_size(n) + padding_before(n); }

  // Number of bytes in the sentinel.
  size_type sentinel_size()
   { return sizeof(std::size_t); }

  // Size of the area we allocate to store n objects,
  // including overhead.
  size_type total_bytes(size_type n)
  { return data_size(n) + padding_before(n) + sentinel_size(); }

  Allocator alloc;
  std::size_t hash_code;
};


// Specialization when the value type is void. We provide typedefs
// (and not even all of those), and we save the underlying allocator
// so we can convert back to some other type.

template <class Allocator>
class debug_allocator<Allocator, void> {
public:
  typedef typename Allocator::size_type       size_type;
  typedef typename Allocator::difference_type difference_type;
  typedef typename Allocator::pointer         pointer;
  typedef typename Allocator::const_pointer   const_pointer;
  typedef typename Allocator::value_type      value_type;

  template <class U> struct rebind {
   typedef typename Allocator::template rebind<U>::other A2;
   typedef debug_allocator<A2, typename A2::value_type> other;
  };

  debug_allocator() : alloc() { }

  template <class A2>
  debug_allocator(const A2& a) : alloc(a) { }

  debug_allocator(const debug_allocator& a) : alloc(a.alloc) { }

  template <class A2, class T2>
  debug_allocator(const debug_allocator<A2, T2>& a) :
    alloc(a.alloc) { }

  ~debug_allocator() {}

private:
  Allocator alloc;
};


// Noninline member functions for debug_allocator. They are not defined
// for the void specialization.

template <class Allocator, class T>
typename debug_allocator<Allocator, T>::pointer
debug_allocator<Allocator, T>::allocate
  (size_type n, const void* = 0) {
  assert(n != 0);

  // Allocate enough space for n objects of type T, plus the debug 
  // info at the beginning, plus a one-byte sentinel at the end.
  typedef typename char_alloc::pointer char_pointer;
  typedef typename size_alloc::pointer size_pointer;
  char_pointer result = char_alloc(alloc).allocate(total_bytes(n));

  // Store the size.
  size_pointer debug_area = size_pointer(result);
  size_alloc(alloc).construct(debug_area + 0, n);

  // Store a hash code based on the type name.
  size_alloc(alloc).construct(debug_area + 1, hash_code);

  // Store the sentinel, which is just the hash code again. 
  // For reasons of alignment, we have to copy it byte by byte.
  typename char_alloc::pointer sentinel_area =
    result + sentinel_offset(n);
  const char* sentinel = hash_code_as_bytes();
  {
   char_alloc ca(alloc);
   int i = 0;
   try {
     for ( ; i < sentinel_size(); ++i)
       ca.construct(sentinel_area + i, sentinel[i]);
   }
   catch(...) {
     for (int j = 0; j < i; ++j)
       ca.destroy(&*(sentinel_area + j));
     throw;
   }
  }

  // Return a pointer to the user-visible portion of the memory.
  pointer data_area = pointer(result + padding_before(n));
  return data_area;
}

template <class Allocator, class T>
void debug_allocator<Allocator, T>::deallocate
  (pointer p, size_type n) {
  assert(n != 0);

  // Get a pointer to the space where we put the debugging information.
  typedef typename char_alloc::pointer char_pointer;
  typedef typename size_alloc::pointer size_pointer;
  char_pointer cp = char_pointer(p);
  size_pointer debug_area = size_pointer(cp - padding_before(n));

  // Get the size request and the hash code, and check for consistency.
  size_t stored_n    = debug_area[0];
  size_t stored_hash = debug_area[1];
  assert(n == stored_n);
  assert(hash_code == stored_hash);

  // Get the sentinel, and check for consistency.
  char_pointer sentinel_area =
    char_pointer(debug_area) + sentinel_offset(n);
  const char* sentinel = hash_code_as_bytes();
  assert(std::equal(sentinel, sentinel + sentinel_size(),
    sentinel_area));

  // Destroy our debugging information.
  size_alloc(alloc).destroy(debug_area + 0);
  size_alloc(alloc).destroy(debug_area + 1);
  for (size_type i = 0; i < sentinel_size(); ++i)
    char_alloc(alloc).destroy(sentinel_area + i);

  // Release the storage.
  char_alloc(alloc).deallocate(cp - padding_before(n), total_bytes(n));
}

template <class Allocator, class T>
void debug_allocator<Allocator, T>::construct(pointer p, const 
value_type& x)
{
  assert(p);
  a.construct(p, x);
}

template <class Allocator, class T>
void debug_allocator<Allocator, T>::destroy(pointer p)
{
  assert(p);
  a.destroy(p);
}

template <class Allocator, class T>
void debug_allocator<Allocator, T>::compute_hash() {
  const char* name = typeid(value_type).name();
  hash_code = 0;
  for ( ; *name != '\0'; ++name)
   hash_code = hash_code * (size_t) 37 + (size_t) *name;
}