Persistent Vector Iterators

Template class vector has some annoying properties. But you can hide them, and still make use of the good stuff, by deriving another template class from vector.


January 01, 1999
URL:http://www.drdobbs.com/persistent-vector-iterators/184403596

January 1999/Persistent Vector Iterators


Template class vector is one of the template container classes from the STL (Standard Template Library). Some of its most important characteristics are:

I find that the last two "features" are a source of bugs, inefficiency, and at least inconvenience. Invalidating the iterators on reallocation must be controlled (by using the member function resize), or it can otherwise create dangling iterators. The lack of deallocation on erase functions can cause memory wastage. (Believe it or not, there is no deallocation in any of the vector functions at all.)

The following two examples illustrate these problems. First is an example of a dangling iterator:

vector<int> v (100);
.....
vector<int>::iterator i = v.begin();
v.push_back (0);
*i = 0;	// BUG

The problem is that push_back might cause reallocation, which invalidates the iterator i. One solution is to reserve enough space (via a call to the member function reserve) at the proper place in the code.

Here is an example of memory wastage:

vector<int> v (10000);
.....
v.clear();
cout << "Memory usage: "  
     << v.capacity() * sizeof (int)
     << " bytes for " << v.size()
     << " elements. " << endl;

The output is:

Memory usage: 40000 bytes for 0 elements.

At first glance, it might appear that these two problems are unrelated. In my opinion, though, the only reason why no reallocations occur on deletion is to avoid invalidating the iterators on these operations. So, if the iterators were valid on reallocation (they are not), it would have been possible to free the memory on deletion, as well as to probably remove the need for such service functions as capacity and reserve.

Both these problems do not exist in the other STL containers, except for deque. This might also be a problem. A program, working with lists, for instance, might not work if you decide to change list to vector.

This article presents a variation of the STL vector container that features iterators that are persistent on insert operations, and that performs memory deallocation on erase operations.

Template Class pvector

Listing 1, pvector.h, contains the source of a template class pvector which is publicly derived from the STL template class std::vector. My intention was to make it functionally compatible to vector, so that it can replace vector without need to change any user code. The only differences are:

The most important change to the parent class is the definition of nested classes iterator and const_iterator (see Listing 1). These definitions override and hide the parent's iterator and const_iterator, respectively. Each of these iterator classes defines two member objects:

An iterator accesses elements of the controlled sequence with expressions of the form (*_itsVect)[_itsInd].

Thus, the iterators that were valid at some moment are still valid even if some reallocation has occurred in the mean time. That solves my first task.

The rest of the member functions make sure that both iterator and const_iterator fulfill the requirements of a random-access iterator.

Member Functions

Template class pvector defines a number of constructors and member functions, to deal with differences in behavior from the base class. Most of these functions do nothing but call the corresponding parent function, adapting the parameters passed and/or the return type as need be. The interesting exceptions, which behave quite differently from the ones defined in vector, are the erase functions. Here is one of them:

iterator erase (iterator it)
    {
    _BaseType::erase(
        it._getBaseIt());
    if (2 * size() < capacity())
        {  
        // deallocate some storage
        pvector copy (*this);
        std::swap (copy);
        }
    return it;
    }

Here I overcome vector's unwillingness to free memory by using its copy constructor, which (hopefully) allocates only the small amount of memory required to accommodate the data being copied to it. I then swap the copied contents with this. This is the solution for my second task.

Note that storage deallocation happens only if the current vector size is less than half as large as the current capacity. This has the effect that the average deallocation count for N erase operation is only log(N). For large N, the average reallocation count per operation approaches zero, which effectively yields amortized constant time per operation. (By the way, a similar scheme exists for reallocations on insert operations, too).

Note also that some other member functions like pop_back and clear also benefit from deallocation, to the extent that they call upon erase.

The rest of the member functions, inherited from vector, do not need redefinitions. Among them are size, capacity, empty, clear, operator[], etc.

Efficiency

Unfortunatelly nothing comes for free. Vector iterators are mere pointers in most STL implementations. This approach is no doubt more efficient (in both time and space) than the more complex iterator classes defined in pvector. Let me try to explain what we lose.

As far as the space is concerned, it is apparent that pvector::iterator is most probably twice as large as a pointer. This of course matters if you keep iterators as data. The space requirement for these will double.

As far as speed degradation is concerned, this can be checked with some simple measurements. A simple program benchPVector.cpp, available on the CUJ ftp site (see p. 3) performs the same test using both vector and pvector — sorting a randomly initialized vector, using the sort algorithm. The result may vary widely across different types of hardware, operating system, compiler, STL implementation, etc. But here are the results I obtained using a Pentium-PRO 200, MSVC 5.0 and NT 4.0:

Times: pvector <int> : 4.562
        vector <int> : 2.875
               ratio : 1.58678

    pvector <double> : 5.766
     vector <double> : 3.875
              ratio  : 1.488

Conclusion

More things can be done to improve pvector. The redundant information stored in the iterators might be used to check their validity. For instance the code can test whether an iterator belongs to the this container in a call to erase, or whether dereferencing the iterator is okay, etc.

You can make a small and dirty optimization in comparing iterators — compare just the indexes. This speeds up comparisons a little, but of course yields erroneous results ifyou try to compare iterators from different containers.

pvector iterators are somewhat slower, compared to vector iterators. On the other hand, the persistence of these iterators, as well as the better memory usage in pvector, might compensate for this slower performance. The decision is up to you which container to use.

Finally, I observe that STL supplies yet another container template class that has non-persistent iterators, template class deque. You can easily derive a container with better properties, the same way as I've shown here for vector.

Radoslav Getov has been working as software engineer for twelve years, developing CAD/CAM software for electronics design and manufacturing. You can reach him at [email protected].

January 1999/Persistent Vector Iterators/Listing 1

Listing 1: Definition of pvector template

//////////////////////////////////////////////////////
//  Copyright (c) 1998 Radoslav Getov               //
//                                                  //
//  Permission to use, copy, modify and distribute  //
//  this software is granted without fee, provided  //
//  that the above copyright notice and this        //
//  permission notice appear in the modified source //
//  and the source files that #include it.          //
//                                                  //
////////////////////////////////////////////////////// 
#ifndef PVECTOR_H
#define PVECTOR_H

#include <vector>

// -------------- template pvector<> -----------
template <class T, class A = std::allocator<T> >
class pvector : public std::vector <T, A> 
{
typedef std::vector<T> _BaseType;
typedef pvector<T>     _ThisType;

public:

// -------------- pvector::const_iterator -------
class const_iterator 
   : public std::iterator < 
               std::random_access_iterator_tag, T >
   {
   protected:
   // -------------- data members ---------------
   size_t           _itsInd;  // Index in its vector.
   const _ThisType* _itsVect; // Ptr to its vector.

   const_iterator (const _ThisType * v, size_t index)
      : _itsVect (v), _itsInd (index) {}

   _BaseType::const_iterator _getBaseConstIt() const
      {return((_BaseType*)_itsVect)->begin()+_itsInd;}

   friend pvector <T>;

   public:
   // -------------- ctors ----------------------
   const_iterator ()
      : _itsInd (0), _itsVect (0) {}

   // -------------- dereferencing --------------
   const_reference operator *() const
      { return (*_itsVect)[_itsInd]; }

   const T* operator ->() const
      { return & operator *(); }

   // -------------- iterator arithmetic --------
   #define DEFINE_ARITHMETIC(It)                     \
      /* binary */                                   \
      It operator + (int s) const                    \
         { return It (_itsVect, _itsInd + s); }      \
      It operator - (int s) const                    \
         { return It (_itsVect, _itsInd - s); }      \
      int operator - (It to) const                   \
         { return _itsInd - to._itsInd; }            \
      /* increment */                                \
      It& operator += (int s)                        \
         { _itsInd += s; return *this; }             \
      It& operator ++ ()                             \
         { _itsInd++; return *this; }                \
      It operator ++ (int)                           \
         { It t (*this); _itsInd++; return t; }      \
      /* decrement */                                \
      It& operator -= (int s)                        \
         { _itsInd -= s; return *this; }             \
      It& operator -- ()                             \
         { _itsInd -- ; return *this; }              \
      It  operator -- (int)                          \
         { It t (*this); _itsInd--; return t; }      \


   DEFINE_ARITHMETIC (const_iterator)

   // -------------- iterator comparisons -----------
   bool operator != (const const_iterator & to) const
      { return _itsInd  != to._itsInd 
            || _itsVect != to._itsVect; }

   bool operator == (const const_iterator & to) const
      { return _itsInd == to._itsInd 
            && _itsVect == to._itsVect; } 

   bool operator < (const const_iterator & to) const
      { return _itsInd  < to._itsInd  
            || _itsInd == to._itsInd 
            && _itsVect < to._itsVect; }

   bool operator <= (const const_iterator & to) const
       { return _itsInd  < to._itsInd 
             || _itsInd == to._itsInd 
             && _itsVect <= to._itsVect; }

   bool operator >= (const const_iterator & to) const

      { return !(*this < to); }

   bool operator > (const const_iterator & to) const
      { return (to < *this); }
   }; // -------------- class const_iterator ----

// ------------- class pvector<>::iterator ----------
class iterator : public const_iterator
   {
   // --------------- private ctor --------------
   iterator (const _ThisType* v, size_t index)
      : const_iterator (v, index) {};

   _BaseType::iterator _getBaseIt() const
     {return ((_BaseType*)_itsVect)->begin()+_itsInd;}

   friend pvector <T>;
   
   public:
   // -------------- ctor ----------------------
   iterator () : const_iterator() {}

   // -------------- dereferencing --------------
   reference operator *() const
      { return (*(_BaseType*)_itsVect)[_itsInd]; }
                               
   T* operator->() const      
      { return &this->operator *();}

   // ------------ arithmetic -------------------
   DEFINE_ARITHMETIC (iterator);
   }; // pvector<>::class const_iterator

#undef DEFINE_ARITHMETIC // cleanup 

// -------------- constructors ------------------
explicit pvector (const A& a = A())
   : std::vector <T> (a) {}

explicit pvector (size_type n, 
                  const T& v = T(), 
                  const A& a = A())
   : std::vector <T> (n, v, a) {}

template <class InIt>   
pvector (InIt first, InIt last, const A& a = A())
   : std::vector <T> (a) 
   { for (; first != last; first++) 
        push_back (*first); }

pvector (const pvector& x)
   : std::vector <T> (x) {}

// -------------- iterator returns --------------
iterator begin() 
   { return iterator (this, 0); }
iterator end() 
   { return iterator (this, size());}

const_iterator begin() const
   { return const_iterator (this, 0);  }
const_iterator end() const
   { return const_iterator (this, size()); }

// -------------- reverse iterators -------------
typedef std::reverse_iterator <iterator, T>
   reverse_iterator;
reverse_iterator rbegin() 
   { return (reverse_iterator (end())); }
reverse_iterator rend() 
   { return (reverse_iterator(begin())); }

typedef std::reverse_iterator <const_iterator, T>
   const_reverse_iterator;
const_reverse_iterator rbegin() const
   { return (const_reverse_iterator (end())); }
const_reverse_iterator rend() const
   { return (const_reverse_iterator (begin())); }

// -------------- assign ------------------------
pvector & operator = (const pvector & from) 
   { (_BaseType &) *this = from; return *this; }

// -------------- erases ------------------------
iterator erase (iterator it)
   { 
   _BaseType::erase (it._getBaseIt());
   if (2 * size() < capacity()) 
      {  // deallocate some storage
      pvector copy (*this);
      swap (copy);
      }
   return it; 
   }

void pop_back ()
   { erase (end() - 1); }

iterator erase (iterator from, iterator to)
   { 
   _BaseType::erase (from._getBaseIt(),
                     to.  _getBaseIt());
   if (2 * size() < capacity())
      {
      pvector copy (*this);
      swap (copy);
      }
   return from; 
   }
                 
void clear ()
   { erase (begin(), end()); }

// -------------- inserts -----------------------
iterator insert (iterator it, const T& x = T())
   { _BaseType::insert (it._getBaseIt(), x);
      return it;  }
   
void insert (iterator it, size_type n, const T& x)
   { _BaseType::insert (it._getBaseIt(), n, x); }

template <class InIt>    
void insert (iterator it, InIt first, InIt last)
   { for (; first != last; it++, first++)
        insert (it, *first);  }

}; // ------------ template pvector -------------

#else
#  error Multiple #included file!
#endif
/* End of File */

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.