An Important Move Optimization Is Nearly Invisible
Last week, we looked at subtle differences between copying and moving a container (a
string in this example) in the context of passing arguments to functions. Curiously, one of the biggest differences between copying and moving happens in code that we don't write.
- The Role of the WAN in Your Hybrid Cloud
- Book Expert: Advanced Analytics with Spark: Patterns for Learning Data at Scale
- Real results: Speeding quality application delivery with DevOps [in financial services]
- Defense Against the Dark Arts
The code in question is executed behind the scenes as part of expanding or contracting a random-access container such as a
deque. For example, if we insert a new element into the middle of a
vector, all of the elements after the insertion point must be moved to make way for the newcomer. Even if there are no such elements — which would happen if the new element is at the end of the
vector — expanding the vector might exceed its capacity. In that case, the library would have to move all of the
vector's elements to new locations in order to accommodate the
vector's new capacity.
Let's start by explaining why it's useful to append elements to a
vector at all rather than using a
list, which does not require any reallocation in order to append elements. The C++ library makes use of a clever algorithmic trick: If you multiply a
vector's capacity by a constant each time you reallocate it, each element is moved no more than a constant number of times. For example, suppose the library doubles a
vector's capacity every time it runs out, and imagine that the capacity is about to be doubled from 1,024 to 2,048. That capacity got to 1,024 by being doubled from a previous value of 512. Therefore, of the 1,024 elements, 512 of them have not been moved before; this reallocation will move them for the first time. Of the remaining 512 elements, 256 of them have been moved once already, 128 have been moved twice already, and so on. If you count up all the moves, you will discover that on average, each element has been moved once already and is about to be moved a second time.
In other words, by doubling the capacity of a
vector each time that capacity is exceeded, the library ensures that each element is moved no more than twice, on average, as a side effect of growing the
vector to an arbitrary size. An analogous argument will show that the library can trade space for time by making the factor of two larger or smaller; the main point is that reallocation doesn't add more than a constant factor of time or space overhead.
list instead of a
vector carries its own overhead. Although there is no need to move elements in order to expand a
list, it is necessary to allocate dynamic memory for each element as part of creating the element. Moreover, because these elements are dynamically allocated, they are unlikely to be in contiguous memory, a fact that makes it less likely that the processor will be able to use its hardware cache to make accesses to those elements more efficient. I once went to a talk in which Jon Bentley said that he had found that inserting and removing elements at the beginning of an integer
vector — which requires moving all of the other elements — are nevertheless faster operations than the corresponding
list operations, which do not require moving any elements — until the
vector grows to more than 50 elements or so. Of course, this measurement was on one particular processor, with one particular program, and I may be misremembering. Nevertheless, I am confident in saying that, at least for integer elements,
vectors are faster compared with
lists than you might suspect.
Of course, if we're dealing with containers that contain integers or other simple values, there is no difference from the computer's viewpoint between moving and copying elements. Where the difference really starts to come into play is when the container elements are containers themselves. The most common such container is
vector<string>, so it is definitely worthwhile thinking about what has to happen in order to reallocate a
The first step in reallocating a
vector<string> is to allocate enough memory for all of the new
strings. Next, the library must construct a copy of each
string in the newly allocated space; finally, it must deallocate all of the old
strings. Constructing the copy of each
string, in turn, requires allocating more memory to contain copies of all the characters, and then copying the characters themselves.
Suppose, for example, that we need to use copy operations to move
N strings, which together contain a total of
M characters. Then we do one dynamic allocation for the new
vector, followed by
N dynamic allocations, one for each
N deallocations for the old
strings, and finally one more deallocation for the old
vector. The total is
N+1 deallocations, and copying
M characters plus the data structures of
Now suppose that we can move those
strings directly instead of copying them. We can get by with one dynamic allocation for the new
vector, one dynamic deallocation for the old
vector, plus copying and resetting the data structures of the
N strings. In other words, the total number of dynamic allocations goes from
N+1 down to 1, as does the number of dynamic deallocations, and the amount of data copied goes from being proportional to
N+M to being proportional to
That's an enormous change, particularly because it means that the time needed to expand a
vector<string> depends only on the number of elements in the
vector<string> , rather than depending also on the total amount of data in those elements.
Important as that change may be, I think that there is an even more important gain from moving
vector elements rather than copying them: It is often possible to design data structures so that moving them cannot throw an exception. Of course, there is always the possibility that there may not be enough memory available for the move destination, but I'm not talking about that. Rather, I'm talking about avoiding the nightmare in which the library allocates memory as part of reallocating a
vector, copies some of the elements successfully, and then encounters an exception in the middle of copying the elements. What should the library do in such a case?
It is generally a desirable property that when a library function throws an exception, it should first put things back as they were when the function was called. In the case of reallocating a vector, that property means that the
vector should have its prior value. In order for that situation to be possible, the library must not delete any of the
vector's old elements until it has finished copying all of them. In effect, in order to be able to do the right thing if memory should happen to run out while it is copying elements, the library must use a memory-allocation strategy that maximizes the chance of running out!
If the library can move elements rather than copying them, then it has to allocate only one block of memory, and that happens before any elements are moved. So long as the element class itself is designed so that moving an element cannot throw an exception, once this initial memory allocation has succeeded, the entire reallocation will succeed.
In short, the ability to move
vector elements rather than copying them makes reallocation easier in two ways: Reallocation can run much faster without any additional effort on the programmer's part; and the reallocation can easily be made more robust in the face of possible exceptions.
As one reader mentioned, moving makes reallocation easier in a third way. There are some classes, such as the library I/O stream classes, which do not allow copying. The reason should not be hard to see: Allowing a stream to be copied would require defining what happens if both copies are used for I/O operations, which in turn would make the stream classes even more complicated than they already are. However, even for classes for which copying is hard to define, moving is often easy. So, for example, although the I/O stream classes are not copyable, they are movable. Because the C++11
vector class uses
move rather than
copy for reallocation, it is actually possible to have
vectors with streams as their elements. In C++03, it would have been necessary to use pointers as the elements instead.
The improvements in robustness and functionality that stem from moving instead of copying are, in my view, at least as important as the optimization. All of these improvements, in turn, come from the notion that reallocating a
vector's elements is conceptually moving the elements, not copying them and deleting the originals. It was not until C++ incorporated a notion of moving data that this concept could be implemented accurately without doing extra work.