Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Object Swapping, Part 2: Algorithms Can Often Swap Instead of Copy

March 21, 2012

Last week I introduced swapping, and mentioned that is it often faster than assignment — a fact that often allows algorithms that swap to be dramatically faster than they would be if they used assignment. Here's why.

C++ requires all objects of a given class to have the same size, and for the size to be known during compilation. This requirement makes it possible for the compiler to lay out each function's stack memory during compilation, thereby making it much easier to generate efficient machine code.

Of course, this requirement conflicts with the desire for variable-sized data structures such as strings and vectors. The C++ library deals with this conflict by implementing each such data structure as a fixed-size part, which resides in each object, and a variable-size part, which the library allocates and frees dynamically. A string or vector therefore consists typically of a small amount of information that includes a pointer to dynamic memory containing the actual data.

Copying such a data structure requires allocating new dynamic memory and copying the data from the original into it. Destroying such a data structure requires freeing that dynamic memory. Both allocating and freeing dynamic memory are relatively slow operations. Moreover, copying the contents of the dynamic memory will also be slow if there is a lot to copy.

As a result of this implementation strategy, it is expensive to copy or destroy a string, vector, or similar data structure. Swapping two such data structures, on the other hand, is usually much simpler: It suffices to swap the contents of the objects themselves, and the dynamic memory can be left alone. As a result, swapping two such data structures is often much faster than copying or destroying them. If the contents are large, the speed difference can be a factor of 100 or more.

Now let's think about a vector of strings, which we'll name vs. Here is a code fragment to remove element number n from vs. We will assume that n is unsigned:

assert (n < vs.size());

for (size_t i = n+1; i != vs.size(); ++i)
           vs[i-1] = vs[i];

The assert does not need to check that n >= 0 because n is unsigned; so of course it is always >=0.

The for loop copies vs[n+1] into vs[n], then copies vs[n+2] into vs[n+1], and so on. After it has finished, we no longer need the last element of vs, so we shrink vs by one, discarding the last element.

This code is capable of wasting a great deal of time. Copying vs[n+1] into vs[n] creates a copy of vs[n+1], after which the original vs[n+1] is overwritten by a copy of vs[n+2] , and so on. So all but one of the elements following vs[n] will be copied and the copies then overwritten.

Now consider this rewrite:

assert (n < vs.size());

for (size_t i = n+1; i != vs.size(); ++i)
           vs[i].swap(vs[i-1]);   // changed

In this version, instead of copying each element after vs[n] into its predecessor, we swap the elements. When the loop completes, the original value of vs[n] will have been swapped all the way into the last element of vs, so it is this original value that the call to vs.resize will destroy. Moreover, the loop will accomplish this feat without ever making a copy of any element of vs.

This example suggests a general principle:

For types that are faster to swap than to copy, 
a sequence of assignments of the form
a = b; b = c; … will be faster if rewritten as
swap(a, b); swap(b, c); and so on.

If it is necessary to preserve the last value in the sequence, the last assignment can be left alone. So, for example,

             a = b; b = c; c = d; d = e;

can become

             swap(a, b); swap(b, c); swap(c, d); d = e;

and only the value of e is copied.

Of course, this swapping strategy is similar to the strategy for using the move operation that is part of C++11. Next week we will look at the curious relationship between swapping and moving.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.