Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Is Moving Objects Worth the Hassle?

July 26, 2013

Last week, I discussed how C++ compilers use overloading to decide whether to move or copy an object. This week, I'd like to take a step back and discuss why moving instead of copying is worth doing in the first place.

You might think that this claim needs no justification. After all, copying takes time; so programs that copy a lot of data will be slower than programs that avoid doing so. However, this kind of unthinking optimization can lead to trouble. For example, many programmers tend to rewrite functions such as

 
double norm(double x, double y) { return sqrt(x*x + y*y); }

so that they use references to avoid copying their arguments:

 
double norm(const double& x, const double& y) { return sqrt(x*x + y*y); }

I can think of several arguments against rewriting functions this way unthinkingly:

  • It violates the "say what you mean" principle. If you want to pass a value to a function, saying that the function must not copy the value is overspecification. Moreover, this overspecification is only for speed purposes, and therefore…
  • It violates the "people are more important than machines" principle. I'm probably in a minority in saying this, but I have long believed that machines should be our servants, not our masters. As a result, I dislike the idea of rewriting my code to make life easier for a computer. I suppose I may come to regret that viewpoint after the Singularity.
  • It doesn't make the program all that much faster. At best, the program copies two pointers when it would otherwise have copied two floating-point numbers. That might save a factor of two in the copy operations' execution times; but the program also does two floating-point multiplies, a floating-point add, and a square root; so the difference is apt to get lost in the rest of the program's execution time. At best, the rewrite would speed up the program by a fairly small constant factor.
  • It might even make the program slower. The compiler's optimizer might not be up to realizing that x*x refers to the same object twice when x is a reference, so it might fetch x twice from memory rather than putting it in a register. Or the compiler might put arguments into hardware registers that it does not use for objects referred to by references. At the very least, I would not be willing to make any bets about which of these two functions is faster without measuring them.
  • In more complicated programs, using references instead of call by value can introduce subtle bugs into programs.

An important point in this discussion is that a double occupies a fixed amount of memory. Copying such an object does not require allocating any dynamic memory, and dynamic memory is usually expensive to allocate or deallocate compared with the cost of copying fixed-length objects. As a result, avoiding copying makes a particularly big difference in code that uses dynamically sized objects:

 
double average(vector<double> v) { /* … */ }

In this function, changing the declared type of v from vector<double> to const vector<double>& avoids copying not only the vector<double> argument itself but also the contents of all of that vector's elements. Moreover, the change also avoids having to allocate and deallocate memory to contain copies of those elements.

In other words, moving a vector instead of copying it saves not just the time needed to move the fixed-sized block of memory that the vector itself occupies, but also the time needed to copy all of the vector's elements. More generally, the real savings come about when it is possible to move a container, not just a vector, rather than copying it. Still more generally, the savings can be enormous when each element of a container is itself a container. If you think that C++ programmers rarely define containers of containers, you've probably forgotten about vector<string>.

Copying a vector<string> entails copying

  • The (fixed-length) contents of the vector itself.
  • The dynamically allocated memory that contains the fixed-length part of each of the component strings.
  • The dynamically allocated memory that holds the characters that constitute each individual string.

If we have a vector<string> that contains n strings, then we must copy n+1 dynamically allocated blocks of memory, plus n fixed-size string parts, plus the total number of characters in all of the strings.

If we move this vector<string> instead of copying it, we need to move only the first of these items. As a result, there is no dynamic memory allocation at all, and the execution time is constant rather than being proportional to the total number of characters in the vector<string> plus the number of strings. This is potentially a huge difference in execution time. Moreover, it is a difference that is apt to become more pronounced as the program and its data structures get larger.

To sum up: Programmers often try to optimize their programs by replacing call-by-value with references, and thereby avoid copying function arguments. Such avoidance often makes only a slight difference in programs' execution time. Where the difference really shows up is in dealing with containers — especially containers of containers such as vector<string>.

Next week, I'll give an example in which moving a vector makes more sense than passing it by reference, and another example of why moving vector elements is much faster than copying them even if a programmer neither moves nor copies the elements directly.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Dr. Dobb's TV