Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

What Does It Mean To Change An Object?

September 26, 2013

Some of the reader comments about last week's note suggest a question of definition. For example, one of the comments eventually leads to an example that includes the following code:

 
template< class Item >
inline auto concat( NcArray<Item> a, Item const v )  ->
NcArray<Item>
{
        a.append( movable( v ) );
        return movable( a );
}

This code appears as part of the definition of a class named NcArray, which is intended to represent an array is never copied, but still supports operations, such as the function concat defined here, that appears to create a new object that contains a copy of the original object's elements.

I say appears in the previous sentence because what really happens is the result of concat contains the contents of its argument augmented by concatenating a new element, and has the side effect that the argument is no longer usable. In other words, if I have an NcArray object named foo, and an element named x, and I write

 
               auto bar = concat(foo, x);

then when I'm done, the value of bar is an NcArray whose elements are all of the elements in foo followed by a copy of x, and as a side effect, I must promise that I will never again use the value of foo.

The question, then, is whether this call of concat modifies foo. We might argue that the definition of concat requires its users to write their code in ways that work properly regardless of whether concat modifies foo. If a program never uses a variable again after passing it as the first argument of concat, it meets this requirement.

With this notion in mind, let's look again at the collatz_vec function I showed two weeks ago. This function executes the statement

 
               result.push_back(n);

repeatedly in a loop, where result is a vector<unsigned>. Using the technique that I described in that article, we could define a tail-recursive auxiliary function something like

 
               vector<unsigned> collatz_vec_aux(unsigned n, unsigned count, vector<unsigned> result)

and then replace the recursive calls in collatz_aux by something like

 
               return collatz_vec_aux(n / 2, count + 1, concat(result, n));

Here, concat would be a function written along the lines of the first example in this article, which would give the appearance of creating a new vector containing the elements of result along with a new element at the end, but which would really recycle the vector's previous contents instead of copying them.

This technique would work for this particular example. However, it does not meet what I think is the definition of a data structure that never changes after being created. For such a data structure, I think it is reasonable to insist that after executing

 
               auto bar = concat(foo, x);
 

the value of foo must be whatever it was before, and there should be no prohibition against looking at that value.

I had thought until recently that such a data structure must be impossible if it were based on contiguous data structures such as arrays or vectors — and, in a sense, I still think it is impossible. However, one can almost get there through cheating. Here's how.

Consider the following simple data structure, each instance of which represents an initial segment of a vector:

 
     template<class T> struct Thing {
           typename vector<T>::size_type size;
           shared_ptr<vector<T>> p;
     };
 

We can imagine implementing a nondestructive concat function along these lines:

 
template<class T> Thing<T> concat(const Thing<T>& t, T n)

{
           
           Thing<T> result = t;
           ++result.size;
           result.p->push_back(n);
     }

This code would affect the contents of the vector associated with a Thing, but because its size element would remain unchanged, the change to the vector would be invisible. Accordingly, after

 
               auto bar = concat(foo, x);

it would still be possible to access the value of foo.

The reason I said one could almost get there is that the following example fails:

 
     auto bar = concat(foo, x);
     auto baz = concat(foo, y);

Here, we would like baz to hold the (original) elements of foo followed by y, but bar has left the underlying vector in a state that renders this strategy impossible. The solution is to make concat check whether the Thing's size matches the vector's size; if it doesn't, it is necessary to copy the vector. This strategy allows well-behaved programs such as collatz_vec to run efficiently, while still delivering semantics — at a price — that preserve the value of concat's argument. We can see this price: Defining bar above takes O(1) time, but defining baz takes O(n) time, where n is the number of elements in foo.

I think that these examples show a fundamental limitation of array-based data structures: There is only one place where "the next element" in such a data structure can go. As a result, if you want to be able to augment a data structure in a uniformly efficient way, it seems to be necessary to use a node-based structure rather than an array-based one.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video