Containers That Never Change

September 19, 2013

Last week, I introduced a programming style that never changes the value of a variable once that value has been established. In that article, I made a statement that one reader questioned:

If we try to write this program recursively using the technique we mentioned earlier, we find that each recursive call passes the entire result array as an argument. As a result, the program's runtime will be quadratic in the number of iterations — hardly a happy state of affairs.

Here's why I made this claim.

We start with the example to which I was referring last week:

```
vector<unsigned> collatz_vec(unsigned n)
{
assert(n != 0);
vector<unsigned> result;

while (n != 1) {
result.push_back(n);
if (n % 2 == 0)
n /= 2;
else
n = 3 * n + 1;
}
return result;
}
```

Next, we rewrite it as a tail-recursive function. We do so by observing that the variables that change in the loop are `result` and `n`, so we define an auxiliary function with those values as parameters:

```
vector<unsigned> collatz_aux(unsigned n, vector<unsigned> result)
{
if (n == 1)
return result;
new_result = concat(result, n);
if (n % 2 == 0)
return collatz_aux(n / 2, new_result);
else
return collatz_aux(3 * n + 1, new_result);
}
```

We now rewrite our `collatz_vec` function to call our auxiliary function:

```
vector<unsigned> collatz_vec(unsigned n)
{
assert (n != 0);
return collatz_aux(n, vector<unsigned>());
}
```

The subexpression `vector<unsigned>()` is a convenient way of constructing a `vector` with no elements.

At this point, we have a problem: How do we write `concat`? The obvious technique is something like this:

```
vector<unsigned> concat(vector<unsigned> v, unsigned n)
{
v.push_back(n);
return v;
}
```

This function violates the restriction we wished to impose against changing the value of a variable once such a value was established. Unfortunately, so long as we stick with `vector`s, it's the best we can do. At least, by using this function, we have been able to encapsulate the idea of appending an element to a `vector` to create a new `vector`. However, this function copies all of the elements of `v`, which means that our `collatz_vec` function winds up running in quadratic time — because as result grows, all of the vector's elements must be copied on each iteration.

In short, in order to make the recursive `collatz_vec` function run in a reasonable amount of time, we need some kind of `vector`-like data structure that allows us to append an element in O(1) time, rather than having to copy all of the elements each time we wish to append one. Putting it more simply, the whole notion of appending rests on being able to change the data structure to which we are appending. What we want is to be able to create a new data structure that contains all of the elements of the old one, along with one additional element, and we wish to do so in O(1) time.

There are many data structures of this kind. Lisp used one of the simpler ones. Next week I'll start explaining how it works.

More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.