# The Case Against int, Part 3: The Advantages and Perils of Unsigned Arithmetic

March 07, 2012

Last week, I noted that integers are often used for counting, and that counting is often better suited to unsigned than to signed arithmetic. To that end, C++ defines an unsigned integer type, `size_t`, that is well-suited to storing array indices, or otherwise counting items that can fit in the computer's memory. Moreover, each of the standard-library containers, such as `vector<T>`, defines a type analogous to `vector<T>::size_type`, which is an unsigned integer with an appropriate capacity to store the number of elements in a `vector<T>`.

If you're counting objects, the number of objects you're counting can't be negative. Because unsigned integers cannot express negative values, a whole category of errors becomes impossible. This claim isn’t mere sophistry, either. Suppose, for example, you want to check whether `n` is a suitable index for a vector `v`. You might think you would have to write

```
if (n >= 0 && n < v.size()) { /* n is suitable */ }
```

However, the fact that `v.size()` yields an unsigned value makes it possible to write

```
if (n < v.size()) { /* n is suitable */ }
```

The reason is that, in general, combining signed and unsigned integers in C++ converts the signed quantity to unsigned. Accordingly, if `n` is negative, it is converted to a huge positive unsigned value, which will appear to be larger than `v.size()`.

Such behavior is often desirable, as it is here; but it can also cause trouble. As an example of such trouble, consider a loop to visit all of the elements of `v` in ascending order:

```
typedef decltype(v.size()) vsize_t; for (vsize_t n = 0; n != v.size(); ++n) { /* … */ }
```

Here, I have started by defining `vsize_t` as the type yielded by `v.size()`, and done so in a way that makes it unnecessary for me to know what kind of container `v` is. I have then written a straightforward loop that steps `n` through each index of `v`.

What if I want to make this loop run backward? The obvious way to write it is

```
for (vsize_t n = v.size()-1; n >= 0; --n) { /* Wrong! */ }
```

but a little examination should convince you that this code is a disaster. In fact, a considerate compiler will warn about the problem: `n` is unsigned, so `n >= 0` is always true. Therefore, the loop will never terminate.

This example exposes one of the difficulties in working with unsigned quantities: Loops often terminate after a variable has passed beyond the range of acceptable values. If that range is bounded by 0, and you're using an unsigned variable, it may be impossible to go beyond that bound.

There's a general rule for working with such bounds: When a loop runs backward, decrement at the beginning of the loop; when it runs forward, increment at the end. Decrementing at the beginning simplifies the code in two ways: It makes it possible to start a loop at the upper off-the-end value, rather than one element earlier, and it ensures that the lower bound will never be transgressed.

This last paragraph is a bit of a mouthful, so let me show the two general forms I have in mind, using `while` instead of `for` to reduce the code to its essentials. Let's start with the familiar form of a loop that increases a variable's value each time through:

```             vsize_t n = 0;
while (n != v.size()) {
// Process v[n]
++n;
}
```

If we are moving backward through `v`, the form is less familiar:

```             vsize_t n = v.size();
while (n != 0) {
--n;
// Process v[n]
}
```

Writing the loop in this slightly unusual form allows us to stop the loop before `n` crosses zero (which it will never do, so we had better not try to test for it). It also makes it unnecessary to write a separate calculation to start `n` at `v.size()-1` instead of `v.size()`. Moreover, the same technique works for iterators:

```             auto iter = v.end();      while (iter != v.begin()) {
--iter;            // Process *iter
}
```

The similarity between these last two code examples comes from a similarity between unsigned integers and iterators: Because there is an off-the-end value but no off-the-beginning value, both unsigned integers and iterators require special care when a value might run off the beginning of a range (i.e., might underflow).

#### Summary

• Unsigned integers are particularly well suited to counting, because the number of items that one is counting cannot be negative.
• Mixing signed and unsigned integers usually converts the signed integer to unsigned. This conversion can be useful (i.e., it can simplify bounds checking), but it can also cause trouble, particularly around lower bounds of ranges.
• One rule of thumb for avoiding such trouble is to decrement a control variable at the beginning of a loop. Using this rule requires the variable to start out one step beyond its "real" first value, but that state of affairs is usually easy to achieve.
• Techniques for writing loops that use unsigned integers can carry over to loops that involve iterators.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>