Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Asymmetric Bounds, Part 4: Pragmatic Advantages

June 28, 2012

People who come to C++ (or C) from languages such as Fortran or Basic that normally begin array indices with 1 sometimes argue that the C++ way of doing things is counterintuitive. For example, if a 42-element array has indices 1 through 42, the argument is that it is easier to tell that the array has 42 elements than it is if the array has indices 0 through 41.

This argument makes sense if you accept that "1 through 42" is the usual way of describing the range of permissible indices — and you also accept that 1 is a reasonable value for the smallest possible index. However, as soon as you admit that the first element of an array might not be element number 1, this simplicity evaporates.

For example, suppose you have a bar graph with an x-axis that represents each of the years 1986 through 2008, inclusive. How many bars are there? Subtract 1986 from 2008 and you get 22, so the answer is probably 21, 22, or 23. But without pausing for thought, do you know for sure which of these alternatives is the right one? I didn't think so.

This particular uncertainty comes from using symmetric bounds. If instead I had said that the first year in the interval was 1986 and the first year beyond the interval was 2009, then you could subtract 1986 from 2009 to get 23, and know immediately that the bar graph would have 23 bars. This observation gives us a useful way to think about asymmetric bounds in general: Instead of representing the first and last elements of a range, we can think of the range as being followed immediately by another range, and we represent the first element of each of those ranges.

It is easier to see the utility of representing only the beginnings of ranges if you think about how you would represent several consecutive ranges at once. If the ranges are consecutive, there is no need to store the beginning and end of each range separately: The end of each range except the last is just before the beginning of the one that follows it. The only wrinkle is that representing ranges this way uses asymmetric bounds for all but the last range. As a result, almost anything you might do with such ranges becomes simpler if you represent the last range in the same way.

As an example, suppose you want to convert numeric grades to letter grades. You might say that A is 90-100, B is 80-89, and so on. Or you might just write down the following:

               A begins at 90
               B begins at 80
               C begins at 70
               D begins at 60
               F is anything below that.

This table has the same information as the explicit ranges, but it's much easier to manage than the explicit, inclusive ranges. Moreover, it's more honest. For example, if an A starts at 90 and a B ends at 89, what do you do with a grade of 89.5? You might argue that the grades have to be integers, so that condition is impossible; but that argument fails if someone comes back later and changes the code. If you don't believe that the first definition could lead to a crash when a student gets a grade of 89.5, you haven't been programming long enough.

Once we adopt the habit of representing only beginnings of ranges, and treating what comes immediately after a range as if it were another range, a lot of computations become much easier. For example, it becomes obvious that an empty range is equivalent to using the same value for the beginning and the end. It takes only a little more thought to realize that the number of elements in the range must be the difference between the beginning and the end. Moreover, you no longer have to worry about whether to use < or <= to end a loop; just use != (or ==) and be done with it. And of course the fact that you don't have to think about ordering comparisons makes it possible to use iterators that don't support + or - at all.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.