Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Even Simple Floating-Point Output Is Complicated

January 23, 2014

We continue the theme from last week and the week before that by discussing an idea that is both fundamental and complicated: floating-point input-output — in particular, the behavior that a programmer should have the right to expect from a program that reads or writes a floating-point number in human-readable form.

Much thought has gone into such behavior over the years, and as a result, an answer has emerged that is easy to define and to understand. Moreover, open-source implementations of this desirable behavior are generally available. Nevertheless, most programming languages still do not specify this aspect of implementations' behavior — because of the very tension between theory and practice that we have been discussing.

The root of the problem is that people usually write numbers in base 10 and computers usually store numbers in base 2. 10 is, of course, 2 times 5; and 2 does not divide 5 evenly. As a result, it follows that powers of 2 are generally not powers of 10, and vice versa. It further follows that numbers that are easy to write exactly in base 10, such as 0.1, are often impossible to write exactly in base 2. For example, 0.1 in base 10 is 0.0001100110011… in base 2; the accuracy of the approximation depends on how many bits one is willing to use to write it.

Interestingly, there is no problem converting numbers in base 2 to their exact equivalents in base 10. For example, 0.1, 0.01, and 0.001 in base 2 are 0.5, 0.25, and 0.125 in base 10, respectively. More generally, it is always possible to represent a number in base 2 with n bits after the binary point exactly as a number in base 10 with n digits after the decimal point. This phenomenon leads to a fundamental conclusion about floating-point output:

It is always possible to represent a binary floating-point number exactly in decimal 
if you are willing to use as many decimal digits after the decimal point
, as there are bits after the binary point.

Of course, the computation that produced a particular binary floating-point number may have its own errors; but every binary floating-point number represents a mathematically exact value that always has an exact decimal representation.

Let's see what happens when we try to convince a C++ implementation to show us one of those exact decimal values. We'll start with a simple example:

    double d = 1.0 / 3.0;
    std::cout << d << std::endl;

When I run this on my computer, it prints 0.333333. This behavior suggests that the default is to display either six digits after the decimal point or six significant digits overall; I'll leave it to the reader to devise an experiment to determine which case applies.

Of course, 0.333333 is not the closest decimal approximation to the floating-point value 1.0 / 3.0. We can see this by running the following program fragment:

    double d = 1.0 / 3.0;
    std::cout << d - 0.333333 << std::endl;

Doing so prints 3.33333e-07, which represents the first six significant digits of the difference between 1.0 / 3.0 and 0.333333. Moreover, if we increase the precision that we use to print our value:

    double d = 1.0 / 3.0;
    std::cout << std::setprecision(20) << d << std::endl;

we get 0.33333333333333331483, a result that suggests that our double variable d contains about 16 decimal digits' worth of precision — or at least a result that suggests that we can print that many digits. Is this result an exact representation of the value of d? Probably not: On this particular computer, a double variable comprises 64 bits, of which 12 are devoted to the sign and exponent. That leaves 52 bits, to which we add a hidden leading bit that is always 1 to obtain 53. Accordingly, the value of 1.0 / 3.0 should have 53 bits after the binary points, so that representing that value exactly in decimal should require 53 digits after the decimal point.

What happens when we try to obtain this 53-digit representation? If we execute

    double d = 1.0 / 3.0;
    std::cout << std::setprecision(53) << d << std::endl;

we get 0.3333333333333333148296162562473909929394722, which has only 43 digits after the decimal point. Without even going to the trouble of verifying these digits, we can be confident that what we have here is not an exact representation.

Why should something as seemingly simple as printing a floating-point number be so hard? The answer is a tangle of technical, pragmatic, and historical reasons, which we shall begin to explore next week.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.