Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

When the Simplest Case Is One Of The Hardest To Get Right

March 06, 2014

What have we learned so far about floating-point input and output?

Floating-point input involves a conversion with a known target precision. For example, when we write

double x;
std::cin >> x;

we know the precision of x. Therefore, we can reasonably expect that an implementation will cause x to be the best possible approximation to whatever value we read, where "best possible" means "the result of rounding the infinite-precision input value according to the rounding rules currently in effect."

In contrast, floating-point output involves a conversion that might not have a known target precision. In particular, when we write

std::cout << 0.1 << std::endl;

there is no obvious choice about the number of significant digits that should appear in the decimal conversion of the floating-point number that was converted from 0.1. To be sure, it is reasonable to expect that this statement will print 0.1; but if we change it slightly:


std::cout << 1.0 / 3.0 << std::endl;

it is far from clear how many 3s should appear in 0.33333… .

In other words, despite appearances, input and output are far from symmetric.

Whenever we do floating-point input, we know exactly what the input is, and we also know exactly what precision the result should have. Accordingly, it seems to be possible to specify how an ideal implementation should handle input: Store best possible approximation to the given input in the required precision. The hard problem, then, is how to specify the behavior of floating-point output.

One reason that this problem is hard is that it is trying to meet conflicting goals. For example, on a machine with 53-bit double-precision floating-point fractions (i.e., most of the computers in use today), the closest double-precision value to 0.1, converted back to decimal, is (exactly) 0.09999999999999997779553950749686919152736663818359375. The first 15 significant digits of this number are all 9, but the 16th and 17th digits are 78. As a result, if we convert this value to decimal with 15 significant digits, we will get 0.1, but if we use 16 significant digits, we will get 0.09999999999999998.

This is a nasty state of affairs, because it implies that if we want (the closest floating-point value to) 0.1 to print as 0.1, we must limit our output to 15 significant digits. Unfortunately, 253 is 9007199254740992, which has 16 digits. This fact shows that 15 significant digits are not always enough to represent accurately the value of a floating-point number with a 53-bit fraction.

In other words, we have two laudable goals that seem impossible to meet at once:

  • Floating-point output should not automatically lose information — that is, when we convert two distinct floating-point number to decimal, the conversion yields two distinct results.
  • Converting a floating-point number that is equal to an obviously simple value such as 0.1 should yield a similarly simple result.

We can rephrase the first of these goals in terms of idempotence: When we convert a floating-point number to decimal, and then convert the decimal representation back to floating-point, we would like the result of this round-trip conversion to be exactly the same value with which we started.

The first of these rules is easy enough to implement: When we convert a floating-point number to decimal, the result could simply be the exact decimal representation of the number. However, this suggestion is not really practical, because it would mean printing 0.1 as 0.09999999999999997779553950749686919152736663818359375. So what should we do?

The first major step toward resolving these problems was Jerome Coonen's observation, around 1980, that it was possible to place specific bounds on how much error could be allowed in input and output while maintaining idempotence. Moreover, it was possible to come up with bounds that could be implemented efficiently. These observations were far from easy to prove; indeed, they were ultimately the core of his 1984 PhD thesis, and also found their way into the IEEE floating-point standard.

However, as we observed last week, loose bounds of this sort are an invitation for implementations to differ from each other — and sometimes even from themselves at different times or in different contexts — and such differences can stand in the way of effective debugging. Next week, we'll take a look at a beautifully elegant way of specifying floating-point input and output that avoids all these problems — albeit at the cost of being harder to implement — and then we'll explore the social factors that make this elegant solution hard to implement in practice.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.