Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Abstractions For Binary Search, Part 5: Getting Down to Details

November 13, 2014

Last week, we wrote this code:

 
       size_t binary_search(T x, const T* array, size_t n)
     
     {
    
              size_t begin = 0, end = n;
 
                    // Invariant: This function will eventually return a value in the range [begin, end]
                    while (begin != end) {
                              size_t mid = (begin + end) / 2;
                              if (vless(x, mid)) {
                                       end = mid;
                             } else {
                                       begin = mid + 1;
                             }
           }
 
           return begin;   // Or return end, because begin == end
     }

Our one remaining coding task is to write vless, noting that we defined vless(x,k) as yielding true if

  • k is the position of an element of the sequence that is not (strictly) less than x, or
  • k is equal to n (i.e., k is the off-the-end position).

and otherwise yielding false. We can think of vless(x,k) as asking "is x less than the element at position k?" with the implicit understanding that x has a value strictly less than that of what we imagine to be the sentinel element that comes after the end of the array.

C++ does not support nested functions in full generality. Nevertheless, we could use a lambda expression to capture the values of array and n, and use those captured values in the lambda. However, useful though this technique might be, we consider it a diversion that we shall leave to another day. Instead, we observe that we call vless from only a single place in the code — so we can simply write our intentions inline without defining a function at all, and also without having to duplicate any code.

In other words, we can replace the line

 
if (vless(x, mid)) {

with

 
if (mid == n || !(array[mid] < x)) {
 

The usage !(array[mid]< x) is ungainly, so it is worth understanding why we did not write x <= array[mid] instead. The reason is that we set out to define a binary-search algorithm that is implemented entirely in terms of the < operation on array elements. We should not assume that we have a <= operation available. For example, suppose that we are working with an array of strings, and that < is a case-insensitive comparison. We might even be using a different name for this operation — it was only as part of defining vless that we assumed that there was an operation named < at all. Are we sure that if we know what < means, we also know what == means? It is probably better to keep our code as clear as possible and use only < for comparisons.

Are we done? Well, we could stop here, but then we would miss some opportunities to simplify the code.

The first opportunity comes from the observation that even though we catered to the possibility that mid would be equal to n, that can never happen. Here's why.

We note first that end<= n, because initially end is equal to n; and we never increase end, we only decrease it. Therefore, if mid < end, it must also be true that mid < n. The question then becomes whether it is ever possible that mid is equal to end.

The answer to that question is no. We observe first that the while condition, together with the knowledge that begin and end never cross, ensures that begin < end. If begin <= end-2, then (begin + end)/2 <= end-1 (before truncation), truncation can never cause mid to be equal to end. So the only case that matters is begin == end-1, in which case (begin + end)/2 == begin + 0.5 before truncation. In that case, however, because begin is unsigned, truncation will yield mid == begin, and never mid == end.

As a result of these observations, we can remove the condition mid == end and simplify the code to

 
if (!(array[mid] < x)) {

Once we have come this far, it should be obvious that we can simplify the code further by removing the ! and reversing the two branches of the if. Doing so leads to

 
       size_t binary_search(T x, const T* array, size_t n)
 
    {
           
              size_t begin = 0, end = n;
 
                     // Invariant: This function will eventually return a value in the range [begin, end]
                    while (begin != end) {
                              size_t mid = (begin + end) / 2;
                              if (array[mid] < x) {
                                         begin = mid + 1;
                              } else {
                                        end = mid;
                              }
           }
 
           return begin;   // Or return end, because begin == end
     }

It's hard to imagine how this code could be much simpler. However, we don't actually know whether it works. In fact, I wrote this entire series of articles so far without doing any testing at all. That's the next order of business, which we shall begin next week.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video