 Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726. # Abstractions For Binary Search, Part 5: Getting Down to Details

November 13, 2014

Last week, we wrote this code:

```
size_t binary_search(T x, const T* array, size_t n)
{
size_t begin = 0, end = n;

// Invariant: This function will eventually return a value in the range [begin, end]
while (begin != end) {
size_t mid = (begin + end) / 2;
if (vless(x, mid)) {
end = mid;
} else {
begin = mid + 1;
}
}

return begin;   // Or return end, because begin == end
}
```

Our one remaining coding task is to write `vless`, noting that we defined `vless(x,k)` as yielding `true` if

• `k` is the position of an element of the sequence that is not (strictly) less than `x`, or
• `k` is equal to `n` (i.e., `k` is the off-the-end position).

and otherwise yielding `false`. We can think of `vless(x,k)` as asking "is `x` less than the element at position `k`?" with the implicit understanding that `x` has a value strictly less than that of what we imagine to be the sentinel element that comes after the end of the array.

C++ does not support nested functions in full generality. Nevertheless, we could use a lambda expression to capture the values of `array` and `n`, and use those captured values in the lambda. However, useful though this technique might be, we consider it a diversion that we shall leave to another day. Instead, we observe that we call `vless` from only a single place in the code — so we can simply write our intentions inline without defining a function at all, and also without having to duplicate any code.

In other words, we can replace the line

```
if (vless(x, mid)) {
```

with

```
if (mid == n || !(array[mid] < x)) {
```

The usage `!(array[mid]< x)` is ungainly, so it is worth understanding why we did not write `x <= array[mid]` instead. The reason is that we set out to define a binary-search algorithm that is implemented entirely in terms of the < operation on array elements. We should not assume that we have a <= operation available. For example, suppose that we are working with an array of strings, and that < is a case-insensitive comparison. We might even be using a different name for this operation — it was only as part of defining `vless` that we assumed that there was an operation named < at all. Are we sure that if we know what < means, we also know what == means? It is probably better to keep our code as clear as possible and use only < for comparisons.

Are we done? Well, we could stop here, but then we would miss some opportunities to simplify the code.

The first opportunity comes from the observation that even though we catered to the possibility that `mid` would be equal to `n`, that can never happen. Here's why.

We note first that `end<= n`, because initially `end` is equal to `n`; and we never increase `end`, we only decrease it. Therefore, if `mid < end`, it must also be true that `mid < n`. The question then becomes whether it is ever possible that `mid` is equal to `end`.

The answer to that question is no. We observe first that the `while` condition, together with the knowledge that `begin` and `end` never cross, ensures that `begin < end`. If `begin <= end-2`, then `(begin + end)/2 <= end-1` (before truncation), truncation can never cause `mid` to be equal to `end`. So the only case that matters is `begin == end-1`, in which case `(begin + end)/2 == begin + 0.5` before truncation. In that case, however, because `begin` is `unsigned`, truncation will yield `mid == begin`, and never `mid == end`.

As a result of these observations, we can remove the condition `mid == end` and simplify the code to

```
if (!(array[mid] < x)) {
```

Once we have come this far, it should be obvious that we can simplify the code further by removing the ! and reversing the two branches of the `if`. Doing so leads to

```
size_t binary_search(T x, const T* array, size_t n)
{
size_t begin = 0, end = n;

// Invariant: This function will eventually return a value in the range [begin, end]
while (begin != end) {
size_t mid = (begin + end) / 2;
if (array[mid] < x) {
begin = mid + 1;
} else {
end = mid;
}
}

return begin;   // Or return end, because begin == end
}
```

It's hard to imagine how this code could be much simpler. However, we don't actually know whether it works. In fact, I wrote this entire series of articles so far without doing any testing at all. That's the next order of business, which we shall begin next week.

### More Insights To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

## Featured Reports ## Featured Whitepapers 