# Abstractions for Binary Search, Part 8: What Does It Mean To Say "It Works?"

Last week we talked in general terms about how to test a binary-search function. Now let's get specific.

From last week's discussion, we know that we intend to create a series of arrays; for each array, we shall search for every element in the array, for a value between each adjacent pair of distinct elements, and for values less than the first and greater than the last array element. Suppose such an array has *n* distinct elements. It might have more than *n* elements if two or more elements are equal to each other. Then we shall search that array *2n* times: Once for each element, once for a value less than each element, and once for a value greater than the last element.

We intend to do each of these searches twice: Once using the binary-search algorithm and once using the linear-search algorithm that we constructed to yield the same result. On average, each linear search will take *O(k)* time, where *k* is the number of elements in the array. Note that *k>n* is possible when the array has duplicate elements. Each search will therefore take *O(log k)* for the binary search, plus *O(k)* for the linear search. This latter term dominates, so we can use *O(k)* to estimate the time needed to test a single *k*-element array. There are 2^{k} such arrays, so testing a *k*-element array will take time proportional *k* times 2^{k}. Despite this rapid growth, we can plausibly expect to be able to test all arrays exhaustively up to, say, 16 or so. We shall verify this guess empirically in the future.

Suppose array is an appropriate array with *k* elements and *n* distinct elements. If we use our odd-number strategy from last week, we should be able to check this array by searching only for values of *x* in the range [0, 2*n*]. So, for example, if *n* is 2, then the array will contain only the values 1 and 3; we should be able to verify the binary-search algorithm in that array by searching for 0, 1, 2, 3, and 4, and no other values.

We are now beginning to imagine a loop that looks something like this:

for (unsigned x = 0; x <= 2 * n; ++x) { assert (binary_search(x, array, k) == linear_search(x, array, k)); }

This loop would be enclosed in a larger one that would generate appropriate values of *k* and *n*, along with appropriate array elements.

Suppose we run this code with every possible *k*-element array and every value of *k* up to 16 or whatever. Can we now be confident that our code works? Well, no. For one reason, `assert`

is an optional statement: If you compile code that contains an `assert`

with the macro variable `NDEBUG`

turned on, the `assert`

has no effect. So at the very least, we should precede this code (and all test code that relies on `assert`

statements) with something like

#define NDEBUG 0

in order to ensure that `assert`

statements are not quietly ignored.

More seriously, there are some additional sanity checks that we can make on the result:

- It must be in the range [0,
*k*]. If the result refers to an element — that is, the result is not*k*— then that element must not be greater than*x*. - If the result refers to an element that is not the first element, then the element
*before*that one must be strictly less than*x*.

In principle, our `linear_search`

function should ensure that these conditions are met, but there is no harm in verifying them separately. Our loop now looks like this:

for (unsigned x = 0; x <= 2 * n; ++x) { auto r = binary_search(x, array, k); assert (r >= 0 && r <= k); if (r != k) { assert(!(x < array[r]))); if (r > 0) assert(array[r-1] < x); } assert (r == linear_search(x, array, k)); }

In principle, the first part of the first assert, `r >= 0`

, should never be necessary because `r`

should have an `unsigned`

type. Nevertheless, I see no reason not to check it anyway: Perhaps the author of `binary_search`

sloppily wrote a signed return type.

This is about all we can do without looking inside the `binary_search`

function — which isn't really enough. For example, one property of `binary_search`

that we want to verify is that it should never look at array elements that do not exist. Doing so might mean that our code appears to give the right answer by accident. Accordingly, next week we shall see what extra instrumentation is needed in order to ensure from the inside that the `binary_search`

function is behaving as it ought. After that, we shall put all the pieces together.