More Thoughts On Sorting
I have received several comments from readers of this article. The most thoughtful of these was sent to me privately rather than appearing in the comments section, so I will leave it to the author to identify himself if he wishes. I intend the following remarks to address his comments along with the online ones.
First, I failed to make clear an important distinction between the behavior of specific algorithms and the requirements that the C++ Standard imposes on its library. The standard C++ library offers two generic sorting algorithms, sort and stable_sort, and imposes performance requirements on them that suggest, but do not require, particular implementations.
The sort function's requirements are loose: Sorting an N-element sequence requires "approximately N log N" comparisons on the average. For stable_sort, the requirements are both more generous and more rigid: "at most N (logN)2 comparisons; if enough extra memory is available, it is N log N." There is also a footnote that tells readers to use stable_sort if worst-case behavior is important.
These requirements are meant to allow implementations to use Quicksort (which is not stable) for the sort function, and to require an algorithm that is much faster than bubble sort (though usually not as fast as Quicksort), such as an in-place merge sort , for stable_sort. Although it is certainly true that nothing forbids a stable sorting algorithm from running quadratically, such an algorithm is unacceptable in a standard-conforming C++ implementation.
Another reader argued, correctly, that quadratic algorithms are fine for "'last-mile' cases where only a few elements need to be sorted." I completely agree. But we're talking here about what techniques to teach to beginners. When I said that bubble sorts should not appear in a textbook, I intended by context to mean an introductory textbook. Of course bubble sorts should appear in, for example, a textbook that includes a survey of sorting algorithms.
This brings me to the comments that the reader sent me directly. He made what I think are four good points, and left out what I think is a fifth point.
He begins by grouping bubble sort with selection and insertion sort, a grouping with which I agree. All of these algorithms are similar in implementation and have similarly wretched performance. He goes on to argue:
- Quadratic sorts are a good example of algorithmic thinking. In particular, they provide a good example for students who have trouble understanding nested loops.
- Implementing a quadratic sort provides a good occasion to introduce the notion of measuring performance.
- Measuring performance, in turn, makes it possible to drive home the idea of how slow quadratic algorithms really are.
- This dramatically bad performance motivates a search for better algorithms.
This reader argues that teaching quadratic sorting algorithms is a good idea for students who wish to become computer scientists, rather than just computer programmers.
I think that all of these arguments make sense, and may ultimately boil down to a matter of taste. For me, these arguments are outweighed by:
- Students tend to use the first technique they learn long after they should have outgrown it.
For this reason, I cannot bring myself to teach students a bad way of solving a problem until after I have shown them a good way to solve it — not even if I emphasize that what I am teaching is a bad solution. I am too concerned that I will give them bad habits that will be hard to break later.
That said, I am much less unwilling to teach a bad solution followed by a good one than I am to teach a bad solution by itself.
Of course, for most people — especially those who are acting as programmers rather than as computer scientists — the best way to sort a sequence is to use the standard library. All too many textbooks make four mistakes at once:
- They teach readers how to write their own sorting algorithms before using the standard library.
- The first sort algorithm they teach has quadratic performance.
- They never teach how to write a sort algorithm that performs better; and
- They never get around to explaining what a stable sort is or why it is useful.
There is an alternative approach that avoids all four of these problems. I'll explain that strategy in my next article.