# Teaching C++ Badly — How To Sort Slowly

May 25, 2011

This post continues the discussion that I started in April about how not to teach C++. It is inspired by code like the following:

```// Code like this should <i>never</i> appear in a textbook
for (int i = 0; i < arraysize – 1; ++i) {
for (int j = 0; j < arraysize – 1; ++j) {
if (array[j] > array[j+1]) {
temp = array[j];
array[j] = array[j+1];
array[j+1] = temp;
}
}
}
```

I adapted this code from a well-known textbook. I have changed the names of the variables and reformatted it so that you can't figure out what textbook it is, because I don't want to embarrass the author or authors. What I do want to do is explain the comment that appears before the code — why it is such a bad idea to put code like this in a textbook. Ever.

This code is a kind of bubble sort, so called because the small elements float toward the beginning of the array being sorted, and the large elements sink to the end. It works by repeatedly comparing pairs of adjacent elements and swapping them if they are out of sequence. It is one of the simplest sorting algorithms, so textbooks often use it to teach beginners about algorithms.

Sorting, in turn, is one of the most important algorithms in all of computer science. I once saw an estimate that a quarter of all computer time used worldwide went into sorting. (This claim preceded the widespread use of computer graphics; I would guess that today the comparable task is moving bits from one part of a display to another.) So why not show students how to sort data as a fundamental part of their education?

The first argument against bubble sorting is that it is slow. This claim may come as a surprise to those of my readers who know that I think human time is more important than computer time. However, quadratic algorithms are so wretchedly slow that they often outweigh the importance of human time.

If you have a 10-element array, and increase it to a 10,000-element array, a bubble sort will take not 1,000 times as long to run, but 1,000,000 times as long. Bubble sorting is so slow that tasks that ought to be commonplace, such as sorting the words in a dictionary, are too slow to contemplate, even on modern computers. Moreover, making the computer faster makes the problem worse, because more speed is usually accompanied by more memory. If you replace your computer with one that runs twice as fast and has twice as much memory, sorting the biggest array that will fit in memory takes twice as long!

The usual argument for using bubble sort is that it is an easy algorithm to understand. I don't buy it, because my experience is that simplified though it might be, bubble sort is still too complicated for most beginners. If you disagree with me, your homework assignment is to look at the code above and determine whether it is actually correct — and then to explain, in terms beginners can understand, how you convinced yourself that the code is correct.

But even if you buy the argument that bubble sort is easy enough for beginners to understand, what you're essentially saying is that it is a good thing to teach beginners how to write a bad solution to a problem that the standard library solves well:

```std::sort(array, array+arraysize);
```

This call is not only much easier to understand than my original example, but it also runs much faster. Moreover, using it also opens the opportunity to explain what I think is an even better alternative:

```std::stable_sort(array, array+arraysize);
```

I would much rather spend time explaining to students what a stable sort is, and why they should use stable_sort instead of sort unless they have a good reason to do otherwise, than to explain to them how to do their own sorting — in ways they barely understand — by writing code that is too slow for any serious use.

For that matter, comparing sort with stable_sort gives the opportunity to discuss an important engineering trade-off with security implications. The standard requirements for sort are intended to allow implementations to use Tony Hoare's classic Quicksort algorithm , which usually runs in time proportional to n log(n). However, it is possible, if the input to Quicksort happens to be particularly unfortunately ordered, for Quicksort to run in quadratic time.

What is worse is that in 1999, Doug McIlroy discovered a simple way to reverse-engineer an implementation of Quicksort by giving it input that causes it to run in quadratic time. The implication of this discovery is that programs that use Quicksort are vulnerable to what is essentially a denial-of-service attack if it is possible for a malicious user to control the details of what is being sorted.

The standard requirements on stable_sort are intended to rule out Quicksort's occasional quadratic behavior, so that programs that use stable_sort will not run horrendously slowly, even in the face of maliciously chosen input.

Of course, many programs do not need to worry about sorting maliciously chosen data, and the risk of chance quadratic behavior is often acceptable. Be that as it may, I would much rather spend classroom time discussing these engineering tradeoffs than explaining to students how to write code in a style that I would never want them to use in practice.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>