Whenever you have a sequence of values, one of the operations you'll most often want to perform is sorting. Sorting data makes it easier for humans to understand, and sorting is the first step in any number of algorithms — even such humble algorithms as calculating the sum of a list of numbers. Every programming system provides some form of sorting; the Standard C++ library provides six! (Or perhaps more, depending on how you count them.) How do they differ, and when should you use one instead of another?

### Sorting with Algorithms

Clause 25 of the C++ Standard has a section called "Sorting and related operations." It includes many operations on sorted ranges, and three generic algorithms for sorting: **sort**, **stable_sort**, and **partial_sort**.

The first two, **sort** and **stable_sort**, have essentially the same interface: as usual with STL algorithms, you pass two Random Access Iterators that define the range to be sorted. You can also, optionally, provide a third parameter that defines how to compare elements: the third parameter is a function object that takes two arguments, **x** and **y**, and returns **true** if **x** should come before **y**. So, for example, if **v** is a vector of **int**:

std::sort(v.begin(), v.end());

sorts it in ascending order. To sort it in descending order instead, you need to supply a different comparison method:

std::sort(v.begin(), v.end(), std::greater<int>());

Note that we're supplying **greater** as the third argument, not **greater_equal**. This is important, and it holds for all of the STL sorting algorithms: the comparison function must return **false** if the two arguments are the same. To some extent, this is arbitrary: it's possible to express a sorting algorithm in terms of a comparison function that looks like **<** or in terms of a comparison function that looks like **<=**. It's necessary to make a definite choice, though, and to stick with it consistently. The Standard C++ library chose the former. If you pass **sort** a function object that returns **true** for equal arguments, you'll get unpredictable and implementation-dependent results; on some systems, it will appear to work, and on others it could cause an infinite loop or a memory protection fault.

For simple cases, you won't see much difference if you use **stable_sort** instead of **sort**. Like **sort**, **stable_sort** sorts a range **[first, last)** into ascending order by default, and again you can supply a comparison function to change the ordering. If you read the C++ Standard, you'll see two differences between **sort** and **stable_sort**:

- As the name suggests,
**stable_sort**is stable. If two elements compare equal,**stable_sort**won't change their relative order. - The performance guarantees are different.

The first point, stability, is straightforward. It's usually irrelevant for **int**s (one "42" is just as good as another), but sometimes very important for other data types. If you're sorting tasks by priority, for example, you probably want high priority tasks to come before low priority ones, but for tasks to appear in their original order when there's a tie. For **sort** there's no such guarantee, but for **stable_sort** there is.

What the Standard says about performance is less straightforward: the guarantees for **sort** and **stable_sort** are both complicated, and in neither case do they give the complete picture. The Standard says that **sort** makes *O(N* log *N)* comparisons "on the average," without saying what the worst case is, and that **stable_sort** makes *O(N (*log *N)*^{2}*)* comparisons, but that "if enough extra memory is available," it, too, is *O(N* log *N)*.

How does one make sense of all of these different cases?

The performance guarantees were written with specific implementation techniques in mind, and the guarantees make more sense if you know what those techniques are. They permit **sort** to be written in terms of *quicksort* (recursively partitioning a range) and **stable_sort** in terms of *merge sort *(recursively merging sorted subranges) [1]. Quicksort is one of the fastest known sorting methods, and it is almost always *O(N* log *N)*, but there are some special sequences for which it is *O(N*^{2}*)*; if the Standard always required **sort** to be *O(N* log *N)*, quicksort would be forbidden. Similarly, merging two subranges is easier if there is an external buffer to copy the results into; merge sort is *O(N* log N) if it can use an auxiliary buffer as large as the original range, and *O(N (*log *N)*^{2}*)* if it can't obtain any auxiliary buffer. If it can only use a smaller auxiliary buffer, then its complexity is intermediate between the two — but, in realistic cases, closer to *O(N* log *N)*.

That's an explanation of what's in the Standard, but it's still not complete. The Standard says that "if the worst case behavior is important," you shouldn't use **sort**. However, that advice is less true than it was when the Standard was first written. Many standard library implementations, including the ones that come with GNU g++ and Microsoft Visual C++, now use a new variation on quicksort, called *introsort* [2]. Introsort is just as fast as quicksort on average, but its complexity is always *O(N* log *N)*. Unless you're worried about older standard library implementations, worst case behavior is no longer a reason to avoid **sort**. And, while **stable_sort** is (usually) *O(N* log *N)* too, that *O* glosses over many details.

In most cases, **stable_sort** is much slower than **sort**, because it has to do more work for each element; that's the price you pay for the guarantee of stability. Use **stable_sort** when it's important for equivalent elements to keep their original ordering (for example, when you're sorting by priority, but keeping items of equal priority in first-come first-served order), and use **sort** otherwise.

The other generic sorting algorithm is **partial_sort** (along with a minor variant, **partial_sort_copy**.) It has slightly different functionality and slightly different syntax: it finds and sorts the first *n* elements of a range. As usual the default is to sort in ascending order as determined by the **<** operator, but you can provide a function object to change that. So, for example, if you're interested only in the top 10 elements of a sequence, you can find them with

partial_sort(first, first+10, last, greater<T>());

Afterwards the largest 10 elements will be contained in **[first, first+10)** (sorted in descending order), and the rest of the elements will be contained in **[first+10, last)**.

There's an obvious limiting case: if you write **partial_sort(first, last, last)**, then you're asking **partial_sort** to sort the entire range **[first, last)**. So, while the syntax may be slightly different, you can still use **partial_sort** the same way you use **sort**. Is there a reason to? Not really. Looking at what the C++ Standard says about complexity, you'll see that sorting a range with **partial_sort** is *O(N* log *N)*, just like **sort**, but, again, asymptotic complexity is an incomplete description. Two *O(N* log *N)* algorithms aren't necessarily the same speed; in this case, **sort** is usually much faster.

The reason **partial_sort** exists is for partial sorts. Suppose that you have a list of a million names and you need to find the first thousand, sorted in alphabetical order. You could get those thousand names by sorting the entire list and ignoring all but the first part of it, but that would be grossly wasteful. It would be much faster to use **partial_sort** or **partial_sort_copy**:

vector<string> result(1000); partial_sort_copy(names.begin(), names.end(), result.begin(), result.end());

Use **partial_sort** when you're only interested in the first part of the sorted range, and use **sort** when you need to sort all of a range.

As with **sort** and **stable_sort**, it's helpful to think about how **partial_sort** is implemented. The usual implementation, and the one that the C++ Standard authors had in mind, is in terms of *heap sort*: the input range is rearranged into a data structure called a *heap*, which is essentially a binary tree flattened into an array in such a way so that it's easy to find the largest element and so that it's easy to remove the largest element and still have a valid heap. If we successively remove elements from a heap, then we'll be left with the smallest *n* elements: just what we need from **partial_sort**. If we remove all of the elements from a heap, then we'll be left with a sorted range.

The Standard C++ library includes generic algorithms for manipulating heaps directly, so, instead of using **partial_sort**, another way to sort a range is by writing:

make_heap(first, last); sort_heap(first, last);

This looks different from

partial_sort(first, last, last);

but it really isn't. In both cases, you're using heap sort; the two forms should have exactly the same speed.

Finally, there's one last "generic" sorting algorithm, inherited from C: **qsort**. I put "generic" in quotes because **qsort** really isn't as generic as **sort**, **stable_sort**, or **partial_sort**. You can't use it to sort objects that have constructors, virtual functions, base classes, or private member variables. C doesn't know about those features; **qsort** assumes it can copy objects bit for bit, which only works with the very simplest of classes. It's also hard to use **qsort** in templates, since you have to pass it a comparison function that takes arguments of type **void***, but that internally knows the exact types of the objects to be sorted.

The C Standard gives no performance guarantees for **qsort**. In cases where **qsort** can be used, however, it usually turns out to be much slower than **sort**. (Largely for the simple reason that **sort**'s interface was designed so that the comparison function could be inlined, and **qsort**'s interface was not.) Except in legacy code, you should avoid **qsort**; **sort** has a simpler and safer interface, it's less restrictive, and it's faster.

### Sorting with Special Containers

I started by discussing generic algorithms, because the basic principle of the Standard C++ library is to decouple things that don't have to be coupled. Algorithms are separate from containers. There's nothing about sorting in the container requirements; you sort a vector by using an algorithm that's separate from **std::vector**:

sort(v.begin(), v.end());

Nevertheless, any complete discussion of sorting in C++ must include containers. Containers in general provide no sorting methods, but some special containers do.

You can't sort a vector by writing **v.sort()**, since **std::vector** has no such member function, but you can sort a list by writing **l.sort()**. As usual, you can also explicitly provide a different comparison function: if **l** is a list of **int**s, then **l.sort(greater<int>())** will sort it in descending order.

In fact, **list::sort** is the only easy way to sort a list: **std::list** only provides Bidirectional Iterators, but the standalone sorting algorithms, **sort**, **stable_sort**, and **partial_sort**, require the more powerful Random Access Iterators [3]. The special member function **list::sort **doesn't use iterators, but instead** **exploits the fact that lists are implemented in terms of linked nodes. It uses a variation of merge sort that works by linking and unlinking nodes, rather than by copying list elements.

Finally, sorting a **set** (or a **multiset**, if you need to have duplicate elements) is easiest of all: it starts out sorted! You can't write **sort(s.begin(), s.end())**, but you also never need to. A fundamental property of a **set** is that its elements are arranged in order. Whenever you insert a new element, it automatically gets put in the appropriate place to maintain the ordering. Internally, **set** uses a data structure (usually some kind of balanced tree) that enables fast (*O(*log* N)*) lookup and fast insertion.

### Timing Tests

Where does this leave us? I've made some assertions about timing, and we can make some more intuitive observations: that inserting elements into a **set** should be slower than sorting a **vector**, for example, because **set** is a complicated data structure that uses dynamic memory allocation, or that sorting a **list** should be roughly as fast as using **stable_sort**, because both are versions of merge sort. However, intuition is no substitute for timing tests. Measurements are difficult (exactly what do you measure, and how?), but they are essential.

Listing 1 is a program that constructs a **vector<double>**,
puts it in random order, and then successively sorts it using each of the methods
we've discussed (except for **qsort**). We deliberately use call-by-value
when passing the vector to each timing test: we don't want any of the tests
to see a vector that has already been sorted!

Compiling the program with Microsoft Visual C++ 7 beta (results with g++ are similar), and running it on a 500 MHz Pentium III, gives the following results:

Sorting a vector of 700000 doubles. sorting method t (sec) sort 0.971 qsort 1.732 stable_sort 1.402 heap sort 1.282 list sort 1.993 set sort 3.194

This is as expected: **sort** is the fastest; **stable_sort**, heap sort, and **qsort** are considerably slower; and sorting with a **list** or a **set**, which use dynamic memory allocation and complicated data structures, is slower still. (Actually, this is an unusually good showing for **qsort**: on g++, and on older versions of VC++, **qsort** is even slower compared to **sort**.)

But this doesn't deserve to be called a sorting benchmark — that's too bold a claim. I tested sorting a **vector<double>**, on one particular system, using a specific (random) initial ordering. Only experiment can determine how far these results may be generalized. At the least, we should try sorting something other than **double**s.

Listing 2 sorts a **vector<string>**: it opens
a file and copies each line into a separate **string**. I'm no longer including
**qsort** in this test, because **qsort** cannot handle elements of types
with constructors. With Project Gutenberg's free e-text of *Anna Karenina
*[4] as input, these are the results:

Sorting a file of 42731 lines. sorting method t (sec) sort 0.431 stable_sort 1.322 heap sort 0.751 list sort 0.25 set sort 0.43

Suddenly, things have changed. We still see that **sort** is much faster than **stable_sort** or heap sort, but the results for **list** and **set** have changed dramatically. Using a **set** is just about the same speed as using **sort**, and **list** is substantially faster. What happened?

What's changed is that **double** is a primitive type, and **std::string** is a complicated class. Copying a **string**, or assigning one string to another, means invoking a function — it might even mean allocating dynamic memory or creating a mutex lock. The tradeoffs have changed; the number of assignments matters much more when you're sorting **string**s than when you're sorting **double**s. Sorting with a **list** involves no assignments at all: the whole reason for defining a special **list::sort** member function is that it works by manipulating pointers to internal list nodes. Relinking a few list node pointers is faster than copying a **string**.

We've rediscovered an old piece of conventional wisdom: if you're dealing with large records, you never want to sort the records themselves, but just pointers to them. Using a list makes this automatic, but you can just as easily do it explicitly: instead of sorting the original **vector<string>**, create an index vector each of whose elements is a **vector<string>::const_iterator**, and sort the index vector. You'll have to tell **sort** how to compare the elements of the index vector (you have to make sure to compare the pointed-to values, not the iterators themselves), but that's only a minor nuisance. We just need to provide an appropriate comparison function object:

template <class Iterator> struct indirect_lt { bool operator()(Iterator x, Iterator y) const { return *x < *y; } };

Listing 3 shows how to use **indirect_lt** and compares
the speed of direct and indirect sorting. The speedup is substantial.

Sorting a file of 42731 lines. sorting method t (sec) indirect sort 0.29 sort 0.401

### Advice

The Standard C++ library provides six sorting methods: **qsort**, **sort**, **stable_sort**, **partial_sort**, **list::sort**, and **set**/**multiset**.

You shouldn't use **qsort** in new code. It's slower than **sort**. Its interface is ugly and, because it requires casting, error-prone. Writing a comparison function to pass to **qsort** can be nontrivial, especially in generic code. You can't use **qsort** to sort objects with constructors or virtual member functions, or to sort any data structure other than a C array. Although **qsort** isn't formally deprecated, its only real use is backward compatibility with C.

Of the other five sorting methods, the first three are generic algorithms and the last two exploit special features of certain containers. Each of these methods compares objects with **operator<** by default, but allows you to specify your own comparison function if necessary. Each provides special features:

sort |
Generally the fastest. |

stable_sort |
Guarantees stability in ordering of equivalent elements. |

partial_sort |
Allows sorting only the first N elements. |

list::sort |
Manipulates pointers, instead of copying elements. |

set |
Allows fast insertion into, and deletion from, a sorted range. |

If you don't need any of the special features of any of the other four methods, your first choice should usually be **sort**.

### Notes

[1] For more details on quicksort and merge sort, and other sorting algorithms, see D. E. Knuth, *The Art of Computer Programming*, vol. 2, 1998.

[2] D. R. Musser. "Introspective sorting and selection algorithms," *Software Practice and Experience*, 27(8):983, 1997.

[3] This restriction is partly artificial: it is possible to implement quicksort and merge sort in terms of Forward Iterators, albeit at the cost of some complexity. Artificial or not, however, it's what the Standard requires.

[4] See <http://sailor.gutenberg.org/by-author/to2.html>.

### About the Author

* Matt Austern is the author of *Generic Programming and the STL

*and the chair of the C++ standardization committee’s library working group. He works at AT&T Labs — Research and can be contacted at*

**[email protected]**.