# Sorting An Immutable List

This article continues last week's discussion by explaining how to sort an immutable list. Doing so runs into trouble right away, because most sort algorithms work by swapping elements of the data structure being sorted. Swapping elements is, of course, an operation that changes the elements being swapped, so we can't use it as part of our sorting operation.

Fortunately, there is a straightforward algorithm for sorting lists that does not rely on the ability to change them. This algorithm is usually called *merge sort*, and we can describe it as follows:

*If the list to be sorted is empty or has one element, we're done. Otherwise:*

- Divide the list into two pieces of roughly equal size.
- (Recursively) sort each of the pieces.
- Merge the two sorted pieces into a single list.

The reason this algorithm works is that the recursive part of it ultimately reduces the list to one-element fragments; merging those fragments eventually puts them all in sequence. Suppose, for example, that we start with 5, 3, 8, 7. We might break that into two fragments, namely 5, 3 and 8, 7. Let’s look at 5, 3 first. To sort 5, 3, we break it into two fragments, one of which is 3 and the other is 5. Recursively sorting these fragments is a null operation, so we now merge 5 and 3 to get 3, 5. Similarly, we break 8, 7 into 8 and 7, and then merge them to get 7, 8. Finally, we merge 3, 5 with 7, 8 to get 3, 5, 7, 8.

We might implement this algorithm as follows:

sort(x) { if (x == nil || second(x) == nil) return x; y = split(x); return merge(sort(first(y)), sort(second(y))); }

This code implements the algorithm exactly as we described it. We start by checking whether `x`

is empty or (if `x`

is not empty) whether `second(x)`

is `nil`

. This latter case applies when `x`

has exactly one element. If either of these two conditions is true, we can just return `x`

as our result. If the tests fail, we continue by calling `split`

, which we must write. We assume that this function returns a pair with half of `x`

's elements in its first part and the rest of the elements in its second part. Finally, we recursively sort these two parts and call `merge`

to combine them into our result.

We must now write `split`

and `merge`

. Because we used split first, we'll write it first.

As with many such functions, it is easiest to think about it recursively. If the input list has two or fewer elements, splitting it is trivial. Otherwise, we can take the first two elements off the front, split the rest of it, and then put the elements we removed on the beginning of each of the halves:

split(x) { if (x == nil || second(x) == nil) return pair(x, nil); y = first(x); z = first(second(x)); s = split(second(second(x))); return pair(pair(y, first(s)), pair(z, second(s))); }

This code needs a little explanation. As we did with `sort`

, we determine whether `x`

has two or more elements; if not, we put all of `x`

into the first of the lists we return and make the other one `nil`

. Once we know that `x`

has at least two elements, we define `y`

as the first of those and `z`

as the second. Note that because of how our lists work, `first(x)`

is the first element of `x`

and `second(x)`

is a list of all of the remaining elements, if any, of `x`

. Accordingly, `second(second(x))`

is a list that contains all but the first two elements of `x`

, and is valid only because we know that `x`

has at least two elements.

We pass `second(second(x))`

, a list that contains all but the first two elements of `x`

, recursively to split. The result is a `pair`

with the two parts of that list in each of its elements. Finally, we create two new lists, each one of which has one of the first two elements of `x`

at the beginning and half of the remaining elements (after the first two) of `x`

at the end. We splice those two lists into a `pair`

, and return it as our result.

What remains is to write `merge`

. Our strategy will be similar: If one of the lists is empty, we're done; otherwise, we create a new list that begins with the smaller of the initial elements of our two lists and recursively merge all of the other elements:

merge(x, y) { if (x == nil) return y; if (y == nil) return x; if (first(x) < first(y)) return pair(first(x), merge(second(x), y)); return pair(first(y), merge(x, second(y))); }

This code may well be easier to understand than the code for `split`

. The first two `if`

statements are trivial, so let's consider the third. If its condition is true, then we return a list whose first element is the first element of `x`

. We obtain the rest of that list's elements by merging all but the first element of `x`

with `y`

.

Similarly, if the condition is false, that means that `first(y)`

must be the first element of the merged result. The rest of the elements are obtained by merging all of `x`

with all but the first element of `y`

.

This code is a little harder to understand than bubble sort or insertion sort, but I don't think it's much harder. Moreover, it is much faster than either: It examines each element of each list at most a number of times proportional to the depth of recursion, which in turn is the log (base 2) of the number of elements. Therefore, this algorithm takes *O(n* log *n*) time. That's a significant advantage over sort algorithms that change their arguments. On the other hand, it also takes *O(n* log *n*) space, which is a disadvantage. Moreover, that space overhead is made worse by the fact that the recursive calls are not tail recursions.

Next week, I'll discuss some pragmatic advantages of programming in this style.