Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Introduction to Programming with Lists

October 09, 2013

Last week, I described the fundamental data structure in Lisp. This data structure is similar to the C++ pair library template, except that it is dynamically typed. To recap, it entails five operations, which I have given C++like names rather than Lisp-like names:

  • For any values x and y, pair(x, y) is a single value that contains x and y.
  • first(pair(x, y)) is x.
  • second(pair(x, y)) is y.
  • ispair(x) is true if and only if x is a pair.
  • nil is a special value that is not a pair, and is not equal to any other value.

Because we are assuming that we are working in a world in which values never change after being created, we will also assume that each of these operations runs in O(1) time, regardless of the complexity of their arguments. The reason for this latter assumption is that there is never a need to copy an immutable value; rather, we can use a pointer to stand in for the value to which it points. In order for these data structures to be useful, we will also assume that there are values, such as integers, that are neither pairs nor nil.

It should not take too much thought to realize that the data structure I have just described is a binary tree. However, much of the time, Lisp programmers don't think of the data structure that way. Instead, they restrict themselves to a particular subset of binary trees, which they call lists. We can define a list this way:

A list is either nil or a pair whose second element is a list.

So, for example, pair(x, nil) is a list regardless of the value of x, because it is a pair whose second element is the list nil. Similarly, pair(x, pair(y, nil)) is a list, as is pair(x, pair(y, pair(z, nil))), and so on. However, pair(nil, x) is a list only if x is nil. In other words, a list is a tree on which successive calls of second yield a nil-terminated sequence of lists.

These restricted trees appear in so many contexts that Lisp has an abbreviated notion for them: (x y z) is an abbreviation for pair(x, pair(y, pair(z, nil))). If we view a list in this way, we can see that there is a natural way of talking about its length — so we can try our hand at writing a function that computes the length of a list:

 
     int length(x) {
           int n = 0;
           while (x != nil) {
                
                x = second(x);
                ++n;
           }
           return n;
     }
 

This isn't quite a C++ program because of x's dynamic typing, but you should get the idea. If x is nil, we return 0; otherwise, we follow the second elements as far as we can and count how many times we did so.

Oops! This program violates our original assumption that all values are immutable. Let's try again:

 
     int length(x) {
           if (x == nil)
                return 0;
           return length(second(x)) + 1;
     }
 

I find this function easier to understand than the earlier one, despite the recursion, because it is easy to verify that its results are correct. The length of nil is obviously zero, and the length of a non-nil list is equally obviously one more than the length of its second part. However, this simplicity comes at a cost: The function makes a recursive call for each element of the list.

We can remove that cost by rewriting the function to use tail recursion:

 
     int length_aux(x, n) {
           if (x == nil)
                return n;
           return length_aux(second(x), n + 1);
     }
 
     int length(x) { return length_aux(x, 0); }

Readers must decide for themselves whether this rewrite is easier to understand than the original iterative version.

This example should start to give a feel for what it's like to program with immutable lists. Next week, I'm going to show how to implement two algorithms that are part of the C++ standard library: sort and reverse. Between now and then, I invite you to think about how to do so.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video