A Lisp-Style Library for C

Here's an approach to representing and manipulating variable-length, heterogeneous lists, self-identified data objects, and symbolic data in C.


August 01, 1991
URL:http://www.drdobbs.com/cpp/a-lisp-style-library-for-c/184408598

Figure 2


Copyright © 1991, Dr. Dobb's Journal

Figure 4


Copyright © 1991, Dr. Dobb's Journal

Figure 5


Copyright © 1991, Dr. Dobb's Journal

Figure 6


Copyright © 1991, Dr. Dobb's Journal

Figure 5


Copyright © 1991, Dr. Dobb's Journal

Figure 6


Copyright © 1991, Dr. Dobb's Journal

Figure 7


Copyright © 1991, Dr. Dobb's Journal

AUG91: A LISP-STYLE LIBRARY FOR C

A LISP-STYLE LIBRARY FOR C

Adding essential features of Lisp to C simplifies handling of complex data objects

Daniel N. Ozick

Daniel is an independent consultant who designs software tools, builds pattern-matching systems, and does software-engineering research. His book, Structured Assembly Language Programming for the Z80, was published in 1985. Contact him at 1 Jackie Court, Chester, NY 10918.


C has become one of the most popular and widely used programming languages, and for good reasons: You can compile the same C program to run on a wide variety of machines; C is a relatively small and easy-to-learn language; it has a rich set of operators and a concise notation; and it lets you deal directly with primitive machine objects, such as characters, numbers, and addresses.

On the other hand, C provides only limited support for composite data objects, having only fixed-size arrays and structures in its repertoire of built-in composite types. The language lacks general mechanisms for:

Because of these deficiencies, if you want to write a C program that involves complex data objects of varying sizes and types, you have to build your own ad hoc mechanisms for manipulating those objects. You often end up "reinventing the wheel' and obfuscating the real work of the program.

Lisp and similar problem-oriented languages excel where the machine-oriented C is deficient. These languages have powerful features for dealing with symbolic and dynamically created data structures at a high level. Having enjoyed and made productive use of the expressive power of several dialects of Lisp, I set out to save myself from the tar pits of C by creating a package of C macros, functions, and data types--a Lisp-Style Library for C--that could imitate some of Lisp's capabilities in a C environment. These capabilities include:

I've successfully used the Lisp-Style Library in my own work, most recently in the construction of a general-purpose compiler (a kind of interpretive version of Unix's lex and yacc tools). The availability of the Lisp-like language features lets me put more effort into effective design and less into details of implementation. At the same time, the extended capabilities do not require a special language preprocessor or changes in syntax--standard C mechanisms are used--nor do they impose a general sacrifice of C's space and time efficiency.

Lists

Consider a program called process whose command line requires a filename argument that may include wildcard characters. For example, process *.c is a request to process every file in the current directory having the extension ".c." Inside process, you need to have a list of unambiguous filenames. Assuming your operating system doesn't automatically handle wildcard expansion on the command line, you might like also to be able to write something like the code of Example 1, where source_file_names is a variable whose value is a list (of any length) of filename strings; process_file is a function whose argument is a filename string; and for__each is a function that causes process_file to be sequentially applied to each element of the source_file names list.

Example 1: A simple assignment statement and function call create and process an arbitrarily long list of strings.

  source_file_names = expand_wildcards (argv [i]);
  for_each (process_file, source_file_names);

You can achieve the ability to manipulate variable-length lists in this simple, direct way by constructing them as singly linked chains of pointer pairs--what Lisp calls cons cells. In the Lisp-Style Library, a Pair is a two-component struct, as shown in Figure 1. The component names car and cdr(pronounced "car" and "could-er") long ago lost their original meanings, but have been retained by the Lisp-using community to refer to the first and second halves of pairs, respectively. Object is a C pointer to a self-identified data object.

Figure 1: The Lisp-Style Library defines Pair, a two-component struct that is the basis for all linked lists. Object is a C pointer to a self-identified data object.

  /* Pair -- a Lisp 'cons' cell for creating linked lists */
  typedef struct
   {
    Object car;         /* any Object */
    Object cdr;         /* PAIR Object or NULL (usually) */
   } Pair;

To make the discussion of list building easier, let's use Lisp's standard notation for lists. Elements enclosed by parentheses and separated by white space represent a list. For example, ("prog-1.c", "prog-2.c", "prog-3.c") is a list of three items, each of which is a string. This list could be the value of source_file_names. An empty pair of parentheses () represents a sequence containing no elements--the empty list.

Let's also use a common schematic representation for Lisp data called box-and-pointer notation. In this representation, each Object is shown as a pointer to a box, with the box containing a representation of the object's data. The box for a pair is actually a double box, containing a car pointer in its left half and a cdr pointer in its right half.

Figure 2 shows the box-and-pointer notation for the sample source_file_names list of three strings. Notice that the chain of pairs is terminated by having the last pair contain a cdr whose value is NULL, indicated by the slashed box. In addition to its function as a list terminator, NULL (called nil in Lisp) can also be considered equivalent to the empty list: NULL and () represent the same entity.

List Constructors

Functions for building lists are called constructors, the most fundamental of which, called first_put, adds an Object to the front of a list by allocating a new pair and setting its car to point to the new Object and its cdr to point to the rest of the list. Lisp calls this function cons (for "construct").

In Example 2, three nested calls to first_ put build the source_file_names list, back to front, starting with the list terminator, NULL. The calls to make_string are required to convert standard C strings into self-identified Objects of type STRING.

Example 2: Three nested calls to first_put build a three-element list, back to front, starting with the list terminator NULL.

  source_file_names =
      first_put (make_string ("prog-1.c"),
          first_put (make_string ("prog-2.c"),
              first_put (make_string ("prog-3.c"), NULL)));

Alternatively, the code in Example 3 accomplishes the same piece of list building with a single invocation of list. This function uses ANSI C's mechanism for handling a variable number of arguments, with a NULL argument terminating the argument list. In true Lisp systems, functions can determine the number of arguments supplied to them at runtime without resorting to a special terminator argument.

Example 3: A single invocation of the list function builds a three-element list. NULL terminates the function's variable-length argument list.

  source_file_names = list (make_string ("prog-1.c"),
                            make_string ("prog-2.c"),
                            make_string ("prog-3.c"),
                            NULL );

If you build a list one item at a time with successive, nonnested applications of first_put, you end up with the first item added as the last item on the list. To avoid this reversal, you can use last_put, which adds an item to the end of a list. For example, the code fragment in Example 4, which might be part of expand wildcards, constructs a file name list in the same order as filenames are retrieved by get_ first_name and get_next_name. Note that the filename list, names, starts out empty as a result of the first assignment statement, names = NULL;.

Example 4: This code fragment, which might be part of expand_wildcards, constructs the filename list names in order (front to back) using last_put. The list starts out empty as a result of the first assignment statement.

  names = NULL;
  name = get_first_name (filespec);
  while (name != NULL)
   {
    names = last_put (name, names);
    name = get_next_name (filespec);
   }

If you do need to reverse an already constructed list, use reverse. This function returns a new list (comprising newly allocated pairs distinct from those making up the old list) whose elements are listed in reverse order of those in the old list.

The last constructor function is append, which splices together two lists. (In true Lisp, append can take any number of lists as arguments.) Example 5 displays a fragment of code that uses append to construct a list of unambiguous source_file_names from a list of command-line arguments, each of which gets its wildcards expanded.

Example 5: This code fragment uses append to construct a list of unambiguous source_file_names from a list of command-line arguments, each of which gets its wildcards expanded.

  /* collect remainder of args (wildcards expanded) as 'source_file_names'
   */
  source_file_names = NULL;
  for (i = i; i < argc; i++)
  source_file_names = append (source_file_names, expand_wildcards (argv
   [i]));

List Selectors

Functions for taking lists apart or getting access to their elements are called "selectors." The definition of the Lisp-Style Library function for__each, Figure 3 , illustrates the two fundamental selectors, first and but_first.

Figure 3: The Lisp-Style Library mapping function for_each demonstrates the classic method for walking down a list by successive applications of but_first while fetching each element for processing with first. (Function_1 is a pointer to a function of one Object argument, returning an Object result.)

  /* for_each -- apply a function 'f' to each element of a list */
  void for_each (Function_1 f, Object list)
   {
    while (list != NULL)
     {
      (*f) (first (list));
      list = but_first (list);
     }
   }

first (called car in Lisp) returns the first element of its argument list, and but_first (called cdr in Lisp) returns the list that begins with the second element of its argument list--that is, everything but the first element. The while loop in for_each implements the general scheme for traversing a list by successive applications of but_first. You can, by the way, determine the number of elements in a list before attempting to process them all by using the length function.

nth provides nonsequential access to the elements of a list, as if it were an array. Indexing begins at zero, so nth (list, O) is equivalent to first (list).

Obviously, this use of lists is inefficient, since nth must traverse n pointer links before arriving at the desired element. For more efficient nonsequential access to sequential data, the Lisp-Style Library provides an Object of type VECTOR. list_to_vector (list) returns a dynamically allocated VECTOR big enough to hold all the elements of list; the macro invocation vector (vec) [n] returns the nth element of vec; and the macro invocation vector_length (vec) returns the number of elements in vec.

Remember, however, that unlike VECTOR objects (or C arrays for that matter), lists can grow and shrink; they do not require an index value for access; and elements can be inserted or deleted anywhere. With a few simple, indexless access procedures, lists can serve as sets, stacks, queues, and associative arrays (tables). The Lisp-Style Library functions is_member, assoc, and index (Listing Three, page 113), and the macros push and pop (Listing One, page 112) constitute a start in this direction.

Of course you pay for all this flexibility: PAIRs occupy memory space and must be recycled when they become obsolete; access to list elements requires an additional level of indirection (when compared to an array of fixed-size data objects); and the time to access a list element is proportional to its position in the list.

Mapping Functions

Mapping functions transform one list into another, just as programs in a Unix pipeline transform one file into another. They promote a "signal-processing view of programs that can be a powerful organizational method.

Suppose, for example, that the function integers generates the sequential list of integers in the range specified by Its input arguments, that square returns the square of its input, and that sum returns the sum of the integers in its input list. Then sum (integers (1, 1O)) returns the sum of the integers from 1 through 10; map (square, integers (1, 10)) returns a list of the squares of the integers from 1 through 10; and, as shown in the signal-flow diagram of Figure 4, sum (map (square, integers (1, 10))) returns the sum of the squares of the integers from 1 through 10. (You can run these examples, using Lisp's comma-free prefix syntax, in the Tiny Lisp Interpreter demonstration program of Listing Seven, page 123. See the comments in that listing for detailed instructions.)

In other words, map applies a function to each element of an input list, and collects the results of that application in a new output list of the same length, thus transforming the list. map_no_nils works like map, except that it discards NULL results, possibly resulting in an output list that is shorter than the function's input list. This behavior makes map_no_nils useful for filtering a list (by removing some elements) in addition to transforming it.

The list-related functions are implemented in Listing Three.

Objects

Up to this point, we've been assuming that the elements of a list are all of the same type. But suppose we want to create a list to represent the expression (A* (B + 3)).

If we treat the written form of that expression directly as standard list notation, then we have the list whose three elements are A, *, and a nested list also containing three elements (B, +, and 3). Allowing the third element of the list to be itself a list results in the box-and-pointer notation of Figure 5. The nested lists can also be viewed as the tree of Figure 6.

For a C program to interpret correctly the data structure diagrammed in Figure 5 and Figure 6, list-traversal procedures must be able to determine what each pointer points to. Is this a pointer to a pair or to a primitive object? If several types of primitive objects may be encountered--and we've already seen STRINGs and VECTORs--what type of primitive object is this a pointer to?

The need to determine the type of data objects at runtime is the reason for the creation of the Object type, which is defined as a pointer to a self-identified data object. To avoid the awkwardness of excessive precision, I will also informally refer to the object itself as an Object.

Objects have two components: type and value. The type of an Object serves as an identifying "tag" for the value that follows. The type tag is a small integer (or C enum constant) with the possible values UNDEFINED, SYMBOL, STRING, INTEGER, FUNCTION, PAIR, VECTOR, or TOKEN (see Listing One). Following the tag is the value part of the Object, whose contents depend on the Object's type. Figure 7 shows, for each type of Object, the make_function that creates the Object by dynamic allocation from the C heap, the macros that access the components of the Object, and the Object's memory layout.

To see how the make_ functions and access macros of Figure 2 are used, let's look at two short examples. The statement s = make_string ("string 1"); results in the creation of a new dynamically allocated Object of type STRING. Consequently, the macro call type (s) returns STRING, while the macro call string (s) returns a (char *) pointer that can be used in regular C expressions such as strcmp (string (s), "string 2"). Similarly, if i = make_integer (99); then type (i) returns INTEGER and integer (i) returns an int that can be used in regular C expressions such as if (integer (i) < O).

Polymorphic Functions

Because the type of an Object can be determined at runtime, you can write polymorphic functions -- that is, functions that can be applied to different types of input and whose precise action is determined by the type of that input. For example, the Lisp-Style Library's pp_object ("prettyprint object"), Example 6, fetches the type of obj using the type access macro and dispatches according to the fetched value.

Example 6: This code fragment from pp_object ("prettyprint object") demonstrates dispatching according to the type of an Object, determined at runtime.

  switch (type   (obj))
   {
    case SYMBOL:
     write_symbol (obj);
     break;
    case STRING:
     write_string (obj);
     break;
       ...

  }

Sometimes it's simpler to use a type predicate (a macro that returns TRUE if its argument is of a particular type and FALSE otherwise) to determine the type of an Object. Table 1 lists the available predicates, and the eval ("evaluate") function of Listing Seven provides several examples of their use. Incidentally, as in true Lisp, a NULL object or any non-NULL object that is not a list is considered to be primitive and called an atom.

Table 1: Type predicates, implemented as macros, in the Lisp-Style Library for C(Listing One). A type predicate returns TRUE if its argument is of the type specified in its name and FALSE otherwise. As in true Lisp, an atom is the NULL object or any non-NULL object that is not a list.

  is_null
  is_symbol
  is_pair
  is_atom
  is_list
  is_vector
  is_string
  is_integer
  is_function
  is_token

Symbols

So far I've described the use of PAIRs as a general method for creating lists and trees and the advantages of using data objects that carry their own type tags. Now let's look at the Lisp-Style Library's adaptations of two of the symbol-manipulation features of Lisp that make that language so powerfully expressive: SYMBOL objects and symbolic input/output of data.

An Object of type SYMBOL has two essential properties: a unique print_ name string and a unique address. In addition, a SYMBOL can be associated with a value (of type Object).

The read_object function (Listing Five, page 119) efficiently converts character strings -- the external textual representations of SYMBOLs -- into unique internal SYMBOL objects, while write_ object (Listing Three) converts internal SYMBOL objects into their external textual representations as strings.

The properties of symbols make them useful as identifiers of all sorts. Anywhere you might use a #define or enum constant, you can probably use a symbol and gain the advantage of symbolic textual input and output. This is particularly handy for testing and debugging. Listing Two (page 112) contains the declarations of symbols used inside the Lisp-Style Library itself, mainly to represent the character and token types recognized by the Lisp input reader (see Listing Five and "The Reader" section that follows).

I have used symbols as reserved words and identifiers in several "little languages" including one for defining menus, one for defining rules in an expert-system knowledge base, one for defining text-formatter output, and even one for defining the grammar of another little language. In fact, if you're willing to use standard list notation as the syntax of your little language, you can create a program to read an input file in the language and convert the information into your program's internal data structures with very little effort (as benefits a little language). The Tiny Lisp Interpreter (read-evaluate-print loop) of Listing Seven is yet another example of the utility of symbolic I/O.

Listing Four (page 118) shows the simple hashed symbol-table scheme that allows read_object to efficiently look up print_name strings. The symbol table is an array of hash buckets, where each bucket is a list (implemented, of course, with the list constructors I described earlier). lookup finds a symbol in the table by first calculating its hash index (with hash) and then walking down the corresponding hash-bucket list (with but_ first) until it finds a matching string. install adds a new symbol by inserting it (with first_ put) at the front of the calculated hash-bucket list. intern ("internalize") always returns a unique SYMBOL object, either one found with lookup or one just created and added to the symbol table by install.

The Reader

Like the reader in many Lisp systems, the reader (or "lexer") for the Lisp-Style Library uses a general-purpose mechanism based on a read-table. A read-table specifies, for each character that may be encountered in the input, the type of that character. In this case, the allowed types are WHITESPACE, COMMENT_MARKER, SPECIAL, STRING_MARKER, and ENDFILE_MARKER. The core function of the reader, get_token, dispatches on the type of the current character (fetched from read_table via the char_type macro). Several of get_token's auxiliary procedures (get_white-space, get_string, and get_word) also reference read_table.

As implemented in Listing Five, the reader:

(Notice that SYMBOLs, unlike C identifiers, can include characters other than letters, digits, and underscores. This means that a little language based on the Lisp-Style Library could include symbols such as +, pair?, or list->vector.)

On the output side, write_object produces textual representations of Objects that conform to read_ object's input format. In other words, you can write an Object to a file with write_ object and read back an equivalent Object with read_ object. (If the Object in question is a SYMBOL, then the two Objects will be not only equivalent but also identical.)

write_object also produces textual representations for VECTORs and TOKENs using Lisp-style #() and #S() notation, respectively. These cannot be read back by the current version of read_object, however.

Systematic Memory Recycling

The flexibility of variable-length lists and data objects that are created as they're needed comes at a price. Memory for Objects must be allocated dynamically from the C heap, and unless the supply of heap memory is inexhaustible -- the memory for Objects that are no longer needed (and only for those objects) must be freed.

True Lisp systems provide automatic garbage collection, pausing briefly to accomplish this process whenever the program runs out of heap memory. Garbage collection consists of tracing the entire network of active pointers, starting with those in the machine registers (and including values on the stack), and discarding those objects that are unreachable.

Unfortunately, I have not yet found a way to implement automatic garbage collection using standard C mechanisms: It seems to require the creation of an entirely new language (or at least a new type of compiler). Instead, the Lisp-Style Library provides two methods for accomplishing what you might call "planned recycling": free_object and mark / free_to_mark Listing Six, page 121).

Using free_object requires you to keep track of what Objects you have created, although you don't have to worry about Objects that are components of other Objects. For example, if you create a list and hand it as an argument to free_object, the function will free every PAIR in the list and every Object pointed to by a PAIR in the list, including other lists (recursively).

On the other hand, using the mark / free_to_mark technique, which I adapted from the PostScript language, does not require you to keep track of individual Objects. Instead, you discard all Objects created between the invocation of free_to_mark and the most recent mark, as in Example 7. Calls to mark and free_to_mark may be properly nested.

Example 7: The functions mark and free_to_mark provide a form of planned Object recycling free_to_mark discards -- that is, recycles the memory used by -- all Objects created since the most recent mark.

  mark ();
  <create and use Objects, all of which become garbage>
  free_to_mark ();

Objects created in a subregion bounded by calls to mark_persistent and unmark_persistent will not be freed by a subsequent call to free_to_mark. This allows some objects to persist beyo~ the mark/free region, as in Example ~

copy_object duplicates an Object a~ all its components recursively (exce~ for SYMBOLs, which are unique)~ much the same way that free_obje~ frees an Object and all its componen~ recursively.

The function persistent_copy_obje~ allows the statement important_obje~ = persistent_copy_object (importan~ object); to be substituted for the la~ three statements of Example 8. In other words, persistent_copy_object is copy~ object wrapped inside a "persisten~ region.

Example 8: Objects created in a subregion bounded by calls to mark_persistent and unmark_persistent will not be freed by a subsequent call to free_to_mark. This allows some objects_in this case, the copy of important_object-to persist beyond the mark/free region.

  mark ();
  <create and use Objects, most of which become garbage>
  mark_persistent ();
  important_object = copy_object (important_object);
  unmark_persistent ();
  free_to_mark ();
  <use important_object>

To understand how the mark/fre~ scheme works, first note that all of the functions that allocate memory for Objects -- the make_series (Listing Three -- do so through safe_malloc, which in turn calls C's native malloc. As currently implemented, safe_malloc simply aborts the program with an error message (using the error function of Listing Three) if sufficient memory to satisfy the allocation request is not available.

Similarly, the functions that free memory, free_object and free_to_mark (Listing Six), do so through safe_free, which in turn calls C's native free_safe_free sets the first byte of the deal-located memory, which will usually be an Object's type tag, to zero. Because zero is not a legal type value -- in fact, it has the name UNDEFINED -- a function that attempts to reference a freed Object will fail, assuming it does some type checking.

Setting the entire block of deallocated memory to zero in safe_free would further increase the probability of catching references through "stale" pointers. Many Objects have components that are themselves pointers to other data, so a reference to a discarded Object would likely lead to an attempt to dereference a zero-valued (NULL) pointer. That action can be trapped by the compiler's runtime error checking or by the operating system. Of course, the extra zero-setting would come at the expense of some execution time.

At the beginning of Listing Six, several variables tell the next part of the mark/free story:

To summarize the process: safe_malloc maintains a list of marked blocks; mark saves a pointer into the list of marked blocks; and free_to_mark walks down the marked block list freeing blocks until it reaches the block referenced by the saved pointer. mark_ persistent and unmark_ persistent allow the definition of a subregion where memory blocks do not get added to the marked block list within a region where they do.

Further Development

I have been building the Lisp-Style Library for C incrementally, over time, adding new features as I've needed them for particular projects. The macros, functions, and types form a reasonably coherent set, but the library is far from a finished product. Without making the mistake of trying to extend the library to the point where it becomes an inefficient interpreter of true Lisp programs instead of an efficient library for C, here are a few of my ideas for further development:

Conclusion

I have described an effective approach to representing and manipulating variable-length, heterogeneous lists, self-identified data objects, and symbolic data in C. The simple examples in this article only hint at the potential usefulness of the Lisp-Style Library for C: Its real power is revealed in projects involving more complex data.

It often seems to me that almost everything in the world -- no matter how simple or complex -- can be represented as a list of something. Moreover, written language is so fundamental to our ability to represent and talk about the world that it seems essential for a computer language to be able to manipulate word-like symbols as easily as it can manipulate numbers and operators. The techniques I've described here bring these powerful ideas to the practical world of C programming.

References

Abelson, H. and G.J. Sussman, with J. Sussman. Structure and Interpretation of Computer Programs. Cambridge, Mass.: MIT Press, 1985.

Adobe Systems Inc. PostScript Language Reference Manual. Reading, Mass.: Addison-Wesley, 1985.

Aho, A.V., R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques, and Tools. Reading, Mass.: Addison-Wesley, 1986.

Bentley, J. "Programming Pearls: Little Languages." Communications of the ACM (August 1986).

Schimandle, J. "Encapsulating C Memory Allocation." DDJ (August 1990).

Steele, G.L., Jr. Common LISP: The Language. Bedford, Mass.: Digital Press, 1984.

Winston, P.H. and B.K.P. Horn. LISP. Second Edition. Reading, Mass.: Addison-Wesley, 1984.

Woodruff, B. "PostScript as a Programming Language," in Real World PostScript, edited by S.F. Roth. Reading, Mass.: Addison-Wesley, 1988.


_A LISP-STYLE LIBRARY FOR C_
by Daniel N. Ozick


[LISTING ONE]


/* file LISP.H of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */
/* Lisp-Style Library for C (Main Header File) */

/* Constants */
/* Array Sizes */
#define MAXSTRING 128           /* size of standard character array  */
#define MAXLINE 256             /* size of text line character array */
#define HASH_TABLE_SZ 211       /* size of HASH_TABLE -- should be prime */

/* Characters */
#define EOS '\0'                /* end of string */
#define TAB '\t'
#define NEWLINE '\n'
#define FORMFEED '\f'
#define SPACE 32
#define BELL 7
#define BACKSPACE 8
#define RETURN 13
#define LINEFEED 10
#define ESCAPE 27
#define DOT '.'
#define PERIOD '.'
#define DOS_EOF 26
#define BACKSLASH '\\'
#define SINGLE_QUOTE '\''
#define DOUBLE_QUOTE '\"'
#define LEFT_PAREN '('
#define RIGHT_PAREN ')'
#define LINE_SPLICE (-2)

/* Strings */
#define NULLSTR ""
#define NEWLINESTR "\n"
/** Types **/
/* Boolean -- standard truth values */
typedef enum
 {
  FALSE,
  TRUE
 } Boolean;
#if 0
/* Note: The following 'enum' version of Object_Type uses an 'int' (16 bits)
   of storage under Microsoft C 6.0! */
/* Object_Type -- values for first component of 'Object' (self-id tag) */
typedef enum
 {
  /* General Types */
  UNDEFINED,
  SYMBOL,
  STRING,
  INTEGER,
  FUNCTION,
  PAIR,
  VECTOR,
  /* Built-in C Structures */
  TOKEN,
 } Object_Type;
#endif
/* Note: The following version of Object_Type is guaranteed to use only one
'char' of storage. (Contrast with 'enum' version, above.) */
/* Object_Type -- values for first component of 'Object' (self-id tag) */
typedef char Object_Type;
/* General Types */
#define UNDEFINED       0
#define SYMBOL          1
#define STRING          2
#define INTEGER         3
#define FUNCTION        4
#define PAIR            5
#define VECTOR          6

/* Built-in C Structures */
#define TOKEN           7
/* Pointer -- 'Generic *' : what's pointed to is unknown at compile time */
typedef void *Pointer;
/* Object -- pointer to self-identified object (starts with Object_Type) */
typedef Object_Type *Object;
/* Function -- pointer to function of ? arguments returning Object */
typedef Object (*Function)(Object, ...);
/* Function_0 -- pointer to function of 0 arguments returning Object */
typedef Object (*Function_0)(void);
/* Function_1 -- pointer to function of 1 Object returning Object */
typedef Object (*Function_1)(Object);
/* Symbol_Entry -- the attributes of a symbol (entered into Symbol_Table) */
typedef struct
 {
  char *print_name;     /* printed representation and lookup key    */
  Object value;         /* value of global variable named by symbol */
 } Symbol_Entry;
/* Pair -- a Lisp 'cons' cell for creating linked lists */
typedef struct
 {
  Object car;           /* any Object */
  Object cdr;           /* PAIR Object or NULL (usually) */
 } Pair;
/* Token -- structure Object stores token type and lexeme string */
typedef struct
 {
  Object type;          /* SYMBOL */
  char *lexeme;         /* string as it appeared in external file */
 } Token;
/* Hash_Table -- an array of hash-bucket lists used for symbol tables */
typedef Object Hash_Table [HASH_TABLE_SZ];
/** Macros **/
/* Standard Input and Output */
#define ungetchar(c)            ungetc (c, stdin)
#define peekchar()              ungetc (getchar(), stdin)
/** Object Components **/
/* SOT -- size of 'Object_Type' (bytes used by type tag) */
#define SOT sizeof (Object_Type)
/* type -- return the object's self-identification (Object_Type) */
#define type(object)            *((Object_Type *) object)
/* symbol -- return the address of symbol's name and value (Symbol_Entry) */
#define symbol(object)          ((Symbol_Entry *) (object + SOT))
/* symbol_value -- return the value assigned to a symbol */
#define symbol_value(object)    (symbol(object)->value)
/* string -- return the address of (the first char of) standard C string */
#define string(object)          ((char *) (object + SOT))
/* integer -- return an 'int' */
#define integer(object)         *((int *) (object + SOT))
/* function -- return the address of a function that returns Object */
#define function(object)        *((Function *) (object + SOT))
/* pair -- return the address of a Lisp-style CONS cell */
#define pair(object)            ((Pair *) (object + SOT))
/* first -- return first element of a list (Lisp CAR) */
#define first(object)           (pair(object)->car)
/* but_first -- return list less its first element (Lisp CDR) */
#define but_first(object)       (pair(object)->cdr)
/* vector -- return the base address of a 1-dimensional array of Object */
#define vector(object)          ((Object *) (object + SOT + sizeof (int)))
/* vector_length -- return length of a VECTOR Object (also an lvalue) */
#define vector_length(object)   *((int *) (object + SOT))
/* token -- return the address of a Token structure */
#define token(object)           ((Token *) (object + SOT))
/* Type Predicates */
#define is_null(object)         (object == NULL)
#define is_symbol(object)       (type(object) == SYMBOL)
#define is_pair(object)         (type(object) == PAIR)
#define is_atom(object)         (is_null(object) || (type(object) != PAIR))
#define is_list(object)         (is_null(object) || is_pair(object))
#define is_vector(object)       (type(object) == VECTOR)
#define is_string(object)       (type(object) == STRING)
#define is_integer(object)      (type(object) == INTEGER)
#define is_function(object)     (type(object) == FUNCTION)
#define is_token(object)        (type(object) == TOKEN)
/* declare_symbol -- declare extern var with same name as interned sym */
#define declare_symbol(name,type)       extern Object name;
/* List-Based Stacks */
/* push -- push an object on to a (list-based) stack */
#define push(location,object)   \
  location = first_put (object, location)
/* pop -- pop an object off of a (list-based) stack, NULL if stack empty */
#define pop(location)           \
  ( (location != NULL) ?        \
    pop_f (&location) : NULL )
/* Function Prototypes */
void error (char *fstr, ...);
Object first_put (Object item, Object list);
Object last_put (Object item, Object list);
Object list (Object item, ...);
Object append (Object list_1, Object list_2);
Object reverse (Object list);
Object flatten (Object obj);
Object flatten_no_nils (Object obj);
void for_each (Function_1 f, Object list);
Object map (Function_1 f, Object list);
Object map_no_nils (Function_1 f, Object list);
Object nth (Object list, int n);
Object assoc (Object key, Object a_list);
Object pop_f (Object *location);
int length (Object list);
Object is_member (Object obj, Object list);
int index (Object element, Object list);
char *make_c_string (char *str);
Object make_symbol (char *name);
Object make_string (char *s);
Object make_integer (int n);
Object make_function (Function f);
Object make_token (Object type, char *lexeme);
Object make_vector (int length);
Object list_to_vector (Object list);
void write_object (Object obj);
Object read_object (void);
Object lookup (char *str);
Object intern (char *str);
Object install_with_value (char *str, Object val);
Object set_symbol_value (Object sym, Object val);
void install_internal_symbols (void);
void mark (void);
void free_to_mark (void);
void mark_persistent (void);
void unmark_persistent (void);
Pointer safe_malloc (size_t size);
void safe_free (void *p);
void free_object (Object obj);
Object copy_object (Object obj);
Object persistent_copy_object (Object obj);
void init_internal_read_table (void);
void set_internal_reader (void);





[LISTING TWO]


/* File I-SYMS.H of 28-Jan-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Declaration of Symbols in Internal Symbol Table **/
/* Symbol Types */
declare_symbol (SYMBOL_TYPE,    SYMBOL_TYPE);
declare_symbol (RESERVED,       SYMBOL_TYPE);
declare_symbol (CHAR_TYPE,      SYMBOL_TYPE);
declare_symbol (TOKEN_TYPE,     SYMBOL_TYPE);
/* Reserved "Lisp" Symbols */
declare_symbol (_UNDEFINED,     RESERVED);
declare_symbol (NIL,            RESERVED);
declare_symbol (T,              RESERVED);
declare_symbol (EOF_OBJECT,     RESERVED);
/* Character Types */
declare_symbol (ILLEGAL,        CHAR_TYPE);
declare_symbol (WHITESPACE,     CHAR_TYPE);
declare_symbol (STRING_MARKER,  CHAR_TYPE);
declare_symbol (COMMENT_MARKER, CHAR_TYPE);
declare_symbol (SPECIAL,        CHAR_TYPE);
declare_symbol (CONSTITUENT,    CHAR_TYPE);
declare_symbol (ESCAPE_MARKER,  CHAR_TYPE);
declare_symbol (ENDFILE_MARKER, CHAR_TYPE);
/** Token Types **/
/* For Internal Diagnostics */
declare_symbol (T_ERROR,        TOKEN_TYPE);
/* Internal Special Symbols (Lisp IO) */
declare_symbol (T_LPAREN,       TOKEN_TYPE);
declare_symbol (T_RPAREN,       TOKEN_TYPE);
/* Others */
declare_symbol (T_NEWLINE,      TOKEN_TYPE);
declare_symbol (T_WHITESPACE,   TOKEN_TYPE);
declare_symbol (T_WORD,         TOKEN_TYPE);
declare_symbol (T_STRING,       TOKEN_TYPE);
declare_symbol (T_EOF,          TOKEN_TYPE);





[LISTING THREE]


/* File LISP.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Lisp-Style Library for C (Main File of User Functions) **/
/* Include Files */
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <stdarg.h>
#include "lisp.h"
#include "i-syms.h"
/** Functions **/
/* error -- write string (args like 'printf') to 'stdout' and exit */
void error (char *fstr, ...)
 {
  va_list ap;
  va_start (ap, fstr);
  vfprintf (stdout, fstr, ap);
  fputc (NEWLINE, stdout);
  /* write DOS_EOF to 'stdout' for compatibility */
  fputc (DOS_EOF, stdout);
  exit (1);
  va_end (ap);
 }
/** List Constructors **/
/* first_put -- add an Object to the front of a list (Lisp CONS) */
Object first_put (Object item, Object list)
 {
  Object new_list;
  new_list = (Object) safe_malloc (sizeof (Object_Type) + sizeof (Pair));
  type (new_list) = PAIR;
  pair (new_list) -> car = item;
  pair (new_list) -> cdr = list;
  return (new_list);
 }
/* last_put -- add an Object to the end of a list (Destructive!) */
Object last_put (Object item, Object list)
 {
  Object old_list, new_list;
  new_list = first_put (item, NULL);
  if (list == NULL)
   return (new_list);
  else
   {
    old_list = list;
    while (but_first (list) != NULL)
     list = but_first (list);
    pair (list) -> cdr = new_list;
    return (old_list);
   }
 }
/* list -- return a new list of given arguments (last arg must be NULL) */
Object list (Object item, ...)
 {
  va_list ap;
  Object result;
  result = NULL;
  va_start (ap, item);
  while (item != NULL)
   {
    result = last_put (item, result);
    item = va_arg (ap, Object);
   }
  va_end (ap);
  return (result);
 }
/* append -- concatenate two lists (destructive (!) Lisp equivalent) */
Object append (Object list_1, Object list_2)
 {
  Object list;
  if (list_1 == NULL)
   return (list_2);
  else
  if (list_2 == NULL)
   return (list_1);
  else
   {
    list = list_1;
    while (but_first (list) != NULL)
     list = but_first (list);
    pair (list) -> cdr = list_2;
    return (list_1);
   }
 }
/** List Modifiers **/
/* reverse -- return a new list in reverse order (Lisp equivalent) */
Object reverse (Object list)
 {
  Object new_list;
  new_list = NULL;
  while (list != NULL)
   {
    new_list = first_put (first (list), new_list);
    list = but_first (list);
   }
  return (new_list);
 }
/* flatten -- return the leaves of a tree (atoms of nested lists) */
Object flatten (Object obj)
 {
  if (is_null (obj))
   return (first_put (NULL, NULL));
  else if (is_atom (obj))
   return (list (obj, NULL));
  else if (is_null (but_first (obj)))
   return (flatten (first (obj)));
  else
   return (append (flatten (first (obj)),
                   flatten (but_first (obj)) ));
 }
/* flatten_no_nils -- 'flatten' a tree, discarding NULL atoms */
Object flatten_no_nils (Object obj)
 {
  if (is_null (obj))
   return (NULL);
  else if (is_atom (obj))
   return (list (obj, NULL));
  else
   return (append (flatten_no_nils (first (obj)),
                                  flatten_no_nils (but_first (obj)) ));
 }
/** Mapping Functions **/
/* for_each -- apply a function 'f' to each element of a list */
void for_each (Function_1 f, Object list)
 {
  while (list != NULL)
   {
    (*f) (first (list));
    list = but_first (list);
   }
 }
/* map -- apply a function 'f' to each element of list, put results in list */
Object map (Function_1 f, Object list)
 {
  Object output;
  output = NULL;
  while (list != NULL)
   {
    output = first_put ((*f) (first (list)), output);
    list = but_first (list);
   }
  return (reverse (output));
 }
/* map_no_nils -- like 'map', but collect only non-NULL results */
Object map_no_nils (Function_1 f, Object list)
 {
  Object result;
  Object output;
  output = NULL;
  while (list != NULL)
   {
    result = (*f) (first (list));
    if (result != NULL)
     output = first_put (result, output);
    list = but_first (list);
   }
  return (reverse (output));
 }
/** List Selectors **/
/* nth -- return nth element of a list or NULL (Lisp equivalent) */
Object nth (Object list, int n)
 {
  while ((list != NULL) && (n > 0))
   {
    list = but_first (list);
    n--;
   }
  if (list != NULL)
   return (first (list));
  else
   return (NULL);
 }
/* assoc -- association-list lookup returns PAIR whose 'first' matches key */
Object assoc (Object key, Object a_list)
 {
  Object pair;
  while (a_list != NULL)
   {
    pair = first (a_list);
    if (first (pair) == key)
     return (pair);
    else
     a_list = but_first (a_list);
   }
  return (NULL);
 }
/* pop_f -- pop an object off of a (list-based) stack: 'pop' macro helper */
Object pop_f (Object *location)
 {
  Object item;
  item = first (*location);
  *location = but_first (*location);
  return (item);
 }

/* List Properties */
/* length -- return the integer length of a list (Lisp equivalent) */
int length (Object list)
 {
  int n;
  n = 0;
  while (list != NULL)
   {
    list = but_first (list);
    n++;
   }
  return (n);
 }
/* is_member -- T if 'obj' is identical to element of 'list', else NULL */
Object is_member (Object obj, Object list)
 {
  while (list != NULL)
   {
    if (first (list) == obj)
     return (T);
    else
     list = but_first (list);
   }
  return (NULL);
 }
/* index -- return index of first occurence of 'element' in 'list' */
int index (Object element, Object list)
 {
  int n;
  n = 0;
  while ((list != NULL) &&
         (first (list) != element) )
   {
    list = but_first (list);
    n++;
   }
  if (list != NULL)
   return (n);
  else
   return (-1);
 }
/** Object Constructors **/
/* make_c_string -- make new copy of argument string in free memory */
char *make_c_string (char *str)
 {
  char *new_string;
  new_string = (char *) safe_malloc (strlen (str) + 1);
  strcpy (new_string, str);
  return (new_string);
 }
/* make_symbol -- return a new symbol of given name (no table lookup) */
Object make_symbol (char *name)
 {
  Object new_symbol;
  new_symbol = (Object) safe_malloc (sizeof (Object_Type) +
                                                    sizeof (Symbol_Entry) );
  type (new_symbol) = SYMBOL;
  symbol (new_symbol) -> print_name = make_c_string (name);
  symbol (new_symbol) -> value = _UNDEFINED;
  return (new_symbol);
 }
/* make_string -- return a new STRING Object with value of given string */
Object make_string (char *s)
 {
  Object new_string;
  new_string = (Object) safe_malloc (sizeof (Object_Type) + strlen (s) + 1 );
  type (new_string) = STRING;
  strcpy (string (new_string), s);
  return (new_string);
 }
/* make_integer -- return a new INTEGER Object of specfied value */
Object make_integer (int n)
 {
  Object new_integer;
  new_integer = (Object) safe_malloc (sizeof (Object_Type) + sizeof (int) );
  type (new_integer) = INTEGER;
  integer (new_integer) = n;
  return (new_integer);
 }
/* make_function -- return a new FUNCTION Object of specfied value */
Object make_function (Function f)
 {
  Object new_function;
  new_function = (Object) safe_malloc (sizeof (Object_Type) +
                                                          sizeof (Function) );
  type (new_function) = FUNCTION;
  function (new_function) = f;
  return (new_function);
 }
/* make_token -- return a new TOKEN Object of specified type and lexeme */
Object make_token (Object type, char *lexeme)
 {
  Object new_token;
  new_token = (Object) safe_malloc (sizeof (Object_Type) + sizeof (Token));
  type (new_token) = TOKEN;
  token (new_token) -> type = type;
  token (new_token) -> lexeme = make_c_string (lexeme);
  return (new_token);
 }
/** Vectors **/
/* make_vector -- return a new VECTOR object of specified 'length' */
Object make_vector (int length)
 {
  Object new_vector;
  int i;
  new_vector = (Object) safe_malloc (sizeof (Object_Type) + sizeof (int) +
                                                   length * sizeof (Object) );
  type (new_vector) = VECTOR;
  vector_length (new_vector) = length;
  for (i = 0; i < length; i++)

   vector(new_vector) [i] = NULL;
  return (new_vector);
 }
/* list_to_vector -- given a (proper) list, return a new VECTOR Object */
Object list_to_vector (Object list)
 {
  Object new_vector;
  Object *element;
  new_vector = make_vector (length (list));
  element = vector(new_vector);
  while (list != NULL)
   {
    *element = first (list);
    list = but_first (list);
    element++;
   }
  return (new_vector);
 }
/** Symbolic Output **/
/* write_spaces -- write 'n' spaces to 'stdout' */
void write_spaces (int n)
 {
  int i;
  for (i = 0; i < n; i++)
   putchar (SPACE);
 }
/* write_c_string -- write standard C string with double-quotes and escapes */
void write_c_string (char *s)
 {
  putchar (DOUBLE_QUOTE);
  while (*s != EOS)
   {
    switch (*s)
     {
      case NEWLINE:
       putchar (BACKSLASH);
       putchar ('n');
       break;
      case TAB:
       putchar (BACKSLASH);
       putchar ('t');
       break;
      case FORMFEED:
       putchar (BACKSLASH);
       putchar ('f');
       break;
      case BACKSLASH:
       putchar (BACKSLASH);
       putchar (BACKSLASH);
       break;
      case DOUBLE_QUOTE:
       putchar (BACKSLASH);
       putchar (DOUBLE_QUOTE);
       break;
      default:
       putchar (*s);
       break;
     }
    s++;
   }
  putchar (DOUBLE_QUOTE);
 }
/* write_symbol -- write printed representation of SYMBOL Object */
void write_symbol (Object obj)
 {
  printf ("%s", symbol(obj)->print_name);
 }
/* write_string -- write printed representation of STRING Object */
void write_string (Object obj)
 {
  write_c_string (string(obj));
 }
/* pp_object -- pretty-print an Object starting at 'col', output at 'hpos' */
void pp_object (Object obj, int col, int hpos)
 {
  int i;
  write_spaces (col - hpos);   hpos = col;
  if (obj == NULL)
   printf ("()");
  else
   switch (type(obj))
    {
     case SYMBOL:
      write_symbol (obj);
      break;
     case STRING:
      write_string (obj);
      break;
     case INTEGER:
      printf ("%d", integer(obj));
      break;
     case PAIR:
      /* for now, assume proper list (ending in NULL 'but_first') */
      putchar (LEFT_PAREN);   hpos++;
      while (obj != NULL)
       {
        if (! is_pair (obj))
         error ("pp_object: not proper list");
        pp_object (first (obj), col+1, hpos);
        obj = but_first (obj);
        if (obj != NULL)
         {
          putchar (NEWLINE);   hpos = 0;
         }
       }
      putchar (RIGHT_PAREN);
      break;
     case VECTOR:
      putchar ('#');          hpos++;
      putchar (LEFT_PAREN);   hpos++;
      for (i = 0; i < vector_length(obj); i++)
       {
        pp_object (vector(obj) [i], col+2, hpos);
        if (i < vector_length(obj)-1)
         {
          putchar (NEWLINE); hpos = 0;
         }
       }
      putchar (RIGHT_PAREN);
      break;
     case FUNCTION:
      printf ("#<function>");
      break;
     case TOKEN:
      printf ("#S(TOKEN ");
      write_symbol (token(obj)->type);
      putchar (SPACE);
      write_c_string (token(obj)->lexeme);
      putchar (RIGHT_PAREN);
      break;
     default:
      error ("pp_object: not standard object");
      break;
    }
 }
/* write_object -- write (re-readable) printed representation of Object */
void write_object (Object obj)
 {
  /* for now (simple version), assume 'hpos' initially 0 */
  pp_object (obj, 0, 0);
 }





[LISTING FOUR]


/* File SYMBOLS.C of 5-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Symbol Tables and Installed Symbols **/
/* Include Files */
#include <stdio.h>
#include <string.h>
#include "lisp.h"
/** Variables **/
/* internal_symbols -- the symbol table for "Lisp" */
Hash_Table internal_symbols;
/* symbol_table -- pointer to the current symbol table */
Object *symbol_table;
/* Predefined Internal Symbols */
#undef declare_symbol
#define declare_symbol(name,type)   \
  Object name
#include "i-syms.h"
/** Functions **/
/* init_hash_table -- set all hash buckets in a table to the empty list */
void init_hash_table (Hash_Table table)
 {
  int i;
  for (i = 0; i < HASH_TABLE_SZ; i++)
   table [i] = NULL;
 }
/* hash -- given a character string, return a hash code (from Aho, p. 436) */
int hash (char *str)
 {
  char *p;
  unsigned long g, h;
  /* from the book "Compilers" by Aho, Sethi, and Ullman, p. 436 */
  h = 0;
  for (p = str; *p != EOS; p++)
   {
    h = (h << 4) + (*p);
    g = h & 0xF0000000;
    if (g)
     {
      h = h ^ (g >> 24);
      h = h ^ g;
     }
   }
  return ( (int) (h % HASH_TABLE_SZ));
 }
/* lookup -- given a string, return symbol from 'symbol_table' or NULL */
Object lookup (char *str)
 {
  Object hash_bucket;           /* list   */
  Object sym;                   /* symbol */
  hash_bucket = symbol_table [hash (str)];
  /* walk linearly down 'hash_bucket' list looking for input string */
  while (hash_bucket != NULL)
   {
    sym = first (hash_bucket);
    if (strcmp (symbol (sym) -> print_name, str) == 0)
     return (sym);
    else
     hash_bucket = but_first (hash_bucket);
   }
  return (NULL);
 }
/* install -- add a new symbol with given print string to 'symbol_table' */
Object install (char *str)
 {
  Object new_sym;
  int hash_index;
  new_sym = make_symbol (str);
  /* insert new symbol object at the front of appropriate hash bucket list */
  hash_index = hash (str);
  symbol_table [hash_index] = first_put (new_sym, symbol_table [hash_index]);
  return (new_sym);
 }
/* intern -- return (possibly new and just installed) symbol of given name */
Object intern (char *str)
 {
  Object sym;           /* symbol */
  sym = lookup (str);
  if (sym == NULL)
   sym = install (str);
  return (sym);
 }
/* set_symbol_value -- set the value of an already installed symbol */
Object set_symbol_value (Object sym, Object val)
 {
  symbol (sym) -> value = val;
  return (val);
 }
/* install_with_value -- add a new symbol and its value to 'symbol_table' */
Object install_with_value (char *str, Object val)
 {
  Object new_sym;
  new_sym = install (str);
  set_symbol_value (new_sym, val);
  return (new_sym);
 }
/* install_internal_symbols -- set internal symbols known at compile time */
void install_internal_symbols (void)
 {
  symbol_table = internal_symbols;
  #undef declare_symbol
  #define declare_symbol(name,type)   \
    name = install_with_value (#name, type)
  #include "i-syms.h"
  install_with_value ("(", T_LPAREN);
  install_with_value (")", T_RPAREN);
 }





[LISTING FIVE]


/* File LEXER.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Lexical Analyzer (a.k.a. Lexer, Scanner, or Reader) **/
/* Include Files */
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include "lisp.h"
#include "i-syms.h"
/* External Variables */
extern Object *symbol_table;
extern Hash_Table internal_symbols;
/* Internal Function Prototypes */
Object read_list (Object first_atom);
/* Constants */
#define CHAR_SET_SZ 256
/** Types **/
/* Read_Table -- array giving CHAR_TYPE SYMBOL for every char and EOF */
typedef Object Read_Table [CHAR_SET_SZ+1];
/** Variables **/
/* internal_read_table -- read table for "Lisp" reader */
Read_Table internal_read_table;
/* read_table -- pointer to the current read table */
Object *read_table;
/* eof_seen -- 'get_token' (EOF) sets TRUE */
Boolean eof_seen = FALSE;
/** Macros **/
/* char_type -- return char type of char or EOF from current read table */
#define char_type(c)    read_table[c+1]
/** Functions **/
/* set_read_table_entries -- set a list of read-table entries to Char_Type */
void set_read_table_entries (char *s, Object t)
 {
  while (*s != EOS)
   char_type (*s++) = t;
 }
/* init_read_table -- initialize 'read_table' with CONSTITUENT and EOF */
void init_read_table (void)
 {
  int c;
  for (c = 0; c < CHAR_SET_SZ; c++)
   char_type (c) = CONSTITUENT;
  char_type (EOF) = ENDFILE_MARKER;
 }
/* init_internal_read_table -- initialize 'internal_read_table' */
void init_internal_read_table (void)
 {
  read_table = internal_read_table;
  init_read_table ();
  set_read_table_entries (" \t\f\n", WHITESPACE);
  set_read_table_entries (";", COMMENT_MARKER);
  set_read_table_entries ("()", SPECIAL);
  char_type (DOUBLE_QUOTE) = STRING_MARKER;
  char_type (BACKSLASH) = ESCAPE_MARKER;
 }
/* set_internal_reader -- set 'read_table' and 'symbol_table' for Lisp I/O */
void set_internal_reader (void)
 {
  read_table = internal_read_table;
  symbol_table = internal_symbols;
 }
/* get_whitespace -- return TOKEN Object of type T_WHITESPACE */
Object get_whitespace (void)
 {
  char lexeme [MAXSTRING];
  int index;
  int current_char;
  /* collect characters up to next non-whitespace */
  index = 0;
  while (current_char = getchar (),
         (char_type (current_char) == WHITESPACE) &&
         (index < MAXSTRING-1) )
   lexeme [index++] = (char) current_char;
  lexeme [index] = EOS;
  ungetchar (current_char);
  return (make_token (T_WHITESPACE, lexeme));
 }
/* get_escaped_char -- return single character value, line splice ==> EOS */
int get_escaped_char (void)
 {
  int c;
  /* discard ESCAPE_MARKER */
  getchar ();
  switch (c = getchar ())
   {
    case 'n':
     return (NEWLINE);
     break;
    case 't':
     return (TAB);
     break;
    case 'f':
     return (FORMFEED);
     break;
    case BACKSLASH:
     return (BACKSLASH);
     break;
    case DOUBLE_QUOTE:
     return (DOUBLE_QUOTE);
     break;
    /* Note: LINE_SPLICE should really be discarded */
    case NEWLINE:
     return (LINE_SPLICE);
     break;
    default:
     return (c);
     break;
   }
 }
/* get_string -- return TOKEN Object of type T_STRING */
Object get_string (void)
 {
  char lexeme [MAXSTRING];
  int index;
  int current_char;
  /* discard starting STRING_MARKER */
  getchar ();
  /* collect characters until next (unescaped) STRING_MARKER */
  index = 0;
  while (current_char = getchar (),
         (char_type (current_char) != STRING_MARKER) &&
         (index < MAXSTRING-1) )
   {
    if (char_type (current_char) != ESCAPE_MARKER)
     lexeme [index++] = (char) current_char;
    else
     {
      ungetchar (current_char);
      lexeme [index++] = (char) get_escaped_char ();
     }
   }
  lexeme [index] = EOS;
  return (make_token (T_STRING, lexeme));
 }
/* skip_comment -- discard characters of a 'get_token' (line) comment */
void skip_comment (void)
 {
  while (getchar () != NEWLINE)
   ;
 }
/* get_special_sym -- return one of the special-symbol TOKEN Objects */
Object get_special_sym (void)
 {
  int current_char;
  char lexeme [3];
  Object sym;
  current_char = getchar ();
  lexeme [0] = (char) current_char;
  /* check for two-character special symbol */
  current_char = getchar ();
  lexeme [1] = (char) current_char;
  lexeme [2] = EOS;
  sym = lookup (lexeme);
  if (sym != NULL)
   return (make_token (symbol_value (sym), lexeme));
  /* check for one-character special symbol */
  else
   {
    ungetchar (current_char);
    lexeme [1] = EOS;
    sym = lookup (lexeme);
    if (sym != NULL)
     return (make_token (symbol_value (sym), lexeme));
    /* else error */
    else
     error ("get_special_sym: no token type for '%s' ", lexeme);
   }
 }
/* get_word -- return TOKEN Object of type T_WORD */
Object get_word (void)
 {
  char lexeme [MAXSTRING];
  int index;
  int current_char;
  /* collect characters up to next non-constituent */
  index = 0;
  while (current_char = getchar (),
         (char_type (current_char) == CONSTITUENT) &&
         (index < MAXSTRING-1) )
   lexeme [index++] = (char) current_char;
  lexeme [index] = EOS;
  ungetchar (current_char);
  return (make_token (T_WORD, lexeme));
 }
/* get_token -- return a single TOKEN Object (raw version) */
Object get_token (void)
 {
  int current_char;
  Object ct;
  if (eof_seen)
   error ("get_token: attempt to read past end of file");
  current_char = peekchar ();
  ct = char_type (current_char);
  if (ct == CONSTITUENT)
   return (get_word ());
  else if (ct == WHITESPACE)
   return (get_whitespace ());
  else if (ct == SPECIAL)
   return (get_special_sym ());
  else if (ct == STRING_MARKER)
   return (get_string ());
  else if (ct == COMMENT_MARKER)
   {
    skip_comment ();
    return (get_token ());
   }
  else if (ct == ESCAPE_MARKER)
   {
    /* discard anything but LINE_SPLICE */
    if (get_escaped_char () == LINE_SPLICE)
     return (make_token (T_WHITESPACE, NEWLINESTR));
    else
     return (get_token ());
   }
  else if (ct == ENDFILE_MARKER)
   {
    /* set end-of-file flag (see 'with_current_files') */
    eof_seen = TRUE;
    return (make_token (T_EOF, NULLSTR));
   }
  else
   error ("get_token: bad char type for '%c' ", current_char);
 }
/* symbol_or_number -- interpret string as SYMBOL or INTEGER Object */
Object symbol_or_number (char *s)
 {
  if (isdigit (*s))
   return (make_integer (atoi (s)));
  else
   return (intern (s));
 }
/* read_atom -- return an atomic Object or list-syntax TOKEN Object */
Object read_atom (void)
 {
  Object t, tt;
  t = get_token ();
  tt = token(t)->type;
  if (tt == T_WHITESPACE)
   return (read_atom ());
  else
  if (tt == T_WORD)
   return (symbol_or_number (token(t)->lexeme));
  else
  if (tt == T_STRING)
   return (make_string (token(t)->lexeme));
  else
  if (tt == T_EOF)
   return (EOF_OBJECT);
  else
  if ((tt == T_LPAREN) || (tt == T_RPAREN))
   return (t);
  else
   error ("read_atom: bad token type on input");
 }
/* read_object_1 -- 'read_object' with first input atom supplied */
Object read_object_1 (Object first_atom)
 {
  Object_Type ot;
  Object tt;
  ot = type(first_atom);
  if (ot == TOKEN)
   tt = token(first_atom)->type;
  if ((ot == TOKEN) && (tt == T_LPAREN))
   return (read_list (read_atom ()));
  else
  if ((ot == TOKEN) && (tt == T_RPAREN))
   error ("read_object_1: right paren without matching left paren");
  else
   return (first_atom);
 }
/* read_list -- read paren-delimited list (helper for 'read_object') */
Object read_list (Object first_atom)
 {
  Object_Type ot;
  Object tt;
  Object first, rest;
  ot = type(first_atom);
  if (ot == TOKEN)
   tt = token(first_atom)->type;
  if ((ot == TOKEN) && (tt == T_RPAREN))
   return (NULL);
  else
  if ((ot == TOKEN) && (tt == T_EOF))
   error ("read_list: EOF encountered before closing right paren");
  else
   {
    first = read_object_1 (first_atom);
    rest = read_list (read_atom ());
    return (first_put (first, rest));
   }
 }
/* read_object -- read complete Object, including paren-delimited list */
Object read_object (void)
 {
  return (read_object_1 (read_atom ()));
 }




[LISTING SIX]


/* File MEMORY.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Memory Allocation and Deallocation Functions **/
/* Include Files */
#include <stdio.h>
#include <stdlib.h>
#include "lisp.h"
/* Constants */
#define MAX_MARK_LEVELS 16
/** Types **/
/* Mark_Type */
typedef enum
 {
  TEMPORARY,
  PERSISTENT
 } Mark_Type;
/* Mark -- an element of 'mark_stack' */
typedef struct
 {
  Mark_Type type;
  Pointer index;
 } Mark;
/** Variables **/
/* marked_block_list -- pointer to linked list of marked allocated blocks */
Pointer marked_block_list = NULL;
/* mark_stack -- stack of 'Mark' and stack index */
Mark mark_stack [MAX_MARK_LEVELS];
int mark_stack_index = 0;
/* alloc_persistent -- FALSE means stack pointers to freeable memory blocks */
Boolean alloc_persistent = TRUE;
/** Functions **/
/* push_marked_block -- push pointer to block on 'marked_block_list' */
void push_marked_block (Pointer p)
 {
  * (Pointer *) p = marked_block_list;
  marked_block_list = p;
 }
/* pop_marked_block -- pop pointer to block from 'marked_block_list' */
Pointer pop_marked_block (void)
 {
  Pointer p;
  p = marked_block_list;
  if (p != NULL)
   {
    marked_block_list = * (Pointer *) p;
    return (p);
   }
  else
   error ("pop_marked_block: 'marked_block_list' is empty");
 }
/* push_mark_stack -- push a Mark on top of 'mark_stack' */
void push_mark_stack (Mark m)
 {
  if (mark_stack_index < MAX_MARK_LEVELS)
   mark_stack [mark_stack_index++] = m;
  else
   error ("push_mark_stack: exceeded MAX_MARK_LEVELS");
 }
/* pop_mark_stack -- pop a Mark from 'mark_stack' */
Mark pop_mark_stack (void)
 {
  if (mark_stack_index > 0)
   return (mark_stack [--mark_stack_index]);
  else
   error ("pop_mark_stack: stack empty");
 }
/* top_mark_stack -- return top of 'mark_stack' or PERSISTENT Mark if empty */
Mark top_mark_stack (void)
 {
  Mark m;
  if (mark_stack_index > 0)
   return (mark_stack [mark_stack_index-1]);
  else
   {
    m.type = PERSISTENT;
    m.index = marked_block_list;
    return (m);
   }
 }
/* mark -- push TEMPORARY Mark (with 'marked_block_list') on 'mark_stack' */
void mark (void)
 {
  Mark m;
  m.type = TEMPORARY;
  m.index = marked_block_list;
  push_mark_stack (m);
  alloc_persistent = FALSE;
 }
/* free_to_mark -- 'safe_free' all memory blocks alloc'ed since last 'mark' */
void free_to_mark (void)
 {
  Mark m;
  m = pop_mark_stack ();
  if (m.type == TEMPORARY)
   {
    while (marked_block_list != m.index)
     safe_free ((char *) pop_marked_block () + sizeof (Pointer));
    alloc_persistent = (top_mark_stack().type == PERSISTENT);
   }
  else
   error ("free_to_mark: wrong mark type on 'mark_stack'");
 }
/* mark_persistent -- disable stacking of freeable memory block pointers */
void mark_persistent (void)
 {
  Mark m;
  m.type = PERSISTENT;
  m.index = marked_block_list;
  push_mark_stack (m);
  alloc_persistent = TRUE;
 }
/* unmark_persistent -- pop a PERSISTENT Mark off the 'mark_stack' */
void unmark_persistent (void)
 {
  Mark m;
  m = pop_mark_stack ();
  if (m.type == PERSISTENT)
   alloc_persistent = (top_mark_stack().type == PERSISTENT);
  else
   error ("unmark_persistent: wrong mark type on 'mark_stack'");
 }
/* safe_malloc -- Unix 'malloc' wrapped inside test for sufficient memory */
Pointer safe_malloc (size_t size)
 {
  Pointer memory;
  static long num_blocks = 0;
  static long total_space = 0;
  /* allocate block, including header for link in 'marked_block_list' */
  memory = malloc (size + sizeof (Pointer));
  num_blocks++;
  total_space += size;
  if (memory != NULL)
   {
    if (! alloc_persistent)
     push_marked_block (memory);
    /* return beginning of user data block */
    return ((char *) memory + sizeof (Pointer));
   }
  else
   error ("safe_malloc: out of memory"
          " (num_blocks = %ld, total_space = %ld) \n",
          num_blocks, total_space );
 }
/* safe_free -- Unix 'free' with first byte of block set to zero */
void safe_free (void *p)
 {
  * (char *) p = (char) 0;
  /* free block, including header for link in 'marked_block_list' */
  free ((char* ) p - sizeof (Pointer));
 }
/* free_object -- free memory for Object and recursively for its components */
void free_object (Object obj)
 {
  if (marked_block_list != NULL)
   error ("free_object: can't free if 'marked_block_list' not empty");
  if (obj == NULL)
   return;
  else
   switch (type(obj))
    {
     case SYMBOL:
      return;
      break;
     case STRING:
     case INTEGER:
     case FUNCTION:
      break;
     case PAIR:
      free_object (first(obj));
      free_object (but_first(obj));
      break;
     case VECTOR:
      error ("free_object: VECTOR objects not implemented yet");
      break;
     case TOKEN:
      safe_free (token(obj)->lexeme);
      break;
     default:
      error ("free_object: not standard object");
      break;
    }
  safe_free (obj);
 }
/* copy_object -- copy Object and its components recursively */
Object copy_object (Object obj)
 {
  if (obj == NULL)
   return (NULL);
  switch (type(obj))
   {
    case SYMBOL:
     return (obj);
    case STRING:
     return (make_string (string(obj)));
    case INTEGER:
     return (make_integer (integer(obj)));
    case FUNCTION:
     return (make_function (function(obj)));
    case PAIR:
     return (first_put (copy_object (first(obj)),
                                        copy_object (but_first(obj)) ));
    case VECTOR:
     error ("copy_object: VECTOR objects not implemented yet");
    case TOKEN:
     return (make_token (token(obj)->type, token(obj)->lexeme ));
    default:
     error ("copy_object: not standard object");
   }
 }
/* persistent_copy_object -- 'copy_object' wrapped in 'mark_persistent' */
Object persistent_copy_object (Object obj)
 {
  Object result;
  mark_persistent ();
  result = copy_object (obj);
  unmark_persistent ();
  return (result);
 }






[LISTING SEVEN]


/* File REPL.C of 11-Feb-91 / Copyright (C) 1991 by Daniel N. Ozick */
/* REPL: A Simplified Lisp-Style Read-Evaluate-Print Loop or
A Tiny Lisp Interpreter
REPL is a simple interactive program intended to demonstrate some of the
features of The Lisp-Style Library for C. At the DOS > prompt, it READs user
input and attempts to convert that input into an internal Object. Then it
EVALuates the input Object as a Lisp expression according to the rules below.
Finally, it PRINTs the external representation of the result of evaluating the
input Object, and prompts for more input. This LOOP continues until either an
error occurs or the user interrupts it with control-C or control-Break.
Lisp expressions are evaluated as follows: 1. The empty list evaluates to
itself. 2. A symbol evaluates to its symbol_value. 3. Strings and integers
evaluate to themselves. 4. A list whose first element is the symbol quote
evaluates to the second element of the list. 5. A list whose first element is
a symbol whose symbol_value is a  function evaluates to the result of applying
that function to the  (recursively) evaluated elements of the rest of the list.
"Impure" Lisp-style functions--those that have non-Object inputs or output--
cannot be used in the Tiny Lisp Interpreter. These functions are for_each, map
(for which pmap is the equivalent "pure" version), map_no_nils, nth, length,
and index. In addition, the interpreter cannot handle macros such as first,
but_first, push, pop and the is_ type predicates. To create the REPL
executable file, link the compiled versions of LISP.C, SYMBOLS.C, LEXER.C,
MEMORY.C, and REPL.C. The required header files are LISP.H and I-SYMS.H. The
Lisp-Style Library and this program have been compiled and tested using
Microsoft C 6.0 under PC-DOS 3.3. */

/* Include Files */
#include <stdio.h>
#include "lisp.h"
/** Variables **/
/* quote -- marker SYMBOL for quoted-expression special form in 'eval' */
Object quote;
/** Macros **/
/* declare_function -- set up a SYMBOL whose value is FUNCTION (same name) */
#define declare_function(name)  \
  install_with_value (#name, make_function ((Function) name))
/** Functions **/
/* integers -- return the list of INTEGERs 'n1' through 'n2' inclusive */
Object integers (Object n1, Object n2)
 {
  int i;
  Object result;
  result = NULL;
  for (i = integer (n1); i <= integer (n2); i++)
   result = first_put (make_integer (i), result);
  return (reverse (result));
 }
/* sum -- return (as an INTEGER) the sum of a list of INTEGERs */
Object sum (Object list)
 {
  int sum;
  sum = 0;
  while (list != NULL)
   {
    sum += integer (first (list));
    list = but_first (list);
   }
  return (make_integer (sum));
 }
/* square -- return (as an INTEGER) the square of an INTEGER */
Object square (Object n)
 {
  return (make_integer (integer (n) * integer (n)));
 }
/* The following function is the "purified" version of map. It has a non-Object
input and can't be used in the Tiny Lisp Interpreter. Similar purifications
can be made for other impure functions in The Lisp-Style Library for C. */
/* pmap -- apply a function to each element of a list, put results in list */
Object pmap (Object f, Object list)
 {
  Object output;
  output = NULL;
  while (list != NULL)
   {
    output = first_put ((*function(f)) (first (list)), output);
    list = but_first (list);
   }
  return (reverse (output));
 }
/* install_function_symbols -- set up some symbols for read-eval-print loop */
void install_function_symbols (void)
 {
  /* pure Object functions from LISP.C */
  declare_function (first_put);
  declare_function (last_put);
  declare_function (reverse);
  declare_function (list);
  declare_function (append);
  declare_function (flatten);
  declare_function (flatten_no_nils);
  declare_function (is_member);
  declare_function (assoc);
  /* pure Object functions from REPL.C (examples for Tiny Interpreter) */
  declare_function (integers);
  declare_function (sum);
  declare_function (square);
  declare_function (pmap);
 }
/* apply -- apply a ("pure" Object) FUNCTION to a list of args (max of 8) */
Object apply (Object f, Object args)
 {
  return ((*function (f)) (nth (args, 0), nth (args, 1),
                           nth (args, 2), nth (args, 3),
                           nth (args, 4), nth (args, 5),
                           nth (args, 6), nth (args, 7) ));
 }
/* eval -- evaluate a Lisp-syntax expression (see notes above) */
Object eval (Object expr)
 {
  Object first_element, f;
  /* () is self-evaluating */
  if (is_null (expr))
   return (expr);
  /* symbol ==> symbol's value, other atoms are self-evaluating */
  else if (is_atom (expr))
   {
    if (is_symbol (expr))
     return (symbol_value (expr));
    else
     return (expr);
   }
  /* lists are function applications or quoted expressions */
  else if (is_pair (expr))
   {
    first_element = first (expr);
    if (first_element == quote)
     return (first (but_first (expr)));
    if (is_symbol (first_element))
     f = symbol_value (first_element);
    else
     error ("eval: first element of list is not a symbol");
    if (is_function (f))
     return (apply (f, map (eval, but_first (expr))));
    else
     error ("eval: symbol value is not a function");
   }
 }
/* main (REPL) -- interactive read-eval-print loop (Tiny Lisp Interpreter) */
int main (int argc, char *argv[])
 {
  printf ("A Tiny Lisp Interpreter using the Lisp-Style Library for C \n");
  printf ("Copyright (C) 1991 by Daniel N. Ozick \n\n");
  /* initialize internal symbol tables and read-tables */
  mark_persistent ();
  install_internal_symbols ();
  init_internal_read_table ();
  set_internal_reader ();
  install_function_symbols ();
  quote = intern ("quote");
  unmark_persistent ();
  /* do read-eval-print loop until user interrupt */
  while (TRUE)
   {
    mark ();
    printf ("\n> ");
    write_object (eval (read_object ()));
    free_to_mark ();
   }
  /* return "no errors" */
  return (0);
 }


Copyright © 1991, Dr. Dobb's Journal

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.