Elements of Programming
By Alexander Stepanov and Paul McJones
Alexander Stepanov is no-doubt familiar to Dobb's readers as the designer of the C++Standard Template Library, coauthor (with P.J. Plauger, Meng Lee, and David R. Musser) of The C++ Standard Template Library, noteworthy Al Stevens' interview subject, and as a recipient of Dr. Dobb's Excellence in Programming Award. With coauthor Paul McJones, Stepanov's newest tome is Elements of Programming -- a book written on the premise that practical programming, like other areas of science and engineering, must be based on a solid mathematical foundation. According to C++ designer Bjarne Stroustrup, "The book contains some of the most beautiful code I have ever seen."
Stepanov and McJones described their book as follows: "Our book applies the deductive method to programming by affiliating programs with the abstract mathematical theories that enable them to work. Specification of these theories, algorithms written in terms of these theories, and theorems and lemmas describing their properties are presented together. The implementation of the algorithms in a real programming language is central to the book. While the specifications, which are addressed to human beings, should, and even must, combine rigor with appropriate informality, the code, which is addressed to the computer, must be absolutely precise even while being general."
The following is an excerpt from Chapter One of Elements of Programming.
Starting with a brief taxonomy of ideas, we introduce notions of value, object, type, procedure, and concept that represent different categories of ideas in the computer. A central notion of the book, regularity, is introduced and elaborated. When applied to procedures, regularity means that procedures return equal results for equal arguments. When applied to types, regularity means that types possess the equality operator and equality-preserving copy construction and assignment. Regularity enables us to apply equational reasoning (substituting equals for equals) to transform and optimize programs.
1.1 Categories of Ideas: Entity, Species, Genus
In order to explain what objects, types, and other foundational computer notions are, it is useful to give an overview of some categories of ideas that correspond to these notions.
An abstract entity is an individual thing that is eternal and unchangeable, while a concrete entity is an individual thing that comes into and out of existence in space and time. An attribute—a correspondence between a concrete entity and an abstract entity—describes some property, measurement, or quality of the concrete entity. Identity, a primitive notion of our perception of reality, determines the sameness of a thing changing over time. Attributes of a concrete entity can change without affecting its identity. A snapshot of a concrete entity is a complete collection of its attributes at a particular point in time. Concrete entities are not only physical entities but also legal, financial, or political entities. Blue and 13 are examples of abstract entities. Socrates and the United States of America are examples of concrete entities. The color of Socrates’ eyes and the number of U.S. states are examples of attributes.
An abstract species describes common properties of essentially equivalent abstract entities. Examples of abstract species are natural number and color. A concrete species describes the set of attributes of essentially equivalent concrete entities. Examples of concrete species are man and U.S. state.
A function is a rule that associates one or more abstract entities, called arguments, from corresponding species with an abstract entity, called the result, from another species. Examples of functions are the successor function, which associates each natural number with the one that immediately follows it, and the function that associates with two colors the result of blending them.
An abstract genus describes different abstract species that are similar in some respect. Examples of abstract genera are number and binary operator. A concrete genus describes different concrete species similar in some respect. Examples of concrete genera are mammal and biped.
An entity belongs to a single species, which provides the rules for its construction or existence. An entity can belong to several genera, each of which describes certain properties.
We show later in the chapter that objects and values represent entities, types represent species, and concepts represent genera.
Unless we know the interpretation, the only things we see in a computer are 0s and 1s. A datum is a finite sequence of 0s and 1s.
A value type is a correspondence between a species (abstract or concrete) and a set of datums. A datum corresponding to a particular entity is called a representation of the entity; the entity is called the interpretation of the datum. We refer to a datum together with its interpretation as a value. Examples of values are integers represented in 32-bit two’s complement big-endian format and rational numbers represented as a concatenation of two 32-bit sequences, interpreted as integer numerator and denominator, represented as two’s complement big-endian values.
A datum is well formed with respect to a value type if and only if that datum represents an abstract entity. For example, every sequence of 32 bits is well formed when interpreted as a two’s-complement integer; an IEEE 754 floating-point NaN (Not a Number) is not well formed when interpreted as a real number.
A value type is properly partial if its values represent a proper subset of the abstract entities in the corresponding species; otherwise it is total. For example, the type int is properly partial, while the type bool is total.
A value type is uniquely represented if and only if at most one value corresponds to each abstract entity. For example, a type representing a truth value as a byte that interprets zero as false and nonzero as true is not uniquely represented. A type representing an integer as a sign bit and an unsigned magnitude does not provide a unique representation of zero. A type representing an integer in two’s complement is uniquely represented.
A value type is ambiguous if and only if a value of the type has more than one interpretation. The negation of ambiguous is unambiguous. For example, a type representing a calendar year over a period longer than a single century as two decimal digits is ambiguous.
Two values of a value type are equal if and only if they represent the same abstract entity. They are representationally equal if and only if their datums are identical sequences of 0s and 1s.
- Lemma 1.1 If a value type is uniquely represented, equality implies representational equality.
- Lemma 1.2 If a value type is not ambiguous, representational equality implies equality.
If a value type is uniquely represented, we implement equality by testing that both sequences of 0s and 1s are the same. Otherwise we must implement equality in such a way that preserves its consistency with the interpretations of its arguments. Nonunique representations are chosen when testing equality is done less frequently than operations generating new values and when it is possible to make generating new values faster at the cost of making equality slower. For example, two rational numbers represented as pairs of integers are equal if they reduce to the same lowest terms. Two finite sets represented as unsorted sequences are equal if, after sorting and eliminating duplicates, their corresponding elements are equal.
Sometimes, implementing true behavioral equality is too expensive or even impossible, as in the case for a type of encodings of computable functions. In these cases we must settle for the weaker representational equality: that two values are the same sequence of 0s and 1s.
Computers implement functions on abstract entities as functions on values. While values reside in memory, a properly implemented function on values does not depend on particular memory addresses: It implements a mapping from values to values.
A function defined on a value type is regular if and only if it respects equality: Substituting an equal value for an argument gives an equal result. Most numeric functions are regular. An example of a numeric function that is not regular is the function that returns the numerator of a rational number represented as a pair of integers, since 1/2 = 2/4, but numerator(1/2) does not equal numerator(2/4). Regular functions allow equational reasoning: substituting equals for equals.
A nonregular function depends on the representation, not just the interpretation, of its argument. When designing the representation for a value type, two tasks go hand in hand: implementing equality and deciding which functions will be regular.
A memory is a set of words, each with an address and a content. The addresses are values of a fixed size, called the address length. The contents are values of another fixed size, called the word length. The content of an address is obtained by a load operation. The association of a content with an address is changed by a store operation. Examples of memories are bytes in main memory and blocks on a disk drive.
An object is a representation of a concrete entity as a value in memory. An object has a state that is a value of some value type. The state of an object is changeable. Given an object corresponding to a concrete entity, its state corresponds to a snapshot of that entity. An object owns a set of resources, such as memory words or records in a file, to hold its state.
While the value of an object is a contiguous sequence of 0s and 1s, the resources in which these 0s and 1s are stored are not necessarily contiguous. It is the interpretation that gives unity to an object. For example, two doubles may be interpreted as a single complex number even if they are not adjacent. The resources of an object might even be in different memories. This book, however, deals only with objects residing in a single memory with one address space. Every object has a unique starting address, from which all its resources can be reached.
An object type is a pattern for storing and modifying values in memory. Corresponding to every object type is a value type describing states of objects of that type. Every object belongs to an object type. An example of an object type is integers represented in 32-bit two’s complement little-endian format aligned to a 4-byte address boundary.
Values and objects play complementary roles. Values are unchanging and are independent of any particular implementation in the computer. Objects are changeable and have computer-specific implementations. The state of an object at any point in time can be described by a value; this value could in principle be written down on paper (making a snapshot) or serialized and sent over a communication link. Describing the states of objects in terms of values allows us to abstract from the particular implementations of the objects when discussing equality. Functional programming deals with values; imperative programming deals with objects.
We use values to represent entities. Since values are unchanging, they can represent abstract entities. Sequences of values can also represent sequences of snapshots of concrete entities. Objects hold values representing entities. Since objects are changeable, they can represent concrete entities by taking on a new value to represent a change in the entity. Objects can also represent abstract entities: staying constant or taking on different approximations to the abstract. We use objects in the computer for the following three reasons.
- Objects model changeable concrete entities, such as employee records in a payroll application.
- Objects provide a powerful way to implement functions on values, such as a procedure implementing the square root of a floating-point number using an iterative algorithm.
- Computers with memory constitute the only available realization of a universal computational device.
Some properties of value types carry through to object types. An object is well formed if and only if its state is well formed. An object type is properly partial if and only if its value type is properly partial; otherwise it is total. An object type is uniquely represented if and only if its value type is uniquely represented.
Since concrete entities have identities, objects representing them need a corresponding notion of identity. An identity token is a unique value expressing the identity of an object and is computed from the value of the object and the address of its resources. Examples of identity tokens are the address of the object, an index into an array where the object is stored, and an employee number in a personnel record. Testing equality of identity tokens corresponds to testing identity. During the lifetime of an application, a particular object could use different identity tokens as it moves either within a data structure or from one data structure to another. Two objects of the same type are equal if and only if their states are equal. If two objects are equal, we say that one is a copy of the other. Making a change to an object does not affect any copy of it.
This book uses a programming language that has no way to describe values and value types as separate from objects and object types. So from this point on, when we refer to types without qualification, we mean object types.
For more sample material or purchasing information, go to