Dr. Dobb's | The Why of Y | April 22, 2007

The Why of Y

One way to derive Y.

April 22, 2007
URL:http://www.drdobbs.com/web-development/the-why-of-y/199200394

Richard P. Gabriel is a Distinguished Engineer at IBM Research, looking into the architecture, design, and implementation of extraordinarily large, self-sustaining systems. He can be contacted at www.dreamsongs.com.

Did you ever wonder how Y works and how anyone could ever have thought of it? In this article, I explain not only how it works, but how someone could have invented it in the first place. I'll use Scheme notation because when functions passed as arguments are applied, Y is easier to understand.

Y's goal is to provide a mechanism to write self-referential progra ms without any special built-in means. Scheme has several mechanisms for writing such programs, including global function definitions and letrec. One way you can write the factorial function in Scheme is:

(define fact (lambda (n) (if (< n 2) 1 (* n (fact (- n 1))))))

This function works because a global variable, fact, has its value set to the value of the lambda expression. When the variable fact in the body of the function is evaluated to determine which function to invoke, the value is found in the global variable. Using a global variable as a function name can be rather unpleasant because it relies on a global, and thus vulnerable, resource -- that is, the global variable space.

The Scheme self-reference form letrec usually is implemented using a side effect; it is easier to reason about programming languages and programs that have no side effects. Therefore, it is of theoretical interest to establish the ability to write recursive functions without using side effects. A program that uses letrec is:

(letrec ((f (lambda (n) (if (< n 2) 1 (* n (f (- n1))))))) (f 10))

This program computes 10!. The reference inside the lambda express ion is to the binding of f, as established by the letrec.

You can implement letrec using let and set!:

(letrec ((f (lambda ...))) ...)

which is equivalent to:

(let ((f )) (set! f (lambda ...)) ...)

All references to f in the lambda expression are to the value of the lambda expression.

Y takes a function describing another recursive or self-referential function and returns another function that implements the recursive function. Y is used to compute 10! with:

(let ((f (y (lambda (n) (lambda (n) (if (< n 2) 1 (* n (h (- n 1))))) )))))) (f 10))

The function passed as an argument to Y takes a function as an argument and returns a function that looks like the factorial function we want to define. That is, the function passed to Y is (lambda (h) ...). The body of this function looks like the factorial function, except that where we would expect a recursive call to the factorial function, h is called instead. Y arranges for an appropriate value to be supplied as the value of h.

People call Y the "applicative-order fixed-point operator for functionals." Let's take a closer look at what this means in the factorial example.

Suppose M is the true mathematical factorial function, possibly in Plato's heaven. Let F denote the function:

F = (lambda (h) (lambda (n) (if (< n 2) 1 (* n (h (- n 1))))))

Then:

((F M) n) = (M n).

That is, M is a fixed point of F: F maps (in some sense) M onto M. Y satisfies the property:

((F (Y F)) X) = ((Y F) X)

This property of Y is very important. Another important property is that the least defined fixed point for functionals is unique; therefore, (Y F) and M are in some sense the same.

Applicative-order Y is not the same as classical Y, which is a combinator. Some texts refer to Y as Z.

To derive Y, I will start with a recursive function example, factorial. In the derivation I will use three techniques:

The first one passes an additional argument to avoid using any self-reference primitives from Scheme.
The second technique converts multiple-parameter functions to nested single-parameter functions to separate manipulations of the self-reference and ordinary parameters.
The third technique introduces functions through abstraction.

All code examples will use the variables n and m to refer to integers, the variable x to refer to an unknown but undistinguished argument, and the variables f, g, h, q, and r to refer to functions.

The basic form of the factorial function is:

(lambda (n) (if (< n 2) 1 (* n (h (- n1)))))

The h variable should refer to the function we wish to invoke when a recursive call is made, which is the factorial function itself. Since we have no way to make h refer directly to the correct function, let's pass it in as an argument:

(lambda (h n) (if (< n 2) 1 (* n (h h (- n 1)))))

In the recursive call to h, the first argument will also be h because we want to pass on the correct function to use in the recursive situation for later invocations of the function.

Therefore, to compute 10! we would write:

(let ((g (lambda (h n) (if (< n 2) 1 (* n (h h (- n 1))))))) (g g 10) )

During the evaluation of the body of g, h's value is the same as the value of g established by let; that is, during execution of g, h refers to the executing function. When the function call (h h (- n 1)) happens, the same value is passed along as an argument to h; h passes itself to itself.

We want to split the management of the function's self-reference fr om the management of other arguments. In this particular case, we want to separate the management of h from that of n. A technique called "currying" is the standard way to handle this separation. Before we curry this example, let's look at another example of currying. Here is a program that also computes 10!, but in a slightly more clever way:

(letrec ((f (lambda (n m) (if (< n 2) m (f (- n 1) (* m n)))))) (f 10 1))

The trick is to use an accumulator, m, to compute the result. Let's curry the definition of f:

(letrec ((f (lambda (n) (lambda (m) (if (< n 2) m ((f (- n 1)) (* m n ))))))) ((f 10) 1))

The idea of currying is that every function has one argument. Passing multiple arguments is accomplished with nested function application: the first application returns a function that takes the second argument and completes the computation of the value. In the previous piece of code, the recursive call:

((f (- n 1)) (* m n))

has two steps: the proper function to apply is computed and applied to the right argument.

We can use this idea to curry the other factorial program:

(let ((g (lambda (h) (lambda (n) (if (< n 2) 1 (* n ((h h) (- n 1)))) ))) ((g g) 10))

In this piece of code, the recursive call also computes and applies the proper function. But that proper function is computed by applying a function to itself.

Applying a function to itself is the process by which we get the basic functionality of a self-reference. The self-application (g g) in the last line of the program calls g with g itself as an argument. This returns a closure in which the variable h is bound to the outside g. This closure takes a number and does the basic factorial comput ation. If the computation needs to perform a recursive call, it invokes the closed-over h with the closed-over has an argument, but all the hs are bound to the function g as defined by the let.

To summarize this technique, suppose we have a self-referential function using letrec as in the following code skeleton:

(letrec ((f (lambda (x) ... f ...))) ... f ...)

This skeleton can be turned into a self-referential function that uses let where r is a fresh identifier:

(let ((f (lambda (r) (lambda (x) ... (r r) ...)))) ... (f f)

For the next step, let's examine how to separate further the management of h in our factorial function from the management of n. Recall that the factorial program looks like:

(let ((g (lambda (h) (lambda (n) (if (< n 2) 1 (* n ((h h ) (- n 1))) )))))) ((g g) 10))

Our plan of attack is to abstract the if expression over (h h) and n, which will accomplish two things: the resulting function will become independent of its surrounding bindings and the management of the control argument will become separated from the numeric argument. The result of the abstraction is:

(let ((g (lambda (h) (lambda (n) (let ((f (lambda (q n) (if (< n 2) 1 (* n (q (- n 1))))))) (f (h h) n)))))) ((g g) 10))

We can curry the definition of f, which also changes its call:

(let ((g (lambda (h) (lambda (n) (let ((f (lambda (q) (lambda (n) (if (< n 2) 1 (* n (q (- n 1)))))))) ((f (h h)) n)))))) ((g g) 10))

Notice that the definition of the function f need not be deeply embedded in the function g. Therefore, we can extract the main part of the function -- the part that computes factorial -- from the rest of the code.

(let ((f (lambda (q) (lambda (n) (if (< n 2) 1 (* n (q (- n 1)))))))) (let ((g (lambda (h) (lambda (n) ((f (h h)) n))))) ((g g) 10)))

The form of f is once again the parameterized form of factorial, and we can abstract this expression over f, which produces Y as follows:

(define Y (lambda (f) (let ((g (lambda (h) (lambda (x) ((f (h h)) x)) ))) (g g))))

This is one way to derive Y.