JVM Languages

Functional Programming in Java

By Mark Zander, November 01, 2005

Generic Java lets you extend the language to gain some of the advantages of functional programming languages.

November, 2005: Functional Programming in Java

Mark is a technical staff member at Los Alamos National Laboratory. He can be contacted at [email protected].

Although Java supports composition of classes and interfaces through the "extend" and "implements" features, it does not allow composition of functions. Functional programming languages such as Haskell (http://www.haskell.org/), support composition of functions because functions are first-class objects and support currying. When functions are first-class objects, they may be assigned to variables, passed as arguments to other functions, and returned as the result of functions. Currying allows function arguments to be applied one at a time, creating a new function with one less argument.

Java 1.5 has recently been released with generics (see "Generics in the Java Programming Language," by Gilad Bracha, http://java.sun.com/j2se/1.5.0/download-pdf.html). Generic Java (GJ) lets you further specify the typing of classes and objects. This is used to guarantee type safety at compile time, rather than guaranteeing type safety at runtime by using casts. Generics are types that exist at compile time only, and that makes them different than C++ templates that instantiate new types that exist at runtime. Nevertheless, there are similarities in what can be achieved; the main influence for this work came from examples in C++ (see "Making C++ Ready for Algorithmic Skeletons," by Jörg Striegnitz; http://www.fz-juelich.de/zam/docs/autoren2000/striegnitz).

Currying and Function Objects

Currying allows passing one parameter at a time to a function. For example, in Haskell, a function to add two numbers would be:

add :: Int -> (Int -> Int)
add x y = x + y

where the first line is the type of the function, and the second line is the implementation of the function. The function type shows that add takes an Int as its first argument, an Int for its second argument, and returns an Int as a result. But what if only one Int is passed to add? Then a new function is created that takes an Int for its argument and returns an Int as a result. For example, you can define a new function:

inc :: Int -> Int
inc = add 1

so function inc takes an Int for an argument and returns an Int as a result. It increments its argument by one to get its result.

In Effective Java Programming Language Guide (Addison-Wesley, 2001), Joshua Bloch gives a description of using a class containing only one function, then using an instance of the class to represent that function. The term "function object" can be used to describe this concept. Because Java does not allow first-class functions, function objects may be used instead.

What should the syntax of calling one argument at a time look like in Java? In Haskell, function arguments are delimited by spaces, so that adding 1 and 2 would look like add 1 2. C++ has operator overloading for parentheses so that passing arguments one at a time to add could look like add(1)(2). In Java, you invoke a method of a function object to emulate a call of one argument at a time. To minimize syntax, I picked a one-letter method named x for the next argument. Calling the add function object in Java would be add.x(1).x(2). The syntax of the three languages is:

add 1 2 -- Haskell
add(1)(2) // C++
add.x(1).x(2) // Java

Functional Programming in Java

I have implemented classes to support functional programming in Java and give some examples before describing the implementation.

A new function object derives from a function object base class, and implements a call method. There is a different function object base class needed for functions requiring different numbers of arguments. The function object base class for functions requiring two arguments has the name Function.O2. It has generic-type parameters indicating the type of each argument, and a final type parameter indicating the return type of the function. The add function earlier would be written as Example 1(a).

In Haskell, the type of a function is indicated by the syntax A->R, where A is the argument type of the function, and R is the result type of the function. The type of a function with two arguments would look like A1->(A2->R), where A1 is the type of the first argument, A2 the type of the second argument, and R the result type. In GJ, this can be emulated with Fcn<A, R> representing a function object needing one argument, and Fcn<A1, Fcn<A2, R>> representing a function object needing two arguments. The function object type of the Add function object just mentioned would be Fcn<Integer, Fcn<Integer, Integer>>. An increment function object, inc, would then have type Fcn<Integer, Integer>, as in Example 1(b).

A more interesting example might be incrementing every element of a collection. The Transform function object takes a function and applies it to a collection. It takes three arguments: a function object, an input collection to which the function will be applied, and a new collection in which the final result will be placed. Example 1(c) is an example of using the Transform function object to increment all the elements of a collection. Notice that in Example 1(c), there was no need to write new functions. Instead, results were achieved by combining existing function objects.

Implementation

The mechanics for function objects is achieved through five classes and one interface. The interface Fcn is used to type function objects. The abstract class Executioner aids in the mechanics of implementation. The class Nil is used to terminate argument lists. The main implementation is split into three parts: argument lists, partial functions, and function object base classes. These parts are:

Fcn is the interface for all function objects. It is key to typing all function objects by their argument type and return type. It also declares the x method for processing the next argument; see Example 2(a).
Executioner contains one method that will execute the requested function. It has just one argument, which is assumed to be an argument list. It unfolds the argument list into its individual arguments to make the actual function call; see Example 2(b).
Nil is an almost empty class used to terminate an argument list; see Example 2(c).

Three main classes supply the implementation that allows for function objects to exist in Java.

Argument Lists are implemented by class ArgList (see Arglist.java, available electronically; see "Resource Center, page 4). This class keeps track of arguments as they are passed in one at a time. It is a heterogeneous list, as each argument could have a different type.
Partial function objects are implemented in class Partial (see Partial.java, available electronically). This class holds several classes, each representing the number of arguments left in a function call. A partial function object is a function that has some of its arguments applied, but not others.
Function object base classes are implemented in class Function (see Function.java, available electronically). This class holds several classes, each representing the number of arguments in the original function. A function object base class is an abstract class with an abstract call method containing the specified number of arguments. A user creates a new function object by extending from a function object base class and implementing its call method.

Argument Lists

Because arguments are accumulated one at a time, a data structure is needed in which to keep the arguments. The first thing that comes to mind is the Java Collection's class LinkedList. Unfortunately, class LinkedList has the same type for every element of the list, whereas an argument list has a different type for every argument. Inspired by Andrei Alexandrescu's Typelists in Modern C++ Design: Generic Programming and Design Patterns Applied (Addison-Wesley, 2001) using C++ templates, a similar structure can be built using GJ. Class ArgList has two type parameters, the first representing the first element of the list and the second representing the rest of the list. By convention, the rest of the list is either empty (represented by class Nil) or includes another ArgList with two more types. Arguments are always added to an arglist at the front. For example:

new ArgList(an, arglist)

where an is the current argument being added, and arglist is the existing argument list. Because of this, the arguments always end up in reverse order in argument lists. Thus, the type of an argument list with three arguments of types T1, T2, and T3 would be:

ArgList<T3, <ArgList< T2, ArgList<T1, Nil>>>

Static accessor functions are supplied that retrieve the first, second, third, fourth, and fifth elements of an argument list. A static function is used because it forces an object of type ArgList to be passed as an argument. Passing an object as a function parameter allows us to infer the type of the object, and each of the accessor functions infers a different type. Arglist.java (available electronically) shows the ArgList class and its static accessor functions.

Partial Function Objects

Partial function objects are function objects that have received at least one of their arguments and have at least one argument that has not been specified. Each partial function object keeps an argument list of arguments that have already been passed and an Executioner object that knows how to execute the required function when the argument list is complete. There is a different partial function object for each number of arguments left in a calling sequence. Each partial function object needs to implement the x method given by the Fcn interface. A partial function object with one argument left, Partial.O1, as in Partial.java (available electronically), executes the required function when its x method is invoked after prepending the final argument to form the completed argument list. A partial function object with two arguments left, Partial.O2, has an x method that prepends its argument to the argument list and then creates a new partial function object needing one more argument. Partial function objects with three or more arguments left are similar to a partial function object with two arguments left. Each additional argument requires another generic type parameter, and each x method creates a new partial function object with one less argument left.

All of the implemented partial functions are in Partial.java (available electronically) as members of class Partial. Each is named by the letter O for object, which is followed by the number of arguments left before making the function call. Partial function objects O1 through O4 have been implemented.

Partial.java shows that the x method does not create a new partial function object directly, but instead calls a static make function that creates the new partial function object. The make method is able to infer the proper type parameters through its argument list.

Partial function objects extend from the appropriate function object base class, implementing the inherited call method so that a partial function object may have its argument list completed all at once. Another advantage of partial function objects extending from function object base classes is that the function object base classes may be used for typing function objects. Rather than using the more verbose type for a function with three arguments, Fcn<A1, Fcn<A2, Fcn<A3, R>>>>, the simpler form of Function.O3<A1, A2, A3, R> may be used.

Function Object Base Classes

Usually a new function object is derived from one of the function object base classes. There is a different function object base class for each number of arguments in the original function call. Every function object base class has an abstract call method that is the function that will eventually be executed when all parameters are present. The function object base class for functions of one argument, Function.O1, is simpler than function object base classes with more arguments. It implements the x method from its inherited interface Fcn by simply calling its abstract call method.

The function object base class for functions of two arguments, Function.O2, sets the pattern for function object base classes with more than two arguments:

It has three generic arguments, <T1, T2, R>, indicating the type of each function argument, T1 and T2, and the return type, R, of the function.
It has an abstract call method that accepts two arguments of the correct type representing a function that gets executed.
It implements interface Fcn<T1, Fcn<T2, R>> with an implementation of method x that creates a partial function object needing one more argument for a complete argument list.
Because it creates a partial function object, it needs to supply an Executioner. It does this by extending the Executioner and implementing the calls method. Notice that the arguments are in reverse order from the list. The calls method calls the abstract call method.

Higher Order Function Objects

Higher order function objects are function objects that take other function objects as arguments. In doing so, they allow functions to be combined, creating new functionality without writing new functions.

A simple example would be function object Twice. It takes two arguments: a function and an argument for that function. It then applies that function twice to the argument. This is demonstrated by the Haskell code:

twice :: (a->a)->a->a
twice f z = f(f(z))

The type of twice says that its first argument is a function whose argument type and return type are the same. Using this information, the equivalent Java function object could be written as:

public class Twice<A>
extends Function.O2<Fcn<A, A>, A, A> {
public Twice() { }
public A call(Fcn<A, A> f, A z)
{ return f.x(f.x(z)); }
}

For example, you could use this to increment a value twice:

System.out.println("Twice inc 5 = ",
(new Twice<Integer>()).x(inc).x(5));

As a convenience, a static make function can be added to class Twice that will infer the argument type and return type given the original function:

public static <A> Fcn<A, A> make(Fcn<A, A> f) {
return (new Twice<A>()).x(f);
}

Now it is easier to print out the example:

System.out.println("Twice inc 5 =" + Twice.make(inc).x(5));

Another higher order function object is composition, where the composition of two functions is defined as applying first one function and then another. The Haskell code would look like the following:

(.) :: (b->c)->(a->b)->a->c
(f . g) z = f(g(z))

This defines ".", dot, to be the composition operator. It takes two functions and a value. It applies the second function to the argument and then applies the first function to that result. The argument type of the first function must be the same as the return type of the second function. Because Java does not have overloaded operators, I chose the name FoG to stand for F of G. The FoG function object is as follows:

public class FoG<A, B, C>
extends Function.O3<Fcn<B, C>, Fcn<A, B>, A, C> {
public FoG() { }
public C call(Fcn<B, C> f, Fcn<A, B> g, A z) { return f.x(g.x(z)); }
}

As with Twice, a static make function can be written that will take the first two arguments and infer the type of the function object:

public static <A, B, C> Fcn<A, C>
make(Fcn<B, C> f, Fcn<A, B> g)
{ return (new FoG<A, B, C>()).x(f).x(g); }

You may notice that using FoG with the same function twice has the same effect as using the Twice function object:

System.out.println("FoG inc inc 5 = " + FoG.make(inc, inc).x(5));

Transform is the final higher order function object described here. This is similar to the Haskell map function, but because the map name is already used by the Java collections, I borrowed the name "Transform" from the C++ Standard Template Library (STL). The Haskell map function is defined as:

map :: (a->b)->[a]->[b]
map f [] = []
map f x:xs = f(x) : map f xs

The map applies the function passed as its first argument to every element of the list passed as its second argument and returns a new list. In Java, there are many collection types, not just lists, so the Transform function object takes any Java collection as its second argument and takes a third function argument, which will be the output collection. Transform.java (available electronically) shows the implementation of the Transform function object. An example of the Transform function object was presented earlier in Example 1(c).

More Efficient Higher Order Function Objects

Combining function objects through higher order function objects can ease the task of programming, but this abstraction can be expensive in time and space. Arguments are put on argument lists, and partial function objects are created. It is therefore worth some effort to create more efficient function objects. The higher order function objects offer a good example. Each higher order function object has a static make method that builds a partial function object given one or two initial arguments. Instead of creating a partial function object, a specialized function object for the task at hand can be written. This technique is demonstrated in class Transform of Transform.java. There is an internal class FcnObject that has a constructor with one argument, the function to be applied. Its call method takes two arguments, the collection with the inputs and the collection that receives the outputs. The static make method creates one of these objects.

Another advantage for the Transform function object is that it is easier to type the function object. For example, using the Fcn<T, R> function, typing the type of the function object returned from Transform.make would be:

Fcn<Collection<Integer>, Fcn<Collection <Integer>, Collection<Integer>>>
incList = Transform.make(inc);

It might be somewhat easier to give the type using Transform.FcnObject:

Transform.FcnObject<Integer, Integer>
incList = Transform.make(inc);

Conclusion

Functional programming allows for greater expressiveness through higher order functions that allow functions to be combined, creating new functionality without writing new functions. With the aid of GJ, function objects with curried arguments can be created that allow for higher order function objects and extend Java to have some of the features and advantages of a functional programming language.

Acknowledgment

Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the University of California for the U.S. Department of Energy under contract W-7405-ENG-36. By acceptance of this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher's right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness. LANL Notice, LA-UR-05-3300.

DDJ

1 2 3 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

JVM Languages