Channels ▼
RSS

Parallel

The Case for D


Functional Programming

Quick, how do you define a functional-style Fibonacci function?


<b>uint</b> fib(<b>uint</b> int n)
{
    <b>return</b> n < 2 ? n : fib(n - 1) + fib(n - 2);
}

I confess to entertaining fantasies. One of these fantasies has it that I go back in time and somehow eradicate this implementation of Fibonacci such that no Computer Science teacher ever teaches it. (Next on the list are bubble sort and the O(n log n)-space quicksort implementation. But fib outdoes both by a large margin. Also, killing Hitler or Stalin is dicey as it has hard-to-assess consequences, whereas killing fib is just good.) fib takes exponential time to complete and as such promotes nothing but ignorance of complexity and of the costs of computation, a "cute excuses sloppy" attitude, and SUV driving. You know how bad exponential is? fib(10) and fib(20) take negligible time on my machine, whereas fib(50) takes nineteen and a half minutes. In all likelihood, evaluating fib(1000) will outlast humankind, which gives me solace because it's what we deserve if we continue teaching bad programming.

Fine, so then what does a "green" functional Fibonacci implementation look like?


<b>uint</b> fib(<b>uint</b> n)
{
    <b>uint</b> iter(<b>uint</b> i, <b>uint</b> fib_1, <b>uint</b> fib_2)
   {
       <b>return</b> i == n
          ? fib_2
          : iter(i + 1, fib_1 + fib_2, fib_1);
   }
    <b>return</b> iter(0, 1, 0);
}

The revised version takes negligible time to complete fib(50). The implementation now takes O(n) time, and tail call optimization (which D implements) takes care of the space complexity. The problem is that the new fib kind of lost its glory. Essentially the revised implementation maintains two state variables in the disguise of function parameters, so we might as well come clean and write the straight loop that iter made unnecessarily obscure:


<b>uint</b> fib(<b>uint</b> n)
{
   <b>uint</b> fib_1 = 1, fib_2 = 0;
   <b>foreach</b> (i; 0 .. n)
  {
      <b>auto</b> t = fib_1;
      fib_1 += fib_2;
      fib_2 = t;
   }
   <b>return</b> fib_2;
}

but (shriek of horror) this is not functional anymore! Look at all that disgusting mutation going on in the loop! One mistaken step, and we fell all the way from the peaks of mathematical purity down to the unsophisticatedness of the unwashed masses.

But if we sit for a minute and think of it, the iterative fib is not that unwashed. If you think of it as a black box, fib always outputs the same thing for a given input, and after all pure is what pure does. The fact that it uses private state may make it less functional in letter, but not in spirit. Pulling carefully on that thread, we reach a very interesting conclusion: as long as the mutable state in a function is entirely transitory (i.e., allocated on the stack) and private (i.e., not passed along by reference to functions that may taint it), then the function can be considered pure.

And that's how D defines functional purity: you can use mutation in the implementation of a pure function, as long as it's transitory and private. You can then put pure in that function's signature and the compiler will compile it without a hitch:


<b>pure uint</b> fib(uint n)
{
    ... iterative implementation ...
}

The way D relaxes purity is quite useful because you're getting the best of both worlds: iron-clad functional purity guarantees, and comfortable implementation when iteration is the preferred method. If that's not cool, I don't know what is.

Last but not least, functional languages have another way of defining a Fibonacci sequence: as a so-called infinite list. Instead of a function, you define a lazy infinite list that gives you more Fibonacci numbers the more you read from it. D's standard library offers a pretty cool way of defining such lazy lists. For example, the code below outputs the first 50 Fibonacci numbers (you'd need to import std.range):


<b>foreach</b> (f; take(50, recurrence!("a[n-1] + a[n-2]")(0, 1)))
{
    writeln(f);
}

That's not a one-liner, it's a half-liner! The invocation of recurrence means, create an infinite list with the recurrence formula a[n] = a[n-1] + a[n-2] starting with numbers 0 and 1. In all this there is no dynamic memory allocation, no indirect function invocation, and no non-renewable resources used. The code is pretty much equivalent to the loop in the iterative implementation. To see how that can be done, you may want to read through the next section.

Generic Programming

(You know the kind of caution you feel when you want to describe to a friend a movie, a book, or some music you really like but don't want to spoil by overselling? That's the kind of caution I feel as I approach the subject of generic programming in D.) Generic programming has several definitions -- even the neutrality of the Wikipedia article on it is being disputed at the time of this writing. Some people refer to generic programming as "programming with parameterized types a.k.a. templates or generics," whereas others mean "expressing algorithms in the most generic form that preserves their complexity guarantees." I'll discuss a bit of the former in this section, and a bit of the latter in the next section.

D offers parameterized structs, classes, and functions with a simple syntax, for example here's a min function:


<b>auto</b> min(T)(T a, T b) { <b>return</b> b < a ? b : a; }
...
<b>auto</b> x = min(4, 5);

T would be a type parameter and a and b are regular function parameters. The auto return type leaves it to the compiler to figure out what type min returns. Here's the embryo of a list:


<b>class</b> List(T)
{
   T value;
   List next;
   ...
}
...
List!<b>int</b> numbers;

The fun only starts here. There's too much to tell in a short article to do the subject justice, so the next few paragraphs offer "deltas"-- differences from the languages with generics that you might already know.

Parameter Kinds. Not only types are acceptable as generic parameters, but also numbers (integral and floating-point), strings, compile-time values of struct type, and aliases. An alias parameter is any symbolic entity in a program, which can in turn refer to a value, a type, a function, or even a template. (That's how D elegantly sidesteps the infinite regression of template template template. . . parameters; just pass it as an alias.) Alias parameters are also instrumental in defining lambda functions. Variable-length parameter lists are also allowed.

String Manipulation. Passing strings to templates would be next to useless if there was no meaningful compile-time manipulation of strings. D offers full string manipulation capabilities during compilation (concatenation, indexing, selecting a substring, iterating, comparison....)

Code Generation: The Assembler of Generic Programming . Manipulating strings during compilation may be interesting, but is confined to the data flatland. What takes things into space is the ability to convert strings into code (by use of the mixin expression). Remember the recurrence example? It passed the recurrence formula for Fibonacci sequences into a library facility by means of a string. That facility in turn converted the string into code and provided arguments to it. As another example, here's how you sort ranges in D:


// define an array of integers
<b>auto</b> arr = [ 1, 3, 5, 2 ];
// sort increasingly (default)
sort(arr);
// decreasingly, using a lambda
sort!((x, y) { <b>return</b> x > y; })(arr);
// decreasingly, using code generation; comparison is
// a string with conventional parameters a and b
sort!("a >  b")(arr);

Code generation is very powerful because it allows implementing entire features without a need for language-level support. For example, D lacks bitfields, but the standard module std.bitmanip defines a facility implementing them fully and efficiently.

Introspection. In a way, introspection (i.e., the ability to inspect a code entity) is the complement of code generation because it looks at code instead of generating it. It also offers support for code generation -- for example, introspection provides the information for generating a parsing function for some enumerated value. At this time, introspection is only partially supported. A better design has been blueprinted and the implementation is "on the list," so please stay tuned for more about that.

is and static if. Anyone who's written a non-trivial C++ template knows both the necessity and the encumbrances of (a) figuring out whether some code "would compile" and deciding what to do if yes vs. if not, and (b) checking for Boolean conditions statically and compiling in different code on each branch. In D, the Boolean compile-time expression is(typeof(expr)) yields true if expr is a valid expression, and false otherwise (without aborting compilation). Also, static if looks pretty much like if, except it operates during compilation on any valid D compile-time Boolean expression (i.e., #if done right). I can easily say these two features alone slash the complexity of generic code in half, and it filled me with chagrin that C++0x includes neither.

But Wait, There's. . .Well, You Know. Generic programming is a vast play field, and although D covers it with a surprisingly compact conceptual package, it would be hard to discuss matters further without giving more involved information. D has more to offer, such as customized error messages, constrained templates ' la C++0x concepts (just a tad simpler -- what's a couple of orders of magnitude between friends?), tuples, a unique feature called "local instantiation "(crucial for flexible and efficient lambdas), and, if you call within the next five minutes, a knife that can cut through a frozen can of soda.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video