Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Open Source

Code Finessing


Quality Time

Luckily, there's a way we can arrive at a clean algorithm implementation without suffering a performance penalty. This almost-magic road is called "tail recursion elimination." You can apply this technique in places where functions get recursively called from their end (tail); that is, at the point where they return to their caller. In such cases you can replace the recursive call with an assignment of the new argument values to the old ones and a jump (via a goto statement) to the function's entry point. In fact, many optimizing compilers automatically perform this optimization, saving you the trouble of doing it by hand.

As most calls in Prosser's expansion algorithm are indeed tail recursive, my first reaction was to see if the compiler realized this fact and optimized the code accordingly. I compiled the code with GCC's -S switch, which generates the intermediate assembly language file, and peeked at the result. A set of 12 separate lines recursively calling subst inside the function's body confirmed my fear: GCC hadn't realized it could optimize the calls away (at least in Version 3.4.2, which I was using).

I thus set out to perform this operation by hand. I did this in several small steps, each one under version control. The large number of regression tests at hand allowed me to proceed with confidence. If a step failed, I reverted the changes and performed it again, more carefully. This is how an excerpt of my initial implementation of subst looked in its recursive form:


static PtokenSequence
subst(const Macro &m, dequePtoken is, 
  const mapArgval &args, HideSet hs, 
   bool skip_defined, const Macro *caller)
{
   if (is.empty())
     return (hsadd(hs, os));
   for (ti = tail.begin(); ti != tail.end(); ti++)
      if (!ti->is_space() && 
      (ai = find_formal_argument(args, *ti)) != args.end()) {
         tail.erase(tail.begin(), ++ti);
         os.push_back(stringize(ai->second));
          return (subst(m, tail, args,
            hs, os, skip_defined, caller));
   }


The first step involved mechanically changing the tail recursive calls into jumps to the function's beginning. In this step, I also altered the code to directly modify the function's arguments is and os, instead of constructing new ones. The corresponding code in the function's body took this form:


again:
  if (is.empty())
    return (hsadd(hs, os));
  for (ti = is.begin(); ti != is. end(); ti++)
    if (!ti->is_space() &&  (ai =
find_formal_argument(args, *ti)) !=  args.end()) {
     is.erase(is.begin(), ++ti);
     os.push_back(stringize(ai-> second));
     goto again;
  }


Running the offending strcmp() Linux statement with the above code showed a dramatic improvement in performance. The processing time dropped by almost two orders of magnitude from 12.2 seconds to 0.16 seconds—at a price. My code ended up littered with 12 instances of a goto statement. Although not all uses of goto are bad (I can't see a problem if a small percentage of a program's functions contain a single label), I try as a matter of principle to avoid the use of a goto whenever I can use an equally readable alternative.

If my implementation was coded in Java, an alternative would be to place the function's body within a labeled loop and use a continue statement to branch to the loop's beginning. Because C++ doesn't provide an explicit facility to exit through multiple nested loops, my next step involved converting a number of inner loops, which searched for the first nonspace token, into calls to a function I named find_nonspace. This exercise had the added benefit of reducing the code's indentation levels by one. Consequently, I felt that through this change, I repaid some of the style debt I incurred by dropping the recursive algorithm. This is the code's new look:


again:
   if (is.empty())
     return (hsadd(hs, os));
   ti = find_nonspace(is.begin(),  is.end());
   if (ti != is.end() && (ai = 
find_formal_argument(args, *ti)) != args.end()) {
    is.erase(is.begin(), ++ti);
    os.push_back(stringize(ai-> second));
    goto again;
   }


At that point, it became child's play to put the function's body inside a while loop and substitute all instances of goto with continue. In a number of cases, I was also able to factor a couple of continue statements into a single one, further improving the code's readability. This is the final look of the goto-less code:


while (!is.empty()) {
  ti = find_nonspace(is.begin(), is.end());
  if (ti != is.end() && (ai = 
find_formal_argument(args, *ti)) !=  args.end()) {
    is.erase(is.begin(), ++ti);
    os.push_back(stringize(ai-> second));
    continue;
  }
}
return (hsadd(hs, os));


In three steps, I moved from a clean recursive implementation to an equally readable and much more efficient iterative code. The in-between stopovers were somewhat messy, but version control and regression tests allowed me to navigate through them with confidence.

Lessons Learned

I have to admit that there were times during this implementation work when I felt frustrated and doubted that the new code would ever work reliably and efficiently. In retrospect, things went remarkably well. The dictum "first make the code correct, then make it fast" proved itself once more. I think this is the prudent course to take when dealing with tricky algorithms. I admit that the degree to which recursion and the functional programming style burdened my program's performance surprised me. This is the first time I witnessed first hand an unintentional change of the complexity class of an algorithm. I was also disappointed by the fact that GCC was unable to optimize the recursive tail calls. It seems that the state of the art in compiler technology I heard advertised as an undergraduate student isn't really what was claimed it was. I was lucky to have at hand a large set of tests to verify the extensive changes I made to the code. I thus realized why unit tests are such an integral part of agile methodologies, which rely on frequent refactoring exercises.

When I committed the final version of the implementation to the version-control system, I felt confident that the code was correct, readable, and efficient. It was quality code, for which I felt proud. And this is what counts in the end.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.