Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Open Source

Code Finessing


Crunch Time

Being satisfied with the correctness of the code, I set out to measure its performance. Functional code is often criticized for its poor time and space performance, so I was worried about the price I would have to pay for the speed in which I got the algorithm running. Surprisingly, I didn't see a dramatic performance difference in the running of the regression tests. The runtime increased by 21 percent, a difference I subsequently decreased to 17 percent by only calling expand on tokens for which macros were defined. After some inconclusive profiling runs, I reasoned that the remaining speed difference could be an acceptable price to pay for an implementation guaranteed to be correct over one that was demonstrably incorrect, and decided to leave it at that. I thus continued my experiments with CScout and the Linux kernel source code.

At that point, things started going horribly wrong. Files that CScout could process in a few seconds took tens of minutes to complete. I was very surprised, because my regression tests also include the processing of real-world applications, like Brian Kernighan's original implementation of the awk language, and the performance drop in these tests was nothing like the orders of magnitude I was observing.

I reasoned that maybe some particular lines contained macros that were confusing the expansion algorithm, and inserted a debug statement to print the file and line number of every line being processed. Sitting in front of the screen watching the lines fly by, I stared in disbelief when I saw CScout pausing at a line for more than 10 seconds. I noted its number, and that of a couple of other lines that prompted similar pauses. All of them had a call to strcmp() in common. The corresponding GCC header files offered a hint for solving the mystery. In those files, strcmp() is defined through some fairly large and hairy definitions as a macro. Consequently, a simple invocation like strcmp(a, b) expands into more than 900 tokens (there's no point in holding your breath here):


__extension__ ({ size_t __s1_len, __s2_len; 
   (__builtin_constant_p (a) && __builtin_constant_p ( b) && 
   (__s1_len = strlen (a), __s2_len = strlen ( b), 
   (!((size_t)(const void *)((a) + 1) - (size_t)(const void 
   [... 24 similar lines removed ...]

b))[2]); if ( __s2_len > 2 && __result == 0) 
   __result = (__s1[3] - ((__const unsigned char *) 
   (__const char *) (  b))[3]); } } __result; }))) : 
    __builtin_strcmp (a,  b)))); });


This code compiles into fast and compact assembly code, but the expansion that led to it severely strained my new implementation of C macro expansion. The reason is a seldom-appreciated fact behind recursive algorithms. A naïve recursive implementation of an algorithm can change an algorithm's complexity class. You probably know that the most dramatic differences in a program's performance appears when switching between algorithms of different complexity classes. For example, substituting a linear search (complexity class O(N)) with a binary search (complexity O(log N)) cuts the half million operations required on average to search through a million elements down to 19. Conversely, implementing macro expansion (which should be linear in the number of tokens processed) through a quadratic algorithm would skyrocket the number of operations required to process the roughly 900 tokens of our strcmp sequence from 900 to 810,000.

So how did recursion make the complexity of my implementation of the algorithm quadratic? Recall that the two main functions of the algorithm work on sequences of the tokens to be expanded. When processing N tokens, these functions get invoked roughly N times. Now, if each function called constructs a list of N elements, that would make the algorithm's complexity O(N × N), which is quadratic. You may counter here, that I shouldn't be constructing separate data structures from scratch, but merely modifying the existing structures on the fly and passing references to those structures on each recursive call. Unfortunately, following this road in an undisciplined manner is a sure way to introduce subtle bugs. The reason my new code worked without errors was exactly because I implemented it in a pure functional style.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.