From ActiveX to Cargo-Cult Science
Greg is the author of Practical Parallel Programming (MIT Press, 1995), and co-editor with Paul Lu of Parallel Programming Using C++ (MIT Press, 1996). Greg can be reached at [email protected]
Understanding ActiveX and OLE:
A Guide for Developers and Managers
Microsoft Press, 1996
328 pp., $22.95
Garbage Collection: Algorithms
for Automatic Dynamic Memory Management
Richard Jones and Rafael Lins
John Wiley & Sons, 1996
377 pp., $60.00
Modern Compiler Implementation
in Java: Basic Techniques
Andrew W. Appel
Cambridge University Press, 1997
398 pp., $29.95
Growing Artificial Societies:
Social Science from the Bottom Up
Joshua M. Epstein and Robert Axtell
MIT Press, 1996
208 pp., $18.95
Java Design: Building Better Apps and Applets
Peter Coad and Mark Mayfield
Yourdon Press/Prentice Hall, 1997
238 pp., $39.95
Any month in which I read more good books than bad is a good month. Right now, I'm running well in the black, having been absorbed for many hours in three useful, well-written books, while having wasted only an afternoon each on two that were better off as trees.
The most useful book on this month's list is one that I almost passed over because of its subtitle and glossy production. Understanding ActiveX and OLE: A Guide for Developers and Managers, by David Chappell, looks like it was written for people who have never had the time to learn how to program well. The edges of each page are delicately shaded with mauve, while the back cover proclaims that the book is "Easy to browse, with colorful illustrations and 'fast track' margin notes..."
I never did find out what makes a "fast track" margin note different from a normal margin note, but that's one of the few things that Chappell doesn't explain in these 300-odd pages. OLE, and its ActiveX derivative, are the most powerful (and most confusing) technologies to come out of Microsoft. Having started as a mechanism for embedding one document in another, OLE has turned into a general-purpose "software bus," and now supports remote procedure calls, multilingual programming, the World Wide Web, and just about everything else. Frequent changes in direction and terminology mean that some ideas have masqueraded under three or four different names, while some terms have meant different things in even- and odd-numbered years.
Chappell's book organizes this confusion into 11 clear, cohesive chapters. After tracing the technology's evolution, he describes each of its several major uses. The book contains almost no code, but that is actually a sign of strength: It is far harder to write clearly about something than to fill the equivalent space with two or three pages of code, and let the reader sweat. Having read the first six chapters, I was ready to go back and wrestle with Microsoft's own abysmal example programs once more; having done that, I'd advise any programmer who is about to wade into Brockschmidt's Inside OLE, or into someone else's OLE code, to read Chappell's travel guide first.
While garbage collection is almost as old as high-level programming languages -- the first papers on it were published in 1960 -- only a small fraction of programmers have ever worked with garbage-collected languages. In part, this has been because of garbage collection's historic association with interpreted, high-level, and (most importantly) slow languages such as Lisp. Garbage collection has also been ignored because its overhead doesn't pay for itself in small programs. If a program is small enough to be written by a handful of people in a few months, its authors can probably keep track of the memory they allocate and free it when it is no longer needed. In larger programs, or programs with more authors, this becomes increasingly difficult. For proof of this, you need only look at the success of companies like Pure Atria, which produces a tool for finding and fixing memory leaks.
Whatever else Java has accomplished, it has finally brought garbage collection into the mainstream. The efficiency and correctness of garbage collection algorithms is henceforth going to be of concern to hundreds of thousands of programmers; those who really care about it could do no better than to start with Garbage Collection: Algorithms for Automatic Dynamic Memory Management, by Richard Jones and Rafael Lins. After an introductory chapter on the history of the subject, and some motivational material, Jones and Lins present the three classical algorithms: reference counting, mark-sweep, and copying. Chapters 3 to 12 then present successively more-intricate extensions to these algorithms, including pointer reversal, the two-finger algorithm, Cheney's algorithm, and (most importantly) generational garbage collection, which has done as much as advances in compilation techniques to improve the performance of high-level languages such as Scheme and ML. Each chapter ends with a section titled "Issues to Consider," which sums up and analyzes the points made in the preceding pages. Together, these sections transform the book from merely a thorough survey into the sort of comprehensive engineering manual that is so rare in computing.
Had the word "Java" not appeared in the title, I wouldn't have bothered ordering Modern Compiler Implementation in Java, by Andrew Appel. I have been looking for evidence of people reorganizing undergraduate computer science curricula to use Java as a lingua franca, in the way they used Pascal during the 1970s. I was also familiar with Appel's reputation within the high-level language community.
Based on this book, the reputation seems well deserved. For a start, this is one of the first compiler texts I have seen that presents, at an undergraduate level, developments from the mid-1980s onward. There are chapters on lexing, parsing, and abstract syntax, but there are also chapters on basic blocks and traces, instruction selection algorithms, liveness analysis, and register allocation (a topic that wasn't even in the index of the text I used 15 years ago). This breakdown of topics is a good reflection of where a compiler writer's time actually goes. Most books, on the other hand, tend to concentrate on what we already understand (57 flavors of parsing) rather than what we actually do.
The second half of this book covers several advanced topics, including garbage collection, object-oriented languages, functional programming, dataflow analysis, and loop optimization. In the second edition version of this book, which is due out within the year, Appel intends to expand these sections; topics that may be covered include how to handle errors during parsing or semantic analysis, how to compile exceptions, and how to deal with concurrency.
Despite its thoroughness, its focus on what really matters, and its clear prose, Modern Compiler Implementation in Java does have two weaknesses. The first is its use of a made-up language called "Tiger" for its examples, even though the exercises are written in Java. While there are good reasons for using a simple-to-compile toy language in an undergraduate course, I think that students would get more out of the book if its starting point was a subset of Java. A second flaw is that Appel uses Java like a mostly functional language, such as Scheme or ML. While this may be a better way to program, I think that many Java programmers will find his style odd (particularly his Scheme-like indentation).
So much for the good; now for the bad. On the first page of Java Design: Building Better Apps and Applets, Peter Coad and Mark Mayfield say, "Java design is profound. It has forever changed how we think about object models and scenarios." Well, no. Java is yet another object-oriented language that happened to be in the right place at the right time. Its major users so far seem to have been computer book authors, for whom it is an excuse to repackage their previous work in the hope of wringing a few more dollars out of the folks who shop by doing keyword searches at Amazon.com. One day, there will be a good book on object-oriented design that happens to use Java for its examples (see http://www.interlog.com/~gvwilson/unwritten.html), but this hurried repackaging of Coad's earlier work is most certainly not it.
The basis of Growing Artificial Societies: Social Science from the Bottom Up, by Joshua M. Epstein and Robert Axtell is intriguing: The authors built and ran a variety of cellular automata (souped-up version of the Game of Life) in an attempt to model the development of simple social activities, such as trading. Their thesis has become a familiar one: Simple entities, interacting through simple, local rules, can produce very complicated behavior.
That thesis is worth investigating. However, such an investigation is only meaningful if it is done scientifically. While this book is written in the footnote-heavy style of science, science it ain't. Cellular automata can indeed generate complex behavior; the problem is, how do you determine what, if anything, that behavior means? A pendulum is billions of simple entities (atoms) interacting through simple rules (electromagnetic forces and gravity); does that mean that the swinging motion of a pendulum somehow tells us something profound about the economic cycle of capitalist economies? By changing the parameters in the authors' "Sugarscape" worldlet, you can get its little agents to migrate, to trade, and so on. But what the authors don't report is how many combinations of parameters they tried that didn't produce behavior that could be given an intriguing label, or how they justify their program's mixing of different spatial and temporal scales -- in short, all the things you would need to know to judge for yourself how significant their results really are.
Sadly, Growing Artificial Societies: Social Science from the Bottom Up is an example of what Richard Feynman called "cargo-cult science." Its authors enact the rituals of science without seeming to understand the reasons for those rituals. One sign of this is that, like some of the artificial intelligence researchers I encountered in the mid-1980s, they use words such as "suggest" a lot, knowing that people are very suggestible. A lot of good scientific research is being done on cellular automata (Lattice-Gas Cellular Automata, by Rothman and Zaleski, Cambridge University Press, 1997, for example). I hope that someone will one day do similarly good work in Epstein and Axtell's field.
Copyright © 1997, Dr. Dobb's Journal