Programming Ruby: A Pragmatic Programmers Guide
David Thomas and Andrew Hunt
Addison-Wesley, 2000
564 pp., $42.95
ISBN 0201710897
Program Development in Java: Abstraction, Specification, and Object-Oriented Design
Barbara Liskov and John Guttag
Addison-Wesley, 2001
433 pp., $49.95
ISBN 0201657686
The Interpretation of Object-Oriented Programming Languages
Iain Craig
Springer Verlag, 1999
254 pp., $79.95
ISBN 1852331593
MMIXware: A RISC Computer for the Third Millennium
Donald E. Knuth
Springer Verlag, 1999
550 pp., $59.00
ISBN 3540669388
Essential XML: Beyond Markup
Don Box, Aaron Skonnard, and John Lam
Addison-Wesley, 2000
368 pp., $34.95
ISBN 0201709147
XML Processing with Python
Sean McGrath
Prentice Hall, 2000
556 pp., $44.95
ISBN 0130211192
Presenting C#
Christoph Wille
SAMS, 2000
204 pp., $25.00
ISBN 0672320371
Women in Computer Sciences: Closing the Gap in Higher Education
Allan Fisher and Jane Margolis
Carnegie-Mellon University
http://www.cs.cmu.edu/~gendergap/index.html
It's been several months since I wrote my last set of book reviews, and a lot has changed in that time. The Software Carpentry design competition has finished, the company I work for in Toronto has been acquired, and I've spent a lot of time wondering why the gender gap among open-source developers is even wider than it is in computing as a whole. This hasn't left me as much time for reading as I would have liked, but the high quality of a few of the books that have come across my desk recently have more than made up for that.
The most popular books on this month's list will probably be Dave Thomas and Andy Hunt's Programming Ruby, and Barbara Liskov and John Guttag's Program Development in Java. The first of these is a comprehensive introduction to a Perl-like scripting language that has become very popular in Japan. As the authors' web site (http://www.pragmaticprogrammer.com/) explains:
You can think of Ruby as a mix of Perl and Smalltalk, or look at it as Python with full object-orientation. It features exception handling, closures, and iterators. Classes and objects can be altered and extended at run time. Everything is an object, including the basic types (such as numbers). Unreferenced objects are freed by a mark-and-sweep garbage collector. Ruby is portable too, and runs on a wide variety of systems. And, to cap it all, it has a simple, regular syntax.
Part I begins with a 12-page overview of the language's major features, then describes classes, blocks, I/O, and other fundamentals. Part II, called "Ruby in its Setting," shows how Ruby can be used for CGI scripting, GUI construction, Windows automation, and similar tasks. The remainder of the book consists of a language reference, a guide to the standard Ruby libraries, and some appendices.
Like their other book, The Pragmatic Programmer (reviewed here in March 2000), the writing is lucid, economical, and illuminating. There are many examples, all of which are easy to follow. While I would have liked more detail in some of the sections on applications, there's certainly enough here to jump-start newcomers. Overall, I give this book full marks, and I think that it deserves to become the standard reference for Ruby programmers.
Program Development in Java, by Barbara Liskov and John Guttag, is intended for use in a second-year course on software development methodology. It assumes that readers are already familiar with Java, and confirms that Java has become the dominant teaching language of the early 21st century. The first part of the book analyzes various forms of abstraction, including the use of procedures, abstract data types, type hierarchies, polymorphism, and iterators. Exceptions are covered early on, as are the differences between checked and unchecked exceptions. This material is the basis of the second part of the book, in which the authors look at specification, testing, requirements analysis, and other aspects of the design process. The last chapter introduces the notion of a design pattern, using flyweights, singletons, composites, and a few other simple patterns as examples.
Overall, the book is well written, well edited, and has a useful index. Its only weakness is that there are a few places where the authors make general points, but do not provide enough specific examples for their intended readers to understand the generalization. For example, while the material on testing against semiformal specifications is good, the testing of iterators, data abstractions, and type hierarchies get only a few paragraphs each. Very few of the college sophomores I know would be able to apply the general principles that Liskov and Guttag make to their particular problems without such illustrations.
Iain Craig's The Interpretation of Object-Oriented Programming Languages is also a tutorial, but is aimed at a much more advanced audience than either of the previous two books. Craig's aim is to introduce the full breadth of object-oriented programming to programmers who only know it through its relatively narrow incarnations in C++ and Java. In a compare-and-contrast style, Craig describes how systems such as Smalltalk and CLOS have implemented inheritance, delegation, polymorphism, and typing. He shows readers that there are real alternatives the way most of us are used to doing things isn't the only, or even necessarily the best, way. While some explanations may be difficult to follow if you don't have some previous exposure to these other systems, the effort is definitely worthwhile.
And speaking of "more advanced audiences," the next book on this month's list is Donald Knuth's MMIXware: A RISC Computer for the Third Millennium. Knuth is one of the brightest minds ever to grace our so-called science. As his work on TeX and MetaFont shows, he is also an intensely practical master of the details of implementation.
Knuth's as-yet-unfinished master work, The Art of Computer Programming, is regarded by many as the definitive work on the analysis of algorithms. In MMIXware, Knuth describes a simulator for a virtual processor called "MMIX," whose instruction set will be used as the basis for the complexity analyses in the new edition of The Art of Computer Programming. MMIXware is written using the literate programming style that Knuth himself invented. Mathematics and Algol-like program statements are combined, indexed, and cross referenced to create a text that is sometimes dauntingly dense, but packed with information. As with The Art of Computer Programming, only a few readers will have the stamina to go through this book from cover to cover, but it is a superb reference for anyone who intends to use the software it describes. If only open-source software was this well documented, it would have taken over the world long ago...
Of course, any mention of taking over the world these days brings XML quickly to mind. As Don Box, Aaron Skonnard, and John Lam explain in the preface to Essential XML, "[it] has replaced Java, Design Patterns, and Object Technology as the industry's solution to world hunger... This is especially ironic given the relatively humble origins of XML, which lie squarely in the world of document management systems."
Of course, XML has outgrown its HTML-plus roots, and is on its way to becoming a universal canvas that every kind of application can read and write. (This notion is explored in more depth in Jon Udell's report for the Software Carpentry project, at http://www.software-carpentry.com/Groupware/report.html.) Most of Essential XML is, therefore, devoted to navigation, transformation, schemas, and other data-exchange aspects of XML, rather than to its more traditional (in web years) use as an extensible successor to HTML.
Unfortunately, this book races through the normal cases too quickly and spends too much time on the darker corners of various standards and standards-in-waiting. While this may be useful for experts who want an in-depth look at the special cases and exceptions that have motivated those standards' more obscure features, most readers (including me) will find that it can take several minutes to work backwards from some passing remark to an understanding of how something specific is supposed to be done. The writing is good and the humor is dry enough to bear rereading, but with nine XML books now on my shelves, I am still waiting for one that I can recommend whole-heartedly.
I also have a hard time recommending Sean McGrath's XML Processing with Python. The first of its problems is that much of its background material (such as the two-chapter introduction to Python) is too rushed to get newcomers up to speed, and too shallow to tell experienced programmers anything they don't already know. I know it's tempting to try to broaden the potential readership of a book by including abbreviated introductions of this sort, but in my experience, they almost always suffer from this double fault.
The second of this book's problems is its focus on processing XML using McGrath's own token-per-line PYX system, rather than the standard SAX and DOM libraries. PYX is certainly interesting, as it converts XML to a format suitable for input to grep and other UNIX command-line utilities. However, given the book's title and the needs of most programmers, I think McGrath would have done better by showing readers how to drive SAX and DOM, particularly as this knowledge could then be applied in other languages as well. (In the last six months, for example, I have used C and Python versions of SAX, and C++ and Python versions of DOM.) While McGrath's book does include one chapter each on these libraries, even here, his main goal is to show how they can be used to generate his token-per-line notation.
The last and least of this month's books is Christoph Wille's Presenting C#. C# and the .NET platform are an interesting, innovative, and well-designed marriage of Java's platform-independent execution and COM's object technology, and I am looking forward to working with it. Funnily enough, though, the word "Java" doesn't even appear in this book's index, and it glosses over the many compromises made in C# in order to allow it to recycle existing COM components. I suspect that Presenting C# is selling rather well, as it is the first bound description of Microsoft's new universal glue language to hit the market. I also suspect that it would have sold just as well if its author and publisher had been honest enough to put "Microsoft Corporation Marketing Department" on the front cover. If you think that independent publishers are supposed to be just that independent then I suggest you give this particular book a miss.
Finally, I enjoyed meeting a lot of people at the O'Reilly Open Source conference in July, and was particularly pleased to discover that Tim Peters is not actually a fictional character. However, I was disturbed by how few women were present in a technical capacity, even by our profession's low standards. Computing has always been an unwelcoming environment for women; a quick sample of several dozen SourceForge projects and the mail archives of four large open-source projects showed that open source has somehow managed to make itself even less congenial.
I was therefore grateful when a colleague pointed me to work that's being done to close this gap at Carnegie-Mellon University. Partly as a result of the efforts by Jane Margolis, Allan Fisher, and others, the entrance enrollment of women in the undergraduate CS program at CMU has risen from 8 percent in 1995 to 37 percent in 1999. You can find out more about their work at http://www.cs.cmu.edu/~gendergap/index.html; it's well worth the surf.
DDJ