Fifteen Books, Five Months Late
I'd like to start this lengthy set of reviews by apologizing to the authors and editors who've been waiting so patiently for me to tell the world about their work. Buying a house, getting married, selling a house, and taking on a new batch of graduate students is an explanation, but it's not an excuse.
I'd also like to apologize for the fact that these reviews aren't as detailed as they ought to be. The distractions listed above are part of the reason; the real cause is that academic life doesn't leave me time for programming, so I'm not able or qualified to delve into technical detail the way I used to. I miss it a lot, and worry that in another couple of years, I will be too out of touch to be able to guide my students. (And that a couple of years after that, I'll have to wear a tie to work...)But for now, though, I have a pile o' books here that you might want to read, and the one you'll probably like most is Charles Petzold's The Annotated Turing. My first reaction when I heard about it was, "Why didn't I think of that?" and my second was, "I wonder if he can pull it off?" The answer to the latter is definitely "yes", and I expect to see "AT" on shelves beside Godel, Escher, Bach and other thinkalong books in years to come.
Petzold's idea is simple: take Alan Turing's classic paper "On Computable Numbers, with an Application to the Entscheidungsproblem"---the paper for which he invented the Turing machine---and interpolate enough explanation to make it accessible to a lay reader. The original paper is broken into chunks ranging in size from a line or two to half a page, and typeset on gray. In between, Petzold's commentary explains the background to Turing's work, why his "machine" has the features it does, what the significance of various parts of the proof are, and so on. It would be a great text for a sophomore course on computability, but it's also simply a fun read for anyone who's curious about the intellectual underpinnings of our field. Five out of five, and I hope it inspires some imitators.
I'd like a lot of people to imitate Neal Ford's The Productive Programmer as well. In fact, I'd like people to imitate Mr. Ford, and the whole point of this book is to make that easy. In a little over 200 pages, he describes the things he does that allow him to produce more working software per day than most of his peers. Some are micro-level tricks, like using clipboards that can hold multiple items. Others, like running code through state-of-the-art static analysis tools or practicing test-driven development, are higher level, but no more or less important. I came away from the book feeling like I'd just watched one of those cooking shows where you get to see exactly how a great pastry chef makes a pie crust that tastes so much better than yours. I'll probably never do everything Ford recommends, but I've already switched to a better desktop shortcut tool, and don't plan to switch back.
"Uncle Bob" Martin's Clean Code is similar in spirit, and just as worthwhile. Where Ford touches on everything a developer does in a working day, Martin focuses on what developers produce: code. I expected most of the topics, such as choosing good variable names and information hiding. What pleasantly surprised me was how many new things Martin had to say about them, and how well his examples illustrated his points. I particularly liked Chapter 14, in which he refactors a Java class for handling command-line arguments step by step. It's the clearest explanation of what refactoring is actually for that I've ever read, and I'm already using it in my software engineering classes. Don't let the table of contents fool you: no matter how experienced you are, there's enough in here to make owning a copy worthwhile.
Kent Beck's Implementation Patterns draws on an equal depth of experience, but focuses more on the ideas that go into the code. The author says in the introduction that it's meant to sit between the Gang of Four's classic Design Patterns and a Java language manual. I think he does himself a disservice: what he's actually done is catalog the mental building blocks people use to write sequential imperative software. The chapter on "Methods", for example, includes a few paragraphs on each of 23 micro-patterns, including:
- Composed Method---Compose methods out of calls to other methods.
- Intention-Revealing Name---Name methods after what they are intended to do.
- Conversion Constructor---For most conversions, provide a method on the ocnverted object's class that takes the source object as a parameter.
It's tempting to say, "Well, everyone knows that," but of course everyone doesn't, and even if they did, categorizing and naming the obvious often reveals a lot that isn't. As I read the book, I thought about how cool it would be if the status bar in Eclipse could tell me which of these micro-patterns I was using in real time as I typed. It would be a great teaching tool, and would keep a lot of corner-cutting programmers (myself included) honest.
Stepping back for a moment, Ford, Martin, and Beck's books are all trying to teach a way of seeing the world. This is much harder than teaching the syntax of Python 3.0 or how to configure Basie, and it's very easy for authors who try to start preaching. (I know, because that's what I do.) FM&B all have very definite opinions on how you should think when you're programming; what makes all three books worthwhile is that they set these opinions on the dinner table and hand you a knife and fork, rather than trying to force-feed you or persuade you that yes, you really do like pickled beets.
Lindberg's Intellectual Property and Open Source also tries to convey a particular way of thinking. In this case, the "way" is the one embodied in America's legal code, which, like every other legal code, is contradictory, biased, and out of date. As you can guess from the title, Lindberg's target is software developers who know at most a few basic terms (and are probably even confused about some of those). The book is divided into two parts: an eight-chapter introduction to IP law that covers patents, trademarks, copyrights, trade secrets, and their interaction with open source, and six "how to" chapters to help you figure out who owns your idea (and patches that other people submit), apply a license to your code, skirt around the landmines of reverse engineering, and formalize your project.
The writing is clear, and the examples accessible; my only complaints are that some of Lindberg's analogies are a bit of a stretch, and that like most books in this area, his only covers the US. Those quibbles aside, I really enjoyed it, and think it deserves a place beside Karl Fogel's Producing Open Source Software on every open source developer's bookshelf.
Next up are five Pythonic books. Younker's Foundations of Agile Python Development and Ziadé's Expert Python Programming overlap in many ways: both have chapter-long introductions to version control, talk about packaging Python applications for distribution, preach the agile gospel, and so on. The major difference is that Ziadé's book devotes more space to the advanced features of Python itself, while Younker devotes that space to database programming and setting up build farms. It would be worth browsing either a few months after starting your first big Python project, just to make sure you hadn't missed anything, but if you have read The Pragmatic Programmer or any of its kin, you will already have seen half or more of their contents. In addition, Ziadé's book could use a closer proof-reading: some of the examples have been incorrectly indented during typesetting, and if you don't already understand decorators, the description in Chapter 2 isn't going to make a lot of sense.
Copeland's Essential SQLAlchemy and Bennett's Practical Django Projects aren't about Python per se, but rather about two popular programming tools built on top of it. SQLAlchemy is a full-featured object/relational mapping tool that does a very good job of managing persistence, thanks in large part to creative use of Python's metaprogramming features. I've never used more than a small subset of SQLAlchemy's features, but this book laid out the rest (especially inheritance handling) clearly and in a logical order.
Django, on the other hand, is the most popular of several Rails-style web application frameworks for Python (but uses its own ORM, rather than SQLAlchemy, which tells you all you need to know about why Python's various offerings are still eating Rails' dust). While Bennett's book was written before the final Django 1.0 release, the examples all seem to work with 1.0. The writing is clear, and it isn't bedevilled by the typos that made Holovaty and Kaplan-Moss's Definitive Guide to Django so frustrating. Like Essential SQLAlchemy, it is a solid, if somewhat predictable, introductions to its subject: here's how to install, here's a "hello, world" application, here's what you need to know to write something that's actually useful, and so on. I wouldn't have minded a few more screenshots, but on the other hand, not having them did force me to actually run more of the code.
The last book in this batch is Kinser's Python for Bioinformatics. The second part of the title is more important to the author than the first: Kinser's aim is clearly to help scientists do things like analyze gene sequences, and Python is "just" a useful tool for doing that. Thus, there are chapters on dynamic programming and text mining, rather than on generators or building distribution packages. I think his just-in-time approach will work well for his intended audience, and the extensive examples are a good way for programmers to learn a little bioinformatics.
At least, that's what the blurb on the back promises. The actual content was a mixed bag, ranging from unnecessarily-detailed descriptions of the language's syntax (complete with railroad diagrams) and commentary on the APIs of some built-in types to fairly advanced discussion of how closures and objects work, and how best to use them. A month after finishing it, I'm still not sure who the intended audience is: many parts will leave newcomers bewildered, while experienced programmers will frequently be bored. It isn't even really the guide to good practice that the title and blurb suggest, as there is too little discussion of why you would do things one way or another.
Kumaravel et al's Windows PowerShell Programming assumes readers already get this, and spends most of its time explaining how to extend PowerShell with new capabilities. You'll need to know a bit about .NET programming to follow the examples, but the payoff is being able to build new power tools with just a few dozen lines of code. It definitely isn't your grandmother's command line any more...
Finally, there is Allemang and Hendler's Semantic Web for the Working Ontologist, a (very) detailed introduction to the semantic web's approach to modeling data. You won't find a lot of code in the traditional sense in this book; instead, the authors present one real-world data management problem after another, and show how to represent and solve it using RDF, SPARQL, and related technologies. A friend of mine with a master's degree in library science littered her copy of this book with sticky notes, some of which had double exclamation marks on it. I wasn't quite as enthusiastic, but that's probably just a reflection of the fact that I don't usually have data complex enough to need this depth of analysis. If I ever get around to rewriting Data Crunching, though, I'll go through this book again very carefully: while the authors occasionally lose the forest in the trees, they are very careful to motivate every new twist and wrinkle they introduce, and their "challenge problems" do a good job of testing the reader's understanding of the material.
And that's all---fifteen books, read over five months, and reviewed in as many days. As I said at the outset, I don't have time to program any more, so I may have missed some crucial details. If so, I apologize in advance; corrections are awlays welcome. Until then, it's a beautiful day outside, and I'm going to take my daughter to the park to play on the slide. The big slide, mind you, not the little one---it really does make a difference, and I'm very happy to be re-learning it.
- Dean Allemang and James Hendler: Semantic Web for the Working Ontologist. Morgan Kaufmann, 0123735564, 2008, 352 pages.
- Kent Beck: Implementation Patterns. Addison-Wesley, 2007, 0321413091, 176 pages.
- James Bennett: Practical Django Projects. Apress, 2008, 1590599969, 256 pages.
- Rick Copeland: Essential SQLAlchemy. O'Reilly, 2008, 0596516142, 230 pages.
- Neal Ford: The Productive Programmer. O'Reilly, 2008, 0596519788, 222 pages.
- Rawld Gill, Craig Riecke, and Alex Russell: Mastering Dojo. Pragmatic Bookshelf, 2008, 1934356115, 568 pages.
- Jason Kinser: Python for Bioinformatics. Jones & Bartlett, 2008, 0763751863, 417 pages.
- Arul Kumaravel, Jon White, Maixin Li, Scott Happell, Guohui Xie, and Krishna C. Vutukuri: Professional Windows PowerShell Programming. Wrox, 2008, 0470173939, 336 pages.
- Van Lindberg: Intellectual Property and Open Source. O'Reilly, 2008, 0596517963, 390 pages.
- Robert C. Martin: Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall PTR, 2008, 0132350882, 464 pages.
- Charles Petzold: The Annotated Turing. Wiley, 2008, 0470229055, 384 pages.
- Jeff Younker: Foundations of Agile Python Development. Apress, 2008, 1590599810, 416 pages.
- Tarek Ziadé: Expert Python Programming. Packt Publishing, 2008, 184719494X, 372 pages.