We are living in an age of unprecedented language creation. Between the explosion of languages on the JVM and the new native languages, we find ourselves with a happy surfeit of very interesting choices. These options are not just the toy creations of comp-sci undergraduates, but sophisticated products with extensive libraries and active communities. Where they tend to be weak, however, is in tooling. And unfortunately, for many language developers, tooling is a metaphor for the coding front end: They strive to create editor plugins to provide basic syntax assistance. The more important support for debugging is often consigned to the use of printf-
like statements to dump trace
statements and variables' contents to the console.
More Insights
White Papers
More >>Reports
More >>Webcasts
- Transforming Operations - Part 1: Managing Outsourced Development in Telecommunications
- Agile Desktop Infrastructures: You CAN Have It All
I have always found this substitution of printf
for debugging to be a profoundly wrong conflation of two concepts. Yet, because we've all had the experience of using printf
or its equivalents to help chase down bugs, we tend to go along with the proposal. Some well-known developers even proclaim their preference for print
statements. Consider this statement from "Uncle Bob" Martin in a post decrying the use of debuggers: "The kinds of bugs I have to troubleshoot are easily isolated by my unit tests, and can be quickly found through inspection and a few judiciously placed print
statements." (I'm not singling out Martin here. He's certainly not the only person who holds this opinion.)
There are multiple aspects of printf
statements that make them very poor substitutes and, in fact, at times dangerous tools.
Location. Martin advises "judiciously placed print statements." Well, if you're in a serious debugging mode, judiciously placing printf
is a very difficult thing to do. It implies some strong knowledge of the nature of the cause of the defect you're chasing. My experience is that, frequently, you get the first attempt at printf
wrong, and then must start to guess where else to place the statements. Sometimes, it's not even guessing: You need to put them at several upstream points to coarsely locate where a variable unexpectedly changes values. Finally, when you get the right location, you must then add new statements to track down why the variable is changing. It's a mess that brings me to the second point.
Time cost. Every print
statement means another compilation and link step. Because of the time this consumes, there is considerable motivation to put in many more statements at each pass, so as to trap the defect wherever it might possibly occur. The result is code literally littered with dump
commands.
Complexity. While conceptually nothing is simpler than dumping a variable to the console, in fact, it's no trivial matter. This is particularly true of data structures, especially those containing pointers. Now, the print
statement must explain what it's dumping and format it correctly. Really subtle bugs might require multiple lines of extracting and formatting code before the dump
statement is useful. Debuggers handle this transparently and allow you to walk lists and arrays with no difficulty.
Heisenberg effect. Print
statements can occasionally have unintended consequences on the executing code. This is particularly true in parallel programming because of the problem that print
statements will be dumped simultaneously to the console and turn the data into unreadable garbage. To avoid this, some kind of mutual exclusion becomes necessary, which immediately changes the execution pattern and performance profile of the code. (Not to mention the complexity of setting this up.)
Clean up. Congratulations, you found the bug! Now, it's time to clean up your print
statements. Unless you were very careful about tracking where you placed them, chances are fair that you'll miss one or two. When it shows up in testing later on, you'll need to dive back into the code for a quick nip and tuck. In practice, however, these statements especially if they including formatting logic for dumping data structures aren't actually removed. Rather, they are simply commented out in case they're needed again; thereby leaving unneeded trash strewn throughout the codebase.
In almost every dimension, the dumping of variables to the console is an inferior alternative to using the debugger. It takes just as long, if not longer, to find defects, and the practice inserts detrimental artifacts into the codebase.
When I hear developers say that they're happy debugging with print
statements (frequently stated with the inflection of "I'm old school and like doing things with simple tools"), I know that they're not likely to be working on large projects and are definitely not working with parallel code. Likewise, when I hear of a new language that recommends print
statements as the way to debug the code, I want to address the suggestion by asking, Why don't you just come out and say it? You don't support debugging yet.
— Andrew Binstock
Editor in Chief
[email protected]
Twitter: platypusguy