Peer Code Reviews: Two Heads Are Better Than One

By Jonathan Erickson, May 08, 2007

Peer code reviews uncover defects and ensure maintable code.

DDJ: We're pleased to be joined by Jason Cohen, founder of Smart Bear Software. Jason, in your book Best Kept Secrets of Peer Code Review you talk about the "largest-ever case study of peer code reviews." Can you tell us about that project.
JC: Sure. Over a 10 month period at Cisco Systems, 50 developers performed 2500 reviews on 3.2 million lines of code. This was on production code in the LiveMeeting project, not an academic setting. Developers were in San Jose, Boulder, Hyderabad, and Budapest.

This was a study of lightweight code review using our tool, Code Collaborator. This means no meetings, no code print-outs, and no stop-watches -- not formal inspections -- but still we needed to collect metrics, produce reports for the internal Cisco code review system, get defect logs, and otherwise perform the usual mechanics of critical review.

Just using a lightweight process cut 4 hours out of every review compared with formal inspections (by replacing the reading and meeting phases with our collaborative platform); other tool features cut out another 2.5 of drudgery. Defect densities didn't change -- this makes sense because it's still humans looking for bugs -- but the speed of the review meant that more code could be reviewed.

The result was that 100 percent of the code changes during that 10-month period were reviewed. And of course that meant far more defects were removed than if they used formal inspections and were able to review a small fraction of that.

DDJ: At minimum, it would seem that a peer code review can be nothing more than having someone else look at your code. Why do you need special tools for something like this?
JC: At minimum, that is all you need! Certainly an over-the-shoulder or e-mail pass-around review process is much better than nothing, and both are relatively easy to implement.

However there are many draw-backs to the simple methods that a tool-based method addresses. Specifically:

Is code being reviewed? Without a record of the review you cannot know what's been reviewed. A tool can provide this record without developers having to do a lot of paperwork.
Metrics. How many defects are we finding, in which files, how quickly, ...? With ad-hoc methods you'll never get an accurate defect count, defect disposition (severity, type), or measure of how much time people are spending in reviews. A tool can measure all these things fully automatically, so developers aren't burdened but managers and process analysts get the data they need.
Reviewing over an ocean. Simultaneous review is a pain over many time-zones, and email threads can take a week to resolve, by which time all the code-diffs and conversation threads become a mess. A tool can keep conversations in context with specific lines of code so it's easy to remember what's going on.
Collecting materials. Developers have to generate diffs from version control, or extract "previous" and "current" versions and send them around. A tool with version control integration can collect/package/deliver this content with one click or command-line and display the results nicely for reviewers. No more referring to "file /foo/bar/bat on line 402".
Are defects being fixed? Usually with simple methods defects are identified but no one double-checks that indeed problems were fixed and no new defects were opened in the process. Verification can be very fast because the reviewer is usually expecting a specific fix, and a tool can highlight those fixes specially so verification can be fast.
When is the review complete? If you're passing around emails, when is the conversation over? How does the author of the changes know when it's okay to check in the code? A tool can track this with the easy of a bug-tracking system.

DDJ: What kind of useful metrics are most useful?
JC: It's a great question, because metrics can be used for good or evil. The most useful metric we've found is defect density, and the ones to stay away from are people-centric.

"Defect density" measure the number of defects per unit amount of code, typically per 1000 lines (or kLOC). Some people think "line of code" should mean "line of executable code," disregarding whitespace, comments, and stylistic pieces. However for code review we've found that it's important to include all lines in a file. For example, it's common to find a problem in a comment, or require a better comment.

One of the more useful applications of defect density is in looking for risky code. If you look at the defect density per file or module, some items pop up to the top. High defect density means: If you change even a little bit of this code, there's a great chance for a bug. There could be a lot of reasons for this -- the code is complex, or perhaps this is a core module which 80 percent of the code base indirectly depends on, so every little nit gets identified and removed. Doesn't matter why -- all of this points to "risk." Knowing which code is risky is of course useful, even for knowing how many reviewers you want to put on the job.

When you get into personal metrics you get into trouble. Examples would be "How fast is this person reviewing code?" or "How many defects does this person typically inject per 1000 lines changed?" It can be tempting to say that a developer is "bad" because he injects a lot of bugs, but we've found this to be incorrect and dangerous.

For example, say you have a piece of code you know is risky but it needs to be changed. Who do you put on the job? Your best people. And then you are extra careful during review, uncovering lots of "defects" because you're calling out every little thing -- more so than you would with other code.

The metrics of this event is that this developer creates tons of defects, but of course this is a result of the situation and has nothing to do with the developers' ability or contribution to the product.

DDJ: Is there a web site readers can go for more information about code reviews?
JC: There are lots of sites explaining how some particular group does review. For more general sites, there's always Wikipedia. Robert Bogue's article Effective Code Reviews Without the Pain describes how to approach reviews. Macadamian has some ideas about what to do with a checklist. And, of course, our own web site has lots materials including an entire book on the subject (including the results from that Cisco study). Finally, we have many many whitepapers and articles.

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Peer Code Reviews: Two Heads Are Better Than One

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Peer Code Reviews: Two Heads Are Better Than One

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content