Channels ▼
RSS

Web Development

Coding for Readability


Charles is a freelance web programmer and can be contacted at cclarkson@htcomp.net.


While contributing to several Perl beginner e-mail lists, I noticed that many posts are from people who have little or no formal training in computer programming. As a fellow uneducated programmer, I think poor habits can be unlearned.

I know good code when I see it. It doesn't have to conform to my personal programming style—my style is biased by my history of programming. At 43, I come from a Basic background with little formal training as a computer programmer. When I started using Perl, I had to abandon some of the style I found common, but I had previously switched between enough languages to find that relatively easy.

One important concept when writing programs is to keep the reader in mind. Computer programs are not written for machines. They are written for people, and therefore, style matters. We too often concentrate on the needs of the Perl interpreter, rather than the needs of people. If we write programs for people, we'll become better programmers.

Coding is Writing

Good programs tell a story. They have a setting, a plot, and characters. Good stories are built on writing tools: grammar, language, style, and the like. When programming, we use similar tools: logic, reuse, style, and so on. Better programmers are those better able to use existing tools and create missing tools.

One key to writing good stories is developing a writing style. Authors develop a basic writing style they prefer, and then tend to write that way consistently. Reading a few stories by one author tells us a lot about the author as well as the characters in the book.

Good programmers need the same commitment. Develop your style. Refine it with logical rules. Stick to it consistently. When someone reads your program, they can tell how organized, consistent, and concise you are. Readers can learn a lot about you from your programs.

Some stories are just dialogue in that the author doesn't add description; for example, Shakespeare. You have characters acting out parts, but everything descriptive is said through the actions of the players, the scenery, and their words. Other stories are written with description mixed in with dialogue. You know what a sunset looks like because the author describes it in detail.

One key to writing good code is mixing just enough code (dialogue) with comments (description). Some code needs comments—we can't tell what is going on by simply reading the code. Meanwhile, other code has comments that are unneeded or easily sidetrack the reader.

There are some elements of my personal programming style that are my personal bias (avoiding excessive comments, parentheses, and underscores in variable names) and some things that should be a part of everyone's programming style (well-structured subroutines of limited, clearly defined purpose, and error checking).

Whatever you do, you should be consistent. If you use a cuddled } else { in one part of script, use it in all parts. If you prefer $VariablesLikeThese to $variables_like_these, that's fine—just avoid mixing both styles in one script. I like to maintain the same naming conventions for subroutines as I use for variables.

Whitespace and Comments

Another area to consider is whitespace. Too little whitespace makes scripts look disorganized and confusing. Imagine the difficulty in reading this article if it had only one paragraph. You would be distracted and have a tough time finding your place. Because I group my words into sentences, and those sentences into paragraphs in (one hopes) meaningful groups, you can more easily digest my meaning.

Readability doesn't always mean lots of comments. Most people use either too many comments or not enough. Beginners who use too many comments are often making up for poorly defined style. Those who use too few comments either don't comment at all or falsely believe they will understand everything when they come back to it later. When writing a comment, I use formal sentences, correct grammar, and punctuation. I place a space between the # and the first word and avoid unexplained jargon. A good comment can make an especially tough piece of code easy to comprehend, and an inappropriate comment can turn good code to the dark side.

Don't use comments to make up for poor naming conventions. If you can give a subroutine a reasonably concise name that adequately describes its function, you might not need to comment it. Nor should your comments repeat the obvious.

Getting Help

If you're a Perl beginner, style becomes especially important when you (inevitably) go to the Internet for help. A readable program will allow others to parse your code for problems more quickly, and will make it more likely that you will get the help you need. Avoid using long lines if you are preparing to ask for help. When I write posts for USENET or e-mail newsgroups, I try to keep my lines shorter than 66 characters across. This aids anyone replying by not having indented lines wrap. In practice, I usually use 72-character-length lines as they print without line wrapping. I mention this because many people use trailing comments like those common to assembler programming. When asking a question on a list, keep your audience in mind (Perl gurus and programmers). Paying attention to their needs gets questions answered.

Dissecting a Subroutine

Let's look at a few common mistakes and their corrections. Take a look at Example 1, which is a subroutine that was pulled off a larger program on USENET. Pay attention to the style more than the errors in the use of Perl.

Do we really need a subroutine description? More importantly, do we need that subroutine description? Look at the name of the subroutine; checkLogin is pretty descriptive. Most readers will immediately know what the script is doing. Let's peek ahead at the other comments. The next one is a trailing comment (I've removed the code part for brevity—see the full line in Example 1):

# split $line, using : as delimiter

Note that the comment was written for a different revision of the script. These types of comments are a maintenance problem. We can safely assume that others reading the script will know that split() splits something. We can also assume that one of the arguments to the function may be the character(s) to split on. We can safely drop this comment and be assured the program will be as easy to read. We can also safely drop the last comment three lines down. Also, using better names for the fields could lead to more descriptive code with no comment:

my( $user, $password ) = ( split /\t/, $_ )[1, 2];

Notice the author's style in determining variable names. Itlookssortoflikethis. The subroutine name uses a lowercaseFirstMixedCase style. One key to good style is consistency. Although I dislike mixed-case style, let's stick to it. $loginname becomes $loginName and $matchokay becomes $matchOkay.

The author uses a two-space indent inconsistently. We'll stick with the two-space indent, but we'll need to fix the inconsistent if statement structure. Let's separate those statements out with the original formatting left intact; see Example 2(a)

We have two distinct styles here. Notice that the inner if in the example is not indented two spaces. In fact, it is out-dented two spaces. That's inconsistent with indenting in the rest of the subroutine. Notice also that it is sometimes okay to cuddle the result statement while other times it is not. Perhaps that's because of the comment. Let's rewrite these, using a consistent style; see Example 2(b).

My style includes opening all files the same way. Unless I'm using the filehandle somewhere far from its opening, or the name is already taken, FH is just fine as the filehandle name. I have an editor macro just for opening files in the same way each time. To use it in this subroutine, we would define $file as $SETTINGS{userfile}:

my $file = '';
open FH, $file or die qq(Cannot open "$file": $!);

close FH;

I have a consistent variable use policy in my style. I attempt to use the least number of variables that will allow my script to run and remain human readable. One variable type I can often avoid is the flag. A flag is a convenience variable that marks an expression and allows us to later perform an action based on the value of that flag.

Many times, flags are unavoidable. In this subroutine, however, the $matchOkay flag is easily avoided. We set its value in a loop and later act upon that value. In this case, it is needed probably because the author does not understand how to localize a file handle to a subroutine. Without correcting the other errors in this sub, let's look at a localized file handle without the $matchOkay flag (Example 3). FH will close when the sub returns a True value.

Another beginner style mistake involves return values. For functions that return truth, return a result that Perl interprets as True. Typically, this is the number 1. For a false result, return an undefined value. The return function returns undefined by default. So return 1; is preferable to return 'true'; and return; is preferable to return '';.

When you have two if statements stacked inside each other the way they are in our example, it's generally best to combine them. Combining them into a one-statement if block can further aid readability with the use of a statement modifier. I also prefer to vertically align operators if they do not detract from reading horizontally; see Figure 1.

Design Issues

The final rules I would like to apply to this subroutine are not just style rules—they also add to readability, but they touch on larger rules for good overall program design. A subroutine is really just one tiny step down from an object. It should be as self contained as possible. It should be passed the values it processes, and should pass new values on return, except under special circumstances. A well-written subroutine should not be affected by changes in the external environment, and should only affect the external environment in well-defined, predetermined ways.

checkLogin() uses values that were never passed in ($matchokay, $SETTINGS{userfile}, and $password). The original version seemed to affect a variable ($matchokay) which was not in the subroutine scope. It also opened a filehandle (USERFILE) without any checks to find if it was being used somewhere else. Finally, the name seems poorly chosen. Use might dictate something else, but I would prefer something more descriptive such as isValidUser(); see Example 4.

There are still two things that bother me. Those regexes could fail if the password or the user name have special characters in them. Frankly, I would have gone with the eq operator instead. But that's definitely outside the realm of style, so I'll leave that alone.

Basic coding style is about more than just making your work look pretty. It will make your code easier for other programmers to follow, will allow you to get the help you need more quickly, and can even help you discover logical flaws in your own code. If you follow some of these style rules, you'll be well on your way to becoming a better Perl programmer.

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 
Dr. Dobb's TV