Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

import java.*: Control Flow — The Bad, The Good, The Exceptional


May 1999/import java.*

In 1968, industry luminary Edsgar Dijkstra wrote a now-famous letter entitled "GOTO Statement Considered Harmful" [1] in which he made a case for programming without branching constructs. He and many others have commented over the years that you can express algorithms more clearly without them, and educators and language designers have labored to usher in a goto-less programming world.

Have they succeeded? It depends. Their efforts have certainly raised the bar of structured programming, with goto-filled languages like FORTRAN and BASIC giving way to better structured languages such as Fortran-77, Pascal, Modula, C, and Visual BASIC. More programmers certainly think structured nowadays. When is the last time you saw a goto in a technical article (besides this one, of course?) Yet all popular languages have always had goto as trap door, just in case you "needed" it.

Until now, that is. Java has no goto and does very well, thank you very much. In this article I'll explain why, as well as look at all the issues pertaining to control of program flow, including use of exceptions.

The Bad

So what's the matter with goto? Like anything else in life, the problem is not in the construct itself, but rather in how it is used/abused. My first language was FORTRAN-IV, which had no else nor the notion of a compound statement. Here's a sample:

      IF (X .LT. 0) GOTO 10
      IF (X .EQ. 0) GOTO 20
      N = 1
      Y = F(N)
      GOTO 30
   10 N = -1
      Y = H(N)
      GOTO 30
   20 N = 0
      Y = G(N)
   30 CONTINUE

Okay, quick! What does this code do? Can't you see the logic at a glance? If you can, I don't know if that's a good thing or not! Here's how you might write it in C or Java:

if (x < 0)
{
    n = -1;
    y = f(n);
}
else if (x == 0)
{
    n = 0;
    y = g(n);
}
else
{
    n = 1;
    y = h(n);
}

Ah, much better! Of course seasoned C hackers might get carried away and do the following:

n = (x < 0) ?
    (f(-1), -1) : (x == 0) ?
        (f(0), 0) : (f(-1), -1);

which is not pretty, I'll admit, but even this atrocity is easier to follow than the FORTRAN version because you don't have to jump all over the place. Don't try the line above in Java, though: it doesn't have a comma operator.

I liken the move from branching to structured logic to the jump from assembly language to a high-level language. You can do anything in assembler, but programming in C is clearer and less error-prone. Likewise, you can express any logic by littering a sequence of statements with gotos, but higher-level constructs make your code more readable and easier to get right the first time.

I realize that in 1999 I might be preaching to the proverbial choir, but let's look at one more example to prove the point. What does the following BASIC program do?

140 lo = 1
150 hi = 100
160 if lo > hi then
    print "You cheated!" : goto 240
170 g = int((lo + hi) / 2)
180 print "Is it";g;" (L/H/Y)?"
190 input r$
200 if r$ = "L" then
    lo = g+1 : goto 160
210 if r$ = "H" then
    hi = g-1 : goto 160
220 if r$ <> "Y" then
    print "What? Try again..." :
        goto 190
230 print "What fun!"
240 print "Wanna play again?"
250 input r$
260 if r$ = "Y" then 140

Since I used reasonably named variables, you probably guessed that this program plays the game of "Hi-Lo": it uses binary search to guess a number between 1 and 100. The user responds to each guess by telling whether it is too high or too low. If the variables lo and hi ever cross (i.e., lo > hi), then the user gave erroneous input. But again, it is difficult to infer the logic without careful study. Can you readily see how many loops there are, and where they begin and end?

The Good

It is one thing to say "don't use goto," but quite another to say what you can and should use to express algorithms well. Two years before Dijkstra's paper mentioned above, Bohm and Jacopini [2] proved mathematically that it is possible to express any algorithm in terms of only three constructs, along with an arbitrary number of boolean flags. The three constructs are:

  • 1) sequences of statements
  • 2) alternation (e.g., if-else, switch)
  • 3) repetition (e.g., while, for, do)

We usually call programs that use only these mechanisms structured programs. Loops and if statements work the same in Java as they do in C++. Listing 1 shows a Java version of the Hi-Lo program that obeys the rules of structured programming. I'll explain the expression throws IOException later; for now, notice the logic. There are two loops: the outer loop allows multiple plays, and the inner loop plays a single game. Note also the two boolean loop control variables: done for the outer loop, and found for the inner. When it's time to terminate a loop, I just change the state of its control. This is the type of programming style Bohm and Jacopini had in mind.

But what if you need to terminate a loop from within, that is, before the last statement of its body? Somehow you need to skip the statements that follow. Following the rules of structured programming, you'd need to nest the remainder of the loop body in an if statement, like this:

boolean done = false;
while (!done)
{
    // <a bunch of statements here>
    if (<you DON'T need to exit
         the loop now>)
    {
        // <the rest of the loop
        //  body goes here>
    }
    else
        done = true;
}

If you need to exit the loop in more than spot, you have a whole lot of nesting going on! To reflect the logic more directly, Java, like C, includes the break, continue, and return statements, which are just a restricted form of goto. The break statement exits the immediately enclosing loop or switch, whereas continue iterates on the enclosing loop. Using break in the loop above obviates the need for the control variable and makes the logic more self-evident:

for (;;)
{
    // <a bunch of statements here>
    if (<you NEED to exit
         the loop now>)
        break;
    // <the rest of the loop
    //    body goes here>
}

So a little bit of goto ain't so bad. This is especially true with nested loops. The structured program in Listing 2 has three loops, nested sequentially, and it wants to break out of all three loops when k == 1 in the innermost loop. To make this happen, this program needs to set all loop control variables false. Java provides a better way via the labeled break, which allows you to say, in effect, "I want to break out of the loop at such and such a level of nesting." As the program in Listing 3 illustrates, you place a label (an identifier followed by a colon, as in C) immediately before the loop(s) you want to directly break out of, and then make that label the target of a break statement. Isn't it nice not to have to use extraneous Boolean flags that have no direct bearing on the meaning of your program? Listing 4 shows a version of Hi-Lo that uses a labeled break to allow the user to quit the game prematurely by typing the letter 'Q'. (In case you're wondering what a BufferedReader is, I'm not going to explain the I/O in this article. Trust me and stay tuned.)

Java also supports a labeled continue, which breaks out of any intermediate loops to iterate on the loop specified by the label. For example, if you replace break with continue in Listing 3, the output is

0,0,0
0,0,1
1,0,0
1,0,1

The branching constructs, break, continue, and return, along with labeled break and continue, make an unbridled goto capability unnecessary, so Java doesn't support it.

The Exceptional

In real programs, not only do you need sometimes to exit deeply nested loops; you also need to exit a deeply nested function call stack. If a catastrophic error should occur, you need to return to a safe place, which is usually at an outer level of logical nesting in your program. In traditional programming languages like C you usually just use return values from a function to indicate an error condition. This requires you to generate and check return values through every function in the chain of function calls, which fills your source code with clutter that obscures the logic of what you're trying to accomplish.

As an illustration, suppose you have functions f, g, and h, which execute in a nested fashion (see Listing 5). These functions produce side effects and do not need to return any value. Suppose further that during the execution of h a particular error might occur, in which case you want to return to the main program and start over. The return-value technique requires h to return a code to g, then g to f and f to main. In this case you must alter your functions' signatures to accommodate the error handling, and error handling code is scattered throughout your program (see Listing 6). In this example, I use Java's random number generator in h to return a one to simulate an error, zero otherwise. The static variable seed holds a number you type on the command line to seed the random number generator.

What a mess! Why not just jump from h to main? With exceptions you can. In Listing 7 I restored f and g to their original form. Now if an error occurs in h, I throw an exception of type MyError, which is caught in main.

As you can see, exception-handling syntax in Java is virtually identical to C++: you wrap code to be exception-tested in a try block at whatever level suits you, followed by one or more exception handlers that catch objects of a specified class. To raise an exception you use the throw keyword. When an exception is thrown, execution retraces its way back up the stack until it finds a handler that takes a parameter of the same type (or of a supertype). The key differences between Java and C++ exception handling are as follows:

  • 1. All exceptions must be objects of classes derived from java.lang.Exception. (Actually, there are a couple of other classes you could derive from, but those are generally reserved for the Java runtime implementation. You should use Exception.) Since exceptions are objects, you must use new; you can't throw primitive types.
  • 2. Java doesn't have destructors, so there is no concept of unwinding the stack when exceptions occur as there is in C++.
  • 3. Exception specifications (e.g., throws MyError) are not optional for exception objects that derive from Exception. Every function must specify the types of exceptions it may throw. Furthermore, a function without an exception specification may not throw any exceptions. In C++, a function without exception specifications can throw any exception.
  • 4. Java has a finally clause that facilitates program cleanup in the presence of exceptions (see below).

Any exception class derived from Exception is called a checked exception, because the compiler checks to ensure that the only such exceptions thrown by a function are the ones you advertise in its exception specification. If you call a function with an exception specification, you must either handle that type of exception locally, or you must add it to the exception specification of the enclosing function. I used the latter course in Listing 1. Since BufferedReader.readLine can throw an IOException (that's what its specification says) I can propagate this to the specification of the function I'm in (main in this case). The alternative would be to handle each call to readLine as follows:

char r;
try {        
    r = in.readLine().toUpperCase().charAt(0);
}
catch (IOException x) {
    // Abort after read error:
    System.out.println("read error " + x);
    System.exit(-1);
}

Unchecked exceptions include things that are difficult to detect at compile-time, such as an array index out of bounds. These exceptions can occur almost anywhere, and it would be ridiculous to force the developer to specify all such exceptions in all method specifications. Unchecked exceptions derive from either RuntimeException or Error.

Exceptions and Resource Management

When using exceptions it is important to ensure that you deallocate any local resources in case an exception occurs. Consider, for example, a function that copies a file to standard output. A first attempt might look like this:

static void copy(String file) {
    FileReader r = new FileReader(file);
    int c;
    while ((c = r.read()) != -1)
        System.out.write(c);
    r.close();
}

This won't compile because the FileRead constructor as well as read, write, and close throw checked exceptions. The easy way to make the compiler happy is to add IOException to copy's throw specification:

static void copy(String file)
throws IOException {
    FileReader r = new FileReader(file);
...

Using this technique any I/O exception will propagate to the caller, and the caller will know something went wrong. So the compiler is happy, but if read or write throw an exception the file doesn't get closed. One solution is to catch the exception and close the file, but you have to rethrow the exception so the caller still gets the exception, like this:

static void copy(String file)
throws IOException {
    FileReader r = new FileReader(file);
    int c;

    try {
        while ((c = r.read()) != -1)
            System.out.write(c);
    }
    catch (IOException x) {
        r.close();
        throw x;    // rethrow
    }
    if (open)
        r.close();
}

It's a pain to have to have two calls to close, and in a complicated program where many exceptions can be thrown this technique is too tedious and error-prone to be acceptable. The C++ solution is to wrap the lifetime of the file in an object and have the destructor close the file. Well, Java doesn't have destructors, but it does has the finally clause, which is an even better solution:

static void copy(String file)
throws IOException {
    FileReader r = new FileReader(file);
    int c;

    try {
        while ((c = r.read()) != -1)
            System.out.write(c);
    }
    finally {
        r.close();
    }
}

Any code in a finally clause is executed no matter what, whether an exception occurs or not, or even if a return statement occurs within the try block or any of its handlers. As the example above shows, you don't need to have a catch clause to use finally. Since all I want to do in this case is to close the file and let the exception pass back to the caller, I don't need one. A complete program that uses the copy method to print a file you specify on the command line is in Listing 8. Note that when you print an exception object with System.out.println, it gives the type of the exception with an explanatory message.

Under Control

I think Dijkstra would have liked Java. It has done away with the dreaded goto, but gives the programmer enough flexibility to write readable and convenient structured code. Java's enforced exception specifications finesse surprise exceptions, and the finally clause helps guarantee proper resource management with minimum hassle. Nice.

References

1. E. Dijkstra. "GOTO Statement Considered Harmful," Communications of the ACM, 11:3, p. 147, March 1968.

2. C. Bohm & G. Jacopini. "Flow Diagrams, Turing Machines, and Languages with Only Two Formation Rules," Communications of the ACM, 9:5, p. 266, May 1966.

Chuck Allison is Consulting Editor and a columnist with CUJ. He is the owner of Fresh Sources, a company specializing in object-oriented software development, training, and mentoring. He has been a contributing member of J16, the C++ Standards Committee, since 1991, and is the author of C and C++ Code Capsules: A Guide for Practitioners, Prentice-Hall, 1998. You can email Chuck at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.