Channels ▼
RSS

Design

The Scourge of Error Handling


Our recent five-part tutorial on Google's Go language induced me to dip back into C-style programming. I was impressed with the improvements the Go team has made, particularly in the design of the return value mechanism. Unlike most languages, Go enables you to return multiple values from a function without creating some ad hoc data structure or object to do it. One of the standard return values is an error code, which is accessed conventionally upon the function's return through the err variable.

This solution solved a messy problem — C's dual use of return values for data and error codes — which was necessary due to C's lack of an exception mechanism. To be fair, this problem still exists in a different form in languages that have robust exceptions. For example, in Java, the conventional use of a null return both as an indicator of an error condition and as an actual data item laces codebases with endless tests for null. The problem is so ubiquitous in Java that many JVM scripting languages include shorthand to abbreviate the null checks.

More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

In fact, Go has an exception mechanism as well, but its use is contrary to convention and convention is a central aspect of Go development. It's also not as elaborate as the exception mechanisms in C++ or Java. Its use is supposed to be for truly exceptional circumstances. I believe this is due to Google scale issues (recalling that Go was designed primarily to address Google development needs, rather than programming problems at large). Namely, that when you're running thousands of transactions on very large systems, exceptions become a very costly proposition. Not only are they slow, but they permanently fork the execution path. And on large, fast-moving systems, both effects tend to be highly undesirable. Moreover, exceptions have to be handled by future code that depends on current exception-oriented code. Because of these limitations, Google is fairly strict about limiting the use of exceptions in its C++ codebase:

"On their face, the benefits of using exceptions outweigh the costs, especially in new projects. However, for existing code, the introduction of exceptions has implications on all dependent code. If exceptions can be propagated beyond a new project, it also becomes problematic to integrate the new project into existing exception-free code. Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions.

Given that Google's existing code is not exception-tolerant, the costs of using exceptions are somewhat greater than the costs in a new project. The conversion process would be slow and error-prone. We don't believe that the available alternatives to exceptions, such as error codes and assertions, introduce a significant burden.

Our advice against using exceptions is not predicated on philosophical or moral grounds, but practical ones. Because we'd like to use our open-source projects at Google and it's difficult to do so if those projects use exceptions, we need to advise against exceptions in Google open-source projects as well."

The position is unequivocal and the logic is hard to fault. But return values, even in the refined form found in Go, have a drawback that we've become so used to we tend to see past it: Code is cluttered with error-checking routines. Exceptions here provide greater readability: Within a single try block, I can see the various steps clearly, and skip over the various exception remedies in the catch statements. The error-handling clutter is in part moved to the end of the code thread.

But even in exception-based languages there is still a lot of code that tests returned values to determine whether to carry on or go down some error-handling path. In this regard, I have long felt that language designers have been remarkably unimaginative. How can it be that after 60+ years of language development, errors are handled by only two comparatively verbose and crude options, return values or exceptions? I've long felt we needed a third option.

One such option would put all the error handling code of a method in a separate section at the end, and the programmer would tie each error-handling snippet to the code via a single statement in the snippet. Then, for example, you could write open( filename ) without any error-checking ceremony, knowing that any errors (be they manifest as exceptions or return codes) would be handled in the linked routine in the error-handling section. This might appear to be a super-local exception handler, but in fact, since it occurs at the language level, it could be implemented as something far simpler, such as code injection of the error-handling code back to where it is currently written longhand by developers. Language designers can implement such conveniences far more elegantly, I hope. The result would be that working code is much, much more readable. (If you have other solutions, please post them in the comments section.)

Go's innovation is indeed a step in the right direction. And other languages, such as Haskell with its Maybe return value, have shown imagination in this area, but I think far more needs to be done to break out of the legacy mindset that error handling should be done as it always has been.

— Andrew Binstock
Editor in Chief
alb@drdobbs.com
Twitter: platypusguy


Related Reading






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Comments:

ubm_techweb_disqus_sso_-16d7c50d6c078abc2acbddb01ccf46d4
2013-01-09T13:16:44

Possibly have mutiple catch levels that are documented in the code, which the IDE understands. Level 1 are true system level exceptions. Make Level 3 be algorithmic significant exception (i.e. intelligent compiler keeps compiling by assuming a missing ";" now exists so that it can better report possible subsequent syntactic errors). Level 2 is somewhere between the two (if necessary).

Then apply sproggit's IDE ideas at specific levels.

It's basically the 3 level errors/warnings/info we've used/seen before. Set your IDE to Level 0 and you don't see ANY catch blocks. Level 3 and you see ALL catch blocks (feel free to reverse the numerology if that makes more sense).


Permalink
ubm_techweb_disqus_sso_-43e165b47cf5d0edfff7378e69ef5220
2012-12-18T03:08:15

Could you say more about the "consumption of runtime resources"? I always thought that monads provided a very elegant and clean way of seperating the "abnormal" flow from the main control flow. For most monads it just translates into an extra return value which is processed with an "if".


Permalink
ubm_techweb_disqus_sso_-c84fa5da9a2a6928505b94efe3baf860
2012-12-11T10:03:06

A very interesting point. I'm not familiar with Google's Go language but I agree that exceptions and error returns alone do not present an ideal error handling mechanism for robust and readable code. However 'C' family languages, especially C++ are rapidly evolving at the moment and there are new ways to both raise and catch exceptions and error return codes that can be used to provide clean code and easily found error handling locations. Augmented types, Flyer based error handlers and techniques being developed for AOP such as call interception. In short its not so much the language that needs to change to solve this problem, that's already happened but the techniques developers use need to catch up to making use of the tools we now have. Error codes and exceptions will remain the underlying mechanisms because they work but the way we use them is changing. I don't believe it will be a problem to be stuck with them when the ease of use and readability issues are solved.


Permalink
ubm_techweb_disqus_sso_-678cac889be1915d41b06ac9548cbcc1
2012-12-09T23:50:32

In most cases there is no point in the code where You know more about an error condition than at the position where the error happens.
I'm using mostly C and preferring to keep the things allways as simple as possible. So I 'm preferring code that is written straight foreward and step by step as the things have to be done. I handle error conditions the simple way to immediately log the problem where it happens and jump via a 'goto' to a cleanup position for locally allocated ressources at the end of my procedures and return a error value after the cleanup. Such code is quite easy to maintain because it's easy to locate most of occuring problems.
May be it's a good thing to let always procedures just return true in success case and false in error cases. So the caller just needs a single if statement and can handle resulting problems immediately after the call of a procedure without additional statements to check other types of error return values. Error codes can be stored in a global variable (i.e. errno) or at the position where an optional argument pointer of the functions points to.


Permalink
ubm_techweb_disqus_sso_-4d3451bcc1dc1fe1428d803e28f82bce
2012-12-09T18:09:23

That skips something, though: not all errors are exceptions. Take the open() example: if my program's response to a failure to open the file is to just continue as if the file contained nothing, then exceptions make the code more complex. The cleanest code there is to check the success/failure status of the open() call and write the code as "open(); if success then { ...; close(); }". I can't do that cleanly with exceptions, and I can't think of a way to do that with external error-handler blocks because they'd need to set a local variable back in the block the error occurred in (which they don't have access to since that block's local variables are out-of-scope).

I suspect that's why calls like open() return a special value on failure, because that particular situation's so common. If you need exceptions it's easy to write "open(); if failure then throw exception;". The code to convert an exception into a status code that can be checked in an if statement isn't nearly as concise.

I think it's helpful to remember that we don't have a binary situation, we have a trinary one: success, failure, error. Not all failures are errors, and in fact not all errors are failures (I have more than a bit of code where it's the _success_ of an operation that indicates an error's occurred).


Permalink
williamlouth
2012-12-09T16:04:34

The adaptation I am talking about happens at runtime not after multiple logged restarts of an actor which involved engineering changing an environment variable or the code itself. This runtime self adaptation does not appear to be adequately addressed by Erlang or many other Actor implementations which seems strange considering the self healing capabilities built into most packet/switch based systems.


Permalink
williamlouth
2012-12-09T16:01:22

Hi Andrew,

Notification is handled via Signal Rules which are fired at the entry and exiting of a Signal Boundary. This is much more efficient and useful as it allows decisions logic to be applied across multiple signal types and their aggregation. Instead of throw an exception rules can fire another signal which will propagate upwards until one of the boundary rules decides to instead translate/transform the signal into a runtime exception.


Permalink
ubm_techweb_disqus_sso_-243240e0117270fbd1d5457622919fe6
2012-12-09T09:53:11

There are some fascinating and educational comments here from others - so thanks to all those who have posted replies.

I believe there may be an aspect or dimension to this question that we have not explicitly posited, but which is critical to answering it efficiently - context.

If I am "hacking" - experimentally writing code with the aim of getting a preliminary alpha release to compile and/or execute cleanly, then I suspect that having to flip backwards and forwards between to chunks of code may be distracting, even if I can open another IDE Window to manage that.

Conversely, if my code is stable, polished, locked down, commented and ready for deployment, then I would agree with the idea that *intrusive*, or perhaps extensively intrusive error handling may detract from ability to read the flow.

Now, what interests me about this mindset to the problem is that it relocates the challenge from the "body of code" i.e. the actual source language files, and places it in the IDE's ability to represent it. Let's go back to the original article and it's thought that the error handling was intrusive. Suppose I could configure my IDE editing window to "auto-collapse all catch blocks" {the syntactic equivalent}. Conceded, my code would still contain chunks of "try{...}" logic in parenthesis, but in practice this is not a disadvantage simply because it helps me read the code as a series of well defined logical blocks. Other options exist - for example the IDE could colour code the catch{...} logic with a darker, or less prominent hue than the mainline code.

So: the conclusions I draw are:-

1. My requirements vis-a-vis in-line or abstracted error handling will vary depending upon the maturity of the code in question, and depending upon what task I am trying to accomplish. Abstracting error handling is only going to be useful in a narrow set of use cases.

2. As the author states, the problem here is a legibility issue, related to ease-to-read. That's primarily the job of the IDE, not the language, and it has multiple potential solutions.


Permalink
wtpayne
2012-12-08T23:53:25

I find the fact that control flow is explicit and local in Go to be a very very nice feature. I can well understand the potential for confusion when program control jumps around non-locally (Having worked on large legacy code bases where this happened a lot). On the other hand, I also agree that large amounts of error checking code "clutter" the source, making it hard to discern the original intent of the code. We are at a seeming impasse; how do we reconcile these two requirements? Not by the mechanism the author of the article proposes, I am afraid, which meets the first requirement, but fails the second.

Fortunately, we can see that both of these problems are to do with *reading* the program logic, or rather, in how the program logic is *presented* to the reader. We already have code-folding editors. How about defining and using very regular conventions for ubiquitous "boilerplate" aspects of program logic, such as error handling and logging, then extending our editors with quick & easy filters to hide or display the boilerplate, depending on the reader's needs.

More generally, the notion that textual programming languages can support multiple display representations remains an under-explored area, at least in the mainstream.


Permalink
ubm_techweb_disqus_sso_-923c977676392cc46a28cf90977c226d
2012-12-08T19:16:46

not really, the supervisor will restart the system and it will fail again until the error condition (usually external) is resolved. You'll determine what the error is from looking at logs, but you don't need to handle the errors internally to the process, they are handled by the caller. Your program will be tested so it appears to run under normal conditions, so you don't need to handler errors for these, tested, conditions using this approach.


Permalink
ubm_techweb_disqus_sso_-923c977676392cc46a28cf90977c226d
2012-12-08T19:08:40

there is a third option - errno. All functions place an error code in this system variable that the caller can inspect after the call completes. No return values, no exceptions.
Now, the caller does need to check this, and I think it could be better mandated, it could also be enhanced with an error message as well as a code, but to be really effective it needs to be *the* error communication mechanism that all functions use.

Or, you could use it in conjunction with the concept of components or modules - you place error guards around calls to a larger component that doesn't need to handle errors quite as robustly internally - as long as they get passed back and always checked at the module boundaries. Then the entire module will be considered to have failed on error, regardless of what the particular error was. eg, if my brakes fail in my car, I don't really care which bit of the brake component failed, as long as I know its failed, I will just replace it. ok, poor analogy :) but the point is to treat the system as connected wholes rather than lots and lots of little bits that need continual checking.


Permalink
ubm_techweb_disqus_sso_-4d3451bcc1dc1fe1428d803e28f82bce
2012-12-08T18:16:42

I see one problem, and that's that not all failures are errors. Take the open() call example. In one occurrence it might be an error, resulting in an error message and aborting the operation. In another occurrence it might be a normal and expected result, and the code just continues as if it'd found an empty file. Worse, both might occur in the same block of code. So not only would you need error blocks, you'd need some way to associate different error blocks with different occurrences of the same potentially-failing function. That of course can be dealt with by language constructs, it merely adds back in some of the verbosity you're trying to remove.

The worse part is one the proposed method can't fix, because the method creates it by design. That's that to figure out *which* case this one is, expected-result or real-error, you can't just look at the open() call. You have to abandon the code you're reading and jump down to the bottom of the function to find somewhere in a plethora of error blocks the one associated with this particular open() call. That breaks an old rule of coding: don't make someone reading the code jump all over the place to see what's happening. Exceptions and return values at least let someone reading the code see right there what happens when the file isn't found. It may not be aesthetically pleasing, but it's *practical*. And as a working developer I find I have less and less patience for misfeatures that, while academically and theoretically elegant, make my life harder.


Permalink
AndrewBinstock
2012-12-07T22:58:49

Presumably, the error-handling code would be placed somewhere by convention, so that a maintenance programmer would always know where it is.

It's not that different from constructors. New creates an object, but if the constructor does any kind of fancy processing, it's hidden from view until you dig into the class. But you know exactly where to look to find it.


Permalink
AndrewBinstock
2012-12-07T22:55:38

I like this idea of signals. It could even be refined into something like the signals/slots mechanism as used in Qt, so that failures are flagged but handled elsewhere with code that's not cluttering the main line of the logic.


Permalink
ubm_techweb_disqus_sso_-16d7c50d6c078abc2acbddb01ccf46d4
2012-12-07T12:49:28

I agree somewhat with what you are saying, Our discussion has been about how the client/caller reacts to the reported error. However, the service/method doesn't know (or need to know) how clientA reacts to an error compared to clientB. Yet, all this discussion is about how the service/method reports errors.

Now what would be cool is a mechanism to easily translate one error reporting mechanism into another, and leave it up to the client to put the proper error-reporting-wrapper around the CALL.

To quote Mr. Binstock, "But we're stuck with them for the foreseeable future — just as we have been for decades"


Permalink
ubm_techweb_disqus_sso_-71fe90849f198c1debe4b67f14243c3d
2012-12-06T19:48:00

Perhaps we should draw a distinction between different types of error handling. In my experience, most error "handling" is trivial. All it does is abort the current operation. The maintenance programmer doesn't need to see that. It's an unnecessary distraction. This is really the sort error handling I had in mind when I wrote my post.

On the other hand, there is another type of handling where you try to do something intelligent. Maybe you retry the operation or switch to a different strategy. This sort of handling is more significant and should be inline with the rest of the code. It is, after all, a fundamental part of the task at hand. I've found this sort of thing quite rare in practice, however.


Permalink
ubm_techweb_disqus_sso_-c18a5e36ec0c81b899e076f3ca6708b9
2012-12-06T18:39:25

Monadic approaches are appealing conceptually and work at the application level, but as a *general* error-handling mechanism, I worry about their consumption of runtime resources. Straight off-the-shelf, the Option and Either monad provide clean strategies for dealing with troublesome returns.


Permalink
ubm_techweb_disqus_sso_-90ab573afae55bd5bfb68111e38670b6
2012-12-06T17:38:53

The IDE could help out by using a companion window to display the assert/catch/finally routines for any block of code. That way you can view the primary code path side-by-side with its error handling. Synchronizing the two windows would be a challenge, and you would want to be able to scan all error handling for a function, not just for the currently selected line/block of code.


Permalink
ubm_techweb_disqus_sso_-16d7c50d6c078abc2acbddb01ccf46d4
2012-12-06T17:18:42

That's my point, though. To separate the error handling from "normal" processing, how can you tell by looking at the code if any error handling is even being done? Think what a maintenance programmer is going to see (or not see). That sounds dangerous to me.


Permalink
ubm_techweb_disqus_sso_-71fe90849f198c1debe4b67f14243c3d
2012-12-06T16:39:31

The idea is to separate, not to hide. No one disagrees that error handling is important. The problem with mixing error handling with the pincipal code is simply that it makes that code harder to understand. For any given function, chances are that (at some point) a programmer who did not write it will be asked to change it in some way. The easier the code is to understand, the more quickly that change will happen and the more likely it will be correct.


Permalink
ubm_techweb_disqus_sso_-47c80bd4758eb35df9d0bb3f61ca32e6
2012-12-06T15:58:49

In any language which allows the implementation of monads, there is a completely different route to exception handling possible:

http://lambda-the-ultimate.org...

With these kinds of monadic exception any problems, such as the ones mentioned by google can be solved at the compiler level. I.e. the compiler will simply refuse to compile code where there are unhandled exceptions. This is similar to the necessity in Java to declare handled and thrown exceptions, but even somewhat more powerfull.


Permalink
ubm_techweb_disqus_sso_-16d7c50d6c078abc2acbddb01ccf46d4
2012-12-06T13:09:46

I never understood why the aesthetics of an algorithm is more important than proper error handling/detection. The first lesson most people learn when writing programs is how to handle the "anomalies", e.g. prompt for a number, but get an alphabetic character.

To me, mis-handling of "anomalies" (or not handling them at all) is one of the biggest concerns of software implementation. Yet historically I see the push for "hiding error handling logic" or "eliminating error handling logic". Nooooooo!

When I read Steve McConnell's Code Complete, this was one area I disagreed with (agreed with just about everything else). He liked his code to present the main algorithm flow concisely and try to "bury" the error hanlding "at the bottom" of your if/else constructs. I disagreed then and I disagree now. I'd rather see how I handle the error on the IF part, vs. on the ELSE part.

Handling anomalies (whether gracefully, intelligently, abruptly, adaptively) is HALF of what we do (anybody can write a program that works perfect with guaranteed perfect input from perfect people, right?!?).

To me, the WAY a website handles "error conditions" is just as important as doing the normal processing. Simple example, I use dashes on my telephone number form. But no, it doesn't want dashes, it wants just a 10 digit number. Or I enter an extension, and the form can't handle the extension. BUT, after submitting my INVALID data, the error gets reported, BUT ALL THE FORM FIELDS ARE CLEARED, not just the telephone number.

Yes, algorithms are important. Critical. Maintaining those algorithms are also important. But so is error handling in most software that I write. So why are we trying/pushing to hide/bury an important part of our algorithm!?!


Permalink
ubm_techweb_disqus_sso_-eb143e6d5f6d149c968e253a5efe5875
2012-12-06T10:06:21

Sounds like a job for "separation of concerns", "aspect oriented programming", you know, with the concern being exception handling.

I wonder if the Specman-e language would have this licked? Hmm?


Permalink
ubm_techweb_disqus_sso_-5769185644cc5a3d9f8fb9eaf2098132
2012-12-06T09:47:59

Error processing is indeed an interesting topic.

As regards me, I never really switched to the using exceptions, for a simple reason: I don't understand them. I don't mean syntactically nor semantically, you can find satisfactory explanations in manuals. I mean architecturally. I never found a tutorial text on the proper use of exceptions in structured software. So I prefer to abstain rather than misuse.


Permalink
williamlouth
2012-12-06T09:39:54

With Erlangs approach in practice there is no adaptation or evolution mechanism in this somewhat flippant approach to fault handling...we learn by our mistakes...the supervisor for the most part simply reinstates the potential (waiting to happen) problem.


Permalink

Video