Channels ▼
RSS

Open Source

Is goto Still Considered Harmful?


If you're a programmer and you read about the recent controversy over a serious security vulnerability in Apple's Secure Transport (their SSL/TLS implementation), then you may have noticed something odd about the Apple source code where the error occurred. It's full of goto statements.

Apple has open-sourced the Secure Transport code. We don't know whether this played a role in the bug being found, but once the fix was announced, the availability of the source code made it easy to find the specific bug.

The C language file with the error is pretty long at 1,970 lines, and it contains 47 goto statements. More primitive languages, such as old BASIC implementations, rely on goto because they lack sufficient control structures to do without them. In CS101, you are taught that goto is an ugly thing, one that breaks logical control flow. More-capable languages like C include it, if only as an escape hatch for situations where logic gets complex without it. This may be what is happening in the buggy Apple function SSLVerifySignedServerKeyExchange:

	static OSStatus SSLVerifySignedServerKeyExchange(…)
	{
	  …

		if ((err = SSLFreeBuffer(&hashCtx)) != 0)
			goto fail;

		if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
			goto fail;
		if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
			goto fail;
		if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
			goto fail;
		if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
			goto fail;
			goto fail;
		if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
			goto fail;

	  …

	fail:
		SSLFreeBuffer(&signedHashes);
		SSLFreeBuffer(&hashCtx);
		return err;
	}

I've cut a lot of code out of the function for readability. The error is where there are two consecutive lines that say goto fail;. I wouldn't say that the use of goto in this program contributed to the error at all. Clearly, the error is a simple, careless editing mistake, one which was still syntactically correct and which therefore generated no errors. You can easily do the same without goto.

But what's with all the gotos? Do real programmers for serious companies like Apple actually use them so frequently? It seems like they do. A search for "goto fail" on Github yields millions of results.

In fairness to these programmers, in C, there's little effective difference between goto and other, less taboo statements, like break inside a switch construct. In that spirit, careful use of goto can actually make code clearer. Another common reason given for using goto as a means of hand-optimization of code is less defensible.

I asked Jeff Law and Jason Merrill, both engineers at Red Hat and members of the steering committee for GCC, what they thought of goto. Merrill quickly shoots down any idea that programmers can or should hand-optimize code for performance:

"As Donald Knuth wrote in his paper Structured Programming with Go To Statements, '…premature optimization is the root of all evil.' Programming for clarity is much more important until you know what the hot spots are in your code, especially given that modern optimizers and profiling tools are much more powerful than anything available to Knuth when he wrote the paper in 1974; most of the examples he gives of useful optimizations using goto are routine transformations for a modern compiler."

Merrill agrees that goto in the Apple code appears to be used for clarity: "In general, micro-optimizing code at the local level is not something we would recommend until after verifying that the code is a hot spot and that micro-optimization would prove to be valuable. Clarity trumps micro-optimization most of the time.

GCC and, I suspect, most modern compilers internally transform the source into a series of basic blocks (maximal set of consecutive insns that is ended by a change in flow control or a label). All transfer of control between blocks is represented by the control flow graph for most of the optimization phases. What this means is that there's no difference to the compiler between well-structured code and spaghetti code created by gotos. In the end, all those changes in control flow are turned into edges in the control flow graph (CFG), and the compiler optimizes them to the best extent possible. Relatively late in the compiler, edges in the CFG are explicitly represented as control flow instructions again."

Law also agrees that Apple's use of gotos is "…likely being more about cleaning up properly than optimization. You would see similar idioms in lots of code where cleanups (particularly releasing memory as is the case in this code fragment) are needed before returning from a function. C++ has significantly better support for this kind of cleanup idiom. GCC's C compiler also supports the "cleanup" attribute, which brings similar capabilities to C code."

So the taboo against goto isn't entirely rational. Look at the SSLVerifySignedServerKeyExchange function — how would you structure it without goto? Would it actually be clearer? In that routine, goto is used to escape only out of linear flow logic, not loops. It's perhaps another thing to goto out of the middle of a while or for loop, as you may be making it more difficult to confirm the correctness of those loop structures.

Even if it makes sense to use goto at times, the more important lesson to learn is about how much better languages have gotten in the last couple of decades. [It seems to me the lesson to learn here is to static check your code before checking it in. —Ed.] The advances in compiler optimization, paired with Moore's law, allow us to use languages that manage cleanup and compartmentalize logic far better than C. Not that Apple should rewrite Secure Transport in Python, but everyone should take the inherent dangers of C more seriously now that the performance imperative for it is not what it used to be.


Larry Seltzer is an independent writer, who previously was editor-in-chief of Byte.com.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video