Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Social Processes and Heartbleed, Part 1

April 16, 2014

Well, it finally happened: We have seen a severe, widely publicized buffer-overrun bug, with wide-reaching effects, that will be difficult, slow, and expensive to fix. I think it is safe to say that all over the world, software managers are saying to developers: "Why didn't you warn me?" and many of those developers are answering: "We did; why didn't you listen?"

There is a constant tension in software development between doing it right and doing it quickly, and this tension is particularly visible in the area of security. So for the next few weeks, I am going to explore this tension. Two aspects of this tension are particularly important:

  • It doesn't matter how good a solution is to a problem if no one uses it.
  • It doesn't matter if someone fails to use a good solution for a good reason or for a bad one.

Let's begin with a seemingly trivial example: the gets function in the C standard library. This function takes a pointer as its argument; it reads characters from the standard input stream into consecutive memory addresses starting where the pointer points. It stops reading when it encounters end of file or a newline character, whichever comes first. If it encounters a newline character, it does not place that character in memory. When it is done, it appends a null character. As a result, what the programmer sees is whatever came from the standard input as a null-terminated C string, minus the newline character, if any, that ends the input line.

This function is both limited and convenient. C programmers often use it to read from the standard input in situations such as:

 
     char input[100];
     printf("Yes or no?\n");
     gets(input);
     /* and so on… */

For at least 30 years, many members of the C programming community have known that gets is unsafe and cannot be made safe. The reason, of course, is that its (only) parameter is a pointer to memory in which it is to place its result, and there is no way for gets to find out how much memory is available for its use. As a result, if there are enough characters in the standard input before the next newline, gets is guaranteed to overwrite the memory that it was given; no action on the programmer's part can prevent this overwriting.

Partly because of gets' lack of safety, it has a companion named fgets. This function takes three arguments: a pointer to memory into which to store data, an integer that gives the size of that memory, and a stream from which to read. One might think, therefore, that the previous code fragment could be made safe by rewriting it this way:

 
     char input[100];
     printf("Yes or no?\n");
     fgets(input, 100, stdin);
     /* and so on… */
 

Unfortunately, there is one more difference between gets and fgets: If fgets stops by reaching a newline character, it includes that newline character as part of the input, whereas gets excludes the newline. Therefore, this rewrite doesn't work: In order to achieve the same result, it is necessary to delete the newline if it is there. One might imagine doing so this way:

 
      /* This code doesn't work! */
     char input[100];
     printf("Yes or no?\n");
     fgets(input, 100, stdin);
     char *last = input + strlen(input) – 1;
     if (*last == '\n')
           *last = '\0';
     /* and so on… */
 

This code fails in an obscure edge case: If it is executed when the standard input stream has consumed all available characters but has not yet reached end of file, then fgets will effectively return a null string by making input[0] a null character. If that happens, strlen(input) is zero, so last will point to the character immediately before input. The result of this code fragment would therefore be undefined; fixing the problem is left as an exercise for the reader.

Once upon a time, I worked in an organization in which one of the managers was security-conscious enough to demand that gets be removed from the local C library. The result of doing so was constantly having to rewrite code that we got from elsewhere. It is not hard to imagine the resulting conversations:

--You know that code you sent me? We had to rewrite part of it so that it didn't use gets.
--What do you have against gets?
--<long explanation>
--Oh, that's interesting.
--We'll be happy to send you the revised code if you like.
--Sure, go ahead — but I can tell you right now that we're not going to be able to do anything about it; we can change code like that only if a customer complains.

Despite its known insecurity, gets was part of the C89 and C99 standards. It was finally removed from the C2011 standard; but when I checked my local implementation, it was still there. Even more interesting to me is that to my knowledge, there is still no function in the C library that is a safe, convenient alternative to gets.

I'd like to invite the C developers reading this to start a discussion: Did you know that gets was unsafe before you read about it here? Does your shop have a policy on the use of gets? Have you ever rewritten code to avoid using it? Anything else you want to tell us? I’ll continue the discussion next week.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video