Social Processes and Heartbleed, Part 1
getsstill with us?
Well, it finally happened: We have seen a severe, widely publicized buffer-overrun bug, with wide-reaching effects, that will be difficult, slow, and expensive to fix. I think it is safe to say that all over the world, software managers are saying to developers: "Why didn't you warn me?" and many of those developers are answering: "We did; why didn't you listen?"
There is a constant tension in software development between doing it right and doing it quickly, and this tension is particularly visible in the area of security. So for the next few weeks, I am going to explore this tension. Two aspects of this tension are particularly important:
- It doesn't matter how good a solution is to a problem if no one uses it.
- It doesn't matter if someone fails to use a good solution for a good reason or for a bad one.
Let's begin with a seemingly trivial example: the
gets function in the C standard library. This function takes a pointer as its argument; it reads characters from the standard input stream into consecutive memory addresses starting where the pointer points. It stops reading when it encounters end of file or a newline character, whichever comes first. If it encounters a newline character, it does not place that character in memory. When it is done, it appends a null character. As a result, what the programmer sees is whatever came from the standard input as a null-terminated C string, minus the newline character, if any, that ends the input line.
This function is both limited and convenient. C programmers often use it to read from the standard input in situations such as:
char input; printf("Yes or no?\n"); gets(input); /* and so on… */
For at least 30 years, many members of the C programming community have known that
gets is unsafe and cannot be made safe. The reason, of course, is that its (only) parameter is a pointer to memory in which it is to place its result, and there is no way for
gets to find out how much memory is available for its use. As a result, if there are enough characters in the standard input before the next newline,
gets is guaranteed to overwrite the memory that it was given; no action on the programmer's part can prevent this overwriting.
Partly because of
gets' lack of safety, it has a companion named
fgets. This function takes three arguments: a pointer to memory into which to store data, an integer that gives the size of that memory, and a stream from which to read. One might think, therefore, that the previous code fragment could be made safe by rewriting it this way:
char input; printf("Yes or no?\n"); fgets(input, 100, stdin); /* and so on… */
Unfortunately, there is one more difference between
fgets stops by reaching a newline character, it includes that newline character as part of the input, whereas
gets excludes the newline. Therefore, this rewrite doesn't work: In order to achieve the same result, it is necessary to delete the newline if it is there. One might imagine doing so this way:
/* This code doesn't work! */ char input; printf("Yes or no?\n"); fgets(input, 100, stdin); char *last = input + strlen(input) – 1; if (*last == '\n') *last = '\0'; /* and so on… */
This code fails in an obscure edge case: If it is executed when the standard input stream has consumed all available characters but has not yet reached end of file, then
fgets will effectively return a null string by making
input a null character. If that happens,
strlen(input) is zero, so
last will point to the character immediately before
input. The result of this code fragment would therefore be undefined; fixing the problem is left as an exercise for the reader.
Once upon a time, I worked in an organization in which one of the managers was security-conscious enough to demand that
gets be removed from the local C library. The result of doing so was constantly having to rewrite code that we got from elsewhere. It is not hard to imagine the resulting conversations:
--You know that code you sent me? We had to rewrite part of it so that it didn't use
--What do you have against
--Oh, that's interesting.
--We'll be happy to send you the revised code if you like.
--Sure, go ahead — but I can tell you right now that we're not going to be able to do anything about it; we can change code like that only if a customer complains.
Despite its known insecurity,
gets was part of the C89 and C99 standards. It was finally removed from the C2011 standard; but when I checked my local implementation, it was still there. Even more interesting to me is that to my knowledge, there is still no function in the C library that is a safe, convenient alternative to
I'd like to invite the C developers reading this to start a discussion: Did you know that
gets was unsafe before you read about it here? Does your shop have a policy on the use of
gets? Have you ever rewritten code to avoid using it? Anything else you want to tell us? I’ll continue the discussion next week.