Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Preventing Buffer Overruns in C++


Jan04: Preventing Buffer Overruns in C++

Richard is the author of Programming with Managed Extensions for Microsoft Visual C++ .NET (Microsoft Press, 2003). He can be contacted at [email protected].


It's rare for a day to pass without a software security advisory. Such updates are costly, at least in terms of obtaining the patches and the downtime to apply them. However, these costs are minuscule compared to the potential costs incurred if data is exposed to unauthorized access or if network bandwidth is exposed by denial of service attacks. In today's computer world, security is the biggest issue, and preventing security breaches is the most important piece of work you can do.

None of the current operating systems can be considered safe regardless of vendor assurances. Equally so, you should not take the blithe statements that one particular system is inherently less secure than another; UNIX and Linux, for instance, are just as vulnerable to attacks as Windows. Attackers want to take over your machine to steal your data or launch their attacks on another machine; electronic vandals want to bring down your machine, deny you access to the network, or deny authorized users access to your machine. Any or all of this is possible if your machine has a buffer overrun vulnerability. As a developer, it's your responsibility to make sure that your code does not expose a vulnerability. In this article, I describe how to use the Visual C++.NET compiler and libraries to rid your code of buffer overruns.

What are Buffer Overruns?

Consider the function in Example 1, which allocates a buffer on the stack and formats a string based on the parameter passed to the function. The formatting procedure is simply to copy the string hello into the temporary buffer, then append the string pointed to by the parameter.

On the surface, this function looks fine. However, the stack-allocated buffer can only hold nine characters (10 if you count the terminating NUL character). Since the hello string takes up to six characters, this means that there is only space for names of three characters or less! If the parameter points to a string that has more than three characters, this string is still appended to the string already in the buf variable, meaning that the extra characters will be copied into memory that is not assigned to the buffer. The stack is used to hold various information. Figure 1 shows the contents of the stack after strcpy has been called, but before strcat is called. As you can see, the memory above the stack pointer contains the address of the instruction that executes after PrintHello has been called and, since the function is __cdecl, the return address is followed by the parameters of the function (in this case, a pointer to a string).

The function's autovariables are allocated on the stack before the stack pointer. The alignment of variables means that, although the size of the buffer buf is 10 characters, the start of the buffer is 12 bytes from the return address. Thus, this alignment issue gives us an extra two characters. If the string passed to PrintHello is Jenny, then the stack looks like Figure 2. As you can see, the NUL character at the end of the new string is just before the return address; if the string was any larger, it would overwrite the return address.

A buffer overrun exploits this by passing data that is large enough to overwrite the return address, but is carefully constructed so that the return address is replaced with a valid address. This "valid" address will actually refer to code that intruders provide to run their rogue code.

What Can I Do About Buffer Overruns?

If you create a Visual C++.NET Win32 console project to test this code, you will find that the stack looks different from what I have shown. There are several reasons for this. First, to simplify things, I've shown the stack for an optimized release build that does not use exception handling. Other build configurations will have other values on the stack between the local variables and the return address (see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vctchcompilersecuritychecksindepth.asp).

The second reason why the stack may appear different is that, by default, the projects generated by the Visual C++.NET wizards turn on stack checking code, while I've presented code without stack checking. By default, for debug builds, the wizard uses /RTCs that turn on runtime checks to check the state of the stack. This setting tests for buffer overruns and underruns and tells you when either happens. The compiler places guard blocks of a known value around buffers and, when the function returns, the compiler provides some extra code that checks to see if the guard blocks have changed. The /RTCs option is valuable during the development phase and helps you identify buffer overrun vulnerable code. However, detecting overruns with /RTCs is dependent on your test data, so it is possible that some bugs will not be detected.

Microsoft recognizes this and provides the /GS switch, which provides code that detects when the return address has been overwritten. It does this by adding a cookie onto the stack. This value will be located between the supposed end of the buffer and the return address so that if a buffer overrun occurs, the cookie will be overwritten, too. The cookie has a known value—a random value generated for the module at runtime—so rogue code cannot guess what the value will be. When the function finishes, a check is performed on the cookie and, if the cookie has changed the return address on the stack, it is not used (because it may be compromised), the condition is reported, and the application is shutdown. The /GS switch is not confined to debug code—it works in release code and you should use it for all of your code. Again, the format of the stack in your code may not be the same as in Figure 1 because, by default, the Visual C++.NET project wizards will turn on /GS.

/RTCs and /GS are attempts to prevent bad code from compromising the security of your machine. If a buffer overrun is detected, your application shuts down, so it is clearly in your interest to make sure that buffer overruns do not occur.

Preventing Buffer Overruns

The code I've presented here is typical of the kind that lets buffer overruns occur. If you use the C Runtime Library, you should carefully check for vulnerable functions that could cause overruns. In Appendix A of their book, Writing Secure Code (Microsoft Press, 2002), Michael Howard and David LeBlanc list the CRT and Win32 functions that expose your code to security vulnerabilities including buffer overruns. In essence, these are functions that fill a user-allocated buffer. If your code uses any of these functions, carefully check to ensure that you are allocating sufficiently large buffers. For example, if your code uses strcat, a potential buffer overrun is waiting to happen, so you should replace this with a call to strncat and provide the maximum number of characters that should be copied. Be careful when using strncat and make sure that you pass the remaining size of the buffer that you want filled and not the size of the buffer because, if you are joining several strings together, the two may not be the same.

Safe String Functions

During February and March 2002, all application development in Microsoft stopped and developers took part in the Security Push initiative. The goal was to check all code for possible security vulnerabilities and fix those problems. One of the outcomes of the Security Push was a library of safe string functions called "strsafe.lib" with an associated header called "strsafe.h." This library is available through the Platform SDK that can be downloaded from the MSDN web site and is automatically installed as part of Visual C++.NET 2003.

These functions prevent many security vulnerabilities: Each function has a parameter for the size of the destination buffer so that data is not written beyond the end of the buffer. Each function guarantees that, when it completes, the destination buffer is NUL terminated. This second property is important because many of the CRT string functions assume that a string is NUL terminated and they will move through the string until a NUL character is found. If a string is not NUL terminated, such functions could access memory beyond the end of the string buffer. If you are wondering why this is such an issue, take a look at the documentation for strncpy—the safer form of strcpy. strncpy is passed the maximum number of characters that can be copied into the destination buffer, and if the length of the source string is greater than or equal to this size, the destination buffer is filled to its capacity—but no NUL terminator is used.

Using these functions is straightforward. The first choice you have to make is whether you want the functions to be defined inline in your source file or to use the version in the static library. If you choose the latter, you should define the symbol STRSAFE_LIB before you include strsafe.h. If you use this header in an existing file, you'll find that the file no longer compiles, and you'll get some strange errors. For example, the code in Example 2(a) is valid, although the compiler complains with Example 2(b).

Digging into strsafe.h indicates what is happening. This header file does two things depending on the symbols that have been defined. If the symbol DEPRECATE_SUPPORTED is defined, the header deprecates CRT and Win32 string functions using the deprecated() pragma, which causes the warning C4995 to be issued. If DEPRECATE_SUPPORTED is not defined, you get the error C2065 because, without this symbol, the header file uses the C preprocessor to redefine the names of the replaced functions. In our case, the header defines Example 2(c). This means that the C++ compiler no longer sees strcpy, but instead sees a symbol that does not exist; the name of this symbol is used to tell you how to fix the code.

This error indicates that I should use StringCbCopyA or StringCchCopyA instead of strcpy. This is typical of all the string functions where you have the choice of a version that counts the characters (StringCchCopyA) or one that counts the bytes (StringCbCopyA) that will be copied. Example 3 is the safer version of the string copy. The sizeof operator returns the size of the buffer in bytes, so I have used the StringCbCopy function. The function returns an HRESULT to indicate that the call was successful. It's important to check this return value before using a buffer. The strsafe.h library lists the possible failure values. One of these values, STRSAFE_E_INVALID_PARAMETER, indicates that the size that you provided for the destination buffer is larger than the maximum allowed: STRSAFE_MAX_CCH. STRSAFE_MAX_CCH has a value of 2,147,483,647, although you are rarely likely to use strings this large!

In addition to providing the basic functionality to replace unsafe string functions, strsafe.h also provides extended functions, such as StringCbCopyEx (Example 4) that provide more functionality. The additional parameters let you get more information about the operation and alter how the copy occurs. If ppszDestEnd is not NULL when StringCbCopyEx returns, it will point to the end of the destination buffer after the copy is made. If pcbRemaining is not NULL, then when the function returns, it contains the number of bytes (or characters for StringCchCopyEx) left in the destination buffer. The dwFlags parameter is used to specify details about the copy; for example, you can specify the value that will be used to fill the uninitialized part of the buffer and you can indicate how it will treat NULL strings.

The safe string library has functions to copy, append, and format strings. It also has functions to return the length of a string and to obtain a string from stdin. It is important to note that these functions do not use the Win32 API, so using them does not bring any other library dependency into your project. It is worth your time to review your existing code and to use the safe string methods in place of the CRT and Win32 string functions.

Conclusion

It's not a safe world and there will always be someone who tries to exploit the smallest vulnerability. Visual C++.NET lets you protect yourself from such exploits by shutting down your process if the stack becomes compromised, so you should always compile your code with /GS. However, protecting yourself from a vulnerability after it has been exploited is far from being satisfactory, so you should carefully review your code and ensure that there are no vulnerabilities. The Visual C++.NET compiler provides runtime checking switches so that you can pick up stack overruns in your debug builds. The best solution, of course, is to never allow buffer overruns to occur and, to this end, you should use the strsafe.h library in the Platform SDK in place of the inherently unsafe C Runtime and Win32 string libraries.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.