Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

StringPrintf: A Typesafe printf Family for C++


August, 2005: StringPrintf: A Typesafe printf Family for C++

Stefan Wörthmüller is a software developer in Berlin who specializes in cross-platform development, user interfaces, and reengineering. He can be contacted at [email protected].


Unnamed Temporary C++ Objects


As every experienced C/C++ programmer knows, using printf can lead to a program crash by using "%s" and supplying an argument different from a NULL-terminated C-string (or even no argument). For example, executing:

printf("%s");
might print garbage or raise an access violation, depending on what values are on the stack. Using "%s" in general opens opportunities for exploitation when output is not limited to the size of the target buffer.

This is bad and anachronistic in times of STL, std::string, and typesafe programming, leading project managers to ban C strings and printf altogether. Then there is C++'s iostreams, which was meant to be printf's successor on C++. However, it has drawbacks of its own [1].

With this in mind, I present in this article StringPrintf, a typesafe version of the printf function family. StringPrintf can be used with std::string and with C strings. In addition, it can do everything that sprintf does and is compatible with all standard printf format strings. In fact, internally, it uses snprintf itself. The big difference is that it checks (and mostly ignores) the type character of the format string, choosing instead the type appropriate for the argument supplied. The type character is used for legal conversions (such as supplying an int for a %c) and compatibility. If the argument is missing, the part of the format string remains unchanged in the resulting string.

I implemented StringPrintf in multiple versions that can replace printf, sprintf, snprintf, and fprintf. Using a couple of #defines (supplied in StringPrintf.h, available at http://www.cuj.com/code/), StringPrintf can replace these calls in existing code. This resolves all program faults caused by format string mismatch and the arguments passed. When compiling in debug mode, StringPrintf prints warning messages for type mismatching, and missing or surplus arguments. However, it cannot resolve buffer overflows of C-strings (which is technically impossible). This can only be done either by replacing the existing calls to sprintf by adding an argument with the buffer length (such as calling snprintf), or by switching to std::string, which is fully supported. std::string-s can be passed instead of C strings as input parameters (using %s), as well as for the output buffer.

StringPrintf is implemented using techniques most current C++ compilers support:

  • Supply different functions for different numbers of arguments (instead of using a variable argument list vararg, which is one main source of problems).
  • Use unnamed temporary objects of a helper class CPrintfArg as arguments (see the sidebar entitled "Unnamed Temporary C++ Objects" for a brief description). This helper class keeps track of what type was supplied. It also enables the possibility to implement conversion functions for user-defined types.
  • Check and/or ignore the type character part of the format string.

The result is a printf-like function that is type and error safe. For instance, writing:

std::string s1;
StringPrintf(s1, "int = %d; float = 
	       %3.1f", -12, 3572.24);

results (exactly as using sprintf) in s1 containing:

"int = -12; float = 3572.2"

Unlike with sprintf though, supplying incorrect or too few arguments does not do any harm to the program being executed, although the resulting string might not totally have the content expected.

Using wrong types, as for example:

StringPrintf(s1, "int = %s; float = %3f", 
		      3572.24, -12);

results in s1 containing:

"int = 3572.247070; float = -12"

This is due to the fact that the data type of the arguments determines the conversion done, rather than the type character of the format string. StringPrintf can also handle missing arguments:

StringPrintf(s1, "int = %s; float = 
        %3d, %s, %d, %s", 3572.24, -12);

results in s1 containing:

"int = 3572.247070; float = 
                    -12, %s, %d, %s"

This is done (in contrast to printf) not by using a vararg list, but instead by using separate implementations of StringPrintf for any different numbers of arguments. Consequently, existing format specifications are skipped if the corresponding argument is missing. See Listing 1 for more examples.

Implementation

The implementation of StringPrintf is quite straightforward. I started by creating a small helper class named CPrintfArg, which can hold one value of any built-in data types. For every different data type, it has a constructor that copies the value (or holds a pointer to it, whatever is less expensive). Each constructor also keeps track of the type used (see Listing 2).

Having created CPrintfArg, it is possible to write a number of StringPrintf functions—one for any number of arguments:

size_t StringPrintf(std::string &out, 
 const std::string &fmt, const CPrintfArg&);
size_t StringPrintf(std::string &out, 
 const std::string &fmt, 
  const CPrintfArg&, const CPrintfArg&);

I provided 12 functions with up to 12 arguments. (Of course, it would be easy to implement functions with any number of parameters—but a construction using modern template techniques might be a way to support any number of arguments generically; see Andrei Alexandrescu's Modern C++ Design.)

Having done this, you can write:

StringPrintf(s1, "Int = %d; Float = %3.1f", -12, 3572.24);

which is expanded by the compiler to:

StringPrintf(s1, "Int = %d; 
   Float = %3.1f", CPrintfArg(-12), 
 CPrintfArg(3572.24));   

The 12 StringPrintf functions are only wrappers for one background function xStringPrintf (see Listing 3 at http://www.cuj.com/code/), which does the work for any number of arguments supplied as an array of pointers to CPrintfArg. xStringPrintf scans the string for the starting character of all printf format strings: "%". After searching its end, it passes the format string of one single value to the function CPrintfArg::Out of the according argument object, which does the conversion. The result is appended to the output string.

CPrintfArg::Out itself uses snprintf, but it always uses the appropriate type specification for the printf format string. It checks the type character passed in the format string and leaves it unchanged if it matches the argument data type or when conversion is legal. Otherwise, it uses the type character that matches the data type of the argument supplied. Also, buffer overflows are prevented by using snprintf instead of printf.

I've tested the source code (see Listings 3 and 4 at http://www.cuj.com/code/) under both Windows (Visual C++ 6.0) and Linux (gcc 3.3). When the preprocessor symbol DEBUG is defined, StringPrintf generates warning messages if format strings do not match the arguments supplied. This is done by calling the function:

void StringPrintfErrorMessage(const string &s).

It usually calls cout << s. Compiling under Windows with _CONSOLE not defined displays a dialog box by calling the Win32 function MessageBox: Figure 1 shows a warning message that results from passing a mismatched argument type. Figure 2 results from passing too few arguments to a StringPrintf call.

Using User-Defined Types

Because all StringPrintf functions use the same argument type, it is easy to declare custom conversions for user-defined types. All that's needed is a function that converts a user-defined object to an object of type CPrintfArg. Given a user-defined type myUserClass that should have a certain representation as a string, all you need is a member function operator CPrintfArg() (see Listings 5 and 6 at http://www.cuj.com/code/).

The object User1 is converted by the compiler to a temporary object of type CPrintfArg by calling the class function operator CPrintfArg(), which returns an object that contains a string. Because the object returned by operator CPrintfArg() must have a lifetime exceeding the body of the function, the object passed to CPrintfArg() may not be a local stack object. Therefore, the additional parameter CPrintfArg::DELETE_ PARM is passed to the CPrintfArg constructor, which tells CPrintfArg to delete the object passed on destruction. This, of course, is a pitfall. Omitting CPrintfArg::DELETE_PARM would lead to an error, as an already destructed string would be referenced.

Limitations

In this implementation, CPrintfArg::Out has certain limitations. For simplicity, everything is passed to snprintf. Therefore, static buffers are used in most cases. Also, the handling of erroneous format strings is different from sprintf, starting with the fact that StringPrintf rejects wrong type characters, up to where sprintf can handle (more or less) arbitrary string and format lengths that StringPrintf probably cannot handle.

StringPrintf uses local C string buffers of ARG_BUFFER_SIZE bytes for every argument, which is set to 512. This is sufficient for all possible 64-bit double numbers, but not for 80-bit doubles, which might become 4000-digits long. The ARG_BUFFER_SIZE bytes limit (per argument) includes the leading and trailing blanks or zeros that sprintf might insert. All output exceeding this limit would be truncated by snprintf. The handling is different for strings (C strings and std::string) though, where StringPrintf always allocates a temporary buffer of the size strlen(string) + ARG_BUFFER_SIZE.

Also for simplicity, I did not include all possible built-in types—only those that otherwise cannot be converted. For example, I left out all short types in CPrintfArg. Nevertheless, StringPrintf certainly works with short arguments—the compiler promotes them to integer types. In addition, StringPrintf certainly produces some overhead when creating temporary objects and allocating memory for strings. But after all, StringPrintf has been used for quite a while in large applications with thousands of sprintf calls without any additional problems.

Conclusion

StringPrintf shows that typesafe versions of the printf family can be implemented using common C++ compilers. Applying StringPrintf to Walter Bright's printf defects, most of his issues have been resolved: StringPrintf is typesafe, supports generic programming and user-defined types, and it cannot corrupt programs through invalid format strings. These features can be used in existing code just by including StringPrintf.h and linking the sources. StringPrintf can also prevent buffer overflows by switching to std::string while not abandoning printf-style formatting.

Acknowledgments

Thanks to Olaf Drümmer and Stefan Haack for their valuable support creating StringPrintf and this article.

Notes

  1. [1] For further discussion, see "printf Revisited," by Walter Bright, Dr. Dobb's Journal, January 2005.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.