StringPrintf: A Typesafe printf Family for C++

You can use StringPrintf, a typesafe version of the printf function family, with std::string and C strings.


August 01, 2005
URL:http://www.drdobbs.com/stringprintf-a-typesafe-printf-family-fo/184401999

August, 2005: StringPrintf: A Typesafe printf Family for C++

Stefan Wörthmüller is a software developer in Berlin who specializes in cross-platform development, user interfaces, and reengineering. He can be contacted at [email protected].


Unnamed Temporary C++ Objects


As every experienced C/C++ programmer knows, using printf can lead to a program crash by using "%s" and supplying an argument different from a NULL-terminated C-string (or even no argument). For example, executing:

printf("%s");
might print garbage or raise an access violation, depending on what values are on the stack. Using "%s" in general opens opportunities for exploitation when output is not limited to the size of the target buffer.

This is bad and anachronistic in times of STL, std::string, and typesafe programming, leading project managers to ban C strings and printf altogether. Then there is C++'s iostreams, which was meant to be printf's successor on C++. However, it has drawbacks of its own [1].

With this in mind, I present in this article StringPrintf, a typesafe version of the printf function family. StringPrintf can be used with std::string and with C strings. In addition, it can do everything that sprintf does and is compatible with all standard printf format strings. In fact, internally, it uses snprintf itself. The big difference is that it checks (and mostly ignores) the type character of the format string, choosing instead the type appropriate for the argument supplied. The type character is used for legal conversions (such as supplying an int for a %c) and compatibility. If the argument is missing, the part of the format string remains unchanged in the resulting string.

I implemented StringPrintf in multiple versions that can replace printf, sprintf, snprintf, and fprintf. Using a couple of #defines (supplied in StringPrintf.h, available at http://www.cuj.com/code/), StringPrintf can replace these calls in existing code. This resolves all program faults caused by format string mismatch and the arguments passed. When compiling in debug mode, StringPrintf prints warning messages for type mismatching, and missing or surplus arguments. However, it cannot resolve buffer overflows of C-strings (which is technically impossible). This can only be done either by replacing the existing calls to sprintf by adding an argument with the buffer length (such as calling snprintf), or by switching to std::string, which is fully supported. std::string-s can be passed instead of C strings as input parameters (using %s), as well as for the output buffer.

StringPrintf is implemented using techniques most current C++ compilers support:

The result is a printf-like function that is type and error safe. For instance, writing:

std::string s1;
StringPrintf(s1, "int = %d; float = 
	       %3.1f", -12, 3572.24);

results (exactly as using sprintf) in s1 containing:

"int = -12; float = 3572.2"

Unlike with sprintf though, supplying incorrect or too few arguments does not do any harm to the program being executed, although the resulting string might not totally have the content expected.

Using wrong types, as for example:

StringPrintf(s1, "int = %s; float = %3f", 
		      3572.24, -12);

results in s1 containing:

"int = 3572.247070; float = -12"

This is due to the fact that the data type of the arguments determines the conversion done, rather than the type character of the format string. StringPrintf can also handle missing arguments:

StringPrintf(s1, "int = %s; float = 
        %3d, %s, %d, %s", 3572.24, -12);

results in s1 containing:

"int = 3572.247070; float = 
                    -12, %s, %d, %s"

This is done (in contrast to printf) not by using a vararg list, but instead by using separate implementations of StringPrintf for any different numbers of arguments. Consequently, existing format specifications are skipped if the corresponding argument is missing. See Listing 1 for more examples.

Implementation

The implementation of StringPrintf is quite straightforward. I started by creating a small helper class named CPrintfArg, which can hold one value of any built-in data types. For every different data type, it has a constructor that copies the value (or holds a pointer to it, whatever is less expensive). Each constructor also keeps track of the type used (see Listing 2).

Having created CPrintfArg, it is possible to write a number of StringPrintf functions—one for any number of arguments:

size_t StringPrintf(std::string &out, 
 const std::string &fmt, const CPrintfArg&);
size_t StringPrintf(std::string &out, 
 const std::string &fmt, 
  const CPrintfArg&, const CPrintfArg&);

I provided 12 functions with up to 12 arguments. (Of course, it would be easy to implement functions with any number of parameters—but a construction using modern template techniques might be a way to support any number of arguments generically; see Andrei Alexandrescu's Modern C++ Design.)

Having done this, you can write:

StringPrintf(s1, "Int = %d; Float = %3.1f", -12, 3572.24);

which is expanded by the compiler to:

StringPrintf(s1, "Int = %d; 
   Float = %3.1f", CPrintfArg(-12), 
 CPrintfArg(3572.24));   

The 12 StringPrintf functions are only wrappers for one background function xStringPrintf (see Listing 3 at http://www.cuj.com/code/), which does the work for any number of arguments supplied as an array of pointers to CPrintfArg. xStringPrintf scans the string for the starting character of all printf format strings: "%". After searching its end, it passes the format string of one single value to the function CPrintfArg::Out of the according argument object, which does the conversion. The result is appended to the output string.

CPrintfArg::Out itself uses snprintf, but it always uses the appropriate type specification for the printf format string. It checks the type character passed in the format string and leaves it unchanged if it matches the argument data type or when conversion is legal. Otherwise, it uses the type character that matches the data type of the argument supplied. Also, buffer overflows are prevented by using snprintf instead of printf.

I've tested the source code (see Listings 3 and 4 at http://www.cuj.com/code/) under both Windows (Visual C++ 6.0) and Linux (gcc 3.3). When the preprocessor symbol DEBUG is defined, StringPrintf generates warning messages if format strings do not match the arguments supplied. This is done by calling the function:

void StringPrintfErrorMessage(const string &s).

It usually calls cout << s. Compiling under Windows with _CONSOLE not defined displays a dialog box by calling the Win32 function MessageBox: Figure 1 shows a warning message that results from passing a mismatched argument type. Figure 2 results from passing too few arguments to a StringPrintf call.

Using User-Defined Types

Because all StringPrintf functions use the same argument type, it is easy to declare custom conversions for user-defined types. All that's needed is a function that converts a user-defined object to an object of type CPrintfArg. Given a user-defined type myUserClass that should have a certain representation as a string, all you need is a member function operator CPrintfArg() (see Listings 5 and 6 at http://www.cuj.com/code/).

The object User1 is converted by the compiler to a temporary object of type CPrintfArg by calling the class function operator CPrintfArg(), which returns an object that contains a string. Because the object returned by operator CPrintfArg() must have a lifetime exceeding the body of the function, the object passed to CPrintfArg() may not be a local stack object. Therefore, the additional parameter CPrintfArg::DELETE_ PARM is passed to the CPrintfArg constructor, which tells CPrintfArg to delete the object passed on destruction. This, of course, is a pitfall. Omitting CPrintfArg::DELETE_PARM would lead to an error, as an already destructed string would be referenced.

Limitations

In this implementation, CPrintfArg::Out has certain limitations. For simplicity, everything is passed to snprintf. Therefore, static buffers are used in most cases. Also, the handling of erroneous format strings is different from sprintf, starting with the fact that StringPrintf rejects wrong type characters, up to where sprintf can handle (more or less) arbitrary string and format lengths that StringPrintf probably cannot handle.

StringPrintf uses local C string buffers of ARG_BUFFER_SIZE bytes for every argument, which is set to 512. This is sufficient for all possible 64-bit double numbers, but not for 80-bit doubles, which might become 4000-digits long. The ARG_BUFFER_SIZE bytes limit (per argument) includes the leading and trailing blanks or zeros that sprintf might insert. All output exceeding this limit would be truncated by snprintf. The handling is different for strings (C strings and std::string) though, where StringPrintf always allocates a temporary buffer of the size strlen(string) + ARG_BUFFER_SIZE.

Also for simplicity, I did not include all possible built-in types—only those that otherwise cannot be converted. For example, I left out all short types in CPrintfArg. Nevertheless, StringPrintf certainly works with short arguments—the compiler promotes them to integer types. In addition, StringPrintf certainly produces some overhead when creating temporary objects and allocating memory for strings. But after all, StringPrintf has been used for quite a while in large applications with thousands of sprintf calls without any additional problems.

Conclusion

StringPrintf shows that typesafe versions of the printf family can be implemented using common C++ compilers. Applying StringPrintf to Walter Bright's printf defects, most of his issues have been resolved: StringPrintf is typesafe, supports generic programming and user-defined types, and it cannot corrupt programs through invalid format strings. These features can be used in existing code just by including StringPrintf.h and linking the sources. StringPrintf can also prevent buffer overflows by switching to std::string while not abandoning printf-style formatting.

Acknowledgments

Thanks to Olaf Drümmer and Stefan Haack for their valuable support creating StringPrintf and this article.

Notes

  1. [1] For further discussion, see "printf Revisited," by Walter Bright, Dr. Dobb's Journal, January 2005.

August, 2005: StringPrintf: A Typesafe printf Family for C++

Figure 1: A warning message that results from passing a mismatched argument type.

August, 2005: StringPrintf: A Typesafe printf Family for C++

Figure 2: Passing too few arguments to a StringPrintf call.

August, 2005: StringPrintf: A Typesafe printf Family for C++

Listing 1

int main()

{   string s1;
    const short          i1 = -12;
    const unsigned short u1 =  23;
    const float          f1 = 3572.24f;
    const double         d1 = -237E-12;
    const char *        cp1 = "submarine";

    // printf-style function
    // correct string and output
    StringPrintfCout("int = %d%%; float = %3.1f\n\n", i1, f1);
    // Outputs: "int = -12%; float = 3572.2"

    // Use Wrong Types
    StringPrintfCout("string = %s; int = %3d\n\n", f1, i1);
    // Outputs: "string = 3572.247070; int = -12"

    // sprintf-style function
    // Too few Arguments
    StringPrintf(s1, "string = %s; int = %3d, %s, %d, %s", f1, d1);
    cout << s1 << "\n\n";
    // Outputs: "string = 3572.247070; int = -0.000000, %s, %d, %s"

    // Too many Arguments
    StringPrintf(s1, "string = %s; int = %3d", cp1, f1, d1, u1);
    cout << s1 << "\n\n";
    // Outputs: "string = submarine; int = 3572.247070"

    // User defined Type
    myUserClass User1;
    StringPrintf(s1, "User1 = %s", User1);
    cout << s1 << "\n\n";
    // Outputs: "User1 = userClassValue"
    // fprintf-style function
    FILE *file;
    if(file = fopen("test.txt", "w"))
    {
        StringPrintf(file, "string = %f; int = %3d\n", f1, i1);
        // Outputs to file: "string = 3572.239990; int = -0.000000"
        fclose(file);
    }
    // Dangerous! sprintf-style buffer overflow possible
    char buff[100];
    StringPrintfCout("int = %d%%; float = %3.1f\n\n", i1, f1);
    // Outputs: "int = -12%; float = 3572.2"

    // snprintf-style function using buffer length parameter 
#define BUFFLEN 24
    char tx[BUFFLEN];
    StringPrintf(tx,BUFFLEN,"String = %s; float = %f,double = %f",cp1,f1,d1);
    tx[BUFFLEN - 1] = 0;
    cout << tx << "\n\n";
    // Outputs: "String = submarine; flo"

    return 0;
}

August, 2005: StringPrintf: A Typesafe printf Family for C++

Listing 2

class CPrintfArg
{
public:
    typedef enum
    {
        TYP_INT, TYP_UINT, TYP_FLOAT, TYP_CHAR, TYP_C_STRING,
        TYP_STD_STRING, TYP_C_STRING_NONCONST, TYP_STD_STRING_NONCONST,
        TYP_VOIDP, 
    }DATA_TYPES;
    typedef enum
    {
        DELETE_PARM,
        NO_DELETE
    }DELETE_MODE;
    CPrintfArg(const int  x)  : mTyp(TYP_INT) , mDeleteMode(NO_DELETE)
    {                                                        
        v.miVal = x;                                         
    };                                                       
    CPrintfArg(const unsigned int  x) : mTyp(TYP_UINT), mDeleteMode(NO_DELETE)
    {                                                        
        v.muVal = x;                                         
    };                                                       
    CPrintfArg(const void *x) : mTyp(TYP_UINT), mDeleteMode(NO_DELETE)
    {                                                        
        v.muVal = (unsigned int)x;                           
    };                                                       
    CPrintfArg(const long x) : mTyp(TYP_INT) , mDeleteMode(NO_DELETE)
    {                                                        
        v.miVal = x;                                         
    };                                                       
    CPrintfArg(const unsigned long x) : mTyp(TYP_UINT), mDeleteMode(NO_DELETE)
    {                                                        
        v.muVal = x;                                         
    };                                                       
    CPrintfArg(const float x) : mTyp(TYP_FLOAT) , mDeleteMode(NO_DELETE)
    {                                                        
        v.mfVal = (double)x;                                 
    };                                                       
    CPrintfArg(const double x) : mTyp(TYP_FLOAT) , mDeleteMode(NO_DELETE)
    {                                                        
        v.mfVal = x;                                         
    };                                                       
    CPrintfArg(const char c) : mTyp(TYP_CHAR) , mDeleteMode(NO_DELETE)
    {
        v.mChar = c;
    };
    CPrintfArg(const char *p) : mTyp(TYP_C_STRING) , mDeleteMode(NO_DELETE)
    {
        v.mConstCharp = p;
    };
    CPrintfArg(const unsigned char *p) : mTyp(TYP_C_STRING), 
                                              mDeleteMode(NO_DELETE)
    { 
        v.mConstCharp = (const char*)p;
    };
    CPrintfArg(const std::string &s) : mTyp(TYP_STD_STRING), 
                                              mDeleteMode(NO_DELETE)
    {
        v.mConstString = &s;
    };
    // Non cost constructors for objects to delete on destruction
    CPrintfArg(char *p, DELETE_MODE mode = NO_DELETE) : 
                   mTyp(TYP_C_STRING_NONCONST) , mDeleteMode(mode)
    {
        v.mCharp = p;
    };
    CPrintfArg(unsigned char *p, DELETE_MODE mode = NO_DELETE) : 
                   mTyp(TYP_C_STRING_NONCONST) , mDeleteMode(mode)
    {
        v.mCharp = (char*)p;
    };
    CPrintfArg(std::string &s, DELETE_MODE mode = NO_DELETE) : 
                   mTyp(TYP_STD_STRING_NONCONST), mDeleteMode(mode)
    {
        v.mString = &s;
    };
    ~CPrintfArg();
    const std::string Out(const std::string &fmt) const ;
private:
             DATA_TYPES  mTyp;          
             DELETE_MODE mDeleteMode;
             union
             {
                                long   miVal;
                       unsigned long   muVal;
                                double mfVal;
                                char   mChar;
                                char  *mCharp;
                         std::string  *mString;
                 const          char  *mConstCharp;
                 const   std::string  *mConstString;
             }v;
};

August, 2005: StringPrintf: A Typesafe printf Family for C++

Unnamed Temporary C++ Objects

The main technique that makes StringPrintf possible is the use of unnamed temporary objects. Most C++ programmers use this technique very often, although most of the time, it is not explicit. For example, having a set of std::strings, the set function insert only accepts references to objects of type std::string. Nevertheless, you can correctly write:

std::set<std::string> stringSet;
stringSet.insert("ABC");

This is possible because the compiler, when finding the C-string "ABC", looks for a way to convert the C-string to a const reference to a std::string object. It then finds that basic_string (the template base class of std::string) has a constructor basic_string(const char *s). The compiler uses this constructor to create an unnamed temporary object of std::string out of "ABC", and passes it to std::set.insert(). The line

stringSet.insert("ABC");

is expanded to

stringSet.insert(std::string("ABC"));

A unnamed temporary std::string object is created and its reference is passed to the insert function. When the insert function returns, the temporary object is destroyed.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.