Channels ▼

Matthew Wilson

Dr. Dobb's Bloggers

Conversion Constructors and Subtle Dangers

January 24, 2011

Pretty much every old C++ war-horse knows to use the explicit keyword when defining constructors with one only (non-defaulted) parameter. Even so, the subtleties of what can go wrong with the best laid plans of library designers can still surprise.

Recently, a user of the Pantheios diagnostic logging library reported a subtle vulnerability involving conversion constructors in a particular set of circumstances.

Background

For reasons of robustness, Pantheios does not accept arguments of fundamental types - int, char, double, void*' and so on. All arguments must be of a string type, or one which the Pantheios Application Layer knows how to interpret as a string. Thus, a statement such as the following will be rejected by the compiler:

[Note: all the log statements here would usually be expressed on one line. Limitations of this medium require I split them.]

 #include <pantheios/pantheios.hpp>

 pantheios::log_NOTICE(
   "secret of life, universe, and everything: "
 , <strong>42</strong>);

To incorporate a fundamental type instance into a log statement a user must first convert it to a string. One (bad) way would be to convert it to a string yourself as follows:

 #include <pantheios/pantheios.hpp>

 <strong>char num[21];</strong>
 <strong>snprintf(num, 21, "%d", 42);</strong>
 pantheios::log_NOTICE(
   "secret of life, universe, and everything: "
 , <strong>num</strong>);

There are several reasons why this is bad: it is verbose; it is not practically portable, because it uses snprintf(), which is "deprecated" by Microsoft's later compilers in favour of so-called safe equivalents (in this case the famous _snprintf_s()); it is potentially inefficient, because the conversion from int->string is made even when statements at log level NOTICE are disabled; it requires you to use magic numbers or some kind of dimensionof() macro.

A better way is to use one of the stock inserted classes/functions provided by Pantheios, in this case the pantheios::integer inserter class, which may be used to insert all integer types (and with different width, radix, etc.), as in:

 <strong>#include <pantheios/inserters/integer.hpp></strong>
 #include <pantheios/pantheios.hpp>

 pantheios::log_NOTICE(
   "secret of life, universe, and everything: "
 , <strong>pantheios::integer</strong>(42));

Incidentally, if you (understandably) think that's getting a tad verbose, you can use some of the provided namespace/class/function aliases to shorten the log statements, as in:

 <strong>#include <pantheios/inserters/i.hpp></strong>
 <strong>#include <pantheios/pan.hpp></strong>

 <strong>pan</strong>::log_NOTICE(
   "secret of life, universe, and everything: "
 , <strong>pan::i</strong>(42));

Problem

Now you know how insertion of an integer should look, let's consider how things should go if you try to insert it directly. With Visual C++ 9, I get a set of compilation errors along the following lines:

 h:\freelibs\pantheios\main\1.0\include\pantheios\internal\generated\log_sev_functions.hpp(9998) : error C2665: 'stlsoft::winstl_project::c_str_len_a' : none of the 17 overloads could convert all the argument types
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(672): could be 'stlsoft::winstl_project::ws_size_t stlsoft::winstl_project::c_str_len_a(const SYSTEMTIME &)'
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(720): or 'stlsoft::winstl_project::ws_size_t stlsoft::winstl_project::c_str_len_a(const FILETIME &)'
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(753): or 'stlsoft::winstl_project::ws_size_t stlsoft::winstl_project::c_str_len_a(const UDATE &)'
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\hwnd.hpp(561): or 'stlsoft::winstl_project::ws_size_t stlsoft::winstl_project::c_str_len_a(HWND)'
  h:\stlsoft\releases\1.9\stlsoft\include\comstl\shims\access\string\variant.hpp(556): or 'stlsoft::comstl_project::cs_size_t stlsoft::comstl_project::c_str_len_a(const VARIANT &)'
  h:\stlsoft\releases\1.9\stlsoft\include\comstl\shims\access\string\guid.hpp(298): or 'stlsoft::comstl_project::cs_size_t stlsoft::comstl_project::c_str_len_a(const GUID &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\time.hpp(254): or 'stlsoft::ss_size_t stlsoft::c_str_len_a(const tm &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\time.hpp(143): or 'stlsoft::ss_size_t stlsoft::c_str_len_a(const tm *)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\basic_string.hpp(392): or 'stlsoft::ss_size_t stlsoft::c_str_len_a(const std::string &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\exception.hpp(299): or 'stlsoft::ss_size_t stlsoft::c_str_len_a(const std::exception &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\fwd.h(100): or 'stlsoft::ss_size_t stlsoft::c_str_len_a(const stlsoft::ss_char_a_t *)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1459): or 'size_t pantheios::shims::c_str_len_a(const pantheios::pan_slice_t &)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1549): or 'size_t pantheios::shims::c_str_len_a(const pantheios::pan_slice_t *)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1592): or 'size_t pantheios::shims::c_str_len_a(pantheios::pan_severity_t)'
  while trying to match the argument list '(const int)'
  h:\publishing\blogs\ddj\code\conversion_constructors\example1\example1.cpp(8) : see reference to function template instantiation 'int pantheios::log_NOTICE<const char[43],int>(T0 (&),const T1 &)' being compiled
  with
  [
   T0=const char [43],
   T1=int
  ]
h:\freelibs\pantheios\main\1.0\include\pantheios\internal\generated\log_sev_functions.hpp(9998) : error C2665: 'stlsoft::winstl_project::c_str_data_a' : none of the 17 overloads could convert all the argument types
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(448): could be 'stlsoft::basic_shim_string<C> stlsoft::winstl_project::c_str_data_a(const SYSTEMTIME &)'
  with
  [
   C=stlsoft::ss_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(482): or 'stlsoft::basic_shim_string<C> stlsoft::winstl_project::c_str_data_a(const FILETIME &)'
  with
  [
   C=stlsoft::ss_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\time.hpp(514): or 'stlsoft::basic_shim_string<C> stlsoft::winstl_project::c_str_data_a(const UDATE &)'
  with
  [
   C=stlsoft::ss_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\winstl\shims\access\string\hwnd.hpp(529): or 'stlsoft::winstl_project::c_str_ptr_HWND_proxy<C> stlsoft::winstl_project::c_str_data_a(HWND)'
  with
  [
   C=stlsoft::winstl_project::ws_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\comstl\shims\access\string\variant.hpp(489): or 'stlsoft::comstl_project::c_str_VARIANT_proxy_a stlsoft::comstl_project::c_str_data_a(const VARIANT &)'
  h:\stlsoft\releases\1.9\stlsoft\include\comstl\shims\access\string\guid.hpp(256): or 'stlsoft::comstl_project::c_str_ptr_GUID_proxy<C> stlsoft::comstl_project::c_str_data_a(const GUID &)'
  with
  [
   C=stlsoft::comstl_project::cs_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\time.hpp(241): or 'stlsoft::basic_shim_string<C> stlsoft::c_str_data_a(const tm &)'
  with
  [
   C=stlsoft::ss_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\time.hpp(100): or 'stlsoft::basic_shim_string<C> stlsoft::c_str_data_a(const tm *)'
  with
  [
   C=stlsoft::ss_char_a_t
  ]
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\basic_string.hpp(226): or 'const stlsoft::ss_char_a_t *stlsoft::c_str_data_a(const std::string &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\std\exception.hpp(223): or 'const stlsoft::ss_char_a_t *stlsoft::c_str_data_a(const std::exception &)'
  h:\stlsoft\releases\1.9\stlsoft\include\stlsoft\shims\access\string\fwd.h(93): or 'const stlsoft::ss_char_a_t *stlsoft::c_str_data_a(const stlsoft::ss_char_a_t *)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1437): or 'const char *pantheios::shims::c_str_data_a(const pantheios::pan_slice_t &)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1527): or 'const char *pantheios::shims::c_str_data_a(const pantheios::pan_slice_t *)'
  h:\freelibs\pantheios\main\1.0\include\pantheios\pantheios.h(1574): or 'const char *pantheios::shims::c_str_data_a(pantheios::pan_severity_t)'
  while trying to match the argument list '(const int)'

Although this is scary at first, it actually holds the essential truth of the problem, something often not the case with C++ compile errors (particularly with moderate-heavy use of templates). The problem is precisely as described (all 17 times!): there are no overloads of c_str_len_a() and c_str_data_a() for "the argument list '(const int)'". These two function overload suites are string access shims, which are how Pantheios meets its design parameters of expressiveness, flexibility, and performance, something I'll get into in more detail in an upcoming article series on diagnostics I will be writing for Dr Dobb's this year. (The articles will also discuss why the library must reject fundamental type arguments, via the absence of string access shims defined for numeric types; for now you'll just have to take my word.) Many string access shim overloads are provided by the STLSoft libraries, on which Pantheios depends, as well as other open-source libraries (and several of my closed-source commercial developments for clients, allowing their types to be succinctly logged).

So far, so good. This behaviour is all by-design, and is actually of benefit to the programmer, being as how it brings potential runtime problems into firm compile-time problems. In all the years I and others have been using Pantheios, this mechanism had never been subverted. Or, at least that was the case until late last year, when a user reported that, after passing an integer directly in a log statement, it had not been displayed correctly in the log. Alarm bells still ringing, we investigated and, lo and behold, it transpired that the integer was indeed being passed to a pair of string access shims, via the conversion constructor of the ATL type CComBSTR, for which string access shims are defined (although only in wide-string builds, hence c_str_len_w() and c_str_data_w()). This only occurred when building when using ATL on Windows and building a wide-string version of an application, in which case the pre-processor symbol UNICODE is defined. What actually happens is that the integer value is passed by the compiler to the (non-explicit) single-parameter CComBSTR(int nSize) constructor, as part of the compiler attempting to find match for the argument. That constructor is used to pre-allocate a buffer of the given size for later use. So what happens in the case of our log statement is that the statement written out is:

  "secret of life, universe, and everything: " <<- we are left to ponder ...

What should have been "42" is actually an empty string, since the secretive CComBSTR instance is empty of content (although it has a buffer of 42-character ready to receive some content). As you can imagine, had the integer had a very high value we may also have had an unwanted memory allocation failure to go along with the malformed log statement.

Naturally enough, when we'd got to this point, I wondered with some heat why on earth the ATL designers had not seen fit to declare that constructor explicit. But what to do about it? I cannot proscribe the use of ATL to Pantheios' users. I cannot remove the requisite wide-form string access shim overloads - c_str_len_w(CComBSTR const&) and c_str_data_w(CComBSTR const&) - from STLSoft, since they are established and used in other things. I can recommend that users compile a multibyte-string version of their code, as well as the wide-string one they want, to act as a guard, but that is often not possible. One thing I could do would be to remove the implicit inclusion of the requisite STLSoft libraries from Pantheios when compiling for wide-string in the presence of ATL. But anyone using Pantheios under those conditions who wanted to be able to insert CComBSTR instances into a log statement - a pretty common requirement - would be disadvantaged.

I confess I thought long and hard about this, and came up with several partially-forme, baroque, and long-winded solutions, until the light finally shone on me. I'd even written about such things in the first chapter of my first book, Imperfect C++, back in 2004: the answer is constraints.

So, all that's been required to allow Pantheios once more to claim 100% type-safety in all circumstances is the insertion of constraints along the lines of that shown in the following 3-parameter application layer log_NOTICE() statement function template:

template<typename T0, typename T1>
int log_NOTICE(
  pan_sev_t severity
, T0 const& v0
, T1 const& v1
)
{
  <strong>STLSOFT_STATIC_ASSERT(0 == stlsoft::<a href="http://www.stlsoft.org/doc-1.9/structstlsoft_1_1is__fundamental__type.html">is_fundamental_type</a><T0>::value);</strong>
  <strong>STLSOFT_STATIC_ASSERT(0 == stlsoft::is_fundamental_type<T1>::value);</strong>
  . . .

The constraint is pretty canonical for these things: a static assertion is used to enforce a compile-time characteristic elicited via a meta-programming type. Specifically, is_fundamental_type is used to verify that the types T0 and T1 are not fundamental types. If either of them is, as in the original case, the compiler will fail on that line.

So far, all the compilers that used to support Pantheios are happy with the additional compile-time load - remember, these are compile-time constraints, and make absolutely no different to runtime speed/behaviour - except for GCC 3.x and Digital Mars C/C++, for which the constraints are diluted/elided.

Conclusion

1. When you're designing a library involving C++ classes, be sure that you mark explicit every one-parameter (or one-non-default-parameter) constructor, unless you explicitly want to allow implicit conversion construction. Yes, I am aware that that's a tongue-twister. And aware too of the irony. But a language as long-lived and successful as C++ has a lot of backwards compatibility to maintain. Such wrong-way-round situations extend to more than just explicit/implicit, and maybe I'll have a carp about them on another day.

2. When you're designing a library involving arbitrary heterogeneous types, remember the utility of constraints (particularly compile-time ones) for controlling how far the envelope of accepted types may be stretched.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video