An Enhanced ostream_iterator

Have you ever wanted a smarter std::ostream_iterator? Matthew Wilson shows you how.


July 20, 2007
URL:http://www.drdobbs.com/cpp/an-enhanced-ostreamiterator/201200278

I have benefited greatly from criticism, and at no time have I suffered a lack thereof.
—Winston Churchill

I'm so gorgeous, there's a six-month waiting list for birds to suddenly appear, every time I am near!
—Cat, Red Dwarf

Introduction

Do you ever find yourself wanting to output the contents of a sequence where the elements are to be indented, perhaps by a tab space ('\ t'), as in the following example?

Header Files:
  H:\ freelibs\ b64\ current\ include\ b64\ b64.h
  H:\ freelibs\ b64\ current\ include\ b64\ cpp\ b64.hpp
Implementation Files:
  H:\ freelibs\ b64\ current\ src\ b64.c
  H:\ freelibs\ b64\ current\ test\C\C.c
  H:\ freelibs\ b64\ current\ test\Cpp\ Cpp.cpp

Using std::ostream_iterator, this is disproportionately difficult and inelegant to achieve. Consider that we're searching for source files under the current directory, using the recls/STL library (itself an adaptation of a collection in what should now be, after reading Part II, characteristic STL extension style). recls/STL provides the recls::stl::search_sequence collection (a typedef of recls::stl::basic_search_sequence<char>), to which we pass the search directory, pattern, and flags. We can use this in combination with std::copy() and std::ostream_iterator, as shown in Listing One, to achieve the desired output.

Listing One: Formatting Output Using std::ostream_iterator

typedef recls::stl::search_sequence srchseq_t;
using recls::RECLS_F_RECURSIVE;

srchseq_t headers(".", "*.h|*.hpp", RECLS_F_RECURSIVE);
srchseq_t impls(".", "*.c|*.cpp", RECLS_F_RECURSIVE);

std::cout << "Header Files:" << std::endl << "\ t";
std::copy(headers.begin(), headers.end()
, std::ostream_iterator<srchseq_t::value_type>(std::cout, "\ n\ t"));
std::cout << "\ r";

std::cout << "Implementation Files:" << std::endl << "\ t";
std::copy(impls.begin(), impls.end()
, std::ostream_iterator<srchseq_t::value_type>(std::cout, "\ n\ t"));
std::cout << "\n";

Obviously, there's a degree of mess here in that the formatting we seek to apply on lines 8–9 and 13–14 leaks out into lines 7, 10, 12. and 15. I'm certain you can imagine how, in more complex cases, this can lead to convoluted and fragile code, something that would just not happen were std::ostream_iterator a tiny bit smarter. This chapter describes how std::ostream_iterator can be enhanced in a simple but crucial way, in the form of stlsoft::ostream_ iterator.

Before we look in depth at the problem and the simple solution, let's see that solution in action (Listing Two).

Listing Two: Formatting Output Using stlsoft::ostream_iterator

typedef recls::stl::search_sequence srchseq_t;
using recls::RECLS_F_RECURSIVE;

srchseq_t headers(".", "*.h|*.hpp", RECLS_F_RECURSIVE);
srchseq_t impls(".", "*.c|*.cpp", RECLS_F_RECURSIVE);

std::cout << "Header Files:" << std::endl;
std::copy(headers.begin(), headers.end()
    , stlsoft::ostream_iterator<srchseq_t::value_type>(std::cout
                          , "\ t", "\ n"));

std::cout << "Implementation Files:" << std::endl;
std::copy(impls.begin(), impls.end()
    , stlsoft::ostream_iterator<srchseq_t::value_type>(std::cout
                          , "\t","\ n"));

Now the formatting is entirely located where it should be, in the invocation of the iterator's constructor. Naturally, this component can be used easily in formatted output that employs different levels of indentation.

(If we wanted to be especially pious with regards to DRY SPOT, we might declare a single instance of stlsoft::ostream_iterator with the required prefix and suffix and pass it to the two invocations of std::copy(). But that's not as clear-cut as you might think, as we'll shortly see.)

std::ostream_iterator

The standard (C++-03: 24.5.2) defines the interface of the std::ostream_iterator class template, as shown in Listing Three.

Listing Three: Definition of std::ostream_iterator

// In namespace std
template< typename V            // The type to be inserted
    , typename C = char         // Char encoding of stream
    , typename T = std::char_traits<C> // Traits type
    >
class ostream_iterator
 : public iterator<output_iterator_tag, void, void, void, void>
{ 
public: // Member Types
 typedef C                      char_type;
 typedef T                      traits_type;
 typedef std::basic_ostream<char_type, traits_type> ostream_type;
 typedef ostream_iterator<V, C, T>          class_type;
public: // Construction
 explicit ostream_iterator(ostream_type& os);
 ostream_iterator(ostream_type& os, char_type const* delim);
 ostream_iterator(class_type const& rhs);
 ~ostream_iterator() throw();
public: // Assignment
 class_type& operator =(V const& value);
public: // Output Iterator Methods
 class_type& operator *();
 class_type& operator ++();
 class_type operator ++(int);
private:
 . . .
} ;

This is a classic output iterator, whereby each instance remembers the stream and the (optional) delimiter from which it is constructed and uses them to effect formatted output when a value is assigned to it. Implementations typically maintain a pointer to the stream, which we'll call m_stm, and a copy of the delimiter pointer (i.e., of type char_type const*), which we'll call m_delim. Listing Four shows the implementation of the assignment operator.

Listing Four: Definition of the Assignment Operator

template<typename V, typename C, typename T>
ostream_iterator<V, C, T>&
 ostream_iterator<V, C, T>::operator =(V const& value)
{ 
 *m_stm << value;
 if(NULL != m_delim)
 { 
  *m_stm << m_delim;
 } 
 return *this;
} 

It's a really simple, clever idea with just one flaw: the lack of ambition demonstrated in the previous section.

void Difference Type

Note in Listing Three that ostream_iterator uses the standard manner of instantiating the std::iterator type generator (Section 12.2) class template, by which all types except the iterator category are void. These are defined as void to help prevent the use of output iterators in contexts where their behavior would be undefined. For example, the standard (C++-03: 24.3.4) defines the std::distance() algorithm's return type in terms of the given iterator's difference_type, as in the following:

template <typename I>
typename std::iterator_traits<I>::difference_type
 distance(I from, I to);

Since the evaluation of such distance for an output iterator is not meaningful, output iterators should define their difference_type (usually via std::iterator, as shown in Listing Three) to be void, which will precipitate a compilation error if a user tries to apply std::distance() to such types.

Define member types as void to (help) proscribe unsupported operations.

stlsoft::ostream_iterator

The STLSoft libraries contain a very modest enhancement to std::ostream_iterator, imaginatively called stlsoft::ostream_iterator. Its definition is shown in Listing Five.

Listing Five: Definition of stlsoft::ostream_iterator

// In namespace stlsoft
template< typename V
    , typename C = char
    , typename T = std::char_traits<C>
    , typename S = std::basic_string<C, T>
    >
class ostream_iterator
 : public std::iterator<std::output_iterator_tag
            , void, void, void, void>
{ 
public: // Member Types
 typedef V                      assigned_type;
 typedef C                      char_type;
 typedef T                      traits_type;
 typedef S                      string_type;
 typedef std::basic_ostream<char_type, traits_type> ostream_type;
 typedef ostream_iterator<V, C, T, S>        class_type;
public: // Construction
 explicit ostream_iterator(ostream_type& os)
  : m_stm(&os)
  , m_prefix()
  , m_suffix()
 { } 
 template <typename S1>
 explicit ostream_iterator(ostream_type& os, S1 const& suffix)
  : m_stm(&os)
  , m_prefix()
  , m_suffix(stlsoft::c_str_ptr(suffix))
 { } 
 template< typename S1
     , typename S2
     >
 ostream_iterator(ostream_type& os, S1 const& prefix, S2 const& suffix)
  : m_stm(&os)
  , m_prefix(stlsoft::c_str_ptr(prefix))
  , m_suffix(stlsoft::c_str_ptr(suffix))
 { } 
 ostream_iterator(class_type const& rhs)
  : m_stm(rhs.m_stm)
  , m_prefix(rhs.m_prefix)
  , m_suffix(rhs.m_suffix)
 { } 
 ~ostream_iterator() throw()
 { } 
public: // Assignment
 class_type& operator =(assigned_type const& value)
 { 
  *m_stm << m_prefix << value << m_suffix;
  return *this;
 } 
public: // Output Iterator Methods
 class_type& operator *()
 { 
  return *this;
 } 
 class_type& operator ++()
 { 
  return *this;
 } 
 class_type operator ++(int)
 { 
  return *this;
 } 
private: // Member Variables
 ostream_type* m_stm;
 string_type  m_prefix;
 string_type  m_suffix;
} ;

The main difference is the sole functional enhancement: separation of the delimiter into a prefix and a suffix. For full compatibility with std::ostream_iterator semantics, the second, two-parameter constructor specifies the suffix, not the prefix, since std::ostream_iterator inserts the delimiter into the stream after the value.

That's it for the interface. We will now examine the several implementation differences.

Shims, Naturally

The most obvious implementation difference is the use of string access shims. These afford all the usual flexibility to work with a wide variety of string or string-representable types. Indeed, all of the following parameterizations of the iterator are well formed:

std::string           prefix("prefix");
stlsoft::simple_string     suffix("suffix");
  
stlsoft::ostream_iterator<int> osi1(std::cout);
stlsoft::ostream_iterator<int> osi2(std::cout, "suffix");
stlsoft::ostream_iterator<int> osi3(std::cout, "prefix", "suffix");
stlsoft::ostream_iterator<int> osi4(std::cout, suffix);
stlsoft::ostream_iterator<int> osi5(std::cout, prefix, "");
stlsoft::ostream_iterator<int> osi6(std::cout, prefix, "suffix");
stlsoft::ostream_iterator<int> osi7(std::cout, "prefix", suffix);
stlsoft::ostream_iterator<int> osi8(std::cout, prefix, suffix);

Safe Semantics

The standard does not prescribe how std::ostream_iterator is to be implemented, which is usually perfectly reasonable. However, in this case there is no stipulation as to whether the iterator instance should take a copy of the delimiter contents or merely, as most implementations do, copy the pointer. As long as ostream_iterator is used in its familiar context, as a temporary passed to an algorithm, this is irrelevant. However, it's not hard to make a mess:

std::string         delim("\ n");
std::ostream_iterator<int> osi(std::cout, delim.c_str());
std::vector<int>      ints(10);
  
delim = "something else";
  
std::copy(ints.begin(), ints.end(), osi); // Undefined behavior!!

The issue can be compounded when using string access shims in the constructor to make it more generic. Consider what would happen if m_prefix and m_suffix were of type char_type const*, rather than of type string_type. It would be possible to use the iterator class template with any type whose shim function returns a temporary, as in the following:

std::vector<int> ints(10);
VARIANT      pre = . . . 
VARIANT      suf = . . . 
  
std::copy(ints.begin(), ints.end()
    , stlsoft::ostream_iterator<int>(std::cout, pre, suf)); // Boom!

This is not something obvious even to the trained eye. The problem is that the language requires that temporaries exist for the lifetime of their enclosing full expression (C++-03: 12.2;3). Let's look again at the requisite lines from the implementation, highlighting the shim invocations:

 template<typename S1, typename S2>
 ostream_iterator(ostream_type& os, S1 const& prefix, S2 const& suffix)
  : m_stm(&os)
  , m_prefix(stlsoft::c_str_ptr(prefix))
  , m_suffix(stlsoft::c_str_ptr(suffix))
 { } 

By the time the constructor completes, the temporary instances of the conversion class returned by the c_str_ptr(VARIANT const&) overload invocations have been created, had their implicit conversion operator called, and been destroyed. The m_prefix and m_suffix pointers would be left holding onto garbage. When these pointers are used within the std::copy() statement, it's all over, Red Rover!

This is a problem with wide potential, but thankfully we can avoid it by adhering to one simple rule.

Rule - If your constructor uses conversion (or access) shims, you must not hold onto the results of the shim functions in pointer (or reference) member variables.

This rule offers only three options. First, we could use (but not hold) the result pointer within the constructor body, as we did in the case of unixstl::glob_sequence (Section 17.3.5). Second, we could copy the result into a string member variable, as we did in the case of unixstl::readdir_sequence (Section 19.3.2).

The final option is to eschew the use of string access shims entirely and stick with C-style string pointers as constructor parameters. But this pushes all responsibility for conversion out to the application code, thereby violating the Principle of Composition (as far too many C++ libraries are wont to do).

Given that the IOStreams are anything but lightning-quick themselves, the prudent choice here is to err on the side of flexibility and safety. Thus stlsoft::ostream_iterator stores copies of the prefix and suffix arguments in member variables of string_type. The nice side effect of this is that the assignment operator implementation becomes extremely simple, just a single statement:

 *m_stm << m_prefix << value << m_suffix;

Don't emulate undefined vulnerabilities in the standard without careful consideration.

std::ostream_iterator Compatibility

The first of the two constructors provides full compatibility with std::ostream_iterator. The second, three-parameter constructor provides the additional functionality. To define a prefix-only iterator is straightforward: Just specify the empty string ("") as the third parameter in the three-parameter constructor:

std::copy(impls.begin(), impls.end()
    , stlsoft::ostream_iterator<srchseq_t::value_type>(std::cout
                             , "\ t", ""));

A Clash of Design Principles?

The constructors of stlsoft::ostream_iterator break an important guideline of consistency between overloads and default arguments. The guideline requires that overloads should behave as if they were implemented as one constructor with a number of default arguments. Additional arguments, which refine the behavior/state requested of the function, are stacked on the end.

Guideline - An overload that provides additional parameters should not change the sequence of parameters with respect to the overridden method.

However, the third overloaded constructor of stlsoft::ostream_iterator specifies its refinement in the middle of the arguments. This is a consequence of applying the Principle of Least Surprise: A user will expect to specify a prefix before a suffix. Doing it the other way would result in client code such as the following:

std::cout << "Header Files:" << std::endl;
std::copy(headers.begin(), headers.end()
    , stlsoft::ostream_iterator<srchseq_t::value_type>(std::cout
                          , "\n","\t"));

This is actually a conflict between levels of discoverability. When viewing the class (template) in isolation, a second overload of (..., suffix, prefix) is probably more discoverable. But when viewed in action, the overload of (..., prefix, suffix) is definitely more discoverable. Given that, I feel that the result is worth transgressing the guideline and have opted for the latter. But it's an equivocal point, to be sure. You may see it differently.

Defining Stream Insertion Operators

You may wonder why this code works as well as it does, specifically why recls::stl:: basic_search_sequence<>::value_type is compatible with ostream_iterator. The reason is that ostream_iterator can work with any type for which a stream insertion operator is defined.

One way we could have implemented this would be as shown in Listing Six. Note that recls::stl::basic_search_sequence<>::value_type is actually the class template recls::stl::basic_search_sequence_value_type<>, whose name is another win for succinctness. (Thankfully, you never need to specify it in client code.)

Listing Six: Stream Insertion Operators

namespace recls::stl
{ 
 std::basic_ostream<char>&
  operator <<(std::basic_ostream<char>&           stm
       , basic_search_sequence_value_type<char, . . .>& v)
 { 
  return stm << v.get_path();
 } 
 std::basic_ostream<wchar_t>&
  operator <<(std::basic_ostream<wchar_t>&           stm
       , basic_search_sequence_value_type<wchar_t, . . .>& v)
 { 
  return stm << v.get_path();
 } 
} // namespace recls::stl

However, there are three problems with this code. First, we've got two functions doing much the same thing. Second, we've had to explicitly choose the traits type that will be supported with basic_search_sequence_value_type. Third, we've assumed that the stream will be derived from std::basic_ostream. A better alternative, which addresses all three problems, is to define the operator as a function template, as shown next. (I separated the return from the insertion, just in case someone wrote an inserter and forgot to provide the expected return type: a mutable [non-const] reference to the stream.)

template<typename S, typename C, typename T>
S& operator <<(S& s, basic_search_sequence_value_type<C, T> const& v)
{ 
 s << v.get_path();
 return s;
} 

This can handle any traits type with which basic_search_sequence may be specialized and works with any stream type that can insert specializations of std::basic_string (the default string type of the recls/STL mapping). And with one additional little flourish—a string access shim, of course—we can make it compatible with any stream type that understands C-style strings.

template<typename S, typename C, typename T>
S& operator <<(S& s, basic_search_sequence_value_type<C, T> const& v)
{ 
 s << stlsoft::c_str_ptr(v.get_path()));
 return s;
} 

Now you can stream to std::cout or even, should you so wish, to an instance of MFC's CArchive!

Implement generic insertion operators.

Summary

We've examined the std::ostream_iterator component and shown how, with relatively little effort, we can enhance its design to support the principles of Composition, Diversity, and Modularity. And, although we've willingly (but advisedly) transgressed an important C++ design principle, the component also supports the Principle of Least Surprise.


Matthew Wilson is a software development consultant for Synesis Software, and creator of the STLSoft libraries. Matthew can be contacted via http://www.synesis.com.au/. This article was excerpted from his book Extended STL, Volume 1 (ISBN 0321305507). Copyright (c) 2007 Addison Wesley Professional. All rights reserved.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.