Pete offers up two brief but useful tutorials on DLLs and iostreams.
To ask Pete a question about C or C++, send e-mail to [email protected], use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.
Q
I ran up against an interesting problem today and I need more information to solve it. I will attempt to be as brief and direct as possible. I'm in an architectural environment where the application consists of multiple DLLs. I have written a system of objects that will manage a specific record set returned from a database query. To assist with that purpose I wrote a small template class. It lives only in the "transaction DLL."
We use a pipeline in the application to pass data to and from the user interface DLL and transaction DLL. We did manage this by multiple inheritance. But, my template will not allow the user interface DLL to build. I get unresolved symbols from the linker. All symbols in the error point to the template class in the transaction DLL. I have tried many different combinations of __export, including the newer __declspec symbols. The template methods are not used or called from the UI DLL. Because of the inheritance the header file for the base class (lives in the transaction DLL) is referenced in the derived class (lives in the UI DLL). So the question comes down to: How do I prepare instance member template classes for export in a DLL ?
Dave Knox
A
Despite the fact that you've used system-specific dirty words like "DLL" and "__export", you've asked a fairly common question, one that is encountered both by UNIX programmers when they create shared libraries, and by Windows programmers when they create DLLs. Since I have very little experience with UNIX, I won't get into the details of doing this sort of thing under UNIX. I will try to keep my comments general enough to be useful to UNIX programmers as well as to Windows programmers.
You must do two things when you export a template from a dynamically linked library[1] :
- ensure that the template is actually instantiated in the library, and,
- configure your tools to make the template's members available from outside the library.
The first step is easy in Standard C++. The second requires implementation-specific techniques. I'll show how to do these things with Borland C++, since that's the compiler I'm most familiar with. But I will also talk about why we're doing what we're doing, so that you can figure out how to do the same thing with your compiler.
Chances are, you've never had to force the compiler to instantiate templates for you. A template is implicitly instantiated when it is used; the compiler is then responsible for generating the necessary code. For example:
#include <vector> std::vector<int> vect; int main() { return 0; }
The #include directive pulls in the definition of the template vector, which is defined in the namespace std. The second line creates a vector of ints. In many implementations this will force the instantiation of all the members of vector<int>, making all those members available for use both within this translation unit and in other translation units in the same program. The current working paper, however, imposes a more restrictive requirement: it requires that the implementation instantiate only those members that are needed. In this case, the compiler should instantiate the default constructor, since that is clearly needed to construct the object vect; it should instantiate the destructor, which is needed when the program terminates; and it should instantiate any other members needed by the constructor or the destructor.
This limited instantiation poses no problem when you build an ordinary application: all the members used get instantiated, so everything you need is present in one or another of the translation units that make up your application. A DLL, however, is not an ordinary application. The problem you typically run into is that you don't use the template instance anywhere within the library, so the compiler doesn't instantiate it. Even if you create an object such as vect above, the compiler isn't supposed to instantiate any members that aren't needed, so you usually don't get all the members in your DLL. That's easily solved, though, through explicit instantiation:
#include <vector> template vector<int>;
The last line tells the compiler to instantiate all the members of vector<int> in the current translation unit. Do this in one of the source files for your DLL [2] and the compiler will build all the code needed for vector<int> into your library.
Getting the template instantiated within a DLL is only half of the problem, though. You must also tell your tools that the members of this template instantiation must be made available to users of the library. Here is where we depart from Standard C++, which has no notion of DLLs. Before talking about how to do it we need to look a bit more carefully at what DLLs are and how they are accessed from an application.
When most programmers use the word "library" they are referring to a statically linked library. Such a library is just a collection of functions and data packaged together for convenience when building programs. When you link with a static library the linker copies code from the library into your executable program, creating a single executable file that contains all the code that it needs to run on your operating system. If you create several executable files that all use code from a single library, each executable file will contain its own copy of the library code. In many cases using mulitple copies is a sensible thing to do, but this duplication of code is often unnecessary. That's where DLLs come in: you can put code into a DLL and share that library among several applications without having to duplicate the code for each application they all use the one copy of the code in the DLL. DLLs can save you disk space, since you don't have all those extra copies lying around. More important, DLLs can simplify your maintenance task, since upgrading the library involves nothing more than replacing the single library file. If you were using a statically linked library you'd have to relink every one of the applications that used it in order to benefit from the upgraded library.
Building an application that uses a DLL is a bit more complicated than building an application that uses a statically linked library. That's because of the separation between the application and the library; you must tell your compiler and linker what functions and data are supplied by which library.
In Windows the mechanism for doing this is called an import library. Using an import library consists of two steps: identifying the functions and data in the DLL to be made available to users, and creating an import library that contains these identifiers. You do the first with the keywords __export [3] and __import. In the DLL, any identifiers that should be made available to users should be marked with __export, as in:
void __export f() { } int __export usage_counter;
When you compile this code into a DLL the compiler and linker will team up to generate the appropriate information.
Once you've created the DLL you carry out the second step with a utility named IMPLIB, which generates an import library for this DLL:
implib example.lib example.dll
Now when you build your application you link it with EXAMPLE.LIB (the import library) so the linker knows which DLL to use when it's looking for these identifiers.
When you use an identifier whose definition is in a DLL you often must tell the compiler to generate extra code. That's where you use the __import keyword. In an application that uses our EXAMPLE.DLL, we would declare the identifiers like this:
void __import f(); extern int __import usage_counter;
Now the compiler can generate the proper code wherever these two identifiers are used.
In practice, these two keywords are usually encompassed in a macro so that you can use a single header file to provide all the declarations of the identifiers that the DLL exports[4] :
// EXAMPLE.H #if defined( BUILD_EXAMPLE_DLL ) #define EXAMPLE_DLL __export #else #define EXAMPLE_DLL __import #endif void EXAMPLE_DLL f(); extern int EXAMPLE_DLL usage_counter;
When you build the DLL you #define BUILD_EXAMPLE_DLL, which puts the __export keyword in the right places. When you build an executable that uses this DLL BUILD_EXAMPLE_DLL is not #defined, so the __import keyword is used instead.
When you're implementing a class the same general principle applies: you must export the member definitions from the DLL, and mark their identifiers as imported wherever they are used. This is easy to do, too: a class definition that uses the __export keyword will mark all of its members as exported, and a class definition that uses the __import keyword will mark all of its members as imported. The same macro technique introduced earlier handles the bookkeeping:
// CLASS.H #if defined( BUILD_CLASS_DLL ) #define CLASS_DLL __export #else #define CLASS_DLL __import #endif class CLASS_DLL C { public: C(); ~C(); void f(); private: static int i; };
When you compile this source code, the preprocessor will use #define CLASS_DLL to tell the compiler to generate the appropriate code to export C::C(), C::~C(), C::f(), and C::i. When you use this class, omit the #define BUILD_CLASS_DLL and the compiler will generate appropriate code to import these member functions.
To implement a template class in a DLL, combine the two techniques previously discussed. That is, explicitly instantiate the template in the dynamically linked library, and tell the compiler to export all of its members. Like this:
#include <vector> template __export vector<int>;
To use this template instantiation, tell the compiler that the implementation is in a DLL:
#include <vector> template __import vector<int>;
Of course, as before, you should put these declarations in a header, and switch between __import and __export with a suitable macro.
That's really all there is to it. DLLs aren't significantly more complicated than ordinary libraries, but they require an extension to the C and C++ notion of linkage. It's not enough to tell the compiler that an identifier is available only in the current translation unit or that it's available to all translation units; you must also tell the compiler that the identifier is to be made available to the other binary files making up an application. That's where the keywords __export and __import come in. With a bit of discipline their use becomes almost transparent, and you can create DLLs with almost no extra effort.
Q
I have often wanted to use operator<< on a stream and have the output go to two streams typically cerr and a logfile stream I have opened. I have tried deriving from ostream and then sent output to *this and my own private stream, but I always get fouled up with endl. I can override operator<< for all the other types, but endl, being a special animal, always chokes the compiler because I can't override anything to catch it. (I can use mystream << "\n" but it's not the same, and none of the other manipulators will work anyway.)
Phil Bowman
A
I'm sure most of us have tried this trick at one time or another and run into the sort of problem you're describing. It's actually fairly straightforward to produce the behavior you're looking for, but you have to know a bit more about the iostreams architecture to understand how. I'll begin with a brief tour of iostreams, then we'll talk about how to modify their behavior to do what you're looking for. Once you've seen how to do this you'll be in a much better position to create new types of streams.
Iostreams do two things: they translate to and from formatted text, and they talk to various sources and sinks of data. The classes that deal with formatting are named ios, istream, ostream, and iostream [5] . The abstract class that talks to data sources and sinks is named streambuf. The iostream library provides several families of classes derived from these five classes to enable your program to talk to disk files, strings, and so on. For example, to provide file I/O, the class fstreambase is derived from ios, the class ifstream is derived from istream, the class ofstream is derived from ostream, and the class fstream is derived from fstreambase and iostream. In addition, filebuf is derived from streambuf; it handles reading and writing data to a file. This may sound complicated, but it really isn't. Here's how it works.
The class ios keeps track of formatting options. So when you use the manipulator setw(4), the width field in your stream's ios object is set to the value 4. All the other formatting options work the same way: the various member functions and manipulators that set these options actually set fields in ios. In addition, ios holds a pointer to a streambuf, so when your code actually writes data to the stream it can figure out where to find the buffer.
The class ostream is derived from ios. ostream doesn't do much on its own, but it provides a hook for all the stream inserters. That is, when you write code like this:
ofstream out( "foo.txt" ); out << "Hello, world!\n";
the last line actually calls the function
ostream& operator << ( ostream&, const char * );
This works because out is an object of a type ofstream, which is derived from ostream. The compiler sees an overloaded function named operator << that takes a reference to an ostream and a const char *, and it calls that function.
The class istream is a lot like ostream: it is derived from ios, and provides a hook for the stream extractors, which are all of the form
istream& operator >> ( istream&, type& );
Finally, the class iostream combines the capabilities of istream and ostream. Any class derived from iostream can be used for both input and output.
Now, these four ios, istream, ostream, and iostream tend to get the most attention from programmers, writers, and teachers. They're important, as we've seen, because they provide support for most of the user-visible operations that are done on streams. In particular, when you write stream inserters and extractors for your own classes, you must know how these four classes interact in order to make your inserters and extractors work correctly. Unfortunately, the prominence of these four classes leads many people to overlook the role of streambuf. streambuf really provides much of the capability of iostreams, and should not be as neglected as it is.
Whenever you insert data into an ostream you are actually inserting it into an object of a type derived from streambuf. This insertion occurs via a call to streambuf's member function sputc, which inserts a single char into the streambuf, and sputn, which inserts the contents of an array of char into the streambuf. Similarly, reading data from a streambuf occurs through its member functions sgetc and sgetn[6] .
Down underneath, streambuf provides buffering and a set of virtual functions which are overridden to talk to a particular data source or sink. When a stream attempts to write data to a streambuf and the buffer is full, streambuf calls either overflow for a single character or xsputn for multiple characters. Similarly, when a stream attempts to read data from a streambuf and the buffer is empty, streambuf calls underflow for a single character or xsgetn for multiple characters.
In the case of a file stream, as we've already seen, the class filebuf handles communications with the file system. filebuf is derived from streambuf, and it overrides overflow, xsputn, underflow, and xsgetn. The code that reads and writes data to files lives in these four functions. In addition, filebuf has a member function, open, that takes the name of a file, so that we can tell it which file to talk to. So whenever we need file I/O we need to create a filebuf.
One way to create a filebuf is to create one explicitly, and set up an ostream to talk to that filebuf. Like this:
filebuf fbuf; fbuf.open( "foo.txt", ios::out ); ostream out(&fbuf);
The first line creates the filebuf object. The second opens a file for output, and the third creates an ostream that will talk to that file. Now you can use inserters to write data to the file named foo.txt.
But that's a lot of work when all you want to do is write data to a file. There's an easier way: just create an ofstream.
ofstream out("foo.txt");
This does basically the same thing as the previous code snippet. The constructor for ofstream creates a filebuf and tells the underlying ios object where to find this filebuf. A typical implementation of ofstream will simply contain a data object whose type is filebuf, and it will pass the address of this filebuf down to ios.
Now that we've completed our tour of the iostream architecture, let's look at how to modify the behavior of iostreams. As you've probably guessed from my emphasis on streambufs, that's where the answer lies. So let's begin by creating our own class, derived from streambuf, that writes to two streambufs. I said earlier that there are four virtual functions we must override to make this work. However, here we're only concerned with output, so we can just override the two functions that handle output; xsputn [7] and overflow. We also need pointers to the two streambufs that we'll be writing to. So our initial class definition looks like this[8] :
class teebuf : public streambuf { public: teebuf(streambuf *s1, streambuf *s2) : buf1(s1), buf2(s2) {} protected: streamsize xsputn (const char_type* s, streamsize n); int_type overflow( int_type c ); private: streambuf *buf1; streambuf *buf2; };
Implementing the two virtual functions is simple. Each one just calls the appropriate output function in each of the two streambufs, and returns the proper value.
streamsize teebuf::xsputn( const char *s, streamsize n ) { int res1, res2; res1 = buf1->sputn( s, n ); res2 = buf2->sputn( s, n ); return (res1 < res2) ? res1 : res2; } int teebuf::overflow( int c ) { buf1->sputc( c ); buf2->sputc( c ); return c; }
Now we have enough code that we can create a test program. Something like this:
int main() { filebuf f; f.open( "foo.txt", ios::out | ios::trunc ); teebuf t( &f, cout.rdbuf() ); ostream out( &t ); out << "Hello, world!\n"; return 0; }
The one thing in this code I haven't mentioned is the use of member function rdbuf on cout. The class ios provides this function, which returns a pointer to the streambuf to which that stream is connected. Since all our stream types are derived from ios, they all provide this member function.
Now, that's still a lot of code to have to write in your application. As we saw when we looked at filebuf, though, it's easy enough to encapsulate the code for building a streambuf in a class derived from ostream. Like this:
class teestream : public ostream { public: teestream( ostream& o1, ostream& o2 ); private: teebuf buf; }; teestream::teestream( ostream& o1, ostream& o2 ) : ostream(&buf), buf( o1.rdbuf(), o2.rdbuf() ) { }
That's really all there is to it. The constructor for teestream tells its ostream base where the streambuf is, and initializes the streambuf. Now we can use teestream in our code:
int main() { ofstream fout( "foo.txt" ); teestream out( fout, cout ); out << "Hello, world!\n"; return 0; }
C++'s iostreams provide a great deal of flexibility. Programmers are often unaware of what iostreams are actually capable of, and confine themselves to providing overloaded inserters and extractors for their newly defined types. That's certainly important, but iostreams are capable of a great deal more. Sometimes examining a part of the class hierarchy you haven't spent much time with will lead to the best solution. Take the time to understand the role of streambuf in the iostream architecture. You may be surprised at what it can do.
Notes
[1] Microsoft seems to like to drop suffixes in their technical terms, so they call a dynamically linked library a "dynamic link library." This affectation also gives us such grating terms as "protect mode" instead of "protected mode."
[2] This works in a regular library too, of course.
[3] Don't confuse __export with export, a new keyword in the working paper that's used to control template compilation.
[4] You must use a separate macro for each DLL.
[5] In reality these are all typedef names for template instantiations. For example, ios is actually a typedef for basic_ios<char>. (I don't have enough space here to talk about these templates, but if you're doing basic char I/O you don't need to worry about the underlying template details.)
[6] I've omitted the input functions sbumpc and snextc, as well as a double handful of other members that deal with input and output, in order to keep this description reasonably short.
[7] If you're using Borland C++, the name of this function is actually do_sputn. That's the name that was used in the original AT&T implementation of iostreams. In general, when you're deriving from streambuf, you need to check the actual names used in your implementation. Over the next few years this problem should go away.
[8] Those peculiar-looking type names int_type and char_type are actually typedefs defined in streambuf. These typedefs provide flexibility when we're instantiating a template with a new character-like type. If you find them confusing, think of them as int and char for purposes of this discussion.
Pete Becker is a consultant who specializes in teaching C and C++ to experienced programmers. He spent eight years in the C++ group at Borland international, both as a developer and manager. He is a member of the ANSI/ISO C++ standardization committee. He can be reached by e-mail at [email protected].