You can sometimes speed up a large rebuild by stockpiling a few template specializations.
A common frustration with the use of templates in C++ is having to put the template class implementations into header files. This practice plays havoc with dependency checking during builds, where the separation of implementation from declaration normally helps to isolate the rebuild ripple caused by making changes to the sources. This article discusses a technique called explicit instantiation, which can be used to maintain the normal separation between headers and implementation. Although the technique is not well documented, it has a solid basis in the Standard. It helps prevent excessive rebuilds, and it also helps to manage unnecessary code bloat due to use of templates.
First, a quick review of the compile and link process. In the simplest case, each C++ implementation file ("compilation unit") is compiled into an intermediate object file. If this compilation unit references functions or variables that are defined in another compilation unit, then these references are marked as unresolved. When the object files are linked together, the linker is responsible for resolving all references to create a final executable.
The use of template classes confuses this situation. Because the compiler writes template functions as it needs them, the same template function can be put into many object files. In the worst case, every object file will contain an identical copy of every template function. Some developers won't use templates because they believe that these copies will appear in the final executable. In reality, today's modern linkers weed out the duplicates and leave only a single final copy of each function in the executable.
The changes to your source code to move template functions out of the headers are minimal. For illustration purposes I have written a simple Set collection class. It is shown in Listings 1, 2, and 3. Listing 1 shows a typical declaration of a template for class Set. Listing 2 shows the implementation of Set, and Listing 3 shows Set being used. Normally the declaration and implementation of Set would be in a single file, such as Set.h, and the class would be used in one or more other files, such as Main.cpp.
The first change is to put the source code for the member functions into their own source file, just like a non-template class would be set up. For the example class, you would take the contents of Listing 2 out of Set.h and put it into Set.cpp. (Note that inline member functions of template classes must remain in the header file, just like they do for regular classes.) If you try building an application after doing this, you will end up with unresolved externals for all the functions in Set that were referenced by Main.cpp. This is the point at which most people give up and infer that template implementations simply aren't meant to go into implementation files.
The key to making it work is to force the compiler to write all the template functions in Set.cpp. Based on your use of templates the rest of the time, you might think that you would have to explicitly call each function to convince the compiler to write the associated code, but this is not true. You can force the compiler to write all of the functions with a single statement:
This is called explicit instantiation. It is described in the draft C++ Standard dated "2 December 1996" in section 14.7.2 and is briefly discussed in the Annotated Reference Manual (ARM) as a recent modification to the language.
This statement must go in the same file with the implementation of the template class. For the example, that would be Set.cpp. There must be one such statement for each variable type used elsewhere in the program to instantiate the template class. Therefore, you might end with something like this in Set.cpp:
template Set<float>; template Set<int>; template Set<complex>; template Set<string>;
Putting all the definitions in one place like this seems to incite a religious fervor in many people. I will talk about some of the issues involved shortly.
One worry that you can usually forget about with most compilers is code bloat. With the "normal" use of templates, the compiler writes only the functions that it actually needs, so your final executable will not be bloated by template functions that are never used. The technique I am describing forces every function to be written regardless of whether it is actually used, but many modern linkers will detect functions that are never called and automatically remove them. For example, Microsoft's Visual C++ calls this "function-level linking" or "packaged functions." It always performs this analysis during Release builds.
Why It Works
The first step in understanding why this technique works is to realize that the compiler handles a reference to a function in a template class almost exactly like a reference to a function in a normal class. First, the compiler generates a decorated name for the function. If possible, the name is resolved immediately; otherwise, the compiler generates an unresolved reference. Finally, at the end of each compilation unit, the compiler "writes" the final implementation for all the functions that were based on templates whose definitions are available. This is why the instantiation statements can be placed anywhere in the implementation file.
For those functions whose definitions were not available, the compiler puts an unresolved reference into the object file in the hope that the linker will be able to resolve it. The unresolved external is defined by a decorated name. This name includes the type(s) for the template, so the linker can disambiguate calls to various instantiations of the template class.
As with most programming techniques, there are tradeoffs. The advantages of this technique are largely related to the logistics of managing a base of code, and these benefits are most apparent in larger projects. The first and biggest advantage is that the implementation is separated from the definition, and changes to the implementation no longer cause massive rebuilds because of dependency checking. This issue is particularly noticeable when it's necessary to do further development on a template class that is already widely used in the project.
The second advantage is a little more obscure. In my experience developers often forget that each new template/type combination can cause the compiler to write pages of object code. One line of source code that uses a new template type can easily generate 10K of object code. When this happens, an executable will end up with, for
example, instantiations of a template array for int, long, unsigned int, and unsigned long. It would be better if developers could look and see if any existing template combinations could be reused. In practice, it can be very difficult to track down what combinations already exist because the declarations are scattered throughout the code. Using explicit instantiation, however, all the definitions become grouped into a single location. It then becomes easy to examine the list to see if any can be reused.
The practical disadvantage is that it can be difficult to identify when a template instantiation is no longer needed. This could happen as a code base evolves and code is removed or rewritten. Ideally, the compiler can remove the functions, but this is not always guaranteed. It may be necessary to periodically cull the template instantiation declarations to remove unused instantiations.
The theoretical disadvantage, and this is the one that bothers the purists, is that knowledge of what types are being used is not tightly encapsulated. You cannot just add a member variable to a class, you may also have to add a related definition to the template implementation file if the member variable uses a new variation on the template. It is my belief that in particularly large projects this break with theoretical purity is easily outweighed by the number of programmer-hours saved. Team members do not have to wait through long rebuilds because a template's implementation was changed.
It is possible to have the best of both worlds by including Set.cpp in Set.h during release builds and including Set.h in Set.cpp in debug builds. Using this method, debug builds put the member functions in the implementation file and release builds put the member functions in the header file. Don't forget to put an #ifdef around the explicit instantiation statements so that they will not be used in a release build.
In this article, I have talked about how to use explicit instantiation to move the implementation of a template class out of a header file and into a standalone implementation file. I also talked about why it works and how it affects the maintenance of a base of source code. As the rate of product development continues to increase, this technique can be a dramatic time-saver in projects with heavy dependencies on template classes. o
Jim Beveridge is the author of Multithreading Applications in Win32 from Addison-Wesley. He is a the Chief Technical Leader at Turning Point Software, where he specializes in high-quality C++/MFC development for the shrink-wrap consumer market. Jim holds a BS in Computer Science from Rochester Institute of Technology. He can be reached at firstname.lastname@example.org.