C/C++

Improving C++ Program Performance

By Stanley B. Lippman, October 01, 1999

Stanley examines the three most common strategies for C++ program speedup, then points out that it is often enough to simply review the code for inappropriate C++ programming idioms.

Oct99: Improving C++ Program Performance

Stan is a consultant specializing in C++ and object-oriented programming, design, and performance improvement. He can be contacted at [email protected] or http://www.objectwrite.com/.

It is as important for C++ programmers to recognize when not to define copy constructors, copy-assignment operators, and destructors, as it is to recognize when these special member functions are necessary. Consider, for example, the timings in Table 1.

The baseline times represent an execution of the program with the full set of specialized member functions defineda copy constructor, copy-assignment operator, and destructor. In addition, none of the class member functions or supporting friend functions are declared to be inline. The absence of inlining and the presence of the three special member functions follow a moderately widespread coding standard for C++ (for example, see Taligent's Guide to Designing Programs, Addison-Wesley, 1994, ISBN 0-201-40888-0).

For argument's sake, say that the performance of an application is too slow and you have been given the task of speeding it up. What should you do?

Three common strategies for program speedup are:

1. Substitution of a more efficient algorithm for one currently in use (the algorithm is generally more complex as well; for example, moving from a bubble sort to a quicksort sorting algorithm).

2. Substitution of a more efficient data structure for one currently in use (the data structure is generally more complex as well; for example, moving from a binary tree data structure to a red-black tree data structure).

3. Substitution of a more efficient code sequence for one currently in use (the performance increase in this case is often times accompanied by a decrease in the control flow and understandability of the code sequence).

In many cases, resorting to any of these three strategies is unnecessary. Often, it is enough to simply review the code for inappropriate C++ programming idioms. One such inappropriate programming idiom is the unnecessary definition of a class-copy constructor, copy-assignment operator, and destructor. Listing One represents the baseline implementation of the Vector class and program. Modification #1 simply adds an inline specification to each friend and member function. Modification #2 removes the definitions of the copy constructor, copy-assignment operator, and destructor. Modification #3 is a combination of Modifications #1 and #2. The program versions were compiled under the 7.2 SGI compiler; timings were generated using the timex command.

Modification #1: Inlining

The first modification is to inline the member and friend functions of the Vector class. It's a simple enough change: You add the keyword inline to each candidate function definition; otherwise, you leave the code alone. Not surprisingly, you push out about a 20 percent performance increase. (In general, an application moving from no inlining to appropriate inlining can expect a 20 to 25 percent performance increase.)

If inlining provides an improvement in program performance without significant effort on your part, then why do many coding standards recommend against the use of inlining? Its primary disadvantage is the need to recompile each file that includes a modified inline function. For a class library used in a large project, the recompilation costs can be significant.

The same problem exists in C++ when you add or remove a data member of a widely used class, or modify the type of an existing data member. For a complete solution to the issue of recompilation, you really need to turn to an alternative object model, such as that supported by COM (see, for example, Essential COM, by Don Box, Addison-Wesley Longman, 1998, ISBN 0-201-63446-5; and Inside the C++ Object Model, by Stanley Lippman, Addison-Wesley, 1996, ISBN 0-201-83454-5).

The prohibition on the use of inlining in project coding standards is, at best, misguided. On the other hand, not every function within a C++ program should be made inline. Another potential disadvantage of inlining is program memory bloat. Candidate functions for inlining, in general, should be small (such as the Vector member functions). In practice, effective inlining requires judgment on your part.

Modification #2: Removal of Unnecessary Member Functions

A second possible modification is to remove the unnecessary copy constructor, copy-assignment operator, and destructor from the Vector class. (For this modification, I've left the member functions as noninline. In modification #3, I've combined the two.) It's an even more straightforward change than adding an inline specification; you simply delete the declaration and definition of the three functions. This results in an approximate 40 percent improvement of this program.

While this result is often a surprise to programmers, it shouldn't be if you consider the underlying compiler behavior.

When Do You Need Explicit Copy Operators?

By default, one class object is initialized or assigned to another by memberwise copy. Conceptually, in the Vector class, first the x-, then y-, and then z-coordinate members are copied in turn. That, in fact, represents your explicit implementation; see Listing One.

In practice, however, the compiler does not copy objects that way:

It does not generate an explicit copy constructor and copy-assignment operator.
It does not copy each coordinate data member in turn. Rather, it does a direct bitwise copy of the entire data content of the class in one operation the same as is done in C. No default memberwise functions are generated by the compiler. The difference between our explicit implementation and the implicit handling by the compiler, then, is significant. This is reflected in performance.

Is this true in all cases? No. In general, a compiler synthesizes a copy constructor and copy-assignment operator in four cases:

When the class contains a member class object that has an associated copy constructor and/or copy-assignment operator.
When the class is derived from one or more base classes that has an associated copy constructor and/or copy-assignment operator.
When a class declares or inherits a virtual function.
When a class is either directly or indirectly derived from a virtual base class.

The C++ Standard refers to these classes as nontrivial. The Vector class, on the other hand, is trivial; that is, none of these four conditions hold. (For a more detailed discussion, see my book Inside the C++ Object Model.)

Should you never define a copy constructor and copy-assignment operator, then? Of course not. In some cases, it is necessary to provide explicit instances. The primary condition requiring you to provide explicit instances is when your class contains one or more pointer members that, during the lifetime of the class object, are either initialized or assigned the address of heap memory. That memory is subsequently deleted within the class destructor. Recognizing this condition is an essential aspect of C++ programming. (For a more leisurely and example-driven discussion, see C++ Primer, Third Edition, by Stanley Lippman and Josée Lajoie, Addison-Wesley Longman, 1998, ISBN 0-201-82470-1.)

For a class with data members contained by value (such as the Vector class), default memberwise initialization and copy are more efficiently carried out by the compiler. It is less work for both the program and programmer. The work for you, in this case, is to recognize the condition that makes an explicit copy constructor and copy-assignment operator unnecessary.

When Do You Need an Explicit Destructor?

Similarly, if the data members of a class are contained by value, as are the three coordinate Vector members, no destructor is necessary. Not every class requires a destructor, even if you have provided one or more constructors for that class. Destructors serve primarily to relinquish resources acquired either within the constructor or during the lifetime of the class object, such as freeing a mutual exclusion lock or deleting memory allocated through operator new; see C++ Primer.

Why Not a Recommended Practice?

If not providing an explicit copy constructor and copy-assignment operator (and destructor) can improve the performance of your program without significant effort on your part, then why do many coding standards recommend always providing explicit instances of each? The issue, I believe, involves an underlying presumption about C++ programmerscan you be trusted to exercise judgment? (To a lesser extent, the same might apply to rules about inlining as well.) Not providing the instances when they are necessary usually results in serious program failures.

What are the choices? One approach is to allow you the freedom to choose, but ensure that the information necessary to make an informed choice is available. A second approach is simply to prescribe rules. At best, these rules prove appropriate in, say, 80-90 percent of the cases. (The choice between these two approaches is not just limited to how you manage a software project.)

In the Vector class, providing unnecessary, explicit instances of the copy constructor, copy-assignment operator, and destructor results not in an incorrect implementation but simply an under- performing implementation. Recognizing that lets you fine-tune your implementation without reengineering it.

Acknowledgment

Josée Lajoie reviewed an early draft of this article, and, as usual, provided many insightful comments and suggestions.

DDJ

Listing One

class Vector {    
    friend Vector 
           operator+( const Vector&, const Vector& );
    friend Vector 
           operator-( const Vector&, const Vector& );
public:
    Vector( double x=0.0, 
             double y=0.0, double z=0.0 );
    Vector( const Vector& );
    Vector& operator=( const Vector& );
    ~Vector();
    bool operator==( const Vector& );
    Vector operator/( double );

    double mag();
    double dot( const Vector& );
    Vector cross( const Vector& );
    void normalize();

    double x() const;
    double y() const;
    double z() const;

    void x( double newx );
    void y( double newy );
    void z( double newz );

private:
    double _x, _y, _z;
};
Vector::Vector( double x, double y, double z )
    { _x = x; _y = y; _z = z; }
Vector::Vector( const Vector &rhs )
    { _x = rhs._x; _y = rhs._y; _z = rhs._z; }
Vector::~Vector()
    {_x = 0.0; _y = 0.0; _z = 0.0; }
Vector& Vector::operator=( const Vector &rhs )
{
    _x = rhs._x; _y = rhs._y; _z = rhs._z;
    return *this;
}
Vector
operator+( const Vector &lhs, const Vector &rhs )
{
    Vector result;

    result._x = lhs._x + rhs._x;
    result._y = lhs._y + rhs._y;
    result._z = lhs._z + rhs._z;

    return result;
}
Vector
operator-( const Vector &lhs, const Vector &rhs )
{
    Vector result;

    result._x = lhs._x - rhs._x;
    result._y = lhs._y - rhs._y;
    result._z = lhs._z - rhs._z;

    return result;
}
#include <math.h>
double Vector::mag()
    { return sqrt( _x*_x + _y*_y +_z*_z ); }
Vector Vector::cross( const Vector &rhs )
{
    Vector result;

    result._x = _y * rhs._z - rhs._y * _z;
    result._y = _z * rhs._x - rhs._z * _x;
    result._z = _x * rhs._y - rhs._z * _y;

    return result;
}
void Vector::normalize()
{
    double d = mag();
    _x /= d; _y /= d; _z /= d;
}
double Vector::dot( const Vector &rhs )
    { return _x*rhs._x + _y*rhs._y + _z*rhs._z; }
bool Vector::operator==( const Vector &rhs )
{
    return _x == rhs._x && 
            _y == rhs._y && _z == rhs._z;
}
Vector
Vector::
operator /( double val )
{
    Vector result;

    if ( val != 0 ) {
        result._x = _x / val;
        result._y = _y / val;
        result._z = _z / val;
    }
    return  result;
}
double Vector::x() const { return _x; } 
double Vector::y() const { return _y; } 
double Vector::z() const { return _z; } 

void Vector::x( double newx ) {  _x = newx; } 
void Vector::y( double newy ) {  _y = newy; } 
void Vector::z( double newz ) {  _z = newz; } 

#include <vector>
int main()
{
    Vector a( 0.231, 2.4745, 0.023 ),
           b( 1.475, 4.8916, -1.23 );
    vector< Vector > vv;
    for ( int ix = 0; ix < 2000000; ++ix )
    {
        Vector c( a.mag(), b.dot(a), ix );

        vv.push_back( a.cross( c ));    
        vv.push_back( a + c );
        vv.push_back( a - c );
        vv.push_back( a / c.mag() );
        vv.push_back( a.cross( c ));

        if ( c == a )
             vv.push_back( b );
        else vv.push_back( a );

        c.normalize();
        vv.push_back( c );
    }
}

Back to Article

1 2 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++