I'll start with the punchline: If you're writing a function that can be implemented as either a member or as a non-friend non-member, you should prefer to implement it as a non-member function. That decision increases class encapsulation. When you think encapsulation, you should think non-member functions.
Surprised? Read on.
Background
When I wrote the first edition of Effective C++ in 1991, I examined the problem of determining where to declare a function that was related to a class. Given a class C
and a function f
related to C,
I developed the following algorithm:
if (f needs to be virtual) make f a member function of C; else if (f is operator>> or operator<<) { make f a non-member function; if (f needs access to non-public members of C) make f a friend of C; } else if (f needs type conversions on its left-most argument) { make f a non-member function; if (f needs access to non-public members of C) make f a friend of C; } else make f a member function of C;
This algorithm served me well through the years, and when I revised Effective C++ for its second edition in 1997, I made no changes to this part of the book.
In 1998, however, I gave a presentation at Actel, where Arun Kundu observed that my algorithm dictated that functions should be member functions even when they could be implemented as non-members that used only C
's public interface. Is that really what I meant, he asked me? In other words, if f
could be implemented as a member function or a non-friend non-member function, did I really advocate making it a member function? I thought about it for a moment, and I decided that that was not what I meant. I therefore modified the algorithm to look like this:
if (f needs to be virtual) make f a member function of C; else if (f is operator>> or operator<<) { make f a non-member function; if (f needs access to non-public members of C) make f a friend of C; } else if (f needs type conversions on its left-most argument) { make f a non-member function; if (f needs access to non-public members of C) make f a friend of C; } else if (f can be implemented via C's public interface) make f a non-member function; else make f a member function of C;
Since then, I've been battling programmers who've taken to heart the lesson that being object-oriented means putting functions inside the classes containing the data on which the functions operate. After all, they tell me, that's what encapsulation is all about.
They are mistaken.
Encapsulation
Encapsulation is a means, not an end. There's nothing inherently desirable about encapsulation. Encapsulation is useful only because it yields other things in our software that we care about. In particular, it yields flexibility and robustness. Consider this struct, whose implementation I think we'll all agree is unencapsulated:
struct Point { int x, y; };
The weakness of this struct is that it's not flexible in the face of change. Once clients started using this struct, it would, practically speaking, be very hard to change it; too much client code would be broken. If we later decided we wanted to compute x
and y
instead of storing those values, we'd probably be out of luck. We'd be similarly thwarted if we decided a superior design would be to look x
and y
up in a database. This is the real problem with poor encapsulation: it precludes future implementation changes. Unencapsulated software is inflexible, and as a result, it's not very robust. When the world changes, the software is unable to gracefully change with it. (Remember that we're talking here about what is practical, not what is possible. It's clearly possible to change struct
Point,
but if enough code is dependent on it in its current form, it's not practical.)
Now consider a class with an interface that offers clients capabilities similar to those afforded by the struct above, but with an encapsulated implementation:
class Point { public: int getXValue() const; int getYValue() const; void setXValue(int newXValue); void setYValue(int newYValue); private: ... // whatever... };
This interface supports the implementation used by the struct (storing x
and y
as int
s), but it also affords alternative implementations, such as those based on computation or database lookup. This is a more flexible design, and the flexibility makes the resulting software more robust. If the class's implementation is found lacking, it can be changed without requiring changes to client code. Assuming the declarations of the public member functions remain unchanged, client source code is unaffected. (If a suitable implementation has been adopted, clients need not even recompile.)
Encapsulated software is more flexible than unencapsulated software, and, all other things being equal, that flexibility makes it the superior design choice.
Degrees of Encapsulation
The class above doesn't fully encapsulate its implementation. If the implementation changes, there's still code that might break. In particular, the member functions of the class might break. In all likelihood, they are dependent on the particulars of the data members of the class. Still, it seems clear that the class is more encapsulated than the struct, and we'd like to have a way to state this more formally.
It's easily done. The reason the class is more encapsulated than the struct is that more code might be broken if the (public) data members in the struct change than if the (private) data members of the class change. This leads to a reasonable approach to evaluating the relative encapsulations of two implementations: if changing one might lead to more broken code than would the corresponding change to the other, the former is less encapsulated than the latter. This definition is consistent with our intuition that if making a change is likely to break a lot of code, we're less likely to make that change than we would be to make a different change that affected less code. There is a direct relationship between encapsulation (how much code might be broken if something changes) and practical flexibility (the likelihood that we'll make a particular change).
An easy way to measure how much code might be broken is to count the functions that might be affected. That is, if changing one implementation leads to more potentially broken functions than does changing another implementation, the first implementation is less encapsulated than the second. If we apply this reasoning to the struct above, we see that changing its data members may break an unknowably large number of functions every function that uses the struct. In general, we can't count how many functions this is, because there's no way to locate all the code that uses a particular struct. This is especially true for library code. However, the number of functions that might be broken if the class's data members change is easy to determine: it's all the functions that have access to the private part of the class. That's just four functions (assuming none are declared in the private part of the class), and we know that because they're all conveniently listed in the class definition. Since they're the only functions that have access to the private parts of the class, they're the only functions that can be affected if those parts change.
Encapsulation and Non-Member Functions
We've now seen that a reasonable way to gauge the amount of encapsulation in a class is to count the number of functions that might be broken if the class's implementation changes. That being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions. And that observation is what justifies my argument for preferring non-member non-friend functions to member functions: if a function f
could be implemented as a member function or as a non-friend non-member function, making it a member would decrease encapsulation, while making it a non-member wouldn't. Since functionality is not at issue here (the functionality of f
is available to class clients regardless of where f
is located), we naturally prefer the more encapsulated design.
It's important that we're trying to choose between member functions and non-friend non-member functions. Just like member functions, friend functions may be broken when a class's implementation changes, so the choice between member functions and friend functions is properly made on behavioral grounds. Furthermore, we now see that the common claim that "friend functions violate encapsulation" is not quite true. Friends don't violate encapsulation, they just decrease it in exactly the same manner as member functions.
This analysis applies to any kind of member functions, including static ones. Adding a static member function to a class when its functionality could be implemented as a non-friend non-member decreases encapsulation by exactly the same amount as does adding a non-static member function. One implication of this is that it's generally a bad idea to move a free function into a class as a static member just to show that it's related to the class. For example, if I have an abstract base class for Widget
s and then use a factory function to make it possible for clients to create Widget
s, the following is a common, but inferior way to organize things:
// a design less encapsulated than it could be class Widget { ... // all the Widget stuff; may be // public, private, or protected public: // could also be a non-friend non-member static Widget* make(/* params */); };
A better design is to move make
out of Widget,
thus increasing the overall encapsulation of the system. To show that Widget
and make are related, the proper tool is a namespace:
// a more encapsulated design namespace WidgetStuff { class Widget { ... }; Widget* make( /* params */ ); };
Alas, there is a weakness to this design when templates enter the picture.