Channels ▼
RSS

C/C++

How Non-Member Functions Improve Encapsulation


I'll start with the punchline: If you're writing a function that can be implemented as either a member or as a non-friend non-member, you should prefer to implement it as a non-member function. That decision increases class encapsulation. When you think encapsulation, you should think non-member functions.

Surprised? Read on.

Background

When I wrote the first edition of Effective C++ in 1991, I examined the problem of determining where to declare a function that was related to a class. Given a class C and a function f related to C, I developed the following algorithm:

if (f needs to be virtual)
   make f a member function of C;
else if (f is operator>> or
         operator<<)
   {
   make f a non-member function;
   if (f needs access to non-public
       members of C)
      make f a friend of C;
   }
else if (f needs type conversions
         on its left-most argument)
   {
   make f a non-member function;
   if (f needs access to
       non-public members of C)
      make f a friend of C;
   }
else
   make f a member function of C;

This algorithm served me well through the years, and when I revised Effective C++ for its second edition in 1997, I made no changes to this part of the book.

In 1998, however, I gave a presentation at Actel, where Arun Kundu observed that my algorithm dictated that functions should be member functions even when they could be implemented as non-members that used only C's public interface. Is that really what I meant, he asked me? In other words, if f could be implemented as a member function or a non-friend non-member function, did I really advocate making it a member function? I thought about it for a moment, and I decided that that was not what I meant. I therefore modified the algorithm to look like this:

if (f needs to be virtual)
   make f a member function of C;
else if (f is operator>> or
         operator<<)
   {
   make f a non-member function;
   if (f needs access to non-public
       members of C)
      make f a friend of C;
   }
else if (f needs type conversions
         on its left-most argument)
   {
   make f a non-member function;
   if (f needs access to non-public
       members of C)
      make f a friend of C;
   }
else if (f can be implemented via C's
         public interface)
   make f a non-member function;
else
   make f a member function of C;

Since then, I've been battling programmers who've taken to heart the lesson that being object-oriented means putting functions inside the classes containing the data on which the functions operate. After all, they tell me, that's what encapsulation is all about.

They are mistaken.

Encapsulation

Encapsulation is a means, not an end. There's nothing inherently desirable about encapsulation. Encapsulation is useful only because it yields other things in our software that we care about. In particular, it yields flexibility and robustness. Consider this struct, whose implementation I think we'll all agree is unencapsulated:

struct Point {
   int x, y;
};

The weakness of this struct is that it's not flexible in the face of change. Once clients started using this struct, it would, practically speaking, be very hard to change it; too much client code would be broken. If we later decided we wanted to compute x and y instead of storing those values, we'd probably be out of luck. We'd be similarly thwarted if we decided a superior design would be to look x and y up in a database. This is the real problem with poor encapsulation: it precludes future implementation changes. Unencapsulated software is inflexible, and as a result, it's not very robust. When the world changes, the software is unable to gracefully change with it. (Remember that we're talking here about what is practical, not what is possible. It's clearly possible to change struct Point, but if enough code is dependent on it in its current form, it's not practical.)

Now consider a class with an interface that offers clients capabilities similar to those afforded by the struct above, but with an encapsulated implementation:

class Point {
public:
   int getXValue() const; 
   int getYValue() const; 
   void setXValue(int newXValue);
   void setYValue(int newYValue);

private:
  ...                 // whatever...
};

This interface supports the implementation used by the struct (storing x and y as ints), but it also affords alternative implementations, such as those based on computation or database lookup. This is a more flexible design, and the flexibility makes the resulting software more robust. If the class's implementation is found lacking, it can be changed without requiring changes to client code. Assuming the declarations of the public member functions remain unchanged, client source code is unaffected. (If a suitable implementation has been adopted, clients need not even recompile.)

Encapsulated software is more flexible than unencapsulated software, and, all other things being equal, that flexibility makes it the superior design choice.

Degrees of Encapsulation

The class above doesn't fully encapsulate its implementation. If the implementation changes, there's still code that might break. In particular, the member functions of the class might break. In all likelihood, they are dependent on the particulars of the data members of the class. Still, it seems clear that the class is more encapsulated than the struct, and we'd like to have a way to state this more formally.

It's easily done. The reason the class is more encapsulated than the struct is that more code might be broken if the (public) data members in the struct change than if the (private) data members of the class change. This leads to a reasonable approach to evaluating the relative encapsulations of two implementations: if changing one might lead to more broken code than would the corresponding change to the other, the former is less encapsulated than the latter. This definition is consistent with our intuition that if making a change is likely to break a lot of code, we're less likely to make that change than we would be to make a different change that affected less code. There is a direct relationship between encapsulation (how much code might be broken if something changes) and practical flexibility (the likelihood that we'll make a particular change).

An easy way to measure how much code might be broken is to count the functions that might be affected. That is, if changing one implementation leads to more potentially broken functions than does changing another implementation, the first implementation is less encapsulated than the second. If we apply this reasoning to the struct above, we see that changing its data members may break an unknowably large number of functions — every function that uses the struct. In general, we can't count how many functions this is, because there's no way to locate all the code that uses a particular struct. This is especially true for library code. However, the number of functions that might be broken if the class's data members change is easy to determine: it's all the functions that have access to the private part of the class. That's just four functions (assuming none are declared in the private part of the class), and we know that because they're all conveniently listed in the class definition. Since they're the only functions that have access to the private parts of the class, they're the only functions that can be affected if those parts change.

Encapsulation and Non-Member Functions

We've now seen that a reasonable way to gauge the amount of encapsulation in a class is to count the number of functions that might be broken if the class's implementation changes. That being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions. And that observation is what justifies my argument for preferring non-member non-friend functions to member functions: if a function f could be implemented as a member function or as a non-friend non-member function, making it a member would decrease encapsulation, while making it a non-member wouldn't. Since functionality is not at issue here (the functionality of f is available to class clients regardless of where f is located), we naturally prefer the more encapsulated design.

It's important that we're trying to choose between member functions and non-friend non-member functions. Just like member functions, friend functions may be broken when a class's implementation changes, so the choice between member functions and friend functions is properly made on behavioral grounds. Furthermore, we now see that the common claim that "friend functions violate encapsulation" is not quite true. Friends don't violate encapsulation, they just decrease it — in exactly the same manner as member functions.

This analysis applies to any kind of member functions, including static ones. Adding a static member function to a class when its functionality could be implemented as a non-friend non-member decreases encapsulation by exactly the same amount as does adding a non-static member function. One implication of this is that it's generally a bad idea to move a free function into a class as a static member just to show that it's related to the class. For example, if I have an abstract base class for Widgets and then use a factory function to make it possible for clients to create Widgets, the following is a common, but inferior way to organize things:


// a design less encapsulated than it could be
class Widget {
   ...          // all the Widget stuff; may be
                // public, private, or protected

public:

   // could also be a non-friend non-member
   static Widget* make(/* params */);
};

A better design is to move make out of Widget, thus increasing the overall encapsulation of the system. To show that Widget and make are related, the proper tool is a namespace:

// a more encapsulated design
namespace WidgetStuff {
   class Widget { ... };
   Widget* make( /* params */ );
};

Alas, there is a weakness to this design when templates enter the picture.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Comments:

ubm_techweb_disqus_sso_-13eff35a200baa381ecd593ff47cf5cf
2014-01-13T14:21:04

Two things I don't buy so easily:

As I understand it, the article argues about functions that are candidates to 1) member functions; and to 2) non-member non-friend functions (NMNFF); that they should be implemented following the option 2). The reason is that changing the implementation won't affect them. This doesn't sound right to me, because if they are candidates to both, they don't use the implementation anyway and are therefore not broken by implementation changes regardless of where they are placed. In other terms, the fact that a public member doesn't necessarily rely on the implementation means that the first sentence of section "Encapsulation and Non-Member Functions" doesn't justify the second sentence of the same section as it intends to.

Also, the first sentence in the section "Interfaces and Packages" claims that a class's interface includes non-members. Later on, the effect of reducing a class's interface is indicated as an advantage of option 2) above because, among other things, it reduces the class's comprehensibility and maintainability. I fail to see how a class whose interface includes X public member functions and Y NMNFFs is more maintenable or comprehensible than the same class with X+Y member functions.

I would love to see a version of this argument that addressed these points more clearly.


Permalink
ubm_techweb_disqus_sso_-65fc933a0f69b070e4bc916f441088df
2012-08-16T22:25:28

As compiler writers know, at the runtime level, methods are merely functions with a hidden this argument. When designing D4, we solved the calling inconsistency problem by making any function post-fix invokable by the first agrument. So if you define a function like this:
Invert(Point point)
you can invoke it like this:
Invert(p)
or
p.Invoke()
I wonder if the syntax of C++ would allow this change to the language.

Great article BTW.


Permalink
ubm_techweb_disqus_sso_-68394d1fe18c90522b96620df114ac44
2011-11-08T12:00:26

Maybe the reasoning of a better grade of concurrency is questionable, If a function can be non-member non-friend class, if there is a change in the implementation this function wouldn't be affected because it only uses public members.

On the other hand, I think maintaining a minimal interface is the best reason to define a function as non-member.


Permalink

Video