How to make nice encapsulated code

For C++, but the concepts go beyond this language


The example

Let’s say you have two classes, each of them with their separate modules and a main file, which means five files: A.h, A.cpp, B.h, B.cpp, main.cpp

Your B.h file contains a very simple class definition:

class B
{
public:
B():val(1) {}
private:
int val;
};

Your A.h is also quite simple:

#include "B.h"
class A
{
public:
A() {}
private:
B b;
};

There is nothing more in these classes so your B.cpp and A.cpp files are empty for the moment.

Now your main.cpp is:

#include "A.h"
int main()
{
A a;
}

The issue

There is a fundamental issue in the example above: to compile the code, you had to include B.h in A.h because A contains B. This issue is twofold:

  • First, there is a practical problem: when you compile main.cpp, it includes A.h, which includes B.h. This recursive inclusion increases compilation time. In the example above, the overall compilation time is ridiculous, but in a large project, with a lot of files and a lot of recursive inclusions within each module, this can be quite time-consuming.
  • Second, there is a conceptual problem: in the above example, nothing prevents you or another developer from instantiating a B class in the main:
#include "A.h"
int main()
{
A a;
B b;//mouhaha!
}

This is bad. Except if B is really useful in the main, you probably don’t want anybody to instance your B class in the main because it is related to implementation of the A class, nothing more. This totally breaks all modular construction in your code.

In a more conceptual level, this is the same assumption which is used everywhere in programming, computer science, and even beyond, whenever it is about buttons in a car, processes in an operating system, or functions in a C++ module: encapsulation. If you break encapsulation, you program will quickly become a mess.

There are two possibilities:

  • Either you are a team of programmers, and, as a maintainer of modules A and B, you personnally don’t want anybody else to use your B class. Maybe it doesn’t work as-is, maybe you will change its interface later, whatever the reason, it’s your class, and it has to stay like this. The only thing that the other programmers need to know about is the interface of the A class, because this is the one used on a upper level.
  • Or you are only one programmer for this program and then you may be as much intelligent as Einstein, you still won’t be able to remember what all your modules do and access in all the program, if each of them accesses everything everywhere. Trust me. You can apprehend the program above, but you can’t maintain more than 3 or 4 modules this way.

So what’s the solution?

There are several, actually. Here are two of them:

First solution: use a pointer on b and a forward declaration

The file A.h changes to:

class B;
class A
{
public:
A();
~A();
private:
B* b;
};

The file A.cpp now contains something:

#include "B.h"
#include "A.h"
A::A() : b(new B())
{
}
A::~A()
{
delete b;
}

This quite fixes the problem, at least the compilation time one. People which reads A.h can still know what’s going on in the implementation of A, but that may not be a very big deal.


Second solution: use an opaque pointer

The file A.h changes to:

class A
{
public:
A();
~A();
private:
struct PrivateA;
PrivateA* privateA;
};

The file A.cpp now contains:

#include "B.h"
#include "A.h"
struct A::PrivateA
{
B b;
};
A::A() : privateA(new PrivateA())
{
}
A::~A()
{
delete privateA;
}

There is no more any mention of B class in A.h. All the implementation is hidden. In the member functions of A you will still be able to access to b with privateA.b