On the Origin of smartref: Proxy Pattern

Erik Valkering
Plaxis

--

Last year at CppCon2017, I was awarded the jury’s first prize for my poster about my smartref library, because of the high quality and innovation of the presented approach B-). A question that I often got, during the conference, as well as on the Internet, was about when such a library would be useful.

In this series of blog posts, I will try to explain where this innovation might be useful. To start with, I will discuss a design problem we faced at work, namely how to cope with very large data sets.

Disclaimer: all the examples are written in a mixture of C++11/14/17. But it should also work with C++11 if properly rewritten. A pre-C++11 version would probably be possible as well, although the smartref library itself is written using C++11/14/17 features.

TL;DR

Imagine we have one million vectors of 1 GB each. Keeping everything in memory is not possible. However, using a Proxy class, we can virtually keep them in memory:

https://gist.github.com/erikvalkering/b53226d7901823d7945adb6eb08161e4

By inheriting from using_<T>, the Proxy class ‘uses’ exactly the same interface as the underlying type. Because of this, we can reuse existing code without an increase in the complexity. Furthermore, this interface is generated automatically, without having to write any boilerplate code.

Now, as soon as any member function is accessed, the data will be lazily loaded through the user-defined conversion function:

https://gist.github.com/erikvalkering/e633cd6e4be599989f869f33e2a2fc3c

The problem

In order to illustrate the problem, as well as the solution, I’ve simplified it a bit and used a CarDealership in the following code fragments.

Consider the following data model:

https://gist.github.com/erikvalkering/1892ff460a442c160920d7572515d41e

In the above example, CarDealership::showroom() can be used to determine what cars are currently present in the showroom and therefore available for purchase. In case a customer visits the car dealership's website, it's good to know that showroom_ is already in memory, such that the customer can quickly get an overview of which cars are available. In practice, the number of cars is quite small, about 50 of them, so it should also easily fit in memory:

https://gist.github.com/erikvalkering/7db3dcabb50f389e331684213c50c749

However, now consider that the CarDealership wants to keep track of all the cars that have ever been sold:

https://gist.github.com/erikvalkering/4192f9c374843c22d985018e43e5c2cc

Where previously we had only very small-sized showrooms, we are now dealing with something that will grow only bigger and bigger. This will of course have a major impact on the customer’s experience, as he now needs to wait until all the sold cars have been loaded into memory, which he is totally not interested in. Furthermore, it is a waste of memory and, depending on the number of cars sold, might not even fit in memory.

The Proxy Pattern

One way to deal with this issue is to move the large data out of the CarDealership data model, and handle them separately. The drawback of this approach is that it would lead to a decrease in reusability, and an increase in complexity, since the developer now always has to think where it should get its data from and use the corresponding functions to process it.

Instead, we adopted a variant of the Proxy Pattern, more specifically one that allows for lazily loading the data from disk. This way, the small and large data sets are treated in exactly the same way, which results in much simpler and reusable code (compared to the previous approach).

Also, fully in line with C++’s motto “Don’t pay for what you don’t use”, we wanted a solution that would not require us to change the types of the small data sets that we were already using, which would not be possible with an inheritance-based implementation of the Proxy Pattern that you often see.

Leaving out many details, the Proxy class we came up with looked something like this:

https://gist.github.com/erikvalkering/07c24049b431e5335078192efe925dab

Simply by changing the type of the soldCars_ data member, we have added support for lazily loading them:

https://gist.github.com/erikvalkering/ab0598aecc471884fdd7aaa26211e23c

Now we can use the same code to visit the showroom, as well as check the cars that have been sold:

https://gist.github.com/erikvalkering/7fb088b256417fca02f8996b4e557f19

As soon as we pass the result from CarDealership::soldCars() to some function that expects a vector<Car> &, the implicit conversion operator is invoked, and the data is lazily loaded from disk.

Supporting generic code

The above example works because we are passing the Proxy object to a function that already expects the underlying type.

Now consider the following piece of code:

https://gist.github.com/erikvalkering/b168902fcd3793fb4efd1d54b326735e

This will not work, because a range-based for loop expects the dealership.soldCars() expression to expose a begin() and end() function, which it doesn't have.

Luckily, this is very easy to solve by adding two forwarding functions to the interface of the Proxy class:

https://gist.github.com/erikvalkering/942859199037c50d38dc46089a32808c

We have fixed the range-based for loop, but also we have added support for a major part of the algorithms library of the STL.

We can now do for example:

https://gist.github.com/erikvalkering/eac84690fc03324b75d8e67fda99b373

Getting rid of the forwarding boilerplate

In the example above, the amount of work we had to do to implement the begin() and end() functions seems manageable. However, when you look at the full interface of vector, you'll see that there's quite some work left to do before we can say that we support all the member functions, member types, operators, and free functions.

On top of that, implementing them correctly is non-trivial and error-prone, due to the const-correctness issues, SFINAE-friendliness requirements, etc.

One of the reasons that drove me to create the smartref library was to remove the necessity to have to write them by hand. In this way, developers can focus on the problem that the class is supposed to solve, instead of providing the boilerplate.

Using the smartref library, the Proxy class can now be written without all the forwarding functions:

https://gist.github.com/erikvalkering/d3d3a1bd88e0df6cb3fee80b2fca8add

The Proxy class 'uses' the public interface of the underlying class, by inheriting from the using_<T> class, which in turn obtains the underlying object through the user-defined conversion operator.

Now, we truly get the full interface of vector:

https://gist.github.com/erikvalkering/3448bd63529f3a42e7dff5e1f64cac68

The Proxy class is only one of the many use cases for the smartref library. Stay tuned for more blog posts in which I will give more compelling examples, go into the technical details, and will give some interesting and unexpected use cases.

--

--

Erik Valkering
Plaxis
Writer for

Software engineer passionate about generic programming, expressive coding, software design/architecture, C++, Python, and JavaScript.