Designing Layered Architecture (2/3)
Isolating data stores and structuring the data transfer process
This is the second article in a series on how developers can build layered applications, and how they can be structured to easily support change. If you haven’t already, you can read the first part of this series here.
Building it the MVC way
So let’s imagine that we’re building an application that manages products — to that end, we will be exposing an endpoint that simply returns a list of products in HTML. So our context diagram looks a bit like this:
Implemented in a traditional MVC framework, its structure would generally look something like this:
While the diagram above gives us some indication of the links between various components, there’s two direct couplings that I want to talk about right away:
- Model to DB/SQL: Whether it’s accessed through an ORM or not, the model contains a representation of the data store’s structure (table columns), as well as the methods for interacting with said data store.
- Views to Models: Views are directly tied to the models (generally via data binding). The views directly have access to the model, and unfortunately in some scenarios, can also directly interact with the data store, thus defining hard to understand flows.
Supporting multiple data stores
So let’s assume that we implement and deploy the solution above — everything works fine, our HTML is rendered and the app is maintainable.
Coming back to the idea of supporting change, let’s imagine that our project is successful and that our company has just completed the acquisition of competitors Company X. As developers, we are now tasked with integrating Company X’s data into our existing ecosystem, to help provide an overview on the entire business.
The workflows for Company X should not be affected (it should carry on operating as it has so far for the existing customers), but our system should be able to directly retrieve, manipulate and generate reports on their data.
We have a problem though — Company X has two datastores: one is a NoSQL database that we can directly connect to, and another is a DB2 instance that we cannot connect to, but that has REST endpoints exposed.
If we look at our original design, we had a strong coupling where the Product Model was responsible for holding the structure of the data, but also provided methods to interact with the SQL datastore:
So now, our system should behave a bit like in the diagram below:
It’s very important to resist the temptation of actually refactoring the Product Model in order to support the design above, and instead refer to one of the SOLID principles, namely the S: Single Responsibility Principle — a class or module should have a single responsibility and should encapsulate this entire responsibility.
In the diagram above, our Product Model does the following:
- Connect to a SQL database, retrieve and manipulate data.
- Connect to a NoSQL database, retrieve and manipulate data.
- Connect to a REST endpoint via cURL, retrieve and manipulate data.
- Define the structure of the Product object, which is then also immediately coupled with the View.
Enter Data Access Layer
Obviously, our design is in breach of the Single Responsibility Principle, so let’s introduce our first two actors: Entities and Repositories.
- Entities (aka. Data Transfer Objects): structural representation of an object that can be transferred to multiple endpoints. This is just a class containing the representation of the data, as well as mutators and accessors. In practice, this would be a simple PHP, Ruby, Java or JS class, as you can see in the example below:
- Repositories (aka. Data Access Layer/Data Access Objects): class encapsulating logic for connecting to an endpoint or datastore and accessing its methods. You can read an example below:
So in order to solve our Product Model coupling, we can instead manage the connections to SQL, NoSQL, and REST via three separate Repositories, which will always return an Entity object to be used by the rest of our design:
The black arrows above indicate that the controller choses from which repository to extract data, and always gets a Product Entity object as the result — the role of the repository is to connect to the data store and generate said entity (via the grey arrows).
If we now look at the coupling issues that we’ve identified at the beginning, we can notice that:
- Our Model is no longer tightly coupled to a single data store.
- Our View now works with an abstraction of the data, and no longer can directly interact with the data store (aka. cannot run queries from the View).
If we wanted to add a new data store in the future, this would be a simple case of creating a new Repository and adding it to our business logic, which is currently stored in the Controller.
In line with another SOLID principle, the Liskov Substitution Principle (LSP), we might want to also make sure that all of our repositories implement a common interface, a Product Repository Interface. This would ensure that no matter the datastore, we can always implement a common protocol when retrieving or manipulating data, regardless of the backing service.
Have we entirely removed the problem?
I know what you’re thinking — if we were breaching the Single Responsibility Principle by having our initial Product Model connect to more than one data store, why is it OK for our Product Controller to orchestrate where and when each repository is used, alongside its usual responsibilities? Should the Controller not just be used for analysing the input data and generating the output data?
We’re going to explore this topic, as well as how we can isolate our views, in the following chapter of the series, Reusing business logic and exposing our application to a larger ecosystem.
We hope that you’ve enjoyed this article — if you have any questions or comments, please leave them below. If you want to support us, feel free to hit the heart icon below and share with your friends.
If you’ve enjoyed this read, make sure to follow our Medium publication for more articles.