Inside Out: Repository Pattern for Data Layer

A perfect place to put your Domain logic for Data Models outside Enitity Definition

Progyan 👨🏻‍💻 | #TheProDev
The Startup
7 min readJun 1, 2020

--

Repository

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects. — Edward Hieatt and Rob Mee in Patterns of Enterprise Application Architecture by Martin Fowler

Before We Begin

The Object-Relational Impedance Mismatch*

In industry, Relational Databases (e.g., OracleDB, MySQL, PostgreSQL) are the one most used as persistent data storage of an application. On the other hand, Object-oriented Programming is the dominant programming paradigm. So often it is the case that your application will be written in one of the languages that supports Object-oriented Programming patterns, whereas your database will be relational in nature. Now, the Object-oriented programming paradigm is an evolution of programming techniques originated from Software Engineering domain; whereas, Relational Database or rather relational mapping in proven Mathematical deduction founding it’s application in data storage domain and hence it’s way to Software Engineering. And as expected, they don’t really go together, and have multiple spaces with subtle difference and gap in communication. This is termed as The Object-Relational Impedance Mismatch. Lucky for us, years of Software development has brought very much enough solution to this problem.

Database Connectors and Transaction Script Pattern*

As obvious, every enterprise application has to find a way to Query and Mutate data stored inside database from application based upon application behaviour. The basic and naïve technique to do that is to open a TCP connection to database and operate directly. Every database vendor offers SDK library in almost every popular programming language to open and connect to database port exposed over network (e.g., JDBC). And every SQL database has to maintain a standard Query Specifications defined by SQL. Note that, although every database vendor has to abide by the SQL Spec, they however if they want, can extend it to best use of their database and have an edge on product comparison. So, in order to perform any application behaviour that requires either of the operation on database, application has to open one or many connection to database, write SQL as per behaviour requirements and execute them inside database itself. The SDK must offer a cursor to current execution context inside database, that the application can use to perform core database operations (e.g.,COMMIT , ROLLBACK etc). As it seems evident, this comes with another set of problems. Now that the SQL query is stored inside the application, it is strongly coupled with application code. Not only that, any change in requirement will force the engineer to rethink the implementation on database as well.

Stored Procedures and Triggers*

Extending the above discussion a bit more, Stored Procedures and Triggers are quite common non-standard feature that SQL Databases offers. The purpose of Stored Procedure is to define a set of instruction that is required to be performed for one single operation expected from application. It could be addition of Foreign Key to an entity in a different Table in respond to a INSERT mutation of the original Table. Now that the complexity has been moved out of application, where you just invoke the procedure via SQL prepared statement, the domain-specific implementation is now inside database itself. This increases coupling to a good cost that must be avoided in any changing system. Triggers are also might be helpful tools but again must be used in a database specific operation and not with domain specific operations.

Domain Model Pattern*

The first layer of abstraction that we can use is the Domain Model pattern. Domain model is an object model of the domain that incorporates both behaviour and data. So a Domain model can be think of as a set of states (attributes) and methods (behaviour). Now it may seem evident to represent a model in Object-oriented paradigm as a Class with One-to-One mapping with a database table. The entity attributes of database table are represented with attributes that either gets initialised during constructors and can be set by anyone (i.e., public accessor) associated with that Class instance. Each methods of that Class can then represent each form of database operation, Query or Mutate related to that model. Now compared to Transaction Script pattern, where all the logic resides in Application Controller layer that handles service requests of the application, the database specific operations are now moved inside model implementations providing a fine layer of abstraction between business logic and database logic.

Data Mapper*

Now from Domain Model, we encapsulate Model behaviour from Controller behaviour, but that does not solve Object-Relational Impedance Mismatch as mentioned in the beginning. For example, you cannot do inheritance in databases whereas it’s a common practice in Object-oriented programming. To solve such inconsistencies, another layer is introduced between Domain Model (in-memory Entity) and Database Table row (persistent Entity). The responsibilities of this layer includes — To keep in-memory references in sync with Database Table row, and to ripple any change that is done on in-memory reference. In particular, Data Mapper is not aware of either Data Model Class or Database Table, but it has knowledge how to map them and transfer information from one another.

Object Relational Mapper/Entity Relational Mapper

Extending functionalities one more level over Data Mapper is Object Relational Mapper. Data Mapper only maps between a Domain Class Instance and corresponding entity row in Database Table. But it is not aware of any relations that exists among entities, which is often the case. To solve this inconsistency, Object Relational Mapper is introduced. An Object Relational Mapper not only can map between Domain Model and Database Entity, but also could defined and Query/Mutation other Domain Models and Database Entities which are made relative to it. This abstracts away all the database operations from Model implementation by offering standard set of API to Query and Mutate, which then is used within Model Class to perform specific persistent operation without have to writing any raw SQL. There are multiple patterns involved to achieve this — Identity Field, Association Table Map, Foreign Key Map and Dependent Map.

Repository Pattern

Entity

Entity

Entity is a single instance of a Domain Model Class that represents an entry in a table in Database, i.e., Entity has One-to-One mapping with a Database Table row. So for a committed Entity, the data attributes of that instance can safely assumed to be persisted inside database. Now every Entity Relational Framework or simply Entity Framework offers a set of functionality to perform standard Query and Mutate operation. But often, if not always, enterprise application models requires more that primitive operations to be performed and that usually involves some amount of business logic to be coupled with Model Class implementations. For a small scale application code, that might just be enough and preferable option. But as the requirements keep adding up and/or changing, it becomes tedious work to maintain that inside a Model object. Also more amount of business logic dependent of Entity Framework means the application being too tight coupled with Framework which in turn makes it hard to replace or modify.

Repository

Repository

Repository is a collection of a particular Entity type. Repositories will always have One-to-One correspondence with Entities. Or in other words, Repository has a composition relation with Entity. Repository could be thought of as a Table or Entity Set in a relational database whereas Entity being a row in Table with a Set of Attributes of its own identified using a key. Now that we have separated Domain Model implementation and specific Business rules, we can now put them inside Repository. In Repository pattern, a Model is allowed to have only data attributes that via Data Mapper will get pushed to Database and static/hook methods that performs pre-processing of those attributes during Object lifecycle. Rest of functionalities that are expected on Data Layer, i.e., business logics around Model can safely be implemented inside Repository Class.

Recap

  1. A Domain Model is an Object-oriented representation of a Database Table Entity.
  2. A Data Mapper takes an in-memory Domain Model Instance and maps to particular Database Table Entity and responsible to keep both in sync.
  3. A Object-relational Mapper extends the behaviour of Data Mapper and associates related Database Table Entities with Domain Model object.
  4. A Repository is a collection of Entity and hence could be thought of as a representation of a Database Table, where all business logic related to Model resides.

Advantages

  1. By separating Business Logic and Domain implementation, the application no longer have hard dependency on Entity Framework. Hence, depending upon use-cases Entity Framework can be changed, modified or upgraded.
  2. By encapsulating Domain specific logics inside Repository, we can reuse queries following DRY methodology.
  3. We can also perform Object-relational behaviour patterns (e.g., Unit of Work, Identity Map, Lazy Load) within scope of Repository to optimise number of database operations and hence the overall application performance.
  4. By providing solid separation between Application Controller (or Web Controller) and Domain Model, tight coupling around modules can be eradicated and all Query and Mutation at Controller layer can be done based upon Interfaces exposed by Repository only.
  5. Encapsulation of Domain Model and Repository helps avoiding unnecessary bugs regarding exposing a sensitive information to outside world.
  6. This also helps to maintain stable relations among Domain Objects and perform JOIN operations smoothly and abstracted from top layers.

Bibliography

  • Patterns of Enterprise Application Architecture — Martin Fowler, David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, Randy Stafford
  • Domain-Driven Design: Tackling Complexity in the Heart of Software — Eric Evans, Foreword by Martin Fowler
  • Agile Database Techniques: Effective Strategies for the Agile Developer — Scott W. Ambler

I found this repository as a great starting point on undestanding the Repository Pattern with very simple relational example: https://github.com/w3tecch/express-typescript-boilerplate

--

--

Progyan 👨🏻‍💻 | #TheProDev
The Startup

Software Engineer turned Architect | Upcoming Data Engineer | Start-up Advisor | Open Source | Philanthropist