Designing a Data Access Layer — Part 1

Greg Pabian
The Startup
Published in
4 min readJan 25, 2021

The Data Access Layer is a layer that establishes communication between a database (SQL or NoSQL) and a backend service (factoring in multiple instances if applicable). Even though the term itself does not speak much about its intrinsic properties, there are certain architectural patterns commonly leveraged when designing DALs. Some frameworks use different terminology, calling it the Data Access Object but for the purpose of this article I find it best to agree that these names are synonymous.

Before I start delving into this topic, please note that the article is highly opinionated, taking into account variety of my experiences designing and using Data Access Layers in Java, Python and JavaScript/TypeScript (including a lot of different frameworks and libraries to achieve that). All my findings are precious to me from my professional standpoint but might not be necessarily adequate to the reader.

Before an IDE is opened

If the DAL is being planned, as the writer of this article I assume that all the data structures residing in the database have been already carefully written down, regarless whether it is a relational database or not. In spite of the fact that it might be possible to create a database schema using migration files generated by some ORM systems (for instance Django ORM or Sequelize), my strong belief is to find such an approach questionable. In the end, the data reside persisted on disk in databases and not in-memory of some backend services.

Furthermore, starting with defining the schema on paper first enables the development team to see the obvious limitations imposed by the chosen database engine. The engine can be still replaced at this stage as there is not a single line of code written, which leaves the technical stack choices still open. The very juxtaposition of the planned database schema and the high-level plan of backend interfaces might result in significant alteration to both aforementioned components of the designed system.

Schema Restrictions

As the goal of any development process is to deliver a product, taking into account the product perspective at all times is crucial to implement a proper DAL that serves its purpose. This should be true for the schema as well, as certain product requirements might drive some schema planners to define:

  • foreign keys, (unique) indices,
  • nullable and non-nullable columns, default column values, column-size limits,
  • cascades, triggers,
  • procedures.

What is important to realise is that certain restrictions are definable on the database level (leveraging the engine that powers the server) but some will simply not be — especially when different database systems are considered. Some RDBMS examples include:

  • ensuring an arbitrary format of text written into a TEXT field (can be enforced on the backend level),
  • ensuring only valid JSON documents are written into a TEXT field (can be enforced on the backend level),
  • there are up to 3 rows in a table that satisfy an arbitrary condition (I invite the reader to think when and how this very problem is solvable).

Undeliverable Features

Spotting features that cannot be reasonably delivered before starting the very implementation process saves a lot of time and frustration. There might be multiple reasons for qualifying a feature as troublesome in the context of database operations:

  • neither the database nor the backend service are able to enforce all the necessary constraints,
  • the necessary set of queries planned for the feature in question does not scale (e.g. locks, execution time),
  • there is no way of leveraging the database engine (with the current schema) to calculate an effect and doing it on the backend side threatens the stability of the system or requires a lot of implementation work.

Even though some of the problems outlined here should reveal themselves at the schema planning phase, a few of them actually might be discovered at a later stage, e.g. during DAL planning, by an experienced backend developer. Usually, finding such an issue results in a discussion over a feature (including product managers) and a decision-making process which ends up in either relaxing the requirements or dropping the feature altogether.

Stability of the DAL

As the DAL is a core component of a backend service with limited feasibility of a major refactoring, the sooner potential problems are discovered, the better. To put this thought into a perspective, the Data Access Layer is built with a single specific purpose — to be used by other components of the backend service. Since those components have a direct dependency on the DAL, every change there (be it something as small as local file refactoring or as big as a change of behaviour affecting multiple functions) might impact them in an unpredictable way.

Summary

It is of crucial importance for any backend developer to be aware of explicit and implicit restrictions imposed on the database level and their impact on the DAL and the set of rules that need to be applied on the backend-level. Ideally, the database and backend systems should be understood as tandem working together to achieve one goal — providing as faithful implementation of the product specification as possible.

--

--