Data driven system design

Designing a flexible data driven skip delivery/collection system

Published in

AnyJunk

6 min readDec 13, 2021

The Problem:

To build a system for managing the booking/charging of skip delivery/exchange/collection. In the future this system should be able to handle many different services with different pricing rules for different sorts of events.

Our goal was to build a system unaware it was working with skips. A scalable design that could handle skips today and kitchen installations tomorrow without input from a developer. Giving the users visibility over the rules of the system by making them configurable and viewable through an admin portal.

Approach:

Embrace SQL. We isolated the key concepts likely to vary as the system expanded and created a set of “type structure” tables in the database which defined these and the relationships between the types. Users can then insert into these type tables and link tables to setup the system to their needs.

Small Example Problem:

Create a system to calculate journey time for a person to walk a route based on distance. We know in the future that we’d like to include other information such as elevation gain and may include calculations for travel methods besides walking.

Simple approach:

case class Walker(id: Walker.Id, pace: Int)
case class Route(id: Route.Id, distance: Int)

This is the quickest solution to the problem, in order to calculate journey time you take a Walker and a Route and multiply distance * pace. However, this system will need developer intervention to expand to accommodate adding on additional time for elevation gained. Something like below.

case class Walker(id: Walker.Id, pace: Int, climbingSpeed: Int)
case class Route(id: Route.Id, distance: Int, elevationGain: Int)

This is still relatively simple but we also know there will be other travel methods in the future. If we add in electric scooters which don’t care about elevation again a developer is required to come in and add something like the following.

case class Scooter(id: Scooter.id, pace: Int)

This approach works well for a small system or for elements of a system that won’t vary much during its lifecycle. However we know the future of this system is to have many methods of travel which could each have different metrics for journey time. We don’t want to pollute our system design with repetitive business logic that causes compounding costs in developer time, code maintainability and testing as more expected features are requested.

We wanted something more flexible (and fun for a developer to build).

Type approach:

First we’ll define the basic concepts of the system. A mode of transport (be it walking or scooter) and a journey timer that can be associated with a transport type.

case class TransportType(id: TransportType.Id, name: String)
case class JourneyTimer(id: JourneyTimer.Id, transportTypeId: TransportType.Id)

Then we need a field containing an amount (be it distance or elevation) in order to add time against.

case class JourneyMetricType(id: JourneyMetric.Id, name: String)

Next a user will need to be able to inform the system which transport types care about which metrics and also how a timer should use those fields.

case class TransportJourneyMetricType(id: TransportJourneyMetricType, transportTypeId: TransportType.Id, journeyMetricTypeId: JourneyMetricType.Id)case class JourneyTimerMetric(id: JourneyTimerMetric.Id, journeyTimerId: JourneyTimer.Id, journeyMetricTypeId: JourneyMetricType.Id, amount: Int)

Lastly a user should be able to create instances of routes and relevant metrics. This can then be used in combination with the journey timers to give the users the functionality they need.

case class Route(id: Route.Id)case class RouteMetric(id: RouteMetric.Id, routeId: Route.Id, journeyMetricTypeId: JourneyMetricType.Id, amount: Int)

This removes the concept of walking / scooters from the system and replaces them with the generic concepts of a TransportType, JourneyTimer and Route. The Scala code is now purely working with generic rules and implementation details are left to the users.

The final system looks like this:

Scala

Type safe IDs

With the type structure approach the resulting system has a heavy reliance on foreign keys and thus IDs in order to function. This can become a challenge during development and a source for subtle bugs if ever the wrong type of ID is used.

By using the concept of a KeyedEntity andKeyedEntityCompanion our database model case classes can define IDs as their own types allowing the compiler to enforce correct ID usage throughout the Scala codebase.

case class TransportType(
  id: TransportType.Id,
  ...
) extends KeyedEntity[TransportType.Id] {
  def key = id
}

object TransportType extends KeyedEntityCompanion[Int]

We occasionally found bugs still sneaking in through the API layer as encodings / decodings couldn’t enforce this type correctness. To resolve this we created the concept of PrefixedKeyedEntityCompanion which also provides encodings / decodings with a given prefix to allow the API to reject invalid IDs.

object TransportType extends PrefixedKeyedEntityCompanion[Int, KeyPrefix.TT.type]

DataIntegrity

Alongside the IDs being of the correct type / structure we also needed a convenient way to ensure IDs exist and were active where required. Whilst it’s impossible to have the compiler check if an ID will exist in the database we can force developers to consider this through the concept of Evidence and Having.

type Having[+T, +E <: EvidenceConstraint] = T with Ecase class ThingWithForeignKey(id: Thing.Id, fkId: FK.Id Having DataIntegrity)

Now ThingWithForeignKey requires the type of the foreign key to be combined withDataIntegrity before it will compile. This forces the developer to wrap the foreign key in the required Evidence and thus we have greater confidence it is correct and won’t upset the database when we attempt to insert / update.

Repositories

In order to make the lives of developers easier whilst checking ID validity our Repository classes have an ActiveKeyedRepository trait which provides helpful utility functions to validate single / multiple IDs wrapping them with DataIntegrity if all exist and are active. Below is a standard usage pattern we’d have within a Service which accepts unvalidated IDs from the API but is required to validate them before passing to create / update DB queries.

class TransportTypesService (
  repository: TransportTypesRepository,
  journeyMetricTypesRepository: JourneyMetricTypesRepository,
) {def create(
  name: String,
  journeyMetricTypeIds: List[JourneyMetricType.Id]
) = for {
  validJourneyMetricTypeIds <- journeyMetricTypesRepository.validateKeys(journeyMetricTypeIds)
  transportType <- repository.create(name, validJourneyMetricTypeIds)
} yield transportType

Takeaway

Advantages

The key advantage to this design approach is the future flexibility your system gains. By abstracting over the concepts we knew were likely to change our users gained the ability to maintain and manage a part of the system normally locked behind the scenes.

Developers time can also be spent on the algorithmic parts of the system, creating the concept of an ability to charge for something. Rather than business specific rules of how a skip should be charged.

It’s also much more fun to build!

Caveats

The solution does introduce more boiler plate, having to create the models etc to define the type structure and validate the IDs. It also increases the complexity of testing / data generation, you want to ensure your tests are running against data which is valid.

Also if you ever need to have specific business logic within your code itself that must interact with the type structure this can be fiddly. We solved this issue through the use of Enums and a table which contains those Enums. Which isn’t bullet proof.

Thoughts

We at AnyJunk are in the privileged situation to be able to design and build another new system which not only solves our immediate requirements but sets us up to have a solution which solves future usages almost free of charge. Designing a more generic system that gives the users power to customise core logic to their needs allows the system to be far more scalable both in terms of current and potential customer usage. Whilst this won’t always be the case we believe the increased code complexity is well worth the future rewards.

Also as functional developers, who doesn’t prefer to design abstract solutions and leave the implementation details to users?