Parent-child entities through Repositories
An approach to create repositories over parent-child entities in PHP
With my colleagues we’ve created an approach how we should create repositories over parent-child entity relationships. In this post I’d like to introduce this approach to you through a simplified example.
Let’s have a Project object which will be our aggregate root. It can have many Tasks (0, 1, 2 …).
If we want to be strict we should reach Tasks only through Project. Tasks are child entities so they should not have their own repositories. When we perform a read operation we would read all the child entities, so we should read the order as a whole, the entire object graph. On a write operation we would work in the Project repository as well.
In the real world there can be problems. For example performance issues (reading the whole object graph), or what if we’d like to reach a child entity directly, not just from the aggregate root. I think it would be good if we can create all the repositories separately but with a little twist.
So one approach could be that we would only have the ProjectRepository in our domain model. And we would create the TaskRepository in our infrastructure layer. This way we can be strict at least in our domain model.
For Tasks we could introduce an additional model which name is TaskCollection (we should call this Tasks but for now let’s call it TaskCollection for the sake of understandability). With this new layer between Project and Tasks we can control the availability of the Tasks. If we don’t load the Tasks we can’t mutate our TaskCollection (it will be null in the Project object), that would cause problematic consistency issues.
For example we load our Project entity without the Tasks, then we save the Project through the ProjectRepository, what should we do? Should we delete all the Tasks because we didn’t load those and now a Project has an empty collection of tasks? I don’t think so. That is the reason for the immutability of the Tasks if we don’t load them.
Now let’s see our two entities Project and Task
This is how our TaskCollection looks like
Now comes the interesting part, where we create DbProject and DbTask classes. Why we need these classes? Two things
- If you take a look at Project entity you will realize that you can’t set its identity. But through DbProject you can. If you create a new Project object somewhere outside the repositories its identity will be null so it will be a new entity in the data source. If you read the Project entity from any kind of data source than it is an existing Project so you have to set its identity. That’s one of the reason why DbProject exists.
- The other reason relates to consistency between Tasks entities in the object and in the data source. Our solution is if we don’t load the Tasks we can’t get or edit them (We’ve already talked about this).
So far we created our aggregate root (Project), its child entity (Task), the link between the two (TaskCollection) and the Db related entities (DbProject, DbTask).
We arrived at our last class, namely ProjectMysqlRepository:
Assemble a Project object looks like
Saving looks like
If you load a Project entity without Tasks through the ProjectRepository and you’re trying to reach the Tasks you’ll get an exception (We could introduce a method as well to decide we can or can not reach the task entities). So as we saw we have to load a Project with Tasks if we want to reach (add, edit, remove) Tasks.
Hopefully now you have the basic idea. I don’t say it is a silver bullet. You should not use this approach every time but I think it can have its own place in a domain model.