Architecture Kata: Agile Dead Trees

Aleksey Boltava
9 min readJul 15, 2020

--

One of the natural ways for a software engineer to make progress is to move towards the software architect position. System design is a complicated matter, and building a complex system from the ground up requires both domain knowledge and tech expertise, but gaining those takes a lot of time and effort. While reading the book “Software Architecture Fundamentals” I’ve come across the notion of “architecture kata”, which is basically a software architecture problem, and decided to try it out to improve my system design skills.

Roadmap

In this article you’ll find:

  1. Problem statement in a nutshell
  2. My approach to designing the solution
  3. My solution to the problem

The problem

The task is to design a book publishing platform for a company, which would allow:

  1. its clients to search for books and buy them
  2. book authors to upload their books for review
  3. company employees to review books, manage them and connect with clients

The full text of the kata with specific requirements can be found here.

“Additional Context” from the kata is an important block, from which one can derive additional constraints and make assumptions about what’s expected from the system being designed. Let’s break them down one by one:

The business is driven to this decision because competitors have a similar offering
This might mean our platform should have at least the same core features and quality to be attractive to book authors and customers. It should probably have most features from the start (or new features could be shipped iteratively, but extremely fast), since if the initial offering is not satisfactory enough, the company isn’t going to gain user base. It might also mean our platform should be able to integrate with the existing ones for authors and readers to migrate easily.

Competition for authors is tight
This hints that all features made for authors should be high-quality, the service should be straightforward, durable and available 24/7-like.

This is part of a long-term strategy to modernize the publishing aspects of the business
From this point we can conclude the system should be scalable and extensible, but the initial solution might have to integrate with existing old-fashioned tech or model the currently follwed clumsy business processes.

Information needed to publish a book (distribution, royalties, marketing) comes from several disparate systems
This is again all about extensibility and integration. Ideally, one day all sources might be hidden behind one pretty UI or eliminated at all.

Approaching the problem

There are many tools that aid preliminary design and outlining the system’s shape. C4 model helps visualise the system and states one should employ the top-to-bottom approach, starting with super high-level overview and decomposing components steadily to get down to the class level (object-oriented design). Domain Driven Design states the whole system should be split by its inner cohesive domains for high isolation and adaptability. When tackling such a problem, I believe, it’s important to keep the users in mind, so I usually follow this approach:

  1. Figure out core users and their roles
  2. List functional requirements (using user stories, use cases or something alike)
  3. Delineate core domains (according to DDD)
  4. Design core components to meet functional requirements within and across the domains, matching nonfunctional requirements those components have
  5. Choose tech for implementation

So, following this approach, let’s tackle the kata!

Core Users

The problem literally states the platform must support at least 3 roles:

  • clients
  • authors
  • editors/reviewers

However, it’s clear from the description that we’d also have:

  • support engineers (to help customers solve their issues)
  • order and billing managers (to manage and keep track of the book sales, delivery and payments)

Functional Requirements

Even now it’s easy to see several core domains the system is going to be slit into. For the sake of clarity, let’s split the user stories according to them. I’ve listed a bunch of straightforward use case names and added some hypothetical ones, which just seemed relevant and complemented the others well.

Catalog

  1. User can view a list of books
  2. User can view details of a single book
  3. User can paginate through the list of books
  4. User can search for books

Ordering

  1. User can add/remove a book to/from the cart
  2. User can view the cart
  3. User can make order given the cart’s state
  4. User can choose book format (e-book, paper book)
  5. User can choose delivery address, time and delivery company for a paper book
  6. User can view his orders (details and status)
  7. User can cancel orders
  8. User can receive emails with notifications about his order status
  9. Manager can view orders list and details
  10. Manager can set order status
  11. Manager can contact order author (phone/email/chat)
  12. Manager can send notifications to users via email
  13. Manager can search for orders
  14. Manager can assign orders deliveries to delivery man (or just pass orders info to delivery company)

Editorial

  1. Managers can edit books metadata
  2. Managers can publish books to the catalog
  3. Managers can set price for a book
  4. Managers can approve/reject chapters or entire books
  5. Managers can comment on chapters/books review
  6. Managers can notify users about new chapters/books
  7. Authors can submit a chapter/book for a review
  8. Authors can comment on review
  9. Authors can approve/reject changes to the book during review
  10. Authors can upload chapters in files
  11. Authors can upload entire books as a file
  12. Authors can edit books content and metadata online
  13. Authors can view stats about their books
  14. Authors can manage payment settings

Given these requirements, let’s have a closer look at the domains they concern.

Domains

I use actors, entities and communication patterns to distinguish between domains of a system. Understanding these points helps estimate communication, load and data access patterns and storage requirements. Here, I’ve derived 5 domains, which I believe will be sufficient for the initial system design.

Catalog

Actors: readers
Entities: book description, cover images, lists, reviews, comment
Functions: everything from use cases above

Load
Since catalog is accessed by all users (active clients and random visitors), the number of whom is measured in millions, the service is going to be under high load. Catalog is about getting information about the books, so read-intensive load is expected.

Tech Requirements

  1. Availability
    We want our users to be able to access the catalog at any point in time. We can sacrifice consistency here, since updating a book’s title or even addition of a new book doesn’t have to be visible instantly in the catalog (even a couple of seconds of delay is definitely tolerable).
  2. Scalability
    As the user base grows (even though it’s already huge) the system should be easy to scale by constant work proportional to the number of new users
  3. Elasticity
    If traffic to the catalog spikes suddenly, the system must be able to handle this quickly and transparently for users and continue operation in a normal way
  4. Recoverability And Resilience
    Catalog is probably the most important part that the platform offers to its users. Catalog is the page new visitors see, it’s the page where main purpose of the platform is fulfilled. If the system, serving the catalog crashes, it should be able to recover very quickly with all useful data in place. Ideally, it should never crash and be resilient to exceptional states or errors in network/data/hardware.

Business Requirements

No specific business requirements have been stated, but here are what some of them might be:

  1. Content personalisation
  2. A/B testing support

Cart

Actors: readers, orders managers
Entities: cart items, images, forms (address, card credentials, identity)
Functions: everything from use cases above

Load
Cart is going to be accessed by currently active (buying) customers and employees that have access to orders and historical data. Heavy load is not expected at all, since the number of buyers is far less than the total number of visitors. However, once the total historical number of orders get big, analysis and manipulation might get tough, which is why it’s worth separating online and offline orders processing.

Tech Requirements

  1. Consistency
    Users most likely want changes in their shopping cart appear instantly and predictably, so consistency is the top priority here.
  2. Resilience
    Just as with catalog, shopping cart is a mission critical component from the user’s perspective. If user’s actions constantly end up with failure, the user’s gone, so cart should be able to handle internal and external errors gracefully and try best to not lose user’s data.
  3. Recoverability
    If the service does go down, new instances backing the cart should spawn and get filled with data quickly for the user to get back to work. Some fallback services may be used to tell the user there is a problem and show his latest cached cart state just to make him sure we’ve got it safe.
  4. Security
    Shopping cart includes checkout page and entering payment credentials. Needless to say the service should handle them properly.

Business Requirements
No requirements have been specified but the possible ones are:

  1. preserving cart’s state between visits
  2. goods recommendations on the checkout page

Workbench

This component is in charge of authors’ business

Actors: book authors
Entities: documents, text
Functions: uploading and editing chapters, passing them to review

Load
Given that the number of authors is quite small, the work they do is easily isolatable and thus distributable, low concurrency intensity, the overall load is expected to be relatively low.

Tech Requirements

  1. Consistency And Resilience
    When editing or uploading data the authors would very likely want to be saved robustly.
  2. Scalability
    The number of books and their volume may cause performance issues, but since each author only interacts with his own work and is unlikely to have hundreds of huge chapters, scalability is not a big deal here.

Business Requirements

  1. Notifications about review status change
  2. Support for “beta” chapters

Editoral CRM

Actors: book editors and content managers
Entities: review comments, chapters
Functions: everything from related use cases

Load
Editorial is similar to author’s workbench in that it’s operated by a small number of users, but the data volume is comparable to the catalog’s, so searching and filtering will require processing time.

Tech Requirements

  1. Interoperability
    As stated in the kata, the platform must support integrations with various data sources and combine them into single [editable] “view”. Those sources come and go, so service should be open for extension and additions of new integration.
  2. Consistency
    As book editors work on reviews and access books content, outdated data is going to mess things up and thus should be avoided.

Business Requirements
From the task definition we should be able to:

  1. notify book authors about review status changes
  2. notify clients about chapter releases

Store CRM

Actors: employees-”admins”
Entities: orders, payments, deliveries
Functions: everything from related use cases

Load
Store is rarely accessed directly and doesn’t operate with huge volumes of data at once, so no significant load is expected.

Tech Requirements

  1. High Consistency
    Store service is going to be the source of truth for all data related to orders and book, so date integrity is of high priority
  2. Security
    Module should support data masking, access restriction and other features to protect users’ data from unauthorized access.

Service Design

There are tons of diagram versions in the wild with their advantages and drawbacks, but for this article we’re going to use Application Diagram from this article on Medium. Here’s mine:

Application Diagram For Possible Kata Solution

Let’s go over the components on the image.

Overall Design

Among lots of architecture styles available I’ve chosen microservices for simple scalability and natural separation of concerns. All services are naturally derived from the domains I’ve listed before and can be deployed in Docker on k8s, Nomad or on Google/AWS Cloud. The services are the following:

Catalog
This service is going to cache minimal data about all books needed to display the catalog and data specific for each book. It could also contain cached data from ranking or recommendation services if we had such. It’s got couchbase as database for in-memory efficiency, scalability and eventual consistency. Books data gets updated via notifications from Kafka (Books topic) once the book is updated in the Books Service. All in all, it’s designed to withstand read-intensive load.
Another important detail is that web traffic to the service is routed via a separate load balancer and subnetwork for security reasons (we don’t want clients to access services they’re not supposed to)

Orders
This service is designed to hold all relevant data about customer orders. Since consistency is our top priority, PostgreSQL is chosen as a backing DB. Once an order is in place, its details are pushed to the Orders topic in Kafka.

Books
This service is in charge of holding record of books content, availability, deliveries and distribution. When book’s data changed, interested parties are notified via Books topic in Kafka and availability info gets pushed to the Books-Lock topic.

Users
Holds relevant information about users (customers, authors, employees) shared by different services

Notification Center
It’s responsible for sending notification to all kinds of users (customers, authors, employees) via target channels (email, SMS, push or something else)

Authors’ CMS
This is the Front-API for writers that implements document manipulation, going through review, looking at their profile and other logic as well as communicating with other internal services. Couchbase is supposed to cache review and documents data until a chapter/book gets approved by the editorial.

Employees Admin
This is the Front-API for all kinds of employees who manage the system: editors, content managers, support engineers and others.

Thanks for reading this far! Leave your comments and tips on how to improve/break this system.

--

--

Aleksey Boltava

Engineering manager, Golang and Scala Developer, UX advocate