The unfinished guide to Event-Sourcing, CQRS and DDD [Unreleased, RAW] + notes about Uber

Vladimir Metnew
8 min readOct 7, 2018

--

https://stackoverflow.com/questions/15934490/read-side-implementation-approaches-using-cqrs?rq=1

Intro

CAUTION: This post is a raw unformatted long list of notes, links to original posts and the most interesting things I’ve extracted from a bunch of sources about CQRS/DDD/Event sourcing.

First of all, I’m really sorry that I haven’t finished this article.
This post promised to be a great one-hour longread explaining all the things in DDD, Event sourcing, CQRS and related domains. However, my interests dramatically changed and I don’t want spending time on researching, making PoC apps and writing a yet another CQRS library.

Anyway, I think it would be better to share this list with others.

https://stackoverflow.com/questions/13489829/cqrs-sagas-did-i-understand-them-right

https://www.slideshare.net/jeppec/agile-ddd-cqrs?next_slideshow=1

https://stackoverflow.com/a/8823383/5206919 -when to use CQRS arch? TLDR: when you think ES and CQRS pattern are suitable

https://stackoverflow.com/a/1958722/5206919 — what is aggregated roots

https://martinfowler.com/articles/201701-event-driven.html — event sourcing, cqrs, event notification patterns definitions

https://stackoverflow.com/questions/34161673/keeping-a-snapshot-of-the-most-current-version-of-each-aggregate-in-an-event-sto — what is “snapshot” in ES. Make snapshot every 50–100 events. Do we need this feature if we use separate table for materialized views?

https://martinfowler.com/eaaCatalog/transactionScript.html — What’s difference between Transactional script and saga?

https://github.com/ravendb/docs/blob/master/Articles/Raven.Documentation.Articles/articles/cqrs-and-event-sourcing-made-easy-with-ravendb.markdown — quick overview of CQRS + some specific notes about CQRS in RavenDB(?)

https://github.com/heynickc/awesome-ddd — Awesome list about DDD

http://cqrs.wikidot.com/doc:projection — projection definition, some notes about “rebuilding/replaying/reverse process”

> Projecting is process of converting (or aggregating) a stream of events into a structural representation. This structural representation (which is being updated as we traverse the stream) can be called many names: persistent read model, view or a state.

https://abdullin.com/post/event-sourcing-projections/ — what is “projection” in ES

> Projection is about deriving current state from the stream of events.

Process of projecting is executed by a set of event handlers, which essentially are just methods executed whenever a specific type of event comes in. These methods perform CRUD operation upon the persistent read model.

http://crw-riviere.github.io/notes/cqrs-event-sourcing-awesome — nice CQRS overview with pros/cons

https://codurance.com/2015/07/18/cqrs-and-event-sourcing-for-dummies/ — Quick overview + pros/cons+ comparison with CRUD + a good image of CQRS architecture

https://stackoverflow.com/questions/15934490/read-side-implementation-approaches-using-cqrs?rq=1 — good cqrs arch image and typical workflow

https://docs.microsoft.com/en-us/azure/architecture/guide/architecture-styles/cqrs — Azure arch center docs are cool, CQRS topic isn’t an exception. Good diagrams + pros/cons/rules/recommendations

https://docs.microsoft.com/en-us/azure/architecture/patterns/cqrs — azure docs, full description of CQRS and ES with examples, not just a short guide like above

https://habrahabr.ru/company/naumen/blog/257477/ — RU, example reactive app with CQRS/ES on Java, cool info in comments

https://cqrs.wordpress.com/documents/building-event-storage/ — designing event storage (mostly for relational DBs, explaining snapshots, some notes about event structure in database, event store as queue)

http://www.vertabelo.com/blog/technical-articles/denormalization-when-why-and-how — some good notes about data denormalization in applications

https://eventstore.org/docs/ — EventStore db docs (blog posts about CQRS ES, not DB’s docs),

> Events as a functional DB, business value of event log

When to avoid CQRS — Here’s the strongest indication I can give you to know that you’re doing CQRS correctly: Your aggregate roots are sagas.

http://udidahan.com/2010/08/31/race-conditions-dont-exist/ = no race conditions in business (that means we’ve got a graph of actions, FSM, step-by-step logic, etc). Also: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type

https://docs.microsoft.com/en-us/azure/architecture/guide/architecture-styles/event-driven — MS notes about event-arch, types of event processing

https://github.com/liangzeng/cqrs — js library for cqrs, allows to write actors as classes and manipulate data, mongo-only,

https://stackoverflow.com/questions/33279680/what-are-the-disadvantages-of-using-event-sourcing-and-cqrs/33305980 — disadvantages of CQRS/ES: high CPU, more disk space, high learning curve(complexity), eventual consistency, high memory usage. Tl;dr: those aren’t cons.

> I’ve been told that Facebook does use ES with Eventual Consistency, which is why you can sometimes see a post disappear and reappear after you’ve posted it.

https://github.com/eventstore/eventstore/wiki/Event-Sourcing-Basics — EventStore DB docs: domain event definition, ES/CQRS basics

http://udidahan.com/2009/12/09/clarified-cqrs/ — another epic must-read post about CQRS. Most interesting topic is “Keeping the query store in sync”.

> After the command-processing autonomous component has decided to accept a command, modifying its persistent store as needed, it publishes an event notifying the world about it.

The publishing of the event is done transactionally together with the processing of the command and the changes to its database. That way, any kind of failure on commit will result in the event not being sent. This is something that should be handled by default by your message bus, and if you’re using MSMQ as your underlying transport, requires the use of transactional queues.

The autonomous component which processes those events and updates the query data store is fairly simple, translating from the event structure to the persistent view model structure. I suggest having an event handler per view model class (AKA per table).

https://stackoverflow.com/questions/32216408/cqrs-commands-and-queries-do-they-belong-in-the-domain- should queries and commands be executed by domain? Tl;dr: Mostly yes. But always, consider your requirements before making a decision.

https://stackoverflow.com/questions/9455305/uniqueness-validation-when-using-cqrs-and-event-sourcing — how to handle validation with CQRS/ES. IMO: we could run a query and check uniqueness before command execution.

https://stackoverflow.com/questions/8820748/when-to-use-the-cqrs-design-pattern — when to use CQRS; tl;rd: when you think it’s suitable: perf matters, business rules, you anyway need to store events, etc.

http://enterprisecraftsmanship.com/2016/01/11/entity-vs-value-object-the-ultimate-list-of-differences/ — DDD, How entity differs from Value objects. Some spoilers:

  • Entities have their own intrinsic identity, value objects don’t.
  • The notion of identity equality refers to entities; the notion of structural equality refers to value objects; the notion of reference equality refers to both.
  • Entities have a history; value objects have a zero lifespan.
  • A value object should always belong to one or several entities, it can’t live by its own.
  • Value objects should be immutable; entities are almost always mutable.
  • To recognize a value object in your domain model, mentally replace it with an integer.
  • Value objects shouldn’t have their own tables in the database.
  • Always prefer value objects over entities in your domain model.

https://stackoverflow.com/questions/5384782/is-cqrs-a-good-approach-for-implementing-a-social-application-on-google-app-engi

> CQRS is not overly complex or difficult, but it does take time to adjust your thinking away from the traditional request/response and client/server interactions that have been pounded into our heads over the years.

Don’t rely on the framework. it’d be better to understand CQRS/ES from the inside.

https://stackoverflow.com/questions/10199082/cqrs-event-sourcing-and-nosql-database?rq=1 — answer contains some links about how to use/avoid 2pc and distributed transactions

https://stackoverflow.com/questions/29916468/what-should-be-returned-from-the-api-for-cqrs-commands — originally commands don’t return anything. But we should understand that CQRS practice doesn’t belong to your app server. Server should return required HTTP headers/status/body. Command inside your app could return nothing (or DB result, success/error message). Fire-and-forget is very dangerous.

https://stackoverflow.com/questions/2559096/cqrs-how-to-handle-new-report-tables-or-how-to-import-all-history-from-the-e — replay your event log to get needed app’s “state”

https://github.com/reimagined/resolve — just a framework/starter kit(like CRA) I accidentally found on Github. reSolve relies on CQRS arch and additionally introduces own concepts. Interesting project, because apps with state-management perfectly fit ES.

https://stackoverflow.com/questions/7312540/event-sourcing-events-that-trigger-others-rebuilding-state — how to work with side-effects in CQRS. TLDR: use sagas.
> From my experience: Uber moved side-effect to DB triggers and doesn’t handle them as events.

https://stackoverflow.com/questions/28667367/best-event-sourcing-db-strategy — how to design EventStore.

https://abdullin.com/post/scalable-and-simple-cqrs-views-in-the-cloud/ — cool article about CQRS views(read model): how/what/where, etc. Concurrency and read model updates.

https://abdullin.com/post/cqrs-architecture-and-definitions/ — CQRS principle doesn’t equal CQRS arch/pattern.

What we know about Uber from their engineering blog

How Uber was rewriting their rider app (ios/android)!

EventDrivenArch:

  • Triggers are decorators
  • Uber uses pub-sub and EDA
  • MVCS = MVC + service layer
Image from Uber Eng https://eng.uber.com/wp-content/uploads/2016/01/billrider_flow-1024x504.png

How Uber was rewriting their payment system in India, side-effects (sms, email, push) are written to queue and triggered asynchronously

Side effects configured. https://eng.uber.com/wp-content/uploads/2017/06/image3-2.png

DB:

Uber uses Schemaless as a core db for trips. Schemaless is powered by a cluster of MySQL dbs. That means:

  • Uber uses NoSQL under the hood for saving trips.
  • Trip schema is flexible.
  • Schemaless(NoSQLs) replaced relational DBs in Uber (at least, core db)
  • Cassandra replaced Riak(another nosql db)
  • Hadoop as a warehouse
  • Schemaless is similar to Cassandra and Dynamo
  • Db triggers are essential as of they trigger side-effects(those might be configured)
  • Uber stores arrays in cells and connect data changes with services using triggers. Example: changes triggers Billing service.
  • Side-effects are defined per-region and could be configured

NOTE: Cassandra and Dynamo shares the same “hashing” concept according to wiki. Uber uses Leveldb (that serves Chromium).

Machine Learning:

ML in Uber, some notes about UberEATS

Machine learning infra + some recommendations

UberEats = React Native + Redux

Tech Stack

Languages, Observability, infra, logging, testing, reliability, data visualization in Uber.

Interesting OSS

http://uber.github.io/react-map-gl/#/Documentation/introduction/introduction

https://github.com/uber/horovod

https://github.com/uber/react-digraph

https://eng.uber.com/chameleon/

CMS that powers Uber and allows high customization => Microsite concept (i.e. /es-MX/driver is a microsite)

https://eng.uber.com/argos/

> Furthermore, at Uber, any service can in principle call any other service

> Uber has advanced outlier detection algorithm that might be used elsewhere

https://eng.uber.com/tech-stack-part-two/

Uber business logic features

  • Uber computes price during the ride
  • Uber has a “marketplace” with many-to-many connections (drivers to customers)

--

--