Building Microservices (Part 2)
Reading notes on Building Microservices by Sam Newman
Chapter 4 Integration
As mentioned, how the black boxes communicate with each other is the essence of building micro-services. Let’s talk about it now.
Which approaches/protocols to choose
We are looking for the following criteria:
- Adding new fields won’t break the communication
- Not bounded to use a certain language / tech stack
- Allow each box to hide its implementation details (so that its changes won’t affect its consumers)
- Simple to use. Consumers shouldn’t be installing too many things
Existing solutions — Shared Database
Each box is allowed to read/write the same database. This is bad because
- Each service’s logic is exposed to others through its schema design
- Changing the business logic/CRUD rules in one service will affect the others (i.e. consumers need to follow)
Avoid at (nearly) all cost.
So how do we go about it
For example, when a customer enrols for something, the following needs to be done by each service
- Create loyalty card
- Dispatch welcome pack by post
- Send welcome email
There are 2 ways to coordinate the above services to work together.
Orchestrator (Sync) vs Pub-Sub (Async) Coordinations
Orchestrator Approach — One service acts as “the brain” which calls the other services to accomplish the tasks. In this example, say a customer service will call membership service, post service, and email service respectively. It’s like a “push” approach, each service is pushed to do something on-demand.
Pub-Sub Approach — One service emits an event, other services listen to and act upon. In this example, the customer service emit a “Customer created” event. The other 3 services will act accordingly once they saw this event. It’s like a “pull” approach, each service pulls up things to do.
Orchestrator pros & cons — Easy to reason about, one place to check the business logic explicitly. But experience shows that this approach could easily evolve into having a “god service” calling a bunch of dumb CRUD services, and the former becomes too large to manage/change.
Pub-Sub pros & cons — No “god service”, each service handles its part so it remains small and easy to manage/change. BUT the whole business flow becomes implicit, extra work is needed to monitor the event queue and track that each series of events happens as expected. It can be tricky to debug. Possible bugs:
- If a fatal task is put back to the queue, workers will die one by one as they take it up
- Forget to set max-retry for each task
- How to replay/retry failed tasks manually/on-demand
- Mis-handle / forget to handle versioning of each message
For better debugging, use correlation ID for tracing requests across boundaries.
Specific technologies
RPC — syntax makes developer easy to forget that each call is indeed making across a boundary, rather than simply a class method. Developers tend to expose all the fields and “methods” in a “model”, instead of paying attention to encapsulation.
REST — the concept of “resource” is useful that a consumer only needs to know which services got what resources, then makes requests to the desired service. That service will then decide whether to perform the corresponding lifecycle event. Be mindful not to bind “how to represent a resource to consumers outside” with “how to store things in certain schema inside”, so as to remain flexibility to change the inner implementation without affecting the interface outside.
The above works in both sync and async manner. For sync manner, it works by request/response. For async manner, it works by registering a “callback”.
Message broker — for async communication. It handles
- subscriptions
- tracking which messages have been sent to which listeners
- ensure only 1 of the workers inside the same service receive the message (Consumer competition)
Be mindful not to put too much logic to the message broker, as vendor tends to put more functionalities to it. Eventually the message broker will become a “god service” where changes are hard to make. Make dumb middleware and leave all the smarts to the services.
Side note on “DRY”
Shared library — for example, create a shared library having domain objects that are used by multiple services. BUT this couples the services too tight. Changes in one domain object will affect the codebase of services that are depending on it.
It is OK to duplicate code — Although that looks dumb, all lifecycle events of a domain object is encapsulated inside only ONE service.
But for common utilities like logging, it is okay to have shared library, as it is not containing any logic related to the lifecycle events of any domain objects.
Client library —Although it provides ease of use to consumers, don’t leak any lifecycle logics to it. Having another team/open source community which is not working on the service to create that library can help.
Passing data around services
Once we figured out how to services are pushed to do/ pulls jobs, we need to decide how to share the required data in each job. Take e-commerce app as an example, when a customer completed an order, we need to send confirmation email.
Include the data in the call / message — the order service will pack information about the order, customer name/email address, and title/content, etc into a payload, and include it in the call to email service (orchestrator approach) or in the “Order created” event (pub-sub approach).
The above method does not consider freshness of data. What if the confirmation email task is queued, and at the meantime the customer name (or worst the email address) is changed ?
Passing reference — instead of passing resource content, we pass its reference (e.g. URI). So at the time the email service is ready to do the task, it unpacks the payload and finds order detail URL, and customer detail URL, then fetches the information from the relevant services.
The above method could put much load on the customer service/order service. To mitigate that, we could also include resource content, reference, and a timestamp in the payload, and the email service decides when to re-fetch.
There’s no hard-and-fast rule here. Weigh the pros & cons and made the tradeoff.
Handle changes smoothly — Versioning
By keeping the “fields” flexible
- the communication protocol should not break after adding/removing fields (REST can handle this)
- reader can parse the response dynamically (Xpath, GraphQL. nice to have)
Use semantic versioning
- New_function_break_old.New_function.Old_function_patch
Coexists
- maintain multiple endpoints (e.g. /v1/customers, /v2/customers)
- beware of the extra effort (i.e. changes need to be made on both versions’ codebase)
Integrate with UI
- This layer changes quite frequently (Desktop app, web app, mobile app, who knows what’s next in future)
- first boundary is at “UI side” and “service side” (which has all the services)
- beware of tight coupling due to chatty API calls between UI side and each services
- can be de-coupled by allowing consumers to specify the required fields
- “backend-for-frontend” approach — can try this when there are too many services in server-side
backend-for-frontend
- For each UI, say we have mobile app and web app, create a “mobile app service” and a “web app service” at the server side
- each BFF is responsible for handling calls from its UI, then making the relevant calls the other micro-services. Kinda like one endpoint for each data-needed user interaction
- Beware of putting too much business logic in this layer — lifecycle event logics for each domain resource should remain in its micro-service
Integrate with 3rd party SaaS
- one common example is CRM
- these CRMs usually have shitty API (or not providing any), weird communication protocols, strange endpoint logics, etc
- facade it into separate service(s), then the rest of our system will only interact with them
- try to facade not just by creating a big “scary CRM service”. Instead, try to create facade services by identifying domain concepts (e.g. Project service, employee service, etc)
Chapter 5 Splitting the Monolith
To be continued here