Conquering the Microservices Dependency Hell at Postman, with Postman — (Part 2)
In the first article in this series on Microservices at Postman, I talked about the situation we found ourselves in early 2018. I highlighted the challenges from technology as well as an organization perspective. In this article, I will discuss organizational practices we put in place that helped us overcome microservice dependency hell.
In the book Linux Annoyances for Geeks, author Michael Jang writes:
“Dependency hell is a colloquial term for the frustration of some software users who have installed software packages which have dependencies on specific versions of other software packages.”
Sharing commonly-used code in the form of libraries is one of the key aspects of software development. We see it everywhere; in all popular programming languages. Ease of sharing code libraries combined with easy discovery build ecosystems out of programming language communities. NodeJS’s
PyPI are the perfect examples of such ecosystems.
However, as a package or module or library increases in adoption, they get bundled in more and more packages. While this is a nice thing to have, the flip side is that we see multiple levels of dependencies build up as these packages are in turn used by others and those by others. We soon end up with a spaghetti of dependencies if we are not careful.
A similar situation manifests in the world of microservices. If you build or maintain a sufficiently large list of services, be it as microservices or any other flavor of service-oriented architecture, chances are you are familiar with such multi-level dependency chains across services in your product.
npm dependencies looked difficult, consider that microservice dependencies bring in all the challenges of distributed computing. Abstracted from the software package dependency world, the idea of dependency hell can be catastrophic when coupled with the costs of distributed computing.
Not stepping on the traps
While I was reading up about building reliable and scalable microservices stack at Postman, I came across the term, “dependency hell” from this great talk from Michael Bryzek, co-founder and ex-CTO of Gilt, titled “Microservices and the Art of Taming the Dependency Hell Monster”. Bryzek talks of how they build processes to ensure that Gilt can be free of unwanted costs while accommodating operations at scale.
Since then I have talked extensively about microservice practices at our internal tech-talks, at different conference panels and in customer conversations.
A recurring theme I have seen companies new to microservices face is the propensity to have strong coupling among services. This is something that we faced as well, as I mentioned in the first part of the series. This can be either as services sharing common data stores, or services strongly depending on some core/central service, or, the more elusive variant, domain logic spilling over service boundaries.
Ben Christensen aptly termed it a “distributed monolith”. When built without careful considerations about bounded contexts, clearly defined service boundaries or isolated data stores, a microservice-based system emulates all the characteristics of a monolith; except now it also has added costs of distributed computing.
Microservices don’t have any industry consensus and as Sam Newman puts it, it is just an opinionated way of building a service-oriented architecture. Given all these challenges that come with building a reasonable sized microservices stack, I am often asked why we chose to take this route with Postman at all and how did we manage to do it.
Microservices at Postman
We started building out our backend services back in 2014. There were four of us at that time — 3 of the co-founders of Postman and our VP of Engineering. Postman was already popular enough at that time. We had 650K users, across almost every country in the world.
Microservices were the blockchain of 2014. Everyone was doing it. We wanted to build out a stack that can support the growing demand of this growing community. Despite the pitfalls that we understood at that point, we decided to adopt the microservices model. There were enough literature and case-studies available for us to have enough research materials to base our decision on.
Benefits and cost
“The complexity of system design shifted to the complexity of managing interconnection between services.”
The benefits of microservices appealed to the engineers in us who loved abstractions and forward thinking architectures. My understanding of the benefits of choosing microservices was strongly influenced by Martin Fowler’s document on “Microservice Trade-Offs”. The three benefits that the document talks about resonated strongly with our situation:
- Strong modular boundaries: The practice and ability to modularly group logic by functions into services meant we could build software where modules could be decoupled from each other. The service boundaries would also ensure that engineers did not need to have a detailed idea of the whole system. They can focus only on the parts of the system that they have expertise in. We saw this an essential aspect to ensure our teams can grow and still work in cohesion.
- Independent deployments: We wanted to implement high-quality continuous delivery practices. Compared to complex monoliths, simple and autonomous services make it easier to build delivery automation pipelines. These services, if architected properly, can be deployed independently of each other. This helps in reducing the time taken to fix critical bugs.
- Technological diversity: Independent services allow us to mix and match and use programming languages, tools, and frameworks that best suit each service, leading to a richer, polyglot technology stack.
As we started building with these benefits in mind, we soon saw the costs creep in; the same costs that are mentioned in Microservice Trade-Offs:
- Distribution: Orchestration of other services was now being done by certain “core” services, they pretty much became “core” because they existed before the other services, not because it should have been their responsibility.
- Eventual Consistency: Microservices introduced eventual consistency issues because of the insistence on decentralized data management. This showed up when multiple resources needed to update for a single business action to take place and everyone had to be consistently aware of resource transactions across services. This fed into the problem of context and hiring, only people with more context (which could be attained literally by having been there) were able to further build the systems and debug problems.
- Operational Complexity: Over time, we had built 20+ services with 5 primary developers and no developer operations. Each developer had to be aware of the existence of other services and be very careful of deployments at a particular time in the day or the week. The complexity of system design shifted to the complexity of managing interconnection between services.
The Decision Quagmire
Post Workspaces, I thought more about this quagmire of dependency hell we found ourselves in, I realized a part of the problem boiled down to the many decisions that we needed to make in a microservices architecture.
“I created a (rather-long) list of all the key decisions we were making for each microservice and kept it on my desk.”
Here is the thing about humans when it comes to making decisions: We are simply not good at decision making without a proper guide to drive those decisions. In the absence of a consistent framework, decisions made even by the same people in the same situation over time may be slight, if not distinctly, different.
These challenges are less prominent when building a monolithic system. You would usually make one or two major choices in a year. You would have one technology stack with, maybe, one type of persistence store.
But, when it comes to microservices, we have to make every decision from scratch, for each microservice, which can be very exhausting for a team, as we had soon realized. Over a period of time, we would end up with different decisions in similar situations, which brought in a whole lot of inconsistencies.
To handle the decision-making process, I created a (rather long) list of all the key decisions we were making for each microservice and kept it on my desk. The following chart is a subset of that decision tree, in no particular order:
“Given the way our organization was structured, we cannot achieve a sustainable technical architecture without isomorphically considering and fixing the organizational structure around it.”
Our developer advocate, Paul, has previously written about Conway’s Law and Inverse Conway Maneuver. That article draws direct influence from how the Postman team dealt with the organization challenges of building the microservices stack.
Looking through that lens, we soon realized that given the way our organization was structured back then, we cannot achieve a sustainable technical architecture without isomorphically considering and fixing the organizational structure around it. So, we sat down to rethink how we can do things better.
After much deliberation, keeping in mind our current constraints and how we got there, we came up with the following:
Product — Service — Platform
- Products, by definition, are what our end-consumers directly interact with.
- Products are built on Services.
- Services interface with the Products and with each other.
- Services are built on Platform components that are created and administered by the Platforms team.
Modified organizational structure
- Core: Improving the performance, stability and long-term roadmap of the app — core request sending flow.
- Cloud: Focused on executing Cloud features.
- Enterprise: Focused on features for Enterprise users.
- Growth: Focused on Growth, designed for quick iteration, A/B testing, fail fast, learn and iterate.
- Ops: Focused on building systems that help the rest of the product team build and iterate quickly.
- Core Services
- Cloud Services
Leveling Up as an Organization
With these in place, we identified the following goals for the Engineering function to adapt to Level Up:
- Define the processes that allow products and services to become efficient, autonomous and of a high quality.
- Through Platforms, create a shared infrastructure that supports microservice developers and outlines clear guidelines for introducing new services.
- Use tools like SLAs (more on that later) to be explicit about the expectation of the producers and consumers of these services will allow each team to move forward independently.
- Align individual growth and company goals through clear ownership and accountability.
With the Inverse Conway Maneuver in our heads and thoughts of an organization design brewing, I dusted off my old copy of “Building Microservices”. The Principles of Microservices that Sam Newman lays down in the book has served as a guiding principle for all the decisions we made to get out of the aforementioned Dependency Hell.
Under this new work-in-progress org structure, we decided to bucket responsibilities for each of the above under the Services and Platforms teams.
- Modeled around Business Domain
- Hide Implementation Details
- Deploying Independently
- Culture of Automation
- Decentralize all the Things
- Isolate Failure
- Highly Observable
Consumer Driven Contracts at Postman
As part of implementing everything that Principles of Microservices talks about, we found that consumer-driven contracts were one of the biggest drivers in conquering our dependency hell. I recommend reading the article on doing Consumer-driven contract testing using Postman to get more idea on that topic.
I think of Consumer-driven contracts as the representation of the consumer’s understanding of your API and a mapping of their consumption behavior. Once the designed API is shared with the consumers, the consumers define their contracts. These contracts are reviewed by the API designer and upon approval added to the list of contracts to be executed.
We execute contracts in 3 scenarios:
- Execution of a CI/CD pipeline: Newman and Docker Container
- Deployment event: On beta and staging environments. This includes serverless setups and Postman Monitors along with cloud APIs
- Periodic monitors: Running contract tests on monitors once a day to test for contract breakage over time.
This is how we built our whole strategy on conquering microservice dependency hell. I talked about the decision components that we put into action. In the next and final installment of this series, I will discuss how we work as a team right now and how we are implementing a growth framework and climbing up the ladder of service maturity model.
In the end, conquering dependency hell is as much about the people as it is about the software they build.
- We use NodeJS heavily at Postman — in our services and in our apps. The way our engineering team manages package dependencies and ensures no vulnerabilities creep in, deserves a long technical piece of its own.\