Managing Architectural Evolution While in Hypergrowth

Published in

SSENSE-TECH

6 min readMay 23, 2019

Introduction

When I joined SSENSE five years ago, we were a company of fewer than 150 employees in total. Today, we are more than 600 strong. I witnessed our technology department grow from a small team of 25, to the current team of over 150. This phase of hypergrowth has felt like a paradigm shift within the technology department. Among other things, the newfound intellectual firepower added by fresh software engineering recruits gave us the ability to accomplish something we had been hoping to do for years — migrate our monolithic system to a microservices architecture.

These developments however, also introduced growing pains that we initially did not anticipate. Implementing a full architectural overhaul, all the while managing and structuring teams in a state of hypergrowth, proved to be both incredibly promising, and incredibly challenging. This article reflects upon the difficult decisions we faced and how we managed our migration to microservices while ensuring that the pace of growth did not undermine the department’s structural integrity.

Balancing Technological Diversity and Maintainability

One key advantage of a microservices architecture is that when properly decoupled, each service can be implemented with different programming languages and deployed on different infrastructure. As long as all services conform to a standardized communication protocol, in our case REST, and provide mutually intelligible contracts, services with completely different back-ends should have no issues functioning collectively. While this introduces room for infinite technological diversity, such that each service can use a tech stack that is perfectly optimized for its function, we have to consider the impact on maintainability.

Considerations around this trade-off have been a key factor in the evolution of the technology department’s long term strategy. While initially, we only used PHP and JavaScript, we eventually introduced four new languages, each specialized for a particular service’s niche — TypeScript, Python, Java, and Swift. Our experience in doing so taught us to carefully evaluate the business case for adopting a new language, and its long term consequences. Languages used and adopted by small pockets of developers tend to isolate the adopters from the rest of the department. This leads to attrition, higher maintenance costs, and low morale. To avoid such situations, we have become much more conservative in supporting new languages and technologies in general. Moreover, these learnings led us to put in place an Architectural Approval Process, which takes place prior to any major decision being greenlit.

The Architectural Approval Process

During the Architectural Approval Process, the Tech Lead of the team proposing a major change is asked to explain their decision and present its pros and cons. In the case where a Senior Developer is acting as the Tech Lead, they will then take on the role of presenting the project.

The goal of the process is not only to audit such decisions, but also reinforce the notion every project is owned by its Tech Lead, who in turn is responsible for its long term impact. Having Tech Leads take ownership of their projects has proven to be a major asset in supporting our pace of growth. However, the responsibility to ensure that every team’s architectural decisions align with the company’s technological initiatives at large rests on the shoulders of the reviewers. The process is divided into three sessions. The objective and process of each session is defined and documented internally to ensure consistency.

During each session, the Tech Lead presents their proposal to a small group comprised of members of the Architecture Team, the product or team manager if needed, and at least one Staff Developer — who are currently the most senior developers in our organization. This professional diversity allows multiple opinions and experiences to inform the review. The aforementioned three sessions are explained as follows:

The Whiteboard Session

The main objective of this session is to develop a comprehensive understanding of the business case driving the changes that the Tech Lead is proposing to implement. During this session, the Tech Lead must be able to explain the business case, outline the architectural overview of the proposal, and identify dependencies of the service that may be affected.

This session also presents an opportunity to reinforce the technical direction of the department, and ensure that the Tech Lead, Product Manager, and Architecture Team remain aligned.

After the whiteboard session, the Tech Lead has the option to create a Request For Comments (RFC) and submit it to the rest of the department. Once approved, the RFC is used to create the documentation required for the next session. The RFC is not mandatory but usually recommended as it fosters transparency within the organization and provides an opportunity for peers across the department to share their input.

The Design Approval Session

Prior to this second session, the Tech Lead must provide the aforementioned document to the review committee, thus giving them a chance to read the proposal and submit their comments and questions. We do this prior to the session to avoid unnecessary confrontation. It also gives us a chance to postpone a session if the document is incomplete. While reviewing the document, the committee deliberately treats it as their first exposure to the proposal, disallowing their insights from the previous whiteboard session to affect their analysis of the document. This is done to ensure that the document is independently sufficient to provide a comprehensive overview of the proposed project.

This session tends to be the most complex and thorough. This is where the architecture and the technical choices, and the details of important design decisions are challenged. We are aware that this session can be very demanding for Tech Leads. They are tasked with defending disruptive technical decisions in front of a committee that is deeply familiar with the systems being disrupted, and is usually comprised of people who have worked on such systems for several years. Being cognisant of this, we emphasize that the committee should treat this, first and foremost, as an opportunity to educate themselves about the proposed changes. The emphasis is not on confronting the presenter, but on understanding the rationale behind the proposal and its implementation.

Unless the committee finds fundamental flaws, such as the proposed solution creating further long-term complications, the proposals are almost always accepted. Any remaining questions and action items are documented, allowing participants to ensure follow-ups. If this session goes smoothly, the Tech Lead and the team in question are given the greenlight to begin implementing their updates.

The Pre-Release Approval Session

This last session allows the committee to revisit the prior discussions and review the developments in its context. Unless there is a major change in direction, we use this session as an opportunity to re-align with the department’s overall strategy and validate its integration. On rare occasions, depending on the complexity of the code, the committee might choose to review the code itself.

The focus of this session is on the planned integration with various systems that are external to the service, and the composition of messages it is expected to produce for our events stream. The documentation for this session includes a list of checkpoints to cover all such external integrations. If the service in question is public-facing, we also validate its ability to support peak traffic.

Conclusion

If you plan on using this article as a point of reference to set up similar processes in your organization, be mindful of the nuances that your organization’s structure, culture, and goals might introduce. There is no one-size-fits-all solution to the problem of managing architectural evolution in a rapidly scaling organization. Having said that, our approach for the technology department at SSENSE has improved both the technical quality of our microservices, and our ability to clearly outline and conform to a long-term strategy. This process helps bridge the knowledge gap between developers, architects, and managers. Moreover, the inherent transparency of the process encourages our Tech Leads to produce high-quality documentation and deliver well-built software. The main challenge with this process is the bureaucratic overhead it adds, which can slow down the development process.

To counteract this, we classify proposed updates into four tiers — T0, T1, T2 and T3, with T0 being the most significant and T3 being the least. We only mandate the review process for T0 level updates, and leave the rest at the discretion of our managers. Surprisingly enough, managers have tended to nominate a large majority of updates to undergo the review process. This year the review process has covered 90% of all new projects so far. We can interpret this as a sign that we must be doing something right.

Editorial reviews by Deanna Chow, Liela Touré & Prateek Sanyal

Want to work with us? Click here to see all open positions at SSENSE!