An Engineering Manager’s Case Study in Org Design

Brendan Cone
9 min readSep 10, 2019

--

Anarchist organizational chart
Source: 2010, Tribune Media Services

When I first became a director, I understood that I needed to change my mindset and adapt as the gaps between my past experience and my new role became apparent.

What I quickly found was that one of the most obvious differences between my old role and my new role was ironically in the self-directed nature of the role itself. No longer was it the case that there was someone to ask for the answers to many of my questions, because each director’s problem-set appeared, at the surface level, to be entirely different from the next. I started to understand why those in leadership positions tend to read a lot — everybody is looking for the answer to unanswerable questions.

I also found that, with this role more than any other in my past, experiential learning was exceedingly valuable. There is no Stack Overflow that will tell you how you should form your teams. I couldn’t copy-paste code into my IDE and change some variable names anymore. Decisions made inside of one director’s organization did not necessarily make sense within another’s. Understanding that I had a lot to learn, I gleaned as much as I could from more experienced leadership, but many of the questions that I had could only be answered with “it depends on the situation”.

Org Design as an Unanswerable Question

One of my responsibilities that articulates how director-level problem sets are dynamic in this way is the concept of org design.

In grossly simplistic terms, org design can be summarized as the act of organizing people and defining roles to optimize their productivity and improve their adaptability to changing business requirements.

To be honest, until I was responsible for it, I hadn’t really thought all that much about org design as an interesting problem. However, I quickly came to realize that it is one of the more important aspects of leadership because it is often both the cause and the solution for long-term dissatisfaction of team members.

Naturally, in trying to understand org design better, I read a lot. I had heard about the famous Spotify Model and read up on the history of organizational structure to gain an understanding of how the current accepted norms came to be. I had also read that it would be foolish to try and lift these models and apply them to our situation without paying mind to the real problems that caused their formation in the first place. These companies arrived organically at their existing structures by identifying issues and adapting, and by feeling the strain in previous designs they had put in place as the size of their company and problem-set grew. If, when they were 50 people, they immediately tried to form into their current organizational structure, it likely would have introduced more problems than it solved (and they’d likely also have more defined roles than people).

Not every company is at the size and scale of Spotify, and it is certainly not the case that a smaller engineering organization should be matrixed ten times over when they only have ten engineers to begin with. This is what made me realize that org design is a craft that is not all that different from programming — there are many different solutions to the same problem. Some org designs are more elegant than others, some are more performant and optimized than others, some are over-engineered, and every single one requires continued iteration to be successful.

A Hypothetical (But Very Real) Example

I don’t like this puzzle piece analogy because it implies that there is only one solution to the problem.

To articulate this point further, let’s imagine a hypothetical scenario where we are running a business of 25 people and trying to achieve the common goal of producing and iterating upon a site-builder product that allows non-technical people to build websites easily.

Of the 25 people, many have specialized skill sets — some are great at full-stack development, some UX design, some quality assurance, some managing infrastructure, some project management and still others are experienced at product management. We also have a senior leadership team that has visibility into the market factors that might impact the company in the long-term, and a view of the competitive landscape.

We could organize our teams in this hypothetical example in many different ways.

Iteration 1: Brute Force

The “brute force” method might be to have everyone be part of the same team. The benefit here is that everybody will have full context into all of the goings-on in market and understand the product well, so it is likely that the team will be very well-aligned. The downside is that everyone will likely be overwhelmed with the amount of information that they have access to and they probably won’t be able to perform their specialized and primary function for the majority of their working hours, making it sub-optimal.

Iteration 2: Organize by Function

We could have another potential design that organizes people around function: product managers on one team, quality assurance on another team, developers on another team, and so on.

One benefit here is that people have clear ownership and understanding of their role (and the boundaries around their role) in the delivery pipeline. The other major benefit is that the teams aren’t bombarded with information that is largely irrelevant to them, and they’re able to focus on their core competency for more of their working hours.

One downside is that development in this way can make for slow and clunky delivery cycles and it also predisposes itself to waterfall-style planning, which may not be so great for a small company such as ours, especially if we haven’t yet found our product-market fit.

It’s also likely that we will have many simultaneous projects, bug fixes and feature improvements on the go since there is only one development team, which could lead to problems of coordination, scheduling and context switching.

Finally, the teams will likely communicate mostly in completed artifacts — a sprint’s worth of code, a product requirements document — depriving teams of the why and important context as work moves along the delivery pipeline, and robbing them of the ability to contribute cross-functionally. While there are also benefits to communicating in artifacts (namely in that it is self-documenting), there seems to be proof time and again that getting the engineering team closer to the problem being solved pays dividends, so let’s explore that as an option.

Iteration 3: A Bunch of KPI-Driven Startup-inside-a-Startups

We could have our teams organized by “mission” — each team will be cross-functional and have product owners, full-stack developers, QA, and scrum masters.

One team might be focused solely on marketing & user acquisition and another might focus on user engagement based on customer feedback, as examples of what missions might look like.

Each team has good alignment with their goals and success metrics, and the product vision for their mission should be clear to everyone in this structure. They also have a very efficient pipeline to get things delivered since everyone necessary to do so is working on the same problem at the same time.

We’ve also brought the engineering teams closer to the problem being solved, allowing for quick ideation and iteration, and likely more successful product deliveries.

However, ownership of systems becomes less clear — who fixes a Sev1 on a system that both teams work on? Who makes sure that a technical roadmap exists for our largest and most complex systems so they don’t become a tangled mess of spaghetti code? How do we handle integration between both teams’ code into the same codebase from a quality perspective?

As a consequence of this structure, we end up having less resourcing flexibility because each team now likely has individuals with a specialized skill set that no one else possesses, and as team members go on vacation or otherwise leave the company, we could end up in a tricky situation pretty quickly and halt progress on missions entirely. There are also a whole slew of other challenges with this design, as I’ve highlighted in another three-part article.

Iteration 4: A Hybrid Model

Let’s try to solve the system ownership problem with the above iteration. Perhaps we keep the mission-based teams, but we have a couple of specialized teams to work on larger, more central systems and packages that benefit the production capacity of many teams at once. Any technical work that must be done on the large, complex system is routed through the system-based team.

This org structure seems to form organically in many cases — the concept of a “platform team”, often called just that, seems to end up existing out of necessity when companies reaches a certain size (or so I’ve seen, anecdotally).

This could land us on a potential “happy medium” of communication overhead to delivery speed, but releases will now need to be coordinated in some way, so we’ll probably need to introduce disciplined cross-team project management (and potentially another role to the company if that does not yet exist).

Iteration 5: Skillset Diversity

We might be able to solve some of the problems we’ve identified with the above designs by deliberately expanding the skillsets of all of our developers, QAs and SREs in an attempt to make them less specialized and able to take on a wider variety of tasks (“T-Shaped Roles” in industry speak). This requires quite a bit of upfront investment, and will certainly slow progress to begin with, which may not be ideal if our company is small.

However, over a longer time horizon, it should help to create better engineers, give more resourcing options, speed up delivery, improve broader understanding of systems, and reduce the bus factor at the company. We must also consider that this may directly conflict with peoples’ career development goals, so we may end up introducing risk on that front.

The Real Answer — Everyone is Right

So, as you can see, there are many ways to organize those 25 people. But what if we had a company of 6 people? What about 100 people? What about 2000 people?

In some cases, we may even revert to another org structure organically when in crisis mode. For example, a war room is essentially a re-organization of our team back into the first “brute force” iteration to attack a serious and time-sensitive issue because it gives us the benefit of full context and information at the cost of short-term chaos.

It seems to become obvious that each of these org designs is suited to a different scale and problem set based on the pros and cons of the solution, and the resources at hand.

The REAL, Real Answer: Empathy and Execution

Sympathy vs. Empathy
Source: Grammarly

The last point that I would like to make, and probably the most important thing that I’ve learned on this subject, is that everything I’ve written here up to this point is theoretical. While the above hypothetical example is based on real experience, I’ve made assumptions on how things should work, and what problems we might come across.

We haven’t talked at all about the people actually living through the change.

As soon as we distort what someone’s role is, what the company expects of them, or who they work with, it takes time and iteration to get it right, and it is inherently frustrating and uncomfortable for them (as I’m sure we’ve all experienced at one point or another in our careers). In working through the change management process, we have to be compassionate and listen. We must be transparent, and always provide the rationale behind the org change up-front. Show the teams that we are actually solving a problem, and not just making a change for the hell of it.

We also have to be humble as leaders, and accept that sometimes an idea that should work in theory, or worked out great in our minds, is a total catastrophe in practice. We need to have an established channel through which we can frequently gather, distill and act upon feedback. We also have to be willing to change our assumptions and iterate on our initial idea based on real-world experience — it’s simply not enough to regurgitate something we’ve read, expect it to work, and then be staunch in the face of conflicting evidence.

Also important is that change in organization is a natural part of software engineering, and business in general. In our hypothetical example, each of the org designs seemed to make sense at some stage in a company’s growth to help them get to the next stage. For this reason, it is critically important that we build company cultures that welcome change with an open mind and with open arms. Even so, if we attempt to make broad sweeping changes too frequently, we’ll quickly end up with dissatisfied team members who don’t understand their place in the organization.

The best thing we can do is empathize with our team members and understand their pain (and if possible, try to truly live it) so that we can arrive over time at a solution that satisfies both the people and the business.

--

--

Brendan Cone

I’m a generally optimistic engineering manager who loves talking about engineering, management, music of all sorts, and a whole lot of other stuff.