Your Platform Org needs a vision. Here’s how to get started.

Michael Galloway
9 min readNov 27, 2022

--

I had just joined a platform org that was knee-deep in a backlog of tickets. On a weekly basis, we would huddle in a room, dim the lights, and review the latest issues that have come in from customers. The mood was always muted and somber.

As the leader of the org moved through the list, he would ask the standard question: “Who can take a look at this?” There was an inevitable sigh, but through a combination of a sense of duty and a true desire to help others, someone would always raise their hand. They would get assigned the ticket and would commit to finding the time, along with all the other tickets they were already working on, to investigate this new problem.

Week in and week out it was the same routine. We fixed issues, did our best to unblock customers, and slowly chipped away at problems. But no matter how hard we tried to get ahead, most weeks we felt like we were just keeping our heads above the water. It was like we were seagulls flying against the wind. We exerted a ton of effort, but we were not getting very far.

The support organization quagmire

Since the requests we’d get would often only be tactical fixes that were often narrowly focused on small and immediate problems, there were no clear signals for how to get ahead. We tried grouping tickets by themes, product features, functionality, and so on, but we had limited success in identifying whether there was a deeper investment to be made or not.

Years later, I reflected on my time in that org, and came to the following realization: Addressing support requests alone will help you optimize at a local maximum, but they won’t help you reach that next level of productivity.

Optimizing for a local maxima, from Metrics-Driven Design by Joshua Porter

The reason why is simple. Like all humans, Engineers will quickly adapt to the conditions of their environment, even if that environment is inefficient, complicated, or cumbersome. They’ve long ago gotten used to the presence of the pebbles in their shoes that limit how fast they can run. Sure it bothered them when they first onboarded, but now their feet have adjusted to it, and they’ve accepted the speed limitations. Any support requests that come in at that point from them will be largely contained within the conditions of the environment. They will no longer question the status quo.

Support is an important part of a platform organization, but it’s the pebbles that keep us from scaling.

Pebbles are unexamined inefficiencies that hold back the productivity of not just one team, but of entire organizations. They are baked into our assumptions when we first pieced together the platform for how engineers work, and therefore how we should support them. However, as the business scales, matures, and evolves, what were perhaps valid assumptions may now be pebbles that limit your ability to grow.

How we defined Platform Engineering, part 2

Continuing from my previous article on finding your purpose, this article dives into how we found some pebbles at Doma and how we developed a vision of a world without those impediments.

Developing a vision provides the “what” for your organization. Whereas your purpose, defined earlier, answers the “why”.

Part 2: Finding pebbles

Start with this question: What do we need to achieve?

First, we needed to understand what matters most to our stakeholders. While we may have a lot of ideas on projects or initiatives to work on, it is vital that we start with what matters most to the business. Otherwise we will run the risk of spending lots of cycles on communicating why it matters, or worse, wasting effort on work that is not valued by leadership or our customers.

At Doma, when we dove into the research we did for defining our purpose, we came across the following business challenge:

We have a new vertical that must find product market fit with real-estate agents and homebuyers.

Based on this, we knew there was going to be a strong emphasis on iterating quickly on features and service improvements, and less on alternatives like burning down tech debt or scaling out. We also knew that our business environment is a highly regulated one (proptech / fintech), so compliance and security concerns needed to be factored in.

We then confirmed both of those assumptions with leadership, product management, and engineering.

Leveraging our purpose

One of the many reasons why a purpose is so valuable to define is that it gives a lens by which to view challenges to help us see the role we play in addressing them.

We defined our purpose: make it fast and easy to build great products. Connecting this to the challenge above, we saw our role as the following:

  • We needed to enable faster feature delivery workflows.
  • We needed to make it easy to ensure services are secure and reliable.

Gathering context: How are we doing today?

We started with our recent customer interviews, where we found the following relevant challenges:

  • Code is merged into our main branches before it gets into staging.
  • Engineers told us that stakeholders would give them feedback only after changes made it to staging or, in some cases, production.
  • Engineers had limited experience with security topics.

We then dove into the data to try to piece together the journey with actual timestamps and flows. We walked through multiple examples from github (git branches, commits, PRs) to CircleCI (CI/CD pipeline runs), and finally out to our cloud environment at the time (Heroku logs).

We then pieced together the following high-level journey.

Feature development — journey map

Looking for the pebbles

Within the platform org, we dove into the journey details and started asking a bunch of questions.

  1. If Jill merges changes into main before getting feedback, does that mean that any issues will result in new tickets that go into the backlog and have to then go through the full cycle again?
  2. If there are no integration tests done before merging to main and pushing to staging, would this potentially make the main branch unstable and result in other work streams being blocked?
  3. What about security scans or testing?

We visited multiple engineering and product teams to validate this high-level journey, and get more context on these questions. The responses we received were very revealing:

Yes, it takes about 8 days before getting feedback. The only place to view the changes is on staging, which based on our current workflow, is only updated after a merge to main. If there are issues, we put a new ticket in the backlog and have to prioritize it in a future sprint. This can delay feature releases, and make planning less predictable.

We don’t have a great way to know until staging if services work as expected with each other. Sometimes this results in tests failing in staging, which can delay promotion to production.

We try to follow best practices for security concerns but we don’t know what we don’t know. We also don’t really know if the libraries we are relying on meet our standards.

In addition to the feedback above, the general sentiment was that these were real problems that impacted them. Moreover, none of these problems were directly raised in support channels or even in our customer interviews. Had we stayed only focused on those signals, we may never have uncovered these deeper problems.

So now with our understanding of some of the key problems confirmed, we were now confident that we had identified some real pebbles.

Creating a vision: What would a better experience look like?

Stephen Covey famously wrote that we should “Begin with the end in mind”. Doing so frees us from thinking within the constraints of our solutions today, in other words, optimizing for that local maxima I mentioned earlier.

At Doma, with a clear understanding of what the pebbles were, we needed to imagine a better feature delivery workflow. We asked the following question:

What if it were possible for engineers to know during the development phase that their changes met product, quality, and security requirements?

As many of you already know, such workflows exist today at many other companies. Shift-left strategies are not new, and preview or development environments are fairly common. Like with finding our purpose statement, I fully endorse stealing great ideas. But that doesn’t mean you can skip steps, since it’s the process of getting there as much as it is the idea that matters.

Here is the future feature delivery journey we envisioned.

Feature development — future journey map

In the future Jill will be able to get feedback quickly on her work. Tying back to the business need of “finding product-market fit”, this feedback will help with rolling out complete changes sooner. Assuming our timeline is reasonably accurate, this meant more complete features 27% faster than before (8 days instead of 11).

We also shifted quality, security, and reliability to the left and brought it into the development environment, before changes are merged to main. This ensures that the main branch is as reliable as possible. This is critical for monorepo development, as failing tests in other parts of the repo can be problematic for peer teams.

Ready for the next step

We were now armed with a vision for the future. We took the journey map and confirmed it with some of our customers one more time. We asked questions like:

  • Does this new journey address the problems we identified earlier?
  • How impactful would these changes be for you?

Once again, the feedback was overwhelmingly positive. So much so, in fact, that members within our own platform organization were wondering if we should be even more ambitious! (Next time, maybe).

In the next article I will dive into how we defined a plan to execute on this vision.

And the journey continues

Finding the right pebbles to focus on is hard, and we certainly didn’t do everything right. Here’s some lessons we learned along the way:

  • You need a lot of information — We pulled heavily on the research we did when defining our purpose. We also leveraged insights from the DORA Capabilities catalog. Those helped us compare our own environment to industry best-practices.
  • Don’t solve everything — Also known as “don’t boil the ocean”. Evolving a platform is as much a social and educational challenge as it is a technical one. The more you change, the longer it will take to “get back to normal”. Pick clear, impactful wins that align with the business needs.
  • Your vision should resonate immediately with your audience — Your customers, and leadership, should be excited when you show your vision to them. If they aren’t, that is a signal that you may not be connecting to meaningful pain.

Next steps

In my next article I’ll dive into how we took this vision and created a plan for executing on it. The combination of this article (vision) and the next (plan) is often referred to as a Technical Strategy.

Having a compelling strategy not only helps platform teams focus, it also sets expectations for customers and leadership. Finally, it enables platform teams to operate both as a support organization and a product organization, which will help the business scale effectively.

Credits
A special thanks to David Holbrook, Michael Lin, and Aeris Stewart for editorial feedback and support on this article.

--

--