The role of DevEx in microservices

Lately, I’ve been involved in many conversations about the realities of developing and operating services in the cloud. As systems grow in complexity and scale, you need to continue to stay nimble and ahead of of the competition, but keeping existing customers happy is equally important. This tension manifests itself in different ways e.g. velocity vs. stability or incremental improvements vs. fundamental architectural shifts.

These are interesting debates, with great arguments on both sides. But if you take a step back, software exists in order to provide business value. It is most effective when developers spend a majority of their energies working on the product. So how do you accomplish that in a DevOps world where developers are responsible for the entire software development life cycle? The key is to invest in simplifying the process of getting things done. This helps you get speed and consistency in repetitive and complex mission critical tasks while conserving energies for higher-order endeavors that require creativity.

As someone whose job is to simplify the process of getting things done, which in this case equates to developing and operating software at scale, I spend a fair bit of time thinking about this topic. The obvious answer is tools and automation. But simply building the right tools isn’t enough. These tools need to earn their place in their users’ day-to-day workflows, be useful when needed and invisible otherwise. They need to make difficult tasks easy while making it hard for users to do the wrong thing. All without compromising user experience or developer experience in this case, because the users are developers.

Developer Experience(DX) as a domain has gained considerable traction in the recent past. It has strong recognition in the API space, where it directly impacts API adoption and usage rates. PaaS offerings like Heroku are also excellent examples of DX done right. More often than not though, DX discussions tend to center around the process of delivering software and deploying it to production. But in a world of microservices and DevOps, a developer’s responsibility doesn’t end there; it includes support and operation of production software. As such, I believe the scope of DX should be extended to include tools that help in debugging functional issues, performance bottlenecks and provide insight into the operational health of systems.

Discussing this caused me to reflect upon the opportunities and challenges unique to this space which I decided to capture in this post.

It starts with empathy

The first step towards success is empathy. Going back to tools earning their place in developer workflows, it is critical to understand why those workflows are structured the way they are. If a tool adds too much friction to their workflow, users will just find ways around it. Tools done right can fundamentally change the way people work (think git and SCM flows), but that is not the typical case. Tools need to fit into existing workflows seamlessly for optimal DX.

Test or Prod?

I learned this the hard way; but with developer tools, you cannot afford to optimize solely for production use cases. Pre-production environments rarely mirror production, but the former is where users will start testing your tools. Environments can diverge in many ways — configuration, data sets, data volumes and traffic patterns. If you think about build and deployment tools for instance, the scale in pre-production can be an order of magnitude higher than production. Likewise, lack of data poses different types of challenges for insights tools. Techniques like caching aren’t always effective in pre-production, but the performance of your tools is evaluated there. In order to gain traction, these tools need to prove their worth in potentially sub-optimal conditions.

Save them from themselves

With all the focus on reducing friction, developers also want you to make it hard for them to do the wrong thing; especially if the consequences can be severe. Therein lies a challenge. If deployments are too easy, how do you prevent accidental deployments? If you make them jump through hoops, are you compromising DX? In other words, everything should be simple and frictionless; except if someone is going to make a bad move. And if they still manage to do so, recovery should be easy.

Don’t let them bring you down

If you are successful, your tools will become an integral part of the daily lives of several developers who will rely on them for mission critical work. As such, reliability and availability expectations are high even if you aren’t operating mission critical software directly. If you develop internal tools, you have an additional challenge here. For instance, you cannot deal with deviant users or rogue clients using standard techniques like throttling or forced upgrades. After all, you can’t afford to cut off your nose to spite your face! Often, doing the work of migration or adoption on their behalf is an effective way to move forward and build partnerships.

Closing thoughts

I’ve just about scratched the surface of the considerations in the DX domain. There are many other interesting topics to explore — how heavily do you favor simplicity over the needs of the power user? How prescriptive should a tool be? Is there such a thing as too much empathy i.e. when should you challenge use cases and use tools to effect changes for the better? And my favorite — how do you measure DX? Is there an emotive component to it? If these topics are of interest to you, I’d love to hear your thoughts. For those local to the SF Bay Area, you can also join us at the DevExSV meetup to share your perspectives. Help us build a community around DevEx and microservices!