Building Software to Scale Virta

Tony Tam
Virta Health
Published in
6 min readSep 25, 2018

“The Miracle,” according to Virta CEO Sami Inkinen, “is that we’ve proven that the treatment works. Now all we have to do is scale it.”

After working at companies struggling to prove the efficacy of their products and enduring the painful pivoting process, this is a welcome message to me. Scaling is hard, but finding a killer product can be a once-in-a-lifetime success (it would be twice in a lifetime for Sami, who co-founded real-estate company Trulia some ten years ago).

Getting to the point where Virta actually need to scale wasn’t easy — the early Virta tech team had to quickly build a system to handle clinical trial patients in a HIPAA-compliant environment, with an ever-changing list of requirements. Building the runway as the plane is taking off forces trade-offs, and scalability is often first to be sacrificed, as was the case at Virta. Monoliths were born, and rightfully so. In low-volume, with small teams, they can be the most effective way to get an MVP out the door. Monoliths have their limits, but getting 1.0 out the door is usually worth the overhead of building distributed systems at this stage.

Enabling Virta to scale also meant substantially growing the engineering team. The transition from three engineers to ten, for example, requires huge changes in process, style, code ownership, and efficiency. Early on there was a goal to move to microservices, and some efforts had been made in that direction. Luckily, the efforts were really starting to hit full speed when I joined, allowing me to offer my experience with Swagger, OpenAPI, and distributed systems to Virta.

I’ve seen a lot of projects transition from monolith to microservices, and witnessed (and shed) plenty of tears along the way, but paying homage to the fact that pattern recognition is one of the most important skills in software, I’ve come up with a short list of dos and don’ts.

1. Don’t do the Version 2.0 Rewrite. Don’t do a gigantic rewrite. Jumping from complex monolith v1 to over-engineered v2 almost never works, often because of too many new, non-critical changes being brought into the mix. It’s worth pointing out that as your team scales, there will almost always be new opinions and hopefully these are good overall. It’s also true that if you’re at the point that you need to scale your solution, you probably have something that sort of works, and needs to be carefully disassembled.

Instead, start out by adding some instrumentation to your system. There are a zillion ways to do this, but at Virta we’ve had good luck by using pyformance, which is a Python implementation of the very popular DropWizard Metrics by codahale. In our fork, we’ve added some convenient decorators to make it easy to collect timing around blocks of code and send them to the InfluxDB time-series database. From there, you can build a solid understanding of what in your code is slow, called often, and is a candidate for refactoring into a microservice component.

2. Try to Migrate Equivalent Functionality to the Microservice. It’s ideal if you can isolate a block of code and plan around making a call to a microservice in place of some in-application logic. This isn’t always easy to do, but it’s almost always worth it.

While conceptually migrating equivalent functionality sounds simple, there is often overhead associated. For example, let’s say you used an ORM to get version one out the door and you have “fat” objects containing tons of logic. When migrating to a distributed system, you typically will be receiving “dumb” objects back from a microservice. This can have a huge impact across your application (especially if using a dynamic language like Python).

The main point is this: try to first start migrating functionality to the microservice, move the microservice into production, and then and only then add new features to the service. That way, if and when issues pop up, you’ll spend less time wondering if you’re experiencing architectural issues with the microservice architecture or logic issues within the service itself.

3. Put new Microservice-based code behind a Feature Flag. Consider the scenario where you build out a block of business logic that grows in complexity far beyond what you ever envisioned. You may decide it makes sense to migrate this code into a microservice for maintainability and scale reasons.

When tackling this architectural shift, try to migrate the existing logic as-is into the microservice, and use a feature flag to enable or disable this code path. A feature flag is effectively an intelligent configuration that allows different code paths to be enabled or disabled. With the goal of reducing organizational risk, the ability to enable or disable your microservice at runtime is hugely valuable.

More advanced feature flag systems can be user-based, and allow specific cohorts of users to experience the refactored codepath. Some more advanced routing logic can make sophisticated decisions based on a number of parameters. Regardless of the level, you should always be able to “turn off” your new service if you start seeing something unsuspected.

It should go without saying that when your old code path is still available behind a feature flag, you can’t enjoy the satisfaction of deleting it yet.

4. Don’t forget Devops. Any modern system needs some amount of deployment engineering to be included. If you can’t deploy your software, it doesn’t do any good. Too often engineering teams don’t bring the devops group into the conversation early enough, or make unrealistic demands. Everyone should be extraordinarily busy at a startup, and the operations group should be no exception. It’s best to provide a deployable prototype for devops review that is reasonably well thought through instead of positioning some broad “what is your ideal software like”?

Being in the healthtech space, Virta has a strict need for HIPAA compliance and an additional emphasis on information security, and the infosec team is always very interested to understand what’s happening on the application development side to ensure patient safety. Ops, devops, infosec all need to be part of the plan, not an afterthought.

5. Shiny New Objects. It’s fun to bring new technologies into a company. Microservices enable a huge amount of flexibility for development, because they are decoupled by design. If your services speak JSON, it shouldn’t matter if you’re writing in Python, Go, Java, or even PHP. Heck, you can go into almost any language.

Problems arise with deployability and maintainability if this route is taken. If your service developer goes all in on Haskell, then decides to take a vacation or jump ship, you’re going to be in a lot of trouble unless you have some redundancy in the team in that language or specific framework. Remember what you’re trying to do at this point — for Virta, it’s scale a business, and save the shiny new objects for later. There will be a chance if you can succeed at this stage. The opposite is not true.

For Virta, following this philosophy meant standardizing on a number of technologies, techniques, and conventions. Remember that once you have microservices, you’re going to be dealing with a lot more boilerplate code. That’s not fun! But you should at least ensure that it’s consistent.

Putting it to Practice

Luckily for us software developers, there is an ecosystem of robust, free or open-source software out there to make microservices a lot easier to tackle. And of course there are paid products to give additional support and peace of mind.

The gigantic landscape of tooling can feel a bit like a multicultural, Las Vegas-style buffet at first, but don’t fear, once you make a few key decisions, the options become easier to tackle. In my next post I’ll talk through how we’re driving our migration to microservices with the OpenAPI specification and open-source tooling.

--

--

Tony Tam
Virta Health

Strongly opinionated generalist, swagger committer and Architect at Virta Health