The Ultimate Way of Doing OPS?

It’s simple. You should do less OPS, use more managed services, apply more standard technology, scale down CHEF – and focus on what you do best. Or should you?

Published in

The Railslove Blog

5 min readFeb 6, 2019

For every software company and especially for agencies handling numerous customer projects, having a deliberate OPS strategy is crucial. At Railslove, we are doing DevOPS. This means every developer needs to be involved. Or put differently:

You can’t start a new project and expect somebody to take care of setting up infrastructure.

Well, that is now. We have come a long way to this decision…

The old Railslove way of doing OPS

Being a software agency with both complex, long-running projects and short-term jobs for smaller clients, we used to run pretty advanced self-managed infrastructure for our clients and our own projects, using CHEF and a number of different hosting providers. We even had our CHEF-powered way of setting up and running Rails apps “with love” (see here and also here).

We also experimented with further new infrastructure technology such as centralized logging with Graylog, virtualization on bare metal hardware with Proxmox and of course Docker container orchestration (e.g with Rancher).

We were able to ship awesome products and we learned a lot about server infrastructure, but over time we also experienced a couple of issues with this approach:

the more apps we have running on a single piece of infrastructure (e.g. CHEF), the more difficult it becomes to maintain this piece (–> upgrading for one project, could mean downtimes for all)
the very specialized infrastructure knowledge was concentrated with a small number of colleagues with intrinsic interest in infrastructure topics
it was a daunting perspective that (new) developers would not only have to be full-stack devs, but also experienced admins to run a full client project. This lead to a sort of “the infrastructure colleagues will hopefully take care of it” attitude
handing over a project to other developers (e.g. when the customer built up their own in-house development team) was a very difficult task because it was tightly coupled to our infrastructure
more advanced topics like high availability scaling were expensive to realize with our own tools and we were not super flexible

Should you get rid of Chef? (image source: EduRidden, brik.co)

Things we learned from dumping infrastructure duties

For architecting web apps we’ve always tried to rely on as much existing and available technology as possible — which is the reason why we’re such big fans of the Ruby on Rails framework. By choosing such tools for base requirements that every web app needs (e.g. user authentication), we have more resources to innovate in the areas that make our clients really unique and successful (e.g. a beautiful checkout experience customized to their particular product).

Based on the above insights, we decided to also apply this approach to infrastructures wherever we could, beginning in 2018.

One of the most delicate aspects of running software infrastructure is persistence and backups. That’s why we are migrating all projects that are at least somewhat critical to use a hosted/managed database service such as Heroku Postgres or AWS RDS. This way we do not have to worry about our backup process always working flawlessly and we have a some easy options in case we need to scale.

The second measure we undertook was trying to deploy all applications to a PaaS (Platform as a service), which abstracts away as many infrastructure details as possible. Specifically we are using Heroku (for most clients and bigger projects) and its open source sibling Dokku (for smaller projects). This gives us a common way for applications to be deployed and configured (git-based deployment, 12 factor configuration, etc.), but a lot of different options in terms of cost, scalability, security, etc.

Certain smaller aspects of an application can also be solved with solutions even simpler than a PaaS. E.g. Netlify for pure frontends or serverless platforms for small, self-contained microservice aspects.

In addition to that, we are are also trying to use more managed services for additional infrastructure concerns such as exception tracking, logging, monitoring. Again, you are outsourcing a large number of concerns this way.

“The infrastructure colleagues will take care of it”
— everyone but the infrastructure colleagues

The case against

Of course our approach (a widely popular one today) isn’t for everyone — and probably shouldn’t be. There are some more or less convincing reasons for you to refrain from using managed services for your or your client’s infrastructures.

Firstly, not only using managed services has become far easier than it used to be but also setting up and running infrastructure isn’t extremely complicated anymore, using e.g. Docker and his friends.

Secondly, for many it’s about trust and control. You need to trust a provider to handle your (customer’s) data securely and according to your country’s legislation and be able to keep their processes running and still support you in case of an incident. If AWS are having a major outage — like it occurred a couple of times over the past years — there’s virtually nothing you can do about it.

Thirdly, you might just want to be able to sell your OPS skills to your customers, giving you an edge over your competitors in case a specific customer feels hosting with you will be more secure — regardless of this being true or not. It might be a deal breaker or winner.

The very essence of why we still chose to shed as much OPS obligation as possible is skillfully summarized in this quote from a truely awesome blogpost by Rich Archbold at Intercom:

Offloading as much responsibility as possible to third parties is a useful technique for minimizing the number of things the team is directly responsible for (and therefore need expertise in) and allowing the team to operate at a higher level of abstraction (and therefore deliver customer value more rapidly).

That being said, go talk to your team and your customers about the options and decide which path to follow. It’s most important your decision is supported by your staff and the project’s stakeholders.

And don’t forget: you’ll always need to have some DevOPS knowledge distributed through all your team. Ideally, every single one should be able to set up infrastructures with managed services on their own. What sounds obvious might be a challenge for some — but teaching every developer how to do it will pay off well if you follow this strategy consistently.

The Ultimate Way of Doing OPS?

It’s simple. You should do less OPS, use more managed services, apply more standard technology, scale down CHEF – and focus on what you do best. Or should you?

The old Railslove way of doing OPS

Things we learned from dumping infrastructure duties

The case against

Written by Railslove