The Rise and Fall (and Resurrection) of DevOps at upday

Published in

upday devs

7 min readJun 4, 2019

Objective

This blog post deals with the evolution of DevOps involvement we went through at upday from our very beginning in January 2015 until the current day (mid of 2019). It is a story about a little bit of success but mostly of failure, transition and how we dealt with it. Core part is how ordinary backend developers could be motivated and trained to take over DevOps duties, how the company can and should support those activities and how it worked out for us.

Motivation and how to Motivate

When we started upday we quickly made sure we had a dedicated DevOps team to set-up and take care of our infrastructure. They worked closely together with the backend team(s). This seemed to be state of the art in 2015. Pretty rad back then, I would even say in retrospect. Over time we experienced natural fluctuation in each discipline, but never particularly replaced any DevOps people leaving the company. This resulted in a shrinking DevOps team to a size it didn’t make sense to have it at all anymore. Consequently we split the remaining members to the other teams and finally ended up in a situation of not having a single trained DevOps person in our company. This was somehow forced to happen, but it also required to convince our developers to take over operational tasks on top of their other work (it’s always useful to do this before the last DevOps guy left the building, just to be sure…).

“NOTHING WILL WORK UNLESS YOU DO.”
– MAYA ANGELOU

But why should a developer grow interest in system operations anyway (apart from being forced to)? There are multiple reasons for a backend person to dig deeper into this topic.

Common Sense:

Gain a general understanding of what operations are: every developer needs basic knowledge of administration and operations.
As a developer you would like to understand UNIX and command line better: even though modern IDEs shield a lot of the low level tasks and a developer could survive without ever touching the command line or working closer to the operating system, having this knowledge is extremely beneficial when it comes to finding bugs and problems in a running system.
As a developer you would like to be able to understand, plan, estimate and set-up infrastructure components: writing code and throwing it over the fence to an operations team is not exactly how a developer’s life is supposed to be these days (at least in a smaller company). When working on a feature, a component or a micro-service, it is beneficial to have an idea about how and where it is ran when going live. Saying so, each developer requires a basic understanding of deployment, scaling and infrastructure in general.
Pass a cloud infrastructure certification to drive own career: being forced to work on cloud infrastructure supports gaining the required knowledge for passing those tests for sure. It’s very likely a developer can utilise these certificates at any later point in his/her career. Having completed those certifications has a positive effect on the CV and on the salary side as well.

upday Specific:

Join the DevOps or Security Circle: at upday we have the concept of circles. Circles are loosely coupled teams working together on certain topics, e.g. having an eye on security aspects of our organisation. To join one of these circles, it is required to have a general understanding of the topic. Having an interest in a certain discipline and the will to drive it further and participate at the decision making process could serve as a motivator to get a general understanding of operations and security aspects.
Be able to optimise our (cloud-) infrastructure in regards of costs and performance: our infrastructure is one of the most costly assets to our company. Optimising its costs and tweaking its performance is always an issue and a rewarding task to do.
Being able to fulfil on-call duty: at upday we follow the simple slogan „You build it, you run it“. Being on-call has been an integral part of a developer’s duty from the beginning (also when having a DevOps team). Being able to be on call, fix occurring problems and have a safe feeling doing so, one needs to understand our infrastructure and its concepts. On top, after a few interrupted nights, the quality and resilience of components grew significantly.
It is part of our developer qualification levels and in order to reach the next level you also have to gain knowledge in operations: What makes a person a junior, intermediate, senior or principal developer? We defined those levels quite thoroughly for different reasons. It clearly defines what a person has to do to reach the next level. It makes it clear for us to put a new hire into a certain category. It’s transparent to all employees and to us and avoids discussions and bad blood in this regard. Being able to work in operations is an important factor in those definitions and stepping up always requires gaining knowledge in this topic.

Learning

“YOU DON’T HAVE TO BE GREAT TO START, BUT YOU HAVE TO START TO BE GREAT.”
– ZIG ZIGLAR

You cannot become an operations person from one day to the other. Training and experience are needed to take solid decisions and to survive the daily maintenance jungle. It also requires a tiny shift in a developer’s mindset.

upday always considered itself as a company that supports its employees in learning and personal growth. So, how did upday actually take care of these anticipations growing a team of backend developers trained in operations? What kinds of trainings need to be taken to succeed in one, a few or even all of these goals?

But What?

Most of our infrastructure is cloud-based, hosted on AWS. Understanding its concepts, learning about its services and its dos and don’ts is essential. Also gathering basic and intermediate knowledge of Unix administration is required. Not even to mention CI/CD practices using CI tools like Jenkins and how to automate them, including learning what infrastructure as code (IaC) means and how it could be implemented. To make it short: there are a lot of areas to cover. So, how should one start and not get overwhelmed by all these various aspects of the wondrous world of operations?

And How?

“TELL ME AND I FORGET. TEACH ME AND I REMEMBER. INVOLVE ME AND I LEARN.”
–BENJAMIN FRANKLIN

Becoming familiar with Unix administration is a good start, given you don’t have any knowledge in this regard. A good source to tap into this realm is Linux Academy. Mastering the command line, understanding how an operating system (OS) works on low level and being familiar with the most important command line tools and commands is extremely helpful when trying to get rid of any glitches on a failing server.

To get a general understanding of what you have agreed to learn, try OpsSchool. Everybody is talking about DevOps, but is there a common understanding what it means? What is included in this profession and what is not? What needs to be mastered?

We set up a special DevOps training where we used our test system (basically with an equal setup like our production system, but with fewer instances) to create real life problems and let the trainees try to solve them. These simulations were inspired by problems that actually occurred during OCD shifts. Going through them again helps to understand what could go wrong and how to tackle it. It increases trust and confidence in one’s ability to really fix a production issue and also supports team work.

In a kind of extended Slack Time (this is how we call the dedicated learning time of our employees, about 10% at the moment) we allowed our engineers to focus on and prepare for an AWS certification by watching the related videos and do workshops. For one week two colleagues were freed from any sprint duties and were able to digest the needed knowledge to pass the exam.

Other ways to learn:

Mentoring would require an experienced mentor to guide and educate. We haven’t had this kind of DevOps in our ranks, so we couldn’t go for this option. But it is a valuable direction to take.
Operations training could be part of the on-boarding process for engineers (e.g. pass an AWS certification)
Dedicate own Slack time on learning operations
In-house training is useful for educating a bigger group at once, but the current level of knowledge of the attendees needs to be considered, to not overtrain ones and bore others.
Let engineers take over tasks that require operational knowledge and let them grow on them.

Alternative Approaches

We struggled with our way of handling DevOps topics after while. To load those extra duties on the shoulders of our backend engineers didn’t seem to be right. Their performance suffered. On top Product Management became unhappy over this as it would affect their commitments as well.

Without going into more details, find a list of alternative approaches below we thought about:

NoOps
Outsource Ops work completely
Have external help (freelancers, specialised companies) in-house
Hire dedicated DevOps people again

And now?

We hired a DevOps person again and looking to expand this function. Being responsible for running web-scale infrastructure, continuously re-evaluating its architecture, costs and optimisations is a full-time job from a certain company size on. And setting up and maintaining the infrastructure is not the sole responsibility of DevOps work. Worth noting are also security and automation.

But our backend engineers are not freed from DevOps duties completely. We strive for collaboration between operations and development. And having infrastructure-aware developers is a great plus in terms of reliability and costs of our system.

Conclusion

It is possible to train developers and grow an operational mindset plus interest in this discipline as well. But this requires to give them all opportunities to deep-dive into this field. Support is key and continuous motivation. And dedicate enough time of course. And pay the bills. But also be aware that operational duties take time, need to be planned properly alongside product features and take away resources from product work. Product Management might be unhappy over this as it would affect their commitments.

Going through all these transitions at upday was quite a trip, but a worthwhile one. We grew as a company, value each discipline more and finally know exactly how we want to be set up to feel comfortable and create our best output possible.