Are you ready? New starters journey to production.

Alex Potter Dixon
Glasswall Engineering
3 min readDec 1, 2019
Photo by Marc-Olivier Paquin on Unsplash

One of the main and if not the most important part of being an SRE is running production system. We have to protect our customer's data, make sure production is healthy and perform regular updates and maintenance. It's incredibly important to make sure that anyone that has the keys to the castle fully knows what they are doing otherwise things can go wrong very quickly.

So how do you make sure they are ready?

Training

As with any new starter in any job, it is incredibly important that you provide adequate training but in the SRE world this especially apparent.

Glasswall’s SRE training plan.

At Glasswall all of our new SRE team members have to go through a specialised and personally curated training book. It’s broken up into several key areas:

  • SAAS, Our Email FileTrust service.
  • Kubernetes
  • Azure DevOps
  • Prometheus/Monitoring
  • Incident Management

Each area is broken up into specific learning areas with references to training resources such as Linux Academy, training videos or talks given by other SRE members. On top of the learning resources, we have questions and activities that the new starter must complete such as building a new Kubernetes cluster or writing up a set of Prometheus queries. Finally, after each task or question has been completed it must be signed off by another member of the SRE team.

This training is a shared responsibility within the team this is because we all run the production system together and we all need to feel confident that they will have our back!

Shadowing

On top of the training, we need to make sure the new starter knows how to apply the new skills they have learned to the 50% operational work that is part of the SRE role.

SRE training timeline.

As you can see they are shadowing an SRE team member from day one. This allows the new team member to get their feet wet as quickly as possible. The job of the on-call person is to show the new team member how they perform BAU (Business as Usual), explaining all the steps in detail and showing them how to handle any incidents that may happen. It is expected that the new team member to ask any questions and ask the on-call team member to repeat anything they aren’t sure about.

Photo by Jason Dent on Unsplash

Sign off

Once the new starter has completed the training booklet, shadowed for a month and everyone feels the new starter is ready its time to sign them off!

We want to make sure this is celebrated. They put the work in and are ready to be a fully fledge SRE!

Final thoughts

It’s important for us that the new starter succeeds and can handle their new role as an SRE. Their success is our success and ultimately it is the team's responsibility for making this happen. We have a feedback session once the new SRE has been initiated to make sure we can fill in any gaps and improve it for the next member, as one of our core values is Always Learning.

--

--

Alex Potter Dixon
Glasswall Engineering

SRE Manager at Glasswall. Always looking to innovate and push in cloud infrastructure and SRE.