CM10: Site Reliability Engineering — Helping CIOs not get fired w/ Andrew Turner
Welcome to Custom Made — our new weekly podcast that explores the many traits of successful product development.
We are on to episode #010 of Custom Made, and today I am talking with Andrew Turner, Head of Software Engineering here at Dialexa about Site Reliability Engineering. Why this topic, and why now? Well in today’s current landscape CIOs won’t get promoted if everything works. But they will get fired if anything doesn’t — and so a 1000 watt spotlight is now shining on an organization’s technology reliability and security.
Andrew is an engineer, speaker, and technical leader with experience building scalable software solutions for clients ranging from venture-funded startups to Fortune 500 companies. He specializes in applying modern development workflows and technologies to create data-driven applications with a need for performance at scale.
At Dialexa, a significant proportion of our projects focus on designing and engineering custom software products and platforms and as our Head of Software Engineering, Andrew works with various teams to architect and engineer solutions that solve our clients’ hardest problems.
He passionately explores new tools and technologies to improve the developer workflow and craft creative solutions to common problems.
With extensive experience in all aspects of software development, Andrew provides end-to-end technical leadership on projects while assisting Dialexa’s executive team in scaling engineering processes and practices.
In today’s technology landscape end-user expectations for application performance are at an all-time high, and with all companies now becoming technology companies, it is critical to ensure that your technology performs at as optimal a level as possible.
Whether you are a growing start-up or a enterprise organization every time you site, applications or technology are down, hacked or not working correctly you risk your customer base and your reputation. This reminds me of the following scene from The Social Network:
Security threats grow more sophisticated on a daily basis. And as a result, CIOs are under more pressure than ever to maintain reliability and stability of business systems. This pressure has sparked greater adoption for site reliability engineering as a formal practice in the developer community.
Site Reliability Engineering is defined as the focus on effectively building, running, and growing systems in production. This includes ensuring the stability and resilience of a production system in addition to continually improving performance while building features.
With Site Reliability Engineering, you balance the need for site reliability with the need to ship new features. And it’s not just building with reliability in mind. It’s questioning how you can make a stable system run even better. Adopting a Site Reliability Engineering practice is all about having the right monitoring, tooling, and processes in place so that you have confidence when deploying a release that it will add value and also meet availability requirements.
The ultimate goal is to build a valuable system quickly and effectively without sacrificing production reliability, security, and scalability.
Throughout our conversation Andrew mentions a number of great resources, here are some quick links to help you check them out:
- Site Reliability Engineering — Google
- Medium Posts — SRE
- Medium Posts — DevOps
- GitHub — Awesome Lists (SRE)
You can catch Andrew’s full Custom Made episode here:
Here are some of the memorable moments from Andrew’s episode of Custom Made:
- What is Site Reliability Engineering?
- Checks and balances to Site Reliability Engineering
- How to integrate Site Reliability Engineering
- Techniques Site Reliability Engineering needs to have
- How has the engineering roles are changing
- What technology leaders now need to look for in a partner
- Advice & recommended resources
And, if you prefer to listen on the go, you can get all episodes of Custom Made on these platforms and many more. Do subscribe on your favorite platform to catch each episode as it is released, and let me know any feedback, questions, and recommendations on twitter @dougplatts.