The Five Pillars of Performance Engineering at Insider

Published in

Inside Business Insider

5 min readNov 17, 2021

At Insider, web performance is a key tenet of our product and engineering organization. We value user experience, developer experience, and optimize for SEO. Over the last few years, we have developed a performance engineering strategy that empowers the product and technology team. We group these strategies into five pillars — Content Delivery, Observability, In-House Consultancy, Performance Evangelism and Search Engine Optimization.

Content Delivery

Site reliability engineering provides the foundation for everything we build, distributing content to over ten million users everyday. We leverage Fastly, custom VCLs, logging services, and caching. We employ Kubernetes horizontal and vertical pod auto scaling and perform load tests to correctly size Kubernetes workloads. These strategies prevent performance issues before they happen.

Our content delivery API contracts are defined using protocol buffers. Protobuf is a binary transfer format. The majority of our services are developed in Go, use gRPC and source data from MongoDB. gRPC allows data to be transferred in binary across HTTP/2 which is seven to ten times faster than traditional REST APIs. Most database queries are indexed and, thanks to denormalization, API requests only require a single trip to the database. We add REST endpoints behind Varnish so a lot of requests don’t reach the back-end. Go is a high performance language, and by marrying that with the benefits of gRPC and MongoDB, one ends up with lightning fast services. Clients either get the speed benefit of Protobuf/gRPC or Varnish.

Currently in discovery, we are also excited to leverage edge computing, Web Assembly, CI/CD load testing for our content delivery services, and HTTP/3.

Observability

Our goal is to capture, diagnose and fix performance issues as they happen, and to improve our performance baselines. Insider’s strategy can be split into application monitoring, real user performance monitoring, and load testing.

CI/CD synthetic performance tests run on build and empower engineers to catch issues before they are are deployed to production.

Alerts are handled through Pager Duty, email, and Slack channel notifications. We also provide dashboards that highlight the status of many key, leading indicators. We use Datadog for application monitoring, logging and tracing. We leverage SpeedCurve RUM and synthetic offerings.

In-House Consultancy

Over the last few years we have been working to build a performance culture that empowers the entire team, celebrates wins, and offers help when it is needed.

We have a performance point person that is up to date with the latest research and processes. We empower our product and engineering team with documentation on critical processes that we rely on every day. These processes include synthetic feature flag testing and real user A/B testing. Once a test has completed, the performance KPIs are integrated with the business KPIs and evaluated in tandem. This empowers product owners to make smart decisions about the features that are deployed to production.

Along with consultancy comes training. Topics that are covered include synthetic and real user testing processes, engineering best practices, and industry initiatives such as Core Web Vitals. These discussions help our team improve Insider’s products and are great for user and developer experience.

Performance Evangelism

We celebrate performance and accessibility wins during sprint demos at the end of each sprint and highlight successes in Slack with a #perfhero tag integration. This rewards the mentioned user with an “in awe” emoji, encouraging celebration within the team.

We network through performance meetups and conferences, write blog posts about our experiences, and assess business impacts with key stakeholders and leadership.

Being a performance advocate is contagious. Everyone loves a good success story so share your wins within your own team and organization and encourage others to do the same!

Search Engine Optimization

Leveraging Fastly logging services, we continuously upload bot traffic data to BigQuery. With this data, we created a bot traffic dashboard in Looker. This dashboard allows our SEO team to see the details of crawled pages. Some example key metrics include crawl volume, device type, and elapsed time.x

Over the last few years we have been specifically optimizing for Google’s Core Web Vitals. Leveraging Chrome User Experience Report data, we can easily create dashboards for Web Vitals and competitive analysis. This covers real Chrome users on any device for standard and AMP pages.

Leveraging the SpeedCurve API, we extract, transform, and upload web performance data to BigQuery. This data is available in Looker where we have dashboards that are integrated with other data sources. This covers real user data on any device for standard pages.

Building a performance culture does not happen overnight but keeping these pillars in mind may help you on your journey. Please feel free to comment or reach out, and thank you for reading!

I would like to thank fellow team members Bryant Durrell, Mahmoud Dolah and Ryan Pardey for their contributions to this post.

Looking for a new job opportunity? Become a part of our team! We are always looking for new Insiders.