How Shopify’s custom Slack integrations help manage their platform
Custom Slack apps can give the right people, the right information, at the right time
Shopify is an ecommerce platform that handles $40 billion in sales every year. They’ve grown to over 2000 employees with five physical offices and remote employees all over the world. They serve customers in 175 countries and their infrastructure consists of thousands of servers located in data centers across the globe.
Such a crucial part of global internet commerce can’t afford downtime. As Shopify continues to to grow and evolve as a company, they’ve created tools that build on their culture of openness and transparency to manage their most critical tasks.
Here’s a look at how Shopify uses Slack to handle their entire deployment workflow, manage operational incidents, and transform their engineering culture.
Scaling the deployment process
Like many large, cloud-based software companies today, Shopify continuously deploys new code across their global infrastructure —sometimes over 50 times a day. Slack is a critical part of that deploy process, increasing transparency by allowing the entire team to know who is deploying what, and when.
To manage and scale this process, they built an internal ChatOps app for Slack. Developers can tag their pull requests as “ready for deploy” and the app, which they called Spy, takes over to do the final verification and schedule it for deployment — all from Slack.
The team found that the most valuable way to handle this was through Slack notifications. The Spy app sends a DM to the engineer who committed the code with relevant links for actions to take, like viewing logs or aborting the deploy if something is wrong at the last minute.
The app also helpfully reminds engineers to keep an eye on the deploy while it’s in process by checking in on their #operations channel. This way, engineers can more easily speak up if it looks like something is wrong.
Once the deploy is complete, the engineer gets another Slack message letting them know everything has shipped, and reminding them to check their work in production. There’s a handy link to rollback changes if it turns out something isn’t right.
Creating a virtual incident management room
As Shopify grew in size and spread across physical locations, they realized they needed to scale their incident management strategy as well.
Spy was already a great ChatOps app for monitoring the deploy process from Slack when code was being shipped, and it made sense to extend this to handling incident response as well.
Managing an incident is, by definition, a stressful process. Spy helps Shopify alleviate some of that stress by setting up clear lines of communication within Slack, automatically notifying people who need to know what’s happening, and reminding key participants of actions they should be taking during the incident.
Their custom app can bring in additional support and take notes as key information is discovered. They added a “TL;DR” command that prints out a summary of the latest updates to anyone who needs it. This lets the on-duty engineers stay focused on resolving the incident, without having to constantly worry about communicating out the status. The TL;DR command can be used in a DM or to keep an entire channel informed.
By automating the incident response processes, Shopify empowers people across their organization to get to work, pitch in, and stay informed — all from within Slack.
Internal integrations had a substantial impact on the team’s efficiency, and also brought a practice of transparency and openness to deploys and incident management. The Shopify team saw benefits from that openness and wanted to expand Spy’s functionality to keep evolving their engineering culture.
For example, Shopify’s backend is broken up into systems they call “pods;” one physical location is able to take over from another when moving pieces of infrastructure across different data centers. These failover operations are quite high impact, and require a level of oversight and authentication that goes beyond merely deploying an update to the codebase.
Shopify felt it was important to build a failover command right into Slack, setting the expectation that this commend should be accessible to be executed at any time. This mindset prompted the team to build the feature in a way that could be relied upon at any time.
Since building it into the Spy app, failovers have gone from a rare event to something that’s run almost daily, and the team knows they can rely on it when they need it. The work to do this in Slack didn’t necessarily offer a technical advantage – but a cultural one.
All systems go
Whether you’re just getting started with Slack’s Platform or building out a full deploy or incident management workflow, custom apps inside Slack can be a tool to can be a tool to improve the way your team moves work forward.
When looking to build your own ChatOps integrations, think of the bottlenecks in the process that can benefit from being pulled into Slack. By making these bottlenecks public and actionable, where they can be searched and visible across a workspace, you increase transparency across your team and empower your team more broadly to help out.
It’s also helpful to think of effective Slack apps as having a conversational context — meaning, look for the parts of your workflow that make sense to bring into a conversation. Requesting a code review is a great example of making an app conversational.
And finally, Slack is already a change agent within your organization. Don’t be afraid to build apps that continue building on this momentum. An app or a feature that helps improve your team’s culture is just as important as one that saves a few steps.