GumGum Tech Blog
Published in

GumGum Tech Blog

How I hacked Scrum to work for DevOps teams

By Corey Gale, Engineering Manager, DevOps

For the DevOps team at GumGum, process has been constantly evolving. Two years ago we shifted towards an Agile methodology for managing all of our work, including interruptive support tasks. Our implementation of Agile borrowed some elements from Scrum (like daily stand-ups and sprint demos) and combined them with the flexibility of Kanban (like continuous delivery).

While this new bump in process rigor fixed a lot of problems like clarifying project priorities and encouraging early discussions about scope, it also introduced a lot of new overhead. It also created new expectations on our DevOps Engineers that, for a team that sometimes spends all of its time on unplanned work, were counterproductive.

In this post, I will discuss how we adapted our process along the way to fit the needs of our DevOps team, and the benefits we gained from doing so.

Our DevOps team

The DevOps team at GumGum (we’re hiring!) is currently composed of three engineers; a combination of remote and on-site personnel. We support three internal engineering teams at GumGum: backend, frontend and big data, which altogether is 40 engineers and over two dozen different systems. Some of these systems serve over 25M RPM and our main AWS account regularly runs more than 1000 EC2 instances. So as you can guess, we have our work cut out for us and anything we can do to be more efficient can make a huge difference.

Of course, maintaining this many systems can lead to a lot of unexpected surges in support work, which our workflow methodology needed to accommodate. This was our first issue with our Agile process: unplanned work. So much of DevOps’ work is unplanned, and Scrum methodology suggests time-consuming exercises like ticket replacement and ticket-splitting for scope changes.

Hack #1: expect the unexpected

After trying out these methods (like “one ticket in, one ticket out”), I realized we were spending too much time in Jira, which was a sign we needed to tweak something. At this point we had completed four two-week sprints, and all of our tasks were tracked in Jira. I analyzed the data for these completed sprints and categorized each ticket as “support” (interrupted) or “sprint” (planned) work. I then added up the story points for each category and realized, for the first time, the true load of support on the DevOps team: 5 story points per engineer per sprint.

From here, I took this number into consideration during sprint planning, in particular when calculating expected capacity per engineer for planned work. Here’s the formula I used:

Capacity_next_sprint = Capacity_avg_last_3_sprints — Support_avg

Protip: categorize your team’s tickets according to the type of work requested. If you pick your categories right, it should be easy to calculate the average time/effort spent on support.

Hack #2: adjust expectations

At this point we were planning for the unexpected and, for the most part, everyone was making their sprint commitments on-time. That’s when I learned about a new problem: process pressure. It was revealed to me during a routine 1:1 with one of my senior contributors that they felt increased stress towards the end of sprints due to looming incomplete tickets. This engineer happened to be spending a lot of time unblocking other engineers, a task that has great ROI for the company. But, because he wasn’t working on the sprint tickets he previously committed to, he felt pressured to work longer/harder to meet those commitments.

This wasn’t fair! To make sure this never happened again, I got his permission to discuss this issue as a team in the next sprint retrospective meeting where I set the expectation that incomplete sprint tickets can slide from one sprint to the next. Up until this point, we scrutinized every ticket that didn’t get completed during our sprint retrospectives. I realized that this process was doing more harm than good and decided to drop it from our sprint retrospective meetings. I also made it clear that it’s not important what didn’t get done, but rather what did get done.

These small changes in expectations had a very positive impact on team morale. In fact, for 2/3 engineers, they said this tweak significantly reduced their stress levels. And I sincerely believe that less stressed engineers are more productive and write less RCAs.

Hack #3: cut the meetings

Now that expectations were clarified, the next consistent complaint I received was about excessive meeting overhead. For every two week sprint, the entire DevOps team spent over 5 hours in meetings:

  1. Sync with internal customers (30 minutes)

In addition to a daily stand-up, this was a lot of interruptions. To fix this, I tweaked a few things:

  1. I made our sync with internal customers an asynchronous process. New work is now requested entirely via tickets or Slack conversations.

These small tweaks saved DevOps team members as much as 6 hours of meetings a month!

Benefits

At this point, things are running smoothly! My team is happy with the overhead, and because we’re using the same tools (Jira, Slack) and process as other teams, there have been some added benefits:

  1. Reporting: I can now speak the same language when communicating DevOps efforts to upper management. Sprint metrics can also make things like asking for new hires much easier!

Summary & key takeaways

  1. Agile processes like Scrum can be adapted for DevOps teams.

We’re always looking for new talent! View jobs.

Follow us: Facebook | Twitter | Linkedin | Instagram