Technology migrations: unglamorous obligation or an impactful opportunity?

Coursera
Coursera
Mar 3, 2017 · 7 min read

By Nikhil Garg

Coursera decided to start using GraphQL supported APIs in the spirit of making our platform faster, and better for our learners. To facilitate that, our engineers extended existing technologies used within Coursera to be compatible with GraphQL. In summary, all systems that used courier (an open-sourced data interchange system built by Coursera), and naptime (an open-sourced REST API library built by Coursera) could easily be extended to use GraphQL, along with a lot of other benefits.

As a result, we initiated the process of migrating legacy systems to ‘courier’ and ‘naptime’. We started with our Peer review assignments system. It is one of the earlier and more complex assignment systems built here at Coursera where learners are graded by their peers.

The overall task was to perform the following for peer review assignments system:

  • Migrate existing scala models to using courier schema models providing a type-safe, schema driven way of sharing JSON data between backends, web and mobile clients.

It was a 3 months long project involving both backend migrations, and a huge frontend effort to support the newer APIs and models.

This is a blog post about how we leveraged this technology migration opportunity to increase impact while staying excited, and emerged feeling proud of what we accomplished. It also references how Coursera’s core values a) Betterment b) Boldness c) Solidarity and d) Deep honesty guided us throughout.

What went differently?

It’s not improbable for an engineer’s first reaction after considering working on technology migrations to resemble: ‘not challenging’, ‘unglamorous’, ‘no growth’, ‘no accomplishment worth being proud of, or sharing with others’, ‘just migrations, eh.‘

I remember something that my manager hinted at several times during our 1:1s, ‘Although getting a meaty project with an anticipated high impact is desirable, how one approaches any given project and how much we push ourselves in the process is equally important for growth, impact, and technical excellence.’

This is our reflection on how the project went: We leveraged this migration opportunity to have a bigger impact on the organization. We learnt and grew a lot during the process. We felt motivated throughout the project, and will categorize it as one of the more technically challenging projects. We’re proud of what we accomplished, and excited to share our work with others.

Here is a summary of what helped us through: 1) Owning the final product, rather than ‘just migrations’ 2) Leveraging this opportunity for bigger impact 3) Upfront planning and having a timeline which we held ourselves accountable to.

Owning the final product, not just the migrations

Owning a system typically means its future maintenance and feature iterations fall under your team’s goals. We understood that apart from migrating the system with its existing weaknesses and strengths, we had an opportunity to dive in and do much more — own the whole system end-to-end rather than ‘just migrations’.

We decided to be bold and empower ourselves to question legacy architecture decisions, be creative and improve/redesign the architecture with newer ideas wherever we could. Within it, we found the motivation to learn deeply about the system, to find opportunities to simplify, and improve the system for easier future extensibility, maintainability, and for significantly reducing the onboarding cost for future developers working on the system.

We decided to take pride in the final product delivered, rather than restricting ourselves to ‘just migrations’ providing the much needed boost to the scope of learning and challenge in this project, along with a boost to the impact and team’s excitement.

The next two sections talk about how we actually leveraged and prioritized opportunities to deliver a bigger impact, while keeping the timeline in check.

Leveraging for more impact

While working with a system and making in-depth changes such as technological migrations, it’s relatively easier to migrate the system as is with its existing strengths and weaknesses without putting additional efforts to ‘improve’ the overall architecture. IMO, restricting migrations to be a mundane task is a missed opportunity.

We felt that when engineers take on in-depth technology migrations for a system, a lot of time is spent thinking about model/architecture redesigns, acquiring context from original authors, getting in the zone of a particular system, reading and understanding through complex legacy code, code reviews, regression testing, etc. We decided to leverage these basic building blocks, and go an extra mile by also including design improvements and code refactors in the project scope.

Some benefits we ended up with: 1) lesser time to onboard new engineers 2) faster iteration cycles for future feature developments — we’re already seeing this benefit while iterating on another system which shares a lot of features with peer review assignments — staff graded open-ended assignments 3) cleaner and more organized code base 4) standardized tooling for other devs working on similar migrations.

Here is a summary of some tasks that we ended up adding on top of migrations:

  1. Build and standardize tools for problems we solved for our migrations: In the spirit of solidarity, one of our senior team members ended up shipping a data migration tool that allowed for much safer data migrations for our peer review system, and would also serve as a great tool for other devs who will take on data migrations for their systems in the near future.

Upfront planning and timelines

Adding more scope to an already big migration efforts can become risky, and hard to sell. Here are a few things that we did to navigate this challenge: 1. Convincing ourselves, and product leaders that the additional improvements are worth the cost 2. Having a detailed upfront plan 3 Milestones and timelines that we hold ourselves accountable to

Is it worth the cost?

Here is my argument 1) these opportunities are really high ROI as talked about in the earlier section 2) they directly impact the motivation and excitement within the team and are likely to give great productivity gains.

When we communicated the increased scope, our product and eng leaders boldly supported us fostering our values of engineering excellence and a culture of betterment.

There is one big risk here though — bloated scope.

We encountered some real challenges during the migration process, both on the frontend and on the backend side. Few examples include 1) Safe and tested data migrations 2) Frontend carefully switching to new APIs safely by employing flow types and extensive unit tests. Combine these with the additional improvements mentioned in the previous section, it can quickly become challenging to strike a good balance between these the two and to not bloat up the scope to unacceptable standards.

Something that helped us through this was detailed upfront scoping and deep honesty in our estimation and communication. We tried to scope out as much as we could before diving into the implementation with detailed design docs, tasks broken down at a very granular level, time estimations and calculating possible risks for the timeline.

Milestones within the timeline helped us detect early on if we were drifting, and helped us strategize to get back on track. We used Jira extensively to manage the project and monitor our progress. Jira epics, weekly reflections to learn and remain accountable using jira sprint reports & epic reports, upfront time estimates for granular tasks were some of the tools that really helped us stay on track, and continuously improve on our estimation muscle.

Overall, working with such a great team and mentors, great support from our leaders, owning the final product rather than ‘just migrations’, challenging ourselves to do pride worthy work and leveraging opportunities to deliver an even larger impact, made for a great satisfying project.

Major thanks to Amory, Holly, David and Priyank for making the peer review technology migrations successful. Nikhil Garg is a backend developer on Coursera’s Learning Experience Team.


Originally published at building.coursera.org on March 3, 2017.

Coursera Engineering

We're changing the way the world learns! Posts from @Coursera engineers and data scientists!

Coursera

Written by

Coursera

Providing universal access to the world’s best education. | www.coursera.org

Coursera Engineering

We're changing the way the world learns! Posts from @Coursera engineers and data scientists!