Migrating to Python/Django: A Step-by-Step Guide

Our Experience and Best Practices for a Smooth Transition at Udemy

Cansin Yildiz
Synthetica Magazine
5 min readApr 24, 2023

--

A colony of ants moving to a futuristic, neon-lit anthill.
Photo by Midjourney

As a software engineer, you may have come across a situation where you have to migrate from one programming language to another. Migrating from PHP to Python/Django is a daunting task that requires careful planning, strategy, and execution. In this blog post, I will share my personal experience leading the migration effort at Udemy back in 2014, detailing the challenges we faced, and the solutions we implemented.

The Pros and Cons of Code Migration

Why is Migration a Bad Idea?

Joel Spolsky, a renowned software developer, once called migration the single worst strategic mistake. The reason is, old code is used and tested, making it harder to read code than write. Tiny bug fixes that could be easily solved in the old codebase become significant issues in the new one. Furthermore, throwing away all the knowledge and experience gained in the old codebase is a significant risk. Refactoring or replacing the existing codebase is often a better option than a full migration.

Why is Migration NOT a Bad Idea?

Despite the challenges of migration, it can be a real solution. Old code can lack documentation and automated tests, making it harder to write tests and documentation than to read code. One-off hacks and trials can create a tangled mess, and shedding away outdated information can be helpful. Incrementally rewriting code can lead to a better, more maintainable codebase.

What We Initially Had

At Udemy, we had an in-house PHP framework with no tests or documentation. The codebase had numerous bugs, and only a small group of developers trained by the founders knew how to work with it. This made it difficult for others to onboard.

What We Wanted to Have

We wanted to adopt a real framework, write tests and testable code, adapt a better development culture, have better code reviews, and automate tests during pull requests. We tried refactoring but realized it was taking too much time.

What We Ended Up Having

After two years and 30% of the engineering team working on it, we achieved our goals. We had an easier onboarding process, new hires, more than 80 and counting, 88% test coverage, and many more features, yet 36% less code, and consistency.

Key Decisions

Research Phase

During the research phase, we evaluated several options such as PHP/Laravel, Node.js/Express, Ruby/Rails, Python/Django, and Scala/Play before deciding to migrate to Python3/Django.

Incremental Migration

We decided that migration must be performed incrementally, with changes shipped frequently. We ensured that code was used in the live production environment and not left to gather dust or remain broken. We also ensured that new work could be done with the new Python/Django codebase.

What to Migrate First

We decided to migrate REST APIs first, which included authentication (OAuth2) but no session, as REST is sessionless. We did not worry about the UI, footer, header, or JS + CSS files.

Migrating APIs

We decided to create new APIs (api-2.0) using Django Rest Framework. We had two options; opaque or transparent to clients. Opaque meant keeping the same API signatures, and data structures, and carrying over mistakes made designing the old PHP API, while transparent meant fixing the problems we had, using a library, and adopting its best practices, such as Django Rest Framework.

Enabling API 2.0 — Authentication

We had a custom OAuth2 implementation in PHP, but we wanted to use a library for Django. We used Django Oauth Toolkit to refactor the existing PHP code, enabling clients to use the same access_token.

Migration is Happening — Creating API 2.0

We identified an API endpoint the JS client used, decided how its signature and data format should be fixed, implemented the better version under api-2.0, updated the JS Web Client code to use api-2.0, and didn’t delete the old API endpoint yet as mobile might still be using it. We deleted the old implementation once mobile also migrated.

Problem — Copy/Pasted API Logic

Since mobile vs. web were using different API versions, we had to mitigate the problem. We migrated API endpoints that mobile was not using, enabled mobile to directly use the newer version when necessary, and applied important fixes and business logic changes to both versions. Mobile clients were more fragile during migration, and we tried our best to escalate mobile-related bug fixes.

Problem — Cache Invalidation

We had the same Memcache instance for Django and PHP, but different data pickling and key generation. To solve the invalidation problem, we used explicit communication between PHP and Django.

Migrating Static Files

While API migration was ongoing, we started to think about web page migration. We decided to migrate JS/CSS files to the new codebase before migrating any web pages. We copied our static files under a static/ folder within the Django codebase, configured Django, and ran manage.py collectstatic during release. We let NGINX serve static_collected/ at udemy.com/staticx/* and changed the code to fetch udemy.com/staticx/* instead of udemy.com/static/* and deleted old static files in the PHP codebase.

Migrating Web Pages

We identified a web page to migrate, migrated the template and View logic to Django, and updated nginx.conf to map the URL to the new Django View. We released the change so the page gets served by Django, went back to the PHP code, and deleted the old implementation.

Sharing Session and Authentication

We had a session-based authentication approach, but except for that, there was not much in the session. We realized we had two options: to unify sessions between PHP and Django, or not care about sessions and focus on consistent authentication. We decided not to have shared sessions and rely on access_token for authentication.

Sharing Certain UI Components and Templates

We decided to copy the header/footer and main template to Django shamelessly.

URL Routing Problem

We served certain URLs from Django and others from PHP, by creating an nginx.conf file within the Django codebase that NGINX imported and used as a configuration. It had entries that redirected certain patterns to Django.

Migrating Web Pages

We migrated the JS/CSS code to the new codebase, migrated the template and View logic to Django, and updated nginx.conf to map the URL to the new Django View. We released the change so the page gets served by Django, went back to the PHP code, and deleted the old implementation.

Moral of the Story

Migrating from PHP to Python/Django can be a daunting task, but with careful planning, strategy, and execution, it can be achieved incrementally. We found it helpful to keep it simple and focus on consistency.

Migrating from PHP to Python/Django requires careful planning and execution, but it can be achieved incrementally. We found it helpful to migrate REST APIs first, copy UI components and templates shamelessly, and rely on access_token for authentication. With persistence and hard work, we achieved all our goals, including an easier onboarding process, new hires, more than 80 and counting, 88% test coverage, many more features, yet 36% less code, and consistency.

Original Content: https://docs.google.com/presentation/d/1hNk4QK-9x6SDTwmzuKT9d3mWfbHbpNVCF1JeHjNur5U

Disclaimer: This piece has been written in collaboration with Poe — Sage.

--

--

Cansin Yildiz
Synthetica Magazine

Software Architect, and Engineering Advisor. Was a Founding Engineer #3 @Udemy.