Migration to Python3: The SquadStack Experience

Yoshita Bajaj

Published in

SquadStack Engineering

7 min readSep 30, 2022

Written by Kunal Mahajan

Introduction

Python is the primary coding language at SquadStack. Our monolith project is 8 years old with more than 500k lines of python code. A few months back we migrated our codebase from Python2.7 to Python3. In this article, we’ll go through various steps taken and the challenges faced in migrating the codebase from Python 2 to Python 3.

Why do you need to migrate your codebase from Python2.7 to Python 3?

This will be the first thing that will definitely come to your mind when you plan to work on this project. Here are some of the reasons:-

Python2.7 is no longer supported by the python contributors. This means if you face any issue w.r.t to Python2.7 no fixes would be done. No security patches would be applied.
New libraries, tools, modules, and frameworks are getting written in Python 3 only.
An outdated Python version can show up as a red flag during a security audit.
Improved threading and multiprocessing library — including shared data between processes in Python 3. (Improvement in GIL).
Python 3 is usually faster than Python 2.7.

There are many more differences between Python3 and Python2.7. If you want to refer to the whole list, refer to the official docs here.

How did we get started with the migration process at SquadStack?

We removed all the extra/redundant code from our codebase. We did not waste time upgrading that part of the codebase which was never going to get executed. We learned from this that we could have been more regular with such clean-ups.😅

Some cleanup screenshots

After the above step, we made sure that we had a good test coverage of our codebase. Without good test coverage, it would be a humongous task to upgrade our codebase to Python3 in a bug-free way.
We started running two CI pipelines for both Python 2.7 and Python 3. By running two CI pipelines we ensured that the code is working in both environments.
After the above steps, the real migration started. Before migrating to Python3 we wanted to make our code compatible with both Python2.7 and 3. Here is a cheat sheet that you can refer to make your code Python2/3 compatible. We used some tools to ensure that our compatibility is not breaking like:
- 2to3: Automated Python 2 to 3 code translation.
- futurize: It allows you to use a single, clean Python 3 -compatible codebase to support both Python 2 and Python 3 with minimal overhead.
- six: It allows you to support codebases that work on both Python 2 and 3 without modifications.
To ensure that no one from the team is pushing incompatible code we added some custom Github actions. These actions ran on every commit. For example, if any __future__ library got missing at the start of the file then the GitHub action would throw an error.
We also integrated futurize in the Github actions. This helped us in automating Python2/3 compatibility with every new commit.
Once we made our code Python2/3 compatible we finally started the last step of migration which was to migrate completely to Python3.

All the steps were straightforward except the last step. Let’s discuss some of those challenges in the next section.

What challenges did we face in the migration process?

One of the biggest issues we faced was pickling and un-pickling objects in python. Let’s discuss this issue in detail.

In SquadStack, we follow the practice of caching querysets. So, before caching the queryset we pickle the queryset first and then store it in Redis.
What problems occurred because of pickling the querysets?
Unpickling a queryset in Python 3 which got pickled in Python2.7 resulted in a lot of UnicodeDecodeError. This happened because Python 3 uses a different protocol version to pickle objects as compared to Python 2.7.
How did we fix this?
We took the following approach:-
- Invalidated cache in case UnicodeDecodeError occurs.
- Pickled the queryset using a protocol version that is compatible with Python2.7 and Python3. Doing this definitely meant that the load on our database would increase. So, to ensure that this does not hamper any of our applications we increased the size of our database.
- Now, once the queryset got pickled using a compatible protocol version(the 2.0 version is compatible with both Python 2.7 and 3) this problem was mitigated. :)

I know increasing the load on your database is not something with which you can go with every time. Please suggest alternate approaches in the comments below. :)

CACHES = {
    “default”: {
        “BACKEND”: “redis_cache.RedisCache”,
        “LOCATION”: “dummy”,
        “TIMEOUT”: 500,
        “BINARY”: True,
        “OPTIONS”: {
            “PICKLE_VERSION”: 2, # pickle version set to 2 to make
                                 # it compatible with Python 2 and 3         },    } }

Text vs binary issues. In Python2.7 there was no distinction between text and binary strings. But now in Python3, there is a clear difference between text and binary. This was the most common issue we faced. This is easy to fix but most time-consuming. You can look into the cheat sheet attached above.

There were some issues related to Django like:-

Creation of new migration files: On running makemigration on Python3 we saw the creation of some new migration files.

1. Running makemigration on python3 resulted in a file without any b prefix for string values. String values can be anything like verbose_name, help_text, choices in any choice field. This happens because there is a clear difference between text(Unicode) and bytes in Python 3. Whereas in Python 2.7 there is no clear difference between them. This resulted in a file with a b prefix for non-unicode values in Python 2.

Compatible Solution: The easiest way to get the same migration file in both python 2 and python 3 environments is by adding from __future__ import unicode_literal to all the models file. For existing migration files either we run makemigration and that should only happen once, or we can remove b prefix from existing migration files.

2. Object representation on Admin/Shell: __unicode__()method is not available in Python3. But we were using this in a lot of places to show a Unicode representation of an object.

Compatible Solution: To support Python 2.7 and Python 3, we added __str__() method and made sure it returns unicode under Python 2.7 by using @python_2_unicode_compatible decorator.

Once we migrated the whole codebase to Python3, we now had to roll out the changes to production. Let’s deep dive into that as well.

How did we deploy this migration to production?

Since we have a monolith architecture, deployment became a big challenge for us. We have a huge codebase. A single issue could have affected all the verticals be it our supply team/demand team/internal team. So we did not do a 100% rollout at once. Instead, we did roll out in phases. Currently, our system has 4 main components:-

app — This component handles the android application and the customer dashboard.
api — This component handles all the API requests.
celery — This component handles all the asynchronous processes.
admin — This component handles all internal workflows/operations.

We have dedicated servers for these components. So we rolled it out in the following manner:-

We created new Python3 instances for every component.
Rolled out less traffic for a single component so that our business doesn’t get affected . We did a weighted routing with only 5–10% traffic on the Python3 servers.
Rolled out full traffic for a short period for a single component and fixed the bugs. Bugs were related to string formatting, comparison related, text vs binary, etc.
Rolled out traffic for other components. (We started with sending traffic to the less busy servers first).
Once the confidence got built we rolled out 100% traffic for all our Python3 instances.
Once everything got stable in the Python 3 environment, we stopped all the Python2.7 instances.

Learnings made from this whole migration exercise

Planning is the most important thing. If you plan well, you’ll always end up finishing any project/exercise/task efficiently.
Writing good and efficient test cases is a big-big must if you want to deploy your code with minimal errors.
Take out some bandwidth for tech debts. Stretching some tech debts for a very long time can lead to nightmares.

Conclusion

Finally, we transitioned to Python3 smoothly without having any downtime along with continuous development. Next up, we have started our Django migration process. So, stay tuned for our next migration blog on Django 😉

References

https://python-future.org/compatible_idioms.html#
https://www.stxnext.com/blog/why-migrate-from-python-2-to-python-3/
https://medium.datadriveninvestor.com/why-companies-are-moving-from-python-2-to-python-3-86d948e529c0
How the other companies migrated like Instagram, Facebook, etc.

Special Notes

We are Hiring! 😀It’s an exciting time to join the SquadStack engineering team. Please have a look at the SquadStack careers page. We have a lot of exciting problems to solve in many domains like Scalability, DevOps, Backend, System design, etc.