Adopt Python 3
Python 3.6.0 came out day before yesterday, and it was like a Christmas present for many of us. But in the midst of all the celebration, many of you were still asking if it is safe to drop Python 2 and move over to Python 3. There still seems to be a fear of missing out on useful third party libraries that lack Python 3 support.
So in this post, I will try to settle this issue once and for all by presenting the relevant data. After you have seen all the data, you will be able to come to your own conclusion (I have already expressed my conclusion in the title).
But first thing first. Not everyone is interested in all Python packages. A web developer needs Django, a Natural Language Processing enthusiast needs NLTK and SpaCy, and an astrophysicist needs Astropy. So which packages should we talk about?
Let’s start with popular packages i.e packages that have the highest download counts on PyPI. Python 3 Wall of Superpowers and Python 3 Readiness are two websites that maintain a list of 200 and 360 most popular packages respectively. If you open these sites today, you will see that 187/200, and 341/360 packages support Python 3. This is nearly 95 % coverage. This means that almost all popular packages have been ported to Python 3.
Next, let’s go beyond the populist approach and consider a broader range of packages. Why should we do this? Because populist lists have a problem. They do not include packages used in niches like Astrophysics and Natural Language Processing. Thus SpaCy and Astropy and many other important packages are missing in these lists. How can we include them?
This can be done by looking at the set of all stable and actively developed packages. To be classified as such, these packages
- should have their Trove Classifier for Development Status set to Production/Stable or Mature (stable)
- must have released at least one new version in 2016 (active)
These packages are useful enough to receive constant attention from maintainers and developed enough to be considered stable. So we better pay some attention to them.
There are roughly 6000 such packages in PyPI. Here is what Python 3 support looks like in this set.
Pictures speak a thousand words. There are several remarkable things about this simple pie chart:
- There are 28 % packages lacking Python 3 support. But at the same time, there are 14 % packages lacking Python 2 support. Therefore, there is risk associated with choosing either of the two Python versions. Python 2 is not risk free.
- Total Python 3 coverage is at 72 %. That’s not so bad given that Python 3 came out in 2008 and 2020 is the official EOL of Python 2.7. Since 72 > (2016–2008)/(2020–2008)*100 = 66.66, porting is happening faster than expected by a linear law. By 2020, there is a good possibility that this percentage will be 100.
Here’s what I think about the 28 % Python2-only packages in this set. The small packages (< 100 kB) shouldn’t hold anyone back from Python 3, because if you desperately need them at some point, you should be able to port them yourself without much overhead. A whopping 75 % of such Python2-only packages are small and easy-to-port.
The 25 % remaining packages are the ones you need to worry about. They are huge (> 100 kB). If you need them and there isn’t a Python 3 compatible alternative, then you are stuck with Python 2. That’s why I call them “too sticky packages” (pun intended).
If you are starting a project in 2017,, the safe thing to do would be to go through this list of sticky packages. The list is kinda long (about 400 packages) and is a pain to read right now. I have plans of classifying it later, so keep an out on this publication.
If that list has a package which you absolutely require, then bad luck mate. You are stuck with 2. At least for a while.
But chances are good that you will never need any of the packages in that list. If this is so, you have exactly zero reason to stick to Python 2. You can adopt Python 3 and enjoy all the goodies that come along with it!
- The code for producing the pie charts is in this iPython notebook.
- I classified packages with a size of less than 10 kB as “easily portable”, with a size between 10 to 100 kB as “medium difficulty” and sizes greater than 100 kB as “difficult to port”. Of course, such a classification is purely heuristic.
- The list of sticky packages have two empty columns called “Python 3 compatible alternatives” and “comments”. Feel free to add Python 3 alternatives (if you know them) and your comments in the respective columns.
Also, if you have any statistics related questions on packages in PyPI, don’t hesitate to let me know in the comments. I have the entire metadata of PyPI, ready to analyze, and your answer is just one computation away.
If you liked reading this article, please recommend it by hitting the ❤ button. This will help other people discover it.