Thoughts on Microsoft’s New Python in Excel Feature [August 2023]
Last week, Microsoft announced their new Python in Excel feature, which will allow users to write Python code within Excel.
As a big fan of both Python and Excel, here are my 5 initial thoughts on this integration:
1. Is this a good idea?
In my opinion, yes.
As a data scientist, I’ve been using both Excel and Python in my data science workflow for over a decade.
Each tool has its advantages. When I’m doing data analysis, I typically use Excel for quick, visual analysis (conditional formatting, creating plots, etc.) and Python for all the heavy lifting (data manipulation, applying algorithms, etc.).
With this integration, instead of switching back and forth between Excel and Python, everything can be done in one place. For me, the fewer tools I need to toggle between, the better.
2. Advanced machine learning in Excel
At the moment, machine learning is reserved for people who know how to work with Python, R, etc. By bringing Python into Excel, Excel users now have access to the world of machine learning.
For example, a data scientist can train a model in Python, and data analysts will be able to use that model to make predictions within Excel without any additional technical skills. 🤯
On the flip side, I’ve been a data science instructor for many years, and one thing I find is that students often interpret models incorrectly when they’re first starting out.
So I would say to Excel users who end up using Python in Excel — it’s very important to learn how machine learning algorithms work to interpret them correctly. Don’t fall into the danger zone!
3. Python’s pandas library in Excel
Pandas is a popular data manipulation library within Python and it’s extremely powerful.
There are a few things that would take me a number of steps to do in Excel, but that I could easily accomplish using a single line of code in Python using pandas. For example:
- Grouping data and returning the first row of each group →
df.groupby().head()
- Unpivoting data from a wide to a long format →
pd.melt()
As a Python user, these are things I’ve been wishing I could do with my data in Excel. It’s pretty cool that it might finally be possible. I think that bringing pandas into Excel is one of the biggest advantages of this integration.
4. Python visualizations in Excel
In the Python in Excel announcement video that Microsoft released last week, they mention the power of Python data visualizations in Excel multiple times.
Here’s the thing — I really dislike creating data visualizations within Python. The code is lengthy and difficult to tweak. It’s hard to do simple things like change the position of a label, vary the colors, etc.
I actually like to export my data from Python and import it into Excel or Tableau to create visualizations! Personally, I think the plots are easier to create and look nicer coming from these other tools.
My bet is that Excel users will think the same when they try to create complex Python visualizations within Excel.
5. Python calculations in the cloud
Another thing I noticed in the announcement video was that they briefly mentioned that all Python in Excel calculations are going to be run in the cloud.
When you typically write Python code, it’s run locally, meaning on your computer, and that code runs pretty quickly.
In the past, when I’ve run Python code in the cloud, it’s taken a bit longer. I used to teach students how to write Python code in Jupyter Notebook locally and Google Colab in the cloud. With Colab, usually things were fine, but if there were network connectivity issues or a larger data set, the code would take longer to run.
So when I heard that Python calculations in Excel are going to be done in the cloud, it made me hesitate. This is something I’d really like to test out to see if there are actually going to be speed issues.
Final Thoughts
Right now, Python in Excel isn’t available to everyone. A subset of users in Microsoft’s Beta Channel can test it out, and it’ll eventually be available to Windows users.
As a Mac user, I’ll have to wait a bit longer. But for the union of two of my favorite data tools, I’m pretty sure it’ll be worth the wait. 😎