169 Followers
·
Follow

Python for Product Managers

Product Management can be a challenging role, with demands pulling you in many different ways. So anything we can do to minimise time spent on repetitive tasks or to maximise the value that we get from data should be capitalised to the full.

Python is a popular open-source programming language which is relatively easy to learn. It can help you automate the boring stuff!

In this series of posts, I’ll show a couple of examples of how mastering Python can free up valuable time and provide deeper insights to you as a Product Manager.

Step one — Install Python

Installing Python depends on you Operating System and your requirements.

Recommended — use the Anaconda distribution

Anaconda is a distribution which contains Python and many of the most common and popular additional packages that you might need. It also gives you a clear and sane way of adding new packages and keeping stuff up to date. It’s probably the simplest way to get started, and it’s widely used so lots of help out there.

Installation instructions can be found online and are easy to follow.

If you use Anaconda, you’ll want to familiarise yourself with the conda command for package management. Again, there are good instructions available online.

You can search for packages:

conda search <package>

You can install a package:

conda install <package>

and you can update a package:

conda update <package>

Full details here…

Also good but maybe a bit more complicated — install from Python.org

Python.org is the home of the Python programming language. There are downloads available for most Operating Systems and the installation process is still relatively straight forwards.

If your using Python without the Anaconda distribution, then pip is the package manager you’ll use.

Step two — Jupyter Notebook

There are several ways to run a Python program. Most simply, a text file with a .py extension can be saved and then run with the Python interpreter. Python is an interpreted language, so rather than compiling your code to an executable file (e.g. MyApp.exe) the scripts are interpreted at run time. This makes it good for experimenting!

To make the whole process even more immediate, a web-based interactive “notebook” is available through the Jupyter project. This allows you to write and execute Python scripts via your browser in an intuative way which lends itself to nicely documented, repeatable experiments.

If you’ve used Anaconda, you should have Jupyer installed already. If not, installation instructions are available.

To start the browser based notebook, run the following:

jupyter notebook

All being well, a web browser will open up showing you a file / directory browser.

Image for post
Image for post

Add a new notebook for one of the Python versions you have available (e.g Python 3 for me in the above screenshot).

Just to prove it’s all working, let’s run some Python code. Type or paste the following into the cell:

print("Hello Product Managers")

Now, whilst the cell is selected, press ctrl-return and you should see the output of the code below.

Notebook keyboard shortcuts

There is, of course, a lovely toolbar and menu structure. But for speed, there are a number of keyboard shortcuts available which make life easier. Here’s a few to get you started:

  • ctrl-enter - Run current cell

That’s probably enough to get started! A few shortcuts go a long way! There are many more shortcuts to learn though, once you’re up and running.

Step three — learn Python

Now, I’m not best placed to teach you Python and its basic syntax. Many people are better equipped to do that, and there is a wealth of information, courses and tutorials out there which will give you a good grounding.

Some pointers to great resources for learning Python:

Or, you can grab a book!

Step four — Python modules for data analysis

So, I kind of shirked my responsibilities in the last step, didn’t I! Well, to make up for it, assuming you know some basic Python, here’s some stuff that will make it all seem worthwhile. And if you haven’t learned yet, hopefully this will inspire you to get stuck in!

Pandas is a great module that helps you load, transform and analyse data quickly in Python. It’s definately a modeule to explore if, like me, you spend a lot of time looking at data an asking questions.

Specifically, Pandas is a useful tool in your workflow for the following common data analysis tasks:

  • Loading data from CSV files or directly from a database

We’ll work through examples of each of these in the next installment — but to make sure you’re hooked, here’s a little example. Imagine we’ve obtained a tab delimited text file with data about sessions for our SaaS application. Each row represents one session, with columns for:

  • A unique ID

My sample file has several years worth of data and is about 100Mb in size. Let’s have a look what we can do with that!

# Show graphs and charts inline in the notebook 
%matplotlib inline
# Import some libraries
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
# Set some default plot styles to make things look nice matplotlib.rcParams['figure.figsize'] = (20.0, 10.0)
plt.style.use('bmh')
# Import our session data from a tab delimited text file, making sure the start and end dates are loaded as dates
df = pd.read_csv('monthly_sessions.csv', sep='\t', parse_dates=['end_datetime', 'start_datetime'])
# Show the top few rows of
data df.head()
Image for post
Image for post
# Let's see how many sessions we've loaded from the file len(df.index)1293267# Let's count sessions per month and plot to see the trend df.groupby([df.start_datetime.dt.year, df.start_datetime.dt.month]).agg('count').id.plot() plt.title('Sessions per month')
Image for post
Image for post
# Let's calculate the duration of each session df['session_duration'] = df.end_datetime - df.start_datetime # And then look at some stats for session_duration 
df.describe()
Image for post
Image for post
# Let's look at the number of sessions by day of week df.start_datetime.dt.weekday.hist(bins=[0,1,2,3,4,5,6,7])<matplotlib.axes._subplots.AxesSubplot at 0x7feea73a93c8>
Image for post
Image for post

Hopefully that gives you a glimpse into how you can load, manipulate and visualise data.

But I can do that in Excel!

Somebody, somewhere

Now, you could do that in Excel, but where Python comes in super useful is that (depending on your PC specs) it can handle larger files more quickly and that it’s super easy to repeat your analysis and even automate it. Imagine a slightly more complex example of the above analysis pulling data from several databases or files — you can re-run the analysis at the click of a button, obtaining new data and presenting new outputs.

In the next instalment, we’ll look in more detail at some common tasks and how to achieve them with Python & Pandas.

If you’ve got any comments, suggestions or requests then please let me know in the comments.

Read the rest of the series

Follow the full series of posts to master Python!

  • Part 1 : Installing and setting up Python, Pandas and Jupyter

Originally published at productmetrics.net on February 14, 2019.

Written by

Hey, I’m Joshua, a Product Manager and data fan. This is my blog and a place to post random musings and tutorials.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store