Build a PyPI analytics web app with 75 lines of Python

Ashish Singal
Pycob
3 min readMar 1, 2023

--

Every Python developer uses PyPI whenever they do pip install xyz. Here, we explore how Python engineers can use PyPI analytics to discover the right packages for their work. Using Pycob, we’ll create a fully functional multi-page PyPI analytics web app in just 75 lines of Python code.

Links: Live App | Github | Youtube | Pycob | Google Slides

Use cases

Every Python programmer is very familiar with PyPI, as the people behind “pip install.” PyPI makes its data available through Google BigQuery’s Public Dataset Program, and we can use this data to build analytics.

  1. Engineers can discover new trending Python packages to check out
  2. Engineers can choose between competing Python packages by checking trends
  3. Builders can monitor the performance of their own packages
  4. Investors can identify breakout Python packages and the companies behind them

App demo

When you land on Pypi Analytics, you see a box where you can search for any Python library as well as a ranking of the top 50 most popular Python packages.

If you tap into any of the pages, you’ll see downloads on that package over the last year. For example, here’s the page for boto3.

Architecture

Let’s dive into how we build this app from scratch. First, we need to grab the data and turn it into something that Pycob can consume. We do this in this Jupyter Notebook.

  1. Raw data in BigQuery — The Pypi data exists in BQ as a public dataset. The schema is below. However, it’s really unwieldy to use by itself.
  2. Data eng in Jupyter — We’ve spun up a Jupyter Notebook to do some basic data engineering.
  3. Save to pickle — We’ll save the resulting analysis — which collates all packages and daily downloads over the last month — to a pickle that we save to the Pycob cloud.
  4. Load into Pycob — Finally, the Pycob web app picks up the pickle and displays it using custom cuts.

Now that we have the data in Pycob, let’s see how our Pycob application is structured. The entire app (as of this writing) is only 75 lines of code — and all in 100% pure Python!

We have two main pages — home and project details.

On the home page, we set the header, read in the pandas Dataframe and aggregate it, and display the results in a Pycob table with a link for each package to a detail page.

On the project details page, we show the number of downloads per month by selecting the individual project rows from the DataFrame. Additionally, we have functionality to take in form inputs from a subtitle and compare_to parameters so users can compare the project to another and add a custom title.

That’s it! Woohoo!

--

--