Blazing fast Python

Profiling Python applications using py-spy

Dane Hillard
Dec 19, 2018 · 5 min read

Perhaps you’ve faced the fortunate challenge of scaling up a Python application to accommodate a steadily increasing user base. Though most cloud hosting providers make it easier than ever to throw more hardware at a problem, there comes a point when the cost outweighs convenience.

Around the time scaling horizontally starts looking less attractive, developers turn to performance tuning to make applications more efficient. In the Python community there are a number of tools to help in this arena; from the built-in timeit module to profiling tools like cProfile, there are quick ways to test the difference between a particular line of code and any of its alternatives.

Although profiling tools help you see important information about which calls in your application are time consuming, it’s difficult to exercise an application during local development the same way your users exercise it in real life. The solution to bridging this gap? Profile in production!

py-spy

After running py-spy and a bit of post-processing you’ll be able to generate visualizations like this:

The y-axis is the call stack; the x-axis represents the proportion of CPU time spent in a particular call

But how?

$ pip install py-spy

Now that you’ve installed the tool, it should be ready to analyze your Python application. To do so, start by snagging the process ID (PID) of the application you’re interested in. If you’re running a Django application, for instance, this would probably be the PID of one of the WSGI workers running the Django app. After you’ve got a PID handy, you’ll run py-spy with options for:

  1. How long to sample the process, in seconds
  2. How often to sample the stack of the process, in seconds
  3. The name of the output file
  4. The PID of the process
  5. Whether to block the measured process while sampling

This will look something like

$ py-spy record \
--duration 300 \
--rate 100 \
-p <PID> \
--output my-app-20210214T1530.svg \
--nonblocking

py-spy will sample the process with PID <PID> one hundred times per second for five minutes, producing an SVG file containing a browsable FlameGraph.

FlameGraph

Pink segments match the search query

By exploring a flame graph, you can more easily find slow and common calls in your application that aren’t always obvious from poring over loads of raw text. And by identifying these calls, you can start to understand how you might effectively pare things down into a leaner application.

The pink segments above represent instances of a particular method call within our application; this call isn’t particularly expensive, but occurs many times on nearly every request. We realized that while we had code that seemed like it was caching these calls, the TTL was only set to one second. By updating the TTL to ten seconds, we were able to reduce how often this call occurred, which was confirmed in the “after” graph:

Less pink means less overall CPU time spent on matching calls

So what?

Requests/minute in yellow, CPU load in green (the big dip at 10:40 is an artifact between deployments)

This drop in CPU load translated to increased throughput and a significant drop in response time:

Before this change, our application was largely CPU bound. We tune Apache to throttle connections in tandem with this behavior so that instances aren’t overwhelmed. After this change, there was enough headroom to reduce the application’s footprint by two server instances. We’re now connection bound and can look into opening that throttling up a bit, which could allow us to shed another instance or two. Across all our applications, especially those requiring significant computing resources, these kinds of optimizations can lead to thousands in savings annually.

py-spy is not just helpful for identifying opportunities, but also helpful for confirming hypotheses. Optimizing calls that appear frequently/loudly in the flame graph can result in tangible benefits, which can ultimately reduce cost. I recommend adding this tool to your arsenal for performance testing and monitoring.

To see more about flame graphs, py-spy, and profiling tools, check out these resources:

Build Smarter

Where developers, designers, and product people at ITHAKA…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store