Measuring the performance of your Python program 🐱

Introduction to Profiling

The Research Nest


All images created using DALLE3

In super simple terms, there are three important aspects to consider when discussing performance.

  1. Memory/Resource usage
  2. Time taken
  3. Bottlenecks

Performing this analysis is called “Profiling”.

There are several tools and libraries for profiling Python code. Some popular options:

  1. cProfile: Built-in Python module that can measure the time spent in each function and how many times each function is called.
  2. timeit: Used for measuring the execution time of small code snippets.
  3. line_profiler: Third-party package that provides more fine-grained information about which lines consume the most time.
  4. memory_profiler: Another third-party package that helps you identify memory leaks and memory-intensive parts of your code.

Let’s explore the intricacies through a fun tutorial featuring Kitty, a tech-savvy kitten on a mission to optimize her magical spell-casting code.

Chapter 1: Using cProfile

Kitty wants to cast a magic spell, and it involves several steps. Let’s assume that each step is a function that takes some random amount of time to execute.

import time
import random

def prepare_ingredients():
# Simulate gathering magical ingredients

def cast_incantation():
# Simulate casting an incantation
time.sleep(random.uniform(0.1, 0.5))

def perform_calculation():
# Simulate a complex magical calculation
result = sum([i for i in range(100000)])
return result

def cast_spell():
result = perform_calculation()
return result

The cast_spell function involves three function calls as described above.

Here’s how we can use cProfile to measure this function.

import cProfile"cast_spell()")

# Output below on running in Google colab
         13 function calls in 1.245 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 0.009 0.009 <ipython-input-9-b58a49c8abaf>:12(perform_calculation)
1 0.008 0.008 0.008 0.008 <ipython-input-9-b58a49c8abaf>:14(<listcomp>)
1 0.000 0.000 1.245 1.245 <ipython-input-9-b58a49c8abaf>:17(cast_spell)
1 0.000 0.000 1.001 1.001 <ipython-input-9-b58a49c8abaf>:4(prepare_ingredients)
1 0.000 0.000 0.234 0.234 <ipython-input-9-b58a49c8abaf>:8(cast_incantation)
1 0.000 0.000 1.245 1.245 <string>:1(<module>)
1 0.000 0.000 0.000 0.000
1 0.000 0.000 1.245 1.245 {built-in method builtins.exec}
1 0.001 0.001 0.001 0.001 {built-in method builtins.sum}
2 1.235 0.618 1.235 0.618 {built-in method time.sleep}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'random' of '_random.Random' objects}

cProfile output gives us the following information:

  1. 13 function calls in 1.245 seconds: This program has 13 different steps (functions), and when we add up the time it takes for each step, it’s a total of 1.245 seconds.
  2. ncalls: This shows how many times each step was called. Think of it as how many times you repeated a step.
  3. tottime: This is the total time spent inside each step, just for that step alone.
  4. percall: This is the average time spent per call (per repetition) for each step. If you did a step multiple times, this tells you how long, on average, each repetition took.
  5. cumtime: This is the total time spent in a step, including the time spent in the steps it called.
  6. filename:lineno(function): This shows where the step is located in the program.

For example, let’s look at the first row:

  • perform_calculation was called once.
  • It spent 0.001 seconds doing its own work.
  • On average, each call took 0.001 seconds.
  • In total, including the steps it called, it took 0.009 seconds.
  • This step is defined in the program at line 12 in a function called perform_calculation.

Here are a few more examples:

  • prepare_ingredients took 1.001 seconds in total, including the steps it called. So, it's the most time-consuming step.
  • cast_incantation spent 0.234 seconds in total, including the steps it called.
  • time.sleep is a built-in function for pausing the program, and it took a total of 1.235 seconds.

This report helps a programmer like Kitty identify which parts of her program take the most time and might need optimization 🍳🕒!

In a similar fashion, you can profile any large function or program and identify parts of the code that require improvements.

Chapter 2: Using timeit

We can use timeit to measure the execution time of individual functions or lines of code.

import timeit

# Example of using timeit
execution_time = timeit.timeit("cast_spell()", setup="from __main__ import cast_spell", number=100)
print(f"Average execution time: {execution_time / 100}")
Average execution time: 1.32108445158

This code snippet measures the average execution time of cast_spell over 100 runs. The setup parameter is used to import the function into the timeit's scope. Doing this lets us determine if any particular code snippet takes too long and needs optimization.

Chapter 3: Using Line Profiler

Assume that Kitty is dealing with a huge codebase and wants to understand how much time is taken in one function. Kitty wants to know where time is spent within perform_calculation(). We can also use line_profiler for the same.

# Install line_profiler: pip install line_profiler

from line_profiler import LineProfiler

def perform_calculation():
result = sum([i for i in range(100000)])
return result

lp = LineProfiler()
lp_wrapper = lp(perform_calculation)
Timer unit: 1e-09 s

Total time: 0.0262776 s
File: <ipython-input-15-d5f84c8ead05>
Function: perform_calculation at line 14

Line # Hits Time Per Hit % Time Line Contents
14 def perform_calculation():
15 # Simulate a complex magical calculation
16 1 26277179.0 3e+07 100.0 result = sum([i for i in range(100000)])
17 1 408.0 408.0 0.0 return result

Here, you can see the time taken for each line of the given function. Since our function has just two lines, you get the same.

Below is the result for cast_spell function.

lp = LineProfiler()
lp_wrapper = lp(cast_spell)
Timer unit: 1e-09 s

Total time: 1.30488 s
File: <ipython-input-16-23df80fcd21b>
Function: cast_spell at line 19

Line # Hits Time Per Hit % Time Line Contents
19 def cast_spell():
20 1 1001099708.0 1e+09 76.7 prepare_ingredients()
21 1 286475374.0 3e+08 22.0 cast_incantation()
22 1 17307760.0 2e+07 1.3 result = perform_calculation()
23 1 672.0 672.0 0.0 return result

You can see the % time for each line inside the function. prepare_ingredients() takes more than 75% time.

Per Hit implies the average time spent per hit (execution) of each line in nanoseconds. Since each line was executed only once, the ‘Time’ and ‘Per Hit’ values are the same.

Chapter 4: Using Memory Profiler

Memory profiling is used to identify memory leaks and heavy memory usage in our code. It is crucial when working with large data sets or long-running applications, where inefficient memory usage can lead to crashes or slowdowns.

We can use the memory_profiler package, which provides insights into memory consumption. Here's how we can use it in the spell-casting code in a Jupyter Notebook (like Google Colab):

# !pip install memory_profiler

%load_ext memory_profiler

def cast_spell():
return "Spell Casted Successfully!"

%memit cast_spell()
peak memory: 111.12 MiB, increment: 0.02 MiB

Here’s what these numbers mean.

Peak Memory (111.12 MiB):

  • This is the maximum amount of memory your code used at any point during its execution.
  • It is measured in Mebibytes (MiB), where 1 MiB is approximately 1.049 Megabytes (MB).
  • The “peak memory” is a vital metric as it indicates the highest memory load that your system had to handle while running the code. It can help you with resource planning.
  • If the peak memory usage is too high, it could lead to memory exhaustion or swapping, causing performance issues.

Increment (0.02 MiB):

  • This value represents the increase in memory consumption that occurred due to the execution of your code.
  • An increment of 0.02 MiB means that when compared to the memory usage before the code execution started, the code used an additional 0.02 MiB at its peak.
  • This incremental measurement helps understand the additional memory burden the specific code block imposes.
  • Low increment values are generally desirable, as they indicate that the code is not excessively increasing memory usage, which is particularly important in memory-critical applications.

To access more features of the memory profiler, you will need to use it directly on the Python files rather than the iPython kernels of a notebook.

For more information, check this:

Whether it’s identifying time-consuming functions, drilling down to specific lines, or pinpointing memory issues, profiling is valuable for any wizard or programmer.

I hope you learned something new. Until next time!

If you want to read more about Python code optimizations, read the blog post below.



The Research Nest

Exploring Tech, Life, and Careers Through Content 🚀