Measuring the performance of your Python program š±
Introduction to Profiling
In super simple terms, there are three important aspects to consider when discussing performance.
- Memory/Resource usage
- Time taken
- Bottlenecks
Performing this analysis is called āProfilingā.
There are several tools and libraries for profiling Python code. Some popular options:
- cProfile: Built-in Python module that can measure the time spent in each function and how many times each function is called.
- timeit: Used for measuring the execution time of small code snippets.
- line_profiler: Third-party package that provides more fine-grained information about which lines consume the most time.
- memory_profiler: Another third-party package that helps you identify memory leaks and memory-intensive parts of your code.
Letās explore the intricacies through a fun tutorial featuring Kitty, a tech-savvy kitten on a mission to optimize her magical spell-casting code.
Chapter 1: Using cProfile
Kitty wants to cast a magic spell, and it involves several steps. Letās assume that each step is a function that takes some random amount of time to execute.
import time
import random
def prepare_ingredients():
# Simulate gathering magical ingredients
time.sleep(1)
def cast_incantation():
# Simulate casting an incantation
time.sleep(random.uniform(0.1, 0.5))
def perform_calculation():
# Simulate a complex magical calculation
result = sum([i for i in range(100000)])
return result
def cast_spell():
prepare_ingredients()
cast_incantation()
result = perform_calculation()
return result
The cast_spell function involves three function calls as described above.
Hereās how we can use cProfile to measure this function.
import cProfile
cProfile.run("cast_spell()")
# Output below on running in Google colab
13 function calls in 1.245 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 0.009 0.009 <ipython-input-9-b58a49c8abaf>:12(perform_calculation)
1 0.008 0.008 0.008 0.008 <ipython-input-9-b58a49c8abaf>:14(<listcomp>)
1 0.000 0.000 1.245 1.245 <ipython-input-9-b58a49c8abaf>:17(cast_spell)
1 0.000 0.000 1.001 1.001 <ipython-input-9-b58a49c8abaf>:4(prepare_ingredients)
1 0.000 0.000 0.234 0.234 <ipython-input-9-b58a49c8abaf>:8(cast_incantation)
1 0.000 0.000 1.245 1.245 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 random.py:546(uniform)
1 0.000 0.000 1.245 1.245 {built-in method builtins.exec}
1 0.001 0.001 0.001 0.001 {built-in method builtins.sum}
2 1.235 0.618 1.235 0.618 {built-in method time.sleep}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {method 'random' of '_random.Random' objects}
cProfile output gives us the following information:
- 13 function calls in 1.245 seconds: This program has 13 different steps (functions), and when we add up the time it takes for each step, itās a total of 1.245 seconds.
- ncalls: This shows how many times each step was called. Think of it as how many times you repeated a step.
- tottime: This is the total time spent inside each step, just for that step alone.
- percall: This is the average time spent per call (per repetition) for each step. If you did a step multiple times, this tells you how long, on average, each repetition took.
- cumtime: This is the total time spent in a step, including the time spent in the steps it called.
- filename:lineno(function): This shows where the step is located in the program.
For example, letās look at the first row:
perform_calculation
was called once.- It spent 0.001 seconds doing its own work.
- On average, each call took 0.001 seconds.
- In total, including the steps it called, it took 0.009 seconds.
- This step is defined in the program at line 12 in a function called
perform_calculation
.
Here are a few more examples:
prepare_ingredients
took 1.001 seconds in total, including the steps it called. So, it's the most time-consuming step.cast_incantation
spent 0.234 seconds in total, including the steps it called.time.sleep
is a built-in function for pausing the program, and it took a total of 1.235 seconds.
This report helps a programmer like Kitty identify which parts of her program take the most time and might need optimization š³š!
In a similar fashion, you can profile any large function or program and identify parts of the code that require improvements.
Chapter 2: Using timeit
We can use timeit
to measure the execution time of individual functions or lines of code.
import timeit
# Example of using timeit
execution_time = timeit.timeit("cast_spell()", setup="from __main__ import cast_spell", number=100)
print(f"Average execution time: {execution_time / 100}")
Average execution time: 1.32108445158
This code snippet measures the average execution time of cast_spell
over 100 runs. The setup
parameter is used to import the function into the timeit
's scope. Doing this lets us determine if any particular code snippet takes too long and needs optimization.
Chapter 3: Using Line Profiler
Assume that Kitty is dealing with a huge codebase and wants to understand how much time is taken in one function. Kitty wants to know where time is spent within perform_calculation()
. We can also use line_profiler
for the same.
# Install line_profiler: pip install line_profiler
from line_profiler import LineProfiler
def perform_calculation():
result = sum([i for i in range(100000)])
return result
lp = LineProfiler()
lp_wrapper = lp(perform_calculation)
lp_wrapper()
lp.print_stats()
Timer unit: 1e-09 s
Total time: 0.0262776 s
File: <ipython-input-15-d5f84c8ead05>
Function: perform_calculation at line 14
Line # Hits Time Per Hit % Time Line Contents
==============================================================
14 def perform_calculation():
15 # Simulate a complex magical calculation
16 1 26277179.0 3e+07 100.0 result = sum([i for i in range(100000)])
17 1 408.0 408.0 0.0 return result
Here, you can see the time taken for each line of the given function. Since our function has just two lines, you get the same.
Below is the result for cast_spell
function.
lp = LineProfiler()
lp_wrapper = lp(cast_spell)
lp_wrapper()
lp.print_stats()
Timer unit: 1e-09 s
Total time: 1.30488 s
File: <ipython-input-16-23df80fcd21b>
Function: cast_spell at line 19
Line # Hits Time Per Hit % Time Line Contents
==============================================================
19 def cast_spell():
20 1 1001099708.0 1e+09 76.7 prepare_ingredients()
21 1 286475374.0 3e+08 22.0 cast_incantation()
22 1 17307760.0 2e+07 1.3 result = perform_calculation()
23 1 672.0 672.0 0.0 return result
You can see the % time for each line inside the function. prepare_ingredients()
takes more than 75% time.
Per Hit implies the average time spent per hit (execution) of each line in nanoseconds. Since each line was executed only once, the āTimeā and āPer Hitā values are the same.
Chapter 4: Using Memory Profiler
Memory profiling is used to identify memory leaks and heavy memory usage in our code. It is crucial when working with large data sets or long-running applications, where inefficient memory usage can lead to crashes or slowdowns.
We can use the memory_profiler
package, which provides insights into memory consumption. Here's how we can use it in the spell-casting code in a Jupyter Notebook (like Google Colab):
# !pip install memory_profiler
%load_ext memory_profiler
def cast_spell():
prepare_ingredients()
cast_incantation()
perform_calculation()
return "Spell Casted Successfully!"
%memit cast_spell()
peak memory: 111.12 MiB, increment: 0.02 MiB
Hereās what these numbers mean.
Peak Memory (111.12 MiB):
- This is the maximum amount of memory your code used at any point during its execution.
- It is measured in Mebibytes (MiB), where 1 MiB is approximately 1.049 Megabytes (MB).
- The āpeak memoryā is a vital metric as it indicates the highest memory load that your system had to handle while running the code. It can help you with resource planning.
- If the peak memory usage is too high, it could lead to memory exhaustion or swapping, causing performance issues.
Increment (0.02 MiB):
- This value represents the increase in memory consumption that occurred due to the execution of your code.
- An increment of 0.02 MiB means that when compared to the memory usage before the code execution started, the code used an additional 0.02 MiB at its peak.
- This incremental measurement helps understand the additional memory burden the specific code block imposes.
- Low increment values are generally desirable, as they indicate that the code is not excessively increasing memory usage, which is particularly important in memory-critical applications.
To access more features of the memory profiler, you will need to use it directly on the Python files rather than the iPython kernels of a notebook.
For more information, check this:
Whether itās identifying time-consuming functions, drilling down to specific lines, or pinpointing memory issues, profiling is valuable for any wizard or programmer.
I hope you learned something new. Until next time!
If you want to read more about Python code optimizations, read the blog post below.