Testing your code for speed and efficiency is a crucial aspect of software development. When code takes too long or consumes too much of a resource like memory or CPU you can quickly run into a wide range of issues. The machines your code runs on can become unstable, your code can produce unintentional side effects and in some cases even data loss. Making sure to examine glaring performance issues as they arise is helpful, but it is equally as important to establish performance baselines and profiles as well.
Code should be tested for functionality from start to finish during the development process, but it is also important to test for performance as well. Building good habits of testing your code for things like speed and resource utilization while writing it will save you headaches down the road.
In this article we’re going to explore ways that you can benchmark and baseline your Python code. The libraries we’ll look at are freely available and provide flexible ways to do things like performance timing, resource consumption measurement and more. Let’s get started.
First up is a Python utility that’s been around for a while and is widely popular for performing quick performance tests. Let’s setup a simple test script and use
timeit.Timer to run a simple time test:
# test.pyimport timeit
import timedef long_function():
long_function we introduce some delay using
time.sleep in order to simulate some long running tasks. Next, in order to actually test our function we pass it to to
Timer class measures the overall execution speed of the function. The
number argument specifies how many times the test should be repeated. This is useful if you have functions that may have even a slight variation in execution time. Repeating the test will build a better picture of speed because you’ll have more data to work with.
After running the code above you should get the following output. This shows that
timeit ran our function twice and then produced the total time it took. Since we’re just sleeping, our variance won’t be too much:
timeit library is great for performing quick, isolated tests on snippets of code but can also be run from the command-line as a standalone unit and contains even more useful parameters.
Check out the detailed documentation for more info:
The next library we’ll explore is called
line_profiler and its usage is a bit more unique than other solutions. The
line_profiler library allows you to get the execution time of each individual line in a file. This is incredibly useful if you’re having trouble narrowing down slow functions or third-party calls in a larger file. Seeing the time spent on each line let’s you quickly pinpoint issues instead of digging through line after line of dense code.
The standard usage for
line_profiler can seem a bit confusing at first, but once you use it a few times it becomes easier. In order to profile your code, you need to add
@profile decorators to each function. Let’s re-use our example from before and tweak it to see how this works:
# test.pyimport time@profile
Looks pretty simple right? That’s because with
line_profiler you don’t have to import anything or change your code very much, you just have to add the decorator. Since we’ve got our decorator setup on our slow function, the only thing left to do it actually test the code. In order to run
line_profiler you have to do two things outside of your code:
kernprof -l test.py
python -m line_profiler test.py.lprof
The first command above will run actually run
line_profiler on your file and generate a separate
.lprof file in the same directory. This
.lprof file contains the results from which a report can be produced using the module itself in the second command. Let’s take a look at the output from the second command:
Timer unit: 1e-06 sTotal time: 5.00472 s
Function: long_function at line 6Line # Hits Time Per Hit % Time Line Contents
7 def long_function():
8 1 15.0 15.0 0.0 print('function start')
9 1 5004679.0 5004679.0 100.0 time.sleep(5)
10 1 21.0 21.0 0.0 print('function end')
Each line for the profiled function is listed with its detailed statistics. Because we spend so much of our time in
long_function sleeping it takes up almost 100% of the execution time of the file. Using
line_profiler to generate a breakdown of where all of your execution time is going let’s you quickly determine where you may need to focus on refactoring or speeding up slow tasks.
Check out the documentation for more info:
For the next two libraries we’re going to shift gears and focus on the underlying resource implications of running Python code. Being able to tell how much CPU and memory your code use is almost mandatory in a modern environment. Setting unbounded processes off to chew through cycles is taboo and could land you in a world of pain. That’s where
resource comes in. This library will let you measure resource usage in your code and even set limitations on how much of a particular resource can be consumed.
Let’s look at an example of how to inspect the CPU usage from within a script:
# test.pyimport time
from resource import getrusage, RUSAGE_SELFdef long_function():
for i in range(10 ** 10):
2 + 2long_function()
In this example we’ve again modified our existing script in order to pull in the
resource package. In order to put a larger load on the CPU during the
long_function we loop over a large range of numbers and force the CPU to perform some calculations. This will produce a higher load than simply sleeping.
Once our test has completed, we should be able to see the following usage output:
resource.struct_rusage(ru_utime=152.395004, ru_stime=0.035994, ru_maxrss=8536, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=1092, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=0, ru_nivcsw=1604)
We can see in this output that we spent quite a lot of user CPU cycles. If you examine the
ru_utime (which is user time) figure it shows we spent a total of 152 seconds of time. Not only can you measure CPU time, but the some other other metrics also give you insight into things like block IO and stack memory usage.
For more detailed information, check out the documentation:
memory_profiler library is similar to
line_profiler but focuses on producing stats directly related to memory usage. When you run
memory_profiler on your code, you still get a line-by-line breakdown, but with a focus on overall and incremental memory usage by line.
Just like in
line_profiler we’re going to use the same decorator structure to test our code. Here is the modified example code:
data = 
for i in range(100000):
In the above example we create a test list and push a large range of integers into it. This slowly balloons the list so we can see the memory usage grow over time. In order to view the
memory_profiler report you can simply run:
python -m memory_profiler test.py
This should produce the following report containing line-by-line memory statistics:
Filename: tat.pyLine # Mem usage Increment Occurences Line Contents
3 38.207 MiB 38.207 MiB 1 @profile
4 def long_function():
5 38.207 MiB 0.000 MiB 1 data = 
6 41.934 MiB 2.695 MiB 100001 for i in range(100000):
7 41.934 MiB 1.031 MiB 100000 data.append(i)
8 41.934 MiB 0.000 MiB 1 return data
As you can see, our function starts out with about
38MB of memory usage and grows to
41.9MB after our list is filled. Although you can obtain memory usage information from the
resource library, it does not produce a detailed line-by-line breakdown like
memory_profiler. If you’re hunting down a memory leak or dealing with a particularly bloated application this is the way to go.
Check out the
memory_profiler GitHub for more details:
Thank you for reading! I hope you have enjoyed digging into these great Python performance libraries. Reach out on Twitter with some of your own favorite ways to benchmark and fine tune Python code!