How using a “Code Profiler” can be a Powerful Skill!!

Kartik Mittal
Analytics Vidhya
Published in
4 min readMay 23, 2020

Fortunately for me, during the first few week of starting work as a computer science professional, I had the opportunity to sit with a very senior developer in our quest to debug a performance issue with some application and there I witnessed “magic” — the simple craft of code profiling.

Rationale behind using code profilers

“Premature optimization is the root of all evil”

Everyone (hopefully) from the software industry must be aware of this quote by Donald Knuth. And this just sums it up as to why mere heuristics and gut feeling shouldn’t force numerous iterations for optimization i.e not jumping into refactoring even before investigating what could be the reason, something that might seem very trivial could be hurting the performance and vice-versa.

This is something very obvious and trivial, most people are actually aware of it, but due to some inherent biases in how we go about optimizing our code, looking for a few things that we assume are at fault, we don’t make use of the tools that would give a clearer picture and a few little things at fault just slide under the radar.

When to go for optimization — things to keep in mind

  1. A hunch — “this could be faster”, when you have that feeling, where resources might be getting wasted unnecessarily, use a profiler to figure out.
  2. The trade-off between how much time you want to spend writing/ re-iterating over a piece of code vs. how performant you want it to be.
  3. For small scripts, probably it’s very convenient, sometimes we don’t look really closely into when we are developing scripts to do some menial task such as reporting or scraping, but these things can be optimized and can help improve your craft as a computer science professional.

TL;DR

Before looking at a very trivial example, it’s suggested that you view a PyCon talk by Jake VanderPlas on Seven Strategies for Optimizing Your Numerical Code, this would actually give you feel about how powerful this stuff can be and what are some of the right strategies/ practices everyone must at least know of.

>>import line_profiler

Code profiling is fortunately fairly easy in Python as there are a bunch of tools, two most popular —

  1. cProfile — This is a built in code profiler that comes in Python’s standard library
  2. line_profiler — This is another utility, which I personally found very handy and the following read covers a simple example of this in action.

Following script simply just writes formatted dates to a file, to use this tool, you need to import line_profiler (of course download it first) and then we can use the @profile decorator with the methods that we want to look into.

I usually start with the main method and then step into each call that’s reporting bulk of the percentage time consumed.

To run this, there are a few options, at first we will use “kernprof” utility provided by line_profiler

$kernprof -v -l temp/main.py

This will run the script and display the results in the terminal itself, to save the output to a file and visualize/ analyze it later you can use -o options as well.

As you can see from the above results, nothing seems to be too odd about the main() method but for formatted_dates() method, at L:10 we are spending most of the time, 70% of it in out.append(date.strftime(“%Y-%m-%d))

Now can we improve this ? — “fastest way to format a date to string”, let’s see

Just by dropping the method call strftime, we see a drop from 145.0 to 33.0!!

[NOTE: There can be a variation everytime you run this as there is some overhead in using @profile so the best is to average it out and then make an inference]

Now let’s try to see if we can further improve the main function —

By changing from strftime and how we writelines, all at once by appending ‘\n’ while adding items to the list we see a drop in the call —

dates = formatted_dates(start) call from 169.0 to 74.0

Summary —

  1. Know your tools and make use of them instead of relying on just your gut.
  2. Start with the root method, and then deep dive into each function individually to actually see where it can be improved.
  3. Always remember, the trade-off between the effort required to optimize a segment vs. the ratio of performance improvement it’s gonna give you for the entire script.
  4. line_profiler has some more features as well for you to explore, like ignoring the time it takes to load a module etc, also the way the output file is generated, there a few different way it can be visualised to make better inferences.
  5. There is a way to use magic methods and do profiling in jupyter notebooks using magic methods that are provided by the line_profiler module.
  6. Also, you can use atexit to register a function with a line_profiler instance and make the script work, without having to use kernprof , more on that here — https://lothiraldan.github.io/2018-02-18-python-line-profiler-without-magic/

--

--

Kartik Mittal
Analytics Vidhya

A software engineer, passionate about learning new things and growing along the way!