Documenting Your Machine Learning Projects Using Advanced Python Techniques (Part 1: Decorators + Bonus: Context Managers)

Daria Zhukova
The Startup
Published in
3 min readSep 20, 2020

--

Taken from https://www.123rf.com/photo_16681458_snake-with-a-forked-tongue-and-new-year-decorations-symbolizing-the-year-of-snake.html

Python decorators can be extremely useful in reducing repetitive code in your projects. Even though the syntax can be quite confusing at first, the payload can be extremely useful by providing a strong and robust boilerplate for your code.

Here is a quick and simple example of a decorator:

The output of the above code would read “Cat Says Meow”. The sound_maker function encloses the animal function, lets the animal function execute and decorates it with the print statement “Says Meow”.

Now, to avoid writing the below to decorate any function,

wrapped_object = sound_maker(animal)

one can turn to the python’s syntactic sugar syntax “@” sign. Then the below can be converted to:

This code would read the same as the previously mentioned one, but with less code overhead!

Let us now step back and see how we can take advantage decorators to streamline our machine learning workflows.

Scikit-learn package contains a collection of tools for building machine learning projects. It includes a huge variety of classifiers, regressors, clustering algorithms, sample datasets and more! The package is also well known for its standard and easy fit/predict methods. If you’ve worked with Scikit-learn before, chances are that you have used fit/predict on multiple models inside your projects to find the right model for your data. Let’s take a look at an example of doing this using the “iris dataset”.

Here’s a preview of x:

For all of the scikit-learn users out there, how many of us have been guilty of the below?

Obviously, the above code involves a lot of repetition and can get really confusing when you are differentiating what score came from where (especially if you are fitting more models for comparison or if you are using Jupyter Notebooks! ). We could really use some custom functions and decorators to log this activity ;)

First let us create a custom function that will do the meat of fitting/predicting on our data:

Now let’s create a model registry decorator!

Let’s break this down a bit… This decorator takes in a function, passes it into a wrapper (notice this is a special decorator too! Read more about it here) that takes in args and kwargs of the function to be decorated. Those args and kwargs are then used to run the function, as well as to log the function’s outputs into a dictionary “registry”.

Now that we’ve defined this model_registry, we can use it to decorate our fitter function:

And when we run the below code,

we get the below output when printing the registry dictionary!

Voila! Keeping track of fitted models is now easier!

However, we can take this a step further. This time, let’s modify our decorator definition to log the metrics of our models into files. To do this, we’ll first define a supplementary factory function that will act like a python context manager and will write our files for us. We’ll do this using the contextmanager decorator! (Read more about it here).

Then we’ll correct our model_registry decorator to include this factory function:

This decorator will now register our models into a registry dictionary, then create a classification table for the model’s predicted outputs vs. true outputs and will write that table to a text file. All you have to do is run the below!

This can of course be taken further by adding other metrics and information regarding the models to the files. But…that’s a story for another article!

Hope you all enjoyed learning about decorators and using them to streamline you machine learning experiments!

--

--

Daria Zhukova
The Startup

Currently a Senior Data Scientist with Lockheed Martin. I am, ultimately, seeking to advance and revolutionize data exploration, discovery and analytics.