Ever find yourself copy and pasting a bunch of code all over your python file or notebook? Maybe you have a design pattern that requires a similar set of actions but with enough small variations that a normal function isn’t sufficient. If so then a decorator might be the right solution for you.
I’ve always thought decorators seemed really cool, but I never had a good use case for them. So I never bothered to learn how they work… Until now! Consider me a decorator convert!
In short, a decorator acts as a wrapper, encapsulating some functionality within a function.
A Practical Example
Imagine you’re performing some work for a retail client. They’ve provided historical transaction data, and you need to build a collaborative filtering solution.
As with any project, your first step will be to explore and clean the data. You’ll probably look for transactions that appear anomalous, and let’s say you want to remove those transactions from the data. You might want to keep track of all the transaction ID’s you drop, so you can bring them up to the customer for discussion.
You could write some code that lists all the unique transaction ID’s present in your dataset, then performs a filtering operation, then again lists all the unique item ID’s, then finds the difference between those two lists.
That could look like this:
That would work just fine, but every time you want to perform another filtering function, you’ll need to copy and paste several lines of code. Not only is that a hassle, but if you decide to somehow change that process or add functionality (see here for functionality you might want to add), you’ll need to make the update all over your code.
You could just make a function for each filtering step, but then you’d have to rewrite that tracking code each time. Ideally you would turn the tracking functionality into a template which could handle different filtering options.
Decorators to the Rescue!
This is exactly the use case for decorators! A decorator wraps a function, modifying its behavior however you wish.
In our case, the function we’d like to wrap is filtering_func()
. We can apply a decorator to the definition of that function to apply the tracking behavior we described above.
A decorator is defined just like a function, then that decorator is invoked in the definition of other functions. A simple implementation of the code above using a decorator would look like this:
Now whenever we use filtering_func()
that function will automatically have the tracking functionality baked in!
We could create a filtering function to remove null values, or remove outliers, or remove invalid values, or filter by date — whatever functionality we want — and @tracking_decorator
will make sure we keep track of any changes at each filtering step.
More Detail
That was a pretty brief explanation, so let’s dive a little deeper into the code to make sure we can make it work.
In the example above there are a few important points:
@functools.wraps(func)
is critical if you intend to pass arguments from the embedded function to the decorator functionality. For example, if you want to be able to call filtering_func()
and pass the name of the transaction ID column to the decorator function, you’ll need to use @functools.wraps(func)
.
*args
and **kwargs
are both very powerful concepts in Python. If you aren’t familiar with them, read about them here.
The line df = func(*args, **kwargs)
will remain unchanged, throughout just about any implementation of any decorator functionality. You might change what the assignment looks like, but the entire purpose of a decorator is to leave the func
call generic enough to accept any function.
A Full Example
Let’s look at a proper example with some example data.
First let’s define our example data:
We’ve purposefully introduced a null value in our data to illustrate some functionality.
Now let’s define our decorator and our filtering function:
Now when we execute our remove_nulls
function, we’ll automatically invoke our tracking logic and get back both a filtered dataframe and a list of dropped transaction ID’s!
df, dropped_transactions = remove_nulls(data=df)
Conclusion
Now that you understand decorators and have seen an example of their functionality, you can play around with adapting the concept to your use case!
There are plenty more concepts to discuss branching from this idea. Stay tuned for future articles on partial functions and using global variables!