I is a copilot generated image, from prompt: “Python facing a Type Error issue”
picture by MS Copilot

Enhancing Python Functions: Decorated Type Validation

MechAI
5 min readMar 27, 2024

--

Introduction

Why should I even verify the types in Python?

Dynamic typing in Python gives us extraordinary flexibility and, at the very same time, might make life more difficult. Especially when it comes to data pipelines, we would like to control the types of data that come in and go out. Indeed, we may suggest the user type hinting (see PEP-484 and PEP-526). It is actually a nice way to say “If I were you, I would use the suggested types”. However, sometimes we’d like to be exact. The main reason for that is to crash early and save a lot of time, computation resources, and eventually money. The best practice is then to check all the types we want, create the corresponding exception classes, and check the values through the if-else statements. The other way, less recommended, is to use assertions inside the function. No matter which way we go, the code clarity drops immediately. When it comes to typing automation, there are several tools available right now to help with that process, like mypy, pytype, and more. You can find the details in this article.

Example:

I have a super simple function:

def rectangle_area(x: Union[int, float], y: Union[int, float]) -> float:
return float(x * y)

So basically, what we really like to have as an inputs to the function is either an int or a float, and we always have a float at the output. Surely, it is not a complicated function that might fail after a few hours of computation just because someone used string, complex, or anything else that does not make any sense to run the rectangle area computation. Nevertheless, it is a good testing case.

Let’s implement the type check then. It can be done like that (the fast way):

def rectangle_area(x: Union[int, float], y: Union[int, float]) -> float:
assert isinstance(
x, (int, float)
), f"x must be an int or a float type, but insted got {type(x)}"
assert isinstance(
y, (int, float)
), f"y must be an int or a float type, but insted got {type(y)}"
return float(x * y)

or like that:

def rectangle_area(x: Union[int, float], y: Union[int, float]) -> float:
if isinstance(x, (int, float)):
raise TypeError(f"x must be an int or a float type, but insted got {type(x)}")
if isinstance(y, (int, float)):
raise TypeError(f"y must be an int or a float type, but insted got {type(y)}")
return float(x * y)

Problem fixed! There is nothing to complain about. Now, I just need to fix 100 similar functions, and I am done … 😒

Let's decorate it!

Decorators are very useful functions, and using modern Python (e.g. dataclasses) surely you have seen and used them. But if you don't, you can start here:

or maybe here: https://realpython.com/primer-on-python-decorators/

What I wanted to achieve was to have a single decorator to check the types of any function. Surely, it does not cover all of the edge cases, specific typing nesting, but it is a good base for understanding it and how to use the typing itself for data type validation. The entire code, with a few helping functions, can be found here. I hope that, in the future more interesting stories will be found there as well.

What we will use now is the following decorator function:

def validate_by_typing(func: callable) -> callable:
"""Decorator to raise the error when the type data are inconsistant with typing"""

def wrapper(*args, **kwargs):
kwargs.update(zip(func.__code__.co_varnames, args))
_types = update_typing_objects(get_types(func))
for key, value in kwargs.items():
if not isinstance(value, _types[key]):
raise TypingBasedTypeError(
key=key,
func=func,
expected_type=_types[key],
received_type=type(value),
)

return func(**kwargs)

return wrapper

I used the above 2 functions, which can be found in the repo, but for clarity of the story, I do not explain them in detatails.

  1. get_types prepares the dictionary with types of data and secures the cases where typing is not available (by giving the ‘object’ type).
  2. update_typing_objects converts Union and Any to something useful for the isinstance function. Actually, the whole key for potential upgrades to the code is in this one (namely, to cover all the edge cases and get inspired on how to solve the nested typing problem).

Let's check out how to use it!

The only thing we must do now is to ‘dress up’ the function with a decorator and use it!

@validate_by_typing
def rectangle_area(x: Union[int, float], y: Union[int, float]) -> float:
return float(x * y)

In the example, I run the function twice:

def rectangle_test():
x = 2.5
y = 3
print(f"Correct types: x: {type(x)}, y: {type(y)}")
print(rectangle_area(x, y))

x = 2.5
y = "3.0"
print("Now incorrect values:")
print(f"correct type: x: {type(x)}, incorrect type y: {type(y)}")
print(rectangle_area(x, y))


if __name__ == "__main__":
try:
rectangle_test()
except Exception as e:
print("!! ERROR !!")
print(e)

and have the following console output:

Correct types: x: <class 'float'>, y: <class 'int'>
7.5
Now incorrect values:
correct type: x: <class 'float'>, incorrect type y: <class 'str'>
!! ERROR !!
Parameter "y" in the function rectangle_area must be of type (<class 'int'>, <class 'float'>), but got <class 'str'> instead.

Non-standard objects typing

The question is whether it can be used anywhere? Just imagine, I need to validate the data inputs for Pandas DataFrame, or maybe any dataclass object: let’s name it MyDataClass. It surely can be done! The only thing you must do is correct typing, which should be done anyway whenever you code in Python.

from pandas.core.frame import DataFrame

@validate_by_typing
def do_something(df: DataFrame):
return df

The story is exactly the same if it comes to any objects we would like to validate the types against. To complement, below is an example for dataclass object.

from dataclasses import dataclass

@dataclass
class MyDataClass:
first_name: str
second_name: str
year_of_birth: int

from datetime import date

@validate_by_typing
def check_age(user: MyDataClass) -> bool:

if date.today().year - user.year_of_birth < 19:
return False
else:
return True

user_1 = MyDataClass('Adam', 'K.', 1983)
user_2 = dict(first_name='Adam', second_name='K.', year_of_birth=1983)

Running the check_age for user_1, we have an answer of True; however, in the second case, the error is raised.

---------------------------------------------------------------------------
TypingBasedTypeError Traceback (most recent call last)
/tmp/ipykernel_12616/1130473327.py in <module>
----> 1 check_age(user_2)

~/project/typing_checker/typing_checker.py in wrapper(*args, **kwargs)
59 for key, value in kwargs.items():
60 if not isinstance(value, _types[key]):
---> 61 raise TypingBasedTypeError(
62 key=key,
63 func=func,

TypingBasedTypeError: Parameter "user" in the function check_age must be of type <class '__main__.MyDataClass'>, but got <class 'dict'> instead.

Summary

Decorators are proven tools for simplifying code, offering a streamlined approach to implementing common functionalities. One does not need to wad through the meanders of code when the decorators are provided. The decorator presented here offers a simple yet effective solution for validating data against typing requirements, thereby reducing code complexity. Additionally, the same function can serve for testing purposes, providing a versatile tool for maintaining code integrity. I hope you find this utility useful. Let me know your thoughts!

see you soon

--

--

MechAI

Senior Data Scientist, former aviation engines engineer, with a PhD in Physics. My mission is to fuse my multidisciplinary background into impactful solutions.