Upskill tutorial for type annotations

Hud Wahab
4 min readJun 30, 2023

--

Day 2: List. Iterable. Callable. Sized. TypeVar.

Hi 👋 I am Hud, a postdoc for engineering data science at the AI Manufacturing Center in Laramie, Wyoming. My funding is running out (AaAaaA !), so while I am actively looking for a new job, instead of doing the 205th coding certificate to prove my worthiness — I thought I’d do design challenges and document how I spend my time upskilling so other engineers can do the same.

Nowadays, certificates are everywhere. Documenting small upskill projects that you can later show off is the best way to get recognition as a professional engineer.

This is day 2 of a 30-day design challenge. Follow along and let me know if you get stuck!

Steps for the challenge:

  1. Understand the importance of type annotations in improving code clarity, catching bugs, and enhancing IDE support.
  2. Download the given before.py file from here and ensure you have the Python language extension installed in your VS Code editor.
  3. Set the type checking mode to “strict” in your editor’s settings.json file.
"python.analysis.typeCheckingMode": "strict"

4. Open the before.py file and observe the type errors flagged by the IDE.

5. Write type annotations for variables, function parameters, and return values in the code without changing its functionality.

6. Ensure that all type errors identified by the IDE are resolved after adding the annotations.

7. Keep the type annotations as generic as possible, avoiding extra limitations on function usage.

8. Do not use the “Any” type in this challenge.

9. Verify that the code runs without errors after adding the type annotations.

10. Submit your solution and verify that all type errors have been eliminated, maintaining the code’s original functionality.

Why not just lists?

Let’s start with the “easy” ones. Just use list[int] — What’s the big deal?

Well, firstly, we see that the function only iterates through a list. Using iterable type annotations, such as Iterable[T], allows your code to work with a broader range of iterable objects beyond just lists. This includes other sequence types like tuples, sets, and even custom objects that implement the iterable protocol.

And secondly, if we consider the other functions, then there’s a downside to limiting the input parameter to a single type, e.g. int or float.

Lastly, if we look at count_chars function, we’re just concerned with the length of the word — it doesn’t even need to know if it’s a string (list of chars) or a list of numbers.

Broad inputs, narrow outputs

By using Iterableand Sizedwe get more flexibility in our input parameters. The int|float notation is a much cleaner and clearer usage than Union[int,float]. It is also more generic, note that we can rename the function count_chars() to count() since Sized refers to any types with a length dunder __len__.

Why don’t we apply the same to the output? It’s generally much better for the caller of the function to specifically know what is being returned. This prevents errors, improves readability as well as maintaining consistency for easier integrations.

Correct typing for functions

When injecting functions into parent function, it’s worth looking at how the function is being implemented:

From here we see process_data is takes in two types of functions filter_func and process_func. The difference between the these two is that filter_func deals with the same I/O of the same type, and process_func deals with I/O of different types i.e. list of chars -> list of integers.

TypeVar, Callable

With the differences above in mind, we can define two type variables T, U, and FilterFunc, and a ProcessFunc callable type. T and U are generic type variables that can be used to define the type of input and output data respectively. FilterFunc is a callable type that takes a single argument of type T and returns a value of the same type T. ProcessFunc is a callable type that takes a single argument of type T and returns a value of type U.

By applying this to the original process_data function, the modified code is more concise and easier to read than the first one. It uses type annotations and optional parameters to make the code more self-documenting and easier to understand.

Conclusion

Congratulations! You finished Day 2 from the 30-day design challenge.

If you have reached this far, you know how to:

  • Use generic and broader types for inputs and specific types for output variables
  • Create custom types and differentiate them within a callable type

Check out day 3 challenge!

Also, you can access the full 30-day GitHub repository here.

💡 My goal here is to help engineering data scientists upskill in design. I’d like to hear from you! Was this helpful? Anything I can improve? Connect with me on LinkedIn | Medium

--

--

Hud Wahab

🤖 Senior ML Engineer | Helping machine learning engineers design and productionize ML systems. | Let's connect: https://rb.gy/vb6au