Advantages of Type Annotations

Published in

Python Supply

8 min readMay 30, 2020

Suppose you are working on a project in Python that takes advantage of Python’s support for higher-order functions. Perhaps you are allowing users to specify their own hooks or event handlers within a web application framework, or you are creating a unit testing library that generates test cases for a user-supplied function. In many such use cases, you might find yourself dealing with two related scenarios:

user-supplied functions can return results of different types and you do not know in advance what the result type of any given user-supplied function will be, or
your code needs to generate inputs to user-supplied functions that have the type the user-supplied function expects.

The first of these cases is more straightforward to handle at runtime: inspect the type of the output and branch accordingly. However, in the event that you need to determine this information in advance, you may not always be free to invoke the function (e.g., if the function relies on some resource that is not yet available or running the function has a very high cost). The second is more evidently difficult, and you would need to provide for the user a way to specify the input type. How can you organize your API to handle both of these issues in a way that keeps your API clean and allows users of your API to leverage features already built into Python?

Type Information and the Python Syntax

While there exist within the Python community standards for documenting information about the inputs and outputs of functions and methods within docstrings (such as those in the Google Python Style Guide), it is explicitly recommended that the information inside docstrings should not consist of a signature. This is in contrast with documentation conventions maintained by communities for other programming languages (such as JSDoc).

So how should programmers document the exact type signature of a function if they wish to do so? Starting with Python 3.5, the Python syntax was extended with mature and well-developed features that allow for type annotations: the ability to specify the types of variables and functions at the time they are defined. The documentation calls these type hints because it is a purely syntactic feature. Type annotations are not checked statically (i.e., at the time the code is parsed and transformed into bytecode) or dynamically (i.e., when the code is actually running).

The cosmetic quality of this built-in feature does not limit its utility, however. In addition to providing a more standard and formally endorsed representation for information that might otherwise be relegated to documentation strings or formatted comments, the Python community is free to write its own static and dynamic analysis tools that use the native syntax for type annotations.

More information on the built-in type specification library and the type annotation concrete syntax can be found in its documentation page. The corresponding new additions to the Python abstract syntax (a topic covered in more detail in another article) are actually quite few in number. A comparison of the Python 2.7 grammar and the Python 3.6 grammar shows that new parameters appear in only a few places:

function argument types are specified using the annotation parameter in the arg parameter of the FunctionDef and AsyncFunctionDef cases,
function output types are specified using the returns parameter in the FunctionDef and AsyncFunctionDef cases, and
the type of the variable being assigned is specified in the annotation parameter of the AnnAssign case.

All that is really happening here is that the Python syntax now has a few extra contexts, delimited using the tokens : and ->, in which programmers can add expressions that will make it into the abstract syntax tree as type annotations.

Annotations with Built-in Types

By convention, type annotations that refer to built-in types simply use the type constructors (e.g., int, float, and str). The example below demonstrates how type annotations in can be included in assignment statements.

n: int = 123
s: str = "abc"

The example below demonstrates how type annotations can be included in function definitions.

def f(x: int) -> int:
    return x + x

Note that no static or dynamic type checking takes place; the annotations are ignored by the interpreter.

>>> (f(123), f("abc"))
(246, 'abcabc')

Annotations with More Complex and Custom Types

The built-in typing library provides a number of useful functions for building up more complex and also user-defined types. The example below illustrates how the Tuple constructor can be used to specify a tuple type. Note the use of overloading to repurpose the bracket notation that is usually used for indexing.

from typing import Tuple

def repeat(si: Tuple[str, int]) -> str:
     (s, i) = si
     return s * i

It is also possible to specify user-defined types.

from typing import NewType

UserName = NewType("UserName", str)

def confirm(s: UserName) -> bool:
    return s == "Alice"

As before, note that no type checking occurs.

>>> (confirm("Alice"), confirm("Bob"), confirm(123))
(True, False, False)

It is also possible to introduce type variables. This is particularly useful for specifying types for functions that are examples of parametric polymorphism. In the example below, the function is an example of parametric polymorphism in that it can operate on any list, regardless of the types of the items in that list.

from typing import Sequence, TypeVar

T = TypeVar("T")

def first(xs: Sequence[T]) -> T:
    return xs[0]

This annotation would be an indication from whoever implemented it that the function first can be applied to a list of any type as long as all the elements in that list are of the same type.

>>> (first([1,2,3]), first(["a", "b", "c"]))
(1, 'a')

The annotation in the above example is different from the annotation below, which indicates that the types of the items in the input list can be mixed (e.g., [123, "abc"]) as long as they are each either an integer or a string.

from typing import Sequence, Union

def first(xs: Sequence[Union[int, str]]) -> Union[int, str]:
    return xs[0]

Determining Input Types

To return to the motivating example introduced in the first paragraph of this article, suppose you are creating a unit testing framework that generates random inputs for functions in order to check that (1) they always return an output of the specified type for every input and (2) they do not raise any exceptions. You can allow users of your library to specify the input types of the functions they are trying to test via Python type annotations. This ensures you are not reinventing the wheel and that your users are not cluttering their code more than necessary with decorators or other additional information that is useful only for your framework and nothing else.

There are two distinct ways to extract the type annotations associated with a function. One approach (using concepts and techniques covered in detail in another article) is to inspect the source code of the function, parse it into an abstract syntax tree, and then extract the annotations from that abstract syntax tree.

import inspect
import ast

def signature(f):
    # Parse the function and extract types from the AST.
    a = ast.parse(inspect.getsource(f))
    type_in = a.body[0].args.args[0].annotation.id
    type_out = a.body[0].returns.id
    return (type_in, type_out)

One benefit of this approach is that you are extracting the original text found in the definition as a string (rather than the value or object to which it evaluates).

>>> Number = int
>>> def double(x: Number) -> Number:
...     return x + x
...
>>> signature(double)
('Number', 'Number')

Another approach is to use the __annotations__ attribute of a function.

>>> double.__annotations__
{'x': int, 'return': int}

Note that because return is a reserved word in the Python concrete syntax, it is safe for it to appear as a key in a dictionary in which all other keys are names of input parameters. The variant of signature below assumes that there is only one input parameter in the function definition.

def signature(f):
    a = f.__annotations__
    type_in = [a[k] for k in a if k != 'return'][0]
    type_out = a['return']
    return (type_in, type_out)

When using this approach, you receive the evaluated result of the expression that appeared within the annotation context. If the original synonym used for a type (as in the first example above) is important to obtain for your application or scenario, that information may be lost after evaluation. If all you care about is the actual type and not the name of the user-defined synonym, this approach is a more direct way to obtain the annotation information.

>>> signature(double)
(int, int)

One important point to consider once you have chosen one of the two techniques above is how you might check whether an object or value is of the type you obtained from the annotation. If the type annotation information is in the form of a string and you are checking a value of a built-in type or an object of a user-defined class, you can perform the check by extracting the name of the type and doing a string comparison.

>>> type(123).__name__ == "int"
True

On the other hand, if the type annotation information is the form of a value or object that represents a type (such as int), you can check a value is of the type in the following way.

>>> isinstance(123, int)
True

A concrete example of a component in your unit testing framework is presented below. It can ingest a function that takes a single input and produces a single output. This component will generate random inputs of the appropriate type, depending on whether the input type annotation of the supplied function indicates that the input must be an integer or a floating point number. It will then check that the output type matches the function’s output type annotation and that no exceptions are raised.

import random

def safe(f):
    (type_in, type_out) = signature(f)
    
    for i in range(10000): # Run 100 trials.
        
        # Generate a random input of the appropriate type.
        if type_in is int:
            value_in = random.randint(-2**16, (2**16)-1)
        if type_in is float:
            value_in = random.uniform(-2**16, (2**16)-1)
        
        # Check that output has the correct type.
        try:
            value_out = f(value_in)
            assert(isinstance(value_out, type_out))
        except:
            return False
    
    return True # All trials succeeded.

The safe function is applied to some example inputs below. The triple function correctly returns an integer in all cases. However, the floor function incorrectly (at least, according to its type specification) returns floating point numbers when its input is not positive.

>>> def triple(x: int) -> int:
...     return x + x + x
...
>>> def floor(x: float) -> int:
...     return int(x) if x > 0 else x
...
>>> (safe(triple), safe(floor))
(True, False)

Notice that this approach allows you to create a testing component that can handle failures during the testing process without raising an exception, making it possible to build long chains of tests for many functions without worrying about unexpected termination of the Python interpreter.

Advantages of Type Annotations

Type Information and the Python Syntax

Annotations with Built-in Types

Annotations with More Complex and Custom Types

Determining Input Types

Further Reading

Written by Andrei Lapets