Pydantic for Experts: Discriminated Unions in Pydantic V2
Differentiate model selection with Pydantic V2’s discriminated union.
Congratulations 🎉
If you’re reading this, you probably want to improve your python skills and learn some advanced pydantic functionality.
⚠️ Disclaimer: I’m a contributor to Pydantic.
Introduction
Pydantic is the go-to data validation python library. With about 20 million downloads per week, it is among the top 100 python libraries.
Pydantic V2, introduces discriminated unions, an advanced data type (i.e. annotation, or in other languages, data structure) for performing sophisticated unions.
💡 Performance: Logic for discriminated unions in Pydantic V2 is implemented in Rust → which means that they’re very fast.
💡 Coming soon in Pydantic V2.5: Discriminated unions are about to get even more powerful, with functionality discriminators being introduced in pydantic 2.5
Problem Statement
Let’s use AWS Appflow TriggerConfig
as an example — the model has the following constrains:
TriggerType
: requiredTriggerProperties
: required only ifTriggerType="Scheduled"
How should validate these properties?
Solution 1: Use a single pydantic model with a field validator
Perform assertions with a field_validator
:
from datetime import datetime
from typing import Literal
from pydantic import BaseModel, field_validator
class TriggerConfig(BaseModel):
"""
Represents Appflow TriggerConfig object
Documentation:
1. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-appflow-flow-triggerconfig.html
"""
TriggerType: Literal["OnDemand", "Event", "Scheduled"]
TriggerProperties: Optional[dict] = None
@field_validator('TriggerProperties', always=True)
def validate_trigger_properties(cls, v: Optional[dict], values: dict) -> Optional[dict]:
"""
If trigger type is Scheduled, then this should not be empty.
Otherwise, the value should be empty. (Empty dicts are converted to `None`)
"""
# Convert empty dict to None
if not v:
v = None
if values["TriggerType"] == "Scheduled":
if not v:
raise ValueError("triggerProperties must not be empty for a scheduled flow")
else:
if v is not None:
raise ValueError("triggerProperties must be empty for a scheduled flow")
return v
Pros:
- Field validators are straightforward
- Consistent with V1
Cons:
- Field validators introduce additional scope, which makes it harder to test
- Field validators have additional overhead
- Solution isn’t super extendable
How a single model with a field validator works
There is a single model type for instances of Scheduled
and OnDemand
. When the model is instantiated, the field_validator
ensures that the expectation of triggerProperties
is met.
Being that pydantic V2 runs on a rust backend, it should be obvious that performing the expectation in a python function (via validator) incurs a significant performance hit.
Solution 2: Use Multiple Pydantic Models with a Discriminated Union
Using discriminated union is a lot neater.
from typing import Union, Literal, List
import pytest
from pydantic import BaseModel, Field, ValidationError
class TriggerConfig_1(BaseModel, extra="forbid"):
TriggerType: Literal["OnDemand", "Event"]
class TriggerConfig_2(BaseModel, extra="forbid"):
TriggerType: Literal["Scheduled"]
TriggerProperties: dict
# Create a custom type (and validate using a type adapter)
TriggerConfig = Annotated[
Union[TriggerConfig_1, TriggerConfig_2],
Field(discriminator='TriggerType')
]
Pros:
- Neater code
- Extendable
- Native pydantic support — higher performance
- Is declarative
Cons:
- Not compatible with Pydantic V1
How Discriminated Union works:
There are 2 model types for instances, one for Scheduled
and one for OnDemand|Event
. They are “stitched together” via an annotated type — which performs the discriminated union.
When TriggerConfig
is instantiated, the discriminated union checks for the value of TriggerType
, and depending on the value, performs validation for the respective model types.
Performance is significantly faster, since discriminated union logic is performed in Pydantic’s rust backend. Further, validation is performed against a single model (as opposed to a regular union, which performs sequentially until a match is found).
An added benefit — validation errors are raised only on the respective model — so catching errors, or writing tests, becomes a lot easier and narrower.
Testing our code:
See for yourself how it works → Here are some unit tests to ensure behavior is being met.
💡 Tests will work for both solutions, give it a try.
import pytest
from pydantic import TypeAdapter, ValidationError
@pytest.mark.parametrize(
"data",
[
{'TriggerType': 'Scheduled', 'TriggerProperties': {'foo': 'bar'}},
{'TriggerType': 'OnDemand'},
{'TriggerType': 'Event'}
]
)
def test_trigger_config_valid(data: dict):
"""
Ensures TriggerConfig can instantiate valid objects
"""
ta = TypeAdapter(TriggerConfig)
_ = ta.validate_python(data)
@pytest.mark.parametrize(
"data",
[
{'TriggerType': 'Scheduled'},
{'TriggerType': 'OnDemand', 'TriggerProperties': {'foo': 'bar'}},
{'TriggerType': 'Event', 'TriggerProperties': {'foo': 'bar'}}
]
)
def test_trigger_config_invalid(data: dict):
"""
Ensures TriggerConfig raises error when instantiating invalid objects
"""
ta = TypeAdapter(TriggerConfig)
with pytest.raises(ValidationError):
_ = ta.validate_python(data)
Further Reading
- Pydantic’s Documentation: https://docs.pydantic.dev/latest/api/standard_library_types/#discriminated-unions-aka-tagged-unions
- An article I wrote an article about V2’s new features: Don’t Write Another Line of Code Until You See These Pydantic V2 Breakthrough Features
- This PR (on the pydantic codebase) can give you more info on how things actually work under the hood: https://github.com/pydantic/pydantic/pull/6570
- Discriminated Unions in TypeScript: https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html#discriminating-unions
Summary
Discriminated unions are an advanced feature of the Pydantic V2 toolkit. The Pydantic way is good. (Especially when you compare to the implementation in other languages such as TypeScript or C++).
Now you’re using advanced pydantic features. Go you!
Special thanks to: ,