Django REST framework: Serialization: Deeper Look. Part 1
Django REST framework is a powerful and flexible toolkit for building Web APIs. In DRF world, serializers play a pivotal role in this process, acting as the bridge between your Django models and API clients. They handle the crucial tasks of serialization (converting model instances into a format like JSON) and deserialization (converting incoming data back into model instances for storage). But DRF serializers offer much more than just basic data conversion. This article delves deeper into the world of DRF serialization, exploring core concepts like fields and arguments that empower you to create robust and efficient APIs.
First of all, it is essential to know what are the Serialization and Deserialization under the hood.
Serialization
DRF provides serializers, which are classes that define how data is converted into a format suitable for API responses (often JSON, XML, or others). Serializers map model instances, querysets, or Python data structures to representations that can be easily transmitted over the network.
The following example clarifies the situation:
from rest_framework import serializers
from .models import Book
class BookSerializer(serializers.ModelSerializer):
class Meta:
model = Book
fields = ('title', 'author', 'publication_date')
# Book instance
book = Book.objects.get(pk=1)
# Serialization to JSON
serializer = BookSerializer(book)
json_data = serializer.data
print(json_data)
# Output: {'title': 'The Great Gatsby', 'author': 'F. Scott Fitzgerald', 'publication_date': '1925-04-10'}
BookSerializer
is a serializer class which defines exactly what fields of the corresponding Django Model should be serialized into JSON. Here, our model isBook
.- We got a single instance of
Book
withid=1
. This instance is a complex data type, namelyQuerySet
. You can read aboutQuerySet
here. BookSerializer(Book)
serializes the complex data type into JSON format. Using.data
we can get the json data and print the result.
Deserialization
Deserialization allows converting data received through API requests (often JSON) back into Python objects. The following example shows the idea:
# Sample JSON data from a request
request_data = {'title': 'Pride and Prejudice', 'author': 'Jane Austen', 'publication_date': '1813-01-28'}
# Deserialization using BookSerializer
serializer = BookSerializer(data=request_data)
# Check if data is valid
if serializer.is_valid():
serializer.save() # Saves the data as a new Book instance
print("Book created successfully!")
else:
print("Error:", serializer.errors)
- Let’s assume we sent request to some endpoint and got the requested data in JSON as shown.
- The same class,
BookSerializer
, can be used to deserialize the JSON data usingdata=
parameter. - But we don’t know whether the requested data is valid. For instance, some required fields are empty, some fields does not pass the validation tests, etc. For this reason, we have to call
.is_valid()
method to make sure that the deserialized data is 100% valid and we can save it to database.
Well, I hope you got the basic idea behind Serialization and Deserialization of Django REST Framework. Now, as a part 1, let’s do a thorough research on each of Serialization Fields with practical examples for the ones which are the hardest to understand.
Serialization Fields
Serializer fields handle converting between primitive values and internal datatypes. They also deal with validating input values, as well as retrieving and setting the values from their parent objects.
The section mostly contains practices, not the whole documentation. You can read the official docs at the same time for better performance:
🔗 https://www.django-rest-framework.org/api-guide/fields/
Core Arguments
Serializer Fields can take these arguments:
👉read_only
— Read-only fields are included in the API output, but should not be included in the input during create or update operations. Defaults to False
.
👉write_only
— Set this to True
to ensure that the field may be used when updating or creating an instance, but is not included when serializing the representation. Defaults to False
. The below example shows the idea clearly:
from django.contrib.auth.models import User
from rest_framework import serializers
class UserSerializer(serializers.ModelSerializer):
password = serializers.CharField(write_only=True)
class Meta:
model = User
fields = ('username', 'email', 'password') # Only username and email are included in output
def create(self, validated_data):
user = User.objects.create_user(**validated_data)
user.set_password(validated_data['password']) # Set password securely (not serialized)
user.save()
return user
This is useful for sensitive data like passwords, internal calculations, or temporary values that you don’t want to expose publicly.
👉required
- means this field have to be supplied. If not, error will be raised. Default is False
.
👉default
- If set, this gives the default value that will be used for the field if no input value is supplied. If not set the default behavior is to not populate the attribute at all. Default fields are not actually a part of serialized data (output). They are defined within the serializer to specify default values for corresponding model fields when creating new model instances using the serializer’s save()
method. Here are the main usage scenarios:
Pre-Populating Fields. Use a default field to provide a value for a model field that might not always be included in the request data.
from rest_framework import serializers
from .models import User
class UserSerializer(serializers.ModelSerializer):
is_active = serializers.BooleanField(default=True)
class Meta:
model = User
fields = ('username', 'email', 'is_active')
# Request data (might not always include is_active)
request_data = {'username': 'john', 'email': 'john@example.com'}
# Serialization and creation with default
serializer = UserSerializer(data=request_data)
if serializer.is_valid():
user = serializer.save()
print(user.is_active) # Output: True (default value used)
Conditional Defaults. You can define logic within the default field to provide dynamic values based on other data in the request, for instance using lambda
s:
from rest_framework import serializers
from .models import Product
class ProductSerializer(serializers.ModelSerializer):
is_featured = serializers.BooleanField(
default=lambda serializer: serializer.validated_data.get('price', 0) > 100
)
class Meta:
model = Product
fields = ('name', 'price', 'is_featured')
# Request data with variable price
request_data1 = {'name': 'T-Shirt', 'price': 80}
request_data2 = {'name': 'Laptop', 'price': 1200}
# Serialization and creation with conditional default
serializer1 = ProductSerializer(data=request_data1)
if serializer1.is_valid():
product1 = serializer1.save()
print(product1.is_featured) # Output: False (default of False)
serializer2 = ProductSerializer(data=request_data2)
if serializer2.is_valid():
product2 = serializer2.save()
print(product2.is_featured) # Output: True (default condition met)
Here, we are setting default value depending on price of the product. If price > 100
, then the default is used (True
), otherwise not (False
).
👉allow_null
— Normally an error will be raised if None
is passed to a serializer field. Set this keyword argument to True
if None
should be considered a valid value. Defaults to False
.
👉source
— this argument is crucial for customizing how data is retrieved for serialization and how data is assigned during deserialization. It allows you to:
Access Nested Data. Use source
to map a serializer field to a different attribute or method on the model instance, especially for nested relationships:
from rest_framework import serializers
from .models import Author, Book
class AuthorSerializer(serializers.ModelSerializer):
class Meta:
model = Author
fields = ('name',)
class BookSerializer(serializers.ModelSerializer):
author = AuthorSerializer(source='author_set.first()') # Get first author
class Meta:
model = Book
fields = ('title', 'author')
# Book instance with multiple authors
book = Book.objects.get(pk=1)
# Serialization with source for nested data
serializer = BookSerializer(book)
print(serializer.data) # Output: {'title': 'The Catcher in the Rye', 'author': {'name': 'J. D. Salinger'}}
Explanation:
BookSerializer
has anauthor
field withsource='author_set.first()'
. This tells DRF to retrieve the first author from theauthor_set
related manager (assuming a ManyToManyField) and use the data from that object for serialization.- The
AuthorSerializer
is nested within thesource
to specify how the author data should be serialized.
Use Custom Methods. Define a custom method on your model or serializer to perform specific logic for data retrieval:
from rest_framework import serializers
class Product(models.Model):
name = models.CharField(max_length=100)
price = models.DecimalField(max_digits=10, decimal_places=2)
def get_discount_price(self):
return self.price * 0.9 # Example discount calculation
class ProductSerializer(serializers.ModelSerializer):
discount_price = serializers.DecimalField(source='get_discount_price', read_only=True)
class Meta:
model = Product
fields = ('name', 'price', 'discount_price')
# Product instance
product = Product.objects.get(pk=1)
# Serialization with source for custom method
serializer = ProductSerializer(product)
print(serializer.data) # Output: {'name': 'T-Shirt', 'price': 10.00, 'discount_price': 9.00}
Explanation:
ProductSerializer
has adiscount_price
field withsource='get_discount_price'
. This tells DRF to call theget_discount_price
method on the product instance to get the value.- The
read_only=True
attribute prevents thediscount_price
field from being included in request data during deserialization.
Modify Data During Deserialization. Use source
to transform data during deserialization before assigning it to the model field.
from rest_framework import serializers
class TagSerializer(serializers.ModelSerializer):
name = serializers.CharField(source='validate_name_upper') # Transform to uppercase
def validate_name_upper(self, value):
if value.lower() == 'admin':
raise serializers.ValidationError("Tag name cannot be 'admin'")
return value.upper()
class Meta:
model = Tag
fields = ('name',)
# Request data with lowercase tag name
request_data = {'name': 'new_tag'}
# Deserialization with source and validation
serializer = TagSerializer(data=request_data)
if serializer.is_valid():
serializer.save()
else:
print("Error:", serializer.errors)
Explanation:
TagSerializer
has aname
field withsource='name_upper'
. This tells DRF to use thename_upper
field (defined below) for deserialization.- The
validate_name_upper
method is used for validation before assigning the value to thename
field of the model.
👉error_messages
— A dictionary of error codes to error messages.
👉label
- A short text string that may be used as the name of the field in HTML form fields or other descriptive elements.
👉help_text
- A text string that may be used as a description of the field in HTML form fields or other descriptive elements.
👉initial
- A value that should be used for pre-populating the value of HTML form fields. You may pass a callable to it, just as you may do with any regular Django Field
:
import datetime
from rest_framework import serializers
class ExampleSerializer(serializers.Serializer):
day = serializers.DateField(initial=datetime.date.today)
👉style
— A dictionary of key-value pairs that can be used to control how renderers should render the field. Two examples here are input_type
and base_template
:
# Use <input type="password"> for the input.
password = serializers.CharField(
style={'input_type': 'password'}
)
# Use a radio input instead of a select input.
color_channel = serializers.ChoiceField(
choices=['red', 'green', 'blue'],
style={'base_template': 'radio.html'}
)
We skipped validators, as we will consider them in the next part.
BooleanFields
👉BooleanField
— A boolean representation. For instance:
is_published = serializers.BooleanField(default=False)
String Fields
👉CharField
- A text representation. Optionally validates the text to be shorter than max_length
and longer than min_length
.
👉EmailField
- A text representation, validates the text to be a valid e-mail address.
👉RegexField
- A text representation, that validates the given value matches against a certain regular expression.
👉SlugField
- A RegexField
that validates the input against the pattern [a-zA-Z0-9_-]+
.
👉URLField
- A RegexField
that validates the input against a URL matching pattern. Expects fully qualified URLs of the form http://<host>/<path>
.
Consider an example for the above 5 fields at once:
class ProductSerializer(serializers.Serializer):
name = serializers.CharField(max_length=100) # Textual field with max length
description = serializers.CharField(required=False) # Optional text field
email = serializers.EmailField() # Field for email addresses
category_slug = serializers.SlugField() # Field for slugs (alphanumeric with hyphens)
website_url = serializers.URLField() # Field for valid URLs
👉UUIDField
- A field that ensures the input is a valid UUID string. The to_internal_value
method will return a uuid.UUID
instance. On output the field will return a string in the canonical hyphenated format, for example: de305d54–75b4–431b-adb2-eb6b9e546013
.
👉FilePathField
- A field whose choices are limited to the filenames in a certain directory on the filesystem.
👉IPAddressField
- A field that ensures the input is a valid IPv4 or IPv6 string.
Consider an example for the above 3 fields at once:
class UploadSerializer(serializers.Serializer):
upload_id = serializers.UUIDField(read_only=True) # Unique identifier (generated)
file_path = serializers.FilePathField(path='/uploads/', allow_files=True) # File path within uploads folder
ip_address = serializers.IPAddressField(protocol='both') # Allows IPv4 and IPv6
Numeric Fields
👉IntegerField
- An integer representation.
👉FloatField
- A floating point representation.
👉DecimalField
- A decimal representation, represented in Python by a Decimal
instance.
Consider an example for the above 3 fields at once:
class ProductPriceSerializer(serializers.Serializer):
price = serializers.IntegerField(min_value=0) # Non-negative integer price
discount = serializers.FloatField(max_value=1.0) # Discount between 0 and 1 (inclusive)
sale_price = serializers.DecimalField(max_digits=5, decimal_places=2) # Price with 2 decimal places
Date and Time Fields
👉DateTimeField
- A date and time representation.
👉DateField
- A date representation.
👉TimeField
- A time representation.
👉DurationField
- A Duration representation. The validated_data
for these fields will contain a datetime.timedelta
instance. The representation is a string following this format '[DD] [HH:[MM:]]ss[.uuuuuu]'
.
Consider an example for the above 4 fields at once:
class EventSerializer(serializers.Serializer):
start_datetime = serializers.DateTimeField() # Full date and time
event_date = serializers.DateField() # Just date information
duration = serializers.DurationField() # Time duration (e.g., hours, minutes)
end_time = serializers.TimeField(required=False) # Optional time field
Choice selection Fields
👉ChoiceField
- A field that can accept a value out of a limited set of choices. Used by ModelSerializer
to automatically generate fields if the corresponding model field includes a choices=…
argument.
👉MultipleChoiceField
- A field that can accept a set of zero, one or many values, chosen from a limited set of choices.
Consider an example for the above 2 fields at once:
class ProductCategorySerializer(serializers.Serializer):
STATUS_CHOICES = (
('available', 'Available'),
('out_of_stock', 'Out of Stock'),
('discontinued', 'Discontinued'),
)
category_name = serializers.CharField(max_length=50)
status = serializers.ChoiceField(choices=STATUS_CHOICES)
# Tags can have multiple values from a defined set
tags = serializers.MultipleChoiceField(choices=(
('electronics', 'Electronics'),
('clothing', 'Clothing'),
('home_goods', 'Home Goods'),
))
File Upload Fields
👉FileField
- A file representation. Performs Django's standard FileField validation.
👉ImageField
- An image representation. Validates the uploaded file content as matching a known image format.
Consider an example for the above 2 fields at once:
class UploadSerializer(serializers.Serializer):
document = serializers.FileField(use_url=True) # Any file type
profile_picture = serializers.ImageField(max_length=100, allow_empty_file=True) # Image specific
def validate_document(self, value):
if value.content_type not in ['application/pdf', 'text/plain']:
raise serializers.ValidationError('Only PDF and text files allowed for documents.')
return value
Composite Fields
👉ListField
— A field class that validates a list of objects. For example, to validate a list of integers you might use something like the following:
scores = serializers.ListField(
child=serializers.IntegerField(min_value=0, max_value=100)
)
👉DictField
— A field class that validates a dictionary of objects. The keys in DictField
are always assumed to be string values. For example, to create a field that validates a mapping of strings to strings, you would write something like this:
document = DictField(child=CharField())
or:
class UserDataSerializer(serializers.Serializer):
# List of strings
skills = serializers.ListField(child=serializers.CharField())
# Dictionary of key-value pairs
profile_data = serializers.DictField(child=serializers.CharField())
👉HStoreField
- A preconfigured DictField
that is compatible with Django's postgres HStoreField
. It has an efficient storage of key-value pairs compared to storing separate columns for each key. Additionally, well-suited for scenarios where you have a dynamic set of keys with string values.
👉JSONField
- A field class that validates that the incoming data structure consists of valid JSON primitives. In its alternate binary mode, it will represent and validate JSON-encoded binary strings.
Miscellaneous Fields
👉ReadOnlyField
- A field class that simply returns the value of the field without modification. This field is used by default with ModelSerializer
when including field names that relate to an attribute rather than a model field.
👉HiddenField
- A field class that does not take a value based on user input, but instead takes its value from a default value or callable. For example, to include a field that always provides the current time as part of the serializer validated data, you would use the following:
modified = serializers.HiddenField(default=timezone.now)
👉ModelField
- A generic field that can be tied to any arbitrary model field. It is used to create serializer fields for custom model fields, without having to create a new custom serializer field.
👉SerializerMethodField
- This is a read-only field. It gets its value by calling a method on the serializer class it is attached to. It can be used to add any sort of data to the serialized representation of your object.
Consider an example for the above 4 fields at once:
from rest_framework import serializers
class BookSerializer(serializers.Serializer):
title = serializers.CharField(max_length=255)
author = serializers.CharField(max_length=100)
publication_date = serializers.DateField(read_only=True) # Read-only field
slug = serializers.ReadOnlyField(source='get_absolute_url') # Derived value
category = serializers.ModelField(model='categories.Category') # Field for a related model
average_rating = serializers.SerializerMethodField() # Calculated field
def get_average_rating(self, obj):
# Logic to calculate and return the average rating for this book
# (might involve querying the database or other calculations)
average_rating = obj.reviews.aggregate(avg_rating=Avg('rating'))['avg_rating']
return average_rating or 0.0 # Handle cases where no reviews exist
class Meta:
model = 'books.Book'
fields = '__all__'
❗️ We skipped validators field for now, as we will consider them more practically in the upcoming part of the article.
🏁 In order to work with Serializers with confidence and broad understanding, it is crucial to try at least once and analyze each field and core arguments. That’s it for today, thanks for your attention!