How to use DRF serializers effectively during write operations

Akshar Raaj
5 min readAug 8, 2019

--

Photo by Mohamed Ajufaan on Unsplash

Agenda

This post assumes a basic familiarity with Django REST Framework.

We would discuss the following:

  • How to add custom field validation
  • How to add cross field validation
  • When and how to override to_internal_value()
  • When and how to override create()

Basic serializer

Let’s write a serializer which will allow creating User instances.

Let’s validate some data and create a user.

In [1]: from accounts.serializers import UserSerializerIn [2]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123'}In [3]: serializer = UserSerializer(data=data)In [4]: serializer.is_valid()
Out[4]: True
In [5]: serializer.save()
Out[5]: <User: john>

Custom field validation

Custom field validation can be accomplished by providing a method validate_<field_name>.

Let’s enforce that password should have at least one non alphanumeric character. We would need to define a validate_password() method to achieve this.

Any custom validation method must return the original or modified value if there is no ValidationError. That’s why we returned value.

Let’s validate data where password doesn’t contain any special character.

In [3]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123'}In [4]: serializer = UserSerializer(data=data)In [5]: serializer.is_valid()
Out[5]: False
In [6]: serializer.errors
Out[6]: {'password': [ErrorDetail(string='password must have atleast one special character.', code='invalid')]}

Let’s validate data where password contains a special character.

In [6]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#'}In [7]: serializer = UserSerializer(data=data)In [8]: serializer.is_valid()
Out[8]: True

If we want to run any validation on first_name, our method name would be validate_first_name().

Cross field validation

If we want to add some validation where we need to access multiple field simultaneously, the correct serializer hook point to do that is validate() method.

Let’s enforce that the first_name and last_name be different. We will have to override serializer’s validate() to achieve this.

Let’s validate data where first_name and last_name are same.

In [10]: data = {'first_name': 'john', 'last_name': 'john', 'username': 'john', 'password': 'abc123#'}In [12]: serializer = UserSerializer(data=data)In [13]: serializer.is_valid()
Out[13]: False
In [14]: serializer.errors
Out[14]: {'non_field_errors': [ErrorDetail(string="first_name and last_name shouldn't be same.", code='invalid')]}

Let’s validate data where first_name and last_name are different.

In [1]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#'}In [3]: serializer = UserSerializer(data=data)In [4]: serializer.is_valid()
Out[4]: True

When and how to override to_internal_value()

DRF has a hookpoint called to_internal_value(). It can be used to do some pre-processing before validation code is executed.

Suppose your Frontend or mobile app sends user information enclosed in another dictionary with key user. Assume the POSTed data looks like:

data = {'user': {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#'}}

Let’s try to validate this data and see what the serializer says.

In [5]: data = {'user': {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#'}}In [6]: serializer = UserSerializer(data=data)In [7]: serializer.is_valid()
Out[7]: False

In such case, user info needs to be extracted out of the dictionary before the fields are validated. We can achieve this by overriding to_internal_value().

Full UserSerializer looks like:

Let’s validate the same data again and see what the serializer says:

In [2]: data = {'user': {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#'}}In [3]: serializer = UserSerializer(data=data)In [4]: serializer.is_valid()
Out[4]: True

Our serializer doesn’t complain anymore.

Let’s discuss one more scenario where to_internal_value() would be ideal.

DRF expects any POSTed datetime/date in format YYYY-MM-DD. Let’s verify that we aren’t able to post in any other format and then see how we can remedy this behavior to allow other formats.

User has a datetime field called date_joined. Let’s add this field on serializer.

Let’s use format MM-DD-YYYY for date_joined and validate the data.

In [9]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#', 'date_joined': '06/12/2019'}In [10]: serializer = UserSerializer(data=data)In [11]: serializer.is_valid()
Out[11]: False
In [12]: serializer.errors
Out[12]: {'date_joined': [ErrorDetail(string='Datetime has wrong format. Use one of these formats instead: YYYY-MM-DDThh:mm[:ss[.uuuuuu]][+HH:MM|-HH:MM|Z].', code='invalid')]}

Notice the error message where DRF complains that you must use YYYY-MM-DD format for date_joined.

Let’s override to_internal_value() and use dateutil.parser() to parse the date and then call super to_internal_value() for further validation.

Let’s validate the same data again.

In [2]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#', 'date_joined': '06/12/2019'}In [3]: serializer = UserSerializer(data=data)In [4]: serializer.is_valid()
Out[4]: True
In [5]: serializer.validated_data
Out[5]:
OrderedDict([('username', 'john'),
('first_name', 'john'),
('last_name', 'doe'),
('password', 'abc123#'),
('date_joined',
datetime.datetime(2019, 6, 12, 0, 0, tzinfo=<UTC>))])

When and how to override create()

Serializer has a method called create(). create() is executed whenserializer.save() is called. The default behavior of serializer create() is to execute model manager’s create.

create() should be overridden when we want to do something different from this default behavior. While creating user instances, we would want to call User.objects.create_user() instead of User.objects.create() so that password gets hashed. This is an ideal scenario to override create.

Let’s try it out on UserSerializer. We will verify that there are no users in the database and then call serializer.save() to create a user.

In [8]: from django.contrib.auth.models import UserIn [9]: User.objects.count()
Out[9]: 1

Let’s validate some data and create a user.

In [10]: data = {'first_name': 'john', 'last_name': 'doe', 'username': 'john', 'password': 'abc123#', 'date_joined': '06/12/2019'}In [11]: serializer = UserSerializer(data=data)In [12]: serializer.is_valid()
Out[12]: True
In [13]: serializer.save()
Out[13]: <User: john>
In [14]: User.objects.count()
Out[14]: 2

Let’s check if password was hashed or not.

In [15]: user = User.objects.latest('pk')In [16]: user.password
Out[16]: 'abc123#'

The password wasn’t hashed because the serializer did the default behaviour. Calling serializer.save() internally called User.objects.create().

Let’s remedy this by overriding create().

Full serializer at this point looks like:

Let’s validate and save data again.

In [9]: data = {'first_name': 'john', 'last_name': 'doe', 'email': 'john@doe.com', 'username': 'john', 'password': 'abc123#', 'date_joined': '06/12/2019'}In [10]: serializer = UserSerializer(data=data)In [12]: serializer.is_valid()
Out[12]: True
In [13]: serializer.save()
Out[13]: <User: john>
In [14]: user = User.objects.latest('pk')In [15]: user.password
Out[15]: 'pbkdf2_sha256$100000$08HZhymedEOB$2O+WXqizLBZgXVFVc7iw0r8I2Z0gAu0GulJNUY4Dj2s=' # hashed password
In [17]: user.check_password('abc123#')
Out[17]: True

Serializer’s create() would be the correct hookpoint if we want to run manager’s get_or_create() instead of manager’s create().

Conclusion

This post covered serializer’s write behavior. If you want to understand how to effectively use serializers during read operations, see this.

Thank you for reading so far! Before you go:

--

--