Exploring Django-Taggit’s Data Model
This post illustrates how the different customization options of
django-taggit are represented in the database.
Database Schema After Installation
Following the setup instructions will add two tables to the database:
taggit_tagis the default table to store the tags
taggit_taggeditemacts as a lookup table relating tags to model instances.
By default, django-taggit uses Django’s Contenttypes Framework to allow for generic relations, thus making each model in our Django app taggable.
Tagging a Model Instance
Let’s enter the Django shell and add a tagged instance based on the README on Github
$ ./manage.py shell
from <app_name>.models import ExpVanillaTags
>>> demo = ExpVanillaTags.objects.create(title="Demo 1")
>>> demo.tags.add("summer", "autumn", "winter", "spring")
<QuerySet [<Tag: autumn>, <Tag: spring>, <Tag: winter>, <Tag: summer>]>
<QuerySet [<ExpVanillaTags: Demo 1>]>
taggit_tag in the database shows a few rows now:
id name slug
1 summer summer
2 winter winter
3 spring spring
4 autumn autumn
The lookup table
taggit_taggeditem looks like this:
id object_id content_type_id tag_id
1 1 9 1
2 1 9 2
3 1 9 3
4 1 9 4
object_idis the instance that we created with
demo = ExpVanillaTags.objects.create(title=”Demo 1")
content_type_idis the id of the content type that Django automatically created when migrating our demo model (see the
django_content_typetable for its entries)
tag_idis the primary key of the tag instance
The lookup table states that the instance with id 1 of the content type 9 is tagged with tags 1, 2, 3, and 4.
Customizing the Default Behavior
So far we have seen what we get out-of-the-box when following django-taggit’s Getting Started guide.
As the Customizing Taggit section shows, we can switch the underlying models to achieve different behavior. Why we might want to do this is stated in the docs:
By default django-taggit uses a “through model” with a
GenericForeignKeyon it, that has another
ForeignKeyto an included
Tagmodel. However, there are some cases where this isn’t desirable, for example
1. if you want the speed and referential guarantees of a real
2. if you have a model with a non-integer primary key, or
3. if you want to store additional data about a tag, such as whether it is official.
In these cases django-taggit makes it easy to substitute your own through model, or
A variation of the third argument is using a custom tag-model to store tags in dedicated models rather than a site-wide tag-repository. Imagine we are tagging semantically different entities like blog categories, image keywords, and employee skills. Chances are that we have forms somewhere that let users enter these tags, providing suggestions while they are typing. The auto-completion should probably only suggest relevant tags, which are simpler to provide when querying a dedicated table rather than a site-wide tag repository.
1. Speed And Referential Guarantees of a Real Foreign Key
Using generic foreign keys comes at the cost of an additional join, as resolving the related entity requires to query the content type table. As mentioned in the docs, we can modify our through-model and specify the foreign key to make it more explicit.
If we migrate the app with these changes and inspect the database we will see two new tables
1 is the
Food model that Django generates for us.
2 is a model based on django-taggit’s
Comparing this visualization with the one above shows the two changes already mentioned.
- Specifying a custom through model means that we do not use
taggit_taggeditembut our custom
content_object_idin the lookup table references the related entity directly, without using Django’s generic relations.
2. Supporting Non-Integer Primary Keys
We will not go into details here, as the use-case and the resulting database structure are pretty clear. Looking at the data type in
taggit_taggeditem reveals that
content_id is assumed to be an integer. If that is not the case for us, extending from
CommonGenericTaggedItemBase allows us to specify the data type of our primary key. After migration, the table will look the same as shown in Getting Started despite
content_type_id being of the different data type.
3. Storing Additional Data in Tags
Let’s follow the docs again and add these models to the app:
This time four new tables are added to our database:
2 being the usual Django models,
4 are generated based on the django-taggit classes that we extend from, leading to the following schema:
We use our custom lookup table
<app_name>_taggedwhatever while storing the tags on
Plane instances in the custom tag model
Of course we could also drop support for generic foreign keys and, as shown in
1 extend from
TaggedItemBase instead. That way we can get a dedicated tag repository for one single model type.
With its customization options, django-taggit is a versatile drop-in solution for tagging entities in Django. Visualizing the database structure helps gaining a clearer understanding of when to use which.