Exploring Django-Taggit’s Data Model

Stephan Herzog
Dec 26, 2016 · 4 min read

This post illustrates how the different customization options of django-taggit are represented in the database.

Database Schema After Installation

Following the setup instructions will add two tables to the database:

  1. taggit_tag
  2. taggit_taggeditem
  • taggit_tagis the default table to store the tags
  • taggit_taggeditem acts as a lookup table relating tags to model instances.

By default, django-taggit uses Django’s Contenttypes Framework to allow for generic relations, thus making each model in our Django app taggable.

Tagging a Model Instance

Let’s enter the Django shell and add a tagged instance based on the README on Github

Inspecting taggit_tag in the database shows a few rows now:

The lookup table taggit_taggeditem looks like this:

  • object_id is the instance that we created with demo = ExpVanillaTags.objects.create(title=”Demo 1")
  • content_type_id is the id of the content type that Django automatically created when migrating our demo model (see the django_content_type table for its entries)
  • tag_id is the primary key of the tag instance

The lookup table states that the instance with id 1 of the content type 9 is tagged with tags 1, 2, 3, and 4.

Customizing the Default Behavior

So far we have seen what we get out-of-the-box when following django-taggit’s Getting Started guide.

Why Customizing?

As the Customizing Taggit section shows, we can switch the underlying models to achieve different behavior. Why we might want to do this is stated in the docs:

By default django-taggit uses a “through model” with a GenericForeignKey on it, that has another ForeignKey to an included Tag model. However, there are some cases where this isn’t desirable, for example

1. if you want the speed and referential guarantees of a real ForeignKey,
2. if you have a model with a non-integer primary key, or
3. if you want to store additional data about a tag, such as whether it is official.

In these cases django-taggit makes it easy to substitute your own through model, or Tag model.

A variation of the third argument is using a custom tag-model to store tags in dedicated models rather than a site-wide tag-repository. Imagine we are tagging semantically different entities like blog categories, image keywords, and employee skills. Chances are that we have forms somewhere that let users enter these tags, providing suggestions while they are typing. The auto-completion should probably only suggest relevant tags, which are simpler to provide when querying a dedicated table rather than a site-wide tag repository.

1. Speed And Referential Guarantees of a Real Foreign Key

Using generic foreign keys comes at the cost of an additional join, as resolving the related entity requires to query the content type table. As mentioned in the docs, we can modify our through-model and specify the foreign key to make it more explicit.

If we migrate the app with these changes and inspect the database we will see two new tables

  1. <appname_>_food
  2. <appname>_taggedfood

1 is the Food model that Django generates for us. 2 is a model based on django-taggit’s TaggedItemBase:

Comparing this visualization with the one above shows the two changes already mentioned.

  1. Specifying a custom through model means that we do not use taggit_taggeditem but our custom <app_name>_taggedfood
  2. content_object_id in the lookup table references the related entity directly, without using Django’s generic relations.

2. Supporting Non-Integer Primary Keys

We will not go into details here, as the use-case and the resulting database structure are pretty clear. Looking at the data type in taggit_taggeditem reveals that content_id is assumed to be an integer. If that is not the case for us, extending from CommonGenericTaggedItemBase allows us to specify the data type of our primary key. After migration, the table will look the same as shown in Getting Started despite content_type_id being of the different data type.

3. Storing Additional Data in Tags

Let’s follow the docs again and add these models to the app:

This time four new tables are added to our database:

  1. <app_name>_car
  2. <app_name>_plane
  3. <app_name>_mycustomtag
  4. <app_name_>taggedwhatever

With 1 and 2 being the usual Django models, 3 and 4 are generated based on the django-taggit classes that we extend from, leading to the following schema:

We use our custom lookup table <app_name>_taggedwhatever while storing the tags on Car and Plane instances in the custom tag model <app_name>_mycustomtag.

Of course we could also drop support for generic foreign keys and, as shown in 1 extend from TaggedItemBase instead. That way we can get a dedicated tag repository for one single model type.

With its customization options, django-taggit is a versatile drop-in solution for tagging entities in Django. Visualizing the database structure helps gaining a clearer understanding of when to use which.





Stephan Herzog

Written by