Managing multiple geographies and time zones in Django — part 2
Also available at https://fractalideas.com, with syntax highlighting.
Last month, I demonstrated how to handle geographies in a Django app, so that:
- users can only interact with objects attached to their geography;
- users see datetimes in the local time zone of their geography.
This is an example of multi-tenancy: each geography is a tenant.
Then I had an uncommon requirement: let a user account interact with several geographies. It’s interesting because it doesn’t fit into the data model at all!
Here’s how I handled this requirement and what I learnt in the process.
There’s no obvious way to meet this requirement with my system where every object is attached to a single geography:
- Superusers can interact with every geography but this isn’t what I need. A user may need access to several small geographies but not to the larger ones.
- I cannot create one account per geography for a given user because user accounts are tied to our global user directory. Besides, switching accounts constantly would be annoying.
Modeling reality
After mulling over the issue for a few days, it became clear that I had no choice but to change the data model to match the new reality.
This means converting the foreign key that attaches users to geographies to a many-to-many relationship.
Like most schema changes that refactor existing data, this is done in three steps:
- Adding the many-to-many field and generating the schema migration:
# models.pyfrom django.contrib.auth.models import AbstractUserDEFAULT_GEOGRAPHY_NAME = "France"class User(AbstractUser, Geographical):
geographies = models.ManyToManyField(
to=Geography,
related_name="users",
verbose_name="geographies",
)
2. Copying the data over with a data migration
# migrations/xxxx_copy_user_geography.pyfrom django.db import migrationsdef initialize_user_geographies(apps, schema_editor):
User = apps.get_model("...", "User")
for user in User.objects.all():
user.geographies.add(user.geography)def clear_user_geographies(apps, schema_editor):
User = apps.get_model("...", "User")
User.geographies.clear()class Migration(migrations.Migration):dependencies = [
("...", "..."),
]operations = [
migrations.RunPython(
initialize_user_geographies,
clear_user_geographies,
)
]
3. Removing the foreign key and generating the schema migration:
# models.pyfrom django.contrib.auth.models import AbstractUserclass User(AbstractUser): # <-- removed Geographical mixin here
geographies = models.ManyToManyField(
to=Geography,
related_name="users",
verbose_name="geographies",
)def save(self, *args, **kwargs):
creating = self.id is None
super().save(*args, **kwargs)
if creating:
self.geographies.add(Geography.objects.get(
name=DEFAULT_GEOGRAPHY_NAME,
))
I’m also adapting the User
model admin. I'm adding the geographies
field unconditionally because only superusers have access to the admin interface for users¹. They manage users across all geographies.
# admin.py@admin.register(models.User)
class User(UserAdmin): # <-- removed Geographical mixin heredef get_fieldsets(self, request, obj=None):
# Add the geographies fields as the list field of the first fieldset.
fieldsets = super().get_fieldsets(request, obj)
return [
(fieldsets[0][0], {
**fieldsets[0][1],
'fields': list(fieldsets[0][1]['fields']) + ['geographies'],
}),
] + list(fieldsets[1:])list_filter = list(UserAdmin.list_filter) + ['geographies']filter_horizontal = ['geographies'] + list(UserAdmin.filter_horizontal)
No free lunch!
At this point, I can’t escape refactoring every access to User.geography
.
That’s a big, painful refactoring because every model has a geography
attribute and because I don't have static type checker² to tell me which ones relate to User
and need updating.
Whenever I see self.geography
or qs.filter(geography=geography)
, I must determine if the code is manipulating User
and, if it does, update it.
Attaching a geography to requests
In several locations, I’m accessing the current geography with request.user.geography
. In order to support switching geographies, the current geography must become an attribute of the request instead of the user.
I add a new middleware that sets request.geography
:
# middleware.pydef set_geography(get_response):def middleware(request):
if request.user.is_authenticated:
assert request.user.geographies.count() == 1
request.geography = request.user.geographies.get()
else:
request.geography = None
return get_response(request)return middleware
Unauthenticated users don’t have a geography. Even though the app is only available to authenticated users, Django still needs to process a couple anonymous requests to perform the login process, regardless of whether it’s single sign-on with ADFS or the standard admin login form.
Temporarily, the middleware takes a shortcut and assumes there’s only one geography. I’ll lift this limitation when I implement the UI for switching geographies.
I update the time zone middleware to rely on request.geography
:
# middleware.pydef set_timezone(get_response):def middleware(request):
if request.geography:
timezone.activate(pytz.timezone(request.geography.timezone))
else:
timezone.deactivate()return get_response(request)
At this point, everything works again with the new data model, as long as users have a single geography. Testing that no regressions have crept in gives me confidence before adding new functionality.
Geography selection
Now everything’s ready. It’s time to allow users to interact with several geographies.
In order to support changing the current geography, I need to store it in the session instead of the database.
This is a good point for thinking through the new requirement:
- Most users manage a single geography. These users mustn’t be exposed to multiple geographies.
- Some users are allowed to manage multiple geographies. After they log in, they should choose a geography. At any point, they should be able to switch to another geography. The current geography should be clearly visible.
These rules combine with technical constraints to produce a much more complex middleware than I’d like.
Unauthenticated users don’t have a geography, as explained above.
Authenticated users must have a geography:
- If their account is configured to allow them to manage a single geography, then that’s their geography.
- If a geography that they’re allowed to manage is currently active, then that’s their geography.
- Otherwise, they must pick a geography. To keep things simple, I’ll redirect them to a geography selection form.
Implementing this reveals one more constraint: the geography selection form mustn’t require an active geography, or else that will trigger an infinite loop.
Here’s the result, after multiple iterations to wrestle the logic into a readable and robust flow:
# middleware.pydef set_geography(get_response):def get_geography(request):
# Anonymous users don't have a geography.
if request.user.is_anonymous:
return# If a geography is defined in the session and valid for the
# user, then it's the current geography.
# This is the default case for users with multiple geographies.
try:
geography_id = request.session['geography_id']
except KeyError:
pass
else:
try:
return request.user.geographies.get(id=geography_id)
except ObjectDoesNotExist:
del request.session['geography_id']# If no geography is defined in the session but there's only one
# valid geography for the user, then it's the current geography.
# This is the default case for users with a single geography.
try:
return request.user.geographies.get()
except (ObjectDoesNotExist, MultipleObjectsReturned):
passdef middleware(request):
geography = get_geography(request)
redirect_url = reverse('select_geography')
# Force geography selection after authentication.
# Avoid redirect loop and let autocomplete requests go through.
if (
request.user.is_authenticated
and geography is None
and request.path != redirect_url
and not request.is_ajax()
):
return redirect_to_login(
next=request.get_full_path(),
login_url=redirect_url,
)
request.geography = geography
return get_response(request)return middleware
Submitting a valid form stores the geography in the session with request.session['geography_id'] = geography.id
and redirects.
To a large extent, the middleware grew more complex because I have to ensure consistency between two data stores: the session and the database.
At any point, geographies can be added to or removed from the list of geographies that a user is allowed to manage. Since I can’t update active sessions accordingly³, I have to check on every request.
Simplifying the admin
You might remember from the previous episode that:
- Regular users only see objects from their current geography.
- Superusers see all objects from all geographies.
This design results from two choices:
- Geography is handled identically in all models. The implementation is factored out in a model mixin and a model admin mixin.
- Superusers are able to edit all user accounts, if only to attach the correct geography to each user after they create their account with SSO.
That’s how the logic for allowing superusers to edit users accounts from any geography extended to all other model admins in the system.
Unfortunately, there are unintended consequences:
- Several times, I built features that worked well for superusers but didn’t for regular users. I also introduced regressions for regular users that I didn’t notice with my superuser account. Overall, past the initial implementation, maintaining two sets of behaviors was a constant drag. The project was too small to justify writing enough tests to prevent regressions.
- The admin UI didn’t prevent superusers from attempting invalid actions like creating a relationship between objects from different geographies. In that case, assertions in
save()
prevent writing invalid data to the database, at the cost of a poor user experience - the admin crashes! I attempted increasingly elaborate hacks, for example to restrict autocomplete results to the geography of the object being edited, but I was clearly digging myself into a hole.
After making the changes described in this blog post, I realized there was an obvious solution!
The User
model admin no longer includes the Geographical
mixin. Therefore, I can filter by geography in Geographical
for superusers and regular users alike. Superusers will still be able to view and to edit all users accounts.
Besides, superusers can edit their own account to give themselves access to any geography they need.
This simplifies the Geographical
model admin mixin drastically:
# admin.pyclass Geographical(admin.ModelAdmin):
"""
Mixin for objects associated to a geography.
"""def get_queryset(self, request):
"""
Only show objects in the user's geography.
"""
qs = super().get_queryset(request)
qs = qs.filter(geography=request.geography)
return qsdef save_model(self, request, obj, form, change):
"""
Create objects in the user's geography.
"""
if not change:
assert obj.geography_id is None
obj.geography = request.geography
return super().save_model(request, obj, form, change)
(The previous version was much longer.)
This solves my issues fully and makes the application much easier to maintain. If something works with my superuser account, it’s extremely likely to work for regular users. Also, my superuser experience improves vastly, as the UI now prevents me from mixing geographies incorrectly.
Too good to be true
There’s still an annoying bug with this design. When user input fails a uniqueness check, the admin crashes instead of reporting the error to the user.
This turns out to be a side effect of not showing the geography field in the admin and setting its value in save_model()
, which runs after form validation.
When a field isn’t included in a ModelForm
— here, the geography field — Django skips checking uniqueness constraints involving that field.
All my uniqueness constraints involve the geography field because they are scoped to geographies. As a consequences, constraints are never validated by Django.
PostgreSQL notices, though, and rejects writes that would break a unique constraint. This triggers an exception and crashes the admin.
To resolve this, I add the geography to the form as a hidden field and I enforce its value:
# admin.pyclass Geographical(admin.ModelAdmin):def get_queryset(self, request):
"""
Only show objects from the user's geography."""
queryset = super().get_queryset(request)
queryset = queryset.filter(geography=request.geography)
return querysetdef get_fieldsets(self, request, obj=None):
"""
Add the geography field."""
fieldsets = super().get_fieldsets(request, obj)
assert not any(
'geography' in fieldset[1]['fields'] for fieldset in fieldsets
)
# Insert "geography" as the first field of the first fieldset.
# Avoid mutating the fieldsets class variable,
# in case get_fieldsets returns it directly.
fieldsets = [
(fieldsets[0][0], {
**fieldsets[0][1],
'fields': ['geography'] + list(fieldsets[0][1]['fields']),
}),
] + list(fieldsets[1:])
return fieldsetsdef get_form(self, request, obj=None, **kwargs):
"""
Enforce the value of the geography field and hide it."""
Form = super().get_form(request, obj, **kwargs)class EnforceGeographyMixin(forms.Form):geography = forms.ModelChoiceField(
initial=request.geography,
queryset=models.Geography.objects.filter(
id=request.geography.id,
),
widget=forms.HiddenInput,
)return type(Form.__name__, (Form, EnforceGeographyMixin), {})
Here’s how get_form()
works:
- Generating a
ModelForm
class dynamically is the only way to get the form I need, depending on the current value ofrequest.geography
, with public APIs. - The
HiddenInput
widget makes the field invisible in the admin. - The
initial
value sets the value of the hidden input to the current geography. - The
queryset
contains exactly one element, the current geography. It validates that the value of the hidden input wasn't tampered with.
Finally, type(Form.__name__, (Form, EnforceGeographyMixin), {})
creates a class that:
- has the same name as the class the
Form
variable points to, - inherits
Form
andEnforceGeographyMixin
, - doesn’t have any other attributes.
Look at the documentation of the type()
builtin for details.
I discovered this trick in Django’s implementation of ModelAdmin.get_form()
and I thought you'd like it :-)
As you can see, even though I have some experience with Django, I went through multiple iterations of the design, until I stopped seeing problems I cared about.
Surely the implementation could be refined further. For example, if a user edits an object, opens a new tab, switches geographies in the new tab, and comes back to the initial object, they won’t be able to save the object because the hidden geography field will fail validation. Time will tell if users hit this bug in practice. If they do, I’ll have to mangle request.POST
to insert the current geography instead of relying on a hidden field.
I see three positive outcomes here:
- I had enough flexibility to adapt quickly to a fundamental change in the data model. It could have been much worse.
- Django had my back when I needed to store data in the session. In the middle of this refactoring, I didn’t want to think about how cookies work.
- Refactoring the code revealed opportunities for improvements. I made the project more maintainable and fixed several bugs, even though that wasn’t my initial goal.
Perhaps the end result feels slightly less elegant. There’s some ad-hoc code that could use more comments.
However, it handles more requirements, has fewer crashing bugs, and isn’t too complex, so I’m still happy with it!
This is a good example of what happens when you try to customize the Django admin beyond simple, declarative parameters.
The Django admin has a lot of public APIs. Probably this started by identifying reasonable extension points and documenting them as public APIs. Of course, there’s never enough extension points for everyone, so Django users asked for more public APIs. They also started overriding on private APIs, which created pressure from stabilizing them and making them public.
Now the Django admin has so many public APIs that it’s pretty much frozen in its current form. I suspect — and hopefully I won’t offend anyone by saying this — that public APIs reflect the implementation much more than a consistently designed set of extensions points.
To me, customizing the admin is akin to witchcraft ;-) You can achieve great results with very little code but you’re never sure it’s completely right. You can’t anticipate every side effect, especially as you combine multiple features.
Lastly, I’m glad my data was safeguarded by the strong guarantees of a relational database. When I had a bug in the application, I would have corrupted data if it weren’t for PostgreSQL. Go relational databases!
Footnotes
- If a non-superuser has access to the admin interface for users, they can give themselves the superuser status. For this reason, only superusers should be able to edit users in the admin.
- I’m aware of mypy and similar type checkers but I don’t think they’re the right trade-off for such a small Django project.
- Django doesn’t provide an API to access all active sessions of a given user.
Originally published at https://fractalideas.com on October 14, 2019.