Conditional Expressions in Django
Our product allows users to create and run Jupyter Notebooks, among other things. Each user notebook instance is modeled in our Django application as a Server
object. Of course, we must track how our users use these notebooks in order to bill them correctly, see what settings are the most popular, etc. To accomplish this, each time a notebook is started, a corresponding ServerRun
record is created in the database. When a notebook is stopped for any reason, the record is updated accordingly. Here is the model definition for the ServerRun
class:
We’ll focus on the scenario where we use these ServerRun
records for billing purposes, as it provides a more interesting example. First, we must get all open invoices:
Before we delve into the query used to calculate usage for a given invoice, an explanation of what the query must find: We need the sum of all server run time for a given user during their currently active billing period. Note that each user has their own billing period, dependent on when they signed up. Let’s consider the possible states of a ServerRun
object, which directly correspond to the scenarios we have to account for in our query. In general order of complexity:
- Started and stopped during the current billing period
- Started before the current billing period began, but stopped during
- Started during the current billing period, and still running at query time
- Started before the current billing period, and still running
If we only had to concern ourselves with the first scenario, things would be pretty simple. Just calculate the run’s duration by subtracting the start timestamp from the stop timestamp, and multiply that by the server’s memory size (with some unit conversions of course). Then bam, you have the usage generated by that particular customer. Of course, the real world is never that simple. Here’s the query used:
This is best explained by examining the query at a high level, and zooming in on the details as we go.
From this perspective, things are simple. We are annotating a Sum
with the key usage
. However, the value being used in the sum is ultimately the combination of a couple fields. The values in question, run duration and the amount of memory the user has configured their notebook with, are of types DurationField
and IntegerField
respectively. When combining fields of different types like this, you must provide the output_field
argument so Django knows what to return. Hence the output_field=DurationField()
at the end of the Sum
statement. Note that you must provide an instance, not a class.
This step is probably not too surprising. Here we’re telling Django to sum whatever the output of this Case
statement is. The Case
object is simple: Give it an arbitrary number of When
objects (which we’ll discuss in a second), and perhaps a default value. The first When
object whose condition is met is returned, or if none match, the default
value is returned. The Value
object, or DurationValue
in this case, is simply a way to tell Django how to cast arbitrary python values to SQL. The Value
class accepts output_field
as an argument at creation time as well.
As you can see, the first argument to the When
constructor is the boolean condition that will be used for comparison. This condition is allowed to be any (or almost any) valid field lookup, including Q
objects. Of course, the value provided to then
is used if the condition is matched.
There is a lot to unpack from the previous step to this one. First, note that both conditions for the When
statement are very similar. The first handles any ServerRun
that has been stopped, while the second handles those that are still running. Let’s break down each then
statement individually.
The Greatest
object does exactly what you would think: it accepts an arbitrary number of expressions, and returns the greatest of their values. These expressions can be field names or python expressions. If you use the latter, you may have to wrap it in a Value
object. Greatest
is used here so that users are only billed for the usage in the current billing period. In other words, if the run in question began before the current billing period we are only interested in the run time that has occurred since the start of the current billing period. We subtract the start time from the stop time (note that this When
only handles runs where stop__isnull=False
so this is guaranteed to be a valid operation) and multiply it by the run’s server_size_memory
.
Recall that when multiple expressions of varying types are combined in an annotate
or aggregate
statement, we must tell Django what the resulting type will be. This means that we must use output_field
, however F
expressions do not support output_field
. If you think about it for a moment, it makes sense that this would be the case: If F
supported output_field
, which should be respected? The first F
which defines output_field
, or the last? Order of operations shenanigans could muddle what “first” and “last” even mean, and those are just a few considerations. To handle this exact scenario, Django helpfully provides the ExpressionWrapper
. Simply wrap an expression in an ExpressionWrapper
, and you can provide output_field
, allowing Django to correctly cast the value in SQL.
The second When
expression is essentially identical to the first one, except that instead of using the run’s stop
value (since it’s null), timezone.now()
is used.
For some context, here is the query in its entirety once again:
You may be wondering why we multiply a run’s duration (a timedelta
object) with the notebook’s memory size (an int
). After we convert the timedelta
object to hours, and server_size_memory
to gigabytes, this value represents gigabyte hours which is one of the metrics we bill based on. The stoichiometry, while simple, is left as en exercise for the reader. 😉
Hopefully you find this real world example of conditional expressions in Django to be useful. If you have any comments or questions don’t hesitate to reach out!
Interested? Learn more by dropping us a note. Sign up today for a free account. No credit card required!