Recently, while building a feature for Task Badger, I wanted to run a fairly simple SQL query to get a list of rows from a table that required a dynamic time interval. Unfortunately Django doesn’t have built in support for ‘intervals’ and though it does provide mechanisms extend its query capabilities, it can take a while to figure out exactly how to do that.
Using PostgreSQL intervals in Django
For the rest of this article we’ll use a simple Django model for storing tasks. Each task has a configurable staleness timeout value which represents the number of days the task can go without updates before being considered stale.
For example, a task with stale_timeout_days = 2
will be considered stale if it was last updated
more than 2 days ago.
Here is the Django model class:
class Task(models.Model):
title = models.CharField(max_length=255)
updated_at = models.DateTimeField(auto_now=True)
stale_timeout_days = models.PostiveSmallIntegerField(default=1)
Notice that we are using auto_now
on the updated_at
field to allow Django to automatically set
this value whenever the model is saved. This decision has some impact on our testing which we’ll see
later.
The query to find stale tasks requires comparing the value of updated_at
with the current time,
taking into account the staleness timeout value. Here is the raw SQL query which utilizes the
make_interval
function to dynamically create the time interval from the data:
SELECT * FROM tasks WHERE updated_at + make_interval(days => stale_timeout_days) < now();
-- alternately we could express it as
SELECT * FROM tasks WHERE updated_at < now() - make_interval(days => stale_timeout_days);
When it comes to executing a query like this in Django, there are a few options. We could run a raw query using
QuerySet.raw
or we could pass in an additional where clause as a raw SQL string to a query using
the QuerySet.extra
function.
We could also write some code to extend the Django query expression API which would allow us to run the query without having to write any raw SQL.
Let’s have a look at some of these options.
1. A raw query
Using a raw query is the simplest approach since it does not require any additional code. Here you
can see how we would do it by using the QuerySet.extra
method to pass in an additional WHERE clause
to the Django query.
def get_stale_tasks():
stale_clause = "updated_at + make_interval(days => stale_timeout_days) < now()"
return Task.objects.extra(where=[stale_clause])
To test this function we need to update task.updated_at
so that we can set its value to some point in the
past either inside or outside the staleness window. Since we are using the now()
database function we can not
easily manipulate the current time in the database which is the reference point for the query.
Recall that we are using Django’s auto_now
functionality, so we can’t just update the model field and save the model,
we have to update it directly in the DB. Here’s an example test case:
@pytest.mark.django_db
def test_get_stale_tasks():
task = create_task(stale_timeout_days=1)
assert not get_stale_tasks()
# set `updated_at` to a time in the past
# use DB update here to bypass Django's 'auto_now' functionality
Task.objects.filter(id=task.id).update(
updated_at=timezone.now() - relativedelta(hours=25)
)
assert len(get_stale_tasks()) == 1
2. Extend Django’s expression API
Our second approach to executing our query is to extend the Django query expression API. This will allow us to run the query without writing any raw SQL. An added advantage is that it makes if there are multiple places that need to run queries like this we can re-use the code.
Here’s what we want our Python code to look like:
def get_stale_tasks():
filter_expression = functions.Now() - MakeInterval("days", F("stale_timeout_days"))
return Task.objects.filter(updated_at__lt=filter_expression)
We use the F()
function to reference the stale_timeout_days
field of the Task model and construct
the staleness window cutoff as functions.Now() - MakeInterval('days', F('stale_timeout_days'))
.
Using this expression we compare that to updated_at
using the updated_at__lt
keyword argument.
The missing piece is the MakeInterval
class which we will need to create. We want this class
to output the make_interval(days => stale_timeout_days)
part of the query.
(For more info on Django’s expression API checkout the Django docs)
Here’s the class definition for the MakeInterval
function class:
from django.db import models
class MakeInterval(models.Func):
"""Django Func class for the 'make_interval' database function
See https://www.postgresql.org/docs/15/functions-datetime.html
"""
function = "make_interval"
arity = 2
arg_joiner = "=>"
def __init__(self, unit, value):
assert unit in ("years", "months", "days", "hours", "mins", "secs")
expressions = (UnquotedValue(unit), value)
super().__init__(*expressions, output_field=models.DateTimeField())
class UnquotedValue(models.Value):
"""Raw value with no formatting (not even quotes).
This used for database keywords."""
def as_sql(self, compiler, connection):
return self.value, []
Here, we create a custom function class MakeInterval
that extends Django’s Func
class. This function
class takes two arguments: unit
and value
, which correspond to the arguments passed to the
make_interval
PostgreSQL function. The __init__
method sets the output_field
attribute to DateTimeField()
to ensure that the result of the function is cast to the correct type. This is important if you want to
use the function in a SELECT clause.
As it stands this still has the same testability issue as the raw query where you need to manually
set the updated_at
value in the DB.
Ideally I’d like to be able to use freezegun
to manipulate the
Python timestamp. Then I could have a test like this:
with freeze_time("2023-03-11 10:00:00") as frozen_time:
task = create_task(stale_timeout_days=1)
assert not get_stale_tasks()
# shift the time forward by 25 hours. The tasks should now be considered stale.
frozen_time.tick(timedelta(hours=25))
assert len(get_stale_tasks()) == 1
To make this work we either need to stop using functions.Now
, or we need to mock it out somehow.
We could do that that database level or at the Python level.
Let’s take a look at these approaches.
Testing approach 1: Stop using functions.Now
With this approach we are going to rely on Python to pass in the current timestamp to our query so that we end up with a where clause that looks like this:
WHERE updated_at < '2023-03-11 10:23:01'::timestamp - make_interval(days => stale_timeout_days)
Notice how we have replaced the now()
function with a static timestamp. This timestamp is passed
in to the query from Python. To do this we update our query so that it passed in the timestamp in the query
filter instead of using functions.Now
:
Task.objects.filter(
updated_at__lt=timezone.now() - MakeInterval("days", "stale_timeout_days")
)
Testing approach 2: Mock functions.Now
The second approach to testing is to mock functions.Now
. This is going to result in the same SQL query
as above where we are passing the timestamp to the query from Python. We’d only do this in test code and
in production code we would still use the now()
database function.
We’ll need a new Now
class to replace the existing one in tests. This can just be a normal Django Func
class as follows:
class MockNow(django.db.models.Func):
"""Output the current timestamp"""
template = "%(expressions)s"
output_field = DateTimeField()
def __init__(self):
expressions = (datetime.now(tz=ZoneInfo("UTC")),)
super().__init__(expressions)
When we use this in a query it will output the current timestamp (according to Python) into the query as a static value.
To make this easier to use let’s create a context manager which patches the django.db.models.functions
module:
@contextmanager
def mock_db_now():
"""Mock the django.db.models.functions.Now class to use the current
timestamp from Python instead of the database time.
This allows using freezegun with DB operations that use functions.Now."""
with patch.object(django.db.models.functions, "Now", new=MockNow):
yield
And finally our test with freezegun works without having to do anything extra except using the
mock_db_now()
context manager.
@pytest.mark.django_db
def test_get_stale_tasks():
with freeze_time("2023-01-01 10:00:00") as frozen_time, mock_db_now():
create_task(stale_timeout_days=1)
assert not get_stale_tasks()
frozen_time.tick(timedelta(hours=25))
assert len(get_stale_tasks()) == 1
Testing approach 3: Mock now()
at the DB level
The final approach that we could take is to mock the now()
function in the database itself.
I’m not doing to go into a the details of this approach, but this gist has some example code which you could try out: thehesiod/freeze_time.sql
You would likely still need a Python context manager to make it more usable in a test.
Wrap up
As you can see we have a few options when it comes to using database functions that aren’t directly supported by Django. You could use the same approaches here for other functions too.