Configure Postgres statement_timeout from within Django

Apr 02, 2018

*“A close-up of white dials on a music mixer” by Alexey Ruban on Unsplash*

In a bid to prepare ourselves for projected growth, we are at the moment trying to figure out what part of our system will break at what scale, and how. One step towards this was to also define strict timeouts for our database queries, and eliminate/fix bad queries in the process.

The problem

Our requirements were:

Be able to define different timeout values for different types of servers (app servers, analytics etc.)
The different limits should be well represented in the code so that they’re easy to discover, even by people who join our team in the future
It should be easy and quick to modify these limits

We identified multiple sources of our queries. Each of these might need a different query timeout. These sources are:

App servers: queries that run for our frontend facing APIs, like APIs that our Android app or clients use
Celery servers: queries made by our celery tasks that run asynchronously
Cron servers: queries made as part of crons
Alerts: we have a system that runs SQL queries at configured time intervals, and pushes the data (results of the queries) to relevant people (over Slack)
Analytics: queries that run as a part of our ETL (v0.1) system

We planned to incrementally reduce the timeouts because at every step/iteration, there will be queries which will not be able to run properly within the planned timeout. We will have to fix all those queries before we reduce the timeout. The incremental limits we defined for each iterations were:

This is a no-brainer if your requirements are simple. You can simply create roles in the database and set different timeouts for them.

Our backend is built using Django, and to accomplish this we would have to

Write a raw SQL migration to create the roles (if needed), and
Alter them to set the appropriate timeout
Set the database dictionary differently for different server in Django settings with the correct role and passwords

Why not just directly log in to the shell and do this? Because then this change isn’t represented in the code and creates gaps in knowledge over time. But, even though migrations are part of the code, they are just for change management, and rarely does someone go back to migrations to look for “logic” affecting your app’s behaviour.

Since we were planning on having multiple iterations, and there would be a lot of back and forth between the timeout limits while we are experimenting, it would become a hassle to write migrations and apply them every time something had to be changed. This solution was not for us.

The solution

We started thinking of a better way to accomplish what we had in mind.

We knew the “ease of configuration” would only come if we can set the timeouts from within Django somehow. Thinking more in this direction and connecting little tidbits we were aware about Django and Postgres, we realized that:

One can set a timeout using SET inside a Postgres session which is then adhered to until the end of the current session using: SET statement_timeout=<x>;.
Django publishes a connection_created signal every time a new database connection is created. This connection is then put in the Connection Pool from where it can be reused (governed by configuration parameters like CONN_MAX_AGE)

Aha! Can’t we just catch the connection as soon as it is created, and set the timeout to whatever we desire for the session? Yes we can :)

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

Show hidden characters

	from django.conf import settings

	# NOTE
	# The timeout values here only restrict the roles from application code. The actual timeout set in the DB could
	# be a different value.
	# Timeout set in DB on 27/03/2018 is 50s.

	DEFAULT_DB_TIMEOUT_IN_MS = 50000

	default_conn = settings.DJANGO_DEFAULT_DB_CONNECTION_NAME
	explorer_conn = settings.EXPLORER_CONNECTION_NAME
	slave_conn = settings.SLAVE_NAME


	DB_IDENTIFIER_AND_CONNECTION_TO_TIMEOUT_MAP = {
	settings.PROD_APP_DB_TIMEOUT_IDENTIFIER: {
	default_conn: 20000,
	explorer_conn: 50000,
	slave_conn: 20000
	},
	settings.PROD_CELERY_DB_TIMEOUT_IDENTIFIER: {
	default_conn: 50000,
	explorer_conn: 50000,
	slave_conn: 50000
	},
	settings.DEFAULT_DB_TIMEOUT_IDENTIFIER: {
	default_conn: 50000,
	explorer_conn: 50000,
	slave_conn: 50000
	},
	}

view raw db_timeouts.py hosted with ❤ by GitHub

Show hidden characters

	from django.db.backends.signals import connection_created

	# ...

	def set_timeout_on_new_conn(sender, connection, **kwargs):
	"""
	Rig django to set statement timeout for each new connection based on the config
	"""
	try:
	timeout_to_set = DB_IDENTIFIER_AND_CONNECTION_TO_TIMEOUT_MAP[settings.DB_TIMEOUT_IDENTIFIER][connection.alias]
	except KeyError as e:
	logger.error("KeyError while setting DB Timeout: {0}".format(e))
	timeout_to_set = DEFAULT_DB_TIMEOUT_IN_MS

	with connection.cursor() as cursor:
	cursor.execute("set statement_timeout={0}".format(timeout_to_set))


	connection_created.connect(set_timeout_on_new_conn)

view raw models.py hosted with ❤ by GitHub

Show hidden characters

	# ...

	PROD_APP_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:app'
	PROD_CELERY_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:celery'
	PROD_CRON_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:cron'
	PROD_BIZ_API_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:biz'
	PROD_ADMIN_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:admin'
	PROD_EXPLORER_DB_TIMEOUT_IDENTIFIER = 'db_timeout:production:explorer'
	STAGING_DB_TIMEOUT_IDENTIFIER = 'db_timeout:staging:*'
	DEFAULT_DB_TIMEOUT_IDENTIFIER = 'db_timeout::'

	# ...

	DB_TIMEOUT_IDENTIFIER = PROD_CELERY_DB_TIMEOUT_IDENTFIER

	# ...

view raw settings.py hosted with ❤ by GitHub

We also went ahead and set the timeout separately for each connection.alias. That gives us even more flexibility, we can now set multiple separate timeout values for the connections from the same server as well (for example: set a timeout of 5s for queries made for our Android app facing APIs, except for login API, for which we set the timeout to 1s. And then use <queryset>.using('<alias>') to use the timeout you want).

The benefit of this approach is:

Everything is in the code, you can just read and figure out what is happening
Easy to modify the timeouts
Since the logic is now in the application code, we can do more stuff with this, like setting a certain timeout only for some percentage of the connections.
Further, we can set different timeouts for different queries made from the same server

Caveats

Since we are running an additional query every time a connection is made, it has some implications. Even though Django’s documentation says that the effect is minor, it is worth checking out if it’s okay for your case
Since the timeout is set using Django’s signals, it means that wherever Django does not publish a signal, this will not work. One such case is when you are directly logging in to the Postgres shell (or by doing python manage.py dbshell).

The changes mentioned here gives us more control over our queries and we can selectively restrict our systems in case there is a 🔥 that needs strict actions to be taken to keep the more important parts of the app alive (not the best solution, but sometimes they don’t have to be 😃) .

(Originally posted on my Medium account)

UPDATE: Jun 2020

Haki Benita was kind enough to provide feedback on this article. Here is a link to what he suggested: Haki Benita’s comment on Medium. If you are learnt from or liked this article, you should definitely read Haki’s feedback as well.

Ketan's Newsletter

Discussion about this post