Don't Let Search Take You Down
Twice in the last few months we’ve had to deal with Django production site issues caused by inefficient search queries.
You might guess that inefficient queries would cause high CPU usage and 504 gateway timeouts for users — and yes, that happened. But we also saw a less obvious side effect: disks filling up with database temporary files.
What These Sites Had in Common
Both projects shared several characteristics:
- Code bases more than a dozen years old.
- Regularly updated to run on supported Django releases, but search had never been modernized to use:
- Full text search in the database, or
- A dedicated search engine such as Elasticsearch.
- Search queries are broken into words and matched across multiple fields in Django models (including related models).
- A user interface that provides full pagination of search results.
- Results that are fully ordered to support correct pagination.
- Users can search large sets of data (not “big data” scale, but still significant).
Key Differences Between the Sites
While the symptoms were similar, there were differences:
- One used MariaDB
- The other used PostgreSQL.
- One site’s search was public-facing, while the other was restricted to admin access.
The Problem
Despite these differences, both sites experienced the same issue:
Search queries generated huge result sets that needed full ordering.
That led the databases to write gigabytes of temporary files to disk. Eventually, disks filled to (or near) capacity.
Quick Fixes
Long-term, both sites need improved search efficiency.
In the short term, we added query timeouts to prevent runaway resource usage:
- For MariaDB: set
max_statement_time
- For PostgreSQL: set
statement_timeout
-- MariaDB
SET GLOBAL max_statement_time = 5000; -- in milliseconds
-- PostgreSQL
SET statement_timeout = '5s';
With these in place, individual requests that consumed excessive resources were cut off before they could cause serious harm.
Conclusion
- Inefficient search queries can silently fill disks with temporary files, not just spike CPU or slow responses.
- Query timeouts are a useful safety net, but not a permanent solution.
- The long-term fix is rethinking search:
- Adopt full text search in your database, or
- Integrate a dedicated search engine like Elasticsearch or OpenSearch.
- Don’t let search take your site down.