Optimising database queries - 03/08/2025
Optimising database queries is both an art and a science.
I was asked the question about database optimisation recently in a job interview, so I wanted to write the guide for myself and others. Since I’m learning Rails at the moment, I will use some of the Rails-specific gems but there are equivalents available in other languages as well.
When it comes to query optimisation, it’s really about a step-by-step approach. First, you want to establish a baseline and figure out which queries are actually slow.
1. Establish a Baseline & Identify Hotspots
Enable Slow-Query Logging
- PostgreSQL: set
log_min_duration_statement = 200(ms) - MySQL: enable the slow query log with
long_query_time = 0.2
Use Application-Level Instrumentation
- In Rails, turn on detailed SQL logging in development/staging (
config.log_level = :debug). - Integrate APM (New Relic, Datadog) or a gem like
rack-mini-profilerto surface N+1s and long-running queries.
Gather Metrics
- Collect 95th/99th-percentile query times.
- Track queries per minute and error rates.
Goal: Know which queries are the biggest offenders.
2. Analyse the Query Plan
Once you know your hotspots, you continue with the analysing the query plan. So, you run an EXPLAIN ANALYZE, you see where those sequential scans or joins are happening, and then you look at how you can add indexes—maybe it’s a single column index, maybe it’s a composite index if you’re filtering on multiple columns.
EXPLAIN / EXPLAIN ANALYZE
- Run
EXPLAIN ANALYZE SELECT …in your database console to see actual vs. estimated costs. - Look at sequential scans vs. index scans, join strategies, sort operations, and row estimates.
Check for Misestimates
- If the planner’s row estimates are wildly off, consider updating statistics (
ANALYZEin Postgres,ANALYZE TABLEin MySQL). - Large tables may need more frequent analyses or increased stats target.
Goal: Understand where time is being spent (scans, sorts, joins).
3. Tactical Indexing
Add Missing Indexes
- Foreign keys, columns used in
WHERE,JOIN,ORDER BY, andGROUP BY. - Example: if you see
WHERE user_id = ?many times,CREATE INDEX idx_posts_user_id ON posts(user_id)speeds up lookups.
Composite Indexes
- For queries filtering on multiple columns, e.g.
SELECT * FROM events
WHERE user_id = ? AND created_at > ?
ORDER BY created_at DESC;
→ CREATE INDEX idx_events_user_date ON events(user_id, created_at DESC);
Covering Indexes (Index-Only Scans)
- Include all columns needed by the query in the index so the table itself never needs to be read.
Partial Indexes (Postgres only)
- Index only the “hot” subset, e.g.
CREATE INDEX idx_active_users ON users(email)
WHERE active = true;
Goal: Minimise full-table scans and let the database do the least work possible.
4. Refactor & Simplify Queries
Avoid SELECT
- Only fetch the columns you need:
SELECT id, title FROM articlesvs.SELECT *.
Break Up Complex Joins
- Sometimes splitting a multi-join query into two smaller queries (with temporary tables or caching) is faster.
Denormalise When Appropriate
- If you routinely need a user’s
nameon every post row, consider copying it intopoststo avoid a join—accepting the write-time overhead.
Use Window Functions
- For “top-N per group” queries, window functions (
ROW_NUMBER() OVER (PARTITION BY…)) can be more efficient than correlated subqueries.
Goal: Make each query as lean and focused as possible.
5. Caching Strategies
Caching is another big piece—maybe you cache at the application level or use materialised views in the database. And finally, you keep monitoring, set alerts if something slows down again, and just keep that feedback loop going.
Application-Level Cache
- Use Rails’
cache.fetcharound expensive queries. - Invalidate the cache on writes to that table or via TTL.
Database-Level Cache
- Materialised views (Postgres): refresh on a schedule or via triggers.
- Summary tables: pre-aggregate counts, sums, etc., so you’re not running
COUNT(*)on millions of rows.
HTTP-Layer Cache
- For read-heavy REST endpoints, set proper
Cache-Controlheaders or use a CDN layer.
Goal: Serve repeat requests in memory instead of hitting disk or CPU.
6. Scaling & Partitioning
Table Partitioning
- Time-based partitions (e.g., monthly) can limit scans to only relevant partitions.
- Postgres:
CREATE TABLE events_2025_08 PARTITION OF events ...
Sharding / Read-Replicas
- Offload reads to replicas.
- Be mindful of replication lag when designing consistency.
Connection Pool Tuning
- Ensure your pool size (
poolindatabase.yml) matches your DB’s max connections. - Avoid connection storms—implement backoff or circuit breakers in high-load scenarios.
Goal: Spread load and isolate hot data to minimise contention.
7. Continuous Monitoring & Feedback
Alert on Query Regression
- Track average/peak execution time per query fingerprint.
- Alert if they exceed thresholds.
Automated Explain Reports
- Tools like
pgBadgerorEXPLAIN.depesz.comhelp visualize trends in query plans.
Post-Mortems & Knowledge Sharing
- Document when and why an index was added or a query restructured.
- Rotate learnings back into team best practices.
Goal: Prevent performance debt from creeping back in.
Putting It All Together
Scenario: Your user list page (/admin/users) is suddenly taking 10s to load because it joins orders and profiles for each user.
Identify: slow query log shows
SELECT u.*, p.bio, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN profiles p ON p.user_id = u.id
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, p.bio
ORDER BY u.created_at DESC
LIMIT 50;
EXPLAIN ANALYZEreveals two full scans onordersand sorting 10M rows.
Indexing:
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_users_created_at ON users(created_at DESC);
Refactor:
- Pre-aggregate
order_countin a summary table or materialised view refreshed every minute. - Denormalise
profiles.biointousers.bioif acceptable.
Cache:
- Cache the first page of admin users for 30s, since it changes infrequently.
Result: Page load drops from ~10s to ~300ms.
By following these steps, you can systematically drive down query times and keep your application preforming well even as data grows.