# Database Optimization Examples Real-world database performance bottlenecks and their solutions with measurable query time improvements. ## Example 1: N+1 Query Problem ### Problem: Loading Users with Posts ```typescript // ❌ BEFORE: N+1 queries - 3,500ms for 100 users async function getUsersWithPosts() { // 1 query to get users const users = await db.user.findMany(); // N queries (1 per user) to get posts for (const user of users) { user.posts = await db.post.findMany({ where: { userId: user.id } }); } return users; } // Total queries: 1 + 100 = 101 queries // Time: ~3,500ms (35ms per query × 100) ``` ### Solution 1: Eager Loading ```typescript // ✅ AFTER: Eager loading - 80ms for 100 users (44x faster!) async function getUsersWithPostsOptimized() { // Single query with JOIN const users = await db.user.findMany({ include: { posts: true } }); return users; } // Total queries: 1 query // Time: ~80ms // Performance gain: 44x faster (3,500ms → 80ms) ``` ### Solution 2: DataLoader Pattern ```typescript // ✅ ALTERNATIVE: Batched loading - 120ms for 100 users import DataLoader from 'dataloader'; const postLoader = new DataLoader(async (userIds: string[]) => { const posts = await db.post.findMany({ where: { userId: { in: userIds } } }); // Group posts by userId const postsByUser = new Map(); for (const post of posts) { if (!postsByUser.has(post.userId)) { postsByUser.set(post.userId, []); } postsByUser.get(post.userId)!.push(post); } // Return in same order as input return userIds.map(id => postsByUser.get(id) || []); }); async function getUsersWithPostsBatched() { const users = await db.user.findMany(); // Batches all user IDs into single query for (const user of users) { user.posts = await postLoader.load(user.id); } return users; } // Total queries: 2 queries (users + batched posts) // Time: ~120ms ``` ### Metrics | Implementation | Queries | Time | Improvement | |----------------|---------|------|-------------| | **N+1 (Original)** | 101 | 3,500ms | baseline | | **Eager Loading** | 1 | 80ms | **44x faster** | | **DataLoader** | 2 | 120ms | **29x faster** | --- ## Example 2: Missing Index ### Problem: Slow Query on Large Table ```sql -- ❌ BEFORE: Full table scan - 2,800ms for 1M rows SELECT * FROM orders WHERE customer_id = '123' AND status = 'pending' ORDER BY created_at DESC LIMIT 10; -- EXPLAIN ANALYZE output: -- Seq Scan on orders (cost=0.00..25000.00 rows=10 width=100) (actual time=2800.000) -- Filter: (customer_id = '123' AND status = 'pending') -- Rows Removed by Filter: 999,990 ``` ### Solution: Composite Index ```sql -- ✅ AFTER: Index scan - 5ms for 1M rows (560x faster!) CREATE INDEX idx_orders_customer_status_date ON orders(customer_id, status, created_at DESC); -- Same query, now uses index: SELECT * FROM orders WHERE customer_id = '123' AND status = 'pending' ORDER BY created_at DESC LIMIT 10; -- EXPLAIN ANALYZE output: -- Index Scan using idx_orders_customer_status_date (cost=0.42..8.44 rows=10) -- (actual time=5.000) -- Index Cond: (customer_id = '123' AND status = 'pending') ``` ### Metrics | Implementation | Scan Type | Time | Rows Scanned | |----------------|-----------|------|--------------| | **No Index** | Sequential | 2,800ms | 1,000,000 | | **With Index** | Index | 5ms | 10 | | **Improvement** | - | **560x** | **99.999% less** | ### Index Strategy ```sql -- Good: Covers WHERE + ORDER BY CREATE INDEX idx_orders_customer_status_date ON orders(customer_id, status, created_at DESC); -- Bad: Wrong column order (status first is less selective) CREATE INDEX idx_orders_status_customer ON orders(status, customer_id); -- Good: Partial index for common queries CREATE INDEX idx_orders_pending ON orders(customer_id, created_at DESC) WHERE status = 'pending'; ``` --- ## Example 3: SELECT * vs Specific Columns ### Problem: Fetching Unnecessary Data ```typescript // ❌ BEFORE: Fetching all columns - 450ms for 10K rows const products = await db.product.findMany({ where: { category: 'electronics' } // Fetches all 30 columns including large JSONB fields }); // Network transfer: 25 MB // Time: 450ms (query) + 200ms (network) = 650ms total ``` ### Solution: Select Only Needed Columns ```typescript // ✅ AFTER: Fetch only required columns - 120ms for 10K rows const products = await db.product.findMany({ where: { category: 'electronics' }, select: { id: true, name: true, price: true, inStock: true } }); // Network transfer: 2 MB (88% reduction) // Time: 120ms (query) + 25ms (network) = 145ms total // Performance gain: 4.5x faster (650ms → 145ms) ``` ### Metrics | Implementation | Columns | Data Size | Total Time | |----------------|---------|-----------|------------| | **SELECT *** | 30 | 25 MB | 650ms | | **Specific Columns** | 4 | 2 MB | 145ms | | **Improvement** | **87% less** | **88% less** | **4.5x** | --- ## Example 4: Connection Pooling ### Problem: Creating New Connection Per Request ```typescript // ❌ BEFORE: New connection each request - 150ms overhead async function handleRequest() { // Opens new connection (150ms) const client = await pg.connect({ host: 'db.example.com', database: 'myapp' }); const result = await client.query('SELECT ...'); await client.end(); // Closes connection return result; } // Per request: 150ms (connect) + 20ms (query) = 170ms ``` ### Solution: Connection Pool ```typescript // ✅ AFTER: Reuse pooled connections - 20ms per query import { Pool } from 'pg'; const pool = new Pool({ host: 'db.example.com', database: 'myapp', max: 20, // Max 20 connections idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000, }); async function handleRequestOptimized() { // Reuses existing connection (~0ms overhead) const client = await pool.connect(); try { const result = await client.query('SELECT ...'); return result; } finally { client.release(); // Return to pool } } // Per request: 0ms (pool) + 20ms (query) = 20ms // Performance gain: 8.5x faster (170ms → 20ms) ``` ### Metrics | Implementation | Connection Time | Query Time | Total | |----------------|-----------------|------------|-------| | **New Connection** | 150ms | 20ms | 170ms | | **Pooled** | ~0ms | 20ms | 20ms | | **Improvement** | **∞** | - | **8.5x** | --- ## Example 5: Query Result Caching ### Problem: Repeated Expensive Queries ```typescript // ❌ BEFORE: Query database every time - 80ms per call async function getPopularProducts() { return await db.product.findMany({ where: { soldCount: { gte: 1000 } }, orderBy: { soldCount: 'desc' }, take: 20 }); } // Called 100 times/min = 8,000ms database load ``` ### Solution: Redis Caching ```typescript // ✅ AFTER: Cache results - 2ms per cache hit import { Redis } from 'ioredis'; const redis = new Redis(); async function getPopularProductsCached() { const cacheKey = 'popular_products'; // Check cache first const cached = await redis.get(cacheKey); if (cached) { return JSON.parse(cached); // 2ms cache hit } // Cache miss: query database const products = await db.product.findMany({ where: { soldCount: { gte: 1000 } }, orderBy: { soldCount: 'desc' }, take: 20 }); // Cache for 5 minutes await redis.setex(cacheKey, 300, JSON.stringify(products)); return products; } // First call: 80ms (database) // Subsequent calls: 2ms (cache) × 99 = 198ms // Total: 278ms vs 8,000ms // Performance gain: 29x faster ``` ### Metrics (100 calls) | Implementation | Cache Hits | DB Queries | Total Time | |----------------|------------|------------|------------| | **No Cache** | 0 | 100 | 8,000ms | | **With Cache** | 99 | 1 | 278ms | | **Improvement** | - | **99% less** | **29x** | --- ## Example 6: Batch Operations ### Problem: Individual Inserts ```typescript // ❌ BEFORE: Individual inserts - 5,000ms for 1000 records async function importUsers(users: User[]) { for (const user of users) { await db.user.create({ data: user }); // 1000 queries } } // Time: 5ms per insert × 1000 = 5,000ms ``` ### Solution: Batch Insert ```typescript // ✅ AFTER: Single batch insert - 250ms for 1000 records async function importUsersOptimized(users: User[]) { await db.user.createMany({ data: users, skipDuplicates: true }); } // Time: 250ms (single query with 1000 rows) // Performance gain: 20x faster (5,000ms → 250ms) ``` ### Metrics | Implementation | Queries | Time | Network Roundtrips | |----------------|---------|------|-------------------| | **Individual** | 1,000 | 5,000ms | 1,000 | | **Batch** | 1 | 250ms | 1 | | **Improvement** | **1000x less** | **20x** | **1000x less** | --- ## Summary | Optimization | Before | After | Gain | When to Use | |--------------|--------|-------|------|-------------| | **Eager Loading** | 101 queries | 1 query | 44x | N+1 problems | | **Add Index** | 2,800ms | 5ms | 560x | Slow WHERE/ORDER BY | | **Select Specific** | 25 MB | 2 MB | 4.5x | Large result sets | | **Connection Pool** | 170ms/req | 20ms/req | 8.5x | High request volume | | **Query Cache** | 100 queries | 1 query | 29x | Repeated queries | | **Batch Operations** | 1000 queries | 1 query | 20x | Bulk inserts/updates | ## Best Practices 1. **Use EXPLAIN ANALYZE**: Always check query execution plans 2. **Index Wisely**: Cover WHERE, JOIN, ORDER BY columns 3. **Eager Load**: Avoid N+1 queries with includes/joins 4. **Connection Pools**: Never create connections per request 5. **Cache Strategically**: Cache expensive, frequently accessed queries 6. **Batch Operations**: Bulk insert/update when possible 7. **Monitor Slow Queries**: Log queries >100ms in production --- **Previous**: [Algorithm Optimization](algorithm-optimization.md) | **Next**: [Caching Optimization](caching-optimization.md) | **Index**: [Examples Index](INDEX.md)