300 lines
7.1 KiB
Markdown
300 lines
7.1 KiB
Markdown
---
|
|
name: sql-optimizer
|
|
description: |
|
|
WHEN: SQL query review, query optimization, index usage, N+1 detection, performance analysis
|
|
WHAT: Query plan analysis + Index recommendations + N+1 detection + Join optimization + Performance tuning
|
|
WHEN NOT: Schema design → schema-reviewer, ORM code → orm-reviewer
|
|
---
|
|
|
|
# SQL Optimizer Skill
|
|
|
|
## Purpose
|
|
Analyzes and optimizes SQL queries for performance, index usage, and best practices.
|
|
|
|
## When to Use
|
|
- SQL query optimization
|
|
- Query performance review
|
|
- Index usage analysis
|
|
- N+1 query detection
|
|
- Slow query troubleshooting
|
|
|
|
## Project Detection
|
|
- `.sql` files
|
|
- Query strings in code
|
|
- Database migration files
|
|
- ORM query logs
|
|
|
|
## Workflow
|
|
|
|
### Step 1: Analyze Query
|
|
```
|
|
**Database**: PostgreSQL/MySQL/SQLite
|
|
**Tables**: users, orders, products
|
|
**Query Type**: SELECT with JOINs
|
|
**Estimated Rows**: 100K+
|
|
```
|
|
|
|
### Step 2: Select Review Areas
|
|
**AskUserQuestion:**
|
|
```
|
|
"Which areas to review?"
|
|
Options:
|
|
- Full query optimization (recommended)
|
|
- Index usage analysis
|
|
- Join optimization
|
|
- Subquery refactoring
|
|
- N+1 detection
|
|
multiSelect: true
|
|
```
|
|
|
|
## Detection Rules
|
|
|
|
### Index Usage
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| Full table scan | Add appropriate index | CRITICAL |
|
|
| Index not used | Check column order | HIGH |
|
|
| Too many indexes | Consolidate indexes | MEDIUM |
|
|
| Missing composite index | Add multi-column index | HIGH |
|
|
|
|
```sql
|
|
-- BAD: No index on filter columns
|
|
SELECT * FROM orders
|
|
WHERE created_at > '2024-01-01'
|
|
AND status = 'pending';
|
|
-- Full table scan!
|
|
|
|
-- GOOD: Add composite index
|
|
CREATE INDEX idx_orders_status_created
|
|
ON orders(status, created_at);
|
|
|
|
-- Index order matters!
|
|
-- For WHERE status = ? AND created_at > ?
|
|
-- Index(status, created_at) ✓
|
|
-- Index(created_at, status) ✗ (less effective)
|
|
```
|
|
|
|
### SELECT Optimization
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| SELECT * | Select specific columns | HIGH |
|
|
| Unnecessary columns | Remove unused columns | MEDIUM |
|
|
| No LIMIT | Add LIMIT for large results | HIGH |
|
|
|
|
```sql
|
|
-- BAD: SELECT * with large result
|
|
SELECT * FROM orders
|
|
WHERE user_id = 123;
|
|
-- Returns all columns, no limit
|
|
|
|
-- GOOD: Specific columns, limited results
|
|
SELECT id, status, total, created_at
|
|
FROM orders
|
|
WHERE user_id = 123
|
|
ORDER BY created_at DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
### JOIN Optimization
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| Cartesian product | Add join condition | CRITICAL |
|
|
| Join on non-indexed column | Add index | HIGH |
|
|
| Too many joins | Consider denormalization | MEDIUM |
|
|
| Implicit join | Use explicit JOIN syntax | LOW |
|
|
|
|
```sql
|
|
-- BAD: Implicit join (harder to read, error-prone)
|
|
SELECT o.*, u.name
|
|
FROM orders o, users u
|
|
WHERE o.user_id = u.id;
|
|
|
|
-- GOOD: Explicit JOIN
|
|
SELECT o.id, o.total, u.name
|
|
FROM orders o
|
|
INNER JOIN users u ON o.user_id = u.id;
|
|
|
|
-- BAD: Join on non-indexed column
|
|
SELECT o.*, p.name
|
|
FROM orders o
|
|
JOIN products p ON o.product_code = p.code;
|
|
-- If products.code has no index → slow!
|
|
|
|
-- FIX: Add index
|
|
CREATE INDEX idx_products_code ON products(code);
|
|
```
|
|
|
|
### Subquery Optimization
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| Correlated subquery | Convert to JOIN | HIGH |
|
|
| IN with subquery | Use EXISTS or JOIN | MEDIUM |
|
|
| Subquery in SELECT | Move to JOIN | HIGH |
|
|
|
|
```sql
|
|
-- BAD: Correlated subquery (runs for each row)
|
|
SELECT *
|
|
FROM orders o
|
|
WHERE total > (
|
|
SELECT AVG(total)
|
|
FROM orders
|
|
WHERE user_id = o.user_id
|
|
);
|
|
|
|
-- GOOD: Use window function
|
|
SELECT *
|
|
FROM (
|
|
SELECT *,
|
|
AVG(total) OVER (PARTITION BY user_id) as avg_total
|
|
FROM orders
|
|
) sub
|
|
WHERE total > avg_total;
|
|
|
|
-- BAD: IN with large subquery
|
|
SELECT * FROM users
|
|
WHERE id IN (SELECT user_id FROM orders WHERE status = 'vip');
|
|
|
|
-- GOOD: Use EXISTS or JOIN
|
|
SELECT u.* FROM users u
|
|
WHERE EXISTS (
|
|
SELECT 1 FROM orders o
|
|
WHERE o.user_id = u.id AND o.status = 'vip'
|
|
);
|
|
|
|
-- Or with JOIN
|
|
SELECT DISTINCT u.*
|
|
FROM users u
|
|
INNER JOIN orders o ON o.user_id = u.id
|
|
WHERE o.status = 'vip';
|
|
```
|
|
|
|
### N+1 Query Detection
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| Loop with query | Batch fetch | CRITICAL |
|
|
| Lazy load in loop | Eager load | CRITICAL |
|
|
|
|
```sql
|
|
-- N+1 Pattern (application code)
|
|
-- Query 1: Get all users
|
|
SELECT * FROM users;
|
|
|
|
-- Then for each user (N queries):
|
|
SELECT * FROM orders WHERE user_id = 1;
|
|
SELECT * FROM orders WHERE user_id = 2;
|
|
SELECT * FROM orders WHERE user_id = 3;
|
|
-- ... N more queries
|
|
|
|
-- SOLUTION 1: JOIN
|
|
SELECT u.*, o.*
|
|
FROM users u
|
|
LEFT JOIN orders o ON o.user_id = u.id;
|
|
|
|
-- SOLUTION 2: IN query (for separate queries)
|
|
SELECT * FROM orders WHERE user_id IN (1, 2, 3, ...);
|
|
```
|
|
|
|
### Aggregation Optimization
|
|
| Check | Recommendation | Severity |
|
|
|-------|----------------|----------|
|
|
| COUNT(*) on large table | Use approximate count | MEDIUM |
|
|
| GROUP BY without index | Add index | HIGH |
|
|
| HAVING vs WHERE | Filter early with WHERE | MEDIUM |
|
|
|
|
```sql
|
|
-- BAD: COUNT on entire table
|
|
SELECT COUNT(*) FROM orders;
|
|
-- Scans entire table
|
|
|
|
-- GOOD: Approximate count (PostgreSQL)
|
|
SELECT reltuples::bigint AS estimate
|
|
FROM pg_class
|
|
WHERE relname = 'orders';
|
|
|
|
-- BAD: WHERE in HAVING
|
|
SELECT user_id, COUNT(*)
|
|
FROM orders
|
|
GROUP BY user_id
|
|
HAVING status = 'completed'; -- Wrong place!
|
|
|
|
-- GOOD: Filter before grouping
|
|
SELECT user_id, COUNT(*)
|
|
FROM orders
|
|
WHERE status = 'completed' -- Filter first
|
|
GROUP BY user_id;
|
|
|
|
-- Index for GROUP BY
|
|
CREATE INDEX idx_orders_user_status
|
|
ON orders(user_id, status);
|
|
```
|
|
|
|
### EXPLAIN Analysis
|
|
```sql
|
|
-- PostgreSQL EXPLAIN
|
|
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
|
|
SELECT u.name, COUNT(o.id)
|
|
FROM users u
|
|
LEFT JOIN orders o ON o.user_id = u.id
|
|
GROUP BY u.id;
|
|
|
|
-- Look for:
|
|
-- ✗ Seq Scan (full table scan)
|
|
-- ✗ Nested Loop with high rows
|
|
-- ✗ Hash Join with large hash
|
|
-- ✓ Index Scan
|
|
-- ✓ Index Only Scan
|
|
-- ✓ Bitmap Index Scan
|
|
```
|
|
|
|
## Response Template
|
|
```
|
|
## SQL Query Optimization Results
|
|
|
|
**Database**: PostgreSQL 15
|
|
**Query Type**: SELECT with JOIN
|
|
**Estimated Impact**: ~10x improvement
|
|
|
|
### Index Usage
|
|
| Status | Issue | Recommendation |
|
|
|--------|-------|----------------|
|
|
| CRITICAL | Full table scan on orders | Add index on (status, created_at) |
|
|
|
|
### Join Analysis
|
|
| Status | Issue | Recommendation |
|
|
|--------|-------|----------------|
|
|
| HIGH | Non-indexed join column | Add index on products.code |
|
|
|
|
### Query Structure
|
|
| Status | Issue | Recommendation |
|
|
|--------|-------|----------------|
|
|
| HIGH | SELECT * with no LIMIT | Select specific columns, add LIMIT |
|
|
|
|
### Recommended Indexes
|
|
```sql
|
|
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
|
|
CREATE INDEX idx_products_code ON products(code);
|
|
```
|
|
|
|
### Optimized Query
|
|
```sql
|
|
SELECT o.id, o.total, p.name
|
|
FROM orders o
|
|
INNER JOIN products p ON o.product_id = p.id
|
|
WHERE o.status = 'pending'
|
|
ORDER BY o.created_at DESC
|
|
LIMIT 100;
|
|
```
|
|
```
|
|
|
|
## Best Practices
|
|
1. **Indexes**: Add for WHERE, JOIN, ORDER BY columns
|
|
2. **SELECT**: Only needed columns, with LIMIT
|
|
3. **JOINs**: Explicit syntax, indexed columns
|
|
4. **Subqueries**: Prefer JOINs or CTEs
|
|
5. **EXPLAIN**: Always analyze query plans
|
|
|
|
## Integration
|
|
- `schema-reviewer`: Database design
|
|
- `orm-reviewer`: ORM query patterns
|
|
- `perf-analyzer`: Application performance
|