815 lines
21 KiB
Markdown
815 lines
21 KiB
Markdown
# Backend Architecture
|
||
|
||
**Comprehensive patterns for APIs, databases, and server-side systems**
|
||
|
||
Consolidated from:
|
||
- backend-architect skills
|
||
- api-developer skills
|
||
- database-architect skills
|
||
|
||
---
|
||
|
||
|
||
|
||
|
||
|
||
# Schema Design Skill
|
||
|
||
Expert patterns for relational database schema design, normalization, and constraint management.
|
||
|
||
## Core Principles
|
||
|
||
### 1. Normalization Levels
|
||
|
||
**1NF (First Normal Form)**:
|
||
- Atomic values only (no arrays, no comma-separated lists)
|
||
- Each column contains single value
|
||
- No repeating groups
|
||
|
||
**2NF (Second Normal Form)**:
|
||
- Must be in 1NF
|
||
- No partial dependencies on composite primary keys
|
||
- Every non-key column depends on the entire primary key
|
||
|
||
**3NF (Third Normal Form)**:
|
||
- Must be in 2NF
|
||
- No transitive dependencies
|
||
- Every non-key column depends only on the primary key
|
||
|
||
**BCNF (Boyce-Codd Normal Form)**:
|
||
- Must be in 3NF
|
||
- Every determinant is a candidate key
|
||
|
||
**Strategic Denormalization**:
|
||
- Only denormalize with performance data justification
|
||
- Document the trade-off
|
||
- Consider materialized views instead
|
||
- Plan for data consistency maintenance
|
||
|
||
### 2. Primary Key Selection
|
||
|
||
**UUID (Recommended for distributed systems)**:
|
||
```sql
|
||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4()
|
||
```
|
||
- Pros: Globally unique, no coordination needed, harder to enumerate
|
||
- Cons: Larger storage (16 bytes), random order (index fragmentation)
|
||
|
||
**Auto-increment Integer**:
|
||
```sql
|
||
id SERIAL PRIMARY KEY -- PostgreSQL
|
||
id INT AUTO_INCREMENT PRIMARY KEY -- MySQL
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT -- SQLite
|
||
```
|
||
- Pros: Small storage (4-8 bytes), sequential (better index performance)
|
||
- Cons: Coordination needed, easy to enumerate, not globally unique
|
||
|
||
**Composite Keys** (for junction tables):
|
||
```sql
|
||
PRIMARY KEY (user_id, role_id)
|
||
```
|
||
|
||
### 3. Foreign Key Constraints
|
||
|
||
**Always define foreign keys** for referential integrity:
|
||
|
||
```sql
|
||
CONSTRAINT fk_orders_customer
|
||
FOREIGN KEY (customer_id)
|
||
REFERENCES customers(id)
|
||
ON DELETE CASCADE -- or RESTRICT, SET NULL
|
||
ON UPDATE CASCADE
|
||
```
|
||
|
||
**ON DELETE options**:
|
||
- `CASCADE`: Delete child rows when parent deleted
|
||
- `RESTRICT`: Prevent delete if children exist
|
||
- `SET NULL`: Set foreign key to NULL
|
||
- `NO ACTION`: Similar to RESTRICT (database-specific)
|
||
|
||
### 4. Check Constraints
|
||
|
||
**Use check constraints for business rules**:
|
||
|
||
```sql
|
||
-- Email format validation
|
||
CONSTRAINT email_format CHECK (
|
||
email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$'
|
||
)
|
||
|
||
-- Positive values
|
||
CONSTRAINT total_positive CHECK (total >= 0)
|
||
|
||
-- Enum-like values
|
||
CONSTRAINT valid_status CHECK (
|
||
status IN ('pending', 'processing', 'completed', 'cancelled')
|
||
)
|
||
|
||
-- Date ranges
|
||
CONSTRAINT valid_date_range CHECK (end_date > start_date)
|
||
```
|
||
|
||
### 5. Index Strategy
|
||
|
||
**Index Types and When to Use**:
|
||
|
||
**B-Tree (Default)**:
|
||
- WHERE clauses: `WHERE status = 'active'`
|
||
- ORDER BY: `ORDER BY created_at DESC`
|
||
- Range queries: `WHERE price BETWEEN 10 AND 100`
|
||
- Joins: Foreign key columns
|
||
|
||
**GIN (PostgreSQL - Generalized Inverted Index)**:
|
||
- JSONB columns: `WHERE data @> '{"key": "value"}'`
|
||
- Arrays: `WHERE tags @> ARRAY['postgresql']`
|
||
- Full-text search: `WHERE to_tsvector(text) @@ to_tsquery('search')`
|
||
|
||
**GiST (PostgreSQL - Generalized Search Tree)**:
|
||
- Geometric data: `WHERE location && box '((0,0),(1,1))'`
|
||
- Full-text search: Alternative to GIN
|
||
- Range types: `WHERE daterange && '[2025-01-01, 2025-12-31]'`
|
||
|
||
**Hash (Limited use)**:
|
||
- Equality only: `WHERE id = 123`
|
||
- Not recommended (B-tree usually better)
|
||
|
||
**Composite Index Column Order**:
|
||
```sql
|
||
-- Rule: Most selective column first, or most commonly filtered
|
||
CREATE INDEX idx_orders_status_created ON orders(status, created_at DESC);
|
||
|
||
-- Works for:
|
||
-- WHERE status = 'pending' ✅
|
||
-- WHERE status = 'pending' AND created_at > NOW() - INTERVAL '7 days' ✅
|
||
-- WHERE status = 'pending' ORDER BY created_at DESC ✅
|
||
|
||
-- Does NOT work for:
|
||
-- WHERE created_at > NOW() - INTERVAL '7 days' ❌ (doesn't start with status)
|
||
```
|
||
|
||
## Schema Patterns
|
||
|
||
### Pattern 1: Soft Delete
|
||
|
||
```sql
|
||
CREATE TABLE users (
|
||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||
email VARCHAR(255) NOT NULL,
|
||
name VARCHAR(255) NOT NULL,
|
||
deleted_at TIMESTAMP NULL,
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
-- Partial unique index (only for non-deleted rows)
|
||
CREATE UNIQUE INDEX idx_users_email_active
|
||
ON users(email)
|
||
WHERE deleted_at IS NULL;
|
||
|
||
-- Query pattern: Always filter deleted
|
||
SELECT * FROM users WHERE deleted_at IS NULL;
|
||
```
|
||
|
||
### Pattern 2: Audit Trail
|
||
|
||
```sql
|
||
CREATE TABLE users (
|
||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||
email VARCHAR(255) NOT NULL,
|
||
name VARCHAR(255) NOT NULL,
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||
created_by UUID REFERENCES users(id),
|
||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||
updated_by UUID REFERENCES users(id)
|
||
);
|
||
|
||
-- Separate audit log table for full history
|
||
CREATE TABLE users_audit (
|
||
audit_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||
user_id UUID NOT NULL,
|
||
operation VARCHAR(10) NOT NULL, -- INSERT, UPDATE, DELETE
|
||
old_values JSONB,
|
||
new_values JSONB,
|
||
changed_by UUID REFERENCES users(id),
|
||
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
```
|
||
|
||
### Pattern 3: Many-to-Many with Metadata
|
||
|
||
```sql
|
||
-- Junction table with additional attributes
|
||
CREATE TABLE user_roles (
|
||
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
|
||
role_id UUID REFERENCES roles(id) ON DELETE CASCADE,
|
||
granted_by UUID REFERENCES users(id),
|
||
granted_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||
expires_at TIMESTAMP,
|
||
|
||
PRIMARY KEY (user_id, role_id)
|
||
);
|
||
|
||
CREATE INDEX idx_user_roles_user ON user_roles(user_id);
|
||
CREATE INDEX idx_user_roles_role ON user_roles(role_id);
|
||
CREATE INDEX idx_user_roles_expires ON user_roles(expires_at)
|
||
WHERE expires_at IS NOT NULL;
|
||
```
|
||
|
||
### Pattern 4: Hierarchical Data (Adjacency List)
|
||
|
||
```sql
|
||
CREATE TABLE categories (
|
||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||
name VARCHAR(255) NOT NULL,
|
||
parent_id UUID REFERENCES categories(id),
|
||
path TEXT, -- Materialized path: /electronics/computers/laptops
|
||
level INT, -- Denormalized for performance
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
CREATE INDEX idx_categories_parent ON categories(parent_id);
|
||
CREATE INDEX idx_categories_path ON categories(path);
|
||
```
|
||
|
||
### Pattern 5: Polymorphic Associations (Avoid if Possible)
|
||
|
||
**❌ Problematic Approach**:
|
||
```sql
|
||
-- Weak referential integrity
|
||
CREATE TABLE comments (
|
||
id UUID PRIMARY KEY,
|
||
content TEXT NOT NULL,
|
||
commentable_type VARCHAR(50), -- 'Post' or 'Photo'
|
||
commentable_id UUID, -- No real foreign key!
|
||
created_at TIMESTAMP
|
||
);
|
||
```
|
||
|
||
**✅ Better Approach (Exclusive Arcs)**:
|
||
```sql
|
||
CREATE TABLE comments (
|
||
id UUID PRIMARY KEY,
|
||
content TEXT NOT NULL,
|
||
post_id UUID REFERENCES posts(id) ON DELETE CASCADE,
|
||
photo_id UUID REFERENCES photos(id) ON DELETE CASCADE,
|
||
created_at TIMESTAMP,
|
||
|
||
-- Exactly one must be set
|
||
CONSTRAINT one_commentable CHECK (
|
||
(post_id IS NOT NULL AND photo_id IS NULL) OR
|
||
(post_id IS NULL AND photo_id IS NOT NULL)
|
||
)
|
||
);
|
||
|
||
CREATE INDEX idx_comments_post ON comments(post_id);
|
||
CREATE INDEX idx_comments_photo ON comments(photo_id);
|
||
```
|
||
|
||
## Naming Conventions
|
||
|
||
**Tables**: Plural nouns, lowercase, underscores
|
||
```
|
||
users, orders, order_items, user_preferences
|
||
```
|
||
|
||
**Columns**: Singular nouns, lowercase, underscores
|
||
```
|
||
id, email, first_name, created_at, customer_id
|
||
```
|
||
|
||
**Primary Keys**: Always `id`
|
||
```
|
||
id UUID PRIMARY KEY
|
||
```
|
||
|
||
**Foreign Keys**: `{referenced_table_singular}_id`
|
||
```
|
||
customer_id, product_id, user_id
|
||
```
|
||
|
||
**Indexes**: `idx_{table}_{column(s)}[_{condition}]`
|
||
```
|
||
idx_users_email
|
||
idx_orders_customer_id
|
||
idx_orders_status_created
|
||
idx_users_email_active (partial index)
|
||
```
|
||
|
||
**Constraints**: `{type}_{table}_{description}`
|
||
```
|
||
pk_users (primary key)
|
||
fk_orders_customer (foreign key)
|
||
uq_users_email (unique)
|
||
ck_orders_total_positive (check)
|
||
```
|
||
|
||
## Common Anti-Patterns to Avoid
|
||
|
||
**❌ Generic JSON Columns (EAV Pattern)**:
|
||
```sql
|
||
-- Bad: No schema, no constraints, no indexes
|
||
CREATE TABLE entities (
|
||
id UUID PRIMARY KEY,
|
||
type VARCHAR(50),
|
||
attributes JSONB
|
||
);
|
||
```
|
||
|
||
**❌ Comma-Separated Lists**:
|
||
```sql
|
||
-- Bad: Violates 1NF, can't join efficiently
|
||
CREATE TABLE users (
|
||
id UUID PRIMARY KEY,
|
||
tags TEXT -- 'javascript,python,sql'
|
||
);
|
||
```
|
||
|
||
**✅ Use junction table instead**:
|
||
```sql
|
||
CREATE TABLE user_tags (
|
||
user_id UUID REFERENCES users(id),
|
||
tag_id UUID REFERENCES tags(id),
|
||
PRIMARY KEY (user_id, tag_id)
|
||
);
|
||
```
|
||
|
||
**❌ Nullable Boolean Columns**:
|
||
```sql
|
||
-- Bad: Three states (true, false, null) - ambiguous
|
||
is_active BOOLEAN NULL
|
||
```
|
||
|
||
**✅ Be explicit**:
|
||
```sql
|
||
-- Good: Two clear states
|
||
is_active BOOLEAN NOT NULL DEFAULT true
|
||
```
|
||
|
||
## ER Diagram Notation (Mermaid)
|
||
|
||
```mermaid
|
||
erDiagram
|
||
CUSTOMER ||--o{ ORDER : places
|
||
CUSTOMER {
|
||
uuid id PK "Primary Key"
|
||
string email UK "Unique Key"
|
||
string name
|
||
timestamp created_at
|
||
}
|
||
|
||
ORDER ||--|{ LINE_ITEM : contains
|
||
ORDER {
|
||
uuid id PK
|
||
uuid customer_id FK
|
||
decimal total
|
||
string status
|
||
}
|
||
|
||
PRODUCT ||--o{ LINE_ITEM : "ordered in"
|
||
LINE_ITEM {
|
||
uuid id PK
|
||
uuid order_id FK
|
||
uuid product_id FK
|
||
int quantity
|
||
decimal unit_price
|
||
}
|
||
```
|
||
|
||
**Cardinality Symbols**:
|
||
- `||--||` : One to exactly one
|
||
- `||--o|` : One to zero or one
|
||
- `||--o{` : One to zero or more
|
||
- `}|--|{` : One or more to one or more
|
||
|
||
## Database-Specific Best Practices
|
||
|
||
### PostgreSQL
|
||
|
||
```sql
|
||
-- Enable UUID extension
|
||
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
||
|
||
-- Use JSONB (not JSON) for better performance
|
||
metadata JSONB
|
||
|
||
-- Use array types when appropriate
|
||
tags TEXT[]
|
||
|
||
-- Use full-text search
|
||
CREATE INDEX idx_products_search ON products
|
||
USING GIN (to_tsvector('english', name || ' ' || description));
|
||
|
||
-- Use enums for fixed sets
|
||
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'completed', 'cancelled');
|
||
```
|
||
|
||
### MySQL
|
||
|
||
```sql
|
||
-- Use InnoDB engine (default in 8.0+)
|
||
ENGINE=InnoDB
|
||
|
||
-- Use UTF8MB4 for full Unicode support (including emoji)
|
||
DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||
|
||
-- Use generated columns for computed values
|
||
price_with_tax DECIMAL(10,2) GENERATED ALWAYS AS (price * 1.20) STORED
|
||
|
||
-- Partition large tables
|
||
PARTITION BY RANGE (YEAR(created_at)) (
|
||
PARTITION p2023 VALUES LESS THAN (2024),
|
||
PARTITION p2024 VALUES LESS THAN (2025),
|
||
PARTITION p2025 VALUES LESS THAN (2026)
|
||
);
|
||
```
|
||
|
||
### SQLite
|
||
|
||
```sql
|
||
-- Use STRICT tables for type enforcement (3.37+)
|
||
CREATE TABLE users (
|
||
id INTEGER PRIMARY KEY,
|
||
email TEXT NOT NULL,
|
||
age INTEGER NOT NULL
|
||
) STRICT;
|
||
|
||
-- Use WITHOUT ROWID for space efficiency
|
||
CREATE TABLE user_settings (
|
||
user_id INTEGER PRIMARY KEY,
|
||
theme TEXT NOT NULL,
|
||
locale TEXT NOT NULL
|
||
) WITHOUT ROWID;
|
||
|
||
-- Use triggers for complex constraints
|
||
CREATE TRIGGER check_age_before_insert
|
||
BEFORE INSERT ON users
|
||
FOR EACH ROW
|
||
WHEN NEW.age < 18
|
||
BEGIN
|
||
SELECT RAISE(ABORT, 'Users must be 18 or older');
|
||
END;
|
||
```
|
||
|
||
## Quality Checklist
|
||
|
||
**Schema Completeness**:
|
||
- [ ] All tables have primary keys
|
||
- [ ] All relationships have foreign keys
|
||
- [ ] Appropriate NOT NULL constraints
|
||
- [ ] Check constraints for business rules
|
||
- [ ] Default values where appropriate
|
||
- [ ] Created_at/updated_at timestamps
|
||
|
||
**Normalization**:
|
||
- [ ] Schema is at least 3NF
|
||
- [ ] No repeating groups
|
||
- [ ] No partial dependencies
|
||
- [ ] No transitive dependencies
|
||
- [ ] Denormalization justified and documented
|
||
|
||
**Performance**:
|
||
- [ ] Indexes on all foreign keys
|
||
- [ ] Indexes on commonly filtered columns
|
||
- [ ] Composite indexes for multi-column queries
|
||
- [ ] Covering indexes for frequent queries
|
||
- [ ] Partial indexes where appropriate
|
||
|
||
**Maintainability**:
|
||
- [ ] Consistent naming conventions
|
||
- [ ] Clear table and column names
|
||
- [ ] Comments on complex structures
|
||
- [ ] ER diagram provided
|
||
- [ ] Design decisions documented
|
||
|
||
---
|
||
|
||
## MCP-Enhanced Schema Design
|
||
|
||
### PostgreSQL MCP for Schema Validation
|
||
|
||
When PostgreSQL MCP is available, validate schema designs directly against production databases:
|
||
|
||
```typescript
|
||
// Runtime detection - no configuration needed
|
||
const hasPostgres = typeof mcp__postgres__query !== 'undefined';
|
||
|
||
if (hasPostgres) {
|
||
console.log("✓ Using PostgreSQL MCP for schema design validation");
|
||
|
||
// Validate schema against existing database
|
||
const schemaCheck = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
table_name,
|
||
column_name,
|
||
data_type,
|
||
is_nullable,
|
||
column_default,
|
||
character_maximum_length
|
||
FROM information_schema.columns
|
||
WHERE table_schema = 'public'
|
||
AND table_name IN ('users', 'orders', 'products')
|
||
ORDER BY table_name, ordinal_position
|
||
`
|
||
});
|
||
|
||
console.log(`✓ Retrieved schema for ${schemaCheck.rows.length} columns`);
|
||
|
||
// Check constraints
|
||
const constraints = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
tc.table_name,
|
||
tc.constraint_name,
|
||
tc.constraint_type,
|
||
kcu.column_name,
|
||
ccu.table_name AS foreign_table_name,
|
||
ccu.column_name AS foreign_column_name
|
||
FROM information_schema.table_constraints tc
|
||
LEFT JOIN information_schema.key_column_usage kcu
|
||
ON tc.constraint_name = kcu.constraint_name
|
||
LEFT JOIN information_schema.constraint_column_usage ccu
|
||
ON tc.constraint_name = ccu.constraint_name
|
||
WHERE tc.table_schema = 'public'
|
||
ORDER BY tc.table_name, tc.constraint_type
|
||
`
|
||
});
|
||
|
||
console.log(`✓ Found ${constraints.rows.length} constraints`);
|
||
|
||
// Validate foreign key relationships
|
||
const orphanedRecords = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
'orders' as table_name,
|
||
COUNT(*) as orphaned_count
|
||
FROM orders o
|
||
LEFT JOIN users u ON o.user_id = u.id
|
||
WHERE u.id IS NULL
|
||
`
|
||
});
|
||
|
||
if (orphanedRecords.rows[0].orphaned_count > 0) {
|
||
console.log(`⚠️ Found ${orphanedRecords.rows[0].orphaned_count} orphaned records`);
|
||
} else {
|
||
console.log("✓ All foreign key relationships valid");
|
||
}
|
||
|
||
// Test DDL before executing
|
||
const ddlTest = await mcp__postgres__query({
|
||
sql: `
|
||
BEGIN;
|
||
-- Test adding new column
|
||
ALTER TABLE users ADD COLUMN test_column VARCHAR(100);
|
||
-- Check table size after change
|
||
SELECT pg_size_pretty(pg_relation_size('users')) as table_size;
|
||
ROLLBACK;
|
||
`
|
||
});
|
||
|
||
console.log(`✓ DDL validated (would not break existing data)`);
|
||
|
||
} else {
|
||
console.log("ℹ️ PostgreSQL MCP not available");
|
||
console.log(" Install for schema validation:");
|
||
console.log(" npm install -g @modelcontextprotocol/server-postgres");
|
||
}
|
||
```
|
||
|
||
### Benefits Comparison
|
||
|
||
| Aspect | With PostgreSQL MCP | Without MCP (Traditional) |
|
||
|--------|-------------------|--------------------------|
|
||
| **Schema Exploration** | Query information_schema instantly | Request schema dump → wait |
|
||
| **Constraint Validation** | Check FK relationships on real data | Assume constraints work |
|
||
| **DDL Testing** | Test ALTER statements with ROLLBACK | Deploy and hope |
|
||
| **Data Distribution** | Analyze with pg_stats | Guess cardinality |
|
||
| **Impact Analysis** | Query actual table sizes | Estimate impact |
|
||
| **Normalization Check** | Find duplicates in production | Theoretical analysis |
|
||
| **Migration Safety** | Test on production replica | Cross fingers |
|
||
|
||
**When to use PostgreSQL MCP:**
|
||
- Designing schema for existing database
|
||
- Validating normalization against real data
|
||
- Testing DDL changes before deployment
|
||
- Analyzing data distribution for index design
|
||
- Finding schema anomalies
|
||
- Planning migrations
|
||
- Reverse engineering existing schemas
|
||
|
||
**When traditional approach needed:**
|
||
- Greenfield database design
|
||
- Designing for future data
|
||
- Theoretical schema modeling
|
||
- No database access
|
||
|
||
### Real-World Example: Adding User Preferences
|
||
|
||
**With PostgreSQL MCP (15 minutes):**
|
||
|
||
```typescript
|
||
// 1. Analyze current users table
|
||
const currentSchema = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
column_name,
|
||
data_type,
|
||
is_nullable
|
||
FROM information_schema.columns
|
||
WHERE table_name = 'users'
|
||
ORDER BY ordinal_position
|
||
`
|
||
});
|
||
|
||
console.log(`✓ Users table has ${currentSchema.rows.length} columns`);
|
||
|
||
// 2. Check for existing preference data
|
||
const preferencesCheck = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
COUNT(DISTINCT user_id) as users_with_prefs,
|
||
COUNT(*) as total_prefs,
|
||
AVG(array_length(string_to_array(preferences, ','), 1)) as avg_prefs_per_user
|
||
FROM user_metadata
|
||
WHERE preferences IS NOT NULL
|
||
`
|
||
});
|
||
|
||
console.log(`✓ ${preferencesCheck.rows[0].users_with_prefs} users have preferences`);
|
||
|
||
// 3. Design decision: Separate table vs JSONB column
|
||
const tableSize = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
pg_size_pretty(pg_relation_size('users')) as current_size,
|
||
pg_size_pretty(pg_relation_size('users') * 1.2) as estimated_with_jsonb
|
||
`
|
||
});
|
||
|
||
console.log(`✓ Adding JSONB column would increase size to ${tableSize.rows[0].estimated_with_jsonb}`);
|
||
|
||
// 4. Test the migration (with ROLLBACK)
|
||
const migrationTest = await mcp__postgres__query({
|
||
sql: `
|
||
BEGIN;
|
||
|
||
-- Add preferences column
|
||
ALTER TABLE users ADD COLUMN preferences JSONB DEFAULT '{}'::jsonb;
|
||
|
||
-- Add GIN index for JSONB queries
|
||
CREATE INDEX idx_users_preferences ON users USING GIN (preferences);
|
||
|
||
-- Test query performance
|
||
EXPLAIN ANALYZE
|
||
SELECT * FROM users
|
||
WHERE preferences @> '{"theme": "dark"}'::jsonb;
|
||
|
||
ROLLBACK;
|
||
`
|
||
});
|
||
|
||
console.log("✓ Migration tested successfully");
|
||
|
||
// Decision: Use JSONB column (flexible, good performance with GIN index)
|
||
```
|
||
|
||
**Without MCP (2 hours):**
|
||
|
||
1. Request schema documentation (15 min wait)
|
||
2. Analyze schema manually (20 min)
|
||
3. Make design decision based on assumptions (15 min)
|
||
4. Write migration script (15 min)
|
||
5. Deploy to test database (10 min)
|
||
6. Load test data (20 min)
|
||
7. Test queries (10 min)
|
||
8. Find issues (15 min)
|
||
9. Revise and redeploy (15 min)
|
||
|
||
### Schema Validation Patterns
|
||
|
||
```typescript
|
||
// Comprehensive schema validation
|
||
async function validateSchema() {
|
||
const hasPostgres = typeof mcp__postgres__query !== 'undefined';
|
||
|
||
if (hasPostgres) {
|
||
// 1. Check for missing indexes on foreign keys
|
||
const missingIndexes = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
tc.table_name,
|
||
kcu.column_name
|
||
FROM information_schema.table_constraints tc
|
||
JOIN information_schema.key_column_usage kcu
|
||
ON tc.constraint_name = kcu.constraint_name
|
||
WHERE tc.constraint_type = 'FOREIGN KEY'
|
||
AND NOT EXISTS (
|
||
SELECT 1
|
||
FROM pg_indexes
|
||
WHERE tablename = tc.table_name
|
||
AND indexdef LIKE '%' || kcu.column_name || '%'
|
||
)
|
||
`
|
||
});
|
||
|
||
if (missingIndexes.rows.length > 0) {
|
||
console.log("⚠️ Missing indexes on foreign keys:");
|
||
missingIndexes.rows.forEach(row => {
|
||
console.log(` ${row.table_name}.${row.column_name}`);
|
||
});
|
||
}
|
||
|
||
// 2. Check for columns that should be NOT NULL
|
||
const nullableColumns = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
table_name,
|
||
column_name,
|
||
COUNT(*) FILTER (WHERE value IS NULL) as null_count,
|
||
COUNT(*) as total_count
|
||
FROM (
|
||
SELECT 'users' as table_name, 'email' as column_name, email as value FROM users
|
||
UNION ALL
|
||
SELECT 'orders' as table_name, 'user_id' as column_name, user_id::text as value FROM orders
|
||
) data
|
||
GROUP BY table_name, column_name
|
||
HAVING COUNT(*) FILTER (WHERE value IS NULL) = 0
|
||
AND EXISTS (
|
||
SELECT 1
|
||
FROM information_schema.columns
|
||
WHERE table_name = data.table_name
|
||
AND column_name = data.column_name
|
||
AND is_nullable = 'YES'
|
||
)
|
||
`
|
||
});
|
||
|
||
if (nullableColumns.rows.length > 0) {
|
||
console.log("⚠️ Columns that could be NOT NULL:");
|
||
nullableColumns.rows.forEach(row => {
|
||
console.log(` ${row.table_name}.${row.column_name} (0 NULLs in ${row.total_count} rows)`);
|
||
});
|
||
}
|
||
|
||
// 3. Check for denormalization opportunities
|
||
const duplicateData = await mcp__postgres__query({
|
||
sql: `
|
||
SELECT
|
||
user_id,
|
||
email,
|
||
COUNT(*) as duplicate_count
|
||
FROM users
|
||
GROUP BY user_id, email
|
||
HAVING COUNT(*) > 1
|
||
`
|
||
});
|
||
|
||
return {
|
||
missingIndexes: missingIndexes.rows,
|
||
nullableColumns: nullableColumns.rows,
|
||
duplicates: duplicateData.rows
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
### PostgreSQL MCP Installation
|
||
|
||
```bash
|
||
# Install PostgreSQL MCP
|
||
npm install -g @modelcontextprotocol/server-postgres
|
||
|
||
# Configure for schema design
|
||
{
|
||
"mcpServers": {
|
||
"postgres": {
|
||
"command": "npx",
|
||
"args": ["-y", "@modelcontextprotocol/server-postgres"],
|
||
"env": {
|
||
"POSTGRES_CONNECTION_STRING": "postgresql://schema_designer:pass@db.company.com:5432/production"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
Once installed, all agents reading this skill automatically validate schemas against live databases.
|
||
|
||
### Schema Design Workflow with MCP
|
||
|
||
1. **Explore Existing Schema**: Query information_schema
|
||
2. **Analyze Data Distribution**: Use pg_stats
|
||
3. **Check Constraints**: Validate FK relationships
|
||
4. **Test DDL Changes**: Use BEGIN...ROLLBACK
|
||
5. **Estimate Impact**: Query table/index sizes
|
||
6. **Validate Normalization**: Find duplicates
|
||
7. **Plan Indexes**: Analyze query patterns
|
||
8. **Generate Migration**: Create safe DDL scripts
|
||
|
||
---
|
||
|
||
**Version**: 1.0
|
||
**Last Updated**: January 2025
|
||
**MCP Enhancement**: PostgreSQL for data-driven schema design
|
||
**Best Practices**: Industry-proven schema design patterns
|