Initial commit
This commit is contained in:
15
.claude-plugin/plugin.json
Normal file
15
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,15 @@
|
||||
{
|
||||
"name": "database-partition-manager",
|
||||
"description": "Database plugin for database-partition-manager",
|
||||
"version": "1.0.0",
|
||||
"author": {
|
||||
"name": "Claude Code Plugins",
|
||||
"email": "[email protected]"
|
||||
},
|
||||
"skills": [
|
||||
"./skills"
|
||||
],
|
||||
"commands": [
|
||||
"./commands"
|
||||
]
|
||||
}
|
||||
3
README.md
Normal file
3
README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# database-partition-manager
|
||||
|
||||
Database plugin for database-partition-manager
|
||||
708
commands/partitioning.md
Normal file
708
commands/partitioning.md
Normal file
@@ -0,0 +1,708 @@
|
||||
---
|
||||
description: Design and implement table partitioning strategies for massive datasets
|
||||
shortcut: partition
|
||||
---
|
||||
|
||||
# Database Partition Manager
|
||||
|
||||
Design, implement, and manage table partitioning strategies for massive datasets with automated partition maintenance, query optimization, and data lifecycle management.
|
||||
|
||||
## When to Use This Command
|
||||
|
||||
Use `/partition` when you need to:
|
||||
- Manage tables exceeding 100GB with slow query performance
|
||||
- Implement time-series data archival strategies (IoT, logs, metrics)
|
||||
- Optimize queries that filter by date ranges or specific values
|
||||
- Reduce maintenance window for VACUUM, INDEX, and ANALYZE operations
|
||||
- Implement efficient data retention policies (delete old partitions)
|
||||
- Improve parallel query performance across multiple partitions
|
||||
|
||||
DON'T use this when:
|
||||
- Tables are small (<10GB) and perform well
|
||||
- Queries don't filter by partition key (causes partition pruning failure)
|
||||
- Application can't be updated to handle partition-aware queries
|
||||
- Database doesn't support native partitioning (use application-level sharding instead)
|
||||
|
||||
## Design Decisions
|
||||
|
||||
This command implements **declarative partitioning** because:
|
||||
- Native database support provides optimal query performance
|
||||
- Automatic partition pruning reduces query execution time by 90%+
|
||||
- Constraint exclusion ensures only relevant partitions are scanned
|
||||
- Partition-wise joins improve multi-table query performance
|
||||
- Automated partition management reduces operational overhead
|
||||
|
||||
**Alternative considered: Application-level sharding**
|
||||
- Full control over data distribution
|
||||
- Requires application code changes
|
||||
- No automatic query optimization
|
||||
- Recommended for multi-tenant applications with tenant-based isolation
|
||||
|
||||
**Alternative considered: Inheritance-based partitioning (legacy)**
|
||||
- Available in older PostgreSQL versions (<10)
|
||||
- Manual trigger maintenance required
|
||||
- No automatic partition pruning
|
||||
- Recommended only for legacy systems
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before running this command:
|
||||
1. Identify partition key (typically timestamp or category column)
|
||||
2. Analyze query patterns to ensure they filter by partition key
|
||||
3. Estimate partition size (target: 10-50GB per partition)
|
||||
4. Plan partition retention policy (e.g., keep 90 days, archive rest)
|
||||
5. Test partition migration on development database
|
||||
|
||||
## Implementation Process
|
||||
|
||||
### Step 1: Analyze Table and Query Patterns
|
||||
Review table size, query patterns, and identify optimal partition strategy.
|
||||
|
||||
### Step 2: Design Partition Schema
|
||||
Choose partitioning method (range, list, hash) and partition key based on access patterns.
|
||||
|
||||
### Step 3: Create Partitioned Table
|
||||
Convert existing table to partitioned table with minimal downtime using pg_partman or manual migration.
|
||||
|
||||
### Step 4: Implement Automated Partition Maintenance
|
||||
Set up automated partition creation, archival, and cleanup processes.
|
||||
|
||||
### Step 5: Optimize Queries for Partition Pruning
|
||||
Ensure queries include partition key in WHERE clauses for automatic pruning.
|
||||
|
||||
## Output Format
|
||||
|
||||
The command generates:
|
||||
- `schema/partitioned_table.sql` - Partitioned table definition
|
||||
- `maintenance/partition_manager.sql` - Automated partition management functions
|
||||
- `scripts/partition_maintenance.sh` - Cron job for partition operations
|
||||
- `migration/convert_to_partitioned.sql` - Zero-downtime migration script
|
||||
- `monitoring/partition_health.sql` - Partition size and performance monitoring
|
||||
|
||||
## Code Examples
|
||||
|
||||
### Example 1: PostgreSQL Range Partitioning for Time-Series Data
|
||||
|
||||
```sql
|
||||
-- Create partitioned table for time-series sensor data
|
||||
CREATE TABLE sensor_readings (
|
||||
id BIGSERIAL,
|
||||
sensor_id INTEGER NOT NULL,
|
||||
reading_value NUMERIC(10,2) NOT NULL,
|
||||
reading_time TIMESTAMP NOT NULL,
|
||||
metadata JSONB,
|
||||
PRIMARY KEY (id, reading_time)
|
||||
) PARTITION BY RANGE (reading_time);
|
||||
|
||||
-- Create indexes on partitioned table (inherited by all partitions)
|
||||
CREATE INDEX idx_sensor_readings_sensor_id ON sensor_readings (sensor_id);
|
||||
CREATE INDEX idx_sensor_readings_time ON sensor_readings (reading_time);
|
||||
CREATE INDEX idx_sensor_readings_metadata ON sensor_readings USING GIN (metadata);
|
||||
|
||||
-- Create initial partitions (monthly strategy)
|
||||
CREATE TABLE sensor_readings_2024_01 PARTITION OF sensor_readings
|
||||
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
|
||||
|
||||
CREATE TABLE sensor_readings_2024_02 PARTITION OF sensor_readings
|
||||
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
|
||||
|
||||
CREATE TABLE sensor_readings_2024_03 PARTITION OF sensor_readings
|
||||
FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
|
||||
|
||||
-- Create default partition for out-of-range data
|
||||
CREATE TABLE sensor_readings_default PARTITION OF sensor_readings DEFAULT;
|
||||
|
||||
-- Automated partition management function
|
||||
CREATE OR REPLACE FUNCTION create_monthly_partitions(
|
||||
p_table_name TEXT,
|
||||
p_months_ahead INTEGER DEFAULT 3
|
||||
)
|
||||
RETURNS VOID AS $$
|
||||
DECLARE
|
||||
v_start_date DATE;
|
||||
v_end_date DATE;
|
||||
v_partition_name TEXT;
|
||||
v_sql TEXT;
|
||||
v_month INTEGER;
|
||||
BEGIN
|
||||
-- Create partitions for next N months
|
||||
FOR v_month IN 1..p_months_ahead LOOP
|
||||
v_start_date := DATE_TRUNC('month', CURRENT_DATE + (v_month || ' months')::INTERVAL);
|
||||
v_end_date := v_start_date + INTERVAL '1 month';
|
||||
v_partition_name := p_table_name || '_' || TO_CHAR(v_start_date, 'YYYY_MM');
|
||||
|
||||
-- Check if partition already exists
|
||||
IF NOT EXISTS (
|
||||
SELECT 1 FROM pg_class
|
||||
WHERE relname = v_partition_name
|
||||
) THEN
|
||||
v_sql := FORMAT(
|
||||
'CREATE TABLE %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
|
||||
v_partition_name,
|
||||
p_table_name,
|
||||
v_start_date,
|
||||
v_end_date
|
||||
);
|
||||
|
||||
RAISE NOTICE 'Creating partition: %', v_partition_name;
|
||||
EXECUTE v_sql;
|
||||
|
||||
-- Analyze new partition
|
||||
EXECUTE FORMAT('ANALYZE %I', v_partition_name);
|
||||
END IF;
|
||||
END LOOP;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
-- Automated partition archival and cleanup
|
||||
CREATE OR REPLACE FUNCTION archive_old_partitions(
|
||||
p_table_name TEXT,
|
||||
p_retention_months INTEGER DEFAULT 12,
|
||||
p_archive_table TEXT DEFAULT NULL
|
||||
)
|
||||
RETURNS VOID AS $$
|
||||
DECLARE
|
||||
v_partition RECORD;
|
||||
v_cutoff_date DATE;
|
||||
v_sql TEXT;
|
||||
BEGIN
|
||||
v_cutoff_date := DATE_TRUNC('month', CURRENT_DATE - (p_retention_months || ' months')::INTERVAL);
|
||||
|
||||
FOR v_partition IN
|
||||
SELECT
|
||||
c.relname AS partition_name,
|
||||
pg_get_expr(c.relpartbound, c.oid) AS partition_bounds
|
||||
FROM pg_class c
|
||||
JOIN pg_inherits i ON i.inhrelid = c.oid
|
||||
JOIN pg_class p ON p.oid = i.inhparent
|
||||
WHERE p.relname = p_table_name
|
||||
AND c.relname LIKE p_table_name || '_%'
|
||||
AND c.relname != p_table_name || '_default'
|
||||
ORDER BY c.relname
|
||||
LOOP
|
||||
-- Extract partition start date from name
|
||||
IF v_partition.partition_name ~ '_\d{4}_\d{2}$' THEN
|
||||
DECLARE
|
||||
v_partition_date DATE;
|
||||
BEGIN
|
||||
v_partition_date := TO_DATE(
|
||||
SUBSTRING(v_partition.partition_name FROM '\d{4}_\d{2}$'),
|
||||
'YYYY_MM'
|
||||
);
|
||||
|
||||
IF v_partition_date < v_cutoff_date THEN
|
||||
RAISE NOTICE 'Archiving partition: %', v_partition.partition_name;
|
||||
|
||||
IF p_archive_table IS NOT NULL THEN
|
||||
-- Move data to archive table
|
||||
v_sql := FORMAT(
|
||||
'INSERT INTO %I SELECT * FROM %I',
|
||||
p_archive_table,
|
||||
v_partition.partition_name
|
||||
);
|
||||
EXECUTE v_sql;
|
||||
END IF;
|
||||
|
||||
-- Detach and drop old partition
|
||||
v_sql := FORMAT(
|
||||
'ALTER TABLE %I DETACH PARTITION %I',
|
||||
p_table_name,
|
||||
v_partition.partition_name
|
||||
);
|
||||
EXECUTE v_sql;
|
||||
|
||||
v_sql := FORMAT('DROP TABLE %I', v_partition.partition_name);
|
||||
EXECUTE v_sql;
|
||||
|
||||
RAISE NOTICE 'Dropped partition: %', v_partition.partition_name;
|
||||
END IF;
|
||||
END;
|
||||
END IF;
|
||||
END LOOP;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
-- Partition health monitoring
|
||||
CREATE OR REPLACE VIEW partition_health AS
|
||||
SELECT
|
||||
schemaname,
|
||||
tablename AS partition_name,
|
||||
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
|
||||
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
|
||||
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) -
|
||||
pg_relation_size(schemaname||'.'||tablename)) AS index_size,
|
||||
n_live_tup AS row_count,
|
||||
n_dead_tup AS dead_rows,
|
||||
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_row_percent,
|
||||
last_vacuum,
|
||||
last_autovacuum,
|
||||
last_analyze,
|
||||
last_autoanalyze
|
||||
FROM pg_stat_user_tables
|
||||
WHERE tablename LIKE '%_202%' -- Filter for partitions
|
||||
ORDER BY schemaname, tablename;
|
||||
|
||||
-- Query to show partition pruning effectiveness
|
||||
CREATE OR REPLACE FUNCTION explain_partition_pruning(p_query TEXT)
|
||||
RETURNS TABLE (plan_line TEXT) AS $$
|
||||
BEGIN
|
||||
RETURN QUERY EXECUTE 'EXPLAIN (ANALYZE, BUFFERS) ' || p_query;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
```
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# scripts/partition_maintenance.sh - Automated Partition Management
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# Configuration
|
||||
DB_NAME="mydb"
|
||||
DB_USER="postgres"
|
||||
DB_HOST="localhost"
|
||||
RETENTION_MONTHS=12
|
||||
CREATE_AHEAD_MONTHS=3
|
||||
LOG_FILE="/var/log/partition_maintenance.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# Create future partitions
|
||||
create_partitions() {
|
||||
log "Creating partitions for next $CREATE_AHEAD_MONTHS months..."
|
||||
|
||||
psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -v ON_ERROR_STOP=1 <<EOF
|
||||
SELECT create_monthly_partitions('sensor_readings', $CREATE_AHEAD_MONTHS);
|
||||
SELECT create_monthly_partitions('audit_logs', $CREATE_AHEAD_MONTHS);
|
||||
SELECT create_monthly_partitions('user_events', $CREATE_AHEAD_MONTHS);
|
||||
EOF
|
||||
|
||||
log "Partition creation completed"
|
||||
}
|
||||
|
||||
# Archive and cleanup old partitions
|
||||
cleanup_partitions() {
|
||||
log "Archiving partitions older than $RETENTION_MONTHS months..."
|
||||
|
||||
psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -v ON_ERROR_STOP=1 <<EOF
|
||||
SELECT archive_old_partitions('sensor_readings', $RETENTION_MONTHS, 'sensor_readings_archive');
|
||||
SELECT archive_old_partitions('audit_logs', $RETENTION_MONTHS, 'audit_logs_archive');
|
||||
SELECT archive_old_partitions('user_events', $RETENTION_MONTHS, NULL); -- No archival, just drop
|
||||
EOF
|
||||
|
||||
log "Partition cleanup completed"
|
||||
}
|
||||
|
||||
# Analyze partitions for query optimization
|
||||
analyze_partitions() {
|
||||
log "Analyzing partitions..."
|
||||
|
||||
psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -v ON_ERROR_STOP=1 <<EOF
|
||||
SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
|
||||
FROM pg_tables
|
||||
WHERE tablename LIKE '%_202%'
|
||||
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
|
||||
|
||||
-- Analyze all partitions
|
||||
DO \$\$
|
||||
DECLARE
|
||||
r RECORD;
|
||||
BEGIN
|
||||
FOR r IN
|
||||
SELECT schemaname, tablename
|
||||
FROM pg_tables
|
||||
WHERE tablename LIKE '%_202%'
|
||||
LOOP
|
||||
EXECUTE FORMAT('ANALYZE %I.%I', r.schemaname, r.tablename);
|
||||
END LOOP;
|
||||
END;
|
||||
\$\$;
|
||||
EOF
|
||||
|
||||
log "Partition analysis completed"
|
||||
}
|
||||
|
||||
# Generate health report
|
||||
health_report() {
|
||||
log "Generating partition health report..."
|
||||
|
||||
psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -v ON_ERROR_STOP=1 <<EOF | tee -a "$LOG_FILE"
|
||||
\echo '=== Partition Health Report ==='
|
||||
\echo ''
|
||||
|
||||
SELECT * FROM partition_health
|
||||
WHERE total_size != '0 bytes'
|
||||
ORDER BY partition_name DESC
|
||||
LIMIT 20;
|
||||
|
||||
\echo ''
|
||||
\echo '=== Partitions Needing VACUUM ==='
|
||||
SELECT partition_name, dead_row_percent, row_count, dead_rows
|
||||
FROM partition_health
|
||||
WHERE dead_row_percent > 10
|
||||
ORDER BY dead_row_percent DESC;
|
||||
EOF
|
||||
|
||||
log "Health report generated"
|
||||
}
|
||||
|
||||
# Main execution
|
||||
main() {
|
||||
log "=== Starting Partition Maintenance ==="
|
||||
|
||||
create_partitions
|
||||
cleanup_partitions
|
||||
analyze_partitions
|
||||
health_report
|
||||
|
||||
log "=== Partition Maintenance Completed ==="
|
||||
}
|
||||
|
||||
main "$@"
|
||||
```
|
||||
|
||||
### Example 2: List Partitioning by Category with Hash Sub-Partitioning
|
||||
|
||||
```sql
|
||||
-- Multi-level partitioning: LIST (by region) → HASH (by customer_id)
|
||||
CREATE TABLE orders (
|
||||
order_id BIGSERIAL,
|
||||
customer_id INTEGER NOT NULL,
|
||||
region VARCHAR(10) NOT NULL,
|
||||
order_date TIMESTAMP NOT NULL,
|
||||
total_amount NUMERIC(10,2),
|
||||
PRIMARY KEY (order_id, region, customer_id)
|
||||
) PARTITION BY LIST (region);
|
||||
|
||||
-- Create regional partitions
|
||||
CREATE TABLE orders_us PARTITION OF orders
|
||||
FOR VALUES IN ('US', 'CA', 'MX')
|
||||
PARTITION BY HASH (customer_id);
|
||||
|
||||
CREATE TABLE orders_eu PARTITION OF orders
|
||||
FOR VALUES IN ('UK', 'FR', 'DE', 'ES', 'IT')
|
||||
PARTITION BY HASH (customer_id);
|
||||
|
||||
CREATE TABLE orders_asia PARTITION OF orders
|
||||
FOR VALUES IN ('JP', 'CN', 'IN', 'SG')
|
||||
PARTITION BY HASH (customer_id);
|
||||
|
||||
-- Create hash sub-partitions (4 buckets per region for parallel processing)
|
||||
CREATE TABLE orders_us_0 PARTITION OF orders_us FOR VALUES WITH (MODULUS 4, REMAINDER 0);
|
||||
CREATE TABLE orders_us_1 PARTITION OF orders_us FOR VALUES WITH (MODULUS 4, REMAINDER 1);
|
||||
CREATE TABLE orders_us_2 PARTITION OF orders_us FOR VALUES WITH (MODULUS 4, REMAINDER 2);
|
||||
CREATE TABLE orders_us_3 PARTITION OF orders_us FOR VALUES WITH (MODULUS 4, REMAINDER 3);
|
||||
|
||||
CREATE TABLE orders_eu_0 PARTITION OF orders_eu FOR VALUES WITH (MODULUS 4, REMAINDER 0);
|
||||
CREATE TABLE orders_eu_1 PARTITION OF orders_eu FOR VALUES WITH (MODULUS 4, REMAINDER 1);
|
||||
CREATE TABLE orders_eu_2 PARTITION OF orders_eu FOR VALUES WITH (MODULUS 4, REMAINDER 2);
|
||||
CREATE TABLE orders_eu_3 PARTITION OF orders_eu FOR VALUES WITH (MODULUS 4, REMAINDER 3);
|
||||
|
||||
CREATE TABLE orders_asia_0 PARTITION OF orders_asia FOR VALUES WITH (MODULUS 4, REMAINDER 0);
|
||||
CREATE TABLE orders_asia_1 PARTITION OF orders_asia FOR VALUES WITH (MODULUS 4, REMAINDER 1);
|
||||
CREATE TABLE orders_asia_2 PARTITION OF orders_asia FOR VALUES WITH (MODULUS 4, REMAINDER 2);
|
||||
CREATE TABLE orders_asia_3 PARTITION OF orders_asia FOR VALUES WITH (MODULUS 4, REMAINDER 3);
|
||||
|
||||
-- Query optimization with partition-wise join
|
||||
SET enable_partitionwise_join = on;
|
||||
SET enable_partitionwise_aggregate = on;
|
||||
|
||||
-- Demonstrate partition pruning
|
||||
EXPLAIN (ANALYZE, BUFFERS)
|
||||
SELECT customer_id, SUM(total_amount) AS total_spent
|
||||
FROM orders
|
||||
WHERE region = 'US'
|
||||
AND order_date >= '2024-01-01'
|
||||
AND order_date < '2024-02-01'
|
||||
GROUP BY customer_id
|
||||
ORDER BY total_spent DESC
|
||||
LIMIT 100;
|
||||
|
||||
-- Output shows only orders_us partitions are scanned (not EU or Asia)
|
||||
```
|
||||
|
||||
```python
|
||||
# scripts/partition_migration.py - Zero-downtime partition migration
|
||||
import psycopg2
|
||||
from psycopg2 import sql
|
||||
import logging
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class PartitionMigrator:
|
||||
"""Migrate existing table to partitioned table with minimal downtime."""
|
||||
|
||||
def __init__(self, connection_string: str):
|
||||
self.conn_string = connection_string
|
||||
|
||||
def connect(self):
|
||||
return psycopg2.connect(self.conn_string)
|
||||
|
||||
def migrate_to_partitioned(
|
||||
self,
|
||||
table_name: str,
|
||||
partition_column: str,
|
||||
partition_type: str = 'RANGE',
|
||||
partition_interval: str = 'MONTHLY'
|
||||
):
|
||||
"""
|
||||
Migrate table to partitioned table with zero downtime.
|
||||
|
||||
Strategy:
|
||||
1. Create new partitioned table
|
||||
2. Copy existing data in batches
|
||||
3. Rename tables atomically
|
||||
4. Update application to use new table
|
||||
"""
|
||||
conn = self.connect()
|
||||
conn.autocommit = False
|
||||
|
||||
try:
|
||||
with conn.cursor() as cur:
|
||||
# Step 1: Create new partitioned table
|
||||
logger.info(f"Creating partitioned table {table_name}_new...")
|
||||
|
||||
cur.execute(f"""
|
||||
CREATE TABLE {table_name}_new (
|
||||
LIKE {table_name} INCLUDING ALL
|
||||
) PARTITION BY {partition_type} ({partition_column});
|
||||
""")
|
||||
|
||||
# Step 2: Create initial partitions based on existing data
|
||||
logger.info("Creating initial partitions...")
|
||||
|
||||
if partition_interval == 'MONTHLY':
|
||||
cur.execute(f"""
|
||||
SELECT
|
||||
DATE_TRUNC('month', MIN({partition_column})) AS min_date,
|
||||
DATE_TRUNC('month', MAX({partition_column})) AS max_date
|
||||
FROM {table_name};
|
||||
""")
|
||||
|
||||
min_date, max_date = cur.fetchone()
|
||||
logger.info(f"Data range: {min_date} to {max_date}")
|
||||
|
||||
current_date = min_date
|
||||
while current_date <= max_date:
|
||||
next_date = current_date + timedelta(days=32)
|
||||
next_date = next_date.replace(day=1) # First day of next month
|
||||
|
||||
partition_name = f"{table_name}_{current_date.strftime('%Y_%m')}"
|
||||
|
||||
cur.execute(sql.SQL("""
|
||||
CREATE TABLE {} PARTITION OF {} FOR VALUES FROM (%s) TO (%s);
|
||||
""").format(
|
||||
sql.Identifier(partition_name),
|
||||
sql.Identifier(f"{table_name}_new")
|
||||
), (current_date, next_date))
|
||||
|
||||
logger.info(f"Created partition: {partition_name}")
|
||||
current_date = next_date
|
||||
|
||||
# Step 3: Copy data in batches
|
||||
logger.info("Copying data in batches...")
|
||||
|
||||
batch_size = 10000
|
||||
offset = 0
|
||||
|
||||
while True:
|
||||
cur.execute(f"""
|
||||
INSERT INTO {table_name}_new
|
||||
SELECT * FROM {table_name}
|
||||
ORDER BY {partition_column}
|
||||
LIMIT {batch_size} OFFSET {offset};
|
||||
""")
|
||||
|
||||
rows_copied = cur.rowcount
|
||||
if rows_copied == 0:
|
||||
break
|
||||
|
||||
offset += batch_size
|
||||
logger.info(f"Copied {offset} rows...")
|
||||
|
||||
# Commit each batch to avoid long-running transactions
|
||||
conn.commit()
|
||||
|
||||
# Step 4: Verify row counts
|
||||
logger.info("Verifying row counts...")
|
||||
|
||||
cur.execute(f"SELECT COUNT(*) FROM {table_name};")
|
||||
original_count = cur.fetchone()[0]
|
||||
|
||||
cur.execute(f"SELECT COUNT(*) FROM {table_name}_new;")
|
||||
new_count = cur.fetchone()[0]
|
||||
|
||||
if original_count != new_count:
|
||||
raise Exception(
|
||||
f"Row count mismatch! Original: {original_count}, New: {new_count}"
|
||||
)
|
||||
|
||||
logger.info(f"Row count verified: {original_count} rows")
|
||||
|
||||
# Step 5: Rename tables atomically
|
||||
logger.info("Renaming tables...")
|
||||
|
||||
cur.execute(f"""
|
||||
BEGIN;
|
||||
ALTER TABLE {table_name} RENAME TO {table_name}_old;
|
||||
ALTER TABLE {table_name}_new RENAME TO {table_name};
|
||||
COMMIT;
|
||||
""")
|
||||
|
||||
logger.info("Migration completed successfully!")
|
||||
logger.info(f"Old table preserved as {table_name}_old")
|
||||
|
||||
conn.commit()
|
||||
|
||||
except Exception as e:
|
||||
conn.rollback()
|
||||
logger.error(f"Migration failed: {e}")
|
||||
raise
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def verify_partition_pruning(self, query: str):
|
||||
"""Test if query benefits from partition pruning."""
|
||||
conn = self.connect()
|
||||
|
||||
try:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(f"EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) {query}")
|
||||
plan = cur.fetchone()[0][0]
|
||||
|
||||
# Extract partition pruning info
|
||||
pruned = self._count_pruned_partitions(plan)
|
||||
|
||||
logger.info(f"Partition pruning analysis:")
|
||||
logger.info(f" Total partitions: {pruned['total']}")
|
||||
logger.info(f" Scanned partitions: {pruned['scanned']}")
|
||||
logger.info(f" Pruned partitions: {pruned['pruned']}")
|
||||
logger.info(f" Pruning effectiveness: {pruned['effectiveness']:.1f}%")
|
||||
|
||||
return pruned
|
||||
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def _count_pruned_partitions(self, plan: dict) -> dict:
|
||||
"""Recursively count partitions in explain plan."""
|
||||
total = 0
|
||||
scanned = 0
|
||||
|
||||
def traverse(node):
|
||||
nonlocal total, scanned
|
||||
|
||||
if 'Relation Name' in node and '_202' in node['Relation Name']:
|
||||
total += 1
|
||||
if 'Plans' in node or node.get('Actual Rows', 0) > 0:
|
||||
scanned += 1
|
||||
|
||||
if 'Plans' in node:
|
||||
for child in node['Plans']:
|
||||
traverse(child)
|
||||
|
||||
traverse(plan['Plan'])
|
||||
|
||||
pruned = total - scanned
|
||||
effectiveness = (pruned / total * 100) if total > 0 else 0
|
||||
|
||||
return {
|
||||
'total': total,
|
||||
'scanned': scanned,
|
||||
'pruned': pruned,
|
||||
'effectiveness': effectiveness
|
||||
}
|
||||
|
||||
# Usage
|
||||
if __name__ == "__main__":
|
||||
migrator = PartitionMigrator(
|
||||
"postgresql://user:password@localhost/mydb"
|
||||
)
|
||||
|
||||
# Migrate sensor_readings table
|
||||
migrator.migrate_to_partitioned(
|
||||
table_name='sensor_readings',
|
||||
partition_column='reading_time',
|
||||
partition_type='RANGE',
|
||||
partition_interval='MONTHLY'
|
||||
)
|
||||
|
||||
# Verify partition pruning works
|
||||
test_query = """
|
||||
SELECT * FROM sensor_readings
|
||||
WHERE reading_time >= '2024-10-01'
|
||||
AND reading_time < '2024-11-01'
|
||||
LIMIT 1000;
|
||||
"""
|
||||
|
||||
migrator.verify_partition_pruning(test_query)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Cause | Solution |
|
||||
|-------|-------|----------|
|
||||
| "No partition of relation ... found for row" | Data outside partition ranges | Create default partition or extend partition range |
|
||||
| "Partition constraint violated" | Invalid data for partition | Fix data or adjust partition bounds |
|
||||
| "Cannot create partition of temporary table" | Partitioning temp tables unsupported | Use regular tables or application-level sharding |
|
||||
| "Too many partitions (>1000)" | Excessive partition count | Increase partition interval (daily → weekly → monthly) |
|
||||
| "Constraint exclusion not working" | Query doesn't filter by partition key | Rewrite query to include partition key in WHERE clause |
|
||||
|
||||
## Configuration Options
|
||||
|
||||
**Partition Planning**
|
||||
- `partition_type`: RANGE (dates), LIST (categories), HASH (distribution)
|
||||
- `partition_interval`: DAILY, WEEKLY, MONTHLY, YEARLY
|
||||
- `retention_policy`: How long to keep old partitions
|
||||
- `partition_size_target`: Target 10-50GB per partition
|
||||
|
||||
**Query Optimization**
|
||||
- `enable_partition_pruning = on`: Enable automatic partition elimination
|
||||
- `constraint_exclusion = partition`: Enable constraint-based pruning
|
||||
- `enable_partitionwise_join = on`: Join matching partitions directly
|
||||
- `enable_partitionwise_aggregate = on`: Aggregate per-partition then combine
|
||||
|
||||
## Best Practices
|
||||
|
||||
DO:
|
||||
- Always include partition key in WHERE clauses for pruning
|
||||
- Target 10-50GB per partition (not too large, not too small)
|
||||
- Use RANGE partitioning for time-series data
|
||||
- Use LIST partitioning for categorical data (regions, types)
|
||||
- Use HASH partitioning for even distribution without natural key
|
||||
- Automate partition creation 3+ months ahead
|
||||
- Monitor partition sizes and adjust strategy if needed
|
||||
|
||||
DON'T:
|
||||
- Create thousands of tiny partitions (overhead > benefit)
|
||||
- Partition tables < 10GB (overhead not justified)
|
||||
- Use partition key that changes over time
|
||||
- Query without partition key filter (scans all partitions)
|
||||
- Forget to analyze partitions after bulk inserts
|
||||
- Mix partition strategies without clear reason
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- Partition pruning can reduce query time by 90%+ on large tables
|
||||
- Each partition adds ~8KB overhead in PostgreSQL catalogs
|
||||
- INSERT performance unchanged for single-row inserts
|
||||
- Bulk INSERT benefits from partition-wise parallelism
|
||||
- VACUUM and ANALYZE run faster on smaller partitions
|
||||
- Index creation can be parallelized across partitions
|
||||
|
||||
## Related Commands
|
||||
|
||||
- `/database-migration-manager` - Schema migrations with partition support
|
||||
- `/database-backup-automator` - Per-partition backup strategies
|
||||
- `/database-index-advisor` - Optimize indexes for partitioned tables
|
||||
- `/sql-query-optimizer` - Ensure queries leverage partition pruning
|
||||
|
||||
## Version History
|
||||
|
||||
- v1.0.0 (2024-10): Initial implementation with PostgreSQL declarative partitioning
|
||||
- Planned v1.1.0: Add MySQL partitioning support and automated partition rebalancing
|
||||
61
plugin.lock.json
Normal file
61
plugin.lock.json
Normal file
@@ -0,0 +1,61 @@
|
||||
{
|
||||
"$schema": "internal://schemas/plugin.lock.v1.json",
|
||||
"pluginId": "gh:jeremylongshore/claude-code-plugins-plus:plugins/database/database-partition-manager",
|
||||
"normalized": {
|
||||
"repo": null,
|
||||
"ref": "refs/tags/v20251128.0",
|
||||
"commit": "2a8382c27b273b5f4979184ba11e568d6fd090a9",
|
||||
"treeHash": "3f23186dcea74e9d0ae9697d38d74a194932423609d843102db314265e79d527",
|
||||
"generatedAt": "2025-11-28T10:18:20.576169Z",
|
||||
"toolVersion": "publish_plugins.py@0.2.0"
|
||||
},
|
||||
"origin": {
|
||||
"remote": "git@github.com:zhongweili/42plugin-data.git",
|
||||
"branch": "master",
|
||||
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
|
||||
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
|
||||
},
|
||||
"manifest": {
|
||||
"name": "database-partition-manager",
|
||||
"description": "Database plugin for database-partition-manager",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"content": {
|
||||
"files": [
|
||||
{
|
||||
"path": "README.md",
|
||||
"sha256": "351ff86ccbf92248b616727002247bf41748b259834d6a9f3d9d44e3f607860a"
|
||||
},
|
||||
{
|
||||
"path": ".claude-plugin/plugin.json",
|
||||
"sha256": "884bb45ecafa3fa3c7c090fa2b2b5885358d33064a647abd7271fcf72d49c1fd"
|
||||
},
|
||||
{
|
||||
"path": "commands/partitioning.md",
|
||||
"sha256": "40b7317e9691a071f3d9de6194365e89a89e385a5bbb5ffaf939491f032cff29"
|
||||
},
|
||||
{
|
||||
"path": "skills/database-partition-manager/SKILL.md",
|
||||
"sha256": "4b60defabc444faffbe4b76c32a529d0e8302d8770759c54eb01de748d9f868b"
|
||||
},
|
||||
{
|
||||
"path": "skills/database-partition-manager/references/README.md",
|
||||
"sha256": "fb50b5bdac1570049fe2d5c97d14f51e06bed9a173c5917c379e9a00381618a7"
|
||||
},
|
||||
{
|
||||
"path": "skills/database-partition-manager/scripts/README.md",
|
||||
"sha256": "11d6d8c722ff734d7c4051167761151e90b48a734e3781ab859f02c884f17d59"
|
||||
},
|
||||
{
|
||||
"path": "skills/database-partition-manager/assets/README.md",
|
||||
"sha256": "6c87da015d41eac1c6626545ba8a2c7b0c8bbf5430e258ea10b0894ba5bd75cf"
|
||||
}
|
||||
],
|
||||
"dirSha256": "3f23186dcea74e9d0ae9697d38d74a194932423609d843102db314265e79d527"
|
||||
},
|
||||
"security": {
|
||||
"scannedAt": null,
|
||||
"scannerVersion": null,
|
||||
"flags": []
|
||||
}
|
||||
}
|
||||
56
skills/database-partition-manager/SKILL.md
Normal file
56
skills/database-partition-manager/SKILL.md
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
name: managing-database-partitions
|
||||
description: |
|
||||
This skill enables Claude to design, implement, and manage table partitioning strategies for large databases. It is triggered when the user needs to optimize query performance, manage time-series data, or reduce maintenance windows for tables exceeding 100GB. Use this skill when asked to "create database partitions", "optimize database queries with partitioning", "manage large database tables", or when the user mentions "partitioning strategy", "data archival", or uses the command `/partition`. The skill helps automate partition maintenance and data lifecycle management. It focuses on database best practices and production-ready implementations.
|
||||
allowed-tools: Read, Write, Edit, Grep, Glob, Bash
|
||||
version: 1.0.0
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This skill automates the design, implementation, and management of database table partitioning strategies. It helps optimize query performance, manage time-series data, and reduce maintenance windows for massive datasets.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Analyze Requirements**: Claude analyzes the user's request to understand the specific partitioning needs, including data size, query patterns, and maintenance requirements.
|
||||
2. **Design Partitioning Strategy**: Based on the analysis, Claude designs an appropriate partitioning strategy (e.g., range, list, hash) and determines the optimal partition key.
|
||||
3. **Implement Partitioning**: Claude generates the necessary SQL scripts or configuration files to implement the partitioning strategy on the target database.
|
||||
4. **Optimize Queries**: Claude provides guidance on optimizing queries to take advantage of the partitioning scheme, including suggestions for partition pruning and index creation.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
This skill activates when you need to:
|
||||
- Manage tables exceeding 100GB with slow query performance.
|
||||
- Implement time-series data archival strategies (IoT, logs, metrics).
|
||||
- Optimize queries that filter by date ranges or specific values.
|
||||
- Reduce database maintenance windows.
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Optimizing Time-Series Data
|
||||
|
||||
User request: "Create database partitions for my IoT sensor data to improve query performance."
|
||||
|
||||
The skill will:
|
||||
1. Analyze the data schema and query patterns for the IoT sensor data.
|
||||
2. Design a range-based partitioning strategy using the timestamp column as the partition key.
|
||||
3. Generate SQL scripts to create partitioned tables and indexes.
|
||||
|
||||
### Example 2: Managing Large Order History Table
|
||||
|
||||
User request: "Implement table partitioning for my order history table to reduce maintenance window."
|
||||
|
||||
The skill will:
|
||||
1. Analyze the size and growth rate of the order history table.
|
||||
2. Design a list-based partitioning strategy based on order status or region.
|
||||
3. Generate SQL scripts to create partitioned tables and migrate existing data.
|
||||
|
||||
## Best Practices
|
||||
|
||||
- **Partition Key Selection**: Choose a partition key that is frequently used in queries and evenly distributes data across partitions.
|
||||
- **Partition Size**: Determine the optimal partition size based on query patterns and storage capacity.
|
||||
- **Maintenance**: Implement automated partition maintenance tasks, such as creating new partitions and archiving old partitions.
|
||||
|
||||
## Integration
|
||||
|
||||
This skill can be integrated with other database management tools for monitoring partition performance and managing data lifecycle. It can also work with data migration tools to efficiently move data between partitions.
|
||||
7
skills/database-partition-manager/assets/README.md
Normal file
7
skills/database-partition-manager/assets/README.md
Normal file
@@ -0,0 +1,7 @@
|
||||
# Assets
|
||||
|
||||
Bundled resources for database-partition-manager skill
|
||||
|
||||
- [ ] partition_template.sql SQL template for creating partitions.
|
||||
- [ ] monitoring_dashboard.json Sample dashboard configuration for monitoring partition performance (e.g., Grafana).
|
||||
- [ ] example_data.csv Example data set to test partitioning strategies.
|
||||
8
skills/database-partition-manager/references/README.md
Normal file
8
skills/database-partition-manager/references/README.md
Normal file
@@ -0,0 +1,8 @@
|
||||
# References
|
||||
|
||||
Bundled resources for database-partition-manager skill
|
||||
|
||||
- [ ] partitioning_strategies.md Detailed documentation on various partitioning strategies (range, list, hash) with pros and cons.
|
||||
- [ ] database_specific_syntax.md Database-specific syntax and commands for implementing partitioning (e.g., PostgreSQL, MySQL, SQL Server).
|
||||
- [ ] performance_tuning.md Best practices for performance tuning after implementing partitioning.
|
||||
- [ ] monitoring_and_alerting.md Guidelines for setting up monitoring and alerting for partition health and performance.
|
||||
7
skills/database-partition-manager/scripts/README.md
Normal file
7
skills/database-partition-manager/scripts/README.md
Normal file
@@ -0,0 +1,7 @@
|
||||
# Scripts
|
||||
|
||||
Bundled resources for database-partition-manager skill
|
||||
|
||||
- [ ] partition_design.py Script to generate partition design based on table size, data type, and query patterns.
|
||||
- [ ] partition_implementation.py Script to implement the partition design on the database.
|
||||
- [ ] partition_maintenance.py Script to automate partition maintenance tasks like merging, splitting, and archiving.
|
||||
Reference in New Issue
Block a user