--- description: Generate comprehensive database documentation from existing schemas shortcut: dbdocs --- # Database Documentation Generator Automatically generate comprehensive database documentation including ERD diagrams, table relationships, column descriptions, indexes, triggers, stored procedures, and interactive HTML documentation from your existing database schemas across multiple database engines. ## When to Use This Command Use `/db-docs` when you need to: - Document existing database schemas for team onboarding - Generate ERD diagrams for architectural reviews - Create data dictionaries for compliance requirements - Build interactive database documentation websites - Document relationships and foreign key constraints - Generate migration guides between schema versions - Create developer reference documentation - Export schema documentation for audits - Build database catalogs for data governance - Generate test data documentation DON'T use this when: - Database is still in early design phase (use `/design-schema` instead) - You only need query optimization docs (use `/optimize-query`) - Working with NoSQL databases without fixed schemas - Database has no meaningful structure to document ## Design Decisions This command implements **SchemaSpy with PlantUML** as the primary approach because: - Supports 30+ database types out of the box - Generates interactive HTML with JavaScript navigation - Creates high-quality vector ERD diagrams - Analyzes implicit relationships via naming conventions - Produces detailed table and column documentation - Open-source with active community support **Alternative considered: dbdocs.io** - Cloud-based with collaboration features - DBML (Database Markup Language) based - Requires internet connection - Recommended for team collaboration needs **Alternative considered: SQL Server Data Tools** - Excellent for SQL Server specifically - Visual Studio integration - Platform-specific limitation - Recommended for pure Microsoft stacks ## Prerequisites Before running this command: 1. Database connection credentials with schema read access 2. JDBC driver for your database type 3. Java Runtime Environment (JRE) 8+ 4. GraphViz installed for diagram generation 5. Sufficient disk space for generated documentation ## Implementation Process ### Step 1: Database Introspection Connect to database and extract complete schema metadata including tables, views, procedures, and relationships. ### Step 2: Relationship Analysis Identify explicit foreign keys and implicit relationships based on naming patterns and data types. ### Step 3: ERD Generation Create entity-relationship diagrams at multiple levels: full schema, per-module, and neighborhood views. ### Step 4: Documentation Generation Generate HTML documentation with search, filtering, and interactive navigation features. ### Step 5: Export and Publishing Package documentation for distribution via static hosting, PDF export, or integration with existing docs. ## Output Format The command generates: - `index.html` - Main documentation entry point - `diagrams/` - ERD diagrams in multiple formats - `schema.svg` - Complete database diagram - `relationships.png` - Relationship overview - `tables/*.svg` - Individual table diagrams - `tables/` - Individual table documentation - `columns.html` - Searchable column index - `constraints.html` - All constraints documentation - `routines.html` - Stored procedures/functions - `orphans.html` - Tables without relationships - `anomalies.html` - Schema issues and warnings - `metrics.json` - Database metrics and statistics - `README.md` - Quick reference guide ## Code Examples ### Example 1: Comprehensive Database Documentation Generator ```javascript // database-doc-generator.js const { exec } = require('child_process'); const fs = require('fs').promises; const path = require('path'); const mysql = require('mysql2/promise'); const { Client: PgClient } = require('pg'); const oracledb = require('oracledb'); const sql = require('mssql'); const ejs = require('ejs'); const marked = require('marked'); const plantuml = require('node-plantuml'); class DatabaseDocumentationGenerator { constructor(config) { this.config = { type: config.type || 'mysql', // mysql, postgresql, mssql, oracle host: config.host || 'localhost', port: config.port, database: config.database, username: config.username, password: config.password, outputDir: config.outputDir || './db-docs', includeViews: config.includeViews !== false, includeProcedures: config.includeProcedures !== false, includeIndexes: config.includeIndexes !== false, includeTriggers: config.includeTriggers !== false, generateERD: config.generateERD !== false, theme: config.theme || 'default', ...config }; this.tables = []; this.views = []; this.procedures = []; this.relationships = []; this.indexes = []; this.triggers = []; this.connection = null; } async connect() { switch (this.config.type) { case 'mysql': this.connection = await mysql.createConnection({ host: this.config.host, port: this.config.port || 3306, user: this.config.username, password: this.config.password, database: this.config.database }); break; case 'postgresql': this.connection = new PgClient({ host: this.config.host, port: this.config.port || 5432, user: this.config.username, password: this.config.password, database: this.config.database }); await this.connection.connect(); break; case 'mssql': this.connection = await sql.connect({ server: this.config.host, port: this.config.port || 1433, user: this.config.username, password: this.config.password, database: this.config.database, options: { encrypt: true, trustServerCertificate: true } }); break; case 'oracle': this.connection = await oracledb.getConnection({ user: this.config.username, password: this.config.password, connectString: `${this.config.host}:${this.config.port || 1521}/${this.config.database}` }); break; default: throw new Error(`Unsupported database type: ${this.config.type}`); } } async disconnect() { if (this.connection) { switch (this.config.type) { case 'mysql': await this.connection.end(); break; case 'postgresql': await this.connection.end(); break; case 'mssql': await this.connection.close(); break; case 'oracle': await this.connection.close(); break; } } } async extractSchema() { console.log('Extracting database schema...'); // Extract tables this.tables = await this.getTables(); // Extract columns for each table for (const table of this.tables) { table.columns = await this.getColumns(table.name); table.primaryKey = await this.getPrimaryKey(table.name); table.foreignKeys = await this.getForeignKeys(table.name); table.indexes = await this.getIndexes(table.name); table.triggers = await this.getTriggers(table.name); table.rowCount = await this.getRowCount(table.name); table.sizeInBytes = await this.getTableSize(table.name); // Add foreign keys to relationships this.relationships.push(...table.foreignKeys); } // Extract views if (this.config.includeViews) { this.views = await this.getViews(); for (const view of this.views) { view.columns = await this.getColumns(view.name, true); view.definition = await this.getViewDefinition(view.name); } } // Extract stored procedures if (this.config.includeProcedures) { this.procedures = await this.getProcedures(); for (const proc of this.procedures) { proc.parameters = await this.getProcedureParameters(proc.name); proc.definition = await this.getProcedureDefinition(proc.name); } } // Analyze implicit relationships this.analyzeImplicitRelationships(); // Calculate metrics this.metrics = this.calculateMetrics(); } async getTables() { let query; switch (this.config.type) { case 'mysql': query = ` SELECT TABLE_NAME as name, TABLE_COMMENT as comment, ENGINE as engine, TABLE_ROWS as estimated_rows, CREATE_TIME as created_at, UPDATE_TIME as updated_at FROM information_schema.TABLES WHERE TABLE_SCHEMA = ? AND TABLE_TYPE = 'BASE TABLE' ORDER BY TABLE_NAME `; break; case 'postgresql': query = ` SELECT tablename as name, obj_description(c.oid) as comment, pg_size_pretty(pg_total_relation_size(c.oid)) as size, n_live_tup as estimated_rows FROM pg_tables t JOIN pg_class c ON c.relname = t.tablename JOIN pg_stat_user_tables s ON s.relname = t.tablename WHERE schemaname = 'public' ORDER BY tablename `; break; case 'mssql': query = ` SELECT t.name, ep.value as comment, p.rows as estimated_rows, t.create_date as created_at, t.modify_date as updated_at FROM sys.tables t LEFT JOIN sys.extended_properties ep ON ep.major_id = t.object_id AND ep.minor_id = 0 LEFT JOIN sys.partitions p ON p.object_id = t.object_id AND p.index_id IN (0,1) WHERE t.is_ms_shipped = 0 ORDER BY t.name `; break; } const [rows] = await this.connection.execute(query, this.config.type === 'mysql' ? [this.config.database] : [] ); return rows; } async getColumns(tableName, isView = false) { let query; switch (this.config.type) { case 'mysql': query = ` SELECT COLUMN_NAME as name, DATA_TYPE as type, CHARACTER_MAXIMUM_LENGTH as max_length, IS_NULLABLE as nullable, COLUMN_DEFAULT as default_value, COLUMN_COMMENT as comment, EXTRA as extra, COLUMN_KEY as key_type FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = ? AND TABLE_NAME = ? ORDER BY ORDINAL_POSITION `; break; case 'postgresql': query = ` SELECT column_name as name, data_type as type, character_maximum_length as max_length, is_nullable as nullable, column_default as default_value, col_description(pgc.oid, a.attnum) as comment FROM information_schema.columns c JOIN pg_class pgc ON pgc.relname = c.table_name JOIN pg_attribute a ON a.attrelid = pgc.oid AND a.attname = c.column_name WHERE table_schema = 'public' AND table_name = $1 ORDER BY ordinal_position `; break; } const params = this.config.type === 'mysql' ? [this.config.database, tableName] : [tableName]; const [rows] = await this.connection.execute(query, params); return rows.map(row => ({ name: row.name, type: this.formatDataType(row), nullable: row.nullable === 'YES', default: row.default_value, comment: row.comment || '', isPrimaryKey: row.key_type === 'PRI', isForeignKey: row.key_type === 'MUL', isUnique: row.key_type === 'UNI', extra: row.extra || '' })); } formatDataType(column) { let type = column.type.toUpperCase(); if (column.max_length) { type += `(${column.max_length})`; } return type; } async getForeignKeys(tableName) { let query; switch (this.config.type) { case 'mysql': query = ` SELECT kcu.CONSTRAINT_NAME as name, kcu.COLUMN_NAME as column_name, kcu.REFERENCED_TABLE_NAME as referenced_table, kcu.REFERENCED_COLUMN_NAME as referenced_column, rc.UPDATE_RULE as update_rule, rc.DELETE_RULE as delete_rule FROM information_schema.KEY_COLUMN_USAGE kcu JOIN information_schema.REFERENTIAL_CONSTRAINTS rc ON kcu.CONSTRAINT_NAME = rc.CONSTRAINT_NAME WHERE kcu.TABLE_SCHEMA = ? AND kcu.TABLE_NAME = ? AND kcu.REFERENCED_TABLE_NAME IS NOT NULL `; break; case 'postgresql': query = ` SELECT tc.constraint_name as name, kcu.column_name, ccu.table_name AS referenced_table, ccu.column_name AS referenced_column, rc.update_rule, rc.delete_rule FROM information_schema.table_constraints AS tc JOIN information_schema.key_column_usage AS kcu ON tc.constraint_name = kcu.constraint_name JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name JOIN information_schema.referential_constraints rc ON rc.constraint_name = tc.constraint_name WHERE tc.table_name = $1 AND tc.constraint_type = 'FOREIGN KEY' `; break; } const params = this.config.type === 'mysql' ? [this.config.database, tableName] : [tableName]; const [rows] = await this.connection.execute(query, params); return rows.map(row => ({ name: row.name, sourceTable: tableName, sourceColumn: row.column_name, targetTable: row.referenced_table, targetColumn: row.referenced_column, onUpdate: row.update_rule, onDelete: row.delete_rule })); } analyzeImplicitRelationships() { // Find implicit relationships based on naming conventions console.log('Analyzing implicit relationships...'); const implicitRelationships = []; for (const table of this.tables) { for (const column of table.columns) { // Check for common foreign key naming patterns const patterns = [ /^(.+)_id$/i, /^id_(.+)$/i, /^fk_(.+)$/i, /^ref_(.+)$/i ]; for (const pattern of patterns) { const match = column.name.match(pattern); if (match) { const potentialTable = match[1]; // Check if referenced table exists const referencedTable = this.tables.find(t => t.name.toLowerCase() === potentialTable.toLowerCase() || t.name.toLowerCase() === potentialTable.toLowerCase() + 's' || t.name.toLowerCase() === potentialTable.toLowerCase().replace(/_/g, '') ); if (referencedTable && !this.isExistingRelationship(table.name, column.name)) { implicitRelationships.push({ name: `implicit_${table.name}_${column.name}`, sourceTable: table.name, sourceColumn: column.name, targetTable: referencedTable.name, targetColumn: 'id', type: 'implicit', confidence: 0.8 }); } } } } } this.relationships.push(...implicitRelationships); } isExistingRelationship(tableName, columnName) { return this.relationships.some(r => r.sourceTable === tableName && r.sourceColumn === columnName ); } async generateERD() { console.log('Generating Entity Relationship Diagram...'); const plantUmlCode = this.generatePlantUMLCode(); // Save PlantUML source await fs.writeFile( path.join(this.config.outputDir, 'diagrams', 'schema.puml'), plantUmlCode ); // Generate SVG diagram return new Promise((resolve, reject) => { const gen = plantuml.generate(plantUmlCode, { format: 'svg' }); const chunks = []; gen.out.on('data', chunk => chunks.push(chunk)); gen.out.on('end', async () => { const svg = Buffer.concat(chunks).toString(); await fs.writeFile( path.join(this.config.outputDir, 'diagrams', 'schema.svg'), svg ); resolve(); }); gen.out.on('error', reject); }); } generatePlantUMLCode() { let puml = '@startuml Database Schema\n'; puml += '!define Table(name,desc) class name as "desc" << (T,#FFAAAA) >>\n'; puml += '!define View(name,desc) class name as "desc" << (V,#AAFFAA) >>\n'; puml += '!define primary_key(x) PK: x\n'; puml += '!define foreign_key(x) FK: x\n'; puml += '!define unique(x) x\n'; puml += '\n'; // Add tables for (const table of this.tables) { puml += `Table(${table.name}, "${table.name}") {\n`; // Add columns for (const column of table.columns) { let columnDef = ` ${column.name}: ${column.type}`; if (column.isPrimaryKey) { columnDef = ` primary_key(${column.name}): ${column.type}`; } else if (column.isForeignKey) { columnDef = ` foreign_key(${column.name}): ${column.type}`; } else if (column.isUnique) { columnDef = ` unique(${column.name}): ${column.type}`; } if (!column.nullable) { columnDef += ' NOT NULL'; } puml += columnDef + '\n'; } puml += '}\n\n'; } // Add views for (const view of this.views) { puml += `View(${view.name}, "${view.name}") {\n`; for (const column of view.columns) { puml += ` ${column.name}: ${column.type}\n`; } puml += '}\n\n'; } // Add relationships for (const rel of this.relationships) { const relType = rel.type === 'implicit' ? '..' : '--'; const label = rel.type === 'implicit' ? 'implicit' : ''; puml += `${rel.sourceTable} "${rel.sourceColumn}" ${relType}> "${rel.targetColumn}" ${rel.targetTable} : ${label}\n`; } puml += '@enduml\n'; return puml; } async generateHTML() { console.log('Generating HTML documentation...'); const template = await fs.readFile( path.join(__dirname, 'templates', 'index.ejs'), 'utf-8' ); const html = ejs.render(template, { database: this.config.database, tables: this.tables, views: this.views, procedures: this.procedures, relationships: this.relationships, metrics: this.metrics, generatedAt: new Date().toISOString() }); await fs.writeFile( path.join(this.config.outputDir, 'index.html'), html ); // Generate individual table pages for (const table of this.tables) { await this.generateTablePage(table); } } async generateTablePage(table) { const template = ` ${table.name} - Database Documentation

${table.name}

${table.comment || 'No description available'}

Rows: ${table.rowCount || 'Unknown'} Size: ${table.sizeInBytes ? this.formatBytes(table.sizeInBytes) : 'Unknown'} Engine: ${table.engine || 'N/A'}

Columns

${table.columns.map(col => ` `).join('')}
Name Type Nullable Default Key Comment
${col.name} ${col.type} ${col.nullable ? 'YES' : 'NO'} ${col.default || 'NULL'} ${col.isPrimaryKey ? 'PK' : ''} ${col.isForeignKey ? 'FK' : ''} ${col.isUnique ? 'UQ' : ''} ${col.comment}
${table.foreignKeys.length > 0 ? `

Foreign Keys

${table.foreignKeys.map(fk => ` `).join('')}
Name Column References On Update On Delete
${fk.name} ${fk.sourceColumn} ${fk.targetTable}.${fk.targetColumn} ${fk.onUpdate} ${fk.onDelete}
` : ''} ${table.indexes.length > 0 ? `

Indexes

${table.indexes.map(idx => ` `).join('')}
Name Columns Type Unique
${idx.name} ${idx.columns.join(', ')} ${idx.type} ${idx.unique ? 'YES' : 'NO'}
` : ''}

Sample Queries

-- Select all records
SELECT * FROM ${table.name};

-- Select with joins
${this.generateSampleJoinQuery(table)}

-- Insert record
INSERT INTO ${table.name} (${table.columns.filter(c => !c.extra.includes('auto_increment')).map(c => c.name).join(', ')})
VALUES (${table.columns.filter(c => !c.extra.includes('auto_increment')).map(c => this.getSampleValue(c)).join(', ')});

-- Update record
UPDATE ${table.name}
SET ${table.columns.filter(c => !c.isPrimaryKey).slice(0, 2).map(c => `${c.name} = ${this.getSampleValue(c)}`).join(',\n    ')}
WHERE ${table.primaryKey ? `${table.primaryKey} = 1` : 'condition'};
`; await fs.writeFile( path.join(this.config.outputDir, 'tables', `${table.name}.html`), template ); } generateSampleJoinQuery(table) { if (table.foreignKeys.length === 0) { return `SELECT * FROM ${table.name} LIMIT 10;`; } const joins = table.foreignKeys.map(fk => `LEFT JOIN ${fk.targetTable} ON ${table.name}.${fk.sourceColumn} = ${fk.targetTable}.${fk.targetColumn}` ).join('\n'); return `SELECT ${table.name}.*, ${table.foreignKeys.map(fk => `${fk.targetTable}.*`).join(',\n ')} FROM ${table.name} ${joins} LIMIT 10;`; } getSampleValue(column) { if (column.default) return column.default; const type = column.type.toLowerCase(); if (type.includes('int')) return '1'; if (type.includes('decimal') || type.includes('float')) return '0.00'; if (type.includes('date')) return "'2024-01-01'"; if (type.includes('time')) return "'2024-01-01 00:00:00'"; if (type.includes('bool')) return 'TRUE'; if (type.includes('json')) return "'{}'"; return "'value'"; } calculateMetrics() { return { totalTables: this.tables.length, totalViews: this.views.length, totalColumns: this.tables.reduce((sum, t) => sum + t.columns.length, 0), totalRelationships: this.relationships.length, totalIndexes: this.tables.reduce((sum, t) => sum + (t.indexes?.length || 0), 0), totalProcedures: this.procedures.length, totalRows: this.tables.reduce((sum, t) => sum + (t.rowCount || 0), 0), orphanTables: this.tables.filter(t => !this.relationships.some(r => r.sourceTable === t.name || r.targetTable === t.name) ).length, tablesWithoutPK: this.tables.filter(t => !t.primaryKey).length, avgColumnsPerTable: Math.round( this.tables.reduce((sum, t) => sum + t.columns.length, 0) / this.tables.length ), avgRelationshipsPerTable: Math.round( this.relationships.length / this.tables.length * 10 ) / 10 }; } formatBytes(bytes) { const sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB']; if (bytes === 0) return '0 Bytes'; const i = parseInt(Math.floor(Math.log(bytes) / Math.log(1024))); return Math.round(bytes / Math.pow(1024, i) * 100) / 100 + ' ' + sizes[i]; } async generateMarkdown() { console.log('Generating Markdown documentation...'); let markdown = `# Database Documentation: ${this.config.database}\n\n`; markdown += `Generated: ${new Date().toISOString()}\n\n`; // Add metrics markdown += '## Database Metrics\n\n'; markdown += `- **Total Tables**: ${this.metrics.totalTables}\n`; markdown += `- **Total Views**: ${this.metrics.totalViews}\n`; markdown += `- **Total Columns**: ${this.metrics.totalColumns}\n`; markdown += `- **Total Relationships**: ${this.metrics.totalRelationships}\n`; markdown += `- **Total Indexes**: ${this.metrics.totalIndexes}\n`; markdown += `- **Orphan Tables**: ${this.metrics.orphanTables}\n`; markdown += `- **Tables without Primary Key**: ${this.metrics.tablesWithoutPK}\n\n`; // Add table documentation markdown += '## Tables\n\n'; for (const table of this.tables) { markdown += `### ${table.name}\n\n`; if (table.comment) { markdown += `${table.comment}\n\n`; } markdown += '#### Columns\n\n'; markdown += '| Name | Type | Nullable | Default | Key | Comment |\n'; markdown += '|------|------|----------|---------|-----|----------|\n'; for (const column of table.columns) { const keys = []; if (column.isPrimaryKey) keys.push('PK'); if (column.isForeignKey) keys.push('FK'); if (column.isUnique) keys.push('UQ'); markdown += `| ${column.name} | ${column.type} | ${column.nullable ? 'YES' : 'NO'} | ${column.default || 'NULL'} | ${keys.join(', ')} | ${column.comment} |\n`; } markdown += '\n'; if (table.foreignKeys.length > 0) { markdown += '#### Foreign Keys\n\n'; markdown += '| Constraint | Column | References | On Update | On Delete |\n'; markdown += '|------------|--------|------------|-----------|----------|\n'; for (const fk of table.foreignKeys) { markdown += `| ${fk.name} | ${fk.sourceColumn} | ${fk.targetTable}.${fk.targetColumn} | ${fk.onUpdate} | ${fk.onDelete} |\n`; } markdown += '\n'; } } await fs.writeFile( path.join(this.config.outputDir, 'README.md'), markdown ); } async generateDataDictionary() { console.log('Generating data dictionary...'); const dictionary = { database: this.config.database, generated: new Date().toISOString(), metrics: this.metrics, tables: this.tables.map(table => ({ name: table.name, comment: table.comment, columns: table.columns.map(col => ({ name: col.name, type: col.type, nullable: col.nullable, default: col.default, comment: col.comment, constraints: { primaryKey: col.isPrimaryKey, foreignKey: col.isForeignKey, unique: col.isUnique } })), relationships: { parents: this.relationships.filter(r => r.sourceTable === table.name), children: this.relationships.filter(r => r.targetTable === table.name) } })) }; await fs.writeFile( path.join(this.config.outputDir, 'data-dictionary.json'), JSON.stringify(dictionary, null, 2) ); } async generate() { try { // Create output directories await fs.mkdir(this.config.outputDir, { recursive: true }); await fs.mkdir(path.join(this.config.outputDir, 'diagrams'), { recursive: true }); await fs.mkdir(path.join(this.config.outputDir, 'tables'), { recursive: true }); await fs.mkdir(path.join(this.config.outputDir, 'assets'), { recursive: true }); // Connect to database await this.connect(); // Extract schema information await this.extractSchema(); // Generate outputs if (this.config.generateERD) { await this.generateERD(); } await this.generateHTML(); await this.generateMarkdown(); await this.generateDataDictionary(); await this.generateAssets(); console.log(`Documentation generated successfully in ${this.config.outputDir}`); } catch (error) { console.error('Error generating documentation:', error); throw error; } finally { await this.disconnect(); } } async generateAssets() { // Generate CSS const css = ` body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif; line-height: 1.6; color: #333; max-width: 1200px; margin: 0 auto; padding: 20px; } h1, h2, h3 { color: #2c3e50; border-bottom: 2px solid #ecf0f1; padding-bottom: 10px; } .container { background: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); } .metadata { background: #f8f9fa; padding: 10px; border-radius: 4px; margin: 20px 0; } .metadata span { margin-right: 20px; font-size: 14px; color: #666; } .data-table { width: 100%; border-collapse: collapse; margin: 20px 0; } .data-table th { background: #3498db; color: white; text-align: left; padding: 10px; } .data-table td { padding: 8px; border-bottom: 1px solid #ddd; } .data-table tr:hover { background: #f5f5f5; } .column-name { font-weight: bold; color: #2c3e50; } .data-type { color: #e74c3c; font-family: 'Courier New', monospace; } .key { display: inline-block; padding: 2px 6px; border-radius: 3px; font-size: 11px; font-weight: bold; margin-right: 4px; } .key.pk { background: #f39c12; color: white; } .key.fk { background: #3498db; color: white; } .key.uk { background: #9b59b6; color: white; } pre { background: #f4f4f4; border: 1px solid #ddd; border-radius: 4px; padding: 15px; overflow-x: auto; } code.sql { color: #2c3e50; font-family: 'Courier New', monospace; } a { color: #3498db; text-decoration: none; } a:hover { text-decoration: underline; } `; await fs.writeFile( path.join(this.config.outputDir, 'assets', 'style.css'), css ); } } // Usage const generator = new DatabaseDocumentationGenerator({ type: 'mysql', host: 'localhost', database: 'ecommerce', username: 'root', password: 'password', outputDir: './database-docs', generateERD: true, includeViews: true, includeProcedures: true }); generator.generate() .then(() => console.log('Documentation generation complete')) .catch(error => console.error('Documentation generation failed:', error)); module.exports = DatabaseDocumentationGenerator; ``` ### Example 2: Python Database Documentation Generator with SchemaSpy Integration ```python # database_doc_generator.py import os import json import subprocess import sqlite3 import psycopg2 import mysql.connector import pymongo from pathlib import Path from typing import Dict, List, Any, Optional from datetime import datetime from jinja2 import Template import graphviz import yaml from dataclasses import dataclass, asdict import pandas as pd @dataclass class Column: name: str data_type: str nullable: bool default: Optional[str] is_primary_key: bool = False is_foreign_key: bool = False is_unique: bool = False comment: Optional[str] = None @dataclass class ForeignKey: name: str source_table: str source_column: str target_table: str target_column: str on_update: str on_delete: str @dataclass class Table: name: str columns: List[Column] primary_keys: List[str] foreign_keys: List[ForeignKey] indexes: List[Dict[str, Any]] comment: Optional[str] = None row_count: Optional[int] = None size_bytes: Optional[int] = None class DatabaseDocGenerator: """Generate comprehensive database documentation""" def __init__(self, config: Dict[str, Any]): self.config = config self.connection = None self.tables: List[Table] = [] self.views: List[Dict[str, Any]] = [] self.procedures: List[Dict[str, Any]] = [] self.relationships: List[ForeignKey] = [] self.metrics: Dict[str, Any] = {} def connect(self): """Connect to database based on type""" db_type = self.config['type'] if db_type == 'postgresql': self.connection = psycopg2.connect( host=self.config['host'], port=self.config.get('port', 5432), database=self.config['database'], user=self.config['username'], password=self.config['password'] ) elif db_type == 'mysql': self.connection = mysql.connector.connect( host=self.config['host'], port=self.config.get('port', 3306), database=self.config['database'], user=self.config['username'], password=self.config['password'] ) elif db_type == 'sqlite': self.connection = sqlite3.connect(self.config['database']) elif db_type == 'mongodb': from pymongo import MongoClient self.connection = MongoClient( host=self.config['host'], port=self.config.get('port', 27017) )[self.config['database']] else: raise ValueError(f"Unsupported database type: {db_type}") def extract_schema(self): """Extract complete database schema""" print("Extracting database schema...") if self.config['type'] == 'mongodb': self._extract_mongodb_schema() else: self._extract_sql_schema() self._analyze_implicit_relationships() self._calculate_metrics() def _extract_sql_schema(self): """Extract schema for SQL databases""" cursor = self.connection.cursor() # Get tables if self.config['type'] == 'postgresql': cursor.execute(""" SELECT tablename, obj_description(c.oid) FROM pg_tables t JOIN pg_class c ON c.relname = t.tablename WHERE schemaname = 'public' ORDER BY tablename """) elif self.config['type'] == 'mysql': cursor.execute(""" SELECT TABLE_NAME, TABLE_COMMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA = %s AND TABLE_TYPE = 'BASE TABLE' ORDER BY TABLE_NAME """, (self.config['database'],)) elif self.config['type'] == 'sqlite': cursor.execute(""" SELECT name, sql FROM sqlite_master WHERE type = 'table' AND name NOT LIKE 'sqlite_%' ORDER BY name """) tables_data = cursor.fetchall() for table_name, comment in tables_data: columns = self._get_columns(cursor, table_name) foreign_keys = self._get_foreign_keys(cursor, table_name) indexes = self._get_indexes(cursor, table_name) primary_keys = [col.name for col in columns if col.is_primary_key] table = Table( name=table_name, columns=columns, primary_keys=primary_keys, foreign_keys=foreign_keys, indexes=indexes, comment=comment ) # Get table statistics try: cursor.execute(f"SELECT COUNT(*) FROM {table_name}") table.row_count = cursor.fetchone()[0] except: pass self.tables.append(table) self.relationships.extend(foreign_keys) def _get_columns(self, cursor, table_name: str) -> List[Column]: """Get columns for a table""" columns = [] if self.config['type'] == 'postgresql': cursor.execute(""" SELECT column_name, data_type, is_nullable, column_default, col_description(pgc.oid, a.attnum) as comment FROM information_schema.columns c JOIN pg_class pgc ON pgc.relname = c.table_name JOIN pg_attribute a ON a.attrelid = pgc.oid AND a.attname = c.column_name WHERE table_schema = 'public' AND table_name = %s ORDER BY ordinal_position """, (table_name,)) elif self.config['type'] == 'mysql': cursor.execute(""" SELECT COLUMN_NAME, DATA_TYPE, IS_NULLABLE, COLUMN_DEFAULT, COLUMN_COMMENT, COLUMN_KEY FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = %s AND TABLE_NAME = %s ORDER BY ORDINAL_POSITION """, (self.config['database'], table_name)) elif self.config['type'] == 'sqlite': cursor.execute(f"PRAGMA table_info({table_name})") for row in cursor.fetchall(): if self.config['type'] == 'sqlite': column = Column( name=row[1], data_type=row[2], nullable=not row[3], default=row[4], is_primary_key=bool(row[5]) ) else: column = Column( name=row[0], data_type=row[1], nullable=row[2] == 'YES', default=row[3], comment=row[4] if len(row) > 4 else None ) if self.config['type'] == 'mysql' and len(row) > 5: column.is_primary_key = row[5] == 'PRI' column.is_foreign_key = row[5] == 'MUL' column.is_unique = row[5] == 'UNI' columns.append(column) return columns def _get_foreign_keys(self, cursor, table_name: str) -> List[ForeignKey]: """Get foreign keys for a table""" foreign_keys = [] if self.config['type'] == 'postgresql': cursor.execute(""" SELECT tc.constraint_name, kcu.column_name, ccu.table_name AS referenced_table, ccu.column_name AS referenced_column, rc.update_rule, rc.delete_rule FROM information_schema.table_constraints AS tc JOIN information_schema.key_column_usage AS kcu ON tc.constraint_name = kcu.constraint_name JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name JOIN information_schema.referential_constraints rc ON rc.constraint_name = tc.constraint_name WHERE tc.table_name = %s AND tc.constraint_type = 'FOREIGN KEY' """, (table_name,)) elif self.config['type'] == 'mysql': cursor.execute(""" SELECT kcu.CONSTRAINT_NAME, kcu.COLUMN_NAME, kcu.REFERENCED_TABLE_NAME, kcu.REFERENCED_COLUMN_NAME, rc.UPDATE_RULE, rc.DELETE_RULE FROM information_schema.KEY_COLUMN_USAGE kcu JOIN information_schema.REFERENTIAL_CONSTRAINTS rc ON kcu.CONSTRAINT_NAME = rc.CONSTRAINT_NAME WHERE kcu.TABLE_SCHEMA = %s AND kcu.TABLE_NAME = %s AND kcu.REFERENCED_TABLE_NAME IS NOT NULL """, (self.config['database'], table_name)) elif self.config['type'] == 'sqlite': cursor.execute(f"PRAGMA foreign_key_list({table_name})") for row in cursor.fetchall(): if self.config['type'] == 'sqlite': fk = ForeignKey( name=f"fk_{table_name}_{row[0]}", source_table=table_name, source_column=row[3], target_table=row[2], target_column=row[4], on_update=row[5], on_delete=row[6] ) else: fk = ForeignKey( name=row[0], source_table=table_name, source_column=row[1], target_table=row[2], target_column=row[3], on_update=row[4], on_delete=row[5] ) foreign_keys.append(fk) return foreign_keys def _get_indexes(self, cursor, table_name: str) -> List[Dict[str, Any]]: """Get indexes for a table""" indexes = [] if self.config['type'] == 'postgresql': cursor.execute(""" SELECT indexname, indexdef FROM pg_indexes WHERE tablename = %s """, (table_name,)) elif self.config['type'] == 'mysql': cursor.execute(""" SELECT INDEX_NAME, GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX) as columns, INDEX_TYPE, NON_UNIQUE FROM information_schema.STATISTICS WHERE TABLE_SCHEMA = %s AND TABLE_NAME = %s GROUP BY INDEX_NAME, INDEX_TYPE, NON_UNIQUE """, (self.config['database'], table_name)) elif self.config['type'] == 'sqlite': cursor.execute(f"PRAGMA index_list({table_name})") for row in cursor.fetchall(): if self.config['type'] == 'sqlite': index = { 'name': row[1], 'unique': bool(row[2]) } else: index = { 'name': row[0], 'definition': row[1] if self.config['type'] == 'postgresql' else None, 'columns': row[1].split(',') if self.config['type'] == 'mysql' else [], 'type': row[2] if self.config['type'] == 'mysql' else None, 'unique': not row[3] if self.config['type'] == 'mysql' else False } indexes.append(index) return indexes def _analyze_implicit_relationships(self): """Find implicit relationships based on naming conventions""" print("Analyzing implicit relationships...") patterns = [ r'(.+)_id$', r'id_(.+)$', r'fk_(.+)$', r'(.+)_fk$' ] for table in self.tables: for column in table.columns: if column.is_foreign_key: continue for pattern in patterns: import re match = re.match(pattern, column.name, re.IGNORECASE) if match: potential_table = match.group(1) # Check variations variations = [ potential_table, potential_table + 's', potential_table[:-1] if potential_table.endswith('s') else potential_table, potential_table.replace('_', '') ] for var in variations: target_table = next( (t for t in self.tables if t.name.lower() == var.lower()), None ) if target_table: # Create implicit relationship fk = ForeignKey( name=f"implicit_{table.name}_{column.name}", source_table=table.name, source_column=column.name, target_table=target_table.name, target_column='id', on_update='NO ACTION', on_delete='NO ACTION' ) self.relationships.append(fk) break def generate_erd(self): """Generate Entity Relationship Diagram""" print("Generating ERD...") dot = graphviz.Digraph(comment='Database Schema', format='svg') dot.attr(rankdir='TB', splines='ortho', nodesep='0.5') # Add tables as nodes for table in self.tables: label = f"<\n" label += f"\n" for column in table.columns[:10]: # Limit to first 10 columns bgcolor = '' if column.is_primary_key: bgcolor = " bgcolor='yellow'" elif column.is_foreign_key: bgcolor = " bgcolor='lightgreen'" label += f"{column.name}" label += f"\n" if len(table.columns) > 10: label += f"\n" label += "
{table.name}
{column.data_type}
... {len(table.columns) - 10} more columns
>" dot.node(table.name, label=label, shape='plaintext') # Add relationships as edges for rel in self.relationships: style = 'solid' if not rel.name.startswith('implicit') else 'dashed' dot.edge( rel.source_table, rel.target_table, label=f"{rel.source_column}→{rel.target_column}", style=style ) # Save diagram output_dir = Path(self.config['output_dir']) output_dir.mkdir(parents=True, exist_ok=True) dot.render(output_dir / 'diagrams' / 'schema', cleanup=True) def generate_html(self): """Generate HTML documentation""" print("Generating HTML documentation...") template = Template(""" {{ database }} - Database Documentation

Database Documentation: {{ database }}

Generated: {{ generated_at }}

{{ metrics.total_tables }}
Tables
{{ metrics.total_columns }}
Columns
{{ metrics.total_relationships }}
Relationships
{{ metrics.total_indexes }}
Indexes

Table of Contents

{% for table in tables %}

{{ table.name }}

{% if table.comment %}

{{ table.comment }}

{% endif %}

Columns

{% for column in table.columns %} {% endfor %}
Name Type Nullable Default Key Comment
{{ column.name }} {{ column.data_type }} {{ 'YES' if column.nullable else 'NO' }} {{ column.default or 'NULL' }} {% if column.is_primary_key %}PK{% endif %} {% if column.is_foreign_key %}FK{% endif %} {% if column.is_unique %}UQ{% endif %} {{ column.comment or '' }}
{% if table.foreign_keys %}

Foreign Keys

{% for fk in table.foreign_keys %} {% endfor %}
Name Column References On Update On Delete
{{ fk.name }} {{ fk.source_column }} {{ fk.target_table }}.{{ fk.target_column }} {{ fk.on_update }} {{ fk.on_delete }}
{% endif %} {% if table.indexes %}

Indexes

{% for idx in table.indexes %} {% endfor %}
Name Unique Definition
{{ idx.name }} {{ 'YES' if idx.unique else 'NO' }} {{ idx.definition or idx.columns|join(', ') if idx.columns else '' }}
{% endif %} {% endfor %} """) html = template.render( database=self.config['database'], generated_at=datetime.now().isoformat(), metrics=self.metrics, tables=self.tables ) output_path = Path(self.config['output_dir']) / 'index.html' output_path.parent.mkdir(parents=True, exist_ok=True) with open(output_path, 'w') as f: f.write(html) def _calculate_metrics(self): """Calculate database metrics""" self.metrics = { 'total_tables': len(self.tables), 'total_columns': sum(len(t.columns) for t in self.tables), 'total_relationships': len(self.relationships), 'total_indexes': sum(len(t.indexes) for t in self.tables), 'orphan_tables': len([ t for t in self.tables if not any( r.source_table == t.name or r.target_table == t.name for r in self.relationships ) ]), 'tables_without_pk': len([ t for t in self.tables if not t.primary_keys ]) } def run_schemaspy(self): """Run SchemaSpy for additional documentation""" print("Running SchemaSpy...") output_dir = Path(self.config['output_dir']) / 'schemaspy' output_dir.mkdir(parents=True, exist_ok=True) cmd = [ 'java', '-jar', 'schemaspy.jar', '-t', self.config['type'], '-host', self.config['host'], '-db', self.config['database'], '-u', self.config['username'], '-p', self.config['password'], '-o', str(output_dir), '-dp', self.config.get('driver_path', '.'), '-vizjs' ] try: subprocess.run(cmd, check=True) print(f"SchemaSpy documentation generated in {output_dir}") except subprocess.CalledProcessError as e: print(f"SchemaSpy failed: {e}") except FileNotFoundError: print("SchemaSpy not found. Download from http://schemaspy.org") def generate(self): """Generate complete documentation""" try: self.connect() self.extract_schema() self.generate_erd() self.generate_html() if self.config.get('use_schemaspy', False): self.run_schemaspy() print(f"Documentation generated in {self.config['output_dir']}") finally: if self.connection: self.connection.close() # Usage if __name__ == "__main__": config = { 'type': 'postgresql', 'host': 'localhost', 'port': 5432, 'database': 'myapp', 'username': 'user', 'password': 'password', 'output_dir': './db-docs', 'use_schemaspy': True } generator = DatabaseDocGenerator(config) generator.generate() ``` ## Error Handling | Error | Cause | Solution | |-------|-------|----------| | "Access denied to schema" | Insufficient database permissions | Grant SELECT permission on information_schema | | "GraphViz not found" | Missing diagram generator | Install GraphViz: `apt-get install graphviz` | | "Out of memory" | Large database schema | Increase Java heap size for SchemaSpy | | "Connection timeout" | Database unreachable | Check network and firewall settings | | "Unknown column type" | Custom database types | Add type mapping in configuration | ## Configuration Options **Basic Usage:** ```bash /db-docs \ --type=postgresql \ --host=localhost \ --database=myapp \ --output=./docs ``` **Available Options:** `--type ` - Database type - `mysql` - MySQL/MariaDB - `postgresql` - PostgreSQL - `mssql` - SQL Server - `oracle` - Oracle Database - `sqlite` - SQLite - `mongodb` - MongoDB `--include-views` - Document database views - Default: `true` `--include-procedures` - Document stored procedures - Default: `true` `--generate-erd` - Create entity relationship diagrams - Default: `true` `--analyze-implicit` - Find implicit relationships - Default: `true` `--output-format ` - Documentation format - `html` - Interactive HTML (default) - `markdown` - Markdown files - `pdf` - PDF document - `json` - JSON data dictionary - `all` - All formats `--theme ` - HTML documentation theme - `default` - Standard theme - `dark` - Dark mode - `minimal` - Minimal design - `corporate` - Professional style ## Best Practices DO: - Run documentation generation in CI/CD pipeline - Version control your database documentation - Include business context in table/column comments - Document implicit relationships and business rules - Generate documentation after each schema migration - Review orphan tables and missing relationships DON'T: - Expose production credentials in documentation - Include sensitive data in examples - Skip documenting views and procedures - Ignore schema anomalies and warnings - Generate documentation from development databases only ## Performance Considerations - Large schemas (1000+ tables) may require increased memory - ERD generation can be slow for complex relationships - Consider splitting documentation by schema/module - Use caching for repeated documentation generations - Limit diagram complexity for better readability ## Security Considerations - Use read-only database credentials - Sanitize any example data in documentation - Don't include connection strings in generated docs - Review documentation for accidentally exposed sensitive column names - Implement access control for documentation hosting ## Related Commands - `/design-schema` - Design new database schemas - `/optimize-query` - Optimize database queries - `/migration` - Manage database migrations - `/backup` - Backup database - `/seed-data` - Generate test data ## Version History - v1.0.0 (2024-10): Initial implementation with multi-database support - Planned v1.1.0: Add MongoDB and graph database support