Initial commit

2025-11-30 08:38:26 +08:00
commit 41d9f6b189
304 changed files with 98322 additions and 0 deletions
--- a/skills/data-schema-knowledge-modeling/resources/evaluators/rubric_data_schema_knowledge_modeling.json
+++ b/skills/data-schema-knowledge-modeling/resources/evaluators/rubric_data_schema_knowledge_modeling.json
@@ -0,0 +1,282 @@
+{
+  "criteria": [
+    {
+      "name": "Entity Identification & Completeness",
+      "description": "Are all domain entities identified? Each with clear purpose, distinct identity, and no redundancy?",
+      "scoring": {
+        "1": "Missing critical entities. Entities poorly defined or overlapping. No clear distinction between entities and attributes.",
+        "2": "Some entities identified but gaps in coverage. Some entity purposes unclear. Minor redundancy.",
+        "3": "Most entities identified with clear purposes. Reasonable coverage. Entities generally distinct.",
+        "4": "All required entities identified with clear, documented purposes. Good examples provided. No redundancy.",
+        "5": "Complete entity coverage validated against all use cases. Each entity has purpose, examples, lifecycle documented. Entity vs value object distinction clear. No overlap or redundancy."
+      }
+    },
+    {
+      "name": "Attribute Definition Quality",
+      "description": "Are attributes complete with appropriate data types, nullability, and constraints?",
+      "scoring": {
+        "1": "Attributes missing or poorly typed. Wrong data types (e.g., money as VARCHAR). Nullability ignored.",
+        "2": "Basic attributes present but some types questionable. Nullability inconsistent. Some constraints missing.",
+        "3": "Attributes defined with reasonable types. Nullability specified. Core constraints present.",
+        "4": "All attributes well-typed (DECIMAL for money, proper VARCHAR lengths). Nullability correctly specified. Constraints documented.",
+        "5": "Comprehensive attribute definitions with justification for types, nullability, defaults, and constraints. Audit fields (createdAt, updatedAt) included where appropriate. No technical debt."
+      }
+    },
+    {
+      "name": "Relationship Modeling Accuracy",
+      "description": "Are relationships correctly identified with proper cardinality, optionality, and implementation?",
+      "scoring": {
+        "1": "Relationships missing or incorrect. Cardinality wrong. M:N modeled without junction table.",
+        "2": "Some relationships identified but cardinality questionable. Missing junction tables or unclear optionality.",
+        "3": "Most relationships mapped with cardinality. Junction tables for M:N. Optionality specified.",
+        "4": "All relationships correctly modeled. Proper cardinality (1:1, 1:N, M:N). Junction tables where needed. Clear optionality.",
+        "5": "Comprehensive relationship documentation with bidirectional naming, implementation details (FKs, ON DELETE actions), and validation that relationships support all use cases. Complex patterns (polymorphic, hierarchical) correctly handled."
+      }
+    },
+    {
+      "name": "Constraint & Invariant Specification",
+      "description": "Are business rules enforced via constraints? Are domain invariants documented and validated?",
+      "scoring": {
+        "1": "No constraints beyond primary keys. Business rules not documented. Invariants missing.",
+        "2": "Basic constraints (NOT NULL, UNIQUE) present but business rules not enforced. Invariants mentioned but not validated.",
+        "3": "Good constraint coverage (PK, FK, UNIQUE, NOT NULL). Some business rules enforced. Invariants documented.",
+        "4": "Comprehensive constraints including CHECK constraints for business rules. All invariants documented with enforcement strategy.",
+        "5": "All constraints documented with rationale. Domain invariants clearly stated and enforced via DB constraints where possible, application logic where not. Validation strategy for complex multi-table invariants. Examples of enforcement code provided."
+      }
+    },
+    {
+      "name": "Normalization & Data Integrity",
+      "description": "Is schema properly normalized (or deliberately denormalized with rationale)?",
+      "scoring": {
+        "1": "Severe normalization violations. Redundant data. Update anomalies likely.",
+        "2": "Some normalization but violations present (partial or transitive dependencies). Some redundancy.",
+        "3": "Generally normalized to 2NF-3NF. Minimal redundancy. Rationale for exceptions provided.",
+        "4": "Proper normalization to 3NF. Any denormalization documented with performance justification. No update anomalies.",
+        "5": "Exemplary normalization with clear explanation of level achieved (1NF/2NF/3NF/BCNF). Strategic denormalization only where measured performance gains justify it. Trade-offs explicitly documented. No data integrity risks."
+      }
+    },
+    {
+      "name": "Use Case Coverage & Validation",
+      "description": "Does schema support all required use cases? Can all queries be answered?",
+      "scoring": {
+        "1": "Schema doesn't support core use cases. Critical queries impossible or require workarounds.",
+        "2": "Supports some use cases but gaps exist. Some queries difficult or inefficient.",
+        "3": "Supports most use cases. Required queries possible though some may be complex.",
+        "4": "All use cases supported. Validation checklist shows each use case can be satisfied. Query paths identified.",
+        "5": "Comprehensive validation against all use cases with example queries. Indexes planned for performance. Edge cases considered. Future use cases accommodated by design."
+      }
+    },
+    {
+      "name": "Technology Appropriateness",
+      "description": "Is the schema type (relational, document, graph) appropriate for the domain?",
+      "scoring": {
+        "1": "Wrong technology choice (e.g., relational for graph problem, or vice versa). Implementation doesn't match paradigm.",
+        "2": "Technology choice questionable. Implementation awkward for chosen paradigm.",
+        "3": "Reasonable technology choice. Implementation follows paradigm conventions.",
+        "4": "Good technology choice with justification. Implementation leverages paradigm strengths.",
+        "5": "Optimal technology choice with clear rationale comparing alternatives. Implementation exemplifies paradigm best practices. Schema leverages technology-specific features appropriately (e.g., JSONB in PostgreSQL, graph traversal in Neo4j)."
+      }
+    },
+    {
+      "name": "Documentation Quality & Clarity",
+      "description": "Is schema well-documented with ERD, implementation code, and clear explanations?",
+      "scoring": {
+        "1": "Minimal documentation. No diagram. Entity definitions incomplete.",
+        "2": "Basic documentation present but gaps. Diagram missing or unclear. Some entities poorly explained.",
+        "3": "Good documentation with most sections complete. Diagram present. Entities explained.",
+        "4": "Comprehensive documentation following template. ERD clear. All entities, relationships, constraints documented. Implementation code provided.",
+        "5": "Exemplary documentation that could serve as reference. ERD/diagram clear and complete. All sections filled thoroughly. Implementation code (SQL DDL / JSON Schema / Cypher) executable. Examples aid understanding. Could be handed to developer for immediate implementation."
+      }
+    },
+    {
+      "name": "Evolution & Migration Strategy",
+      "description": "Is there a plan for schema changes? Migration path from existing systems considered?",
+      "scoring": {
+        "1": "No evolution strategy. If migration, no plan for existing data.",
+        "2": "Evolution mentioned but no concrete strategy. Migration path vague.",
+        "3": "Basic evolution strategy (versioning or backward-compat approach). Migration considered if applicable.",
+        "4": "Clear evolution strategy documented. Migration path defined with phases if migrating from legacy.",
+        "5": "Comprehensive evolution strategy with versioning, backward-compatibility approach, and detailed migration plan if applicable. Rollback strategy considered. Zero-downtime deployment approach specified. Future extensibility designed in."
+      }
+    },
+    {
+      "name": "Advanced Pattern Application",
+      "description": "Are advanced patterns (temporal, hierarchies, polymorphic) correctly applied when needed?",
+      "scoring": {
+        "1": "Complex patterns needed but missing or incorrectly implemented.",
+        "2": "Attempted advanced patterns but implementation flawed or overly complex.",
+        "3": "Advanced patterns applied where needed with reasonable implementation.",
+        "4": "Advanced patterns correctly implemented with good trade-off decisions (e.g., hierarchy approach chosen based on use case).",
+        "5": "Sophisticated pattern usage with clear rationale for choices. Temporal modeling, hierarchies, polymorphic associations, or graph patterns implemented optimally for domain. Trade-offs explicit and justified."
+      }
+    }
+  ],
+  "schema_type_guidance": {
+    "Relational (SQL)": {
+      "target_score": 3.5,
+      "focus_criteria": [
+        "Normalization & Data Integrity",
+        "Constraint & Invariant Specification",
+        "Relationship Modeling Accuracy"
+      ],
+      "key_patterns": [
+        "Proper normalization (3NF typical)",
+        "Foreign key relationships with CASCADE/RESTRICT",
+        "CHECK constraints for business rules",
+        "Junction tables for M:N relationships"
+      ]
+    },
+    "Document/NoSQL": {
+      "target_score": 3.5,
+      "focus_criteria": [
+        "Entity Identification & Completeness",
+        "Use Case Coverage & Validation",
+        "Technology Appropriateness"
+      ],
+      "key_patterns": [
+        "Embed vs reference decision documented",
+        "Denormalization for read performance justified",
+        "Document structure matches query patterns",
+        "JSON schema validation if available"
+      ]
+    },
+    "Graph Database": {
+      "target_score": 4.0,
+      "focus_criteria": [
+        "Relationship Modeling Accuracy",
+        "Technology Appropriateness",
+        "Advanced Pattern Application"
+      ],
+      "key_patterns": [
+        "Nodes for entities, edges for relationships",
+        "Properties on edges for context",
+        "Traversal patterns optimized (< 3 hops typical)",
+        "Index on frequently filtered properties"
+      ]
+    },
+    "Data Warehouse (OLAP)": {
+      "target_score": 3.5,
+      "focus_criteria": [
+        "Use Case Coverage & Validation",
+        "Normalization & Data Integrity",
+        "Technology Appropriateness"
+      ],
+      "key_patterns": [
+        "Star or snowflake schema",
+        "Fact tables with foreign keys to dimensions",
+        "Dimensional attributes denormalized",
+        "Slowly changing dimensions handled"
+      ]
+    }
+  },
+  "domain_complexity_guidance": {
+    "Simple Domain (< 10 entities, straightforward relationships)": {
+      "target_score": 3.0,
+      "acceptable_shortcuts": [
+        "ERD can be simple text diagram",
+        "Fewer implementation details needed",
+        "Basic constraints sufficient"
+      ],
+      "key_quality_gates": [
+        "All entities identified",
+        "Relationships correct",
+        "Supports use cases"
+      ]
+    },
+    "Standard Domain (10-30 entities, moderate complexity)": {
+      "target_score": 3.5,
+      "required_elements": [
+        "Complete entity definitions",
+        "ERD diagram",
+        "All relationships mapped",
+        "Constraints documented",
+        "Implementation code (DDL/schema)"
+      ],
+      "key_quality_gates": [
+        "All 10 criteria evaluated",
+        "Minimum score of 3 on each",
+        "Average ≥ 3.5"
+      ]
+    },
+    "Complex Domain (30+ entities, hierarchies, temporal, polymorphic)": {
+      "target_score": 4.0,
+      "required_elements": [
+        "Comprehensive documentation",
+        "Multiple diagrams (ERD + detail views)",
+        "Advanced pattern usage documented",
+        "Migration strategy if applicable",
+        "Performance considerations",
+        "Example queries for complex patterns"
+      ],
+      "key_quality_gates": [
+        "All 10 criteria evaluated",
+        "Minimum score of 3.5 on each",
+        "Average ≥ 4.0",
+        "Score 5 on Advanced Pattern Application"
+      ]
+    }
+  },
+  "common_failure_modes": {
+    "1. God Entities": {
+      "symptom": "User table with 50+ attributes, or single entity handling multiple concerns",
+      "why_it_fails": "Violates single responsibility, hard to query, update anomalies",
+      "fix": "Extract related concerns into separate entities (UserProfile, UserPreferences, UserAddress)",
+      "related_criteria": ["Entity Identification & Completeness", "Normalization & Data Integrity"]
+    },
+    "2. Missing Junction Tables": {
+      "symptom": "Attempting M:N relationship with direct foreign keys or comma-separated IDs",
+      "why_it_fails": "Can't properly model M:N, violates 1NF, query complexity",
+      "fix": "Always use junction table with composite primary key for M:N relationships",
+      "related_criteria": ["Relationship Modeling Accuracy", "Normalization & Data Integrity"]
+    },
+    "3. Wrong Data Types": {
+      "symptom": "Money as FLOAT, dates as VARCHAR, booleans as CHAR(1)",
+      "why_it_fails": "Precision loss (money), format inconsistency (dates), unclear semantics (booleans)",
+      "fix": "Use DECIMAL for money, DATE/TIMESTAMP for dates, BOOLEAN for flags",
+      "related_criteria": ["Attribute Definition Quality"]
+    },
+    "4. No Constraints": {
+      "symptom": "Business rules in documentation but not enforced in schema",
+      "why_it_fails": "Application bugs can corrupt data, no database-level guarantees",
+      "fix": "Use CHECK constraints, NOT NULL, UNIQUE, FK constraints to enforce rules",
+      "related_criteria": ["Constraint & Invariant Specification"]
+    },
+    "5. Premature Denormalization": {
+      "symptom": "Duplicating data for \"performance\" without measuring",
+      "why_it_fails": "Update anomalies, data inconsistency, wasted effort if not bottleneck",
+      "fix": "Normalize first (3NF), denormalize only after profiling shows actual bottleneck",
+      "related_criteria": ["Normalization & Data Integrity", "Use Case Coverage & Validation"]
+    },
+    "6. Ignoring Use Cases": {
+      "symptom": "Schema designed in isolation, doesn't support required queries",
+      "why_it_fails": "Schema can't answer business questions, requires redesign",
+      "fix": "Validate schema against ALL use cases. Write example queries to verify.",
+      "related_criteria": ["Use Case Coverage & Validation"]
+    },
+    "7. Modeling Implementation": {
+      "symptom": "Entities like \"UserSession\", \"Cache\", \"Queue\" in domain model",
+      "why_it_fails": "Confuses domain concepts with technical infrastructure",
+      "fix": "Model real-world domain entities only. Infrastructure is separate concern.",
+      "related_criteria": ["Entity Identification & Completeness", "Technology Appropriateness"]
+    },
+    "8. No Evolution Strategy": {
+      "symptom": "Can't change schema without breaking production",
+      "why_it_fails": "Schema ossifies, can't adapt to business changes",
+      "fix": "Plan for evolution: versioning, backward-compat changes, or expand-contract migrations",
+      "related_criteria": ["Evolution & Migration Strategy"]
+    }
+  },
+  "scale": {
+    "description": "Each criterion scored 1-5",
+    "min_score": 1,
+    "max_score": 5,
+    "passing_threshold": 3.5,
+    "excellence_threshold": 4.5
+  },
+  "usage_notes": {
+    "when_to_score": "After completing schema design, before delivering to user",
+    "minimum_standard": "Average score ≥ 3.5 across all criteria (standard domain). Simple domains: ≥ 3.0. Complex domains: ≥ 4.0.",
+    "how_to_improve": "If scoring < threshold, identify lowest-scoring criteria and iterate. Common fixes: add missing entities, specify constraints, validate against use cases, improve documentation.",
+    "self_assessment": "Score honestly. Schema flaws are expensive to fix in production. Better to iterate now."
+  }
+}
--- a/skills/data-schema-knowledge-modeling/resources/methodology.md
+++ b/skills/data-schema-knowledge-modeling/resources/methodology.md
@@ -0,0 +1,439 @@
+# Data Schema & Knowledge Modeling: Advanced Methodology
+
+## Workflow
+
+```
+Advanced Schema Modeling:
+- [ ] Step 1: Analyze complex domain patterns
+- [ ] Step 2: Design advanced relationship structures
+- [ ] Step 3: Apply normalization or strategic denormalization
+- [ ] Step 4: Model temporal/historical aspects
+- [ ] Step 5: Plan schema evolution strategy
+```
+
+**Steps:** (1) Identify patterns in [Advanced Relationships](#1-advanced-relationship-patterns), (2) Apply [Hierarchy](#2-hierarchy-modeling) and [Polymorphic](#3-polymorphic-associations) patterns, (3) Use [Normalization](#4-normalization-levels) then [Denormalization](#5-strategic-denormalization), (4) Add [Temporal](#6-temporal--historical-modeling) if needed, (5) Plan [Evolution](#7-schema-evolution).
+
+---
+
+## 1. Advanced Relationship Patterns
+
+### Self-Referential
+
+Entity relates to itself (org charts, categories, social networks).
+
+```sql
+CREATE TABLE Employee (
+  id BIGINT PRIMARY KEY,
+  managerId BIGINT NULL REFERENCES Employee(id),
+  CONSTRAINT no_self_ref CHECK (id != managerId)
+);
+```
+
+Query with recursive CTE for full hierarchy.
+
+### Conditional
+
+Relationship exists only under conditions.
+
+```sql
+CREATE TABLE Order (
+  id BIGINT PRIMARY KEY,
+  status VARCHAR(20),
+  paymentId BIGINT NULL REFERENCES Payment(id),
+  CONSTRAINT payment_when_paid CHECK (
+    (status IN ('paid','completed') AND paymentId IS NOT NULL) OR
+    (status NOT IN ('paid','completed'))
+  )
+);
+```
+
+### Multi-Parent
+
+Entity has multiple parents (document in folders).
+
+```sql
+CREATE TABLE DocumentFolder (
+  documentId BIGINT REFERENCES Document(id),
+  folderId BIGINT REFERENCES Folder(id),
+  PRIMARY KEY (documentId, folderId)
+);
+```
+
+---
+
+## 2. Hierarchy Modeling
+
+Four approaches with trade-offs:
+
+| Approach | Implementation | Read | Write | Best For |
+|----------|---------------|------|-------|----------|
+| **Adjacency List** | `parentId` column | Slow (recursive) | Fast | Shallow trees, frequent updates |
+| **Path Enumeration** | `path VARCHAR` ('/1/5/12/') | Fast | Medium | Read-heavy, moderate depth |
+| **Nested Sets** | `lft, rgt INT` | Fastest | Slow | Read-heavy, rare writes |
+| **Closure Table** | Separate ancestor/descendant table | Fastest | Medium | Complex queries, any depth |
+
+**Adjacency List:**
+```sql
+CREATE TABLE Category (
+  id BIGINT PRIMARY KEY,
+  parentId BIGINT NULL REFERENCES Category(id)
+);
+```
+
+**Closure Table:**
+```sql
+CREATE TABLE CategoryClosure (
+  ancestor BIGINT,
+  descendant BIGINT,
+  depth INT,  -- 0=self, 1=child, 2+=deeper
+  PRIMARY KEY (ancestor, descendant)
+);
+```
+
+**Recommendation:** Adjacency for < 5 levels, Closure for complex queries.
+
+---
+
+## 3. Polymorphic Associations
+
+Entity relates to multiple types (Comment on Post/Photo/Video).
+
+### Approach 1: Separate FKs (Recommended for SQL)
+
+```sql
+CREATE TABLE Comment (
+  id BIGINT PRIMARY KEY,
+  postId BIGINT NULL REFERENCES Post(id),
+  photoId BIGINT NULL REFERENCES Photo(id),
+  videoId BIGINT NULL REFERENCES Video(id),
+  CONSTRAINT one_parent CHECK (
+    (postId IS NOT NULL)::int +
+    (photoId IS NOT NULL)::int +
+    (videoId IS NOT NULL)::int = 1
+  )
+);
+```
+
+**Pros:** Type-safe, referential integrity
+**Cons:** Schema grows with types
+
+### Approach 2: Supertype/Subtype
+
+```sql
+CREATE TABLE Commentable (id BIGINT PRIMARY KEY, type VARCHAR(50));
+CREATE TABLE Post (id BIGINT PRIMARY KEY REFERENCES Commentable(id), ...);
+CREATE TABLE Photo (id BIGINT PRIMARY KEY REFERENCES Commentable(id), ...);
+CREATE TABLE Comment (commentableId BIGINT REFERENCES Commentable(id));
+```
+
+**Use when:** Shared attributes across types.
+
+---
+
+## 4. Graph & Ontology Design
+
+### Property Graph
+
+**Nodes** = entities, **Edges** = relationships, both have properties.
+
+```cypher
+CREATE (u:User {id: 1, name: 'Alice'})
+CREATE (p:Product {id: 100, name: 'Widget'})
+CREATE (u)-[:PURCHASED {date: '2024-01-15', quantity: 2}]->(p)
+```
+
+**Schema:**
+```
+Nodes: User, Product, Category
+Edges: PURCHASED (User→Product, {date, quantity})
+       REVIEWED (User→Product, {rating, comment})
+       BELONGS_TO (Product→Category)
+```
+
+**Design principles:**
+- Nodes for entities with identity
+- Edges for relationships
+- Properties on edges for context
+- Avoid deep traversals (< 3 hops)
+
+### RDF Triples (Semantic Web)
+
+Subject-Predicate-Object:
+```turtle
+ex:Alice rdf:type ex:User .
+ex:Alice ex:purchased ex:Widget .
+```
+
+**Use RDF when:** Standards compliance, semantic reasoning, linked data
+**Use Property Graph when:** Performance, complex traversals
+
+---
+
+## 5. Normalization Levels
+
+### 1NF: Atomic Values
+
+**Violation:** Multiple phones in one column
+**Fix:** Separate UserPhone table
+
+### 2NF: No Partial Dependencies
+
+**Violation:** In OrderItem(orderId, productId, productName), productName depends only on productId
+**Fix:** productName lives in Product table
+
+### 3NF: No Transitive Dependencies
+
+**Violation:** In Address(id, zipCode, city, state), city/state depend on zipCode
+**Fix:** Separate ZipCode table
+
+**When to normalize to 3NF:** OLTP, frequent updates, consistency required
+
+---
+
+## 6. Strategic Denormalization
+
+**Only after profiling shows bottleneck.**
+
+### Pattern 1: Computed Aggregates
+
+Store `Order.total` instead of summing OrderItems on every query.
+
+**Trade-off:** Faster reads, slower writes, consistency risk (use triggers/app logic)
+
+### Pattern 2: Frequent Joins
+
+Embed address fields in User table to avoid join.
+
+**Trade-off:** No join, but updates must maintain both
+
+### Pattern 3: Historical Snapshots
+
+```sql
+CREATE TABLE OrderSnapshot (
+  orderId BIGINT,
+  snapshotDate DATE,
+  userName VARCHAR(255),  -- denormalized from User
+  userEmail VARCHAR(255),
+  PRIMARY KEY (orderId, snapshotDate)
+);
+```
+
+**Use when:** Need point-in-time data (e.g., user's name at time of order)
+
+---
+
+## 7. Temporal & Historical Modeling
+
+### Pattern 1: Effective Dating
+
+```sql
+CREATE TABLE Price (
+  productId BIGINT,
+  price DECIMAL(10,2),
+  effectiveFrom DATE NOT NULL,
+  effectiveTo DATE NULL,  -- NULL = current
+  PRIMARY KEY (productId, effectiveFrom)
+);
+```
+
+**Query current:** WHERE effectiveFrom <= TODAY AND (effectiveTo IS NULL OR effectiveTo > TODAY)
+
+### Pattern 2: History Table
+
+```sql
+CREATE TABLE UserHistory (
+  id BIGINT AUTO_INCREMENT PRIMARY KEY,
+  userId BIGINT,
+  email VARCHAR(255),
+  name VARCHAR(255),
+  validFrom TIMESTAMP DEFAULT NOW(),
+  validTo TIMESTAMP NULL,
+  changeType VARCHAR(20)  -- 'INSERT', 'UPDATE', 'DELETE'
+);
+```
+
+Trigger on User table inserts into UserHistory on changes.
+
+### Pattern 3: Event Sourcing
+
+```sql
+CREATE TABLE OrderEvent (
+  id BIGINT AUTO_INCREMENT PRIMARY KEY,
+  orderId BIGINT,
+  eventType VARCHAR(50),  -- 'CREATED', 'ITEM_ADDED', 'SHIPPED'
+  eventData JSON,
+  occurredAt TIMESTAMP DEFAULT NOW()
+);
+```
+
+Reconstruct state by replaying events.
+
+**Trade-offs:**
+**Pros:** Complete audit, time travel
+**Cons:** Query complexity, storage
+
+---
+
+## 8. Schema Evolution
+
+### Strategy 1: Backward-Compatible
+
+Safe changes (no app changes):
+- Add nullable column
+- Add table (not referenced)
+- Add index
+- Widen column (VARCHAR(100) → VARCHAR(255))
+
+```sql
+ALTER TABLE User ADD COLUMN phoneNumber VARCHAR(20) NULL;
+```
+
+### Strategy 2: Expand-Contract
+
+For breaking changes:
+
+1. **Expand:** Add new alongside old
+   ```sql
+   ALTER TABLE User ADD COLUMN newEmail VARCHAR(255) NULL;
+   ```
+
+2. **Migrate:** Copy data
+   ```sql
+   UPDATE User SET newEmail = email WHERE newEmail IS NULL;
+   ```
+
+3. **Contract:** Remove old
+   ```sql
+   ALTER TABLE User DROP COLUMN email;
+   ALTER TABLE User RENAME COLUMN newEmail TO email;
+   ```
+
+### Strategy 3: Versioned Schemas (NoSQL)
+
+```json
+{"_schemaVersion": "2.0", "email": "alice@example.com"}
+```
+
+App handles multiple versions.
+
+### Strategy 4: Blue-Green
+
+Run old and new schemas simultaneously, dual-write, migrate, switch reads, remove old.
+
+**Best for:** Major redesigns, zero downtime
+
+---
+
+## 9. Multi-Tenancy
+
+### Pattern 1: Separate Databases
+
+```
+tenant1_db, tenant2_db, tenant3_db
+```
+
+**Pros:** Strong isolation
+**Cons:** High overhead
+
+### Pattern 2: Separate Schemas
+
+```sql
+CREATE SCHEMA tenant1;
+CREATE TABLE tenant1.User (...);
+```
+
+**Pros:** Better than separate DBs
+**Cons:** Still some overhead
+
+### Pattern 3: Shared Schema + Tenant ID
+
+```sql
+CREATE TABLE User (
+  id BIGINT PRIMARY KEY,
+  tenantId BIGINT NOT NULL,
+  email VARCHAR(255),
+  UNIQUE (tenantId, email)
+);
+```
+
+**Pros:** Most efficient
+**Cons:** Must filter ALL queries by tenantId
+
+**Recommendation:** Pattern 3 for SaaS, Pattern 1 for regulated industries
+
+---
+
+## 10. Performance
+
+### Indexes
+
+**Covering index** (includes all query columns):
+```sql
+CREATE INDEX idx_user_status ON User(status) INCLUDE (name, email);
+```
+
+**Composite index** (order matters):
+```sql
+-- Good for: WHERE tenantId = X AND createdAt > Y
+CREATE INDEX idx_tenant_date ON Order(tenantId, createdAt);
+```
+
+**Partial index** (reduce size):
+```sql
+CREATE INDEX idx_active ON User(email) WHERE deletedAt IS NULL;
+```
+
+### Partitioning
+
+**Horizontal (sharding):**
+```sql
+CREATE TABLE Order (...) PARTITION BY RANGE (createdAt);
+CREATE TABLE Order_2024_Q1 PARTITION OF Order
+  FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
+```
+
+**Vertical:** Split hot/cold data into separate tables.
+
+---
+
+## 11. Common Advanced Patterns
+
+### Soft Deletes
+
+```sql
+ALTER TABLE User ADD COLUMN deletedAt TIMESTAMP NULL;
+-- Query: WHERE deletedAt IS NULL
+```
+
+### Audit Columns
+
+```sql
+createdAt TIMESTAMP DEFAULT NOW()
+updatedAt TIMESTAMP DEFAULT NOW() ON UPDATE NOW()
+createdBy BIGINT REFERENCES User(id)
+updatedBy BIGINT REFERENCES User(id)
+```
+
+### State Machines
+
+```sql
+CREATE TABLE OrderState (
+  orderId BIGINT REFERENCES Order(id),
+  state VARCHAR(20),
+  transitionedAt TIMESTAMP DEFAULT NOW(),
+  PRIMARY KEY (orderId, transitionedAt)
+);
+-- Track: draft → pending → confirmed → shipped → delivered
+```
+
+### Idempotency Keys
+
+```sql
+CREATE TABLE Request (
+  idempotencyKey UUID PRIMARY KEY,
+  payload JSON,
+  result JSON,
+  processedAt TIMESTAMP
+);
+-- Prevents duplicate processing
+```
--- a/skills/data-schema-knowledge-modeling/resources/template.md
+++ b/skills/data-schema-knowledge-modeling/resources/template.md
@@ -0,0 +1,330 @@
+# Data Schema & Knowledge Modeling Template
+
+## Workflow
+
+Copy this checklist and track your progress:
+
+```
+Data Schema & Knowledge Modeling Progress:
+- [ ] Step 1: Gather domain requirements and scope
+- [ ] Step 2: Identify entities and attributes systematically
+- [ ] Step 3: Define relationships and cardinality
+- [ ] Step 4: Specify constraints and invariants
+- [ ] Step 5: Validate against use cases and document
+```
+
+**Step 1: Gather domain requirements and scope**
+
+Ask user for domain description, core use cases, existing data sources, scale requirements, and technology stack. Use [Input Questions](#input-questions).
+
+**Step 2: Identify entities and attributes systematically**
+
+Extract entities from requirements using [Entity Identification](#entity-identification). Define attributes with types and nullability using [Attribute Guide](#attribute-guide).
+
+**Step 3: Define relationships and cardinality**
+
+Map entity connections using [Relationship Mapping](#relationship-mapping). Specify cardinality (1:1, 1:N, M:N) and optionality.
+
+**Step 4: Specify constraints and invariants**
+
+Define business rules and constraints using [Constraint Specification](#constraint-specification). Document domain invariants.
+
+**Step 5: Validate against use cases and document**
+
+Create `data-schema-knowledge-modeling.md` using [Template](#schema-documentation-template). Verify using [Validation Checklist](#validation-checklist).
+
+---
+
+## Input Questions
+
+**Domain & Scope:**
+- What domain? (e-commerce, healthcare, social network)
+- Boundaries? In/out of scope?
+- Existing schemas to integrate/migrate from?
+
+**Core Use Cases:**
+- Primary operations? (CRUD for which entities?)
+- Required queries/reports?
+- Access patterns? (read-heavy, write-heavy, mixed)
+
+**Scale & Performance:**
+- Data volume? (rows per table, storage)
+- Growth rate? (daily/monthly)
+- Performance SLAs?
+
+**Technology:**
+- Database? (PostgreSQL, MongoDB, Neo4j, etc.)
+- Compliance? (GDPR, HIPAA, SOC2)
+- Evolution needs? (schema versioning, migrations)
+
+---
+
+## Entity Identification
+
+**Step 1: Extract nouns**
+
+List nouns from requirements = candidate entities.
+
+**Step 2: Validate**
+
+For each, check:
+- [ ] Distinct identity? (can point to "this specific X")
+- [ ] Independent lifecycle?
+- [ ] Multiple attributes beyond name?
+- [ ] Track multiple instances?
+
+**Keep** if yes to most. **Reject** if just an attribute.
+
+**Step 3: Entity vs Value Object**
+
+- **Entity**: Has ID, mutable (User, Order)
+- **Value Object**: No ID, immutable (Address, Money)
+
+**Step 4: Document**
+
+```markdown
+### Entity: [Name]
+**Purpose:** [What it represents]
+**Examples:** [2-3 concrete cases]
+**Lifecycle:** [Creation → deletion]
+**Invariants:** [Rules that must hold]
+```
+
+---
+
+## Attribute Guide
+
+**Template:**
+```
+attributeName: DataType [NULL|NOT NULL] [DEFAULT value]
+  - Description: [What it represents]
+  - Validation: [Constraints]
+  - Examples: [Sample values]
+```
+
+**Standard attributes:**
+- `id`: Primary key (UUID/BIGINT)
+- `createdAt`: TIMESTAMP NOT NULL
+- `updatedAt`: TIMESTAMP NOT NULL
+- `deletedAt`: TIMESTAMP NULL (soft deletes)
+
+**Data types:**
+
+| Data | SQL | NoSQL | Notes |
+|------|-----|-------|-------|
+| Short text | VARCHAR(N) | String | Specify max |
+| Long text | TEXT | String | No limit |
+| Integer | INT/BIGINT | Number | Choose size |
+| Decimal | DECIMAL(P,S) | Number | Fixed precision |
+| Money | DECIMAL(19,4) | {amount,currency} | Never FLOAT |
+| Boolean | BOOLEAN | Boolean | Not nullable |
+| Date/Time | TIMESTAMP | ISODate | With timezone |
+| UUID | UUID/CHAR(36) | String | Distributed IDs |
+| JSON | JSON/JSONB | Object | Flexible |
+| Enum | ENUM/VARCHAR | String | Fixed values |
+
+**Nullability:**
+- NOT NULL if required
+- NULL if optional/unknown at creation
+- Avoid NULL for booleans
+
+---
+
+## Relationship Mapping
+
+**Cardinality:**
+
+**1:1** - User has one Profile
+- SQL: `Profile.userId UNIQUE NOT NULL REFERENCES User(id)`
+
+**1:N** - User has many Orders
+- SQL: `Order.userId NOT NULL REFERENCES User(id)`
+
+**M:N** - Order contains Products
+- Junction table:
+  ```sql
+  OrderItem (
+    orderId REFERENCES Order(id),
+    productId REFERENCES Product(id),
+    quantity INT NOT NULL,
+    PRIMARY KEY (orderId, productId)
+  )
+  ```
+
+**Optionality:**
+- Required: NOT NULL
+- Optional: NULL
+
+**Naming:**
+Use verbs: User **owns** Order, Product **belongs to** Category
+
+---
+
+## Constraint Specification
+
+**Primary Keys:**
+```sql
+id BIGINT PRIMARY KEY AUTO_INCREMENT
+-- or --
+id UUID PRIMARY KEY DEFAULT gen_random_uuid()
+```
+
+**Unique:**
+```sql
+email VARCHAR(255) UNIQUE NOT NULL
+UNIQUE (userId, productId)  -- composite
+```
+
+**Foreign Keys:**
+```sql
+userId BIGINT NOT NULL REFERENCES User(id) ON DELETE CASCADE
+-- Options: CASCADE, SET NULL, RESTRICT
+```
+
+**Check Constraints:**
+```sql
+price DECIMAL(10,2) CHECK (price >= 0)
+status VARCHAR(20) CHECK (status IN ('draft','pending','completed'))
+```
+
+**Domain Invariants:**
+
+Document business rules:
+```markdown
+### Invariant: Order total = sum of items
+Order.total = SUM(OrderItem.quantity * OrderItem.price)
+
+### Invariant: Unique email
+No duplicate emails (case-insensitive)
+```
+
+Enforce via: DB constraints (preferred), application logic, or triggers.
+
+---
+
+## Schema Documentation Template
+
+Create: `data-schema-knowledge-modeling.md`
+
+**Required sections:**
+
+1. **Domain Overview** - Purpose, scope, technology
+2. **Use Cases** - Primary operations, query patterns
+3. **Entity Definitions** - For each entity:
+   - Purpose, examples, lifecycle
+   - Attributes table (name, type, null, default, constraints, description)
+   - Relationships (cardinality, FK, optionality)
+   - Invariants
+4. **ERD** - Visual/text diagram showing relationships
+5. **Constraints** - DB constraints, domain invariants
+6. **Normalization** - Level, denormalization decisions
+7. **Implementation** - SQL DDL / JSON Schema / Graph schema as appropriate
+8. **Validation** - Check each use case is supported
+9. **Open Questions** - Unresolved decisions
+
+**Example entity definition:**
+
+```markdown
+### Entity: Order
+
+**Purpose:** Represents customer purchase transaction
+**Examples:** Amazon order #123, Shopify order #456
+**Lifecycle:** Created on checkout → Updated during fulfillment → Completed on delivery
+
+#### Attributes
+
+| Attribute | Type | Null? | Default | Constraints | Description |
+|---|---|---|---|---|---|
+| id | BIGINT | NO | auto | PK | Unique identifier |
+| userId | BIGINT | NO | - | FK→User | Customer who placed order |
+| status | VARCHAR(20) | NO | 'pending' | CHECK IN(...) | Order status |
+| total | DECIMAL(10,2) | NO | - | CHECK >= 0 | Order total |
+
+#### Relationships
+- **belongs to:** 1:N with User (Order.userId → User.id)
+- **contains:** 1:N with OrderItem junction table
+
+#### Invariants
+- total = SUM(OrderItem.quantity * OrderItem.price)
+- status transitions: pending → confirmed → shipped → delivered
+```
+
+---
+
+## Validation Checklist
+
+**Completeness:**
+- [ ] All entities identified
+- [ ] All attributes defined (types, nullability)
+- [ ] All relationships mapped (cardinality)
+- [ ] All constraints specified
+- [ ] All invariants documented
+
+**Correctness:**
+- [ ] Each entity distinct purpose
+- [ ] No redundant entities
+- [ ] Attributes in correct entities
+- [ ] Cardinality reflects reality
+- [ ] Constraints enforce rules
+
+**Use Case Coverage:**
+- [ ] Supports all CRUD operations
+- [ ] All queries answerable
+- [ ] Indexes planned
+- [ ] No missing joins
+
+**Normalization:**
+- [ ] No partial dependencies (2NF)
+- [ ] No transitive dependencies (3NF)
+- [ ] Denormalization documented
+- [ ] No update anomalies
+
+**Technical Quality:**
+- [ ] Consistent naming
+- [ ] Appropriate data types
+- [ ] Primary keys defined
+- [ ] Foreign keys maintain integrity
+- [ ] Soft delete strategy (if needed)
+- [ ] Audit fields (if needed)
+
+**Future-Proofing:**
+- [ ] Schema extensible
+- [ ] Migration path (if applicable)
+- [ ] Versioning strategy
+- [ ] No technical debt
+
+---
+
+## Common Pitfalls
+
+**1. Modeling implementation, not domain**
+- Symptom: Entities like "UserSession", "Cache"
+- Fix: Model real-world concepts only
+
+**2. God entities**
+- Symptom: User with 50+ attributes
+- Fix: Extract to separate entities
+
+**3. Missing junction tables**
+- Symptom: M:N with FKs
+- Fix: Always use junction table
+
+**4. Nullable FKs without reason**
+- Symptom: All relationships optional
+- Fix: NOT NULL unless truly optional
+
+**5. Not enforcing invariants**
+- Symptom: Rules in docs only
+- Fix: CHECK constraints, triggers, app validation
+
+**6. Premature denormalization**
+- Symptom: Duplicating without measurement
+- Fix: Normalize first, denormalize after profiling
+
+**7. Wrong data types**
+- Symptom: Money as VARCHAR
+- Fix: DECIMAL for money, proper types for all
+
+**8. No migration strategy**
+- Symptom: Can't change schema
+- Fix: Versioning, backward-compat changes