Initial commit
This commit is contained in:
330
skills/data-schema-knowledge-modeling/resources/template.md
Normal file
330
skills/data-schema-knowledge-modeling/resources/template.md
Normal file
@@ -0,0 +1,330 @@
|
||||
# Data Schema & Knowledge Modeling Template
|
||||
|
||||
## Workflow
|
||||
|
||||
Copy this checklist and track your progress:
|
||||
|
||||
```
|
||||
Data Schema & Knowledge Modeling Progress:
|
||||
- [ ] Step 1: Gather domain requirements and scope
|
||||
- [ ] Step 2: Identify entities and attributes systematically
|
||||
- [ ] Step 3: Define relationships and cardinality
|
||||
- [ ] Step 4: Specify constraints and invariants
|
||||
- [ ] Step 5: Validate against use cases and document
|
||||
```
|
||||
|
||||
**Step 1: Gather domain requirements and scope**
|
||||
|
||||
Ask user for domain description, core use cases, existing data sources, scale requirements, and technology stack. Use [Input Questions](#input-questions).
|
||||
|
||||
**Step 2: Identify entities and attributes systematically**
|
||||
|
||||
Extract entities from requirements using [Entity Identification](#entity-identification). Define attributes with types and nullability using [Attribute Guide](#attribute-guide).
|
||||
|
||||
**Step 3: Define relationships and cardinality**
|
||||
|
||||
Map entity connections using [Relationship Mapping](#relationship-mapping). Specify cardinality (1:1, 1:N, M:N) and optionality.
|
||||
|
||||
**Step 4: Specify constraints and invariants**
|
||||
|
||||
Define business rules and constraints using [Constraint Specification](#constraint-specification). Document domain invariants.
|
||||
|
||||
**Step 5: Validate against use cases and document**
|
||||
|
||||
Create `data-schema-knowledge-modeling.md` using [Template](#schema-documentation-template). Verify using [Validation Checklist](#validation-checklist).
|
||||
|
||||
---
|
||||
|
||||
## Input Questions
|
||||
|
||||
**Domain & Scope:**
|
||||
- What domain? (e-commerce, healthcare, social network)
|
||||
- Boundaries? In/out of scope?
|
||||
- Existing schemas to integrate/migrate from?
|
||||
|
||||
**Core Use Cases:**
|
||||
- Primary operations? (CRUD for which entities?)
|
||||
- Required queries/reports?
|
||||
- Access patterns? (read-heavy, write-heavy, mixed)
|
||||
|
||||
**Scale & Performance:**
|
||||
- Data volume? (rows per table, storage)
|
||||
- Growth rate? (daily/monthly)
|
||||
- Performance SLAs?
|
||||
|
||||
**Technology:**
|
||||
- Database? (PostgreSQL, MongoDB, Neo4j, etc.)
|
||||
- Compliance? (GDPR, HIPAA, SOC2)
|
||||
- Evolution needs? (schema versioning, migrations)
|
||||
|
||||
---
|
||||
|
||||
## Entity Identification
|
||||
|
||||
**Step 1: Extract nouns**
|
||||
|
||||
List nouns from requirements = candidate entities.
|
||||
|
||||
**Step 2: Validate**
|
||||
|
||||
For each, check:
|
||||
- [ ] Distinct identity? (can point to "this specific X")
|
||||
- [ ] Independent lifecycle?
|
||||
- [ ] Multiple attributes beyond name?
|
||||
- [ ] Track multiple instances?
|
||||
|
||||
**Keep** if yes to most. **Reject** if just an attribute.
|
||||
|
||||
**Step 3: Entity vs Value Object**
|
||||
|
||||
- **Entity**: Has ID, mutable (User, Order)
|
||||
- **Value Object**: No ID, immutable (Address, Money)
|
||||
|
||||
**Step 4: Document**
|
||||
|
||||
```markdown
|
||||
### Entity: [Name]
|
||||
**Purpose:** [What it represents]
|
||||
**Examples:** [2-3 concrete cases]
|
||||
**Lifecycle:** [Creation → deletion]
|
||||
**Invariants:** [Rules that must hold]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Attribute Guide
|
||||
|
||||
**Template:**
|
||||
```
|
||||
attributeName: DataType [NULL|NOT NULL] [DEFAULT value]
|
||||
- Description: [What it represents]
|
||||
- Validation: [Constraints]
|
||||
- Examples: [Sample values]
|
||||
```
|
||||
|
||||
**Standard attributes:**
|
||||
- `id`: Primary key (UUID/BIGINT)
|
||||
- `createdAt`: TIMESTAMP NOT NULL
|
||||
- `updatedAt`: TIMESTAMP NOT NULL
|
||||
- `deletedAt`: TIMESTAMP NULL (soft deletes)
|
||||
|
||||
**Data types:**
|
||||
|
||||
| Data | SQL | NoSQL | Notes |
|
||||
|------|-----|-------|-------|
|
||||
| Short text | VARCHAR(N) | String | Specify max |
|
||||
| Long text | TEXT | String | No limit |
|
||||
| Integer | INT/BIGINT | Number | Choose size |
|
||||
| Decimal | DECIMAL(P,S) | Number | Fixed precision |
|
||||
| Money | DECIMAL(19,4) | {amount,currency} | Never FLOAT |
|
||||
| Boolean | BOOLEAN | Boolean | Not nullable |
|
||||
| Date/Time | TIMESTAMP | ISODate | With timezone |
|
||||
| UUID | UUID/CHAR(36) | String | Distributed IDs |
|
||||
| JSON | JSON/JSONB | Object | Flexible |
|
||||
| Enum | ENUM/VARCHAR | String | Fixed values |
|
||||
|
||||
**Nullability:**
|
||||
- NOT NULL if required
|
||||
- NULL if optional/unknown at creation
|
||||
- Avoid NULL for booleans
|
||||
|
||||
---
|
||||
|
||||
## Relationship Mapping
|
||||
|
||||
**Cardinality:**
|
||||
|
||||
**1:1** - User has one Profile
|
||||
- SQL: `Profile.userId UNIQUE NOT NULL REFERENCES User(id)`
|
||||
|
||||
**1:N** - User has many Orders
|
||||
- SQL: `Order.userId NOT NULL REFERENCES User(id)`
|
||||
|
||||
**M:N** - Order contains Products
|
||||
- Junction table:
|
||||
```sql
|
||||
OrderItem (
|
||||
orderId REFERENCES Order(id),
|
||||
productId REFERENCES Product(id),
|
||||
quantity INT NOT NULL,
|
||||
PRIMARY KEY (orderId, productId)
|
||||
)
|
||||
```
|
||||
|
||||
**Optionality:**
|
||||
- Required: NOT NULL
|
||||
- Optional: NULL
|
||||
|
||||
**Naming:**
|
||||
Use verbs: User **owns** Order, Product **belongs to** Category
|
||||
|
||||
---
|
||||
|
||||
## Constraint Specification
|
||||
|
||||
**Primary Keys:**
|
||||
```sql
|
||||
id BIGINT PRIMARY KEY AUTO_INCREMENT
|
||||
-- or --
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
|
||||
```
|
||||
|
||||
**Unique:**
|
||||
```sql
|
||||
email VARCHAR(255) UNIQUE NOT NULL
|
||||
UNIQUE (userId, productId) -- composite
|
||||
```
|
||||
|
||||
**Foreign Keys:**
|
||||
```sql
|
||||
userId BIGINT NOT NULL REFERENCES User(id) ON DELETE CASCADE
|
||||
-- Options: CASCADE, SET NULL, RESTRICT
|
||||
```
|
||||
|
||||
**Check Constraints:**
|
||||
```sql
|
||||
price DECIMAL(10,2) CHECK (price >= 0)
|
||||
status VARCHAR(20) CHECK (status IN ('draft','pending','completed'))
|
||||
```
|
||||
|
||||
**Domain Invariants:**
|
||||
|
||||
Document business rules:
|
||||
```markdown
|
||||
### Invariant: Order total = sum of items
|
||||
Order.total = SUM(OrderItem.quantity * OrderItem.price)
|
||||
|
||||
### Invariant: Unique email
|
||||
No duplicate emails (case-insensitive)
|
||||
```
|
||||
|
||||
Enforce via: DB constraints (preferred), application logic, or triggers.
|
||||
|
||||
---
|
||||
|
||||
## Schema Documentation Template
|
||||
|
||||
Create: `data-schema-knowledge-modeling.md`
|
||||
|
||||
**Required sections:**
|
||||
|
||||
1. **Domain Overview** - Purpose, scope, technology
|
||||
2. **Use Cases** - Primary operations, query patterns
|
||||
3. **Entity Definitions** - For each entity:
|
||||
- Purpose, examples, lifecycle
|
||||
- Attributes table (name, type, null, default, constraints, description)
|
||||
- Relationships (cardinality, FK, optionality)
|
||||
- Invariants
|
||||
4. **ERD** - Visual/text diagram showing relationships
|
||||
5. **Constraints** - DB constraints, domain invariants
|
||||
6. **Normalization** - Level, denormalization decisions
|
||||
7. **Implementation** - SQL DDL / JSON Schema / Graph schema as appropriate
|
||||
8. **Validation** - Check each use case is supported
|
||||
9. **Open Questions** - Unresolved decisions
|
||||
|
||||
**Example entity definition:**
|
||||
|
||||
```markdown
|
||||
### Entity: Order
|
||||
|
||||
**Purpose:** Represents customer purchase transaction
|
||||
**Examples:** Amazon order #123, Shopify order #456
|
||||
**Lifecycle:** Created on checkout → Updated during fulfillment → Completed on delivery
|
||||
|
||||
#### Attributes
|
||||
|
||||
| Attribute | Type | Null? | Default | Constraints | Description |
|
||||
|---|---|---|---|---|---|
|
||||
| id | BIGINT | NO | auto | PK | Unique identifier |
|
||||
| userId | BIGINT | NO | - | FK→User | Customer who placed order |
|
||||
| status | VARCHAR(20) | NO | 'pending' | CHECK IN(...) | Order status |
|
||||
| total | DECIMAL(10,2) | NO | - | CHECK >= 0 | Order total |
|
||||
|
||||
#### Relationships
|
||||
- **belongs to:** 1:N with User (Order.userId → User.id)
|
||||
- **contains:** 1:N with OrderItem junction table
|
||||
|
||||
#### Invariants
|
||||
- total = SUM(OrderItem.quantity * OrderItem.price)
|
||||
- status transitions: pending → confirmed → shipped → delivered
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
**Completeness:**
|
||||
- [ ] All entities identified
|
||||
- [ ] All attributes defined (types, nullability)
|
||||
- [ ] All relationships mapped (cardinality)
|
||||
- [ ] All constraints specified
|
||||
- [ ] All invariants documented
|
||||
|
||||
**Correctness:**
|
||||
- [ ] Each entity distinct purpose
|
||||
- [ ] No redundant entities
|
||||
- [ ] Attributes in correct entities
|
||||
- [ ] Cardinality reflects reality
|
||||
- [ ] Constraints enforce rules
|
||||
|
||||
**Use Case Coverage:**
|
||||
- [ ] Supports all CRUD operations
|
||||
- [ ] All queries answerable
|
||||
- [ ] Indexes planned
|
||||
- [ ] No missing joins
|
||||
|
||||
**Normalization:**
|
||||
- [ ] No partial dependencies (2NF)
|
||||
- [ ] No transitive dependencies (3NF)
|
||||
- [ ] Denormalization documented
|
||||
- [ ] No update anomalies
|
||||
|
||||
**Technical Quality:**
|
||||
- [ ] Consistent naming
|
||||
- [ ] Appropriate data types
|
||||
- [ ] Primary keys defined
|
||||
- [ ] Foreign keys maintain integrity
|
||||
- [ ] Soft delete strategy (if needed)
|
||||
- [ ] Audit fields (if needed)
|
||||
|
||||
**Future-Proofing:**
|
||||
- [ ] Schema extensible
|
||||
- [ ] Migration path (if applicable)
|
||||
- [ ] Versioning strategy
|
||||
- [ ] No technical debt
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
**1. Modeling implementation, not domain**
|
||||
- Symptom: Entities like "UserSession", "Cache"
|
||||
- Fix: Model real-world concepts only
|
||||
|
||||
**2. God entities**
|
||||
- Symptom: User with 50+ attributes
|
||||
- Fix: Extract to separate entities
|
||||
|
||||
**3. Missing junction tables**
|
||||
- Symptom: M:N with FKs
|
||||
- Fix: Always use junction table
|
||||
|
||||
**4. Nullable FKs without reason**
|
||||
- Symptom: All relationships optional
|
||||
- Fix: NOT NULL unless truly optional
|
||||
|
||||
**5. Not enforcing invariants**
|
||||
- Symptom: Rules in docs only
|
||||
- Fix: CHECK constraints, triggers, app validation
|
||||
|
||||
**6. Premature denormalization**
|
||||
- Symptom: Duplicating without measurement
|
||||
- Fix: Normalize first, denormalize after profiling
|
||||
|
||||
**7. Wrong data types**
|
||||
- Symptom: Money as VARCHAR
|
||||
- Fix: DECIMAL for money, proper types for all
|
||||
|
||||
**8. No migration strategy**
|
||||
- Symptom: Can't change schema
|
||||
- Fix: Versioning, backward-compat changes
|
||||
Reference in New Issue
Block a user