Initial commit
This commit is contained in:
715
skills/pre-dev-data-model/SKILL.md
Normal file
715
skills/pre-dev-data-model/SKILL.md
Normal file
@@ -0,0 +1,715 @@
|
||||
---
|
||||
name: pre-dev-data-model
|
||||
description: |
|
||||
Gate 5: Data structures document - defines entities, relationships, and ownership
|
||||
before database technology selection. Large Track only.
|
||||
|
||||
trigger: |
|
||||
- API Design passed Gate 4 validation
|
||||
- System stores persistent data
|
||||
- Multiple entities with relationships
|
||||
- Large Track workflow (2+ day features)
|
||||
|
||||
skip_when: |
|
||||
- Small Track workflow → skip to Task Breakdown
|
||||
- No persistent data → skip to Dependency Map
|
||||
- API Design not validated → complete Gate 4 first
|
||||
|
||||
sequence:
|
||||
after: [pre-dev-api-design]
|
||||
before: [pre-dev-dependency-map]
|
||||
---
|
||||
|
||||
# Data Modeling - Defining Data Structures
|
||||
|
||||
## Foundational Principle
|
||||
|
||||
**Data structures, relationships, and ownership must be defined before database technology selection.**
|
||||
|
||||
Jumping to database-specific schemas without modeling creates:
|
||||
- Inconsistent data structures across services
|
||||
- Unclear data ownership and authority
|
||||
- Schema conflicts discovered during development
|
||||
- Migration nightmares when requirements change
|
||||
- Performance issues from poor data design
|
||||
|
||||
**The Data Model answers**: WHAT data exists, HOW entities relate, WHO owns what data?
|
||||
**The Data Model never answers**: WHICH database technology or HOW to implement storage.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when:
|
||||
- API Design has passed Gate 4 validation
|
||||
- TRD has passed Gate 3 validation
|
||||
- System stores persistent data
|
||||
- Multiple entities with relationships
|
||||
- Need clear data ownership boundaries
|
||||
- Building data-intensive applications
|
||||
|
||||
## Mandatory Workflow
|
||||
|
||||
### Phase 1: Data Analysis (Inputs Required)
|
||||
1. **Approved API Design** (Gate 4 passed) - contracts define data flows
|
||||
2. **Approved TRD** (Gate 3 passed) - architecture components identified
|
||||
3. **Approved Feature Map** (Gate 2 passed) - domains defined
|
||||
4. **Approved PRD** (Gate 1 passed) - business requirements locked
|
||||
5. **Extract entities** from PRD, Feature Map, and API contracts
|
||||
6. **Identify relationships** between entities
|
||||
|
||||
### Phase 2: Data Modeling
|
||||
1. **Define entities** (what data objects exist)
|
||||
2. **Specify attributes** (what properties each entity has)
|
||||
3. **Model relationships** (how entities connect)
|
||||
4. **Assign ownership** (which component owns which data)
|
||||
5. **Define constraints** (uniqueness, required fields, ranges)
|
||||
6. **Plan data lifecycle** (creation, updates, deletion, archival)
|
||||
7. **Design access patterns** (how data will be queried)
|
||||
8. **Consider data quality** (validation, normalization)
|
||||
|
||||
### Phase 3: Gate 5 Validation
|
||||
**MANDATORY CHECKPOINT** - Must pass before proceeding to Dependency Map:
|
||||
- [ ] All entities are identified and defined
|
||||
- [ ] Entity attributes are complete with types and constraints
|
||||
- [ ] Relationships between entities are modeled
|
||||
- [ ] Data ownership is explicitly assigned to components
|
||||
- [ ] Primary identifiers are defined
|
||||
- [ ] Unique constraints are specified
|
||||
- [ ] Required vs. optional fields are clear
|
||||
- [ ] Data lifecycle is documented
|
||||
- [ ] Access patterns are identified
|
||||
- [ ] No database-specific details (tables, indexes, SQL)
|
||||
- [ ] No ORM or storage technology specifics
|
||||
|
||||
## Explicit Rules
|
||||
|
||||
### ✅ DO Include in Data Model
|
||||
- Entity definitions (conceptual data objects)
|
||||
- Attributes with types (string, number, boolean, date, etc.)
|
||||
- Constraints (required, unique, ranges, patterns)
|
||||
- Relationships (one-to-one, one-to-many, many-to-many)
|
||||
- Data ownership (which component is authoritative)
|
||||
- Primary identifiers (how entities are uniquely identified)
|
||||
- Lifecycle rules (soft delete, archival, retention)
|
||||
- Access patterns (how data will be queried)
|
||||
- Data quality rules (validation, normalization)
|
||||
- Referential integrity requirements
|
||||
|
||||
### ❌ NEVER Include in Data Model
|
||||
- Database product names (PostgreSQL, MongoDB, Redis)
|
||||
- Table names or collection names
|
||||
- Index definitions
|
||||
- SQL or query language specifics
|
||||
- ORM frameworks (Prisma, TypeORM, SQLAlchemy)
|
||||
- Storage engine specifics (InnoDB, MyISAM)
|
||||
- Partitioning or sharding strategies (implementation detail)
|
||||
- Replication or backup strategies
|
||||
- Database-specific data types (JSONB, UUID, BIGSERIAL)
|
||||
|
||||
### Abstraction Rules
|
||||
1. **Entity**: Say "User" not "users table"
|
||||
2. **Attribute**: Say "emailAddress: String (email format)" not "email VARCHAR(255)"
|
||||
3. **Relationship**: Say "User has many Orders" not "foreign key user_id"
|
||||
4. **Identifier**: Say "Unique identifier" not "UUID primary key"
|
||||
5. **Constraint**: Say "Must be unique" not "UNIQUE INDEX"
|
||||
|
||||
## Rationalization Table
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "We know it's PostgreSQL, just use PG types" | Database choice comes later. Model abstractly now. |
|
||||
| "Table design is data modeling" | Tables are implementation. Entities are concepts. Stay conceptual. |
|
||||
| "We need indexes for performance" | Indexes are optimization. Model data first, optimize later. |
|
||||
| "ORMs require specific schemas" | ORMs adapt to models. Don't let tooling drive design. |
|
||||
| "Foreign keys define relationships" | Relationships exist conceptually. FKs are implementation. |
|
||||
| "SQL examples help clarity" | Abstract models are clearer. SQL is implementation detail. |
|
||||
| "NoSQL doesn't need relationships" | All systems have data relationships. Model them regardless of DB type. |
|
||||
| "This is just ERD" | ERD is visualization tool. Data model is broader (ownership, lifecycle, etc). |
|
||||
| "We can skip this for simple CRUD" | Even CRUD needs clear entity design. Don't skip. |
|
||||
| "Microservices mean no relationships" | Services interact via data. Model entities per service. |
|
||||
|
||||
## Red Flags - STOP
|
||||
|
||||
If you catch yourself writing any of these in Data Model, **STOP**:
|
||||
|
||||
- Database product names (Postgres, MySQL, Mongo, Redis)
|
||||
- SQL keywords (CREATE TABLE, ALTER TABLE, SELECT, JOIN)
|
||||
- Database-specific types (SERIAL, JSONB, VARCHAR, TEXT)
|
||||
- Index commands (CREATE INDEX, UNIQUE INDEX)
|
||||
- ORM code (Prisma schema, TypeORM decorators)
|
||||
- Storage details (partitioning, sharding, replication)
|
||||
- Query optimization (EXPLAIN plans, index hints)
|
||||
- Backup/recovery strategies
|
||||
|
||||
**When you catch yourself**: Replace DB detail with abstract concept. "users table" → "User entity"
|
||||
|
||||
## Gate 5 Validation Checklist
|
||||
|
||||
Before proceeding to Dependency Map, verify:
|
||||
|
||||
**Entity Completeness**:
|
||||
- [ ] All entities from PRD/Feature Map are modeled
|
||||
- [ ] Entity names are clear and consistent
|
||||
- [ ] Each entity has defined purpose
|
||||
- [ ] Entity boundaries align with component ownership (from TRD)
|
||||
|
||||
**Attribute Specification**:
|
||||
- [ ] All attributes have types specified
|
||||
- [ ] Required vs. optional is explicit
|
||||
- [ ] Constraints are documented (unique, range, format)
|
||||
- [ ] Default values are specified where relevant
|
||||
- [ ] Computed/derived fields are identified
|
||||
|
||||
**Relationship Modeling**:
|
||||
- [ ] All relationships between entities are documented
|
||||
- [ ] Cardinality is specified (one-to-one, one-to-many, many-to-many)
|
||||
- [ ] Optional vs. required relationships are clear
|
||||
- [ ] Referential integrity needs are documented
|
||||
- [ ] Circular dependencies are identified and resolved
|
||||
|
||||
**Data Ownership**:
|
||||
- [ ] Each entity is owned by exactly one component
|
||||
- [ ] Read/write permissions are documented
|
||||
- [ ] Cross-component data access is via APIs (from Gate 4)
|
||||
- [ ] No shared database anti-pattern
|
||||
|
||||
**Data Quality**:
|
||||
- [ ] Validation rules are specified
|
||||
- [ ] Normalization level is appropriate
|
||||
- [ ] Denormalization decisions are justified
|
||||
- [ ] Data consistency strategy is defined (eventual vs. strong)
|
||||
|
||||
**Lifecycle Management**:
|
||||
- [ ] Creation rules are documented
|
||||
- [ ] Update patterns are defined
|
||||
- [ ] Deletion strategy is specified (hard vs. soft delete)
|
||||
- [ ] Archival and retention policies exist
|
||||
- [ ] Audit trail needs are identified
|
||||
|
||||
**Access Patterns**:
|
||||
- [ ] Primary access patterns are documented
|
||||
- [ ] Query needs are identified (lookups, searches, aggregations)
|
||||
- [ ] Write patterns are documented (create, update, delete frequencies)
|
||||
- [ ] Consistency requirements are specified
|
||||
|
||||
**Technology Agnostic**:
|
||||
- [ ] No database product names
|
||||
- [ ] No SQL or NoSQL specifics
|
||||
- [ ] No table/collection/index definitions
|
||||
- [ ] Can implement in any database technology
|
||||
|
||||
**Gate Result**:
|
||||
- ✅ **PASS**: All checkboxes checked → Proceed to Dependency Map
|
||||
- ⚠️ **CONDITIONAL**: Remove DB specifics or add missing entities → Re-validate
|
||||
- ❌ **FAIL**: Incomplete model or poor ownership → Rework
|
||||
|
||||
## Data Model Template
|
||||
|
||||
```markdown
|
||||
# Data Model: [Project/Feature Name]
|
||||
|
||||
## Overview
|
||||
- **API Design Reference**: [Link to Gate 4 API contracts]
|
||||
- **TRD Reference**: [Link to Gate 3 TRD]
|
||||
- **Feature Map Reference**: [Link to Gate 2 Feature Map]
|
||||
- **Last Updated**: [Date]
|
||||
- **Status**: Draft / Under Review / Approved
|
||||
|
||||
## Data Ownership Map
|
||||
|
||||
| Entity | Owning Component (from TRD) | Read Access | Write Access |
|
||||
|--------|------------------------------|-------------|--------------|
|
||||
| User | User Service | All components | User Service only |
|
||||
| Order | Order Service | User, Payment, Fulfillment | Order Service only |
|
||||
| Payment | Payment Service | Order, Billing | Payment Service only |
|
||||
| Product | Catalog Service | All components | Catalog Service only |
|
||||
|
||||
**Principle**: Each entity has exactly ONE authoritative owner. Cross-component access via APIs only.
|
||||
|
||||
---
|
||||
|
||||
## Entity Definitions
|
||||
|
||||
### Entity: User
|
||||
|
||||
**Purpose**: Represents a system user account
|
||||
|
||||
**Owned By**: User Service (from TRD)
|
||||
|
||||
**Primary Identifier**: userId (Unique identifier, immutable)
|
||||
|
||||
**Attributes**:
|
||||
| Attribute | Type | Required | Unique | Constraints | Description |
|
||||
|-----------|------|----------|--------|-------------|-------------|
|
||||
| userId | Identifier | Yes | Yes | Immutable, UUID format | Unique user identifier |
|
||||
| email | EmailAddress | Yes | Yes | Valid email format, max 254 chars | Primary email |
|
||||
| displayName | String | No | No | 3-50 chars, alphanumeric + spaces | Public name |
|
||||
| passwordHash | String | Yes | No | Hashed value only, never store plain text | Authentication credential |
|
||||
| accountStatus | UserStatus | Yes | No | One of: active, suspended, deleted, pending | Current status |
|
||||
| emailVerified | Boolean | Yes | No | Default: false | Email verification status |
|
||||
| createdAt | Timestamp | Yes | No | Immutable, ISO8601 | Account creation time |
|
||||
| updatedAt | Timestamp | Yes | No | Auto-updated on changes | Last modification time |
|
||||
| lastLoginAt | Timestamp | No | No | ISO8601 | Most recent login time |
|
||||
|
||||
**Constraints**:
|
||||
- `email` must be unique across all users
|
||||
- `displayName` is optional but recommended (prompt user during registration)
|
||||
- `passwordHash` must never be returned in API responses
|
||||
- `accountStatus` transitions: pending → active, active → suspended → active, active → deleted (final)
|
||||
|
||||
**Lifecycle**:
|
||||
- **Creation**: Via `CreateUser` API operation (Gate 4 contract)
|
||||
- **Updates**: Via `UpdateUserProfile`, `ChangePassword`, `UpdateStatus` operations
|
||||
- **Deletion**: Soft delete (set `accountStatus = deleted`, retain data for 90 days)
|
||||
- **Archival**: Hard delete after 90 days of soft delete
|
||||
|
||||
**Access Patterns**:
|
||||
- Lookup by `userId` (primary pattern, most frequent)
|
||||
- Lookup by `email` (login flow, unique constraint)
|
||||
- List users by `accountStatus` (admin operations)
|
||||
- Search by `displayName` (user search feature)
|
||||
|
||||
**Data Quality**:
|
||||
- `email` is normalized to lowercase before storage
|
||||
- `displayName` is trimmed of leading/trailing whitespace
|
||||
- `passwordHash` uses industry-standard hashing (algorithm TBD in Dependency Map)
|
||||
|
||||
---
|
||||
|
||||
### Entity: Order
|
||||
|
||||
**Purpose**: Represents a customer order
|
||||
|
||||
**Owned By**: Order Service (from TRD)
|
||||
|
||||
**Primary Identifier**: orderId (Unique identifier, immutable)
|
||||
|
||||
**Attributes**:
|
||||
| Attribute | Type | Required | Unique | Constraints | Description |
|
||||
|-----------|------|----------|--------|-------------|-------------|
|
||||
| orderId | Identifier | Yes | Yes | Immutable, UUID format | Unique order identifier |
|
||||
| userId | Identifier | Yes | No | Must reference existing User | Customer who placed order |
|
||||
| orderStatus | OrderStatus | Yes | No | One of: pending, confirmed, shipped, delivered, cancelled | Current status |
|
||||
| totalAmount | MonetaryAmount | Yes | No | Non-negative, in smallest currency unit | Total order value |
|
||||
| currency | CurrencyCode | Yes | No | ISO 4217 code | Currency for totalAmount |
|
||||
| shippingAddress | Address | Yes | No | Valid address structure | Delivery destination |
|
||||
| orderItems | List<OrderItem> | Yes | No | Min 1 item, max 100 items | Products in order |
|
||||
| createdAt | Timestamp | Yes | No | Immutable, ISO8601 | Order creation time |
|
||||
| updatedAt | Timestamp | Yes | No | Auto-updated on changes | Last modification time |
|
||||
|
||||
**Nested Types**:
|
||||
|
||||
#### OrderItem (embedded within Order)
|
||||
| Attribute | Type | Required | Description |
|
||||
|-----------|------|----------|-------------|
|
||||
| productId | Identifier | Yes | References Product entity |
|
||||
| quantity | Integer | Yes | Min 1, max 999 |
|
||||
| unitPrice | MonetaryAmount | Yes | Price per item at time of order |
|
||||
| subtotal | MonetaryAmount | Yes | quantity × unitPrice |
|
||||
|
||||
#### Address (value object)
|
||||
| Attribute | Type | Required | Description |
|
||||
|-----------|------|----------|-------------|
|
||||
| street1 | String | Yes | Primary street address |
|
||||
| street2 | String | No | Secondary address (apt, suite) |
|
||||
| city | String | Yes | City name |
|
||||
| state | String | Yes | State/province code |
|
||||
| postalCode | String | Yes | Postal/ZIP code |
|
||||
| country | String | Yes | ISO 3166-1 alpha-2 code |
|
||||
|
||||
**Relationships**:
|
||||
- **User** (one-to-many): One User can have many Orders
|
||||
- **Product** (via OrderItem): Order references Products (read-only, owned by Catalog Service)
|
||||
|
||||
**Constraints**:
|
||||
- `totalAmount` must equal sum of all `orderItems[].subtotal` + shipping + tax
|
||||
- `orderStatus` transitions: pending → confirmed → shipped → delivered
|
||||
- `orderStatus` can go to cancelled from pending or confirmed only
|
||||
- `orderItems` must contain at least 1 item
|
||||
- All `productId` references must be valid at order creation time (validate via API)
|
||||
|
||||
**Lifecycle**:
|
||||
- **Creation**: Via `CreateOrder` API operation
|
||||
- **Updates**: Via `UpdateOrderStatus`, `CancelOrder` operations
|
||||
- **Deletion**: Soft delete after 7 years (regulatory compliance)
|
||||
- **Archival**: Orders never hard deleted (permanent record)
|
||||
|
||||
**Access Patterns**:
|
||||
- Lookup by `orderId` (most frequent)
|
||||
- List by `userId` (user order history)
|
||||
- List by `orderStatus` (fulfillment workflows)
|
||||
- Query by `createdAt` range (reporting, analytics)
|
||||
|
||||
**Data Quality**:
|
||||
- `orderItems[].unitPrice` is snapshot at order time (price changes don't affect existing orders)
|
||||
- `totalAmount` is computed and validated on creation
|
||||
- `shippingAddress` is validated against address API before order creation
|
||||
|
||||
---
|
||||
|
||||
### Entity: [Another Entity]
|
||||
[Same structure as above]
|
||||
|
||||
---
|
||||
|
||||
## Relationship Diagram
|
||||
|
||||
```
|
||||
User (1) ──< has many >── (*) Order
|
||||
│
|
||||
│ contains
|
||||
↓
|
||||
OrderItem (embedded)
|
||||
│
|
||||
│ references
|
||||
↓
|
||||
Product (1)
|
||||
│
|
||||
│ belongs to
|
||||
↓
|
||||
Category (1)
|
||||
|
||||
Payment (*) ──< processes >── (1) Order
|
||||
```
|
||||
|
||||
**Legend**:
|
||||
- `(1)`: One
|
||||
- `(*)`: Many
|
||||
- `──<`: One-to-many relationship
|
||||
- `──`: One-to-one relationship
|
||||
- Embedded: Data stored within parent entity
|
||||
|
||||
---
|
||||
|
||||
## Cross-Component Data Access
|
||||
|
||||
### User Service → Order Service
|
||||
|
||||
**Scenario**: User wants to view their order history
|
||||
|
||||
**Data Flow**:
|
||||
1. User Service authenticates user (owns User entity)
|
||||
2. User Service calls Order Service API: `GetOrdersByUserId(userId)`
|
||||
3. Order Service returns order data (owns Order entity)
|
||||
4. User Service enriches with user display name if needed
|
||||
|
||||
**Rules**:
|
||||
- User Service does NOT access Order Service's data store directly
|
||||
- All access via APIs (from Gate 4 contracts)
|
||||
- Order Service is authoritative for Order data
|
||||
|
||||
---
|
||||
|
||||
### Order Service → Catalog Service
|
||||
|
||||
**Scenario**: Order creation needs product info
|
||||
|
||||
**Data Flow**:
|
||||
1. Order Service receives `CreateOrder` request
|
||||
2. Order Service calls Catalog Service API: `GetProduct(productId)`
|
||||
3. Catalog Service returns product data (price, availability)
|
||||
4. Order Service creates order with snapshot of product data
|
||||
|
||||
**Rules**:
|
||||
- Order stores `productId` reference and price snapshot
|
||||
- Catalog Service is authoritative for current Product data
|
||||
- Order's product snapshot is immutable (historical record)
|
||||
|
||||
---
|
||||
|
||||
### [Another Cross-Component Access]
|
||||
[Same structure]
|
||||
|
||||
---
|
||||
|
||||
## Data Consistency Strategy
|
||||
|
||||
### Strong Consistency (Immediate)
|
||||
- User authentication (must be immediately consistent)
|
||||
- Payment transactions (cannot tolerate eventual consistency)
|
||||
- Inventory deductions (prevent overselling)
|
||||
|
||||
### Eventual Consistency (Acceptable Delay)
|
||||
- User profile updates reflecting in analytics (delay OK)
|
||||
- Order history in user dashboard (few seconds lag acceptable)
|
||||
- Search indexes (brief staleness tolerable)
|
||||
|
||||
### Consistency Implementation**:
|
||||
- Strong: Synchronous API calls with transactional guarantees
|
||||
- Eventual: Async events with idempotent handlers
|
||||
|
||||
---
|
||||
|
||||
## Data Validation Rules
|
||||
|
||||
### User Entity Validation
|
||||
- `email`: Must match RFC 5322 format
|
||||
- `displayName`: No profanity (filter list: [reference])
|
||||
- `passwordHash`: Must meet complexity requirements (min 12 chars, mixed case, numbers, symbols)
|
||||
|
||||
### Order Entity Validation
|
||||
- `totalAmount`: Must match sum of items + fees
|
||||
- `orderItems`: Each `productId` must exist in Catalog Service at order time
|
||||
- `shippingAddress`: Must pass address validation API
|
||||
|
||||
### Cross-Entity Validation
|
||||
- User must have `accountStatus = active` to create orders
|
||||
- Products must have `availability > 0` to be added to orders
|
||||
|
||||
---
|
||||
|
||||
## Data Lifecycle Policies
|
||||
|
||||
### Retention Periods
|
||||
| Entity | Active Period | Archive After | Delete After |
|
||||
|--------|---------------|---------------|--------------|
|
||||
| User | Until deleted | 90 days (soft delete) | 90 days after soft delete |
|
||||
| Order | Permanent | N/A | Never (regulatory) |
|
||||
| Payment | 7 years | 7 years | 10 years (compliance) |
|
||||
| AuditLog | 1 year | 1 year | 5 years |
|
||||
|
||||
### Soft Delete Strategy
|
||||
- **User**: Set `accountStatus = deleted`, retain data for GDPR compliance window
|
||||
- **Order**: Never deleted (permanent financial record)
|
||||
- **Payment**: Anonymize after 7 years (retain transaction, remove PII)
|
||||
|
||||
### Audit Trail Requirements
|
||||
- **User**: Log all status changes, login attempts
|
||||
- **Order**: Log all status transitions, modifications
|
||||
- **Payment**: Log all transaction attempts, results
|
||||
|
||||
---
|
||||
|
||||
## Data Privacy & Compliance
|
||||
|
||||
### Personally Identifiable Information (PII)
|
||||
| Entity | PII Fields | Handling |
|
||||
|--------|-----------|----------|
|
||||
| User | email, displayName | Encrypted at rest, GDPR right to deletion |
|
||||
| Order | shippingAddress | Encrypted at rest, retention per policy |
|
||||
| Payment | card details | Never stored (use tokenization) |
|
||||
|
||||
### GDPR Compliance
|
||||
- Users can request data export (via `ExportUserData` API)
|
||||
- Users can request deletion (soft delete with 90-day grace period)
|
||||
- Consent tracking for marketing communications
|
||||
|
||||
### Data Encryption
|
||||
- Sensitive fields encrypted at rest (algorithm TBD in Dependency Map)
|
||||
- Encryption keys managed externally (key management TBD in Dependency Map)
|
||||
|
||||
---
|
||||
|
||||
## Access Pattern Analysis
|
||||
|
||||
### High-Frequency Patterns (Optimize for these)
|
||||
1. **User lookup by ID**: `GetUser(userId)` - 1000 req/sec
|
||||
2. **Order lookup by ID**: `GetOrder(orderId)` - 500 req/sec
|
||||
3. **User's orders**: `GetOrdersByUserId(userId)` - 200 req/sec
|
||||
|
||||
### Medium-Frequency Patterns
|
||||
4. **User lookup by email**: `GetUserByEmail(email)` - 50 req/sec (login)
|
||||
5. **Orders by status**: `GetOrdersByStatus(status)` - 30 req/sec (fulfillment)
|
||||
|
||||
### Low-Frequency Patterns
|
||||
6. **User search**: `SearchUsers(query)` - 5 req/sec (admin)
|
||||
7. **Order reports**: `GetOrdersByDateRange(start, end)` - 1 req/hour (reporting)
|
||||
|
||||
**Optimization Notes** (for later, not now):
|
||||
- High-frequency patterns need fast lookups (indexes, caching)
|
||||
- Medium-frequency patterns need balanced design
|
||||
- Low-frequency patterns can tolerate slower queries
|
||||
|
||||
---
|
||||
|
||||
## Data Quality Standards
|
||||
|
||||
### Normalization
|
||||
- **User email**: Stored in lowercase, normalized form
|
||||
- **Addresses**: Standardized format via address validation service
|
||||
- **Phone numbers**: Stored in E.164 format
|
||||
|
||||
### Validation
|
||||
- All inputs validated before storage
|
||||
- Validation rules documented per attribute
|
||||
- Failed validation returns clear error messages
|
||||
|
||||
### Data Integrity
|
||||
- Referential integrity enforced (userId in Order must exist)
|
||||
- Constraints enforced at model level (not just database)
|
||||
- Data consistency checks in automated tests
|
||||
|
||||
---
|
||||
|
||||
## Migration & Evolution Strategy
|
||||
|
||||
### Schema Evolution
|
||||
- **Additive changes**: Add optional fields (backward compatible)
|
||||
- **Non-breaking**: Default values for new required fields
|
||||
- **Breaking changes**: Versioned approach (v1 → v2 with migration)
|
||||
|
||||
### Data Migration
|
||||
- Plan for zero-downtime migrations
|
||||
- Backward and forward compatibility during migration
|
||||
- Rollback strategy for failed migrations
|
||||
|
||||
### Versioning
|
||||
- Entities can evolve over time
|
||||
- Migration scripts documented (but not written here - that's implementation)
|
||||
- Compatibility maintained during transitions
|
||||
|
||||
---
|
||||
|
||||
## Gate 5 Validation
|
||||
|
||||
**Validation Date**: [Date]
|
||||
**Validated By**: [Person/team]
|
||||
|
||||
- [ ] All entities defined with complete attributes
|
||||
- [ ] Relationships documented and valid
|
||||
- [ ] Data ownership assigned to components
|
||||
- [ ] Constraints and validation rules specified
|
||||
- [ ] Lifecycle policies documented
|
||||
- [ ] Access patterns identified
|
||||
- [ ] No database-specific details included
|
||||
- [ ] Ready for Dependency Map (Gate 6)
|
||||
|
||||
**Approval**: ☐ Approved | ☐ Needs Revision | ☐ Rejected
|
||||
**Next Step**: Proceed to Dependency Map (`pre-dev-dependency-map`)
|
||||
```
|
||||
|
||||
## Common Violations and Fixes
|
||||
|
||||
### Violation 1: Database-Specific Schema
|
||||
❌ **Wrong**:
|
||||
```sql
|
||||
CREATE TABLE users (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
email VARCHAR(255) UNIQUE NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
```
|
||||
|
||||
✅ **Correct**:
|
||||
```markdown
|
||||
### Entity: User
|
||||
**Attributes**:
|
||||
| Attribute | Type | Required | Unique | Description |
|
||||
|-----------|------|----------|--------|-------------|
|
||||
| userId | Identifier | Yes | Yes | Unique user identifier |
|
||||
| email | EmailAddress | Yes | Yes | Primary email address |
|
||||
| createdAt | Timestamp | Yes | No | Account creation time |
|
||||
```
|
||||
|
||||
### Violation 2: ORM-Specific Code
|
||||
❌ **Wrong**:
|
||||
```typescript
|
||||
@Entity()
|
||||
class User {
|
||||
@PrimaryGeneratedColumn('uuid')
|
||||
id: string;
|
||||
|
||||
@Column({ unique: true })
|
||||
email: string;
|
||||
}
|
||||
```
|
||||
|
||||
✅ **Correct**:
|
||||
```markdown
|
||||
### Entity: User
|
||||
**Primary Identifier**: userId (Unique identifier)
|
||||
**Attributes**: userId, email, ...
|
||||
**Constraints**: email must be unique
|
||||
```
|
||||
|
||||
### Violation 3: Technology in Relationships
|
||||
❌ **Wrong**:
|
||||
```markdown
|
||||
**Relationships**:
|
||||
- Foreign key `user_id` references `users.id`
|
||||
- Join table `user_roles` for many-to-many
|
||||
```
|
||||
|
||||
✅ **Correct**:
|
||||
```markdown
|
||||
**Relationships**:
|
||||
- User (one-to-many) Order: One user can have many orders
|
||||
- User (many-to-many) Role: Users can have multiple roles, roles can be assigned to multiple users
|
||||
```
|
||||
|
||||
## Confidence Scoring
|
||||
|
||||
Use this to adjust your interaction with the user:
|
||||
|
||||
```yaml
|
||||
Confidence Factors:
|
||||
Entity Coverage: [0-30]
|
||||
- All entities modeled: 30
|
||||
- Most entities covered: 20
|
||||
- Significant gaps: 10
|
||||
|
||||
Relationship Clarity: [0-25]
|
||||
- All relationships documented: 25
|
||||
- Most relationships clear: 15
|
||||
- Ambiguous connections: 5
|
||||
|
||||
Data Ownership: [0-25]
|
||||
- Clear ownership boundaries: 25
|
||||
- Mostly clear with minor overlaps: 15
|
||||
- Unclear or contested: 5
|
||||
|
||||
Constraint Completeness: [0-20]
|
||||
- All validation rules specified: 20
|
||||
- Common cases covered: 12
|
||||
- Minimal specification: 5
|
||||
|
||||
Total: [0-100]
|
||||
|
||||
Action:
|
||||
80+: Generate complete data model autonomously
|
||||
50-79: Present options for normalization/relationships
|
||||
<50: Ask clarifying questions about entity boundaries
|
||||
```
|
||||
|
||||
## Output Location
|
||||
|
||||
**Always output to**: `docs/pre-development/data-model/data-model-[feature-name].md`
|
||||
|
||||
## After Data Model Approval
|
||||
|
||||
1. ✅ Lock the data model - entity structure is now reference
|
||||
2. 🎯 Use data model as input for Dependency Map (next phase: `pre-dev-dependency-map`)
|
||||
3. 🚫 Never add database specifics to data model retroactively
|
||||
4. 📋 Keep data model technology-agnostic until Dependency Map
|
||||
|
||||
## Quality Self-Check
|
||||
|
||||
Before declaring Data Model complete, verify:
|
||||
- [ ] All entities are defined with complete attributes
|
||||
- [ ] Attribute types and constraints are specified
|
||||
- [ ] Relationships are modeled with correct cardinality
|
||||
- [ ] Data ownership is explicitly assigned
|
||||
- [ ] Primary identifiers are defined
|
||||
- [ ] Lifecycle policies are documented
|
||||
- [ ] Access patterns are identified
|
||||
- [ ] Validation rules are comprehensive
|
||||
- [ ] Privacy/compliance needs are addressed
|
||||
- [ ] Consistency strategy is defined
|
||||
- [ ] Zero database-specific details (tables, SQL, indexes)
|
||||
- [ ] Zero ORM or framework specifics
|
||||
- [ ] Gate 5 validation checklist 100% complete
|
||||
|
||||
## The Bottom Line
|
||||
|
||||
**If you wrote SQL schemas or ORM code, delete it and model abstractly.**
|
||||
|
||||
Data modeling is conceptual. Period. No database products. No SQL. No ORMs.
|
||||
|
||||
Database technology goes in Dependency Map. That's the next phase. Wait for it.
|
||||
|
||||
Violating this separation means:
|
||||
- You're locked into a database before evaluating alternatives
|
||||
- Data model can't be reused across different database types
|
||||
- You can't objectively compare relational vs. document vs. key-value
|
||||
- Poor separation of concerns (conceptual vs. physical)
|
||||
|
||||
**Model the data. Stay abstract. Choose database later.**
|
||||
Reference in New Issue
Block a user