Initial commit
This commit is contained in:
490
skills/spec-author/guides/data-model.md
Normal file
490
skills/spec-author/guides/data-model.md
Normal file
@@ -0,0 +1,490 @@
|
||||
# How to Create a Data Model Specification
|
||||
|
||||
Data Model specifications document the entities, fields, relationships, and constraints for your application's data. They define the "shape" of data your system works with.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Create a new data model
|
||||
scripts/generate-spec.sh data-model data-001-descriptive-slug
|
||||
|
||||
# 2. Open and fill in the file
|
||||
# (The file will be created at: docs/specs/data-model/data-001-descriptive-slug.md)
|
||||
|
||||
# 3. Fill in entities and relationships, then validate:
|
||||
scripts/validate-spec.sh docs/specs/data-model/data-001-descriptive-slug.md
|
||||
|
||||
# 4. Fix issues and check completeness:
|
||||
scripts/check-completeness.sh docs/specs/data-model/data-001-descriptive-slug.md
|
||||
```
|
||||
|
||||
## When to Write a Data Model
|
||||
|
||||
Use a Data Model when you need to:
|
||||
- Define database schema for new features
|
||||
- Document entity relationships and constraints
|
||||
- Establish consistent naming conventions
|
||||
- Enable API/UI teams to understand data structure
|
||||
- Plan data migrations or refactoring
|
||||
- Document complex data relationships
|
||||
|
||||
## Research Phase
|
||||
|
||||
### 1. Research Related Specifications
|
||||
Find what this data model supports:
|
||||
|
||||
```bash
|
||||
# Find technical requirements this fulfills
|
||||
grep -r "prd\|technical" docs/specs/ --include="*.md"
|
||||
|
||||
# Find existing data models that might be related
|
||||
grep -r "data\|model" docs/specs/ --include="*.md"
|
||||
|
||||
# Find API contracts that expose this data
|
||||
grep -r "api\|endpoint" docs/specs/ --include="*.md"
|
||||
```
|
||||
|
||||
### 2. Review Existing Data Models
|
||||
- What data modeling patterns does your codebase use?
|
||||
- What database are you using (PostgreSQL, MongoDB, etc.)?
|
||||
- How are relationships currently modeled?
|
||||
- Naming conventions for fields and entities?
|
||||
- Any legacy schema patterns to respect or migrate from?
|
||||
|
||||
### 3. Research Domain Models
|
||||
- How do industry-standard models structure similar data?
|
||||
- Are there existing standards (e.g., ISO, RFC) you should follow?
|
||||
- What are best practices in this domain?
|
||||
|
||||
### 4. Understand Business Rules
|
||||
- What constraints must the data satisfy?
|
||||
- What are the cardinality rules (one-to-many, many-to-many)?
|
||||
- What data must be unique or required?
|
||||
- What's the expected scale/volume?
|
||||
|
||||
## Structure & Content Guide
|
||||
|
||||
### Title & Metadata
|
||||
- **Title**: "User Data Model", "Transaction Model", etc.
|
||||
- **Scope**: What entities does this model cover?
|
||||
- **Version**: 1.0 for new models
|
||||
|
||||
### Overview Section
|
||||
Provide context:
|
||||
|
||||
```markdown
|
||||
# User & Profile Data Model
|
||||
|
||||
This data model defines the core entities for user management and profile
|
||||
information. Covers user accounts, authentication data, and user preferences.
|
||||
|
||||
**Entities**: User, UserProfile, UserPreference
|
||||
**Relationships**: User → UserProfile (1:1), User → UserPreference (1:many)
|
||||
**Primary Database**: PostgreSQL
|
||||
```
|
||||
|
||||
### Entity Definitions Section
|
||||
|
||||
Document each entity/table:
|
||||
|
||||
#### Entity: User
|
||||
|
||||
```markdown
|
||||
### User
|
||||
|
||||
Core user account entity. Every user must have exactly one User record.
|
||||
|
||||
**Purpose**: Represents a user account in the system.
|
||||
|
||||
**Fields**
|
||||
|
||||
| Field | Type | Required | Unique | Default | Description |
|
||||
|-------|------|----------|--------|---------|-------------|
|
||||
| id | UUID | Yes | Yes | auto | Primary key, auto-generated |
|
||||
| email | String(255) | Yes | Yes | - | User's email address, used for login |
|
||||
| password_hash | String(255) | Yes | No | - | Bcrypt hash of password (cost=12) |
|
||||
| first_name | String(100) | No | No | - | User's first name |
|
||||
| last_name | String(100) | No | No | - | User's last name |
|
||||
| status | Enum | Yes | No | active | Account status: active, inactive, suspended |
|
||||
| created_at | Timestamp | Yes | No | now() | Account creation time (UTC) |
|
||||
| updated_at | Timestamp | Yes | No | now() | Last update time (UTC) |
|
||||
| deleted_at | Timestamp | No | No | NULL | Soft-delete timestamp, NULL if active |
|
||||
|
||||
**Indexes**
|
||||
- Primary: `email` (unique for quick lookups)
|
||||
- Secondary: `created_at` (for user listing/pagination)
|
||||
- Secondary: `status` (for filtering active users)
|
||||
|
||||
**Constraints**
|
||||
- Email format must be valid (enforced in application)
|
||||
- Password must be at least 8 characters (enforced in application)
|
||||
- Email must be globally unique
|
||||
- Status can only be: active, inactive, suspended
|
||||
|
||||
**Data Volume**
|
||||
- Expected growth: 100 new users/day
|
||||
- Estimated year 1: ~36k users
|
||||
- Estimated year 3: ~150k users
|
||||
|
||||
**Archival Strategy**
|
||||
- Deleted users (deleted_at != NULL) moved to archive after 1 year
|
||||
- Soft deletes used for data recovery capability
|
||||
```
|
||||
|
||||
#### Entity: UserProfile
|
||||
|
||||
```markdown
|
||||
### UserProfile
|
||||
|
||||
Extended user profile information. One-to-one relationship with User.
|
||||
|
||||
**Purpose**: Stores optional user profile information separate from core account.
|
||||
|
||||
**Fields**
|
||||
|
||||
| Field | Type | Required | Unique | Description |
|
||||
|-------|------|----------|--------|-------------|
|
||||
| id | UUID | Yes | Yes | Primary key |
|
||||
| user_id | UUID (FK) | Yes | Yes | Foreign key to User.id |
|
||||
| avatar_url | String(500) | No | No | URL to user's avatar image |
|
||||
| bio | String(500) | No | No | User bio/description |
|
||||
| phone | String(20) | No | Yes | User phone number |
|
||||
| timezone | String(50) | No | No | User's timezone (e.g., America/New_York) |
|
||||
| language | String(5) | No | No | Preferred language (ISO 639-1, e.g., en, fr) |
|
||||
| theme | Enum | No | No | UI theme preference: light, dark, auto |
|
||||
| created_at | Timestamp | Yes | No | Creation time |
|
||||
| updated_at | Timestamp | Yes | No | Last update time |
|
||||
|
||||
**Indexes**
|
||||
- Primary: `user_id` (unique for 1:1 relationship)
|
||||
|
||||
**Constraints**
|
||||
- Foreign key: user_id references User(id) ON DELETE CASCADE
|
||||
- Phone must be valid format (if provided)
|
||||
- Timezone must be valid (e.g., from IANA timezone database)
|
||||
- Language must be valid ISO 639-1 code
|
||||
- Theme must be one of: light, dark, auto
|
||||
|
||||
**Notes**
|
||||
- Soft-deleted with parent User (CASCADE delete)
|
||||
- Profile is optional - some users may not have profile data
|
||||
```
|
||||
|
||||
#### Entity: UserPreference
|
||||
|
||||
```markdown
|
||||
### UserPreference
|
||||
|
||||
Key-value preferences for users. Flexible schema for future preference types.
|
||||
|
||||
**Purpose**: Stores user preferences without requiring schema changes.
|
||||
|
||||
**Fields**
|
||||
|
||||
| Field | Type | Required | Unique | Description |
|
||||
|-------|------|----------|--------|-------------|
|
||||
| id | UUID | Yes | Yes | Primary key |
|
||||
| user_id | UUID (FK) | Yes | No | Foreign key to User.id |
|
||||
| preference_key | String(100) | Yes | No | Preference identifier (e.g., notifications_email) |
|
||||
| preference_value | String(1000) | Yes | No | Preference value as string |
|
||||
| created_at | Timestamp | Yes | No | Creation time |
|
||||
| updated_at | Timestamp | Yes | No | Last update time |
|
||||
|
||||
**Indexes**
|
||||
- Composite: `(user_id, preference_key)` - For efficient preference lookup
|
||||
- Primary: `user_id` - For finding all preferences for a user
|
||||
|
||||
**Constraints**
|
||||
- Foreign key: user_id references User(id) ON DELETE CASCADE
|
||||
- Composite unique: `(user_id, preference_key)` - One preference per key per user
|
||||
- preference_key must match pattern: `[a-z_]+` (lowercase letters and underscores only)
|
||||
- preference_value must be valid JSON or simple string
|
||||
|
||||
**Valid Preferences**
|
||||
Examples of preference_key values:
|
||||
- `notifications_email` → "true"/"false"
|
||||
- `notifications_sms` → "true"/"false"
|
||||
- `export_format` → "csv"/"json"
|
||||
- `ui_columns_per_page` → "20"/"50"/"100"
|
||||
|
||||
**Notes**
|
||||
- Flexible key-value design allows adding preferences without schema changes
|
||||
- Values stored as strings for flexibility, parsed by application layer
|
||||
```
|
||||
|
||||
### Relationships Section
|
||||
|
||||
Document how entities relate:
|
||||
|
||||
```markdown
|
||||
## Entity Relationships
|
||||
|
||||
```
|
||||
┌───────────┐ ┌──────────────┐ ┌─────────────┐
|
||||
│ User │ │ UserProfile │ │ UserPref │
|
||||
├───────────┤ ├──────────────┤ ├─────────────┤
|
||||
│ id (PK) │ │ id (PK) │ │ id (PK) │
|
||||
│ email │◄───1:1──│ user_id (FK) │ │ user_id(FK) │
|
||||
│ ... │ │ avatar_url │ │ pref_key │
|
||||
└───────────┘ │ ... │ │ pref_value │
|
||||
└──────────────┘ └─────────────┘
|
||||
▲
|
||||
│
|
||||
1:many
|
||||
```
|
||||
|
||||
### Relationship: User → UserProfile (1:1)
|
||||
- **Type**: One-to-One
|
||||
- **Foreign Key**: UserProfile.user_id → User.id
|
||||
- **Cardinality**: A User has exactly one UserProfile; a UserProfile belongs to exactly one User
|
||||
- **Delete Behavior**: CASCADE - Deleting User deletes UserProfile
|
||||
- **Optional**: UserProfile is optional (some users may not have detailed profile)
|
||||
|
||||
### Relationship: User → UserPreference (1:many)
|
||||
- **Type**: One-to-Many
|
||||
- **Foreign Key**: UserPreference.user_id → User.id
|
||||
- **Cardinality**: A User can have many UserPreferences; each UserPreference belongs to one User
|
||||
- **Delete Behavior**: CASCADE - Deleting User deletes all preferences
|
||||
- **Optional**: A User can have zero preferences
|
||||
```
|
||||
|
||||
### Constraints & Validation Section
|
||||
|
||||
```markdown
|
||||
## Data Constraints & Validation
|
||||
|
||||
### Business Logic Constraints
|
||||
- Users cannot have duplicate emails (enforced at database + application)
|
||||
- User phone numbers must be unique if provided
|
||||
- Email and phone cannot both be deleted/NULL in UserProfile
|
||||
|
||||
### Data Integrity Rules
|
||||
- password_hash must never be exposed in API responses
|
||||
- deleted_at cannot be set retroactively (only forward through time)
|
||||
- updated_at must be >= created_at
|
||||
|
||||
### Referential Integrity
|
||||
- Foreign key constraints enforced at database level
|
||||
- Cascade deletes on User deletion
|
||||
- No orphaned UserProfile or UserPreference records
|
||||
|
||||
### Enumeration Values
|
||||
|
||||
**User.status**
|
||||
- `active` - Account is active
|
||||
- `inactive` - Account temporarily inactive
|
||||
- `suspended` - Account suspended (admin action)
|
||||
|
||||
**UserProfile.theme**
|
||||
- `light` - Light theme
|
||||
- `dark` - Dark theme
|
||||
- `auto` - Follow system settings
|
||||
|
||||
**UserPreference.preference_key**
|
||||
- Must match pattern: `[a-z_]+`
|
||||
- Examples: `notifications_email`, `export_format`, `ui_language`
|
||||
```
|
||||
|
||||
### Scaling Considerations Section
|
||||
|
||||
```markdown
|
||||
## Scaling & Performance
|
||||
|
||||
### Expected Data Volume
|
||||
- Users: 100-1000 per day growth
|
||||
- Preferences: ~5-10 per user on average
|
||||
- Year 1 estimate: 36k users, ~180k preference records
|
||||
|
||||
### Table Sizes
|
||||
- User table: ~36MB (estimated year 1)
|
||||
- UserProfile table: ~28MB
|
||||
- UserPreference table: ~22MB
|
||||
|
||||
### Query Patterns & Indexes
|
||||
- Find user by email: Indexed (UNIQUE index on email)
|
||||
- Find all preferences for user: Indexed (composite on user_id, pref_key)
|
||||
- List users by creation date: Indexed (on created_at)
|
||||
- Filter users by status: Indexed (on status)
|
||||
|
||||
### Optimization Notes
|
||||
- Composite index `(user_id, preference_key)` enables efficient preference lookups
|
||||
- Email index enables fast login queries
|
||||
- Consider partitioning UserPreference by user_id for very large scale (100M+ records)
|
||||
```
|
||||
|
||||
### Migration & Change Management Section
|
||||
|
||||
```markdown
|
||||
## Schema Evolution
|
||||
|
||||
### Creating These Tables
|
||||
```sql
|
||||
CREATE TABLE user (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
email VARCHAR(255) UNIQUE NOT NULL,
|
||||
password_hash VARCHAR(255) NOT NULL,
|
||||
first_name VARCHAR(100),
|
||||
last_name VARCHAR(100),
|
||||
status VARCHAR(50) DEFAULT 'active' NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT now() NOT NULL,
|
||||
updated_at TIMESTAMP DEFAULT now() NOT NULL,
|
||||
deleted_at TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE user_profile (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID UNIQUE NOT NULL REFERENCES user(id) ON DELETE CASCADE,
|
||||
avatar_url VARCHAR(500),
|
||||
bio VARCHAR(500),
|
||||
phone VARCHAR(20) UNIQUE,
|
||||
timezone VARCHAR(50),
|
||||
language VARCHAR(5),
|
||||
theme VARCHAR(20),
|
||||
created_at TIMESTAMP DEFAULT now() NOT NULL,
|
||||
updated_at TIMESTAMP DEFAULT now() NOT NULL
|
||||
);
|
||||
|
||||
CREATE TABLE user_preference (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID NOT NULL REFERENCES user(id) ON DELETE CASCADE,
|
||||
preference_key VARCHAR(100) NOT NULL,
|
||||
preference_value VARCHAR(1000) NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT now() NOT NULL,
|
||||
updated_at TIMESTAMP DEFAULT now() NOT NULL,
|
||||
UNIQUE(user_id, preference_key)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_user_email ON user(email);
|
||||
CREATE INDEX idx_user_created_at ON user(created_at);
|
||||
CREATE INDEX idx_user_status ON user(status);
|
||||
CREATE INDEX idx_preference_lookup ON user_preference(user_id, preference_key);
|
||||
```
|
||||
|
||||
### Future Migrations
|
||||
- Q2 2024: Add `last_login_at` to User (nullable, new index)
|
||||
- Q3 2024: Implement user archival (age > 1 year, no activity)
|
||||
```
|
||||
|
||||
### Documentation & Examples Section
|
||||
|
||||
```markdown
|
||||
## Example Queries
|
||||
|
||||
### Find user by email
|
||||
```sql
|
||||
SELECT * FROM user WHERE email = 'user@example.com';
|
||||
```
|
||||
|
||||
### Get user with profile
|
||||
```sql
|
||||
SELECT u.*, p.*
|
||||
FROM user u
|
||||
LEFT JOIN user_profile p ON u.id = p.user_id
|
||||
WHERE u.id = $1;
|
||||
```
|
||||
|
||||
### Get user's preferences
|
||||
```sql
|
||||
SELECT preference_key, preference_value
|
||||
FROM user_preference
|
||||
WHERE user_id = $1
|
||||
ORDER BY created_at DESC;
|
||||
```
|
||||
|
||||
### Archive old inactive users
|
||||
```sql
|
||||
UPDATE user
|
||||
SET deleted_at = now()
|
||||
WHERE status = 'inactive' AND updated_at < now() - interval '1 year'
|
||||
AND deleted_at IS NULL;
|
||||
```
|
||||
```
|
||||
|
||||
## Writing Tips
|
||||
|
||||
### Document Constraints Clearly
|
||||
- Why does each field have the constraints it does?
|
||||
- What validation rules apply?
|
||||
- What happens on constraint violations?
|
||||
|
||||
### Think About Scale
|
||||
- How much data will this table store?
|
||||
- What are the growth projections?
|
||||
- What indexing strategy is needed?
|
||||
- Will partitioning be needed in the future?
|
||||
|
||||
### Link to Related Specs
|
||||
- Reference technical requirements: `[PRD-001]`
|
||||
- Reference API contracts: `[API-001]` (what data is exposed)
|
||||
- Reference design documents: `[DES-001]`
|
||||
|
||||
### Include Examples
|
||||
- Sample SQL for common queries
|
||||
- Sample JSON representations
|
||||
- Example migration scripts
|
||||
|
||||
### Document Change Constraints
|
||||
- What fields can't change after creation?
|
||||
- What fields are immutable?
|
||||
- How do we handle schema evolution?
|
||||
|
||||
## Validation & Fixing Issues
|
||||
|
||||
### Run the Validator
|
||||
```bash
|
||||
scripts/validate-spec.sh docs/specs/data-model/data-001-your-spec.md
|
||||
```
|
||||
|
||||
### Common Issues & Fixes
|
||||
|
||||
**Issue**: "Missing entity field specifications"
|
||||
- **Fix**: Complete the fields table for each entity with types, constraints, descriptions
|
||||
|
||||
**Issue**: "No relationships documented"
|
||||
- **Fix**: Add a relationships section showing foreign keys and cardinality
|
||||
|
||||
**Issue**: "TODO items in Constraints (3 items)"
|
||||
- **Fix**: Complete constraint definitions, validation rules, and enumeration values
|
||||
|
||||
**Issue**: "No scaling or performance information"
|
||||
- **Fix**: Add data volume estimates, indexing strategy, and optimization notes
|
||||
|
||||
## Decision-Making Framework
|
||||
|
||||
As you write the data model, consider:
|
||||
|
||||
1. **Entity Design**: What entities do we need?
|
||||
- What are distinct concepts?
|
||||
- What are attributes vs. relationships?
|
||||
- Should data be normalized or denormalized?
|
||||
|
||||
2. **Relationships**: How do entities relate?
|
||||
- One-to-one, one-to-many, many-to-many?
|
||||
- Should relationships be required or optional?
|
||||
- How should deletions cascade?
|
||||
|
||||
3. **Constraints**: What rules must data satisfy?
|
||||
- Uniqueness constraints?
|
||||
- Required fields?
|
||||
- Data type restrictions?
|
||||
- Enumeration values?
|
||||
|
||||
4. **Performance**: How will data be queried?
|
||||
- What indexes are needed?
|
||||
- What's the expected scale?
|
||||
- Are there bottlenecks?
|
||||
|
||||
5. **Evolution**: How will this model change?
|
||||
- Can we add fields without migrations?
|
||||
- Can we add entities without breaking things?
|
||||
- How do we handle data migrations?
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Create the spec**: `scripts/generate-spec.sh data-model data-XXX-slug`
|
||||
2. **Define entities**: What are the main entities/tables?
|
||||
3. **Specify fields** with types, constraints, descriptions
|
||||
4. **Document relationships** between entities
|
||||
5. **Plan indexes** for performance
|
||||
6. **Validate**: `scripts/validate-spec.sh docs/specs/data-model/data-XXX-slug.md`
|
||||
7. **Share with team** before implementation
|
||||
Reference in New Issue
Block a user