Files
gh-tordks-claude-workflow-p…/skills/claude-workflow/references/plan-spec.md
2025-11-30 09:02:26 +08:00

322 lines
9.4 KiB
Markdown

# Plan Document Specification
Specification for creating conformant plan documents in the CWF workflow.
---
## What is a Plan Document?
Plan documents capture **architectural context and design rationale**. They preserve WHY decisions were made and WHAT the solution is, enabling implementation across sessions after context has been cleared.
**Plan = WHY/WHAT** | Tasklist = WHEN/HOW
---
## Conformance
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
> **Note:** See `SKILL.md` for conformance levels (1-3) tailoring documentation depth.
---
## Core Plan Sections
Plan documents MUST include three core sections: Overview, Solution Design, and Implementation Strategy.
### Section 1: Overview
Provides high-level summary of problem and solution.
**MUST include:**
- Problem statement (current pain point or gap)
- Feature purpose (solution being built)
- Scope (What is IN/OUT of scope)
**SHOULD include:**
- Success criteria (quantifiable completion validation)
**Example (Informative):**
```markdown
## Overview
### Problem
Users currently search documentation by manually scanning files or using basic text search. This is slow (10+ minutes per search) and misses relevant documents that use different terminology. Support tickets show 40% of questions are about "how to find X in the docs."
### Purpose
Add keyword-based document search with relevance ranking. Users enter search terms and receive ranked results within 1 second, improving discoverability and reducing support load.
### Scope
**IN scope:**
- Keyword search with boolean AND/OR operators
- TF-IDF relevance ranking
- Result filtering by document type
- Search result caching
**OUT of scope:**
- Natural language queries ("find me information about...")
- Semantic/embedding-based search
- Advanced operators (NEAR, wildcards, regex)
### Success Criteria
- Users can search by keywords and receive ranked results
- Search completes in <100ms for 10,000 documents
- Results include documents even with terminology variations
- Test coverage >80% for core search logic
- Zero regressions in existing functionality
```
---
### Section 2: Solution Design
Documents the complete solution architecture and technical approach.
#### 2.1 System Architecture
**MUST include:**
- Component overview (logical pieces and their responsibilities)
- Project structure (file tree with operation markers)
**SHOULD include:**
- Component relationships (dependencies and communication patterns)
- Relationship to existing codebase (where feature fits, what it extends/uses)
**File Tree Format:**
File trees MUST use operation markers:
- `[CREATE]` for new files
- `[MODIFY]` for modified files
- `[REMOVE]` for removed files
- No marker for existing unchanged files
**Example (Informative):**
````markdown
### System Architecture
**Core Components:**
- **QueryParser:** Parses user search strings into structured queries (operators, quoted phrases)
- **DocumentIndexer:** Builds and maintains TF-IDF index from document corpus
- **QueryRanker:** Ranks documents against query using cosine similarity
- **SearchCache:** LRU cache for frequent queries
- **SearchAPI:** HTTP endpoint exposing search functionality
**Project Structure:**
```
src/
├── search/
│ ├── __init__.py [CREATE]
│ ├── parser.py [CREATE]
│ ├── indexer.py [CREATE]
│ ├── ranker.py [CREATE]
│ └── cache.py [CREATE]
├── api/
│ └── search.py [CREATE]
├── models/
│ └── document.py [MODIFY]
└── tests/
└── search/
├── test_parser.py [CREATE]
└── test_ranker.py [CREATE]
```
**Component Relationships:**
- SearchAPI depends on QueryParser, SearchCache
- QueryRanker depends on DocumentIndexer
- SearchCache depends on QueryRanker
- All components use shared Document model
**Relationship to Existing Codebase:**
- Architectural layer: Service layer (alongside existing `src/api/` endpoints)
- Domain: Search functionality (new domain area)
- Extends: `BaseAPIHandler` pattern used throughout repository
- Uses: Existing `AuthMiddleware` for authentication
- Uses: Application `CacheManager` for result caching
- Follows: Repository's service-oriented architecture and dependency injection patterns
````
---
#### 2.2 Design Rationale
Documents reasoning behind structural and technical choices.
**MUST include:**
- Rationale for key design choices
**SHOULD include:**
- Alternatives considered and why not chosen
- Trade-offs accepted
**MAY include:**
- Constraints influencing decisions
- Principles or patterns applied
**Tip (Informative):** Format flexibly - inline rationale, comparison tables, or structured decision records all work. Focus on capturing WHY, not following a template.
**Example (Informative):**
```markdown
### Design Rationale
**Use TF-IDF with cosine similarity for ranking**
Well-understood algorithm with predictable behavior. No training data or ML infrastructure required.
Alternatives considered:
- BM25: Marginal improvement for our corpus size, added complexity not justified
- Neural/embedding-based: Requires GPU, training data, model management - overkill for current needs
Trade-offs accepted:
- Pro: Fast to implement, predictable results, no infrastructure dependencies
- Con: Doesn't understand semantic similarity, sensitive to exact keyword matches
```
---
#### 2.3 Technical Specification
Describes runtime behavior and operational requirements.
**MUST include:**
- Dependencies (libraries, external systems)
- Runtime behavior (algorithms, execution flow, state management)
**MAY include:**
- Error handling (failure detection and recovery)
- Configuration needs (runtime or deployment settings)
**Example (Informative):**
````markdown
### Technical Specification
**Dependencies:**
Required libraries (new):
- scikit-learn 1.3+ (TF-IDF vectorization, cosine similarity)
- nltk 3.8+ (text preprocessing, stopword removal)
Required systems:
- PostgreSQL (stores `documents` table)
- Redis (event stream for `document_updated` events)
- InfluxDB (search metrics and monitoring)
Existing (from project):
- FastAPI 0.100+ (API framework)
- SQLAlchemy 2.0+ (database ORM)
- pytest 7.4+ (testing framework)
**Runtime Behavior:**
1. Parse query → structured query (operators, phrases)
2. Check cache (LRU, 1000 entries)
3. On cache miss: vectorize query, compute cosine similarity, rank results
4. Return paginated results (25 per page)
**Error Handling:**
Invalid Input:
- Empty query → 400 "Query cannot be empty"
- Invalid operators → 400 "Invalid syntax: [specific error]"
- Query too long (>500 chars) → 400 "Query exceeds maximum length"
Runtime Errors:
- Index not ready → 503 "Search index is building, retry in [X] seconds"
- Timeout (>5s) → 408 "Query timeout, try simplifying search terms"
- No results found → 200 with empty list (not an error)
System Errors:
- Database unavailable → 500, log error, alert on-call
- Index corruption → Rebuild from database, log incident
**Configuration:**
```python
SEARCH_INDEX_PATH = "/data/search-index.pkl"
SEARCH_CACHE_SIZE = 1000
SEARCH_TIMEOUT_MS = 5000
```
````
---
### Section 3: Implementation Strategy
Describes high-level approach guiding phase and task structure.
**MUST include:**
- Development approach (incremental, outside-in, vertical slice, bottom-up, etc.)
**SHOULD include:**
- Testing approach (test-driven, integration-focused, comprehensive, etc.)
- Risk mitigation strategy (tackle unknowns first, safe increments, prototype early, etc.)
- Checkpoint strategy (quality and validation operations at phase boundaries)
The strategy SHOULD explain WHY the tasklist is structured as it is.
**MUST NOT include:**
- Step-by-step execution instructions or task checklists
**Example (Informative):**
```markdown
## Implementation Strategy
### Development Approach
**Incremental with Safe Checkpoints**
Build bottom-up with validation at each layer:
1. **Foundation First:** Core search components (indexer, ranker) before API
2. **Runnable Increments:** Each phase produces working, testable code
3. **Early Validation:** Algorithm performance validated early before building around it
### Testing Approach
Integration-focused with targeted unit tests:
- Unit tests for complex logic (parsing, scoring)
- Integration tests for component interactions
- E2E tests for critical user flows
### Checkpoint Strategy
Each phase ends with mandatory validation before proceeding:
- Self-review: Agent reviews implementation against phase deliverable
- Code quality: Linting and formatting with ruff
- Code complexity: Complexity check with Radon
These checkpoints ensure AI-generated code meets project standards before continuing to next phase.
```
**Note (Informative):** Checkpoint types are project-specific. Use only tools your project already has. If the project doesn't use linting or complexity analysis, omit those checkpoints.
---
## Context Independence
Plans MUST be self-contained. Implementation may occur in fresh sessions after context has been cleared. All architectural decisions and rationale must be in the plan document.
---
## Validation
Plans are conformant when they:
- Include all three core sections with required content
- Contain all three Solution Design subsections
- Use file tree markers correctly
- Document WHY for design decisions
- Are self-contained (no assumed conversation context)
- Contain no step-by-step execution instructions