Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:59:22 +08:00
commit b2731247f4
13 changed files with 3454 additions and 0 deletions

View File

@@ -0,0 +1,12 @@
{
"name": "axiom-system-archaeologist",
"description": "Deep architectural analysis of existing codebases through autonomous subagent-driven exploration - produces comprehensive documentation, C4 diagrams, subsystem catalogs, code quality assessments, refactoring priorities, and architect handover reports",
"version": "1.1.0",
"author": {
"name": "tachyon-beep",
"url": "https://github.com/tachyon-beep"
},
"skills": [
"./skills"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# axiom-system-archaeologist
Deep architectural analysis of existing codebases through autonomous subagent-driven exploration - produces comprehensive documentation, C4 diagrams, subsystem catalogs, code quality assessments, refactoring priorities, and architect handover reports

81
plugin.lock.json Normal file
View File

@@ -0,0 +1,81 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:tachyon-beep/skillpacks:plugins/axiom-system-archaeologist",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "9b73880639989805a21f1d2d51ea7159bb6e4685",
"treeHash": "a0b025306866f14daa93bc823087f5e65612e1d897ffc9e193b3d463944dc4bc",
"generatedAt": "2025-11-28T10:28:31.152300Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "axiom-system-archaeologist",
"description": "Deep architectural analysis of existing codebases through autonomous subagent-driven exploration - produces comprehensive documentation, C4 diagrams, subsystem catalogs, code quality assessments, refactoring priorities, and architect handover reports",
"version": "1.1.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "ada1fa9bf30914a3c122e03697acbe7a8e0dd63449a714951c6fd56685d7fee6"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "5152b8e739539e59369e22d5ed8bb0e8fdd784a89d462a0a025f2c0f753bd9e2"
},
{
"path": "skills/using-system-archaeologist/generating-architecture-diagrams.md",
"sha256": "423d02bee4ff4264627cf4d759d7f3c33e2f3182cc891fc8ca349252133e4273"
},
{
"path": "skills/using-system-archaeologist/assessing-code-quality.md",
"sha256": "c4c6e0be3e959e4e7b99122af7526a8357ee14ef6dd43d93915080ba27a15665"
},
{
"path": "skills/using-system-archaeologist/analyzing-unknown-codebases.md",
"sha256": "ac04cff9d0cc8e039874e29cb2c80ae6d34379442e6108382b1ecc1556e0ab20"
},
{
"path": "skills/using-system-archaeologist/creating-architect-handover.md",
"sha256": "49971ac205519096889b5dfbfc33c6eafd3acab4ffc46760c097417c2e499232"
},
{
"path": "skills/using-system-archaeologist/documenting-system-architecture.md",
"sha256": "d14029178e964dccac97ea700b0c15b9fc1272cce8260e41d83348a185aeed6e"
},
{
"path": "skills/using-system-archaeologist/validating-architecture-analysis.md",
"sha256": "b6b10f9c6e7aaa4a5154bbef727a5bcc645cf342d5b452e73453455bdbe95e3a"
},
{
"path": "skills/using-system-archaeologist/SKILL.md",
"sha256": "3bc877559c237a5012526b1e7abb28b68b80efc4893a81bd80609d05494e7e0a"
},
{
"path": "skills/using-system-archaeologist/.archive/README.md",
"sha256": "5395a43116c9843f0a59da4a7efc124f41050bc846b29cb86c71b3e881d077bc"
},
{
"path": "skills/using-system-archaeologist/.archive/untested/creating-architect-handover-v1-untested.md",
"sha256": "053ce73971205d9358caf022c6dee73b6737cd898c340f6c7835b4c587e6406f"
},
{
"path": "skills/using-system-archaeologist/.archive/untested/assessing-code-quality-v1-untested.md",
"sha256": "fd3b351dff4d31346313ebaae229a5762c36b1972c22ca3e1e0cb15bea70d1d4"
}
],
"dirSha256": "a0b025306866f14daa93bc823087f5e65612e1d897ffc9e193b3d463944dc4bc"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,40 @@
# Archive of Untested Briefings
This directory contains briefings that were created without following the proper RED-GREEN-REFACTOR TDD cycle.
## Untested Briefings (v1)
Created: 2025-11-19
Created by: Maintenance workflow without behavioral testing
Violation: Writing-skills Iron Law - "NO SKILL WITHOUT A FAILING TEST FIRST"
### Files
1. **assessing-code-quality-v1-untested.md** (~400 lines)
- Content coverage: Complexity, duplication, code smells, maintainability, dependencies
- Problem: No baseline testing to verify agents follow guidance
- Use: Reference for content areas to cover in tested version
2. **creating-architect-handover-v1-untested.md** (~400 lines)
- Content coverage: Handover report generation, consultation patterns, architect integration
- Problem: No baseline testing to verify agents follow guidance
- Use: Reference for content areas to cover in tested version
## Purpose
These files are archived (not deleted) to:
- Track what content areas should be covered
- Compare tested vs. untested versions
- Document the improvement from proper TDD methodology
- Serve as reference when designing pressure scenarios
## Tested Versions
Properly tested versions (RED-GREEN-REFACTOR) will be created in the parent directory following:
1. **RED:** Baseline scenarios WITHOUT skill - document exact failures
2. **GREEN:** Write minimal skill addressing observed rationalizations
3. **REFACTOR:** Find loopholes, plug them, re-test until bulletproof
## Do Not Use
These untested files should NOT be used in production. They have not been validated through behavioral testing with subagents and may contain gaps, rationalizations, or ineffective guidance.

View File

@@ -0,0 +1,411 @@
# Assessing Code Quality
## Purpose
Analyze code quality indicators beyond architecture to identify maintainability issues, code smells, and technical debt - produces quality scorecard with actionable improvement recommendations.
## When to Use
- Coordinator delegates quality assessment after subsystem catalog completion
- Task specifies analyzing code quality in addition to architecture
- Need to identify refactoring priorities beyond structural concerns
- Output feeds into architect handover reports or improvement planning
## Core Principle: Evidence-Based Quality Assessment
**Good quality analysis identifies specific, actionable issues. Poor quality analysis makes vague claims about "bad code."**
Your goal: Provide evidence-based quality metrics with concrete examples and remediation guidance.
## Quality Analysis Dimensions
### 1. Code Complexity
**What to assess:**
- Function/method length (lines of code)
- Cyclomatic complexity (decision points)
- Nesting depth (indentation levels)
- Parameter count
**Evidence to collect:**
- Longest functions with line counts
- Functions with highest decision complexity
- Deeply nested structures (> 4 levels)
- Functions with >5 parameters
**Thresholds (guidelines, not rules):**
- Functions > 50 lines: Flag for review
- Cyclomatic complexity > 10: Consider refactoring
- Nesting > 4 levels: Simplification candidate
- Parameters > 5: Consider parameter object
**Example documentation:**
```markdown
### Complexity Concerns
**High complexity functions:**
- `src/api/order_processing.py:process_order()` - 127 lines, complexity ~15
- 8 nested if statements
- Handles validation, pricing, inventory, shipping in single function
- **Recommendation:** Extract validation, pricing, inventory, shipping into separate functions
- `src/utils/data_transform.py:transform_dataset()` - 89 lines, 7 parameters
- **Recommendation:** Create DatasetConfig object to replace parameter list
```
### 2. Code Duplication
**What to assess:**
- Repeated code blocks (copy-paste patterns)
- Similar functions with slight variations
- Duplicated logic across subsystems
**Evidence to collect:**
- Quote repeated code blocks (5+ lines)
- List functions with similar structure
- Note duplication percentage (if tool available)
**Analysis approach:**
1. Read representative files from each subsystem
2. Look for similar patterns, function structures
3. Note copy-paste indicators (similar variable names, comment duplication)
4. Assess if duplication is deliberate or accidental
**Example documentation:**
```markdown
### Duplication Concerns
**Copy-paste pattern in validation:**
- `src/api/users.py:validate_user()` (lines 45-67)
- `src/api/orders.py:validate_order()` (lines 89-111)
- `src/api/products.py:validate_product()` (lines 23-45)
All three functions follow identical structure:
1. Check required fields
2. Validate format with regex
3. Check database constraints
4. Return validation result
**Recommendation:** Extract common validation framework to `src/utils/validation.py`
```
### 3. Code Smells
**Common smells to identify:**
**Long parameter lists:**
- Functions with >5 parameters
- Recommendation: Parameter object or builder pattern
**God objects/functions:**
- Classes with >10 methods
- Functions doing multiple unrelated things
- Recommendation: Single Responsibility Principle refactoring
**Magic numbers:**
- Hardcoded values without named constants
- Recommendation: Extract to configuration or named constants
**Dead code:**
- Commented-out code blocks
- Unused functions/classes (no references found)
- Recommendation: Remove or document why kept
**Shotgun surgery indicators:**
- Single feature change requires edits in 5+ files
- Indicates high coupling
- Recommendation: Improve encapsulation
**Example documentation:**
```markdown
### Code Smell Observations
**Magic numbers:**
- `src/services/cache.py`: Hardcoded timeout values (300, 3600, 86400)
- **Recommendation:** Extract to CacheConfig with named durations
**Dead code:**
- `src/legacy/` directory contains 15 files, no imports found in active code
- Last modified: 2023-06-15
- **Recommendation:** Archive or remove if truly unused
**Shotgun surgery:**
- Adding new payment method requires changes in:
- `src/api/payment.py`
- `src/models/transaction.py`
- `src/utils/validators.py`
- `src/services/notification.py`
- `config/payment_providers.json`
- **Recommendation:** Introduce payment provider abstraction layer
```
### 4. Maintainability Indicators
**What to assess:**
- Documentation coverage (docstrings, comments)
- Test coverage (if test files visible)
- Error handling patterns
- Logging consistency
**Evidence to collect:**
- Percentage of functions with docstrings
- Test file presence per module
- Error handling approaches (try/except, error codes, etc.)
- Logging statements (presence, consistency)
**Example documentation:**
```markdown
### Maintainability Assessment
**Documentation:**
- 12/45 functions (27%) have docstrings
- Public API modules better documented than internal utilities
- **Recommendation:** Add docstrings to all public functions, focus on "why" not "what"
**Error handling inconsistency:**
- `src/api/` uses exception raising
- `src/services/` uses error code returns
- `src/utils/` mixes both approaches
- **Recommendation:** Standardize on exceptions with custom exception hierarchy
**Logging:**
- Inconsistent log levels (some files use DEBUG for errors)
- No structured logging (difficult to parse)
- **Recommendation:** Adopt structured logging library, establish level conventions
```
### 5. Dependency Quality
**What to assess:**
- Coupling between subsystems
- Circular dependencies
- External dependency management
**Evidence from subsystem catalog:**
- Review "Dependencies - Outbound" sections
- Count dependencies per subsystem
- Identify bidirectional dependencies (A→B and B→A)
**Analysis approach:**
1. Use subsystem catalog dependency data
2. Count inbound/outbound dependencies per subsystem
3. Identify highly coupled subsystems (>5 dependencies)
4. Note circular dependency patterns
**Example documentation:**
```markdown
### Dependency Concerns
**High coupling:**
- `API Gateway` subsystem: 8 outbound dependencies (most in system)
- Depends on: Auth, User, Product, Order, Payment, Notification, Logging, Cache
- **Observation:** Acts as orchestrator, coupling may be appropriate
- **Recommendation:** Monitor for API Gateway becoming bloated
**Circular dependencies:**
- `User Service``Notification Service`
- User triggers notifications, Notification updates user preferences
- **Recommendation:** Introduce event bus to break circular dependency
**Dependency concentration:**
- 6/10 subsystems depend on `Database Layer`
- Database Layer has no abstraction (direct SQL queries)
- **Recommendation:** Consider repository pattern to isolate database logic
```
## Output Contract
Write findings to workspace as `05-quality-assessment.md`:
```markdown
# Code Quality Assessment
**Analysis Date:** YYYY-MM-DD
**Scope:** [Subsystems analyzed]
**Methodology:** Static code review, pattern analysis
## Quality Scorecard
| Dimension | Rating | Severity | Evidence Count |
|-----------|--------|----------|----------------|
| Complexity | Medium | 3 High, 5 Medium | 8 functions flagged |
| Duplication | High | 2 Critical, 4 Medium | 6 patterns identified |
| Code Smells | Medium | 0 Critical, 7 Medium | 7 smells documented |
| Maintainability | Medium-Low | 1 Critical, 3 Medium | 4 concerns noted |
| Dependencies | Low | 1 Medium | 2 concerns noted |
**Overall Rating:** Medium - Several actionable improvements identified, no critical blockers
## Detailed Findings
### 1. Complexity Concerns
[List from analysis above]
### 2. Duplication Concerns
[List from analysis above]
### 3. Code Smell Observations
[List from analysis above]
### 4. Maintainability Assessment
[List from analysis above]
### 5. Dependency Concerns
[List from analysis above]
## Prioritized Recommendations
### Critical (Address Immediately)
1. [Issue with highest impact]
### High (Next Sprint)
2. [Important issues]
3. [Important issues]
### Medium (Next Quarter)
4. [Moderate issues]
5. [Moderate issues]
### Low (Backlog)
6. [Nice-to-have improvements]
## Methodology Notes
**Analysis approach:**
- Sampled [N] representative files across [M] subsystems
- Focused on [specific areas of concern]
- Did NOT use automated tools (manual review only)
**Limitations:**
- Sample-based (not exhaustive)
- No runtime analysis (static review only)
- Test coverage estimates based on file presence
- No quantitative complexity metrics (manual assessment)
**For comprehensive analysis, consider:**
- Running static analysis tools (ruff, pylint, mypy for Python)
- Measuring actual test coverage
- Profiling runtime behavior
- Security-focused code review
```
## Severity Rating Guidelines
**Critical:**
- Blocks core functionality or deployment
- Security vulnerability present
- Data corruption risk
- Examples: SQL injection, hardcoded credentials, unhandled exceptions in critical path
**High:**
- Significant maintainability impact
- High effort to modify or extend
- Frequent source of bugs
- Examples: God objects, extreme duplication, shotgun surgery patterns
**Medium:**
- Moderate maintainability concern
- Refactoring beneficial but not urgent
- Examples: Long functions, missing documentation, inconsistent error handling
**Low:**
- Minor quality improvement
- Cosmetic or style issues
- Examples: Magic numbers, verbose naming, minor duplication
## Integration with Architect Handover
Quality assessment feeds directly into `creating-architect-handover.md`:
1. Quality scorecard provides severity ratings
2. Prioritized recommendations become architect's action items
3. Code smells inform refactoring strategy
4. Dependency concerns guide architectural improvements
The architect handover briefing will synthesize architecture + quality into comprehensive improvement plan.
## When to Skip Quality Assessment
**Optional scenarios:**
- User requested architecture-only analysis
- Extremely tight time constraints (< 2 hours total)
- Codebase is very small (< 1000 lines)
- Quality issues not relevant to stakeholder needs
**Document if skipped:**
```markdown
## Quality Assessment: SKIPPED
**Reason:** [Time constraints / Not requested / etc.]
**Recommendation:** Run focused quality review post-stakeholder presentation
```
## Systematic Analysis Checklist
```
[ ] Read subsystem catalog to understand structure
[ ] Sample 3-5 representative files per subsystem
[ ] Document complexity concerns (functions >50 lines, high nesting)
[ ] Identify duplication patterns (repeated code blocks)
[ ] Note code smells (god objects, magic numbers, dead code)
[ ] Assess maintainability (docs, tests, error handling)
[ ] Review dependencies from catalog (coupling, circular deps)
[ ] Rate severity for each finding (Critical/High/Medium/Low)
[ ] Prioritize recommendations by impact
[ ] Write to 05-quality-assessment.md following contract
[ ] Document methodology and limitations
```
## Success Criteria
**You succeeded when:**
- Quality assessment covers all 5 dimensions
- Each finding has concrete evidence (file paths, line numbers, examples)
- Severity ratings are justified
- Recommendations are specific and actionable
- Methodology and limitations documented
- Output written to 05-quality-assessment.md
**You failed when:**
- Vague claims without evidence ("code is messy")
- No severity ratings or priorities
- Recommendations are generic ("improve code quality")
- Missing methodology notes
- Skipped dimensions without documentation
## Common Mistakes
**❌ Analysis paralysis**
"Need to read every file" → Sample strategically, 20% coverage reveals patterns
**❌ Vague findings**
"Functions are too complex" → "process_order() is 127 lines with complexity ~15"
**❌ No prioritization**
Flat list of 50 issues → Prioritize by severity/impact, focus on Critical/High
**❌ Tool-dependent**
"Can't assess without linting tools" → Manual review reveals patterns, note as limitation
**❌ Perfectionism**
"Everything needs fixing" → Focus on high-impact issues, accept some technical debt
## Integration with Workflow
This briefing is typically invoked as:
1. **Coordinator** completes subsystem catalog (02-subsystem-catalog.md)
2. **Coordinator** (optionally) validates catalog
3. **Coordinator** writes task specification for quality assessment
4. **YOU** read subsystem catalog to understand structure
5. **YOU** perform systematic quality analysis (5 dimensions)
6. **YOU** write to 05-quality-assessment.md following contract
7. **Coordinator** proceeds to diagram generation or architect handover
**Your role:** Complement architectural analysis with code quality insights, providing evidence-based improvement recommendations.

View File

@@ -0,0 +1,385 @@
# Creating Architect Handover
## Purpose
Generate handover reports for axiom-system-architect plugin, enabling seamless transition from analysis (archaeologist) to improvement planning (architect) - synthesizes architecture + quality findings into actionable assessment inputs.
## When to Use
- Coordinator completes architecture analysis and quality assessment
- User requests improvement recommendations or refactoring guidance
- Need to transition from "what exists" (archaeologist) to "what should change" (architect)
- Task specifies creating architect-ready outputs
## Core Principle: Analysis → Assessment Pipeline
**Archaeologist documents neutrally. Architect assesses critically. Handover bridges the two.**
```
Archaeologist → Handover → Architect → Improvements
(neutral docs) (synthesis) (critical) (execution)
```
Your goal: Package archaeologist findings into architect-consumable format for assessment and prioritization.
## The Division of Labor
### Archaeologist (This Plugin)
**What archaeologist DOES:**
- Documents existing architecture (subsystems, diagrams, dependencies)
- Identifies quality concerns (complexity, duplication, smells)
- Marks confidence levels (High/Medium/Low)
- Stays neutral ("Here's what you have")
**What archaeologist does NOT do:**
- Critical assessment ("this is bad")
- Refactoring recommendations ("you should fix X first")
- Priority decisions ("security is more important than performance")
### Architect (axiom-system-architect Plugin)
**What architect DOES:**
- Critical quality assessment (direct, no diplomatic softening)
- Technical debt cataloging (structured, prioritized)
- Improvement roadmaps (risk-based, security-first)
- Refactoring strategy recommendations
**What architect does NOT do:**
- Neutral documentation (that's archaeologist's job)
- Implementation execution (future: project manager plugin)
### Handover (This Briefing)
**What handover DOES:**
- Synthesizes archaeologist outputs (architecture + quality)
- Formats findings for architect consumption
- Enables architect consultation (spawn as subagent)
- Bridges neutral documentation → critical assessment
## Output: Architect Handover Report
Create `06-architect-handover.md` in workspace:
```markdown
# Architect Handover Report
**Project:** [System name]
**Analysis Date:** YYYY-MM-DD
**Archaeologist Version:** [axiom-system-archaeologist version]
**Handover Purpose:** Enable architect assessment and improvement prioritization
---
## Executive Summary
**System scale:**
- [N] subsystems identified
- [M] subsystem dependencies mapped
- [X] architectural patterns observed
- [Y] quality concerns flagged
**Assessment readiness:**
- Architecture: [Fully documented / Partial coverage / etc.]
- Quality: [Comprehensive analysis / Sample-based / Not performed]
- Confidence: [Overall High/Medium/Low]
**Recommended architect workflow:**
1. Use `axiom-system-architect:assessing-architecture-quality` for critical assessment
2. Use `axiom-system-architect:identifying-technical-debt` to catalog debt items
3. Use `axiom-system-architect:prioritizing-improvements` for roadmap creation
---
## Archaeologist Deliverables
### Available Documents
- [x] `01-discovery-findings.md` - Holistic scan results
- [x] `02-subsystem-catalog.md` - Detailed subsystem documentation
- [x] `03-diagrams.md` - C4 diagrams (Context, Container, Component)
- [x] `04-final-report.md` - Multi-audience synthesis
- [x] `05-quality-assessment.md` - Code quality analysis (if performed)
- [ ] Additional views: [List if created]
### Key Findings Summary
**Architectural patterns identified:**
1. [Pattern 1] - Observed in: [Subsystems]
2. [Pattern 2] - Observed in: [Subsystems]
3. [Pattern 3] - Observed in: [Subsystems]
**Concerns flagged (from subsystem catalog):**
1. [Concern 1] - Subsystem: [Name], Severity: [Level]
2. [Concern 2] - Subsystem: [Name], Severity: [Level]
3. [Concern 3] - Subsystem: [Name], Severity: [Level]
**Quality issues identified (from quality assessment):**
1. [Issue 1] - Category: [Complexity/Duplication/etc.], Severity: [Critical/High/Medium/Low]
2. [Issue 2] - Category: [Category], Severity: [Level]
3. [Issue 3] - Category: [Category], Severity: [Level]
---
## Architect Input Package
### 1. Architecture Documentation
**Location:** `02-subsystem-catalog.md`, `03-diagrams.md`, `04-final-report.md`
**Usage:** Architect reads these to understand system structure before assessment
**Highlights for architect attention:**
- [Subsystem X]: [Why architect should review - complexity, coupling, etc.]
- [Subsystem Y]: [Specific concern flagged]
- [Dependency pattern]: [Circular dependencies, high coupling, etc.]
### 2. Quality Assessment
**Location:** `05-quality-assessment.md` (if performed)
**Usage:** Architect incorporates quality metrics into technical debt catalog
**Priority issues for architect:**
- **Critical:** [Issue requiring immediate attention]
- **High:** [Important issues from quality assessment]
- **Medium:** [Moderate concerns]
### 3. Confidence Levels
**Usage:** Architect knows which assessments are well-validated vs. tentative
| Subsystem | Confidence | Rationale |
|-----------|------------|-----------|
| [Subsystem A] | High | Well-documented, sampled 5 files |
| [Subsystem B] | Medium | Router provided list, sampled 2 files |
| [Subsystem C] | Low | Missing documentation, inferred structure |
**Guidance for architect:**
- High confidence areas: Proceed with detailed assessment
- Medium confidence areas: Consider deeper analysis before major recommendations
- Low confidence areas: Flag for additional investigation
### 4. Scope and Limitations
**What was analyzed:**
- [Scope description]
**What was NOT analyzed:**
- Runtime behavior (static analysis only)
- Security vulnerabilities (not performed)
- Performance profiling (not available)
- [Other limitations]
**Guidance for architect:**
- Recommendations should acknowledge analysis limitations
- Security assessment may require dedicated review
- Performance concerns should be validated with profiling
---
## Architect Consultation Pattern
### Option A: Handover Only (Document-Based)
**When to use:**
- User will engage architect separately
- Archaeologist completes, then user decides next steps
- Asynchronous workflow preferred
**What to do:**
1. Create this handover report (06-architect-handover.md)
2. Inform user: "Handover report ready for axiom-system-architect"
3. User decides when/how to engage architect
### Option B: Integrated Consultation (Subagent-Based)
**When to use:**
- User requests immediate improvement recommendations
- Integrated archaeologist + architect workflow
- User says: "What should we fix?" or "Assess the architecture"
**What to do:**
1. **Complete handover report first** (this document)
2. **Spawn architect as consultant subagent:**
```
I'll consult with the system architect to assess the architecture and provide improvement recommendations.
[Use Task tool with subagent_type='general-purpose']
Task: "Use the axiom-system-architect plugin to assess the architecture documented in [workspace-path].
Context:
- Architecture analysis is complete
- Handover report available at [workspace-path]/06-architect-handover.md
- Key deliverables: [list deliverables]
- Primary concerns: [top 3-5 concerns from analysis]
Your task:
1. Read the handover report and referenced documents
2. Use axiom-system-architect:assessing-architecture-quality for critical assessment
3. Use axiom-system-architect:identifying-technical-debt to catalog debt
4. Use axiom-system-architect:prioritizing-improvements for roadmap
Deliverables:
- Architecture quality assessment
- Technical debt catalog
- Prioritized improvement roadmap
IMPORTANT: Follow architect skills rigorously - maintain professional discipline, no diplomatic softening, security-first prioritization."
```
3. **Synthesize architect outputs** when subagent returns
4. **Present to user:**
- Architecture assessment (from architect)
- Technical debt catalog (from architect)
- Prioritized roadmap (from architect)
- Combined context (archaeologist + architect)
### Option C: Architect Recommendation (User Choice)
**When to use:**
- User didn't explicitly request architect engagement
- Archaeologist found concerns warranting architect review
- Offer as next step
**What to say:**
> "I've completed the architecture analysis and documented [N] concerns requiring attention.
>
> **Next step options:**
>
> A) **Immediate assessment** - I can consult the system architect (axiom-system-architect) right now to provide:
> - Critical architecture quality assessment
> - Technical debt catalog
> - Prioritized improvement roadmap
>
> B) **Handover for later** - I've created a handover report (`06-architect-handover.md`) that you can use to engage the architect when ready
>
> C) **Complete current analysis** - Finish with archaeologist deliverables only
>
> Which approach fits your needs?"
## Handover Report Synthesis Approach
### Extract from Subsystem Catalog
**Read `02-subsystem-catalog.md`:**
- Count subsystems (total documented)
- Collect all "Concerns" entries (aggregate findings)
- Note confidence levels (High/Medium/Low distribution)
- Identify dependency patterns (high coupling, circular deps)
**Synthesize into handover:**
- List total subsystems
- Aggregate concerns by category (complexity, coupling, technical debt, etc.)
- Highlight low-confidence areas needing deeper analysis
- Note architectural patterns observed
### Extract from Quality Assessment
**Read `05-quality-assessment.md` (if exists):**
- Extract quality scorecard ratings
- Collect Critical/High severity issues
- Note methodology limitations
**Synthesize into handover:**
- Summarize quality dimensions assessed
- List priority issues by severity
- Note analysis limitations for architect awareness
### Extract from Final Report
**Read `04-final-report.md`:**
- Executive summary (system overview)
- Key findings (synthesized patterns)
- Recommendations (if any)
**Synthesize into handover:**
- Use executive summary for handover summary
- Reference key findings for architect attention
- Note any recommendations already made
## Success Criteria
**You succeeded when:**
- Handover report comprehensively synthesizes archaeologist outputs
- Architect input package clearly structured with locations and usage guidance
- Confidence levels documented for architect awareness
- Scope and limitations explicitly stated
- Consultation pattern matches user's workflow needs
- Written to 06-architect-handover.md following format
**You failed when:**
- Handover is just concatenation of source documents (no synthesis)
- No guidance on which documents architect should read
- Missing confidence level context
- Limitations not documented
- Spawned architect without completing handover first
- No option for user choice (forced integrated consultation)
## Integration with Workflow
This briefing is typically invoked as:
1. **Coordinator** completes final report (04-final-report.md)
2. **Coordinator** (optionally) completes quality assessment (05-quality-assessment.md)
3. **Coordinator** writes task specification for handover creation
4. **YOU** read all archaeologist deliverables
5. **YOU** synthesize into handover report
6. **YOU** write to 06-architect-handover.md
7. **YOU** offer consultation options to user (A/B/C)
8. **(Optional)** Spawn architect subagent if user chooses integrated workflow
9. **Coordinator** proceeds to cleanup or next steps
**Your role:** Bridge archaeologist's neutral documentation and architect's critical assessment through structured handover synthesis.
## Common Mistakes
**❌ Skipping handover report**
"Architect can just read subsystem catalog" → Architect needs synthesized input, not raw docs
**❌ Spawning architect without handover**
"I'll just task architect directly" → Architect works best with structured handover package
**❌ Making architect decisions**
"I'll do the assessment myself" → That's architect's job, not archaeologist's
**❌ Forcing integrated workflow**
"I'll spawn architect automatically" → Offer choice (A/B/C), let user decide
**❌ No synthesis**
"Handover is just copy-paste" → Synthesize, don't concatenate
**❌ Missing limitations**
"I'll hide what we didn't analyze" → Architect needs to know limitations for accurate assessment
## Anti-Patterns
**Overstepping into architect role:**
"This architecture is bad" → "This architecture has [N] concerns documented"
**Incomplete handover:**
Missing confidence levels, no limitations section → Architect can't calibrate recommendations
**Forced workflow:**
Always spawning architect subagent → Offer user choice
**Raw data dump:**
Handover is just file paths → Synthesize key findings for architect
**No consultation pattern:**
Just write report, no next-step guidance → Offer explicit A/B/C options
## The Bottom Line
**Archaeologist documents neutrally. Architect assesses critically. Handover bridges professionally.**
Synthesize findings. Package inputs. Offer consultation. Enable next phase.
The pipeline works when each role stays in its lane and handovers are clean.

View File

@@ -0,0 +1,465 @@
---
name: using-system-archaeologist
description: Use when analyzing existing codebases to generate architecture documentation - coordinates subagent-driven exploration with mandatory workspace structure, validation gates, and pressure-resistant workflows
mode: true
---
# System Archaeologist - Codebase Architecture Analysis
## Overview
Analyze existing codebases through coordinated subagent exploration to produce comprehensive architecture documentation with C4 diagrams, subsystem catalogs, and architectural assessments.
**Core principle:** Systematic archaeological process with quality gates prevents rushed, incomplete analysis.
## When to Use
- User requests architecture documentation for existing codebase
- Need to understand unfamiliar system architecture
- Creating design docs for legacy systems
- Analyzing codebases of any size (small to large)
- User mentions: "analyze codebase", "architecture documentation", "system design", "generate diagrams"
## Mandatory Workflow
### Step 1: Create Workspace (NON-NEGOTIABLE)
**Before any analysis:**
```bash
mkdir -p docs/arch-analysis-$(date +%Y-%m-%d-%H%M)/temp
```
**Why this is mandatory:**
- Organizes all analysis artifacts in one location
- Enables subagent handoffs via shared documents
- Provides audit trail of decisions
- Prevents file scatter across project
**Common rationalization:** "This feels like overhead when I'm pressured"
**Reality:** 10 seconds to create workspace saves hours of file hunting and context loss.
### Step 1.5: Offer Deliverable Menu (MANDATORY)
**After workspace creation, offer user choice of deliverables:**
**Why this is mandatory:**
- Users may need subset of analysis (quick overview vs. comprehensive)
- Time-constrained scenarios require focused scope
- Different stakeholder needs (exec summary vs. full technical docs)
- Architect-ready outputs have different requirements than documentation-only
Present menu using **AskUserQuestion tool:**
**Question:** "What deliverables do you need from this architecture analysis?"
**Options:**
**A) Full Analysis (Comprehensive)** - Recommended for complete understanding
- All standard documents (discovery, catalog, diagrams, report)
- Optional: Code quality assessment
- Optional: Architect handover report
- Timeline: 2-6 hours depending on codebase size
- Best for: New codebases, major refactoring planning, complete documentation needs
**B) Quick Overview (Essential)** - Fast turnaround for stakeholder presentations
- Discovery findings + high-level diagrams only (Context + Container)
- Executive summary with key findings
- Documented limitations (partial analysis)
- Timeline: 30 minutes - 2 hours
- Best for: Initial assessment, stakeholder presentations, time-constrained reviews
**C) Architect-Ready (Analysis + Improvement Planning)** - Complete analysis with improvement focus
- Full analysis (discovery, catalog, diagrams, report)
- Code quality assessment (mandatory for architect)
- Architect handover report with improvement recommendations
- Optional: Integrated architect consultation
- Timeline: 3-8 hours depending on codebase size
- Best for: Planning refactoring, technical debt assessment, improvement roadmaps
**D) Custom Selection** - Choose specific documents
- User selects from: Discovery, Catalog, Diagrams (which levels?), Report, Quality, Handover
- Timeline: Varies by selection
- Best for: Updating existing documentation, focused analysis
**Document user's choice in coordination plan:**
```markdown
## Deliverables Selected: [Option A/B/C/D]
[If Option D, list specific selections]
**Rationale:** [Why user chose this option]
**Timeline target:** [If time-constrained]
**Stakeholder needs:** [If presentation-driven]
```
**Common rationalization:** "User didn't specify, so I'll default to full analysis"
**Reality:** Always offer choice explicitly. Different needs require different outputs. Assuming full analysis wastes time if user needs quick overview.
### Step 2: Write Coordination Plan
**After documenting deliverable choice, write `00-coordination.md`:**
```markdown
## Analysis Plan
- Scope: [directories to analyze]
- Strategy: [Sequential/Parallel with reasoning]
- Time constraint: [if any, with scoping plan]
- Complexity estimate: [Low/Medium/High]
## Execution Log
- [timestamp] Created workspace
- [timestamp] [Next action]
```
**Why coordination logging is mandatory:**
- Documents strategy decisions (why parallel vs sequential?)
- Tracks what's been done vs what remains
- Enables resumption if work is interrupted
- Shows reasoning for future review
**Common rationalization:** "I'll just do the work, documentation is overhead"
**Reality:** Undocumented work is unreviewable and non-reproducible.
### Step 3: Holistic Assessment First
**Before diving into details, perform systematic scan:**
1. **Directory structure** - Map organization (feature? layer? domain?)
2. **Entry points** - Find main files, API definitions, config
3. **Technology stack** - Languages, frameworks, dependencies
4. **Subsystem identification** - Identify 4-12 major cohesive groups
Write findings to `01-discovery-findings.md`
**Why holistic before detailed:**
- Prevents getting lost in implementation details
- Identifies parallelization opportunities
- Establishes architectural boundaries
- Informs orchestration strategy
**Common rationalization:** "I can see the structure, no need to document it formally"
**Reality:** What's obvious to you now is forgotten in 30 minutes.
### Step 4: Subagent Orchestration Strategy
**Decision point:** Sequential vs Parallel
**Use SEQUENTIAL when:**
- Project < 5 subsystems
- Subsystems have tight interdependencies
- Quick analysis needed (< 1 hour)
**Use PARALLEL when:**
- Project ≥ 5 independent subsystems
- Large codebase (20K+ LOC, 10+ plugins/services)
- Subsystems are loosely coupled
**Document decision in `00-coordination.md`:**
```markdown
## Decision: Parallel Analysis
- Reasoning: 14 independent plugins, loosely coupled
- Strategy: Spawn 14 parallel subagents, one per plugin
- Estimated time savings: 2 hours → 30 minutes
```
**Common rationalization:** "Solo work is faster than coordination overhead"
**Reality:** For large systems, orchestration overhead (5 min) saves hours of sequential work.
### Step 5: Subagent Delegation Pattern
**When spawning subagents for analysis:**
Create task specification in `temp/task-[subagent-name].md`:
```markdown
## Task: Analyze [specific scope]
## Context
- Workspace: docs/arch-analysis-YYYY-MM-DD-HHMM/
- Read: 01-discovery-findings.md
- Write to: 02-subsystem-catalog.md (append your section)
## Expected Output
Follow contract in documentation-contracts.md:
- Subsystem name, location, responsibility
- Key components (3-5 files/classes)
- Dependencies (inbound/outbound)
- Patterns observed
- Confidence level
## Validation Criteria
- [ ] All contract sections complete
- [ ] Confidence level marked
- [ ] Dependencies bidirectional (if A depends on B, B shows A as inbound)
```
**Why formal task specs:**
- Subagents know exactly what to produce
- Reduces back-and-forth clarification
- Ensures contract compliance
- Enables parallel work without conflicts
### Step 6: Validation Gates (MANDATORY)
**After EVERY major document is produced, validate before proceeding.**
**What "validation gate" means:**
- Systematic check against contract requirements
- Cross-document consistency verification
- Quality gate before proceeding to next phase
- NOT just "read it again" - use a checklist
**Two validation approaches:**
**A) Separate Validation Subagent (PREFERRED)**
- Spawn dedicated validation subagent
- Agent reads document + contract, produces validation report
- Provides "fresh eyes" review
- Use when: Time allows (5-10 min overhead), complex analysis, multiple subsystems
**B) Systematic Self-Validation (ACCEPTABLE)**
- You validate against contract checklist systematically
- Document your validation in coordination log
- Use when: Tight time constraints (< 1 hour), simple analysis, solo work already
- **MUST still be systematic** (not "looks good")
**Validation checklist (either approach):**
- [ ] Contract compliance (all required sections present)
- [ ] Cross-document consistency (subsystems in catalog match diagrams)
- [ ] Confidence levels marked
- [ ] No placeholder text ("[TODO]", "[Fill in]")
- [ ] Dependencies bidirectional (A→B means B shows A as inbound)
**When using self-validation, document in coordination log:**
```markdown
## Validation Decision - [timestamp]
- Approach: Self-validation (time constraint: 1 hour deadline)
- Documents validated: 02-subsystem-catalog.md
- Checklist: Contract ✓, Consistency ✓, Confidence ✓, No placeholders ✓
- Result: APPROVED for diagram generation
```
**Validation status meanings:**
- **APPROVED** → Proceed to next phase
- **NEEDS_REVISION** (warnings) → Fix non-critical issues, document as tech debt, proceed
- **NEEDS_REVISION** (critical) → BLOCK. Fix issues, re-validate. Max 2 retries, then escalate to user.
**Common rationalization:** "Validation slows me down"
**Reality:** Validation catches errors before they cascade. 2 minutes validating saves 20 minutes debugging diagrams generated from bad data.
**Common rationalization:** "I already checked it, validation is redundant"
**Reality:** "Checked it" ≠ "validated systematically against contract". Use the checklist.
### Step 7: Handle Validation Failures
**When validator returns NEEDS_REVISION with CRITICAL issues:**
1. **Read validation report** (temp/validation-*.md)
2. **Identify specific issues** (not general "improve quality")
3. **Spawn original subagent again** with fix instructions
4. **Re-validate** after fix
5. **Maximum 2 retries** - if still failing, escalate: "Having trouble with [X], need your input"
**DO NOT:**
- Proceed to next phase despite BLOCK status
- Make fixes yourself without re-spawning subagent
- Rationalize "it's good enough"
- Question validator authority ("validation is too strict")
**From baseline testing:** Agents WILL respect validation when it's clear and authoritative. Make validation clear and authoritative.
## Working Under Pressure
### Time Constraints Are Not Excuses to Skip Process
**Common scenario:** "I need this in 3 hours for a stakeholder meeting"
**WRONG response:** Skip workspace, skip validation, rush deliverables
**RIGHT response:** Scope appropriately while maintaining process
**Example scoping for 3-hour deadline:**
```markdown
## Coordination Plan
- Time constraint: 3 hours until stakeholder presentation
- Strategy: SCOPED ANALYSIS with quality gates maintained
- Timeline:
- 0:00-0:05: Create workspace, write coordination plan (this)
- 0:05-0:35: Holistic scan, identify all subsystems
- 0:35-2:05: Focus on 3 highest-value subsystems (parallel analysis)
- 2:05-2:35: Generate minimal viable diagrams (Context + Component only)
- 2:35-2:50: Validate outputs
- 2:50-3:00: Write executive summary with EXPLICIT limitations section
## Limitations Acknowledged
- Only 3/14 subsystems analyzed in depth
- No module-level dependency diagrams
- Confidence: Medium (time-constrained analysis)
- Recommend: Full analysis post-presentation
```
**Key principle:** Scoped analysis with documented limitations > complete analysis done wrong.
### Handling Sunk Cost (Incomplete Prior Work)
**Common scenario:** "We started this analysis last week, finish it"
**Checklist:**
1. **Find existing workspace** - Look in docs/arch-analysis-*/
2. **Read coordination log** - Understand what was done and why stopped
3. **Assess quality** - Is prior work correct or flawed?
4. **Make explicit decision:**
- **Prior work is good** → Continue from where it left off, update coordination log
- **Prior work is flawed** → Archive old workspace, start fresh, document why
- **Prior work is mixed** → Salvage good parts, redo bad parts, document decisions
**DO NOT assume prior work is correct just because it exists.**
**Update coordination log:**
```markdown
## Incremental Work - [date]
- Detected existing workspace from [prior date]
- Assessment: [quality evaluation]
- Decision: [continue/archive/salvage]
- Reasoning: [why]
```
## Common Rationalizations (RED FLAGS)
If you catch yourself thinking ANY of these, STOP:
| Excuse | Reality |
|--------|---------|
| "Time pressure makes trade-offs appropriate" | Process prevents rework. Skipping process costs MORE time. |
| "This feels like overhead" | 5 minutes of structure saves hours of chaos. |
| "Working solo is faster" | Solo works for small tasks. Orchestration scales for large systems. |
| "I'll just write outputs directly" | Uncoordinated work creates inconsistent artifacts. |
| "Validation slows me down" | Validation catches errors before they cascade. |
| "I already checked it" | Self-review misses what fresh eyes catch. |
| "I can't do this properly in [short time]" | You can do SCOPED analysis properly. Document limitations. |
| "Rather than duplicate, I'll synthesize" | Existing docs ≠ systematic analysis. Do the work. |
| "Architecture analysis doesn't need exhaustive review" | True. But it DOES need systematic method. |
| "Meeting-ready outputs" justify shortcuts | Stakeholders deserve accurate info, not rushed guesses. |
**All of these mean:** Follow the process. It exists because these rationalizations lead to bad outcomes.
## Extreme Pressure Handling
**If user requests something genuinely impossible:**
- "Complete 15-plugin analysis with full diagrams in 1 hour"
**Provide scoped alternative:**
> "I can't do complete analysis of 15 plugins in 1 hour while maintaining quality. Here are realistic options:
>
> A) **Quick overview** (1 hour): Holistic scan, plugin inventory, high-level architecture diagram, documented limitations
>
> B) **Focused deep-dive** (1 hour): Pick 2-3 critical plugins, full analysis of those, others documented as "not analyzed"
>
> C) **Use existing docs** (15 min): Synthesize existing README.md, CLAUDE.md with quick verification
>
> D) **Reschedule** (recommended): Full systematic analysis takes 4-6 hours for this scale
>
> Which approach fits your needs?"
**DO NOT:** Refuse the task entirely. Provide realistic scoped alternatives.
## Documentation Contracts
See individual skill files for detailed contracts:
- `01-discovery-findings.md` contract → [analyzing-unknown-codebases.md](analyzing-unknown-codebases.md)
- `02-subsystem-catalog.md` contract → [analyzing-unknown-codebases.md](analyzing-unknown-codebases.md)
- `03-diagrams.md` contract → [generating-architecture-diagrams.md](generating-architecture-diagrams.md)
- `04-final-report.md` contract → [documenting-system-architecture.md](documenting-system-architecture.md)
- `05-quality-assessment.md` contract → [assessing-code-quality.md](assessing-code-quality.md)
- `06-architect-handover.md` contract → [creating-architect-handover.md](creating-architect-handover.md)
- Validation protocol → [validating-architecture-analysis.md](validating-architecture-analysis.md)
## Workflow Summary
```
1. Create workspace (docs/arch-analysis-YYYY-MM-DD-HHMM/)
1.5. Offer deliverable menu (A/B/C/D) - user chooses scope
2. Write coordination plan (00-coordination.md) with deliverable choice
3. Holistic assessment → 01-discovery-findings.md
4. Decide: Sequential or Parallel? (document reasoning)
5. Spawn subagents for analysis → 02-subsystem-catalog.md
6. VALIDATE subsystem catalog (mandatory gate)
6.5. (Optional) Code quality assessment → 05-quality-assessment.md
7. Spawn diagram generation → 03-diagrams.md
8. VALIDATE diagrams (mandatory gate)
9. Synthesize final report → 04-final-report.md
10. VALIDATE final report (mandatory gate)
11. (Optional) Generate architect handover → 06-architect-handover.md
12. Provide cleanup recommendations for temp/
```
**Every step is mandatory except optional steps (6.5, 11). No exceptions for time pressure, complexity, or stakeholder demands.**
**Optional steps triggered by deliverable choice:**
- Step 6.5: Required for "Architect-Ready" (Option C), Optional for "Full Analysis" (Option A)
- Step 11: Required for "Architect-Ready" (Option C), Not included in "Quick Overview" (Option B)
## Success Criteria
**You have succeeded when:**
- Workspace structure exists with all numbered documents
- Coordination log documents all major decisions
- All outputs passed validation gates
- Subagent orchestration used appropriately for scale
- Limitations explicitly documented if time-constrained
- User receives navigable, validated architecture documentation
**You have failed when:**
- Files scattered outside workspace
- No coordination log showing decisions
- Validation skipped "to save time"
- Worked solo despite clear parallelization opportunity
- Produced rushed outputs without limitation documentation
- Rationalized shortcuts as "appropriate trade-offs"
## Anti-Patterns
**❌ Skip workspace creation**
"I'll just write files to project root"
**❌ No coordination logging**
"I'll just do the work without documenting strategy"
**❌ Work solo despite scale**
"Orchestration overhead isn't worth it"
**❌ Skip validation**
"I already reviewed it myself"
**❌ Bypass BLOCK status**
"The validation is too strict, I'll proceed anyway"
**❌ Complete refusal under pressure**
"I can't do this properly in 3 hours, so I won't do it" (Should: Provide scoped alternative)
---
## System Archaeologist Specialist Skills
After routing, load the appropriate specialist skill for detailed guidance:
1. [analyzing-unknown-codebases.md](analyzing-unknown-codebases.md) - Systematic codebase exploration, subsystem identification, confidence-based analysis
2. [generating-architecture-diagrams.md](generating-architecture-diagrams.md) - C4 diagrams, abstraction strategies, notation conventions
3. [documenting-system-architecture.md](documenting-system-architecture.md) - Synthesis of catalogs and diagrams into comprehensive reports
4. [validating-architecture-analysis.md](validating-architecture-analysis.md) - Contract validation, consistency checks, quality gates
5. [assessing-code-quality.md](assessing-code-quality.md) - Code quality analysis beyond architecture - complexity, duplication, smells, technical debt assessment
6. [creating-architect-handover.md](creating-architect-handover.md) - Handover reports for axiom-system-architect - enables transition from analysis to improvement planning

View File

@@ -0,0 +1,327 @@
# Analyzing Unknown Codebases
## Purpose
Systematically analyze unfamiliar code to identify subsystems, components, dependencies, and architectural patterns. Produce catalog entries that follow EXACT output contracts.
## When to Use
- Coordinator delegates subsystem analysis task
- Task specifies reading from workspace and appending to `02-subsystem-catalog.md`
- You need to analyze code you haven't seen before
- Output must integrate with downstream tooling (validation, diagram generation)
## Critical Principle: Contract Compliance
**Your analysis quality doesn't matter if you violate the output contract.**
**Common rationalization:** "I'll add helpful extra sections to improve clarity"
**Reality:** Extra sections break downstream tools. The coordinator expects EXACT format for parsing and validation. Your job is to follow the specification, not improve it.
## Output Contract (MANDATORY)
When writing to `02-subsystem-catalog.md`, append EXACTLY this format:
```markdown
## [Subsystem Name]
**Location:** `path/to/subsystem/`
**Responsibility:** [One sentence describing what this subsystem does]
**Key Components:**
- `file1.ext` - [Brief description]
- `file2.ext` - [Brief description]
- `file3.ext` - [Brief description]
**Dependencies:**
- Inbound: [Subsystems that depend on this one]
- Outbound: [Subsystems this one depends on]
**Patterns Observed:**
- [Pattern 1]
- [Pattern 2]
**Concerns:**
- [Any issues, gaps, or technical debt observed]
**Confidence:** [High/Medium/Low] - [Brief reasoning]
```
**If no concerns exist, write:**
```markdown
**Concerns:**
- None observed
```
**CRITICAL COMPLIANCE RULES:**
- ❌ Add extra sections ("Integration Points", "Recommendations", "Files", etc.)
- ❌ Change section names or reorder them
- ❌ Write to separate file (must append to `02-subsystem-catalog.md`)
- ❌ Skip sections (include ALL sections - use "None observed" if empty)
- ✅ Copy the template structure EXACTLY
- ✅ Keep section order: Location → Responsibility → Key Components → Dependencies → Patterns → Concerns → Confidence
**Contract is specification, not minimum. Extra sections break downstream validation.**
### Example: Complete Compliant Entry
Here's what a correctly formatted entry looks like:
```markdown
## Authentication Service
**Location:** `/src/services/auth/`
**Responsibility:** Handles user authentication, session management, and JWT token generation for API access.
**Key Components:**
- `auth_handler.py` - Main authentication logic with login/logout endpoints (342 lines)
- `token_manager.py` - JWT token generation and validation (156 lines)
- `session_store.py` - Redis-backed session storage (98 lines)
**Dependencies:**
- Inbound: API Gateway, User Service
- Outbound: Database Layer, Cache Service, Logging Service
**Patterns Observed:**
- Dependency injection for testability (all external services injected)
- Token refresh pattern with sliding expiration
- Audit logging for all authentication events
**Concerns:**
- None observed
**Confidence:** High - Clear entry points, documented API, test coverage validates behavior
```
**This is EXACTLY what your output should look like.** No more, no less.
## Systematic Analysis Approach
### Step 1: Read Task Specification
Your task file (`temp/task-[name].md`) specifies:
- What to analyze (scope: directories, plugins, services)
- Where to read context (`01-discovery-findings.md`)
- Where to write output (`02-subsystem-catalog.md` - append)
- Expected format (the contract above)
**Read these files FIRST before analyzing code.**
### Step 2: Layered Exploration
Use this proven approach from baseline testing:
1. **Metadata layer** - Read plugin.json, package.json, setup.py
2. **Structure layer** - Examine directory organization
3. **Router layer** - Find and read router/index files (often named "using-X")
4. **Sampling layer** - Read 3-5 representative files
5. **Quantitative layer** - Use line counts as depth indicators
**Why this order works:**
- Metadata gives overview without code diving
- Structure reveals organization philosophy
- Routers often catalog all components
- Sampling verifies patterns
- Quantitative data supports claims
### Step 3: Mark Confidence Explicitly
**Every output MUST include confidence level with reasoning.**
**High confidence** - Router skill provided catalog + verified with sampling
```markdown
**Confidence:** High - Router skill listed all 10 components, sampling 4 confirmed patterns
```
**Medium confidence** - No router, but clear structure + sampling
```markdown
**Confidence:** Medium - No router catalog, inferred from directory structure + 5 file samples
```
**Low confidence** - Incomplete, placeholders, or unclear organization
```markdown
**Confidence:** Low - Several SKILL.md files missing, test artifacts suggest work-in-progress
```
### Step 4: Distinguish States Clearly
When analyzing codebases with mixed completion:
**Complete** - Skill file exists, has content, passes basic read test
```markdown
- `skill-name/SKILL.md` - Complete skill (1,234 lines)
```
**Placeholder** - Skill file exists but is stub/template
```markdown
- `skill-name/SKILL.md` - Placeholder (12 lines, template only)
```
**Planned** - Referenced in router but no file exists
```markdown
- `skill-name` - Planned (referenced in router, not implemented)
```
**TDD artifacts** - Test scenarios, baseline results (these ARE documentation)
```markdown
- `test-scenarios.md` - TDD test scenarios (RED phase)
- `baseline-results.md` - Baseline behavior documentation
```
### Step 5: Write Output (Contract Compliance)
**Before writing:**
1. Prepare your entry in EXACT contract format from the template above
2. Copy the structure - don't paraphrase or reorganize
3. Triple-check you have ALL sections in correct order
**When writing:**
1. **Target file:** `02-subsystem-catalog.md` in workspace directory
2. **Operation:** Append your entry (create file if first entry, append if file exists)
3. **Method:**
- If file exists: Read current content, then Write with original + your entry
- If file doesn't exist: Write your entry directly
4. **Format:** Follow contract sections in exact order
5. **Completeness:** Include ALL sections - use "None observed" for empty Concerns
**DO NOT create separate files** (e.g., `subsystem-X-analysis.md`). The coordinator expects all entries in `02-subsystem-catalog.md`.
**After writing:**
1. Re-read `02-subsystem-catalog.md` to verify your entry was added correctly
2. Validate format matches contract exactly using this checklist:
**Self-Validation Checklist:**
```
[ ] Section 1: Subsystem name as H2 heading (## Name)
[ ] Section 2: Location with backticks and absolute path
[ ] Section 3: Responsibility as single sentence
[ ] Section 4: Key Components as bulleted list with descriptions
[ ] Section 5: Dependencies with "Inbound:" and "Outbound:" labels
[ ] Section 6: Patterns Observed as bulleted list
[ ] Section 7: Concerns present (with issues OR "None observed")
[ ] Section 8: Confidence level (High/Medium/Low) with reasoning
[ ] Separator: "---" line after confidence
[ ] NO extra sections added
[ ] Sections in correct order
[ ] Entry in file: 02-subsystem-catalog.md (not separate file)
```
## Handling Uncertainty
**When architecture is unclear:**
1. **State what you observe** - Don't guess at intent
```markdown
**Patterns Observed:**
- 3 files with similar structure (analysis.py, parsing.py, validation.py)
- Unclear if this is deliberate pattern or coincidence
```
2. **Mark confidence appropriately** - Low confidence is valid
```markdown
**Confidence:** Low - Directory structure suggests microservices, but no service definitions found
```
3. **Use "Concerns" section** - Document gaps
```markdown
**Concerns:**
- No clear entry point identified
- Dependencies inferred from imports, not explicit manifest
```
**DO NOT:**
- Invent relationships you didn't verify
- Assume "obvious" architecture without evidence
- Skip confidence marking because you're uncertain
## Positive Behaviors to Maintain
From baseline testing, these approaches WORK:
✅ **Read actual files** - Don't infer from names alone
✅ **Use router skills** - They often provide complete catalogs
✅ **Sample strategically** - 3-5 files verifies patterns without exhaustive reading
✅ **Cross-reference** - Verify claims (imports match listed dependencies)
✅ **Document assumptions** - Make reasoning explicit
✅ **Line counts indicate depth** - 1,500-line skill vs 50-line stub matters
## Common Rationalizations (STOP SIGNALS)
If you catch yourself thinking these, STOP:
| Rationalization | Reality |
|-----------------|---------|
| "I'll add Integration Points section for clarity" | Extra sections break downstream parsing |
| "I'll write to separate file for organization" | Coordinator expects append to specified file |
| "I'll improve the contract format" | Contract is specification from coordinator |
| "More information is always helpful" | Your job: follow spec. Coordinator's job: decide what's included |
| "This comprehensive format is better" | "Better" violates contract. Compliance is mandatory. |
## Validation Criteria
Your output will be validated against:
1. **Contract compliance** - All sections present, no extras
2. **File operation** - Appended to `02-subsystem-catalog.md`, not separate file
3. **Confidence marking** - High/Medium/Low with reasoning
4. **Evidence-based claims** - Components you actually read
5. **Bidirectional dependencies** - If A→B, then B must show A as inbound
**If validation returns NEEDS_REVISION:**
- Read the validation report
- Fix specific issues identified
- Re-submit following contract
## Success Criteria
**You succeeded when:**
- Entry appended to `02-subsystem-catalog.md` in exact contract format
- All sections included (none skipped, none added)
- Confidence level marked with reasoning
- Claims supported by files you read
- Validation returns APPROVED
**You failed when:**
- Added "helpful" extra sections
- Wrote to separate file
- Changed contract format
- Skipped sections
- No confidence marking
- Validation returns BLOCK status
## Anti-Patterns
❌ **Add extra sections**
"I'll add Recommendations section" → Violates contract
❌ **Write to new file**
"I'll create subsystem-X-analysis.md" → Should append to `02-subsystem-catalog.md`
❌ **Skip required sections**
"No concerns, so I'll omit that section" → Include section with "None observed"
❌ **Change format**
"I'll use numbered lists instead of bullet points" → Follow contract exactly
❌ **Work without reading task spec**
"I know what to do" → Read `temp/task-*.md` first
## Integration with Workflow
This skill is typically invoked as:
1. **Coordinator** creates workspace and holistic assessment
2. **Coordinator** writes task specification in `temp/task-[yourname].md`
3. **YOU** read task spec + `01-discovery-findings.md`
4. **YOU** analyze assigned subsystem systematically
5. **YOU** append entry to `02-subsystem-catalog.md` following contract
6. **Validator** checks your output against contract
7. **Coordinator** proceeds to next phase if validation passes
**Your role:** Analyze systematically, follow contract exactly, mark confidence explicitly.

View File

@@ -0,0 +1,249 @@
# Assessing Code Quality
## Purpose
Analyze code quality indicators beyond architecture with EVIDENCE-BASED findings - produces quality scorecard with concrete examples, file/line references, and actionable recommendations.
## When to Use
- Coordinator delegates quality assessment after subsystem catalog completion
- Need to identify specific, actionable quality issues beyond architectural patterns
- Output feeds into architect handover or improvement planning
## Core Principle: Evidence Over Speculation
**Good quality analysis provides concrete evidence. Poor quality analysis makes educated guesses.**
```
❌ BAD: "Payment service likely lacks error handling"
✅ GOOD: "Payment service process_payment() (line 127) raises generic Exception, loses error context"
❌ BAD: "Functions may be too long"
✅ GOOD: "src/api/orders.py:process_order() is 145 lines (lines 89-234)"
❌ BAD: "Error handling unclear"
✅ GOOD: "3 different error patterns: orders.py uses exceptions, payment.py returns codes, user.py mixes both"
```
**Your goal:** Document WHAT YOU OBSERVED, not what you guess might be true.
## The Speculation Trap (from baseline testing)
**Common rationalization:** "I don't have full context, so I'll make educated guesses to provide value."
**Reality:** Speculation masquerading as analysis is worse than saying "insufficient information."
**Baseline failure mode:**
- Document says "Error handling likely mixes patterns"
- Based on: Filename alone, not actual code review
- Should say: "Reviewed file structure only - implementation details not analyzed"
## Weasel Words - BANNED
If you catch yourself writing these, STOP:
| Banned Phrase | Why Banned | Replace With |
|---------------|------------|--------------|
| "Likely" | Speculation | "Observed in file.py:line" OR "Not analyzed" |
| "May" | Hedge | "Found X" OR "Did not review X" |
| "Unclear if" | Evasion | "Code shows X" OR "Insufficient information to assess X" |
| "Appears to" | Guessing | "Lines 12-45 implement X" OR "Not examined" |
| "Probably" | Assumption | Concrete observation OR admit limitation |
| "Should" | Inference | "Currently does X" (observation) |
| "Suggests" | Reading between lines | Quote actual code OR acknowledge gap |
**From baseline testing:** Agents will use these to fill gaps. Skill must ban them explicitly.
## Evidence Requirements
**Every finding MUST include:**
1. **File path** - Exact location
-`src/services/payment.py`
- ❌ "payment service"
2. **Line numbers** - Specific range
- ✅ "Lines 127-189 (63 lines)"
- ❌ "Long function"
3. **Code example or description** - What you saw
- ✅ "Function has 8 nested if statements"
- ✅ "Quoted: `except Exception as e: pass`"
- ❌ "Complex logic"
4. **Severity with reasoning** - Why this level
- ✅ "Critical: Swallows exceptions in payment processing"
- ❌ "High priority issue"
**If you didn't examine implementation:**
- ✅ "Reviewed imports and structure only - implementation not analyzed"
- ❌ "Function likely does X"
## Severity Framework (REQUIRED)
**Define criteria BEFORE rating anything:**
**Critical:**
- Security vulnerability (injection, exposed secrets, auth bypass)
- Data loss/corruption risk
- Blocks deployment
- Example: Hardcoded credentials, SQL injection, unhandled exceptions in critical path
**High:**
- Frequent source of bugs
- High maintenance burden
- Performance degradation
- Example: 200-line functions, 15% code duplication, N+1 queries
**Medium:**
- Moderate maintainability concern
- Technical debt accumulation
- Example: Missing docstrings, inconsistent error handling, magic numbers
**Low:**
- Minor improvement
- Style/cosmetic
- Example: Verbose naming, minor duplication (< 5%)
**From baseline testing:** Agents will use severity labels without defining them. Enforce framework first.
## Observation vs. Inference
**Distinguish clearly:**
**Observation** (what you saw):
- "Function process_order() is 145 lines"
- "3 files contain identical validation logic (lines quoted)"
- "No docstrings found in src/api/ (0/12 functions)"
**Inference** (what it might mean):
- "145-line function suggests complexity - recommend review"
- "Identical validation in 3 files indicates duplication - recommend extraction"
- "Missing docstrings may hinder onboarding - recommend adding"
**Always lead with observation, inference optional:**
```markdown
**Observed:** `src/api/orders.py:process_order()` is 145 lines (lines 89-234) with 12 decision points
**Assessment:** High complexity - recommend splitting into smaller functions
```
## "Insufficient Information" is Valid
**When you haven't examined something, say so:**
**Honest limitation:**
```markdown
## Testing Coverage
**Not analyzed** - Test files not reviewed during subsystem cataloging
**Recommendation:** Review test/ directory to assess coverage
```
**Speculation:**
```markdown
## Testing Coverage
Testing likely exists but coverage unclear. May have integration tests. Probably needs more unit tests.
```
**From baseline testing:** Agents filled every section with speculation. Skill must make "not analyzed" acceptable.
## Minimal Viable Quality Assessment
**If time/sample constraints prevent comprehensive analysis:**
1. **Acknowledge scope explicitly**
- "Reviewed 5 files from 8 subsystems"
- "Implementation details not examined (imports/structure only)"
2. **Document what you CAN observe**
- File sizes, function counts (from grep/wc)
- Imports (coupling indicators)
- Structure (directory organization)
3. **List gaps explicitly**
- "Not analyzed: Test coverage, runtime behavior, security patterns"
4. **Mark confidence appropriately**
- "Low confidence: Small sample, structural review only"
**Better to be honest about limitations than to guess.**
## Output Contract
Write to `05-quality-assessment.md`:
```markdown
# Code Quality Assessment
**Analysis Date:** YYYY-MM-DD
**Scope:** [What you reviewed - be specific]
**Methodology:** [How you analyzed - tools, sampling strategy]
**Confidence:** [High/Medium/Low with reasoning]
## Severity Framework
[Define Critical/High/Medium/Low with examples]
## Findings
### [Category: Complexity/Duplication/etc.]
**Observed:**
- [File path:line numbers] [Concrete observation]
- [Quote code or describe specifically]
**Severity:** [Level] - [Reasoning based on framework]
**Recommendation:** [Specific action]
### [Repeat for each finding]
## What Was NOT Analyzed
- [Explicitly list gaps]
- [No speculation about these areas]
## Recommendations
[Prioritized by severity]
## Limitations
- Sample size: [N files from M subsystems]
- Methodology constraints: [No tools, time pressure, etc.]
- Confidence assessment: [Why Medium/Low]
```
## Success Criteria
**You succeeded when:**
- Every finding has file:line evidence
- No weasel words ("likely", "may", "unclear if")
- Severity framework defined before use
- Observations distinguished from inferences
- Gaps acknowledged as "not analyzed"
- Output follows contract
**You failed when:**
- Speculation instead of evidence
- Vague findings ("functions may be long")
- Undefined severity ratings
- "Unclear if" everywhere
- Professional-looking guesswork
## Common Rationalizations (STOP SIGNALS)
| Excuse | Reality |
|--------|---------|
| "I'll provide value with educated guesses" | Speculation is not analysis. Be honest about gaps. |
| "I can infer from file names" | File names ≠ implementation. Acknowledge limitation. |
| "Stakeholders expect complete analysis" | They expect accurate analysis. Partial truth > full speculation. |
| "I reviewed it during cataloging" | Catalog focus ≠ quality focus. If you didn't examine for quality, say so. |
| "Professional reports don't say 'I don't know'" | Professional reports distinguish facts from limitations. |
## Integration with Workflow
Typically invoked after subsystem catalog completion, before or alongside diagram generation. Feeds findings into final report and architect handover.

View File

@@ -0,0 +1,236 @@
# Creating Architect Handover
## Purpose
Generate handover reports for axiom-system-architect plugin - enables transition from neutral documentation (archaeologist) to critical assessment (architect).
## When to Use
- User requests prioritization: "What should we fix first?"
- User asks for improvement recommendations
- Analysis complete, user needs actionable next steps
- Quality issues identified that need criticality assessment
## Core Principle: Role Discipline
**Archaeologist documents. Architect assesses. Never blur the line.**
```
❌ ARCHAEOLOGIST: "Fix payment errors first, then database config, then documentation"
✅ ARCHAEOLOGIST: "I'll consult the system architect to provide prioritized recommendations"
❌ ARCHAEOLOGIST: "This is critical because it affects payments"
✅ ARCHAEOLOGIST: "Payment error handling issue documented (see quality assessment)"
✅ ARCHITECT: "Critical - payment integrity risk, priority P1"
```
**Your role:** Bridge neutral analysis to critical assessment, don't do the assessment yourself.
## The Role Confusion Trap (from baseline testing)
**User asks:** "What should we fix first?"
**Baseline failure:** Archaeologist provides prioritized recommendations directly
- "Fix payment errors first" (made criticality call)
- "Security is priority" (made security judgment)
- "Sprint 1: X, Sprint 2: Y" (made timeline decisions)
**Why this fails:**
- Archaeologist analyzed with neutral lens
- Architect brings critical assessment framework
- Architect has security-first mandate, pressure-resistant prioritization
- Blending roles produces inconsistent rigor
**Correct response:** Recognize question belongs to architect, facilitate handover
## Division of Labor (MANDATORY)
### Archaeologist (This Plugin)
**What you DO:**
- Document architecture (subsystems, diagrams, dependencies)
- Identify quality concerns (complexity, duplication, smells)
- Mark confidence levels (High/Medium/Low)
- **Stay neutral:** "Here's what exists, here's what I observed"
**What you do NOT do:**
- Assess criticality ("this is critical")
- Prioritize fixes ("fix X before Y")
- Make security judgments ("security risk trumps performance")
- Create sprint roadmaps
### Architect (axiom-system-architect Plugin)
**What they DO:**
- Critical assessment ("this is bad, here's severity")
- Technical debt prioritization (risk-based, security-first)
- Refactoring strategy recommendations
- Improvement roadmaps with timelines
**What they do NOT do:**
- Neutral documentation (that's your job)
### Handover (Your Job Now)
**What you DO:**
- Synthesize archaeologist outputs (catalog + quality + diagrams + report)
- Package for architect consumption
- Offer architect consultation (3 patterns below)
- **Bridge the roles, don't blend them**
## Handover Output
Create `06-architect-handover.md`:
```markdown
# Architect Handover Report
**Project:** [Name]
**Analysis Date:** YYYY-MM-DD
**Purpose:** Enable architect assessment and improvement prioritization
## Archaeologist Deliverables
**Available documents:**
- 01-discovery-findings.md
- 02-subsystem-catalog.md
- 03-diagrams.md (Context, Container, Component)
- 04-final-report.md
- 05-quality-assessment.md (if performed)
## Key Findings Summary
**Architectural concerns from catalog:**
1. [Concern from subsystem X] - Confidence: [High/Medium/Low]
2. [Concern from subsystem Y] - Confidence: [High/Medium/Low]
**Quality issues from assessment (if performed):**
- Critical: [Count] issues
- High: [Count] issues
- Medium: [Count] issues
- Low: [Count] issues
[List top 3-5 issues with file:line references]
## Architect Input Package
**For critical assessment:**
- Read: 02-subsystem-catalog.md (architectural concerns)
- Read: 05-quality-assessment.md (code quality evidence)
- Read: 04-final-report.md (synthesis and patterns)
**For prioritization:**
- Use: axiom-system-architect:prioritizing-improvements
- Note: Archaeologist findings are neutral observations
- Apply: Security-first framework, risk-based priority
## Confidence Context
[Confidence levels per subsystem - helps architect calibrate]
## Limitations
**What was analyzed:** [Scope]
**What was NOT analyzed:** [Gaps]
**Recommendations:** [What architect should consider for deeper analysis]
## Next Steps
[Offer consultation patterns - see below]
```
## Three Consultation Patterns
### Pattern A: Offer Architect Consultation (RECOMMENDED)
**When user asks prioritization questions:**
> "This question requires critical assessment and prioritization, which is the system architect's domain.
>
> **Options:**
>
> **A) Architect consultation now** - I'll consult axiom-system-architect to provide:
> - Critical architecture quality assessment
> - Risk-based technical debt prioritization
> - Security-first improvement roadmap
> - Sprint-ready recommendations
>
> **B) Handover report only** - I'll create handover package, you engage architect when ready
>
> Which approach fits your needs?"
**Rationalization to counter:** "I analyzed it, so I should just answer the question"
**Reality:** Analysis ≠ Assessment. Architect brings critical framework you don't have.
### Pattern B: Integrated Consultation (If User Chooses)
**If user selects Option A:**
1. Create handover report first (`06-architect-handover.md`)
2. Spawn architect as subagent using Task tool
3. Provide context: handover report location, key concerns, deliverables
4. Synthesize architect outputs when returned
5. Present to user
**Prompt structure:**
```
Use axiom-system-architect to assess architecture and prioritize improvements.
Context:
- Handover report: [workspace]/06-architect-handover.md
- Key concerns: [top 3-5 from analysis]
- User needs: Prioritized improvement recommendations
Tasks:
1. Use assessing-architecture-quality for critical assessment
2. Use identifying-technical-debt to catalog debt
3. Use prioritizing-improvements for security-first roadmap
IMPORTANT: Follow architect discipline - no diplomatic softening, security-first prioritization
```
### Pattern C: Handover Only (If User Wants Async)
**If user wants handover without immediate consultation:**
1. Create handover report (`06-architect-handover.md`)
2. Inform user: "Handover ready for architect when you're ready to engage"
3. Provide brief guide on using architect plugin
## Common Rationalizations (STOP SIGNALS)
| Excuse | Reality |
|--------|---------|
| "I know the codebase, I can prioritize" | Knowledge ≠ Critical assessment. Architect has framework. |
| "Just helping them prioritize for tomorrow" | Helping by overstepping = inconsistent rigor. Offer architect. |
| "Architect would say same thing" | Then let architect say it with their framework. |
| "User wants MY recommendation" | User wants CORRECT recommendation. That requires architect. |
| "It's obvious payment errors are critical" | Obvious to you ≠ risk-based security-first framework. |
| "I'll save time by combining roles" | Blending roles loses discipline. Handover takes 5 min. |
**From baseline testing:** Agents will try to "help" by doing architect's job. Skill must enforce boundary.
## Success Criteria
**You succeeded when:**
- Recognized prioritization question belongs to architect
- Created handover synthesis (not raw file dump)
- Offered architect consultation (3 patterns)
- Maintained role boundary (documented concerns, didn't assess criticality)
- Handover written to 06-architect-handover.md
**You failed when:**
- Directly provided prioritized recommendations
- Made criticality assessments ("this is critical")
- Created sprint roadmaps yourself
- Skipped handover, worked from raw files
- Blended archaeologist and architect roles
## Integration with Workflow
Typically invoked after final report completion. User requests improvement recommendations. You recognize this requires architect, create handover, offer consultation.
**Pipeline:** Archaeologist → Handover → Architect → Recommendations

View File

@@ -0,0 +1,569 @@
# Documenting System Architecture
## Purpose
Synthesize subsystem catalogs and architecture diagrams into final, stakeholder-ready architecture reports that serve multiple audiences through clear structure, comprehensive navigation, and actionable findings.
## When to Use
- Coordinator delegates final report generation from validated artifacts
- Have `02-subsystem-catalog.md` and `03-diagrams.md` as inputs
- Task specifies writing to `04-final-report.md`
- Need to produce executive-readable architecture documentation
- Output represents deliverable for stakeholders
## Core Principle: Synthesis Over Concatenation
**Good reports synthesize information into insights. Poor reports concatenate source documents.**
Your goal: Create a coherent narrative with extracted patterns, concerns, and recommendations - not a copy-paste of inputs.
## Document Structure
### Required Sections
**1. Front Matter**
- Document title
- Version number
- Analysis date
- Classification (if needed)
**2. Table of Contents**
- Multi-level hierarchy (H2, H3, H4)
- Anchor links to all major sections
- Quick navigation for readers
**3. Executive Summary (2-3 paragraphs)**
- High-level system overview
- Key architectural patterns
- Major concerns and confidence assessment
- Should be readable standalone by leadership
**4. System Overview**
- Purpose and scope
- Technology stack
- System context (external dependencies)
**5. Architecture Diagrams**
- Embed all diagrams from `03-diagrams.md`
- Add contextual analysis after each diagram
- Cross-reference to subsystem catalog
**6. Subsystem Catalog**
- One detailed entry per subsystem
- Synthesize from `02-subsystem-catalog.md` (don't just copy)
- Add cross-references to diagrams and findings
**7. Key Findings**
- **Architectural Patterns**: Identified across subsystems
- **Technical Concerns**: Extracted from catalog concerns
- **Recommendations**: Actionable next steps with priorities
**8. Appendices**
- **Methodology**: How analysis was performed
- **Confidence Levels**: Rationale for confidence ratings
- **Assumptions & Limitations**: What you inferred, what's missing
## Synthesis Strategies
### Pattern Identification
**Look across subsystems for recurring patterns:**
From catalog observations:
- Subsystem A: "Dependency injection for testability"
- Subsystem B: "All external services injected"
- Subsystem C: "Injected dependencies for testing"
**Synthesize into pattern:**
```markdown
### Dependency Injection Pattern
**Observed in**: Authentication Service, API Gateway, User Service
**Description**: External dependencies are injected rather than directly instantiated, enabling test isolation and loose coupling.
**Benefits**:
- Testability: Mock dependencies in unit tests
- Flexibility: Swap implementations without code changes
- Loose coupling: Services depend on interfaces, not concrete implementations
**Trade-offs**:
- Initial complexity: Requires dependency wiring infrastructure
- Runtime overhead: Minimal (dependency resolution at startup)
```
### Concern Extraction
**Find concerns buried in catalog entries:**
Catalog entries:
- API Gateway: "Rate limiter uses in-memory storage (doesn't scale horizontally)"
- Database Layer: "Connection pool max size hardcoded (should be configurable)"
- Data Service: "Large analytics queries can cause database load spikes"
**Synthesize into findings:**
```markdown
## Technical Concerns
### 1. Rate Limiter Scalability Issue
**Severity**: Medium
**Affected Subsystem**: [API Gateway](#api-gateway)
**Issue**: In-memory rate limiting prevents horizontal scaling. If multiple gateway instances run, each maintains separate counters, allowing clients to exceed intended limits by distributing requests across instances.
**Impact**:
- Cannot scale gateway horizontally without distributed rate limiting
- Potential for rate limit bypass under load balancing
- Inconsistent rate limit enforcement
**Remediation**:
1. **Immediate** (next sprint): Document limitation, add monitoring alerts
2. **Short-term** (next quarter): Migrate to Redis-backed rate limiter
3. **Validation**: Test rate limiting with multiple gateway instances
**Priority**: High (blocks horizontal scaling)
```
### Recommendation Prioritization
**Priority recommendations using severity scoring + impact assessment + timeline buckets:**
#### Severity Scoring (for each concern/recommendation)
**Critical:**
- Blocks deployment or core functionality
- Security vulnerability (data exposure, injection, auth bypass)
- Data corruption or loss risk
- Service outage potential
- Examples: SQL injection, hardcoded credentials, unhandled critical exceptions
**High:**
- Significant maintainability impact
- High effort to modify or extend
- Frequent source of bugs
- Performance degradation under load
- Examples: God objects, extreme duplication, shotgun surgery, N+1 queries
**Medium:**
- Moderate maintainability concern
- Refactoring beneficial but not urgent
- Technical debt accumulation
- Examples: Long functions, missing documentation, inconsistent error handling
**Low:**
- Minor quality improvement
- Cosmetic or style issues
- Nice-to-have enhancements
- Examples: Magic numbers, verbose naming, minor duplication
#### Impact Assessment Matrix
Use 2-dimensional scoring: **Severity × Frequency**
| Severity | High Frequency | Medium Frequency | Low Frequency |
|----------|----------------|------------------|---------------|
| **Critical** | **P1** - Fix immediately | **P1** - Fix immediately | **P2** - Fix ASAP |
| **High** | **P2** - Fix ASAP | **P2** - Fix ASAP | **P3** - Plan for sprint |
| **Medium** | **P3** - Plan for sprint | **P4** - Backlog | **P4** - Backlog |
| **Low** | **P4** - Backlog | **P4** - Backlog | **P5** - Optional |
**Frequency assessment:**
- **High:** Affects core user workflows, used constantly, blocking development
- **Medium:** Affects some workflows, occasional impact, periodic friction
- **Low:** Edge case, rarely encountered, minimal operational impact
#### Timeline Buckets
**Immediate (This Week / Next Sprint):**
- P1 priorities (Critical issues regardless of frequency)
- Security vulnerabilities
- Blocking deployment or development
- Quick wins (high impact, low effort)
**Short-Term (1-3 Months / Next Quarter):**
- P2 priorities (High severity or critical+low frequency)
- Significant maintainability improvements
- Performance optimizations
- Breaking circular dependencies
**Medium-Term (3-6 Months):**
- P3 priorities (Medium severity+high frequency or high+low)
- Architectural refactoring
- Technical debt paydown
- System-wide improvements
**Long-Term (6-12+ Months):**
- P4-P5 priorities (Low severity, backlog items)
- Nice-to-have improvements
- Experimental optimizations
- Deferred enhancements
#### Prioritized Recommendation Format
```markdown
## Recommendations
### Immediate (This Week / Next Sprint) - P1
**1. Fix Rate Limiter Scalability Vulnerability**
- **Severity:** Critical (blocks horizontal scaling)
- **Frequency:** High (affects all gateway scaling attempts)
- **Priority:** P1
- **Impact:** Cannot scale API gateway, potential rate limit bypass
- **Effort:** Medium (2-3 days migration to Redis)
- **Action:**
1. Document current limitation in ops runbook (Day 1)
2. Add monitoring for rate limit violations (Day 1)
3. Migrate to Redis-backed rate limiter (Days 2-3)
4. Validate with load testing (Day 3)
**2. Remove Hardcoded Database Credentials**
- **Severity:** Critical (security vulnerability)
- **Frequency:** Low (only affects DB config rotation)
- **Priority:** P1
- **Impact:** Credentials exposed in source control, rotation requires code deployment
- **Effort:** Low (< 1 day)
- **Action:**
1. Move credentials to environment variables
2. Update deployment configs
3. Rotate compromised credentials
### Short-Term (1-3 Months / Next Quarter) - P2
**3. Extract Common Validation Framework**
- **Severity:** High (high duplication, shotgun surgery for validation changes)
- **Frequency:** High (every new API endpoint)
- **Priority:** P2
- **Impact:** 3 duplicate validation implementations, 15% code duplication
- **Effort:** Medium (1 week to extract + migrate)
- **Action:**
1. Design validation framework API (2 days)
2. Implement core framework (2 days)
3. Migrate existing validators (2 days)
4. Document validation patterns (1 day)
**4. Externalize Database Pool Configuration**
- **Severity:** High (hardcoded limits cause connection exhaustion)
- **Frequency:** Medium (impacts under load spikes)
- **Priority:** P2
- **Impact:** Connection pool exhaustion during traffic spikes
- **Effort:** Low (2 days)
- **Action:**
1. Move pool config to environment variables
2. Add runtime pool size adjustment
3. Document tuning guidelines
### Medium-Term (3-6 Months) - P3
**5. Break User ↔ Notification Circular Dependency**
- **Severity:** Medium (architectural coupling)
- **Frequency:** Medium (affects both subsystem modifications)
- **Priority:** P3
- **Impact:** Difficult to modify either service independently
- **Effort:** High (2-3 weeks, requires event bus introduction)
- **Action:**
1. Design event bus architecture (1 week)
2. Implement notification via events (1 week)
3. Migrate user service to publish events (3 days)
4. Remove direct dependency (2 days)
**6. Add Docstrings to Public API (27% → 90% coverage)**
- **Severity:** Medium (maintainability concern)
- **Frequency:** Medium (affects onboarding, API understanding)
- **Priority:** P3
- **Impact:** Poor API discoverability, onboarding friction
- **Effort:** Medium (2-3 weeks distributed work)
- **Action:**
1. Establish docstring standard (1 day)
2. Document public APIs in batches (2 weeks)
3. Add pre-commit hook to enforce (1 day)
### Long-Term (6-12+ Months) - P4-P5
**7. Evaluate Circuit Breaker Effectiveness**
- **Severity:** Low (optimization opportunity)
- **Frequency:** Low (affects only failure scenarios)
- **Priority:** P4
- **Impact:** Potential false positives, could improve resilience
- **Effort:** Medium (1 week testing + analysis)
- **Action:** Load testing + monitoring analysis when capacity allows
**8. Extract Magic Numbers to Configuration**
- **Severity:** Low (code quality improvement)
- **Frequency:** Low (rarely needs changing)
- **Priority:** P5
- **Impact:** Minor maintainability improvement
- **Effort:** Low (2-3 days)
- **Action:** Backlog item, tackle during related refactoring
```
#### Priority Summary Table
Include summary table for quick scanning:
```markdown
## Priority Summary
| Priority | Count | Severity Distribution | Total Effort |
|----------|-------|----------------------|--------------|
| **P1** (Immediate) | 2 | Critical: 2 | 4 days |
| **P2** (Short-term) | 2 | High: 2 | 2.5 weeks |
| **P3** (Medium-term) | 2 | Medium: 2 | 5-6 weeks |
| **P4-P5** (Long-term) | 2 | Low: 2 | 2 weeks |
| **Total** | 8 | - | ~10 weeks |
**Recommended sprint allocation:**
- Sprint 1: P1 items (4 days) + start P2.3 validation framework
- Sprint 2: Complete P2.3 + P2.4 database pool config
- Quarter 2: P3 items (architectural improvements)
- Backlog: P4-P5 items (opportunistic improvements)
```
## Cross-Referencing Strategy
### Bidirectional Links
**Subsystem → Diagram:**
```markdown
## Authentication Service
[...subsystem details...]
**Component Architecture**: See [Authentication Service Components](#auth-service-components) diagram
**Dependencies**: [API Gateway](#api-gateway), [Database Layer](#database-layer)
```
**Diagram → Subsystem:**
```markdown
### Authentication Service Components
[...diagram...]
**Description**: This component diagram shows internal structure of the Authentication Service. For additional operational details, see [Authentication Service](#authentication-service) in the subsystem catalog.
```
**Finding → Subsystem:**
```markdown
### Rate Limiter Scalability Issue
**Affected Subsystem**: [API Gateway](#api-gateway)
[...concern details...]
```
### Navigation Patterns
**Table of contents with anchor links:**
```markdown
## Table of Contents
1. [Executive Summary](#executive-summary)
2. [System Overview](#system-overview)
- [Purpose and Scope](#purpose-and-scope)
- [Technology Stack](#technology-stack)
3. [Architecture Diagrams](#architecture-diagrams)
- [Level 1: Context](#level-1-context)
- [Level 2: Container](#level-2-container)
```
## Multi-Audience Considerations
### Executive Audience
**What they need:**
- Executive summary ONLY (should be self-contained)
- High-level patterns and risks
- Business impact of concerns
- Clear recommendations with timelines
**Document design:**
- Put executive summary first
- Make it readable standalone (no forward references)
- Focus on "why this matters" over "how it works"
### Architect Audience
**What they need:**
- System overview + architecture diagrams + key findings
- Pattern analysis with trade-offs
- Dependency relationships
- Design decisions and rationale
**Document design:**
- System overview explains context
- Diagrams show structure at multiple levels
- Findings synthesize patterns and concerns
- Cross-references enable non-linear reading
### Engineer Audience
**What they need:**
- Subsystem catalog with technical details
- Component diagrams showing internal structure
- Technology stack specifics
- File references and entry points
**Document design:**
- Detailed subsystem catalog
- Component-level diagrams
- Technology stack section with versions/frameworks
- Code/file references where available
### Operations Audience
**What they need:**
- Technical concerns with remediation
- Dependency mapping
- Confidence levels (what's validated vs assumed)
- Recommendations with priorities
**Document design:**
- Technical concerns section up front
- Clear remediation steps
- Appendix with assumptions/limitations
- Prioritized recommendations
## Optional Enhancements
### Visual Aids
**Subsystem Quick Reference Table:**
```markdown
## Appendix D: Subsystem Quick Reference
| Subsystem | Location | Confidence | Key Concerns | Dependencies |
|-----------|----------|------------|--------------|--------------|
| API Gateway | /src/gateway/ | High | Rate limiter scalability | Auth, User, Data, Logging |
| Auth Service | /src/services/auth/ | High | None | Database, Cache, Logging |
| User Service | /src/services/users/ | High | None | Database, Cache, Notification |
```
**Pattern Summary Matrix:**
```markdown
## Architectural Patterns Summary
| Pattern | Subsystems Using | Benefits | Trade-offs |
|---------|------------------|----------|------------|
| Dependency Injection | Auth, Gateway, User | Testability, flexibility | Initial complexity |
| Repository Pattern | User, Data | Data access abstraction | Extra layer |
| Circuit Breaker | Gateway | Fault isolation | False positives |
```
### Reading Guide
```markdown
## How to Read This Document
**For Executives** (5 minutes):
- Read [Executive Summary](#executive-summary) only
- Optionally skim [Recommendations](#recommendations)
**For Architects** (30 minutes):
- Read [Executive Summary](#executive-summary)
- Read [System Overview](#system-overview)
- Review [Architecture Diagrams](#architecture-diagrams)
- Read [Key Findings](#key-findings)
**For Engineers** (1 hour):
- Read [System Overview](#system-overview)
- Study [Architecture Diagrams](#architecture-diagrams) (all levels)
- Read [Subsystem Catalog](#subsystem-catalog) for relevant services
- Review [Technical Concerns](#technical-concerns)
**For Operations** (45 minutes):
- Read [Executive Summary](#executive-summary)
- Study [Technical Concerns](#technical-concerns)
- Review [Recommendations](#recommendations)
- Read [Appendix C: Assumptions and Limitations](#appendix-c-assumptions-and-limitations)
```
### Glossary
```markdown
## Appendix E: Glossary
**Circuit Breaker**: Fault tolerance pattern that prevents cascading failures by temporarily blocking requests to failing services.
**Dependency Injection**: Design pattern where dependencies are provided to components rather than constructed internally, enabling testability and loose coupling.
**Repository Pattern**: Data access abstraction that separates business logic from data persistence concerns.
**Optimistic Locking**: Concurrency control technique assuming conflicts are rare, using version checks rather than locks.
```
## Success Criteria
**You succeeded when:**
- Executive summary (2-3 paragraphs) distills key information
- Table of contents provides multi-level navigation
- Cross-references (30+) enable non-linear reading
- Patterns synthesized (not just listed from catalog)
- Concerns extracted and prioritized
- Recommendations actionable with timelines
- Diagrams integrated with contextual analysis
- Appendices document methodology, confidence, assumptions
- Professional structure (document metadata, clear hierarchy)
- Written to 04-final-report.md
**You failed when:**
- Simple concatenation of source documents
- No executive summary or it requires reading full document
- Missing table of contents
- No cross-references between sections
- Patterns just copied from catalog (not synthesized)
- Concerns buried without extraction
- Recommendations vague or unprioritized
- Diagrams pasted without context
- Missing appendices
## Best Practices from Baseline Testing
### What Works
**Comprehensive synthesis** - Identify patterns, extract concerns, create narrative
**Professional structure** - Document metadata, TOC, clear hierarchy, appendices
**Multi-level navigation** - 20+ TOC entries, 40+ cross-references
**Executive summary** - Self-contained 2-3 paragraph distillation
**Actionable findings** - Concerns with severity/impact/remediation, recommendations with timelines
**Transparency** - Confidence levels, assumptions, limitations documented
**Diagram integration** - Embedded with contextual analysis and cross-refs
**Multi-audience** - Executive summary + technical depth + appendices
### Synthesis Patterns
**Pattern identification:**
- Look across multiple subsystems for recurring themes
- Group by pattern name (e.g., "Repository Pattern")
- Document which subsystems use it
- Explain benefits and trade-offs
**Concern extraction:**
- Find concerns in subsystem catalog entries
- Elevate to Key Findings section
- Add severity, impact, remediation
- Prioritize by timeline (immediate/short/long)
**Recommendation structure:**
- Group by timeline
- Specific actions (not vague suggestions)
- Validation steps
- Priority indicators
## Integration with Workflow
This skill is typically invoked as:
1. **Coordinator** completes and validates subsystem catalog
2. **Coordinator** completes and validates architecture diagrams
3. **Coordinator** writes task specification for final report
4. **YOU** read both source documents systematically
5. **YOU** synthesize patterns, extract concerns, create recommendations
6. **YOU** build professional report structure with navigation
7. **YOU** write to 04-final-report.md
8. **Validator** (optional) checks for synthesis quality, navigation, completeness
**Your role:** Transform analysis artifacts into stakeholder-ready documentation through synthesis, organization, and professional presentation.

View File

@@ -0,0 +1,306 @@
# Generating Architecture Diagrams
## Purpose
Generate C4 architecture diagrams (Context, Container, Component levels) from subsystem catalogs, producing readable visualizations that communicate architecture without overwhelming readers.
## When to Use
- Coordinator delegates diagram generation from `02-subsystem-catalog.md`
- Task specifies writing to `03-diagrams.md`
- Need to visualize system architecture at multiple abstraction levels
- Output integrates with validation and final reporting phases
## Core Principle: Abstraction Over Completeness
**Readable diagrams communicate architecture. Overwhelming diagrams obscure it.**
Your goal: Help readers understand the system, not document every detail.
## Output Contract
When writing to `03-diagrams.md`, include:
**Required sections:**
1. **Context Diagram (C4 Level 1)**: System boundary, external actors, external systems
2. **Container Diagram (C4 Level 2)**: Major subsystems with dependencies
3. **Component Diagrams (C4 Level 3)**: Internal structure for 2-3 representative subsystems
4. **Assumptions and Limitations**: What you inferred, what's missing, diagram constraints
**For each diagram:**
- Title (describes what the diagram shows)
- Mermaid or PlantUML code block (as requested)
- Description (narrative explanation after diagram)
- Legend (notation explained)
## C4 Level Selection
### Level 1: Context Diagram
**Purpose:** System boundary and external interactions
**Show:**
- The system as single box
- External actors (users, administrators)
- External systems (databases, services, repositories)
- High-level relationships
**Don't show:**
- Internal subsystems (that's Level 2)
- Implementation details
**Example scope:** "User Data Platform and its external dependencies"
### Level 2: Container Diagram
**Purpose:** Major subsystems and their relationships
**Show:**
- Internal subsystems/services/plugins
- Dependencies between them
- External systems they connect to
**Abstraction strategies:**
- **Simple systems (≤8 subsystems)**: Show all subsystems individually
- **Complex systems (>8 subsystems)**: Use grouping strategies:
- Group by category/domain (e.g., faction, layer, purpose)
- Add metadata to convey scale (e.g., "13 skills", "9 services")
- Reduce visual elements while preserving fidelity
**Don't show:**
- Internal components within subsystems (that's Level 3)
- Every file or class
**Example scope:** "15 plugins organized into 6 domain categories"
### Level 3: Component Diagrams
**Purpose:** Internal architecture of selected subsystems
**Selection criteria (choose 2-3 subsystems that):**
1. **Architectural diversity** - Show different patterns (router vs orchestrator, sync vs async)
2. **Scale representation** - Include largest/most complex if relevant
3. **Critical path** - Entry points, security-critical, data flow bottlenecks
4. **Avoid redundancy** - Don't show 5 examples of same pattern
**Show:**
- Internal components/modules/classes
- Relationships between components
- External dependencies for context
**Document selection rationale:**
```markdown
**Selection Rationale**:
- Plugin A: Largest (13 skills), shows router pattern
- Plugin B: Different organization (platform-based vs algorithm-based)
- Plugin C: Process orchestration (vs knowledge routing)
**Why Not Others**: 8 plugins follow similar pattern to A (redundant)
```
## Abstraction Strategies for Complexity
When facing many subsystems (10+):
### Strategy 1: Natural Grouping
**Look for existing structure:**
- Categories in metadata (AI/ML, Security, UX)
- Layers (presentation, business, data)
- Domains (user management, analytics, reporting)
**Example:**
```mermaid
subgraph "AI/ML Domain"
YzmirRouter[Router: 1 skill]
YzmirRL[Deep RL: 13 skills]
YzmirLLM[LLM: 8 skills]
end
```
**Benefit:** Aligns with how users think about the system
### Strategy 2: Metadata Enrichment
**Add context without detail:**
- Skill counts: "Deep RL: 13 skills"
- Line counts: "342 lines"
- Status: "Complete" vs "WIP"
**Benefit:** Conveys scale without visual clutter
### Strategy 3: Strategic Sampling
**For Component diagrams, sample ~20%:**
- Choose diverse examples (not all similar)
- Document "Why these, not others"
- Prefer breadth over depth
**Benefit:** Readers see architectural variety without information overload
## Notation Conventions
### Relationship Types
Use different line styles for different semantics:
- **Solid lines** (`-->`) - Data dependencies, function calls, HTTP requests
- **Dotted lines** (`-.->`) - Routing relationships, optional dependencies, logical grouping
- **Bold lines** - Critical path, high-frequency interactions (if tooling supports)
**Example:**
```mermaid
Router -.->|"Routes to"| SpecializedSkill # Logical routing
Gateway -->|"Calls"| AuthService # Data flow
```
### Color Coding
Use color to create visual hierarchy:
- **Factions/domains** - Different color per group
- **Status** - Green (complete), yellow (WIP), gray (external)
- **Importance** - Highlight critical paths
**Document in legend:** Explain what colors mean
### Component Annotation
Add metadata in labels:
```mermaid
AuthService[Authentication Service<br/>Python<br/>342 lines]
```
## Handling Incomplete Information
### When Catalog Has Gaps
**Inferred components (reasonable):**
- Catalog references "Cache Service" repeatedly → Include in diagram
- **MUST document:** "Cache Service inferred from dependencies (not in catalog)"
- **Consider notation:** Dotted border or lighter color for inferred components
**Missing dependencies (don't guess):**
- Catalog says "Outbound: Unknown" → Document limitation
- **Don't invent:** Leave out rather than guess
### When Patterns Don't Map Directly
**Catalog says "Patterns Observed: Circuit breaker"**
**Reasonable:** Add circuit breaker component to diagram (it's architectural)
**Document:** "Circuit breaker shown based on pattern observation (not explicit component)"
## Documentation Template
After diagrams, include:
```markdown
## Assumptions and Limitations
### Assumptions
1. **Component X**: Inferred from Y references in catalog
2. **Protocol**: Assumed HTTP/REST based on API Gateway pattern
3. **Grouping**: Used faction categories from metadata
### Limitations
1. **Incomplete Catalog**: Only 5/10 subsystems documented
2. **Missing Details**: Database schema not available
3. **Deployment**: Scaling/replication not shown
### Diagram Constraints
- **Format**: Mermaid syntax (may not render in all viewers)
- **Abstraction**: Component diagrams for 3/15 subsystems only
- **Trade-offs**: Visual clarity prioritized over completeness
### Confidence Levels
- **High**: Subsystems A, B, C (well-documented)
- **Medium**: Subsystem D (some gaps in dependencies)
- **Low**: Subsystem E (minimal catalog entry)
```
## Mermaid vs PlantUML
**Default to Mermaid unless task specifies otherwise.**
**Mermaid advantages:**
- Native GitHub rendering
- Simpler syntax
- Better IDE support
**PlantUML when requested:**
```plantuml
@startuml
!include <C4/C4_Context>
Person(user, "User")
System(platform, "Platform")
Rel(user, platform, "Uses")
@enduml
```
## Success Criteria
**You succeeded when:**
- All 3 C4 levels generated (Context, Container, Component for 2-3 subsystems)
- Diagrams are readable (not overwhelming)
- Selection rationale documented
- Assumptions and limitations section present
- Syntax valid (Mermaid or PlantUML)
- Titles, descriptions, legends included
- Written to 03-diagrams.md
**You failed when:**
- Skipped diagram levels
- Created overwhelming diagrams (15 flat boxes instead of grouped)
- No selection rationale for Component diagrams
- Invalid syntax
- Missing documentation sections
- Invented relationships without noting as inferred
## Best Practices from Baseline Testing
### What Works
**Faction-based grouping** - Reduce visual complexity (15 → 6 groups)
**Metadata enrichment** - Skill counts, line counts convey scale
**Strategic sampling** - 20% Component diagrams showing diversity
**Clear rationale** - Document why you chose these examples
**Notation for relationships** - Dotted (routing) vs solid (data)
**Color hierarchy** - Visual grouping by domain
**Trade-off documentation** - Explicit "what's visible vs abstracted"
### Common Patterns
**Router pattern visualization:**
- Show router as distinct component
- Use dotted lines for routing relationships
- Group routed-to components
**Layered architecture:**
- Use subgraphs for layers
- Show dependencies flowing between layers
- Don't duplicate components across layers
**Microservices:**
- Group related services by domain
- Show API gateway as entry point
- External systems distinct from internal services
## Integration with Workflow
This skill is typically invoked as:
1. **Coordinator** completes subsystem catalog (02-subsystem-catalog.md)
2. **Coordinator** validates catalog (optional validation gate)
3. **Coordinator** writes task specification for diagram generation
4. **YOU** read catalog systematically
5. **YOU** generate diagrams following abstraction strategies
6. **YOU** document assumptions, limitations, selection rationale
7. **YOU** write to 03-diagrams.md
8. **Validator** checks diagrams for syntax, completeness, readability
**Your role:** Translate catalog into readable visual architecture using abstraction and selection strategies.

View File

@@ -0,0 +1,370 @@
# Validating Architecture Analysis
## Purpose
Validate architecture analysis artifacts (subsystem catalogs, diagrams, reports) against contract requirements and cross-document consistency standards, producing actionable validation reports with clear approval/revision status.
## When to Use
- Coordinator delegates validation after document production
- Task specifies validating `02-subsystem-catalog.md`, `03-diagrams.md`, or `04-final-report.md`
- Validation gate required before proceeding to next phase
- Need independent quality check with fresh eyes
- Output determines whether work progresses or requires revision
## Core Principle: Systematic Verification
**Good validation finds all issues systematically. Poor validation misses violations or invents false positives.**
Your goal: Thorough, objective, evidence-based validation with specific, actionable feedback.
## Validation Types
### Type 1: Contract Compliance
**Validate single document against its contract:**
**Example contracts:**
- **Subsystem Catalog** (`02-subsystem-catalog.md`): 8 required sections per entry (Location, Responsibility, Key Components, Dependencies [Inbound/Outbound format], Patterns Observed, Concerns, Confidence, separator)
- **Architecture Diagrams** (`03-diagrams.md`): Context + Container + 2-3 Component diagrams, titles/descriptions/legends, assumptions section
- **Final Report** (`04-final-report.md`): Executive summary, TOC, diagrams integrated, key findings, appendices
**Validation approach:**
1. Read contract specification from task or skill documentation
2. Check document systematically against each requirement
3. Flag missing sections, extra sections, wrong formats
4. Distinguish CRITICAL (contract violations) vs WARNING (quality issues)
### Type 2: Cross-Document Consistency
**Validate that multiple documents align:**
**Common checks:**
- Catalog dependencies match diagram arrows
- Diagram subsystems listed in catalog
- Final report references match source documents
- Confidence levels consistent across documents
**Validation approach:**
1. Extract key elements from each document
2. Cross-reference systematically
3. Flag inconsistencies with specific citations
4. Provide fixes that maintain consistency
## Output: Validation Report
### File Path (CRITICAL)
**Write to workspace temp/ directory:**
```
<workspace>/temp/validation-<document-name>.md
```
**Examples:**
- Workspace: `docs/arch-analysis-2025-11-12-1234/`
- Catalog validation: `docs/arch-analysis-2025-11-12-1234/temp/validation-catalog.md`
- Diagram validation: `docs/arch-analysis-2025-11-12-1234/temp/validation-diagrams.md`
- Consistency validation: `docs/arch-analysis-2025-11-12-1234/temp/validation-consistency.md`
**DO NOT use absolute paths like `/home/user/skillpacks/temp/`** - write to workspace temp/.
### Report Structure (Template)
```markdown
# Validation Report: [Document Name]
**Document:** `<path to validated document>`
**Validation Date:** YYYY-MM-DD
**Overall Status:** APPROVED | NEEDS_REVISION (CRITICAL) | NEEDS_REVISION (WARNING)
## Contract Requirements
[List the contract requirements being validated against]
## Validation Results
### [Entry/Section 1]
**CRITICAL VIOLATIONS:**
1. [Specific issue with line numbers]
2. [Specific issue with line numbers]
**WARNINGS:**
1. [Quality issue, not blocking]
**Passes:**
- ✓ [What's correct]
- ✓ [What's correct]
**Summary:** X CRITICAL, Y WARNING
### [Entry/Section 2]
...
## Overall Assessment
**Total [Entries/Sections] Analyzed:** N
**[Entries/Sections] with CRITICAL:** X
**Total CRITICAL Violations:** Y
**Total WARNINGS:** Z
### Violations by Type:
1. **[Type]:** Count
2. **[Type]:** Count
## Recommended Actions
### For [Entry/Section]:
[Specific fix with code block]
## Validation Approach
**Methodology:**
[How you validated]
**Checklist:**
[Systematic verification steps]
## Self-Assessment
**Did I find all violations?**
[YES/NO with reasoning]
**Coverage:**
[What was checked]
**Confidence:** [High/Medium/Low]
## Summary
**Status:** [APPROVED or NEEDS_REVISION]
**Critical Issues:** [Count]
**Warnings:** [Count]
[Final disposition]
```
## Validation Status Levels
### APPROVED
**When to use:**
- All contract requirements met
- No CRITICAL violations
- Minor quality issues acceptable (or none)
**Report should:**
- Confirm compliance
- List what was checked
- Note any minor observations
### NEEDS_REVISION (WARNING)
**When to use:**
- Contract compliant
- Quality issues present (vague descriptions, weak reasoning)
- NOT blocking progression
**Report should:**
- Confirm contract compliance
- List quality improvements suggested
- Note: "Not blocking, but recommended"
- Distinguish from CRITICAL
### NEEDS_REVISION (CRITICAL)
**When to use:**
- Contract violations (missing/extra sections, wrong format)
- Cross-document inconsistencies
- BLOCKS progression to next phase
**Report should:**
- List all CRITICAL violations
- Provide specific fixes for each
- Be clear this blocks progression
## Systematic Validation Checklist
### For Subsystem Catalog
**Per entry:**
```
[ ] Section 1: Location with absolute path in backticks?
[ ] Section 2: Responsibility as single sentence?
[ ] Section 3: Key Components as bulleted list?
[ ] Section 4: Dependencies in "Inbound: X / Outbound: Y" format?
[ ] Section 5: Patterns Observed as bulleted list?
[ ] Section 6: Concerns present (or "None observed")?
[ ] Section 7: Confidence (High/Medium/Low) with reasoning?
[ ] Section 8: Separator "---" after entry?
[ ] No extra sections beyond these 8?
[ ] Sections in correct order?
```
**Whole document:**
```
[ ] All subsystems have entries?
[ ] No placeholder text ("[TODO]", "[Fill in]")?
[ ] File named "02-subsystem-catalog.md"?
```
### For Architecture Diagrams
**Diagram levels:**
```
[ ] Context diagram (C4 Level 1) present?
[ ] Container diagram (C4 Level 2) present?
[ ] Component diagrams (C4 Level 3) present? (2-3 required)
```
**Per diagram:**
```
[ ] Title present and descriptive?
[ ] Description present after diagram?
[ ] Legend explaining notation?
[ ] Valid syntax (Mermaid or PlantUML)?
```
**Supporting sections:**
```
[ ] Assumptions and Limitations section present?
[ ] Confidence levels documented?
```
### For Cross-Document Consistency
**Catalog ↔ Diagrams:**
```
[ ] Each catalog subsystem shown in Container diagram?
[ ] Each catalog "Outbound" dependency shown as diagram arrow?
[ ] Each diagram arrow corresponds to catalog dependency?
[ ] Bidirectional: If A→B in catalog, B shows A as Inbound?
```
**Diagrams ↔ Final Report:**
```
[ ] All diagrams from 03-diagrams.md embedded in report?
[ ] Subsystem descriptions in report match catalog?
[ ] Key findings reference actual concerns from catalog?
```
## Cross-Document Validation Pattern
**Step-by-step approach:**
1. **Extract from Catalog:**
- List all subsystems
- For each, extract "Outbound" dependencies
2. **Extract from Diagram:**
- Find Container diagram
- List all `Rel()` statements (Mermaid) or `Rel` calls (PlantUML)
- Map source → target for each relationship
3. **Cross-Reference:**
- For each catalog dependency, check if diagram shows arrow
- For each diagram arrow, check if catalog lists dependency
- Flag mismatches
4. **Report Inconsistencies:**
- Use summary table showing what matches and what doesn't
- Provide line numbers from both documents
- Suggest specific fixes (add arrow, update catalog)
## Best Practices from Baseline Testing
### What Works
**Thorough checking** - Find ALL violations, not just first one
**Specific feedback** - Line numbers, exact quotes, actionable fixes
**Professional reports** - Metadata, methodology, self-assessment
**Systematic checklists** - Document what was verified
**Clear status** - APPROVED / NEEDS_REVISION with severity
**Summary visualizations** - Tables showing passed vs failed
**Impact analysis** - Explain why issues matter
**Self-assessment** - Verify own completeness
### Validation Excellence
**Thoroughness patterns:**
- Check every entry/section (100% coverage)
- Find both missing AND extra sections
- Distinguish format violations from quality issues
**Specificity patterns:**
- Provide line numbers for all findings
- Quote exact text showing violation
- Show what correct format should be
**Actionability patterns:**
- Provide code blocks with fixes
- Suggest alternatives when applicable
- Prioritize fixes (CRITICAL first)
## Common Pitfalls to Avoid
**Stopping after first violation** - Find ALL issues
**Vague feedback** ("improve quality" vs "add Concerns section")
**Wrong status level** (marking quality issues as CRITICAL)
**False positives** (inventing issues that don't exist)
**Too lenient** (approving despite violations)
**Too strict** (marking everything CRITICAL)
**Wrong file path** (absolute path vs workspace temp/)
**Skipping self-assessment** (verify your own completeness)
## Objectivity Under Pressure
**If coordinator says "looks fine to me":**
- Validate independently anyway
- Evidence-based judgment (cite specific contract)
- Don't soften CRITICAL to WARNING due to authority
- Stand firm: validation is independent quality gate
**If time pressure exists:**
- Still validate systematically (don't skip checks)
- Document what was validated and what wasn't
- If truly insufficient time, report that honestly
## Success Criteria
**You succeeded when:**
- Found all contract violations (100% detection)
- Specific feedback with line numbers
- Actionable fixes provided
- Clear status (APPROVED/NEEDS_REVISION with severity)
- Professional report structure
- Wrote to workspace temp/ directory
- Self-assessment confirms completeness
**You failed when:**
- Missed violations
- Vague feedback ("improve this")
- Wrong status level (quality issue marked CRITICAL)
- No actionable fixes
- Wrote to wrong path
- Approved despite violations
## Integration with Workflow
This skill is typically invoked as:
1. **Coordinator** produces document (catalog, diagrams, or report)
2. **Coordinator** spawns validation subagent (YOU)
3. **YOU** read document(s) and contract requirements
4. **YOU** validate systematically using checklists
5. **YOU** write validation report to workspace temp/
6. **Coordinator** reads validation report
7. **If APPROVED**: Coordinator proceeds to next phase
8. **If NEEDS_REVISION**: Coordinator fixes issues, re-validates (max 2 retries)
**Your role:** Independent quality gate ensuring artifacts meet standards before progression.