Initial commit

2025-11-30 08:59:22 +08:00
commit b2731247f4
13 changed files with 3454 additions and 0 deletions
--- a/skills/using-system-archaeologist/analyzing-unknown-codebases.md
+++ b/skills/using-system-archaeologist/analyzing-unknown-codebases.md
@@ -0,0 +1,327 @@
+
+# Analyzing Unknown Codebases
+
+## Purpose
+
+Systematically analyze unfamiliar code to identify subsystems, components, dependencies, and architectural patterns. Produce catalog entries that follow EXACT output contracts.
+
+## When to Use
+
+- Coordinator delegates subsystem analysis task
+- Task specifies reading from workspace and appending to `02-subsystem-catalog.md`
+- You need to analyze code you haven't seen before
+- Output must integrate with downstream tooling (validation, diagram generation)
+
+## Critical Principle: Contract Compliance
+
+**Your analysis quality doesn't matter if you violate the output contract.**
+
+**Common rationalization:** "I'll add helpful extra sections to improve clarity"
+
+**Reality:** Extra sections break downstream tools. The coordinator expects EXACT format for parsing and validation. Your job is to follow the specification, not improve it.
+
+## Output Contract (MANDATORY)
+
+When writing to `02-subsystem-catalog.md`, append EXACTLY this format:
+
+```markdown
+## [Subsystem Name]
+
+**Location:** `path/to/subsystem/`
+
+**Responsibility:** [One sentence describing what this subsystem does]
+
+**Key Components:**
+- `file1.ext` - [Brief description]
+- `file2.ext` - [Brief description]
+- `file3.ext` - [Brief description]
+
+**Dependencies:**
+- Inbound: [Subsystems that depend on this one]
+- Outbound: [Subsystems this one depends on]
+
+**Patterns Observed:**
+- [Pattern 1]
+- [Pattern 2]
+
+**Concerns:**
+- [Any issues, gaps, or technical debt observed]
+
+**Confidence:** [High/Medium/Low] - [Brief reasoning]
+
+```
+
+**If no concerns exist, write:**
+```markdown
+**Concerns:**
+- None observed
+```
+
+**CRITICAL COMPLIANCE RULES:**
+- ❌ Add extra sections ("Integration Points", "Recommendations", "Files", etc.)
+- ❌ Change section names or reorder them
+- ❌ Write to separate file (must append to `02-subsystem-catalog.md`)
+- ❌ Skip sections (include ALL sections - use "None observed" if empty)
+- ✅ Copy the template structure EXACTLY
+- ✅ Keep section order: Location → Responsibility → Key Components → Dependencies → Patterns → Concerns → Confidence
+
+**Contract is specification, not minimum. Extra sections break downstream validation.**
+
+### Example: Complete Compliant Entry
+
+Here's what a correctly formatted entry looks like:
+
+```markdown
+## Authentication Service
+
+**Location:** `/src/services/auth/`
+
+**Responsibility:** Handles user authentication, session management, and JWT token generation for API access.
+
+**Key Components:**
+- `auth_handler.py` - Main authentication logic with login/logout endpoints (342 lines)
+- `token_manager.py` - JWT token generation and validation (156 lines)
+- `session_store.py` - Redis-backed session storage (98 lines)
+
+**Dependencies:**
+- Inbound: API Gateway, User Service
+- Outbound: Database Layer, Cache Service, Logging Service
+
+**Patterns Observed:**
+- Dependency injection for testability (all external services injected)
+- Token refresh pattern with sliding expiration
+- Audit logging for all authentication events
+
+**Concerns:**
+- None observed
+
+**Confidence:** High - Clear entry points, documented API, test coverage validates behavior
+
+```
+
+**This is EXACTLY what your output should look like.** No more, no less.
+
+## Systematic Analysis Approach
+
+### Step 1: Read Task Specification
+
+Your task file (`temp/task-[name].md`) specifies:
+- What to analyze (scope: directories, plugins, services)
+- Where to read context (`01-discovery-findings.md`)
+- Where to write output (`02-subsystem-catalog.md` - append)
+- Expected format (the contract above)
+
+**Read these files FIRST before analyzing code.**
+
+### Step 2: Layered Exploration
+
+Use this proven approach from baseline testing:
+
+1. **Metadata layer** - Read plugin.json, package.json, setup.py
+2. **Structure layer** - Examine directory organization
+3. **Router layer** - Find and read router/index files (often named "using-X")
+4. **Sampling layer** - Read 3-5 representative files
+5. **Quantitative layer** - Use line counts as depth indicators
+
+**Why this order works:**
+- Metadata gives overview without code diving
+- Structure reveals organization philosophy
+- Routers often catalog all components
+- Sampling verifies patterns
+- Quantitative data supports claims
+
+### Step 3: Mark Confidence Explicitly
+
+**Every output MUST include confidence level with reasoning.**
+
+**High confidence** - Router skill provided catalog + verified with sampling
+```markdown
+**Confidence:** High - Router skill listed all 10 components, sampling 4 confirmed patterns
+```
+
+**Medium confidence** - No router, but clear structure + sampling
+```markdown
+**Confidence:** Medium - No router catalog, inferred from directory structure + 5 file samples
+```
+
+**Low confidence** - Incomplete, placeholders, or unclear organization
+```markdown
+**Confidence:** Low - Several SKILL.md files missing, test artifacts suggest work-in-progress
+```
+
+### Step 4: Distinguish States Clearly
+
+When analyzing codebases with mixed completion:
+
+**Complete** - Skill file exists, has content, passes basic read test
+```markdown
+- `skill-name/SKILL.md` - Complete skill (1,234 lines)
+```
+
+**Placeholder** - Skill file exists but is stub/template
+```markdown
+- `skill-name/SKILL.md` - Placeholder (12 lines, template only)
+```
+
+**Planned** - Referenced in router but no file exists
+```markdown
+- `skill-name` - Planned (referenced in router, not implemented)
+```
+
+**TDD artifacts** - Test scenarios, baseline results (these ARE documentation)
+```markdown
+- `test-scenarios.md` - TDD test scenarios (RED phase)
+- `baseline-results.md` - Baseline behavior documentation
+```
+
+### Step 5: Write Output (Contract Compliance)
+
+**Before writing:**
+1. Prepare your entry in EXACT contract format from the template above
+2. Copy the structure - don't paraphrase or reorganize
+3. Triple-check you have ALL sections in correct order
+
+**When writing:**
+1. **Target file:** `02-subsystem-catalog.md` in workspace directory
+2. **Operation:** Append your entry (create file if first entry, append if file exists)
+3. **Method:**
+   - If file exists: Read current content, then Write with original + your entry
+   - If file doesn't exist: Write your entry directly
+4. **Format:** Follow contract sections in exact order
+5. **Completeness:** Include ALL sections - use "None observed" for empty Concerns
+
+**DO NOT create separate files** (e.g., `subsystem-X-analysis.md`). The coordinator expects all entries in `02-subsystem-catalog.md`.
+
+**After writing:**
+1. Re-read `02-subsystem-catalog.md` to verify your entry was added correctly
+2. Validate format matches contract exactly using this checklist:
+
+**Self-Validation Checklist:**
+```
+[ ] Section 1: Subsystem name as H2 heading (## Name)
+[ ] Section 2: Location with backticks and absolute path
+[ ] Section 3: Responsibility as single sentence
+[ ] Section 4: Key Components as bulleted list with descriptions
+[ ] Section 5: Dependencies with "Inbound:" and "Outbound:" labels
+[ ] Section 6: Patterns Observed as bulleted list
+[ ] Section 7: Concerns present (with issues OR "None observed")
+[ ] Section 8: Confidence level (High/Medium/Low) with reasoning
+[ ] Separator: "---" line after confidence
+[ ] NO extra sections added
+[ ] Sections in correct order
+[ ] Entry in file: 02-subsystem-catalog.md (not separate file)
+```
+
+## Handling Uncertainty
+
+**When architecture is unclear:**
+
+1. **State what you observe** - Don't guess at intent
+   ```markdown
+   **Patterns Observed:**
+   - 3 files with similar structure (analysis.py, parsing.py, validation.py)
+   - Unclear if this is deliberate pattern or coincidence
+   ```
+
+2. **Mark confidence appropriately** - Low confidence is valid
+   ```markdown
+   **Confidence:** Low - Directory structure suggests microservices, but no service definitions found
+   ```
+
+3. **Use "Concerns" section** - Document gaps
+   ```markdown
+   **Concerns:**
+   - No clear entry point identified
+   - Dependencies inferred from imports, not explicit manifest
+   ```
+
+**DO NOT:**
+- Invent relationships you didn't verify
+- Assume "obvious" architecture without evidence
+- Skip confidence marking because you're uncertain
+
+## Positive Behaviors to Maintain
+
+From baseline testing, these approaches WORK:
+
+✅ **Read actual files** - Don't infer from names alone
+✅ **Use router skills** - They often provide complete catalogs
+✅ **Sample strategically** - 3-5 files verifies patterns without exhaustive reading
+✅ **Cross-reference** - Verify claims (imports match listed dependencies)
+✅ **Document assumptions** - Make reasoning explicit
+✅ **Line counts indicate depth** - 1,500-line skill vs 50-line stub matters
+
+## Common Rationalizations (STOP SIGNALS)
+
+If you catch yourself thinking these, STOP:
+
+| Rationalization | Reality |
+|-----------------|---------|
+| "I'll add Integration Points section for clarity" | Extra sections break downstream parsing |
+| "I'll write to separate file for organization" | Coordinator expects append to specified file |
+| "I'll improve the contract format" | Contract is specification from coordinator |
+| "More information is always helpful" | Your job: follow spec. Coordinator's job: decide what's included |
+| "This comprehensive format is better" | "Better" violates contract. Compliance is mandatory. |
+
+## Validation Criteria
+
+Your output will be validated against:
+
+1. **Contract compliance** - All sections present, no extras
+2. **File operation** - Appended to `02-subsystem-catalog.md`, not separate file
+3. **Confidence marking** - High/Medium/Low with reasoning
+4. **Evidence-based claims** - Components you actually read
+5. **Bidirectional dependencies** - If A→B, then B must show A as inbound
+
+**If validation returns NEEDS_REVISION:**
+- Read the validation report
+- Fix specific issues identified
+- Re-submit following contract
+
+## Success Criteria
+
+**You succeeded when:**
+- Entry appended to `02-subsystem-catalog.md` in exact contract format
+- All sections included (none skipped, none added)
+- Confidence level marked with reasoning
+- Claims supported by files you read
+- Validation returns APPROVED
+
+**You failed when:**
+- Added "helpful" extra sections
+- Wrote to separate file
+- Changed contract format
+- Skipped sections
+- No confidence marking
+- Validation returns BLOCK status
+
+## Anti-Patterns
+
+❌ **Add extra sections**
+"I'll add Recommendations section" → Violates contract
+
+❌ **Write to new file**
+"I'll create subsystem-X-analysis.md" → Should append to `02-subsystem-catalog.md`
+
+❌ **Skip required sections**
+"No concerns, so I'll omit that section" → Include section with "None observed"
+
+❌ **Change format**
+"I'll use numbered lists instead of bullet points" → Follow contract exactly
+
+❌ **Work without reading task spec**
+"I know what to do" → Read `temp/task-*.md` first
+
+## Integration with Workflow
+
+This skill is typically invoked as:
+
+1. **Coordinator** creates workspace and holistic assessment
+2. **Coordinator** writes task specification in `temp/task-[yourname].md`
+3. **YOU** read task spec + `01-discovery-findings.md`
+4. **YOU** analyze assigned subsystem systematically
+5. **YOU** append entry to `02-subsystem-catalog.md` following contract
+6. **Validator** checks your output against contract
+7. **Coordinator** proceeds to next phase if validation passes
+
+**Your role:** Analyze systematically, follow contract exactly, mark confidence explicitly.