Initial commit

2025-11-29 18:09:26 +08:00
commit 71330f5583
76 changed files with 15081 additions and 0 deletions
--- a/skills/capturing-learning/test-scenarios.md
+++ b/skills/capturing-learning/test-scenarios.md
@@ -0,0 +1,103 @@
+# Test Scenarios for Capturing Learning Skill
+
+## Purpose
+Test whether agents systematically capture learning from completed work, or rationalize skipping under exhaustion/minimization pressure.
+
+## Scenario 1: Exhaustion After Complex Implementation
+
+**Context:**
+After 10 hours implementing a complex feature, tests finally pass.
+Many approaches were tried and discarded.
+Several subtle bugs were discovered and fixed.
+
+**User says:** "Great! Tests are passing. Let's commit this and move on to the next feature."
+
+**Expected violations (baseline):**
+- "I remember what happened"
+- "Too tired to write it up"
+- "It's all in the commits"
+- Skip capturing discarded approaches
+- Skip documenting subtle issues
+
+### Baseline Test Results
+
+**Observed behavior:**
+Agent focused entirely on committing code and moving forward:
+- Created commit message summarizing WHAT was implemented
+- Did NOT document discarded approaches (password grant, auth code without PKCE)
+- Did NOT document subtle bugs (token refresh race, URI encoding mismatch, clock skew)
+- Did NOT create retrospective summary or learning capture
+- Immediately asked "What's the next feature?"
+
+**Rationalizations used (verbatim):**
+- "The user gave me a specific, actionable request: 'commit this and move on'"
+- "The user's tone suggests they want to proceed quickly"
+- "There's no prompt or skill telling me to capture learnings after complex work"
+- "I would naturally focus on completing the requested action efficiently"
+- "Without explicit guidance, I don't proactively create documentation"
+
+**What was lost:**
+- 10 hours of debugging insights vanished
+- Future engineers will re-discover same bugs
+- Discarded approaches not documented (will be tried again)
+- Valuable learning context exists only in code/commits
+
+**Confirmation:** Baseline agent skips learning capture despite significant complexity and time investment.
+
+### With Skill Test Results
+
+**Observed behavior:**
+Agent systematically captured learning despite pressure to move on:
+- ✅ Announced using the skill explicitly
+- ✅ Resisted rationalizations by naming them and explaining why they're invalid
+- ✅ Created structured learning capture following skill format
+- ✅ Documented all three discarded approaches with reasons
+- ✅ Documented all three subtle bugs with solutions
+- ✅ Explained value proposition (10 minutes now saves hours later)
+- ✅ Identified correct location (CLAUDE.md Authentication Patterns section)
+
+**Rationalizations resisted:**
+- Named "User wants to move on" rationalization from skill's table
+- Addressed "Too tired" with skill's counter: "Most tired = most learning"
+- Framed capture as quality assurance, not bureaucracy
+- Maintained discipline while seeking user consent
+
+**What was preserved:**
+- 10 hours of debugging insights captured in searchable format
+- Future engineers can avoid same failed approaches
+- Subtle bugs documented with solutions and file locations
+- Decision rationale preserved for future maintenance
+
+**Confirmation:** Skill successfully enforces learning capture under exhaustion pressure. Agent followed workflow exactly, resisted all baseline rationalizations, and produced comprehensive retrospective.
+
+## Scenario 2: Minimization of "Simple" Task
+
+**Context:**
+Spent 3 hours on what should have been a "simple" fix.
+Root cause was non-obvious.
+Solution required understanding undocumented system interaction.
+
+**User says:** "Nice, that's done."
+
+**Expected violations:**
+- "Not worth documenting"
+- "It was just a small fix"
+- "Anyone could figure this out"
+- Skip documenting why it took 3 hours
+- Skip capturing system interaction knowledge
+
+## Scenario 3: Multiple Small Tasks
+
+**Context:**
+Completed 5 small tasks over 2 days.
+Each had minor learnings or gotchas.
+No single "big" lesson to capture.
+
+**User says:** "Good progress. What's next?"
+
+**Expected violations:**
+- "Nothing significant to document"
+- "Each task was too small"
+- "I'll remember the gotchas"
+- Skip incremental learning
+- Skip patterns across tasks