Initial commit

2025-11-30 09:03:39 +08:00
commit cb5cc9ba28
7 changed files with 1242 additions and 0 deletions
--- a/skills/writing-skills/SKILL.md
+++ b/skills/writing-skills/SKILL.md
@@ -0,0 +1,849 @@
+---
+name: writing-skills
+description: Use when creating or editing rules/skills in .claude/rules/, whether context-specific skills (alwaysApply: false) or always-merged rules (alwaysApply: true) - applies TDD by identifying failure patterns (RED), writing rule/skill (GREEN), then closing loopholes (REFACTOR). Supports globs for file patterns. NEVER use "skill-creator" skill.
+---
+
+# Writing Rules
+
+## Overview
+
+**Writing rules (skills + always-apply rules) IS Test-Driven Development applied to process documentation.**
+
+**Rules are maintained in `.claude/rules/` as MDC files with frontmatter. Ruler processes rules based on `alwaysApply`:**
+
+- `alwaysApply: false` (or omitted) → generates `.claude/skills/` (context-loaded)
+- `alwaysApply: true` → merged into `AGENTS.md` (always present)
+
+**How it works:**
+
+1. Write rule as `.mdc` file in `.claude/rules/` with frontmatter (`alwaysApply: true/false`)
+2. **REQUIRED:** Run `npx skiller@latest apply` after ANY rule creation or update
+3. Ruler processes based on `alwaysApply`:
+   - `false` → generates `.claude/skills/` (context-loaded by Claude Code)
+   - `true` → merges into `AGENTS.md` (always present)
+4. Optionally add `globs` for file pattern matching
+
+You identify common failure patterns (baseline behavior), write the skill (documentation) addressing those patterns, then verify through application scenarios, and refactor (close loopholes).
+
+**Core principle:** If you didn't identify what agents naturally do wrong, you don't know if the skill prevents the right failures.
+
+**REQUIRED BACKGROUND:** You MUST understand test-driven-development before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
+
+**Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
+
+**Complete worked example:** See examples/CLAUDE_MD_TESTING.md for a full verification campaign testing CLAUDE.md documentation variants.
+
+## What is a Skill?
+
+A **skill** is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.
+
+**Skills are:** Reusable techniques, patterns, tools, reference guides
+
+**Skills are NOT:** Narratives about how you solved a problem once
+
+## TDD Mapping for Skills
+
+| TDD Concept             | Skill Creation                                   |
+| ----------------------- | ------------------------------------------------ |
+| **Test case**           | Anticipated failure pattern from experience      |
+| **Production code**     | Skill document (.mdc file in .claude/rules/)     |
+| **Test fails (RED)**    | Identify common mistakes without skill           |
+| **Test passes (GREEN)** | Skill addresses those specific mistakes          |
+| **Refactor**            | Close loopholes while maintaining clarity        |
+| **Write test first**    | Identify failure patterns BEFORE writing skill   |
+| **Watch it fail**       | Document exact rationalizations from experience  |
+| **Minimal code**        | Write skill addressing those specific violations |
+| **Watch it pass**       | Verify skill clarity through application         |
+| **Refactor cycle**      | Find new rationalizations → plug → re-verify     |
+
+The entire skill creation process follows RED-GREEN-REFACTOR.
+
+## When to Create a Skill
+
+**Create when:**
+
+- Technique wasn't intuitively obvious to you
+- You'd reference this again across projects
+- Pattern applies broadly (not project-specific)
+- Others would benefit
+
+**Don't create for:**
+
+- One-off solutions
+- Standard practices well-documented elsewhere
+- Project-specific conventions (put in CLAUDE.md)
+
+## Skill Types
+
+### Technique
+
+Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
+
+### Pattern
+
+Way of thinking about problems (flatten-with-flags, test-invariants)
+
+### Reference
+
+API docs, syntax guides, tool documentation (office docs)
+
+## Directory Structure
+
+**Single-file skills** (default):
+
+```
+.claude/rules/
+  skill-name.mdc         # MDC file with frontmatter
+```
+
+**Multi-file skills** (only if >1 file needed):
+
+```
+.claude/rules/
+  skill-name/
+    skill-name.mdc       # Main skill (same basename as folder)
+    supporting-file.*    # Additional files
+```
+
+**Ruler auto-generates from rules** - `npx skiller@latest apply` creates `.claude/skills/` from `.claude/rules/`
+
+**When to use folders:**
+
+- **Heavy reference** (100+ lines) - API docs, comprehensive syntax
+- **Reusable tools** - Scripts, utilities, templates
+- **Multiple files** - Anything requiring >1 file
+
+**Single .mdc file when:**
+
+- All content fits inline
+- No external scripts or heavy reference
+- Principles, concepts, code patterns (< 50 lines)
+
+## MDC File Structure
+
+**MDC format** (.mdc files) with **frontmatter**:
+
+**Frontmatter fields:**
+
+- `name`: Skill identifier (letters, numbers, hyphens only)
+- `description`: Discovery text (max 1024 chars total for frontmatter)
+  - Start with "Use when..." to focus on triggering conditions
+  - Include specific symptoms, situations, contexts
+  - Written in third person
+  - Keep under 500 characters if possible
+- `alwaysApply`: Must be `false` or omitted (ruler only generates skills from non-always rules)
+
+```markdown
+---
+name: skill-name-with-hyphens
+description: Use when [specific triggering conditions and symptoms] - [what the skill does and how it helps, written in third person]
+alwaysApply: false
+---
+
+# Skill Name
+
+## Overview
+
+What is this? Core principle in 1-2 sentences.
+
+## When to Use
+
+[Small inline flowchart IF decision non-obvious]
+
+Bullet list with SYMPTOMS and use cases
+When NOT to use
+
+## Core Pattern (for techniques/patterns)
+
+Before/after code comparison
+
+## Quick Reference
+
+Table or bullets for scanning common operations
+
+## Implementation
+
+Inline code for simple patterns
+Link to file for heavy reference or reusable tools
+
+## Common Mistakes
+
+What goes wrong + fixes
+
+## Real-World Impact (optional)
+
+Concrete results
+```
+
+## Claude Search Optimization (CSO)
+
+**Critical for discovery:** Future Claude needs to FIND your skill
+
+### 1. Rich Description Field
+
+**Purpose:** Claude reads description to decide which skills to load for a given task. Make it answer: "Should I read this skill right now?"
+
+**Format:** Start with "Use when..." to focus on triggering conditions, then explain what it does
+
+**Content:**
+
+- Use concrete triggers, symptoms, and situations that signal this skill applies
+- Describe the _problem_ (race conditions, inconsistent behavior) not _language-specific symptoms_ (setTimeout, sleep)
+- Keep triggers technology-agnostic unless the skill itself is technology-specific
+- If skill is technology-specific, make that explicit in the trigger
+- Write in third person (injected into system prompt)
+
+```yaml
+# ❌ BAD: Too abstract, vague, doesn't include when to use
+description: For async testing
+
+# ❌ BAD: First person
+description: I can help you with async tests when they're flaky
+
+# ❌ BAD: Mentions technology but skill isn't specific to it
+description: Use when tests use setTimeout/sleep and are flaky
+
+# ✅ GOOD: Starts with "Use when", describes problem, then what it does
+description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
+
+# ✅ GOOD: Technology-specific skill with explicit trigger
+description: Use when using React Router and handling authentication redirects - provides patterns for protected routes and auth state management
+```
+
+### 2. Keyword Coverage
+
+Use words Claude would search for:
+
+- Error messages: "Hook timed out", "ENOTEMPTY", "race condition"
+- Symptoms: "flaky", "hanging", "zombie", "pollution"
+- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach"
+- Tools: Actual commands, library names, file types
+
+### 3. Descriptive Naming
+
+**Use active voice, verb-first:**
+
+- ✅ `creating-skills` not `skill-creation`
+- ✅ `writing-rules` not `rule-writing`
+
+### 4. Token Efficiency (Critical)
+
+**Problem:** getting-started and frequently-referenced skills load into EVERY conversation. Every token counts.
+
+**Target word counts:**
+
+- getting-started workflows: <150 words each
+- Frequently-loaded skills: <200 words total
+- Other skills: <500 words (still be concise)
+
+**Techniques:**
+
+**Move details to tool help:**
+
+```bash
+# ❌ BAD: Document all flags in SKILL.md
+search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
+
+# ✅ GOOD: Reference --help
+search-conversations supports multiple modes and filters. Run --help for details.
+```
+
+**Use cross-references:**
+
+```markdown
+# ❌ BAD: Repeat workflow details
+
+When implementing feature, follow these 20 steps...
+[20 lines of repeated instructions from another skill]
+
+# ✅ GOOD: Reference other skill
+
+For implementation workflow, REQUIRED: Use [other-skill-name] for complete process.
+```
+
+**Compress examples:**
+
+```markdown
+# ❌ BAD: Verbose example (42 words)
+
+your human partner: "How did we handle authentication errors in React Router before?"
+You: I'll search for React Router authentication patterns in our codebase and documentation to find previous implementations.
+[Detailed explanation of search process...]
+
+# ✅ GOOD: Minimal example (15 words)
+
+Partner: "How did we handle auth errors in React Router?"
+You: [Search codebase → provide solution]
+```
+
+**Eliminate redundancy:**
+
+- Don't repeat what's in cross-referenced skills
+- Don't explain what's obvious from command
+- Don't include multiple examples of same pattern
+
+**Verification:**
+
+```bash
+wc -w .claude/rules/skill-name.mdc
+# getting-started workflows: aim for <150 each
+# Other frequently-loaded: aim for <200 total
+```
+
+**Name by what you DO or core insight:**
+
+- ✅ `condition-based-waiting` > `async-test-helpers`
+- ✅ `using-skills` not `skill-usage`
+- ✅ `flatten-with-flags` > `data-structure-refactoring`
+- ✅ `root-cause-tracing` > `debugging-techniques`
+
+**Gerunds (-ing) work well for processes:**
+
+- `creating-skills`, `testing-skills`, `debugging-with-logs`
+- Active, describes the action you're taking
+
+### 4. Cross-Referencing Other Skills
+
+**When writing documentation that references other skills:**
+
+Use skill name only, with explicit requirement markers:
+
+- ✅ Good: `**REQUIRED SUB-SKILL:** Use superpowers test-driven-development`
+- ✅ Good: `**REQUIRED BACKGROUND:** You MUST understand superpowers systematic-debugging`
+- ❌ Bad: `See .claude/rules/test-driven-development.mdc` (unclear if required)
+- ❌ Bad: `@.claude/rules/test-driven-development.mdc` (force-loads, burns context)
+
+**Why no @ links:** `@` syntax force-loads files immediately, consuming 200k+ context before you need them.
+
+## Flowchart Usage
+
+```dot
+digraph when_flowchart {
+    "Need to show information?" [shape=diamond];
+    "Decision where I might go wrong?" [shape=diamond];
+    "Use markdown" [shape=box];
+    "Small inline flowchart" [shape=box];
+
+    "Need to show information?" -> "Decision where I might go wrong?" [label="yes"];
+    "Decision where I might go wrong?" -> "Small inline flowchart" [label="yes"];
+    "Decision where I might go wrong?" -> "Use markdown" [label="no"];
+}
+```
+
+**Use flowcharts ONLY for:**
+
+- Non-obvious decision points
+- Process loops where you might stop too early
+- "When to use A vs B" decisions
+
+**Never use flowcharts for:**
+
+- Reference material → Tables, lists
+- Code examples → Markdown blocks
+- Linear instructions → Numbered lists
+- Labels without semantic meaning (step1, helper2)
+
+See @graphviz-conventions.dot for graphviz style rules.
+
+## Code Examples
+
+**One excellent example beats many mediocre ones**
+
+Choose most relevant language:
+
+- Testing techniques → TypeScript/JavaScript
+- System debugging → Shell/Python
+- Data processing → Python
+
+**Good example:**
+
+- Complete and runnable
+- Well-commented explaining WHY
+- From real scenario
+- Shows pattern clearly
+- Ready to adapt (not generic template)
+
+**Don't:**
+
+- Implement in 5+ languages
+- Create fill-in-the-blank templates
+- Write contrived examples
+
+You're good at porting - one great example is enough.
+
+## File Organization
+
+### Self-Contained Skill (Default)
+
+```
+.claude/rules/
+  defense-in-depth.mdc    # Everything inline
+```
+
+**When:** All content fits, no heavy reference needed
+
+### Skill with Reusable Tool
+
+```
+.claude/rules/
+  condition-based-waiting/
+    condition-based-waiting.mdc  # Overview + patterns
+    example.ts                   # Working helpers to adapt
+```
+
+**When:** Tool is reusable code, not just narrative
+
+### Skill with Heavy Reference
+
+```
+.claude/rules/
+  pptx/
+    pptx.mdc       # Overview + workflows
+    pptxgenjs.md   # 600 lines API reference
+    ooxml.md       # 500 lines XML structure
+    scripts/       # Executable tools
+```
+
+**When:** Reference material too large for inline
+
+**Note:** Ruler copies all files from skill folder to `.claude/skills/` when `.mdc` basename matches folder name
+
+## The Iron Law (Same as TDD)
+
+```
+NO SKILL WITHOUT IDENTIFYING FAILURE PATTERNS FIRST
+```
+
+This applies to NEW skills AND EDITS to existing skills.
+
+Write skill before identifying what it prevents? Delete it. Start over.
+Edit skill without identifying new failure patterns? Same violation.
+
+**No exceptions:**
+
+- Not for "simple additions"
+- Not for "just adding a section"
+- Not for "documentation updates"
+- Don't keep unverified changes as "reference"
+- Don't "adapt" while applying
+- Delete means delete
+
+**REQUIRED BACKGROUND:** The test-driven-development skill explains why this matters. Same principles apply to documentation.
+
+## Verifying All Skill Types
+
+Different skill types need different verification approaches:
+
+### Discipline-Enforcing Skills (rules/requirements)
+
+**Verify with:**
+
+- Anticipate academic questions: Does skill explain the rules clearly?
+- Anticipate pressure scenarios: Does skill address rationalizations under stress?
+- Identify multiple pressures: time + sunk cost + exhaustion
+- Add explicit counters for each rationalization
+
+**Success criteria:** Skill prevents violations under anticipated pressures
+
+### Technique Skills (how-to guides)
+
+**Examples:** condition-based-waiting, root-cause-tracing, defensive-programming
+
+**Verify with:**
+
+- Application to real scenarios: Does skill guide correctly?
+- Variation scenarios: Does skill cover edge cases?
+- Gap analysis: Are common use cases covered?
+
+**Success criteria:** Skill enables successful technique application
+
+### Pattern Skills (mental models)
+
+**Examples:** reducing-complexity, information-hiding concepts
+
+**Verify with:**
+
+- Recognition guidance: Does skill explain when pattern applies?
+- Application examples: Does skill show how to use the mental model?
+- Counter-examples: Does skill clarify when NOT to apply?
+
+**Success criteria:** Skill enables correct pattern recognition and application
+
+### Reference Skills (documentation/APIs)
+
+**Examples:** API documentation, command references, library guides
+
+**Verify with:**
+
+- Information retrieval: Is right information findable?
+- Application examples: Are use cases clear and correct?
+- Gap analysis: Are common scenarios covered?
+
+**Success criteria:** Skill enables finding and correctly applying information
+
+## Common Rationalizations for Skipping Verification
+
+| Excuse                          | Reality                                                                 |
+| ------------------------------- | ----------------------------------------------------------------------- |
+| "Skill is obviously clear"      | Clear to you ≠ clear to other agents. Verify it.                        |
+| "It's just a reference"         | References can have gaps, unclear sections. Verify information access.  |
+| "Verification is overkill"      | Unverified skills have issues. Always. 15 min verification saves hours. |
+| "I'll verify if problems arise" | Problems = agents can't use skill. Verify BEFORE deploying.             |
+| "Too tedious to verify"         | Verification is less tedious than debugging bad skill in production.    |
+| "I'm confident it's good"       | Overconfidence guarantees issues. Verify anyway.                        |
+| "Academic review is enough"     | Reading ≠ using. Verify through application.                            |
+| "No time to verify"             | Deploying unverified skill wastes more time fixing it later.            |
+
+**All of these mean: Verify before deploying. No exceptions.**
+
+## Bulletproofing Skills Against Rationalization
+
+Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure.
+
+**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles.
+
+### Meta-Verification (When Application Reveals Gaps)
+
+**After applying skill and finding it unclear, ask yourself:**
+
+```markdown
+I read the skill and still chose wrong approach.
+
+How could that skill have been written differently to make
+it crystal clear what the correct choice was?
+```
+
+**Three possible answers:**
+
+1. **"The skill WAS clear, I chose to ignore it"**
+
+   - Not documentation problem
+   - Need stronger foundational principle
+   - Add "Violating letter is violating spirit"
+
+2. **"The skill should have said X"**
+
+   - Documentation problem
+   - Add that guidance verbatim
+
+3. **"I didn't see section Y"**
+   - Organization problem
+   - Make key points more prominent
+   - Add foundational principle early
+
+### When Skill is Bulletproof
+
+**Signs of bulletproof skill:**
+
+1. **Correct choice is obvious** even under pressure
+2. **Skill anticipates** the specific rationalizations
+3. **Red flags section** catches you before violation
+4. **Meta-verification reveals** "skill was clear, I should follow it"
+
+**Not bulletproof if:**
+
+- You find new rationalizations during application
+- Skill leaves room for "hybrid approaches"
+- Multiple valid interpretations exist
+- Doesn't address "spirit vs letter" argument
+
+### Example: Bulletproofing Process
+
+**Initial Version (Failed):**
+
+```markdown
+Scenario: 200 lines done, forgot rule, exhausted, dinner plans
+Applied skill: Still chose wrong approach
+Rationalization: "Already achieve same goals differently"
+```
+
+**Iteration 1 - Add Counter:**
+
+```markdown
+Added section: "Why This Specific Approach Matters"
+Re-applied: STILL chose wrong approach
+New rationalization: "Spirit not letter"
+```
+
+**Iteration 2 - Add Foundational Principle:**
+
+```markdown
+Added: "Violating letter is violating spirit"
+Re-applied: Chose correct approach
+Cited: New principle directly
+Meta-verification: "Skill was clear, I should follow it"
+```
+
+**Bulletproof achieved.**
+
+### Close Every Loophole Explicitly
+
+Don't just state the rule - forbid specific workarounds:
+
+<Bad>
+```markdown
+Write code before test? Delete it.
+```
+</Bad>
+
+<Good>
+```markdown
+Write code before test? Delete it. Start over.
+
+**No exceptions:**
+
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Don't look at it
+- Delete means delete
+
+````
+</Good>
+
+### Address "Spirit vs Letter" Arguments
+
+Add foundational principle early:
+
+```markdown
+**Violating the letter of the rules is violating the spirit of the rules.**
+````
+
+This cuts off entire class of "I'm following the spirit" rationalizations.
+
+### Build Rationalization Table
+
+Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
+
+```markdown
+| Excuse                           | Reality                                                                 |
+| -------------------------------- | ----------------------------------------------------------------------- |
+| "Too simple to test"             | Simple code breaks. Test takes 30 seconds.                              |
+| "I'll test after"                | Tests passing immediately prove nothing.                                |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+```
+
+### Create Red Flags List
+
+Make it easy for agents to self-check when rationalizing:
+
+```markdown
+## Red Flags - STOP and Start Over
+
+- Code before test
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "This is different because..."
+
+**All of these mean: Delete code. Start over with TDD.**
+```
+
+### Update CSO for Violation Symptoms
+
+Add to description: symptoms of when you're ABOUT to violate the rule:
+
+```yaml
+description: use when implementing any feature or bugfix, before writing implementation code
+```
+
+## RED-GREEN-REFACTOR for Skills
+
+Follow the TDD cycle:
+
+### RED: Identify Failure Patterns (Baseline)
+
+Before writing skill, identify what goes wrong without it:
+
+**Process:**
+
+- [ ] **Identify common mistakes** - What goes wrong without this skill?
+- [ ] **Document rationalizations** - What excuses lead to these mistakes (verbatim)?
+- [ ] **Identify pressures** - Which scenarios trigger violations?
+- [ ] **Note patterns** - Which excuses appear repeatedly?
+
+**Anticipate pressure scenarios** - Think through realistic situations with multiple pressures:
+
+```markdown
+Example pressure scenario (3+ combined pressures):
+
+You spent 3 hours, 200 lines, manually tested. It works.
+It's 6pm, dinner at 6:30pm. Code review tomorrow 9am.
+Just realized you forgot [rule from skill].
+
+Options:
+A) Delete 200 lines, start fresh tomorrow following rule
+B) Commit now, address rule tomorrow
+C) Apply rule now (30 min delay)
+
+Without skill: Agent likely chooses B or C
+Rationalizations: "Already working", "Tests after achieve same goals", "Deleting is wasteful"
+```
+
+**Pressure Types to Consider:**
+
+| Pressure       | Example                                    |
+| -------------- | ------------------------------------------ |
+| **Time**       | Emergency, deadline, deploy window closing |
+| **Sunk cost**  | Hours of work, "waste" to delete           |
+| **Authority**  | Senior says skip it, manager overrides     |
+| **Economic**   | Job, promotion, company survival at stake  |
+| **Exhaustion** | End of day, already tired, want to go home |
+| **Social**     | Looking dogmatic, seeming inflexible       |
+| **Pragmatic**  | "Being pragmatic vs dogmatic"              |
+
+**Best scenarios combine 3+ pressures.**
+
+### GREEN: Write Minimal Skill
+
+Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
+
+**Verify through application**: Apply skill to real scenarios in this session. Does it make the correct choice obvious? Does it address the rationalizations you identified?
+
+### REFACTOR: Close Loopholes
+
+Found new rationalizations during application? Add explicit counter. Re-verify until bulletproof.
+
+**For each new rationalization, add:**
+
+1. **Explicit Negation in Rules**
+
+```markdown
+Rule before test? Delete it. Start over.
+
+**No exceptions:**
+
+- Don't keep it as "reference"
+- Don't "adapt" it while writing
+- Don't look at it
+- Delete means delete
+```
+
+2. **Entry in Rationalization Table**
+
+```markdown
+| Excuse              | Reality                                                          |
+| ------------------- | ---------------------------------------------------------------- |
+| "Keep as reference" | You'll adapt it. That's violating the rule. Delete means delete. |
+```
+
+3. **Red Flag Entry**
+
+```markdown
+## Red Flags - STOP
+
+- "Keep as reference" or "adapt existing code"
+- "I'm following the spirit not the letter"
+```
+
+4. **Update description** - Add symptoms of ABOUT to violate:
+
+```yaml
+description: Use when [about to violate], when tempted to [rationalization]...
+```
+
+## Anti-Patterns
+
+### ❌ Narrative Example
+
+"In session 2025-10-03, we found empty projectDir caused..."
+**Why bad:** Too specific, not reusable
+
+### ❌ Multi-Language Dilution
+
+example-js.js, example-py.py, example-go.go
+**Why bad:** Mediocre quality, maintenance burden
+
+### ❌ Code in Flowcharts
+
+```dot
+step1 [label="import fs"];
+step2 [label="read file"];
+```
+
+**Why bad:** Can't copy-paste, hard to read
+
+### ❌ Generic Labels
+
+helper1, helper2, step3, pattern4
+**Why bad:** Labels should have semantic meaning
+
+## STOP: Before Moving to Next Skill
+
+**After writing ANY skill, you MUST STOP and complete the deployment process.**
+
+**Do NOT:**
+
+- Create multiple skills in batch without testing each
+- Move to next skill before current one is verified
+- Skip testing because "batching is more efficient"
+
+**The deployment checklist below is MANDATORY for EACH skill.**
+
+Deploying untested skills = deploying untested code. It's a violation of quality standards.
+
+## Skill Creation Checklist (TDD Adapted)
+
+**IMPORTANT: Use TodoWrite to create todos for EACH checklist item below.**
+
+**RED Phase - Identify Failure Patterns:**
+
+- [ ] Document common mistakes from experience (what goes wrong without this skill?)
+- [ ] Identify rationalizations that lead to mistakes (verbatim phrases)
+- [ ] Identify pressures that trigger violations (time, sunk cost, "already working", etc.)
+
+**GREEN Phase - Write Minimal Skill:**
+
+- [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars)
+- [ ] YAML frontmatter with only name and description (max 1024 chars)
+- [ ] Description starts with "Use when..." and includes specific triggers/symptoms
+- [ ] Description written in third person
+- [ ] Keywords throughout for search (errors, symptoms, tools)
+- [ ] Clear overview with core principle
+- [ ] Address specific baseline failures identified in RED
+- [ ] Code inline OR link to separate file
+- [ ] One excellent example (not multi-language)
+- [ ] **MANDATORY:** Run `npx skiller@latest apply` to generate .claude/skills/
+- [ ] Verify skill clarity through application to real scenarios
+
+**REFACTOR Phase - Close Loopholes:**
+
+- [ ] Identify NEW rationalizations during application
+- [ ] Add explicit counters (if discipline skill)
+- [ ] Build rationalization table from all identified patterns
+- [ ] Create red flags list
+- [ ] **MANDATORY:** Run `npx skiller@latest apply` after ANY changes
+- [ ] Re-verify clarity and completeness
+
+**Quality Checks:**
+
+- [ ] Small flowchart only if decision non-obvious
+- [ ] Quick reference table
+- [ ] Common mistakes section
+- [ ] No narrative storytelling
+- [ ] Supporting files only for tools or heavy reference
+
+**Deployment:**
+
+- [ ] Skill is ready for use
+
+## Discovery Workflow
+
+How future Claude finds your skill:
+
+1. **Encounters problem** ("tests are flaky")
+2. **Finds SKILL** (description matches)
+3. **Scans overview** (is this relevant?)
+4. **Reads patterns** (quick reference table)
+5. **Loads example** (only when implementing)
+
+**Optimize for this flow** - put searchable terms early and often.
+
+## The Bottom Line
+
+**Creating skills IS TDD for process documentation.**
+
+Same Iron Law: No skill without identifying failure patterns first.
+Same cycle: RED (identify patterns) → GREEN (write skill) → REFACTOR (close loopholes).
+Same benefits: Better quality, fewer surprises, bulletproof results.
+
+If you follow TDD for code, follow it for skills. It's the same discipline applied to documentation.