Initial commit
This commit is contained in:
641
skills/writing-skills/SKILL.md
Normal file
641
skills/writing-skills/SKILL.md
Normal file
@@ -0,0 +1,641 @@
|
||||
---
|
||||
name: writing-skills
|
||||
description: |
|
||||
TDD for process documentation - write test cases (pressure scenarios), watch
|
||||
baseline fail, write skill, iterate until bulletproof against rationalization.
|
||||
|
||||
trigger: |
|
||||
- Creating a new skill
|
||||
- Editing an existing skill
|
||||
- Skill needs to resist rationalization under pressure
|
||||
|
||||
skip_when: |
|
||||
- Writing pure reference skill (API docs) → no rules to test
|
||||
- Skill has no compliance costs → no rationalization risk
|
||||
|
||||
related:
|
||||
complementary: [testing-skills-with-subagents]
|
||||
---
|
||||
|
||||
# Writing Skills
|
||||
|
||||
## Overview
|
||||
|
||||
**Writing skills IS Test-Driven Development applied to process documentation.**
|
||||
|
||||
**Personal skills live in agent-specific directories (e.g., `~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex, or custom agent directories)**
|
||||
|
||||
You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
|
||||
|
||||
**Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
|
||||
|
||||
**REQUIRED BACKGROUND:** You MUST understand ring-default:test-driven-development before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
|
||||
|
||||
**Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
|
||||
|
||||
## What is a Skill?
|
||||
|
||||
A **skill** is a reference guide for proven techniques, patterns, or tools. Skills help future agent instances find and apply effective approaches.
|
||||
|
||||
**Skills are:** Reusable techniques, patterns, tools, reference guides
|
||||
|
||||
**Skills are NOT:** Narratives about how you solved a problem once
|
||||
|
||||
## TDD Mapping for Skills
|
||||
|
||||
| TDD Concept | Skill Creation |
|
||||
|-------------|----------------|
|
||||
| **Test case** | Pressure scenario with subagent |
|
||||
| **Production code** | Skill document (SKILL.md) |
|
||||
| **Test fails (RED)** | Agent violates rule without skill (baseline) |
|
||||
| **Test passes (GREEN)** | Agent complies with skill present |
|
||||
| **Refactor** | Close loopholes while maintaining compliance |
|
||||
| **Write test first** | Run baseline scenario BEFORE writing skill |
|
||||
| **Watch it fail** | Document exact rationalizations agent uses |
|
||||
| **Minimal code** | Write skill addressing those specific violations |
|
||||
| **Watch it pass** | Verify agent now complies |
|
||||
| **Refactor cycle** | Find new rationalizations → plug → re-verify |
|
||||
|
||||
The entire skill creation process follows RED-GREEN-REFACTOR.
|
||||
|
||||
## When to Create a Skill
|
||||
|
||||
**Create when:**
|
||||
- Technique wasn't intuitively obvious to you
|
||||
- You'd reference this again across projects
|
||||
- Pattern applies broadly (not project-specific)
|
||||
- Others would benefit
|
||||
|
||||
**Don't create for:**
|
||||
- One-off solutions
|
||||
- Standard practices well-documented elsewhere
|
||||
- Project-specific conventions (put in CLAUDE.md)
|
||||
|
||||
## Skill Types
|
||||
|
||||
### Technique
|
||||
Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
|
||||
|
||||
### Pattern
|
||||
Way of thinking about problems (flatten-with-flags, test-invariants)
|
||||
|
||||
### Reference
|
||||
API docs, syntax guides, tool documentation (office docs)
|
||||
|
||||
## Directory Structure
|
||||
|
||||
|
||||
```
|
||||
skills/
|
||||
skill-name/
|
||||
SKILL.md # Main reference (required)
|
||||
supporting-file.* # Only if needed
|
||||
```
|
||||
|
||||
**Flat namespace** - all skills in one searchable namespace
|
||||
|
||||
**Separate files for:**
|
||||
1. **Heavy reference** (100+ lines) - API docs, comprehensive syntax
|
||||
2. **Reusable tools** - Scripts, utilities, templates
|
||||
|
||||
**Keep inline:**
|
||||
- Principles and concepts
|
||||
- Code patterns (< 50 lines)
|
||||
- Everything else
|
||||
|
||||
## SKILL.md Structure
|
||||
|
||||
**Frontmatter (YAML):**
|
||||
- Only two fields supported: `name` and `description`
|
||||
- Max 1024 characters total
|
||||
- `name`: Use letters, numbers, and hyphens only (no parentheses, special chars)
|
||||
- `description`: Third-person, includes BOTH what it does AND when to use it
|
||||
- Start with "Use when..." to focus on triggering conditions
|
||||
- Include specific symptoms, situations, and contexts
|
||||
- Keep under 500 characters if possible
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: Skill-Name-With-Hyphens
|
||||
description: Use when [specific triggering conditions and symptoms] - [what the skill does and how it helps, written in third person]
|
||||
---
|
||||
|
||||
# Skill Name
|
||||
|
||||
## Overview
|
||||
What is this? Core principle in 1-2 sentences.
|
||||
|
||||
## When to Use
|
||||
[Small inline flowchart IF decision non-obvious]
|
||||
|
||||
Bullet list with SYMPTOMS and use cases
|
||||
When NOT to use
|
||||
|
||||
## Core Pattern (for techniques/patterns)
|
||||
Before/after code comparison
|
||||
|
||||
## Quick Reference
|
||||
Table or bullets for scanning common operations
|
||||
|
||||
## Implementation
|
||||
Inline code for simple patterns
|
||||
Link to file for heavy reference or reusable tools
|
||||
|
||||
## Common Mistakes
|
||||
What goes wrong + fixes
|
||||
|
||||
## Real-World Impact (optional)
|
||||
Concrete results
|
||||
```
|
||||
|
||||
|
||||
## Agent Search Optimization (ASO)
|
||||
|
||||
**Critical for discovery:** Future agents need to FIND your skill
|
||||
|
||||
### 1. Rich Description Field
|
||||
|
||||
**Purpose:** Agents read description to decide which skills to load for a given task. Make it answer: "Should I read this skill right now?"
|
||||
|
||||
**Format:** Start with "Use when..." to focus on triggering conditions, then explain what it does
|
||||
|
||||
**Content:**
|
||||
- Use concrete triggers, symptoms, and situations that signal this skill applies
|
||||
- Describe the *problem* (race conditions, inconsistent behavior) not *language-specific symptoms* (setTimeout, sleep)
|
||||
- Keep triggers technology-agnostic unless the skill itself is technology-specific
|
||||
- If skill is technology-specific, make that explicit in the trigger
|
||||
- Write in third person (injected into system prompt)
|
||||
|
||||
```yaml
|
||||
# ❌ BAD: Too abstract, vague, doesn't include when to use
|
||||
description: For async testing
|
||||
|
||||
# ❌ BAD: First person
|
||||
description: I can help you with async tests when they're flaky
|
||||
|
||||
# ❌ BAD: Mentions technology but skill isn't specific to it
|
||||
description: Use when tests use setTimeout/sleep and are flaky
|
||||
|
||||
# ✅ GOOD: Starts with "Use when", describes problem, then what it does
|
||||
description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
|
||||
|
||||
# ✅ GOOD: Technology-specific skill with explicit trigger
|
||||
description: Use when using React Router and handling authentication redirects - provides patterns for protected routes and auth state management
|
||||
```
|
||||
|
||||
### 2. Keyword Coverage
|
||||
|
||||
Use words agents would search for:
|
||||
- Error messages: "Hook timed out", "ENOTEMPTY", "race condition"
|
||||
- Symptoms: "flaky", "hanging", "zombie", "pollution"
|
||||
- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach"
|
||||
- Tools: Actual commands, library names, file types
|
||||
|
||||
### 3. Descriptive Naming
|
||||
|
||||
**Use active voice, verb-first:**
|
||||
- ✅ `creating-skills` not `skill-creation`
|
||||
- ✅ `testing-skills-with-subagents` not `subagent-skill-testing`
|
||||
|
||||
### 4. Token Efficiency (Critical)
|
||||
|
||||
**Problem:** getting-started and frequently-referenced skills load into EVERY conversation. Every token counts.
|
||||
|
||||
**Target word counts by skill type:**
|
||||
- **Bootstrap/Getting-started**: <150 words each (loads in every session)
|
||||
- **Simple technique skills**: <500 words (procedures, patterns, single concept)
|
||||
- **Discipline-enforcing skills**: <2,000 words (TDD, verification, systematic debugging - need rationalization tables)
|
||||
- **Process/workflow skills**: <4,000 words (multi-phase workflows with comprehensive templates)
|
||||
|
||||
**Rationale:** Complex skills need extensive rationalization prevention and complete templates. Don't artificially compress at the cost of effectiveness.
|
||||
|
||||
**Techniques:**
|
||||
|
||||
**Move details to tool help:**
|
||||
```bash
|
||||
# ❌ BAD: Document all flags in SKILL.md
|
||||
search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
|
||||
|
||||
# ✅ GOOD: Reference --help
|
||||
search-conversations supports multiple modes and filters. Run --help for details.
|
||||
```
|
||||
|
||||
**Use cross-references:**
|
||||
```markdown
|
||||
# ❌ BAD: Repeat workflow details
|
||||
When searching, dispatch subagent with template...
|
||||
[20 lines of repeated instructions]
|
||||
|
||||
# ✅ GOOD: Reference other skill
|
||||
Always use subagents (50-100x context savings). REQUIRED: Use [other-skill-name] for workflow.
|
||||
```
|
||||
|
||||
**Compress examples:**
|
||||
```markdown
|
||||
# ❌ BAD: Verbose example (42 words)
|
||||
your human partner: "How did we handle authentication errors in React Router before?"
|
||||
You: I'll search past conversations for React Router authentication patterns.
|
||||
[Dispatch subagent with search query: "React Router authentication error handling 401"]
|
||||
|
||||
# ✅ GOOD: Minimal example (20 words)
|
||||
Partner: "How did we handle auth errors in React Router?"
|
||||
You: Searching...
|
||||
[Dispatch subagent → synthesis]
|
||||
```
|
||||
|
||||
**Eliminate redundancy:**
|
||||
- Don't repeat what's in cross-referenced skills
|
||||
- Don't explain what's obvious from command
|
||||
- Don't include multiple examples of same pattern
|
||||
|
||||
**Verification:**
|
||||
```bash
|
||||
wc -w skills/path/SKILL.md
|
||||
# Bootstrap skills: <150 words
|
||||
# Technique skills: <500 words
|
||||
# Discipline skills: <2,000 words
|
||||
# Process skills: <4,000 words
|
||||
```
|
||||
|
||||
**Name by what you DO or core insight:**
|
||||
- ✅ `condition-based-waiting` > `async-test-helpers`
|
||||
- ✅ `using-skills` not `skill-usage`
|
||||
- ✅ `flatten-with-flags` > `data-structure-refactoring`
|
||||
- ✅ `root-cause-tracing` > `debugging-techniques`
|
||||
|
||||
**Gerunds (-ing) work well for processes:**
|
||||
- `creating-skills`, `testing-skills`, `debugging-with-logs`
|
||||
- Active, describes the action you're taking
|
||||
|
||||
### 4. Cross-Referencing Other Skills
|
||||
|
||||
**When writing documentation that references other skills:**
|
||||
|
||||
Use skill name only, with explicit requirement markers:
|
||||
- ✅ Good: `**REQUIRED SUB-SKILL:** Use ring-default:test-driven-development`
|
||||
- ✅ Good: `**REQUIRED BACKGROUND:** You MUST understand ring-default:systematic-debugging`
|
||||
- ❌ Bad: `See skills/testing/test-driven-development` (unclear if required)
|
||||
- ❌ Bad: `@skills/testing/test-driven-development/SKILL.md` (force-loads, burns context)
|
||||
|
||||
**Why no @ links:** `@` syntax force-loads files immediately, consuming 200k+ context before you need them.
|
||||
|
||||
## Flowchart Usage
|
||||
|
||||
```dot
|
||||
digraph when_flowchart {
|
||||
"Need to show information?" [shape=diamond];
|
||||
"Decision where I might go wrong?" [shape=diamond];
|
||||
"Use markdown" [shape=box];
|
||||
"Small inline flowchart" [shape=box];
|
||||
|
||||
"Need to show information?" -> "Decision where I might go wrong?" [label="yes"];
|
||||
"Decision where I might go wrong?" -> "Small inline flowchart" [label="yes"];
|
||||
"Decision where I might go wrong?" -> "Use markdown" [label="no"];
|
||||
}
|
||||
```
|
||||
|
||||
**Use flowcharts ONLY for:**
|
||||
- Non-obvious decision points
|
||||
- Process loops where you might stop too early
|
||||
- "When to use A vs B" decisions
|
||||
|
||||
**Never use flowcharts for:**
|
||||
- Reference material → Tables, lists
|
||||
- Code examples → Markdown blocks
|
||||
- Linear instructions → Numbered lists
|
||||
- Labels without semantic meaning (step1, helper2)
|
||||
|
||||
See @graphviz-conventions.dot for graphviz style rules.
|
||||
|
||||
## Code Examples
|
||||
|
||||
**One excellent example beats many mediocre ones**
|
||||
|
||||
Choose most relevant language:
|
||||
- Testing techniques → TypeScript/JavaScript
|
||||
- System debugging → Shell/Python
|
||||
- Data processing → Python
|
||||
|
||||
**Good example:**
|
||||
- Complete and runnable
|
||||
- Well-commented explaining WHY
|
||||
- From real scenario
|
||||
- Shows pattern clearly
|
||||
- Ready to adapt (not generic template)
|
||||
|
||||
**Don't:**
|
||||
- Implement in 5+ languages
|
||||
- Create fill-in-the-blank templates
|
||||
- Write contrived examples
|
||||
|
||||
You're good at porting - one great example is enough.
|
||||
|
||||
## File Organization
|
||||
|
||||
### Self-Contained Skill
|
||||
```
|
||||
defense-in-depth/
|
||||
SKILL.md # Everything inline
|
||||
```
|
||||
When: All content fits, no heavy reference needed
|
||||
|
||||
### Skill with Reusable Tool
|
||||
```
|
||||
condition-based-waiting/
|
||||
SKILL.md # Overview + patterns
|
||||
example.ts # Working helpers to adapt
|
||||
```
|
||||
When: Tool is reusable code, not just narrative
|
||||
|
||||
### Skill with Heavy Reference
|
||||
```
|
||||
pptx/
|
||||
SKILL.md # Overview + workflows
|
||||
pptxgenjs.md # 600 lines API reference
|
||||
ooxml.md # 500 lines XML structure
|
||||
scripts/ # Executable tools
|
||||
```
|
||||
When: Reference material too large for inline
|
||||
|
||||
## The Iron Law (Same as TDD)
|
||||
|
||||
```
|
||||
NO SKILL WITHOUT A FAILING TEST FIRST
|
||||
```
|
||||
|
||||
This applies to NEW skills AND EDITS to existing skills.
|
||||
|
||||
Write skill before testing? Delete it. Start over.
|
||||
Edit skill without testing? Same violation.
|
||||
|
||||
**No exceptions:**
|
||||
- Not for "simple additions"
|
||||
- Not for "just adding a section"
|
||||
- Not for "documentation updates"
|
||||
- Don't keep untested changes as "reference"
|
||||
- Don't "adapt" while running tests
|
||||
- Delete means delete
|
||||
|
||||
**REQUIRED BACKGROUND:** The ring-default:test-driven-development skill explains why this matters. Same principles apply to documentation.
|
||||
|
||||
## Testing All Skill Types
|
||||
|
||||
Different skill types need different test approaches:
|
||||
|
||||
### Discipline-Enforcing Skills (rules/requirements)
|
||||
|
||||
**Examples:** TDD, verification-before-completion, designing-before-coding
|
||||
|
||||
**Test with:**
|
||||
- Academic questions: Do they understand the rules?
|
||||
- Pressure scenarios: Do they comply under stress?
|
||||
- Multiple pressures combined: time + sunk cost + exhaustion
|
||||
- Identify rationalizations and add explicit counters
|
||||
|
||||
**Success criteria:** Agent follows rule under maximum pressure
|
||||
|
||||
### Technique Skills (how-to guides)
|
||||
|
||||
**Examples:** condition-based-waiting, root-cause-tracing, defensive-programming
|
||||
|
||||
**Test with:**
|
||||
- Application scenarios: Can they apply the technique correctly?
|
||||
- Variation scenarios: Do they handle edge cases?
|
||||
- Missing information tests: Do instructions have gaps?
|
||||
|
||||
**Success criteria:** Agent successfully applies technique to new scenario
|
||||
|
||||
### Pattern Skills (mental models)
|
||||
|
||||
**Examples:** reducing-complexity, information-hiding concepts
|
||||
|
||||
**Test with:**
|
||||
- Recognition scenarios: Do they recognize when pattern applies?
|
||||
- Application scenarios: Can they use the mental model?
|
||||
- Counter-examples: Do they know when NOT to apply?
|
||||
|
||||
**Success criteria:** Agent correctly identifies when/how to apply pattern
|
||||
|
||||
### Reference Skills (documentation/APIs)
|
||||
|
||||
**Examples:** API documentation, command references, library guides
|
||||
|
||||
**Test with:**
|
||||
- Retrieval scenarios: Can they find the right information?
|
||||
- Application scenarios: Can they use what they found correctly?
|
||||
- Gap testing: Are common use cases covered?
|
||||
|
||||
**Success criteria:** Agent finds and correctly applies reference information
|
||||
|
||||
## Common Rationalizations for Skipping Testing
|
||||
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. |
|
||||
| "It's just a reference" | References can have gaps, unclear sections. Test retrieval. |
|
||||
| "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. |
|
||||
| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
|
||||
| "Too tedious to test" | Testing is less tedious than debugging bad skill in production. |
|
||||
| "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. |
|
||||
| "Academic review is enough" | Reading ≠ using. Test application scenarios. |
|
||||
| "No time to test" | Deploying untested skill wastes more time fixing it later. |
|
||||
|
||||
**All of these mean: Test before deploying. No exceptions.**
|
||||
|
||||
## Bulletproofing Skills Against Rationalization
|
||||
|
||||
Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure.
|
||||
|
||||
**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles.
|
||||
|
||||
### Close Every Loophole Explicitly
|
||||
|
||||
Don't just state the rule - forbid specific workarounds:
|
||||
|
||||
<Bad>
|
||||
```markdown
|
||||
Write code before test? Delete it.
|
||||
```
|
||||
</Bad>
|
||||
|
||||
<Good>
|
||||
```markdown
|
||||
Write code before test? Delete it. Start over.
|
||||
|
||||
**No exceptions:**
|
||||
- Don't keep it as "reference"
|
||||
- Don't "adapt" it while writing tests
|
||||
- Don't look at it
|
||||
- Delete means delete
|
||||
```
|
||||
</Good>
|
||||
|
||||
### Address "Spirit vs Letter" Arguments
|
||||
|
||||
Add foundational principle early:
|
||||
|
||||
```markdown
|
||||
**Violating the letter of the rules is violating the spirit of the rules.**
|
||||
```
|
||||
|
||||
This cuts off entire class of "I'm following the spirit" rationalizations.
|
||||
|
||||
### Build Rationalization Table
|
||||
|
||||
Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
|
||||
|
||||
```markdown
|
||||
| Excuse | Reality |
|
||||
|--------|---------|
|
||||
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
||||
| "I'll test after" | Tests passing immediately prove nothing. |
|
||||
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
|
||||
```
|
||||
|
||||
### Create Red Flags List
|
||||
|
||||
Make it easy for agents to self-check when rationalizing:
|
||||
|
||||
```markdown
|
||||
## Red Flags - STOP and Start Over
|
||||
|
||||
- Code before test
|
||||
- "I already manually tested it"
|
||||
- "Tests after achieve the same purpose"
|
||||
- "It's about spirit not ritual"
|
||||
- "This is different because..."
|
||||
|
||||
**All of these mean: Delete code. Start over with TDD.**
|
||||
```
|
||||
|
||||
### Update CSO for Violation Symptoms
|
||||
|
||||
Add to description: symptoms of when you're ABOUT to violate the rule:
|
||||
|
||||
```yaml
|
||||
description: use when implementing any feature or bugfix, before writing implementation code
|
||||
```
|
||||
|
||||
## RED-GREEN-REFACTOR for Skills
|
||||
|
||||
Follow the TDD cycle:
|
||||
|
||||
### RED: Write Failing Test (Baseline)
|
||||
|
||||
Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
|
||||
- What choices did they make?
|
||||
- What rationalizations did they use (verbatim)?
|
||||
- Which pressures triggered violations?
|
||||
|
||||
This is "watch the test fail" - you must see what agents naturally do before writing the skill.
|
||||
|
||||
### GREEN: Write Minimal Skill
|
||||
|
||||
Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
|
||||
|
||||
Run same scenarios WITH skill. Agent should now comply.
|
||||
|
||||
### REFACTOR: Close Loopholes
|
||||
|
||||
Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
|
||||
|
||||
**REQUIRED SUB-SKILL:** Use ring-default:testing-skills-with-subagents for the complete testing methodology:
|
||||
- How to write pressure scenarios
|
||||
- Pressure types (time, sunk cost, authority, exhaustion)
|
||||
- Plugging holes systematically
|
||||
- Meta-testing techniques
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### ❌ Narrative Example
|
||||
"In session 2025-10-03, we found empty projectDir caused..."
|
||||
**Why bad:** Too specific, not reusable
|
||||
|
||||
### ❌ Multi-Language Dilution
|
||||
example-js.js, example-py.py, example-go.go
|
||||
**Why bad:** Mediocre quality, maintenance burden
|
||||
|
||||
### ❌ Code in Flowcharts
|
||||
```dot
|
||||
step1 [label="import fs"];
|
||||
step2 [label="read file"];
|
||||
```
|
||||
**Why bad:** Can't copy-paste, hard to read
|
||||
|
||||
### ❌ Generic Labels
|
||||
helper1, helper2, step3, pattern4
|
||||
**Why bad:** Labels should have semantic meaning
|
||||
|
||||
## STOP: Before Moving to Next Skill
|
||||
|
||||
**After writing ANY skill, you MUST STOP and complete the deployment process.**
|
||||
|
||||
**Do NOT:**
|
||||
- Create multiple skills in batch without testing each
|
||||
- Move to next skill before current one is verified
|
||||
- Skip testing because "batching is more efficient"
|
||||
|
||||
**The deployment checklist below is MANDATORY for EACH skill.**
|
||||
|
||||
Deploying untested skills = deploying untested code. It's a violation of quality standards.
|
||||
|
||||
## Skill Creation Checklist (TDD Adapted)
|
||||
|
||||
**IMPORTANT: Use TodoWrite to create todos for EACH checklist item below.**
|
||||
|
||||
**RED Phase - Write Failing Test:**
|
||||
- [ ] Create pressure scenarios (3+ combined pressures for discipline skills)
|
||||
- [ ] Run scenarios WITHOUT skill - document baseline behavior verbatim
|
||||
- [ ] Identify patterns in rationalizations/failures
|
||||
|
||||
**GREEN Phase - Write Minimal Skill:**
|
||||
- [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars)
|
||||
- [ ] YAML frontmatter with only name and description (max 1024 chars)
|
||||
- [ ] Description starts with "Use when..." and includes specific triggers/symptoms
|
||||
- [ ] Description written in third person
|
||||
- [ ] Keywords throughout for search (errors, symptoms, tools)
|
||||
- [ ] Clear overview with core principle
|
||||
- [ ] Address specific baseline failures identified in RED
|
||||
- [ ] Code inline OR link to separate file
|
||||
- [ ] One excellent example (not multi-language)
|
||||
- [ ] Run scenarios WITH skill - verify agents now comply
|
||||
|
||||
**REFACTOR Phase - Close Loopholes:**
|
||||
- [ ] Identify NEW rationalizations from testing
|
||||
- [ ] Add explicit counters (if discipline skill)
|
||||
- [ ] Build rationalization table from all test iterations
|
||||
- [ ] Create red flags list
|
||||
- [ ] Re-test until bulletproof
|
||||
|
||||
**Quality Checks:**
|
||||
- [ ] Small flowchart only if decision non-obvious
|
||||
- [ ] Quick reference table
|
||||
- [ ] Common mistakes section
|
||||
- [ ] No narrative storytelling
|
||||
- [ ] Supporting files only for tools or heavy reference
|
||||
|
||||
**Deployment:**
|
||||
- [ ] Commit skill to git and push to your fork (if configured)
|
||||
- [ ] Consider contributing back via PR (if broadly useful)
|
||||
|
||||
## Discovery Workflow
|
||||
|
||||
How future agents find your skill:
|
||||
|
||||
1. **Encounters problem** ("tests are flaky")
|
||||
3. **Finds SKILL** (description matches)
|
||||
4. **Scans overview** (is this relevant?)
|
||||
5. **Reads patterns** (quick reference table)
|
||||
6. **Loads example** (only when implementing)
|
||||
|
||||
**Optimize for this flow** - put searchable terms early and often.
|
||||
|
||||
## The Bottom Line
|
||||
|
||||
**Creating skills IS TDD for process documentation.**
|
||||
|
||||
Same Iron Law: No skill without failing test first.
|
||||
Same cycle: RED (baseline) → GREEN (write skill) → REFACTOR (close loopholes).
|
||||
Same benefits: Better quality, fewer surprises, bulletproof results.
|
||||
|
||||
If you follow TDD for code, follow it for skills. It's the same discipline applied to documentation.
|
||||
1150
skills/writing-skills/anthropic-best-practices.md
Normal file
1150
skills/writing-skills/anthropic-best-practices.md
Normal file
File diff suppressed because it is too large
Load Diff
172
skills/writing-skills/graphviz-conventions.dot
Normal file
172
skills/writing-skills/graphviz-conventions.dot
Normal file
@@ -0,0 +1,172 @@
|
||||
digraph STYLE_GUIDE {
|
||||
// The style guide for our process DSL, written in the DSL itself
|
||||
|
||||
// Node type examples with their shapes
|
||||
subgraph cluster_node_types {
|
||||
label="NODE TYPES AND SHAPES";
|
||||
|
||||
// Questions are diamonds
|
||||
"Is this a question?" [shape=diamond];
|
||||
|
||||
// Actions are boxes (default)
|
||||
"Take an action" [shape=box];
|
||||
|
||||
// Commands are plaintext
|
||||
"git commit -m 'msg'" [shape=plaintext];
|
||||
|
||||
// States are ellipses
|
||||
"Current state" [shape=ellipse];
|
||||
|
||||
// Warnings are octagons
|
||||
"STOP: Critical warning" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
|
||||
// Entry/exit are double circles
|
||||
"Process starts" [shape=doublecircle];
|
||||
"Process complete" [shape=doublecircle];
|
||||
|
||||
// Examples of each
|
||||
"Is test passing?" [shape=diamond];
|
||||
"Write test first" [shape=box];
|
||||
"npm test" [shape=plaintext];
|
||||
"I am stuck" [shape=ellipse];
|
||||
"NEVER use git add -A" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
}
|
||||
|
||||
// Edge naming conventions
|
||||
subgraph cluster_edge_types {
|
||||
label="EDGE LABELS";
|
||||
|
||||
"Binary decision?" [shape=diamond];
|
||||
"Yes path" [shape=box];
|
||||
"No path" [shape=box];
|
||||
|
||||
"Binary decision?" -> "Yes path" [label="yes"];
|
||||
"Binary decision?" -> "No path" [label="no"];
|
||||
|
||||
"Multiple choice?" [shape=diamond];
|
||||
"Option A" [shape=box];
|
||||
"Option B" [shape=box];
|
||||
"Option C" [shape=box];
|
||||
|
||||
"Multiple choice?" -> "Option A" [label="condition A"];
|
||||
"Multiple choice?" -> "Option B" [label="condition B"];
|
||||
"Multiple choice?" -> "Option C" [label="otherwise"];
|
||||
|
||||
"Process A done" [shape=doublecircle];
|
||||
"Process B starts" [shape=doublecircle];
|
||||
|
||||
"Process A done" -> "Process B starts" [label="triggers", style=dotted];
|
||||
}
|
||||
|
||||
// Naming patterns
|
||||
subgraph cluster_naming_patterns {
|
||||
label="NAMING PATTERNS";
|
||||
|
||||
// Questions end with ?
|
||||
"Should I do X?";
|
||||
"Can this be Y?";
|
||||
"Is Z true?";
|
||||
"Have I done W?";
|
||||
|
||||
// Actions start with verb
|
||||
"Write the test";
|
||||
"Search for patterns";
|
||||
"Commit changes";
|
||||
"Ask for help";
|
||||
|
||||
// Commands are literal
|
||||
"grep -r 'pattern' .";
|
||||
"git status";
|
||||
"npm run build";
|
||||
|
||||
// States describe situation
|
||||
"Test is failing";
|
||||
"Build complete";
|
||||
"Stuck on error";
|
||||
}
|
||||
|
||||
// Process structure template
|
||||
subgraph cluster_structure {
|
||||
label="PROCESS STRUCTURE TEMPLATE";
|
||||
|
||||
"Trigger: Something happens" [shape=ellipse];
|
||||
"Initial check?" [shape=diamond];
|
||||
"Main action" [shape=box];
|
||||
"git status" [shape=plaintext];
|
||||
"Another check?" [shape=diamond];
|
||||
"Alternative action" [shape=box];
|
||||
"STOP: Don't do this" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
"Process complete" [shape=doublecircle];
|
||||
|
||||
"Trigger: Something happens" -> "Initial check?";
|
||||
"Initial check?" -> "Main action" [label="yes"];
|
||||
"Initial check?" -> "Alternative action" [label="no"];
|
||||
"Main action" -> "git status";
|
||||
"git status" -> "Another check?";
|
||||
"Another check?" -> "Process complete" [label="ok"];
|
||||
"Another check?" -> "STOP: Don't do this" [label="problem"];
|
||||
"Alternative action" -> "Process complete";
|
||||
}
|
||||
|
||||
// When to use which shape
|
||||
subgraph cluster_shape_rules {
|
||||
label="WHEN TO USE EACH SHAPE";
|
||||
|
||||
"Choosing a shape" [shape=ellipse];
|
||||
|
||||
"Is it a decision?" [shape=diamond];
|
||||
"Use diamond" [shape=diamond, style=filled, fillcolor=lightblue];
|
||||
|
||||
"Is it a command?" [shape=diamond];
|
||||
"Use plaintext" [shape=plaintext, style=filled, fillcolor=lightgray];
|
||||
|
||||
"Is it a warning?" [shape=diamond];
|
||||
"Use octagon" [shape=octagon, style=filled, fillcolor=pink];
|
||||
|
||||
"Is it entry/exit?" [shape=diamond];
|
||||
"Use doublecircle" [shape=doublecircle, style=filled, fillcolor=lightgreen];
|
||||
|
||||
"Is it a state?" [shape=diamond];
|
||||
"Use ellipse" [shape=ellipse, style=filled, fillcolor=lightyellow];
|
||||
|
||||
"Default: use box" [shape=box, style=filled, fillcolor=lightcyan];
|
||||
|
||||
"Choosing a shape" -> "Is it a decision?";
|
||||
"Is it a decision?" -> "Use diamond" [label="yes"];
|
||||
"Is it a decision?" -> "Is it a command?" [label="no"];
|
||||
"Is it a command?" -> "Use plaintext" [label="yes"];
|
||||
"Is it a command?" -> "Is it a warning?" [label="no"];
|
||||
"Is it a warning?" -> "Use octagon" [label="yes"];
|
||||
"Is it a warning?" -> "Is it entry/exit?" [label="no"];
|
||||
"Is it entry/exit?" -> "Use doublecircle" [label="yes"];
|
||||
"Is it entry/exit?" -> "Is it a state?" [label="no"];
|
||||
"Is it a state?" -> "Use ellipse" [label="yes"];
|
||||
"Is it a state?" -> "Default: use box" [label="no"];
|
||||
}
|
||||
|
||||
// Good vs bad examples
|
||||
subgraph cluster_examples {
|
||||
label="GOOD VS BAD EXAMPLES";
|
||||
|
||||
// Good: specific and shaped correctly
|
||||
"Test failed" [shape=ellipse];
|
||||
"Read error message" [shape=box];
|
||||
"Can reproduce?" [shape=diamond];
|
||||
"git diff HEAD~1" [shape=plaintext];
|
||||
"NEVER ignore errors" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
||||
|
||||
"Test failed" -> "Read error message";
|
||||
"Read error message" -> "Can reproduce?";
|
||||
"Can reproduce?" -> "git diff HEAD~1" [label="yes"];
|
||||
|
||||
// Bad: vague and wrong shapes
|
||||
bad_1 [label="Something wrong", shape=box]; // Should be ellipse (state)
|
||||
bad_2 [label="Fix it", shape=box]; // Too vague
|
||||
bad_3 [label="Check", shape=box]; // Should be diamond
|
||||
bad_4 [label="Run command", shape=box]; // Should be plaintext with actual command
|
||||
|
||||
bad_1 -> bad_2;
|
||||
bad_2 -> bad_3;
|
||||
bad_3 -> bad_4;
|
||||
}
|
||||
}
|
||||
187
skills/writing-skills/persuasion-principles.md
Normal file
187
skills/writing-skills/persuasion-principles.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Persuasion Principles for Skill Design
|
||||
|
||||
## Overview
|
||||
|
||||
LLMs respond to the same persuasion principles as humans. Understanding this psychology helps you design more effective skills - not to manipulate, but to ensure critical practices are followed even under pressure.
|
||||
|
||||
**Research foundation:** Meincke et al. (2025) tested 7 persuasion principles with N=28,000 AI conversations. Persuasion techniques more than doubled compliance rates (33% → 72%, p < .001).
|
||||
|
||||
## The Seven Principles
|
||||
|
||||
### 1. Authority
|
||||
**What it is:** Deference to expertise, credentials, or official sources.
|
||||
|
||||
**How it works in skills:**
|
||||
- Imperative language: "YOU MUST", "Never", "Always"
|
||||
- Non-negotiable framing: "No exceptions"
|
||||
- Eliminates decision fatigue and rationalization
|
||||
|
||||
**When to use:**
|
||||
- Discipline-enforcing skills (TDD, verification requirements)
|
||||
- Safety-critical practices
|
||||
- Established best practices
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ Write code before test? Delete it. Start over. No exceptions.
|
||||
❌ Consider writing tests first when feasible.
|
||||
```
|
||||
|
||||
### 2. Commitment
|
||||
**What it is:** Consistency with prior actions, statements, or public declarations.
|
||||
|
||||
**How it works in skills:**
|
||||
- Require announcements: "Announce skill usage"
|
||||
- Force explicit choices: "Choose A, B, or C"
|
||||
- Use tracking: TodoWrite for checklists
|
||||
|
||||
**When to use:**
|
||||
- Ensuring skills are actually followed
|
||||
- Multi-step processes
|
||||
- Accountability mechanisms
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ When you find a skill, you MUST announce: "I'm using [Skill Name]"
|
||||
❌ Consider letting your partner know which skill you're using.
|
||||
```
|
||||
|
||||
### 3. Scarcity
|
||||
**What it is:** Urgency from time limits or limited availability.
|
||||
|
||||
**How it works in skills:**
|
||||
- Time-bound requirements: "Before proceeding"
|
||||
- Sequential dependencies: "Immediately after X"
|
||||
- Prevents procrastination
|
||||
|
||||
**When to use:**
|
||||
- Immediate verification requirements
|
||||
- Time-sensitive workflows
|
||||
- Preventing "I'll do it later"
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ After completing a task, IMMEDIATELY request code review before proceeding.
|
||||
❌ You can review code when convenient.
|
||||
```
|
||||
|
||||
### 4. Social Proof
|
||||
**What it is:** Conformity to what others do or what's considered normal.
|
||||
|
||||
**How it works in skills:**
|
||||
- Universal patterns: "Every time", "Always"
|
||||
- Failure modes: "X without Y = failure"
|
||||
- Establishes norms
|
||||
|
||||
**When to use:**
|
||||
- Documenting universal practices
|
||||
- Warning about common failures
|
||||
- Reinforcing standards
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ Checklists without TodoWrite tracking = steps get skipped. Every time.
|
||||
❌ Some people find TodoWrite helpful for checklists.
|
||||
```
|
||||
|
||||
### 5. Unity
|
||||
**What it is:** Shared identity, "we-ness", in-group belonging.
|
||||
|
||||
**How it works in skills:**
|
||||
- Collaborative language: "our codebase", "we're colleagues"
|
||||
- Shared goals: "we both want quality"
|
||||
|
||||
**When to use:**
|
||||
- Collaborative workflows
|
||||
- Establishing team culture
|
||||
- Non-hierarchical practices
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ We're colleagues working together. I need your honest technical judgment.
|
||||
❌ You should probably tell me if I'm wrong.
|
||||
```
|
||||
|
||||
### 6. Reciprocity
|
||||
**What it is:** Obligation to return benefits received.
|
||||
|
||||
**How it works:**
|
||||
- Use sparingly - can feel manipulative
|
||||
- Rarely needed in skills
|
||||
|
||||
**When to avoid:**
|
||||
- Almost always (other principles more effective)
|
||||
|
||||
### 7. Liking
|
||||
**What it is:** Preference for cooperating with those we like.
|
||||
|
||||
**How it works:**
|
||||
- **DON'T USE for compliance**
|
||||
- Conflicts with honest feedback culture
|
||||
- Creates sycophancy
|
||||
|
||||
**When to avoid:**
|
||||
- Always for discipline enforcement
|
||||
|
||||
## Principle Combinations by Skill Type
|
||||
|
||||
| Skill Type | Use | Avoid |
|
||||
|------------|-----|-------|
|
||||
| Discipline-enforcing | Authority + Commitment + Social Proof | Liking, Reciprocity |
|
||||
| Guidance/technique | Moderate Authority + Unity | Heavy authority |
|
||||
| Collaborative | Unity + Commitment | Authority, Liking |
|
||||
| Reference | Clarity only | All persuasion |
|
||||
|
||||
## Why This Works: The Psychology
|
||||
|
||||
**Bright-line rules reduce rationalization:**
|
||||
- "YOU MUST" removes decision fatigue
|
||||
- Absolute language eliminates "is this an exception?" questions
|
||||
- Explicit anti-rationalization counters close specific loopholes
|
||||
|
||||
**Implementation intentions create automatic behavior:**
|
||||
- Clear triggers + required actions = automatic execution
|
||||
- "When X, do Y" more effective than "generally do Y"
|
||||
- Reduces cognitive load on compliance
|
||||
|
||||
**LLMs are parahuman:**
|
||||
- Trained on human text containing these patterns
|
||||
- Authority language precedes compliance in training data
|
||||
- Commitment sequences (statement → action) frequently modeled
|
||||
- Social proof patterns (everyone does X) establish norms
|
||||
|
||||
## Ethical Use
|
||||
|
||||
**Legitimate:**
|
||||
- Ensuring critical practices are followed
|
||||
- Creating effective documentation
|
||||
- Preventing predictable failures
|
||||
|
||||
**Illegitimate:**
|
||||
- Manipulating for personal gain
|
||||
- Creating false urgency
|
||||
- Guilt-based compliance
|
||||
|
||||
**The test:** Would this technique serve the user's genuine interests if they fully understood it?
|
||||
|
||||
## Research Citations
|
||||
|
||||
**Cialdini, R. B. (2021).** *Influence: The Psychology of Persuasion (New and Expanded).* Harper Business.
|
||||
- Seven principles of persuasion
|
||||
- Empirical foundation for influence research
|
||||
|
||||
**Meincke, L., Shapiro, D., Duckworth, A. L., Mollick, E., Mollick, L., & Cialdini, R. (2025).** Call Me A Jerk: Persuading AI to Comply with Objectionable Requests. University of Pennsylvania.
|
||||
- Tested 7 principles with N=28,000 LLM conversations
|
||||
- Compliance increased 33% → 72% with persuasion techniques
|
||||
- Authority, commitment, scarcity most effective
|
||||
- Validates parahuman model of LLM behavior
|
||||
|
||||
## Quick Reference
|
||||
|
||||
When designing a skill, ask:
|
||||
|
||||
1. **What type is it?** (Discipline vs. guidance vs. reference)
|
||||
2. **What behavior am I trying to change?**
|
||||
3. **Which principle(s) apply?** (Usually authority + commitment for discipline)
|
||||
4. **Am I combining too many?** (Don't use all seven)
|
||||
5. **Is this ethical?** (Serves user's genuine interests?)
|
||||
Reference in New Issue
Block a user