Files
gh-resolve-io-prism/skills/skill-builder/reference/skill-creation-process.md
2025-11-30 08:51:34 +08:00

835 lines
22 KiB
Markdown

# Skill Creation Process: Step-by-Step Guide
> **Use this guide** to systematically build a new Claude Code skill following progressive disclosure principles and token optimization.
**Example Used**: `incident-triage` skill (adapt for your use case)
---
## 📋 Process Overview
```
Phase 1: Planning → Phase 2: Structure → Phase 3: Implementation → Phase 4: Testing → Phase 5: Refinement
(30 min) (15 min) (2-4 hours) (30 min) (ongoing)
```
---
## Phase 1: Planning (30 minutes)
### Step 1.1: Define the Core Problem
**Questions to answer:**
- [ ] What specific, repeatable task does this solve?
- [ ] When should Claude invoke this skill?
- [ ] What are the inputs and outputs?
- [ ] What's the 1-sentence description?
**Example (incident-triage):**
- **Task**: Triage incidents by extracting facts, enriching with data, proposing severity/priority
- **Triggers**: "triage", "new incident", "assign severity", "prioritize ticket"
- **Inputs**: Free text or JSON ticket payload
- **Outputs**: Summary, severity/priority, next steps, assignment hint
- **Description**: "Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions."
### Step 1.2: Identify the Three Levels
**Level 1: Metadata** (~100 tokens, always loaded)
- [ ] Skill name (kebab-case)
- [ ] Description (triggers Claude's router)
- [ ] Version
**Level 2: SKILL.md Body** (<2k tokens, loaded on trigger)
- [ ] When to Use (2-3 bullet points)
- [ ] What It Does (high-level flow)
- [ ] Inputs/Outputs (contract)
- [ ] Quick Start (1-3 commands)
- [ ] Links to Level 3 docs
**Level 3: Bundled Files** (unlimited, loaded as-needed)
- [ ] Detailed documentation
- [ ] Executable scripts
- [ ] API specs, examples, decision matrices
- [ ] Shared utilities
### Step 1.3: Token Budget Plan
Fill out this table:
| Component | Target Tokens | What Goes Here |
|-----------|--------------|----------------|
| Metadata | ~100 | Name, description, version |
| SKILL.md Body | <2k (aim for 1.5k) | Quick ref, links to Level 3 |
| reference/*.md | 500-1000 each | Detailed docs (as many files as needed) |
| scripts/*.py | n/a | Executable code (not loaded unless run) |
---
## Phase 2: Structure (15 minutes)
### Step 2.1: Create Folder Layout
**⚠️ CRITICAL: Create `/reference/` folder and put ALL reference .md files there!**
```bash
# Navigate to skills directory
cd .claude/skills
# Create skill structure
mkdir -p incident-triage/{scripts,reference,shared}
touch incident-triage/SKILL.md
touch incident-triage/scripts/{triage_main.py,enrich_ticket.py,suggest_priority.py,common.py}
touch incident-triage/reference/{inputs-and-prompts.md,decision-matrix.md,runbook-links.md,api-specs.md,examples.md}
touch incident-triage/shared/{config.py,api_client.py,formatters.py}
```
**Verify structure matches this EXACT pattern:**
```
incident-triage/
├── SKILL.md # ✅ Level 1+2 (≤2k tokens) - ONLY .md in root
├── reference/ # ✅ REQUIRED: Level 3 docs folder
│ ├── inputs-and-prompts.md # ✅ All reference .md files go HERE
│ ├── decision-matrix.md # ✅ NOT in root!
│ ├── runbook-links.md
│ ├── api-specs.md
│ └── examples.md
├── scripts/ # Level 3: executable code
│ ├── triage_main.py
│ ├── enrich_ticket.py
│ ├── suggest_priority.py
│ └── common.py
└── shared/ # Level 3: utilities
├── config.py
├── api_client.py
└── formatters.py
```
**❌ WRONG - DO NOT DO THIS:**
```
incident-triage/
├── SKILL.md
├── inputs-and-prompts.md # ❌ WRONG! Should be in reference/
├── decision-matrix.md # ❌ WRONG! Should be in reference/
└── scripts/
```
### Step 2.2: Stub Out Files
Create minimal stubs for each file to establish contracts:
**SKILL.md** (copy template from best-practices.md)
**reference/*.md** (headers only for now)
**scripts/*.py** (function signatures with pass)
**shared/*.py** (class/function signatures)
### Step 2.3: Validate Folder Structure
**Run this validation BEFORE moving to Phase 3:**
```bash
# Check structure
ls -la incident-triage/
# Verify:
# ✅ SKILL.md exists in root
# ✅ reference/ folder exists
# ✅ NO .md files in root except SKILL.md
# ✅ scripts/ folder exists (if needed)
# ✅ shared/ folder exists (if needed)
# Check reference folder
ls -la incident-triage/reference/
# Verify:
# ✅ All .md reference files are HERE
# ✅ inputs-and-prompts.md
# ✅ decision-matrix.md
# ✅ api-specs.md
# ✅ examples.md
```
**Checklist:**
- [ ] `/reference/` folder created
- [ ] All reference .md files in `/reference/` (not root)
- [ ] SKILL.md links use `./reference/filename.md` format
- [ ] No .md files in root except SKILL.md
---
## Phase 3: Implementation (2-4 hours)
Work in this order to maintain focus and avoid scope creep:
### Step 3.1: Write Level 1 (Metadata) - 5 minutes
Open `SKILL.md` and write the frontmatter:
```yaml
---
name: incident-triage
description: Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions.
version: 1.0.0
---
```
**Checklist:**
- [ ] Name is clear and specific (not "helper" or "utility")
- [ ] Description contains trigger keywords
- [ ] Description explains what it does (not what it is)
- [ ] Total metadata ≤100 tokens
### Step 3.2: Write Level 2 (SKILL.md Body) - 30 minutes
Follow this exact structure:
```markdown
# Level 2: Body (<2k tokens recommended) — Loaded when the skill triggers
## When to Use
- [Trigger condition 1]
- [Trigger condition 2]
- [Trigger condition 3]
## What It Does (at a glance)
- **[Action 1]**: [brief description]
- **[Action 2]**: [brief description]
- **[Action 3]**: [brief description]
- **[Action 4]**: [brief description]
## Inputs
- [Input format 1]
- [Input format 2]
Details: see [reference/inputs-and-prompts.md](./reference/inputs-and-prompts.md).
## Quick Start
1. **Dry-run** (no external calls):
```bash
python scripts/main.py --example --dry-run
```
2. **With enrichment**:
```bash
python scripts/main.py --ticket-id 12345 --include-logs
```
3. Review output
Examples: [reference/examples.md](./reference/examples.md)
## Decision Logic (high-level)
[2-3 sentences on how decisions are made]
Full details: [reference/decision-matrix.md](./reference/decision-matrix.md)
## Outputs (contract)
- `field1`: [description]
- `field2`: [description]
- `field3`: [description]
## Guardrails
- [Security consideration 1]
- [Token budget note]
- [Error handling approach]
## Links (Level 3, loaded only when needed)
- Prompts: [reference/inputs-and-prompts.md](./reference/inputs-and-prompts.md)
- Decision logic: [reference/decision-matrix.md](./reference/decision-matrix.md)
- Examples: [reference/examples.md](./reference/examples.md)
- API specs: [reference/api-specs.md](./reference/api-specs.md)
## Triggers (help the router)
Keywords: [keyword1], [keyword2], [keyword3]
Inputs containing: [field1], [field2]
## Security & Config
Set environment variables:
- `VAR1_API_KEY`
- `VAR2_API_KEY`
Centralized in `shared/config.py`. Never echo secrets.
## Testing
```bash
# Smoke test
python scripts/main.py --fixture reference/examples.md
# End-to-end
python scripts/main.py --text "Example input" --dry-run
```
```
**Checklist:**
- [ ] <2k tokens (aim for 1.5k)
- [ ] Links to Level 3 for details
- [ ] Quick Start is copy-paste ready
- [ ] Output contract is clear
- [ ] No extensive examples or specs embedded
### Step 3.3: Write Level 3 Reference Docs - 45 minutes
Create each reference file systematically:
#### reference/inputs-and-prompts.md
```markdown
# Inputs and Prompt Shapes
## Input Format 1: Free Text
- Description
- Example
## Input Format 2: Structured JSON
```json
{
"field": "value"
}
```
## Prompt Snippets
- Extraction goals
- Summarization style
- Redaction rules
```
#### reference/decision-matrix.md
```markdown
# Decision Matrix
[Full decision logic with tables, formulas, edge cases]
## Base Matrix
| Dimension 1 \ Dimension 2 | Value A | Value B | Value C |
|---|---|---|---|
| Low | Result | Result | Result |
| Med | Result | Result | Result |
| High | Result | Result | Result |
## Adjustments
- Adjustment rule 1
- Adjustment rule 2
## Rationale
[Why this matrix, examples, edge cases]
```
#### reference/api-specs.md
```markdown
# API Specs & Schemas
## API 1: CMDB
- Base URL: `{SERVICE_MAP_URL}`
- Auth: Header `X-API-Key: {CMDB_API_KEY}`
- Endpoints:
- GET `/service/{name}/dependencies`
- Response schema: [...]
## API 2: Logs
- Base URL: [...]
- Endpoints: [...]
```
#### reference/examples.md
```markdown
# Examples
## Example 1: [Scenario Name]
**Input:**
```
[Example input]
```
**Output:**
```
[Example output with all fields]
```
**Explanation:** [Why these decisions were made]
## Example 2: [Another Scenario]
[...]
```
#### reference/runbook-links.md
```markdown
# Runbook Links
- [Service 1]: <URL>
- [Service 2]: <URL>
- [Escalation tree]: <URL>
```
**Checklist for all reference docs:**
- [ ] Each file focuses on one aspect
- [ ] 500-1000 tokens per file (can be more if needed)
- [ ] Referenced from SKILL.md but not embedded
- [ ] Includes examples where helpful
### Step 3.4: Write Shared Utilities - 30 minutes
#### shared/config.py
```python
"""Centralized configuration from environment variables."""
import os
class Config:
"""Config object - never logs secrets"""
CMDB_API_KEY = os.getenv("CMDB_API_KEY")
LOGS_API_KEY = os.getenv("LOGS_API_KEY")
SERVICE_MAP_URL = os.getenv("SERVICE_MAP_URL")
DASHBOARD_BASE_URL = os.getenv("DASHBOARD_BASE_URL")
@classmethod
def validate(cls):
"""Check required env vars are set"""
missing = []
for key in ["CMDB_API_KEY", "LOGS_API_KEY"]:
if not getattr(cls, key):
missing.append(key)
if missing:
raise ValueError(f"Missing required env vars: {missing}")
cfg = Config()
```
#### shared/api_client.py
```python
"""API client wrappers."""
import requests
from .config import cfg
class CMDBClient:
def __init__(self):
self.base_url = cfg.SERVICE_MAP_URL
self.headers = {"X-API-Key": cfg.CMDB_API_KEY}
def get_service_dependencies(self, service_name):
"""Fetch service dependencies"""
try:
resp = requests.get(
f"{self.base_url}/service/{service_name}/dependencies",
headers=self.headers,
timeout=5
)
resp.raise_for_status()
return resp.json()
except requests.RequestException as e:
raise ConnectionError(f"CMDB API failed: {e}")
class LogsClient:
def __init__(self):
self.base_url = cfg.LOGS_API_URL
self.headers = {"Authorization": f"Bearer {cfg.LOGS_API_KEY}"}
def recent_errors(self, service_name, last_minutes=15):
"""Fetch recent error logs"""
# Implementation
pass
def cmdb_client():
return CMDBClient()
def logs_client():
return LogsClient()
```
#### shared/formatters.py
```python
"""Output formatting helpers."""
def format_output(enriched, severity, priority, rationale, next_steps):
"""Format triage result as markdown."""
lines = [
"### Incident Triage Result",
f"**Severity**: {severity} | **Priority**: {priority}",
f"**Rationale**: {rationale}",
"",
"**Summary**:",
enriched.get("summary", "N/A"),
"",
"**Next Steps**:",
]
for i, step in enumerate(next_steps, 1):
lines.append(f"{i}. {step}")
if "evidence" in enriched:
lines.extend(["", "**Evidence**:"])
for link in enriched["evidence"]:
lines.append(f"- {link}")
return "\n".join(lines)
```
### Step 3.5: Write Main Scripts - 1 hour
#### scripts/triage_main.py (entry point)
```python
#!/usr/bin/env python3
"""Main entry point for incident triage."""
import argparse
import json
import sys
from pathlib import Path
# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from shared.config import cfg
from shared.formatters import format_output
from scripts.enrich_ticket import enrich
from scripts.suggest_priority import score
def main():
parser = argparse.ArgumentParser(description="Triage an incident")
parser.add_argument("--text", help="Free-text incident description")
parser.add_argument("--ticket-id", help="Ticket ID to enrich")
parser.add_argument("--include-logs", action="store_true")
parser.add_argument("--include-cmdb", action="store_true")
parser.add_argument("--dry-run", action="store_true",
help="Skip external API calls")
args = parser.parse_args()
# Validate inputs
if not args.text and not args.ticket_id:
print("Error: Provide --text or --ticket-id")
sys.exit(1)
# Build payload
payload = {
"text": args.text,
"ticket_id": args.ticket_id
}
try:
# Enrich (respects --dry-run)
enriched = enrich(
payload,
include_logs=args.include_logs and not args.dry_run,
include_cmdb=args.include_cmdb and not args.dry_run
)
# Score (deterministic)
severity, priority, rationale = score(enriched)
# Generate next steps
next_steps = generate_next_steps(enriched, severity)
# Format output
output = format_output(enriched, severity, priority, rationale, next_steps)
print(output)
except Exception as e:
print(f"❌ Triage failed: {e}")
print("\nTroubleshooting:")
print("1. Check environment variables are set")
print("2. Verify API endpoints are accessible")
print("3. Run with --dry-run to test without external calls")
sys.exit(1)
def generate_next_steps(enriched, severity):
"""Generate action items based on enrichment and severity"""
steps = []
if severity in ["SEV1", "SEV2"]:
steps.append("Page on-call immediately")
if "dashboard_url" in enriched:
steps.append(f"Review dashboard: {enriched['dashboard_url']}")
steps.append("Compare last 15m vs 24h baseline")
if enriched.get("recent_deploy"):
steps.append("Consider rollback if error budget breached")
return steps
if __name__ == "__main__":
main()
```
#### scripts/enrich_ticket.py
```python
"""Enrich ticket with external data."""
from shared.config import cfg
from shared.api_client import cmdb_client, logs_client
def enrich(payload, include_logs=False, include_cmdb=False):
"""
Enrich ticket payload with CMDB/logs data.
Args:
payload: Dict with 'text' and/or 'ticket_id'
include_logs: Fetch recent logs
include_cmdb: Fetch CMDB dependencies
Returns:
Dict with original payload + enrichment
"""
result = {"input": payload}
# Extract service name from text or ticket
service = extract_service(payload)
if service:
result["service"] = service
# Enrich with CMDB
if include_cmdb and service:
try:
cmdb_data = cmdb_client().get_service_dependencies(service)
result["cmdb"] = cmdb_data
result["blast_radius"] = cmdb_data.get("dependent_services", [])
except Exception as e:
result["cmdb_error"] = str(e)
# Enrich with logs
if include_logs and service:
try:
logs = logs_client().recent_errors(service)
result["logs"] = logs
except Exception as e:
result["logs_error"] = str(e)
# Derive scope/impact hints
result["scope"] = derive_scope(result)
result["impact"] = derive_impact(result)
return result
def extract_service(payload):
"""Extract service name from payload."""
# Check explicit service field
if "service" in payload:
return payload["service"]
# Parse from text (simple keyword matching)
text = payload.get("text", "").lower()
known_services = ["checkout", "payments", "inventory", "auth"]
for service in known_services:
if service in text:
return service
return None
def derive_scope(enriched):
"""Determine blast radius scope."""
blast_radius = len(enriched.get("blast_radius", []))
if blast_radius == 0:
return "single-service"
elif blast_radius < 3:
return "few-services"
else:
return "multi-service"
def derive_impact(enriched):
"""Estimate user impact level."""
# Check for explicit impact data
if "impact" in enriched.get("input", {}):
pct = enriched["input"]["impact"].get("users_affected_pct", 0)
if pct > 50:
return "high"
elif pct > 10:
return "medium"
else:
return "low"
# Infer from service criticality
service = enriched.get("service", "")
critical_services = ["checkout", "payments", "auth"]
if service in critical_services:
return "medium" # Default to medium for critical services
return "low"
```
#### scripts/suggest_priority.py
```python
"""Deterministic severity/priority scoring."""
DECISION_MATRIX = {
# (impact, scope) -> (severity, priority)
("low", "single-service"): ("SEV4", "P4"),
("low", "few-services"): ("SEV3", "P3"),
("low", "multi-service"): ("SEV3", "P3"),
("medium", "single-service"): ("SEV3", "P3"),
("medium", "few-services"): ("SEV2", "P2"),
("medium", "multi-service"): ("SEV2", "P2"),
("high", "single-service"): ("SEV2", "P2"),
("high", "few-services"): ("SEV1", "P1"),
("high", "multi-service"): ("SEV1", "P1"),
}
def score(enriched):
"""
Score incident severity and priority.
Args:
enriched: Dict from enrich_ticket()
Returns:
Tuple of (severity, priority, rationale)
"""
impact = enriched.get("impact", "medium")
scope = enriched.get("scope", "single-service")
# Base score from matrix
key = (impact, scope)
if key not in DECISION_MATRIX:
# Default fallback
severity, priority = "SEV3", "P3"
rationale = f"Default scoring (impact={impact}, scope={scope})"
else:
severity, priority = DECISION_MATRIX[key]
rationale = f"{impact.title()} impact, {scope} scope"
# Apply adjustments
if should_escalate(enriched):
severity, priority = escalate(severity, priority)
rationale += " (escalated: long recovery expected)"
return severity, priority, rationale
def should_escalate(enriched):
"""Check if incident should be escalated."""
# Check for long recovery indicators
logs = enriched.get("logs", {})
if logs.get("error_rate_increasing"):
return True
# Check for repeated incidents
if enriched.get("recent_incidents_count", 0) > 3:
return True
return False
def escalate(severity, priority):
"""Escalate severity/priority by one level."""
sev_map = {"SEV4": "SEV3", "SEV3": "SEV2", "SEV2": "SEV1", "SEV1": "SEV1"}
pri_map = {"P4": "P3", "P3": "P2", "P2": "P1", "P1": "P1"}
return sev_map.get(severity, severity), pri_map.get(priority, priority)
```
---
## Phase 4: Testing (30 minutes)
### Step 4.1: Create Test Fixtures
Create `reference/test-fixtures.json`:
```json
{
"test1": {
"text": "Checkout API seeing 500 errors at 12%; started 15:05Z",
"expected_severity": "SEV2",
"expected_priority": "P2"
},
"test2": {
"text": "Single user reports login issue on mobile app",
"expected_severity": "SEV4",
"expected_priority": "P4"
}
}
```
### Step 4.2: Run Tests
```bash
# 1. Smoke test deterministic components
python scripts/suggest_priority.py --test
# 2. Dry-run end-to-end
python scripts/triage_main.py --text "API timeouts on checkout" --dry-run
# 3. With enrichment (requires env vars)
export CMDB_API_KEY="test_key"
export LOGS_API_KEY="test_key"
python scripts/triage_main.py --ticket-id 12345 --include-logs --include-cmdb
```
### Step 4.3: Test with Claude
Ask Claude:
```
"I have a new incident: checkout API showing 500 errors affecting 15% of users in EU region. Can you triage this?"
```
Verify:
- [ ] Skill triggers correctly
- [ ] Output is well-formatted
- [ ] Severity/priority makes sense
- [ ] Next steps are actionable
- [ ] Links work
---
## Phase 5: Refinement (Ongoing)
### Step 5.1: Token Count Audit
```bash
# Count tokens in SKILL.md body (exclude metadata)
wc -w incident-triage/SKILL.md
# Multiply by 0.75 for rough token count
```
**Checklist:**
- [ ] Metadata ~100 tokens
- [ ] Body <2k tokens
- [ ] If over, move content to reference/*.md
### Step 5.2: Real-World Usage Monitoring
Track these metrics:
- [ ] Does Claude trigger the skill appropriately?
- [ ] Are users getting helpful results?
- [ ] What questions/errors come up?
- [ ] Which Level 3 docs are never used?
### Step 5.3: Iterate Based on Feedback
**If skill triggers too often:**
→ Make description more specific
**If skill triggers too rarely:**
→ Add more trigger keywords
**If output is unhelpful:**
→ Improve decision logic or examples
**If token limit exceeded:**
→ Move more content to Level 3
---
## 🎓 Adaptation Checklist
To create YOUR skill from this template:
- [ ] **Folder Structure** (CRITICAL):
- [ ] Create `/reference/` folder
- [ ] Put ALL reference .md files IN `/reference/` folder
- [ ] NO .md files in root except SKILL.md
- [ ] Links in SKILL.md use `./reference/filename.md` format
- [ ] **Rename**: Replace "incident-triage" with your skill name
- [ ] **Metadata**: Write name/description with your trigger keywords
- [ ] **Triggers**: List all keywords/patterns that should invoke your skill
- [ ] **Inputs/Outputs**: Define your specific contract
- [ ] **Scripts**: Replace enrichment/scoring with your logic
- [ ] **Reference docs**: Create docs for your domain (decision matrices, API specs, etc.)
- [ ] **Config**: Add your required environment variables
- [ ] **Examples**: Create 3-5 realistic examples
- [ ] **Test**: Dry-run → with real data → with Claude
- [ ] **Validate Structure**: Run structure validation checklist
- [ ] **Refine**: Monitor usage, iterate based on feedback
---
## 📚 Related Resources
- [Agent Skills Best Practices](../best-practices.md) - Quick reference
- [Progressive Disclosure](../topics/progressive-disclosure.md) - Design philosophy
- [Token Optimization](../README.md#token-optimized-structure) - Token limits explained
---
**Last Updated**: 2025-10-20
**Version**: 1.0.0