22 KiB
Skill Creation Process: Step-by-Step Guide
Use this guide to systematically build a new Claude Code skill following progressive disclosure principles and token optimization.
Example Used: incident-triage skill (adapt for your use case)
📋 Process Overview
Phase 1: Planning → Phase 2: Structure → Phase 3: Implementation → Phase 4: Testing → Phase 5: Refinement
(30 min) (15 min) (2-4 hours) (30 min) (ongoing)
Phase 1: Planning (30 minutes)
Step 1.1: Define the Core Problem
Questions to answer:
- What specific, repeatable task does this solve?
- When should Claude invoke this skill?
- What are the inputs and outputs?
- What's the 1-sentence description?
Example (incident-triage):
- Task: Triage incidents by extracting facts, enriching with data, proposing severity/priority
- Triggers: "triage", "new incident", "assign severity", "prioritize ticket"
- Inputs: Free text or JSON ticket payload
- Outputs: Summary, severity/priority, next steps, assignment hint
- Description: "Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions."
Step 1.2: Identify the Three Levels
Level 1: Metadata (~100 tokens, always loaded)
- Skill name (kebab-case)
- Description (triggers Claude's router)
- Version
Level 2: SKILL.md Body (<2k tokens, loaded on trigger)
- When to Use (2-3 bullet points)
- What It Does (high-level flow)
- Inputs/Outputs (contract)
- Quick Start (1-3 commands)
- Links to Level 3 docs
Level 3: Bundled Files (unlimited, loaded as-needed)
- Detailed documentation
- Executable scripts
- API specs, examples, decision matrices
- Shared utilities
Step 1.3: Token Budget Plan
Fill out this table:
| Component | Target Tokens | What Goes Here |
|---|---|---|
| Metadata | ~100 | Name, description, version |
| SKILL.md Body | <2k (aim for 1.5k) | Quick ref, links to Level 3 |
| reference/*.md | 500-1000 each | Detailed docs (as many files as needed) |
| scripts/*.py | n/a | Executable code (not loaded unless run) |
Phase 2: Structure (15 minutes)
Step 2.1: Create Folder Layout
⚠️ CRITICAL: Create /reference/ folder and put ALL reference .md files there!
# Navigate to skills directory
cd .claude/skills
# Create skill structure
mkdir -p incident-triage/{scripts,reference,shared}
touch incident-triage/SKILL.md
touch incident-triage/scripts/{triage_main.py,enrich_ticket.py,suggest_priority.py,common.py}
touch incident-triage/reference/{inputs-and-prompts.md,decision-matrix.md,runbook-links.md,api-specs.md,examples.md}
touch incident-triage/shared/{config.py,api_client.py,formatters.py}
Verify structure matches this EXACT pattern:
incident-triage/
├── SKILL.md # ✅ Level 1+2 (≤2k tokens) - ONLY .md in root
├── reference/ # ✅ REQUIRED: Level 3 docs folder
│ ├── inputs-and-prompts.md # ✅ All reference .md files go HERE
│ ├── decision-matrix.md # ✅ NOT in root!
│ ├── runbook-links.md
│ ├── api-specs.md
│ └── examples.md
├── scripts/ # Level 3: executable code
│ ├── triage_main.py
│ ├── enrich_ticket.py
│ ├── suggest_priority.py
│ └── common.py
└── shared/ # Level 3: utilities
├── config.py
├── api_client.py
└── formatters.py
❌ WRONG - DO NOT DO THIS:
incident-triage/
├── SKILL.md
├── inputs-and-prompts.md # ❌ WRONG! Should be in reference/
├── decision-matrix.md # ❌ WRONG! Should be in reference/
└── scripts/
Step 2.2: Stub Out Files
Create minimal stubs for each file to establish contracts:
SKILL.md (copy template from best-practices.md) reference/*.md (headers only for now) scripts/*.py (function signatures with pass) shared/*.py (class/function signatures)
Step 2.3: Validate Folder Structure
Run this validation BEFORE moving to Phase 3:
# Check structure
ls -la incident-triage/
# Verify:
# ✅ SKILL.md exists in root
# ✅ reference/ folder exists
# ✅ NO .md files in root except SKILL.md
# ✅ scripts/ folder exists (if needed)
# ✅ shared/ folder exists (if needed)
# Check reference folder
ls -la incident-triage/reference/
# Verify:
# ✅ All .md reference files are HERE
# ✅ inputs-and-prompts.md
# ✅ decision-matrix.md
# ✅ api-specs.md
# ✅ examples.md
Checklist:
/reference/folder created- All reference .md files in
/reference/(not root) - SKILL.md links use
./reference/filename.mdformat - No .md files in root except SKILL.md
Phase 3: Implementation (2-4 hours)
Work in this order to maintain focus and avoid scope creep:
Step 3.1: Write Level 1 (Metadata) - 5 minutes
Open SKILL.md and write the frontmatter:
---
name: incident-triage
description: Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions.
version: 1.0.0
---
Checklist:
- Name is clear and specific (not "helper" or "utility")
- Description contains trigger keywords
- Description explains what it does (not what it is)
- Total metadata ≤100 tokens
Step 3.2: Write Level 2 (SKILL.md Body) - 30 minutes
Follow this exact structure:
# Level 2: Body (<2k tokens recommended) — Loaded when the skill triggers
## When to Use
- [Trigger condition 1]
- [Trigger condition 2]
- [Trigger condition 3]
## What It Does (at a glance)
- **[Action 1]**: [brief description]
- **[Action 2]**: [brief description]
- **[Action 3]**: [brief description]
- **[Action 4]**: [brief description]
## Inputs
- [Input format 1]
- [Input format 2]
Details: see [reference/inputs-and-prompts.md](./reference/inputs-and-prompts.md).
## Quick Start
1. **Dry-run** (no external calls):
```bash
python scripts/main.py --example --dry-run
-
With enrichment:
python scripts/main.py --ticket-id 12345 --include-logs -
Review output
Examples: reference/examples.md
Decision Logic (high-level)
[2-3 sentences on how decisions are made]
Full details: reference/decision-matrix.md
Outputs (contract)
field1: [description]field2: [description]field3: [description]
Guardrails
- [Security consideration 1]
- [Token budget note]
- [Error handling approach]
Links (Level 3, loaded only when needed)
- Prompts: reference/inputs-and-prompts.md
- Decision logic: reference/decision-matrix.md
- Examples: reference/examples.md
- API specs: reference/api-specs.md
Triggers (help the router)
Keywords: [keyword1], [keyword2], [keyword3] Inputs containing: [field1], [field2]
Security & Config
Set environment variables:
VAR1_API_KEYVAR2_API_KEY
Centralized in shared/config.py. Never echo secrets.
Testing
# Smoke test
python scripts/main.py --fixture reference/examples.md
# End-to-end
python scripts/main.py --text "Example input" --dry-run
**Checklist:**
- [ ] <2k tokens (aim for 1.5k)
- [ ] Links to Level 3 for details
- [ ] Quick Start is copy-paste ready
- [ ] Output contract is clear
- [ ] No extensive examples or specs embedded
### Step 3.3: Write Level 3 Reference Docs - 45 minutes
Create each reference file systematically:
#### reference/inputs-and-prompts.md
```markdown
# Inputs and Prompt Shapes
## Input Format 1: Free Text
- Description
- Example
## Input Format 2: Structured JSON
```json
{
"field": "value"
}
Prompt Snippets
- Extraction goals
- Summarization style
- Redaction rules
#### reference/decision-matrix.md
```markdown
# Decision Matrix
[Full decision logic with tables, formulas, edge cases]
## Base Matrix
| Dimension 1 \ Dimension 2 | Value A | Value B | Value C |
|---|---|---|---|
| Low | Result | Result | Result |
| Med | Result | Result | Result |
| High | Result | Result | Result |
## Adjustments
- Adjustment rule 1
- Adjustment rule 2
## Rationale
[Why this matrix, examples, edge cases]
reference/api-specs.md
# API Specs & Schemas
## API 1: CMDB
- Base URL: `{SERVICE_MAP_URL}`
- Auth: Header `X-API-Key: {CMDB_API_KEY}`
- Endpoints:
- GET `/service/{name}/dependencies`
- Response schema: [...]
## API 2: Logs
- Base URL: [...]
- Endpoints: [...]
reference/examples.md
# Examples
## Example 1: [Scenario Name]
**Input:**
[Example input]
**Output:**
[Example output with all fields]
**Explanation:** [Why these decisions were made]
## Example 2: [Another Scenario]
[...]
reference/runbook-links.md
# Runbook Links
- [Service 1]: <URL>
- [Service 2]: <URL>
- [Escalation tree]: <URL>
Checklist for all reference docs:
- Each file focuses on one aspect
- 500-1000 tokens per file (can be more if needed)
- Referenced from SKILL.md but not embedded
- Includes examples where helpful
Step 3.4: Write Shared Utilities - 30 minutes
shared/config.py
"""Centralized configuration from environment variables."""
import os
class Config:
"""Config object - never logs secrets"""
CMDB_API_KEY = os.getenv("CMDB_API_KEY")
LOGS_API_KEY = os.getenv("LOGS_API_KEY")
SERVICE_MAP_URL = os.getenv("SERVICE_MAP_URL")
DASHBOARD_BASE_URL = os.getenv("DASHBOARD_BASE_URL")
@classmethod
def validate(cls):
"""Check required env vars are set"""
missing = []
for key in ["CMDB_API_KEY", "LOGS_API_KEY"]:
if not getattr(cls, key):
missing.append(key)
if missing:
raise ValueError(f"Missing required env vars: {missing}")
cfg = Config()
shared/api_client.py
"""API client wrappers."""
import requests
from .config import cfg
class CMDBClient:
def __init__(self):
self.base_url = cfg.SERVICE_MAP_URL
self.headers = {"X-API-Key": cfg.CMDB_API_KEY}
def get_service_dependencies(self, service_name):
"""Fetch service dependencies"""
try:
resp = requests.get(
f"{self.base_url}/service/{service_name}/dependencies",
headers=self.headers,
timeout=5
)
resp.raise_for_status()
return resp.json()
except requests.RequestException as e:
raise ConnectionError(f"CMDB API failed: {e}")
class LogsClient:
def __init__(self):
self.base_url = cfg.LOGS_API_URL
self.headers = {"Authorization": f"Bearer {cfg.LOGS_API_KEY}"}
def recent_errors(self, service_name, last_minutes=15):
"""Fetch recent error logs"""
# Implementation
pass
def cmdb_client():
return CMDBClient()
def logs_client():
return LogsClient()
shared/formatters.py
"""Output formatting helpers."""
def format_output(enriched, severity, priority, rationale, next_steps):
"""Format triage result as markdown."""
lines = [
"### Incident Triage Result",
f"**Severity**: {severity} | **Priority**: {priority}",
f"**Rationale**: {rationale}",
"",
"**Summary**:",
enriched.get("summary", "N/A"),
"",
"**Next Steps**:",
]
for i, step in enumerate(next_steps, 1):
lines.append(f"{i}. {step}")
if "evidence" in enriched:
lines.extend(["", "**Evidence**:"])
for link in enriched["evidence"]:
lines.append(f"- {link}")
return "\n".join(lines)
Step 3.5: Write Main Scripts - 1 hour
scripts/triage_main.py (entry point)
#!/usr/bin/env python3
"""Main entry point for incident triage."""
import argparse
import json
import sys
from pathlib import Path
# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from shared.config import cfg
from shared.formatters import format_output
from scripts.enrich_ticket import enrich
from scripts.suggest_priority import score
def main():
parser = argparse.ArgumentParser(description="Triage an incident")
parser.add_argument("--text", help="Free-text incident description")
parser.add_argument("--ticket-id", help="Ticket ID to enrich")
parser.add_argument("--include-logs", action="store_true")
parser.add_argument("--include-cmdb", action="store_true")
parser.add_argument("--dry-run", action="store_true",
help="Skip external API calls")
args = parser.parse_args()
# Validate inputs
if not args.text and not args.ticket_id:
print("Error: Provide --text or --ticket-id")
sys.exit(1)
# Build payload
payload = {
"text": args.text,
"ticket_id": args.ticket_id
}
try:
# Enrich (respects --dry-run)
enriched = enrich(
payload,
include_logs=args.include_logs and not args.dry_run,
include_cmdb=args.include_cmdb and not args.dry_run
)
# Score (deterministic)
severity, priority, rationale = score(enriched)
# Generate next steps
next_steps = generate_next_steps(enriched, severity)
# Format output
output = format_output(enriched, severity, priority, rationale, next_steps)
print(output)
except Exception as e:
print(f"❌ Triage failed: {e}")
print("\nTroubleshooting:")
print("1. Check environment variables are set")
print("2. Verify API endpoints are accessible")
print("3. Run with --dry-run to test without external calls")
sys.exit(1)
def generate_next_steps(enriched, severity):
"""Generate action items based on enrichment and severity"""
steps = []
if severity in ["SEV1", "SEV2"]:
steps.append("Page on-call immediately")
if "dashboard_url" in enriched:
steps.append(f"Review dashboard: {enriched['dashboard_url']}")
steps.append("Compare last 15m vs 24h baseline")
if enriched.get("recent_deploy"):
steps.append("Consider rollback if error budget breached")
return steps
if __name__ == "__main__":
main()
scripts/enrich_ticket.py
"""Enrich ticket with external data."""
from shared.config import cfg
from shared.api_client import cmdb_client, logs_client
def enrich(payload, include_logs=False, include_cmdb=False):
"""
Enrich ticket payload with CMDB/logs data.
Args:
payload: Dict with 'text' and/or 'ticket_id'
include_logs: Fetch recent logs
include_cmdb: Fetch CMDB dependencies
Returns:
Dict with original payload + enrichment
"""
result = {"input": payload}
# Extract service name from text or ticket
service = extract_service(payload)
if service:
result["service"] = service
# Enrich with CMDB
if include_cmdb and service:
try:
cmdb_data = cmdb_client().get_service_dependencies(service)
result["cmdb"] = cmdb_data
result["blast_radius"] = cmdb_data.get("dependent_services", [])
except Exception as e:
result["cmdb_error"] = str(e)
# Enrich with logs
if include_logs and service:
try:
logs = logs_client().recent_errors(service)
result["logs"] = logs
except Exception as e:
result["logs_error"] = str(e)
# Derive scope/impact hints
result["scope"] = derive_scope(result)
result["impact"] = derive_impact(result)
return result
def extract_service(payload):
"""Extract service name from payload."""
# Check explicit service field
if "service" in payload:
return payload["service"]
# Parse from text (simple keyword matching)
text = payload.get("text", "").lower()
known_services = ["checkout", "payments", "inventory", "auth"]
for service in known_services:
if service in text:
return service
return None
def derive_scope(enriched):
"""Determine blast radius scope."""
blast_radius = len(enriched.get("blast_radius", []))
if blast_radius == 0:
return "single-service"
elif blast_radius < 3:
return "few-services"
else:
return "multi-service"
def derive_impact(enriched):
"""Estimate user impact level."""
# Check for explicit impact data
if "impact" in enriched.get("input", {}):
pct = enriched["input"]["impact"].get("users_affected_pct", 0)
if pct > 50:
return "high"
elif pct > 10:
return "medium"
else:
return "low"
# Infer from service criticality
service = enriched.get("service", "")
critical_services = ["checkout", "payments", "auth"]
if service in critical_services:
return "medium" # Default to medium for critical services
return "low"
scripts/suggest_priority.py
"""Deterministic severity/priority scoring."""
DECISION_MATRIX = {
# (impact, scope) -> (severity, priority)
("low", "single-service"): ("SEV4", "P4"),
("low", "few-services"): ("SEV3", "P3"),
("low", "multi-service"): ("SEV3", "P3"),
("medium", "single-service"): ("SEV3", "P3"),
("medium", "few-services"): ("SEV2", "P2"),
("medium", "multi-service"): ("SEV2", "P2"),
("high", "single-service"): ("SEV2", "P2"),
("high", "few-services"): ("SEV1", "P1"),
("high", "multi-service"): ("SEV1", "P1"),
}
def score(enriched):
"""
Score incident severity and priority.
Args:
enriched: Dict from enrich_ticket()
Returns:
Tuple of (severity, priority, rationale)
"""
impact = enriched.get("impact", "medium")
scope = enriched.get("scope", "single-service")
# Base score from matrix
key = (impact, scope)
if key not in DECISION_MATRIX:
# Default fallback
severity, priority = "SEV3", "P3"
rationale = f"Default scoring (impact={impact}, scope={scope})"
else:
severity, priority = DECISION_MATRIX[key]
rationale = f"{impact.title()} impact, {scope} scope"
# Apply adjustments
if should_escalate(enriched):
severity, priority = escalate(severity, priority)
rationale += " (escalated: long recovery expected)"
return severity, priority, rationale
def should_escalate(enriched):
"""Check if incident should be escalated."""
# Check for long recovery indicators
logs = enriched.get("logs", {})
if logs.get("error_rate_increasing"):
return True
# Check for repeated incidents
if enriched.get("recent_incidents_count", 0) > 3:
return True
return False
def escalate(severity, priority):
"""Escalate severity/priority by one level."""
sev_map = {"SEV4": "SEV3", "SEV3": "SEV2", "SEV2": "SEV1", "SEV1": "SEV1"}
pri_map = {"P4": "P3", "P3": "P2", "P2": "P1", "P1": "P1"}
return sev_map.get(severity, severity), pri_map.get(priority, priority)
Phase 4: Testing (30 minutes)
Step 4.1: Create Test Fixtures
Create reference/test-fixtures.json:
{
"test1": {
"text": "Checkout API seeing 500 errors at 12%; started 15:05Z",
"expected_severity": "SEV2",
"expected_priority": "P2"
},
"test2": {
"text": "Single user reports login issue on mobile app",
"expected_severity": "SEV4",
"expected_priority": "P4"
}
}
Step 4.2: Run Tests
# 1. Smoke test deterministic components
python scripts/suggest_priority.py --test
# 2. Dry-run end-to-end
python scripts/triage_main.py --text "API timeouts on checkout" --dry-run
# 3. With enrichment (requires env vars)
export CMDB_API_KEY="test_key"
export LOGS_API_KEY="test_key"
python scripts/triage_main.py --ticket-id 12345 --include-logs --include-cmdb
Step 4.3: Test with Claude
Ask Claude:
"I have a new incident: checkout API showing 500 errors affecting 15% of users in EU region. Can you triage this?"
Verify:
- Skill triggers correctly
- Output is well-formatted
- Severity/priority makes sense
- Next steps are actionable
- Links work
Phase 5: Refinement (Ongoing)
Step 5.1: Token Count Audit
# Count tokens in SKILL.md body (exclude metadata)
wc -w incident-triage/SKILL.md
# Multiply by 0.75 for rough token count
Checklist:
- Metadata ~100 tokens
- Body <2k tokens
- If over, move content to reference/*.md
Step 5.2: Real-World Usage Monitoring
Track these metrics:
- Does Claude trigger the skill appropriately?
- Are users getting helpful results?
- What questions/errors come up?
- Which Level 3 docs are never used?
Step 5.3: Iterate Based on Feedback
If skill triggers too often: → Make description more specific
If skill triggers too rarely: → Add more trigger keywords
If output is unhelpful: → Improve decision logic or examples
If token limit exceeded: → Move more content to Level 3
🎓 Adaptation Checklist
To create YOUR skill from this template:
- Folder Structure (CRITICAL):
- Create
/reference/folder - Put ALL reference .md files IN
/reference/folder - NO .md files in root except SKILL.md
- Links in SKILL.md use
./reference/filename.mdformat
- Create
- Rename: Replace "incident-triage" with your skill name
- Metadata: Write name/description with your trigger keywords
- Triggers: List all keywords/patterns that should invoke your skill
- Inputs/Outputs: Define your specific contract
- Scripts: Replace enrichment/scoring with your logic
- Reference docs: Create docs for your domain (decision matrices, API specs, etc.)
- Config: Add your required environment variables
- Examples: Create 3-5 realistic examples
- Test: Dry-run → with real data → with Claude
- Validate Structure: Run structure validation checklist
- Refine: Monitor usage, iterate based on feedback
📚 Related Resources
- Agent Skills Best Practices - Quick reference
- Progressive Disclosure - Design philosophy
- Token Optimization - Token limits explained
Last Updated: 2025-10-20 Version: 1.0.0