Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:51:34 +08:00
commit acde81dcfe
59 changed files with 22282 additions and 0 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,272 @@
# Best Practices Guide
**Progressive Disclosure Applied**: This guide uses a hierarchical structure where you start with high-level concepts and progressively drill down into technical details.
**Token-Optimized Structure**:
- This file: ~628 tokens (overview & navigation)
- [best-practices.md](./best-practices.md): ~920 tokens (quick reference for building skills)
- Topic files: 1.4k-2.2k tokens each (deep dives loaded as-needed)
**📑 Navigation**: See [INDEX.md](./INDEX.md) for complete file reference and navigation patterns.
---
## 🎯 Quick Start (Level 1)
### Building a New Skill?
**[Skill Creation Process](./reference/skill-creation-process.md)** - Follow this step-by-step guide
### Learning Patterns?
Choose your learning path:
- **[Progressive Disclosure](./topics/progressive-disclosure.md)** - Learn the core UX/architectural pattern
- **[Dynamic Manifests](./topics/dynamic-manifests.md)** - Implement runtime capability discovery
- **[Deferred Loading](./topics/deferred-loading.md)** - Optimize resource initialization
### Need Quick Reference?
**[best-practices.md](./best-practices.md)** - Checklists, templates, and common pitfalls
---
## 📚 Concept Map
```
Best Practices
├─── Progressive Disclosure ◄──┐
│ (Design Pattern) │
│ │ │
│ └─── Influences ────────────┤
│ │
├─── Dynamic Manifests ◄────────┤
│ (Runtime Discovery) │
│ │ │
│ └─── Enables ───────────────┤
│ │
└─── Deferred Loading ◄─────────┘
(Lazy Initialization)
```
---
## 🚀 Why These Patterns Matter
### The Problem
Traditional systems load everything at startup:
- ❌ Slow initialization
- ❌ High memory consumption
- ❌ Wasted resources on unused features
- ❌ Poor scalability
### The Solution
Progressive Disclosure + Dynamic Manifests + Deferred Loading:
- ✅ Fast startup (load on-demand)
- ✅ Efficient resource usage
- ✅ Adaptive capabilities
- ✅ Context-aware feature availability
---
## 📖 Learning Path
### For Beginners
1. Start with **[Progressive Disclosure](./topics/progressive-disclosure.md#what-is-it)** - Understand the philosophy
2. See **[Simple Examples](./topics/progressive-disclosure.md#simple-examples)**
3. Review **[Quick Start](./topics/dynamic-manifests.md#quick-start)**
### For Practitioners
1. Read **[Implementation Patterns](./topics/progressive-disclosure.md#implementation-patterns)**
2. Configure **[Dynamic Manifests](./topics/dynamic-manifests.md#configuration)**
3. Optimize with **[Deferred Loading](./topics/deferred-loading.md#strategies)**
### For Architects
1. Study **[Architectural Principles](./topics/progressive-disclosure.md#architectural-principles)**
2. Design **[Capability Systems](./topics/dynamic-manifests.md#capability-systems)**
3. Implement **[Advanced Optimization](./topics/deferred-loading.md#advanced-techniques)**
---
## 🔗 Topic Relationships
### Progressive Disclosure → Dynamic Manifests
Progressive disclosure provides the **design philosophy**: show users only what they need, when they need it.
Dynamic manifests provide the **technical implementation**: systems query capabilities at runtime, enabling features progressively.
**Example**: A chat interface starts with basic tools (Level 1), then reveals advanced tools (Level 2) as the user demonstrates expertise → The system's dynamic manifest adjusts which tools are available based on context.
### Dynamic Manifests → Deferred Loading
Dynamic manifests tell you **what's available**.
Deferred loading determines **when to initialize it**.
**Example**: Dynamic manifest says "Tool X is available" → Deferred loading ensures Tool X's code isn't loaded until first use → Saves memory and startup time.
---
## 🎓 Real-World Applications
### MCP (Model Context Protocol) Skills
```
User opens Claude Code
[Progressive Disclosure]
→ Only basic skills shown initially
User works with project files
[Dynamic Manifests]
→ System detects project type
→ New relevant skills appear
User invokes advanced skill
[Deferred Loading]
→ Skill code loaded on first use
→ Subsequent calls use cached version
```
### Web Applications
```
User visits page
[Progressive Disclosure]
→ Core UI loads first
User navigates to dashboard
[Dynamic Manifests]
→ Check user permissions
→ Build feature menu dynamically
User clicks "Export Data"
[Deferred Loading]
→ Load export library on demand
→ Initialize only when needed
```
---
## 🛠️ Implementation Checklist
Use this as a quick reference when implementing these patterns:
- [ ] Design information hierarchy (Progressive Disclosure)
- [ ] Identify capability tiers (Basic → Intermediate → Advanced)
- [ ] Implement runtime discovery endpoints (Dynamic Manifests)
- [ ] Create `.well-known/mcp/manifest.json` (MCP specific)
- [ ] Enable lazy initialization (Deferred Loading)
- [ ] Add caching strategies (Optimization)
- [ ] Implement change notifications (Dynamic updates)
- [ ] Test without system restart (Validation)
---
## 📊 Performance Metrics
Track these to measure success:
| Metric | Before | Target | Pattern |
|--------|--------|--------|---------|
| Initial Load Time | 5s | < 1s | Progressive Disclosure |
| Memory at Startup | 500MB | < 100MB | Deferred Loading |
| Feature Discovery | Static | Dynamic | Dynamic Manifests |
| Context Tokens Used | 10k | < 2k | Progressive Loading |
---
## 🔍 Deep Dive Topics
Ready to go deeper? Click any topic:
1. **[Progressive Disclosure](./topics/progressive-disclosure.md)**
- Design philosophy
- UX patterns
- Information architecture
- Cognitive load management
2. **[Dynamic Manifests](./topics/dynamic-manifests.md)**
- Configuration guide
- Endpoint implementation
- Registry patterns
- MCP-specific setup
3. **[Deferred Loading](./topics/deferred-loading.md)**
- Lazy initialization
- Code splitting
- Resource optimization
- Caching strategies
---
## 🎯 Quick Wins
Want immediate improvements? Start here:
### 5-Minute Win: Enable Dynamic Discovery
```json
// claude_desktop_config.json
{
"mcpServers": {
"your-server": {
"dynamicDiscovery": true,
"discoveryInterval": 5000
}
}
}
```
See [Dynamic Manifests: Quick Start](./topics/dynamic-manifests.md#quick-start)
### 15-Minute Win: Implement Lazy Loading
```python
from functools import lru_cache
@lru_cache(maxsize=128)
def load_expensive_resource():
# Only loads on first call
return initialize_resource()
```
See [Deferred Loading: Basic Patterns](./topics/deferred-loading.md#basic-patterns)
### 30-Minute Win: Progressive Disclosure UI
```markdown
# Level 1: Essentials (always visible)
## Getting Started
# Level 2: Intermediate (click to expand)
<details>
<summary>Advanced Features</summary>
...
</details>
# Level 3: Expert (separate page)
See [Advanced Guide](./advanced.md)
```
See [Progressive Disclosure: UI Patterns](./topics/progressive-disclosure.md#ui-patterns)
---
## 📚 Additional Resources
- [MCP Official Spec](https://spec.modelcontextprotocol.io/)
- [Progressive Disclosure (Nielsen Norman Group)](https://www.nngroup.com/articles/progressive-disclosure/)
- [Lazy Loading Best Practices](https://web.dev/lazy-loading/)
---
## 🆘 Troubleshooting
**Problem**: Changes not appearing without restart
**Solution**: Check [Dynamic Manifests: Configuration](./topics/dynamic-manifests.md#configuration)
**Problem**: High memory usage at startup
**Solution**: Review [Deferred Loading: Strategies](./topics/deferred-loading.md#strategies)
**Problem**: Users overwhelmed by options
**Solution**: Apply [Progressive Disclosure: Principles](./topics/progressive-disclosure.md#principles)
---
**Last Updated**: 2025-10-20
**Version**: 1.0.0

View File

@@ -0,0 +1,620 @@
# Progressive Disclosure
> **Definition**: A design pattern that sequences information and actions across multiple screens to reduce cognitive load and improve user experience.
**Navigation**: [← Back to Best Practices](../README.md) | [Next: Dynamic Manifests →](./dynamic-manifests.md)
---
## Table of Contents
- [What Is It?](#what-is-it) ← Start here
- [Why Use It?](#why-use-it)
- [Simple Examples](#simple-examples)
- [Implementation Patterns](#implementation-patterns) ← For practitioners
- [Architectural Principles](#architectural-principles) ← For architects
- [UI Patterns](#ui-patterns)
- [Related Concepts](#related-concepts)
---
## What Is It?
Progressive disclosure is revealing information **gradually** rather than all at once.
### The Core Idea
```
❌ Bad: Show everything immediately
User sees: [100 buttons] [50 options] [20 menus]
Result: Overwhelmed, confused
✅ Good: Show essentials, reveal more as needed
User sees: [5 core actions]
User clicks "More": [15 additional options appear]
User clicks "Advanced": [Advanced features panel opens]
Result: Focused, confident
```
### Real-World Analogy
**Restaurant Menu**
```
1. Main categories (Appetizers, Entrees, Desserts) ← Level 1
└─ Click "Entrees"
2. Entree types (Pasta, Seafood, Steak) ← Level 2
└─ Click "Pasta"
3. Specific dishes with details ← Level 3
```
This prevents menu overwhelm while still providing complete information.
---
## Why Use It?
### Benefits
| Benefit | Description | Impact |
|---------|-------------|--------|
| **Reduced Cognitive Load** | Users process less information at once | Less confusion, faster decisions |
| **Improved Discoverability** | Users find relevant features easier | Better feature adoption |
| **Faster Performance** | Load only what's needed now | Quicker startup, less memory |
| **Adaptive Complexity** | Beginners see simple, experts see advanced | Serves all skill levels |
### When to Use
**Use progressive disclosure when:**
- Users don't need all features/info immediately
- Feature set is large or complex
- Users have varying skill levels
- Performance/load time matters
**Don't use when:**
- All information is equally critical
- Users need to compare all options at once
- Feature set is small (< 7 items)
- Extra clicks harm the experience
---
## Simple Examples
### Example 1: Settings Panel
**Traditional Approach:**
```
Settings
├── Profile Name: _______
├── Email: _______
├── Password: _______
├── Theme: [ Dark | Light ]
├── Language: [ English ▼ ]
├── Timezone: [ UTC-5 ▼ ]
├── Date Format: [ MM/DD/YYYY ▼ ]
├── Currency: [ USD ▼ ]
├── API Keys: _______
├── Webhook URL: _______
├── Debug Mode: [ ]
├── Log Level: [ Info ▼ ]
└── ... (20 more settings)
```
Result: Users scrolls, scans, feels lost.
**Progressive Disclosure:**
```
Settings
├── Profile Name: _______
├── Email: _______
├── Theme: [ Dark | Light ]
├── [▼ Advanced Settings]
│ └── (collapsed by default)
└── [▼ Developer Settings]
└── (collapsed by default)
```
Click "Advanced Settings":
```
Advanced Settings
├── Language: [ English ▼ ]
├── Timezone: [ UTC-5 ▼ ]
├── Date Format: [ MM/DD/YYYY ▼ ]
└── Currency: [ USD ▼ ]
```
### Example 2: MCP Skills
**Traditional: All Skills Loaded**
```python
# Load everything at startup
available_skills = [
"basic-search",
"file-operations",
"web-scraping",
"data-analysis",
"machine-learning",
"blockchain-analysis",
"video-processing",
# ... 50 more skills
]
```
Result: Slow startup, high memory, overwhelming list.
**Progressive Disclosure:**
```python
# Level 1: Always available
tier_1_skills = ["basic-search", "file-operations"]
# Level 2: Loaded when project type detected
if is_data_project():
tier_2_skills = ["data-analysis", "visualization"]
# Level 3: Loaded on explicit request
if user_requests("machine-learning"):
tier_3_skills = ["ml-training", "model-deployment"]
```
### Example 3: Command Line Tool
**Traditional:**
```bash
$ mytool --help
Usage: mytool [OPTIONS] COMMAND [ARGS]...
Options:
--config PATH Configuration file path
--verbose Verbose output
--debug Debug mode
--log-file PATH Log file path
--log-level LEVEL Logging level
--timeout SECONDS Operation timeout
--retry-count N Number of retries
--parallel N Parallel workers
--cache-dir PATH Cache directory
--no-cache Disable caching
--format FORMAT Output format
... (30 more options)
Commands:
init Initialize project
build Build project
deploy Deploy project
test Run tests
... (20 more commands)
```
**Progressive Disclosure:**
```bash
$ mytool --help
Usage: mytool [OPTIONS] COMMAND
Common Commands:
init Initialize project
build Build project
deploy Deploy project
Run 'mytool COMMAND --help' for command-specific options
Run 'mytool --help-all' for complete documentation
$ mytool build --help
Usage: mytool build [OPTIONS]
Essential Options:
--output PATH Output directory (default: ./dist)
--watch Watch for changes
Advanced Options (mytool build --help-advanced):
--parallel N Parallel workers
--cache-dir PATH Cache directory
... (more advanced options)
```
---
## Implementation Patterns
### Pattern 1: Tiered Information Architecture
Organize content into logical tiers:
```
Tier 1: Essentials (80% of users need this)
├── Core functionality
├── Most common tasks
└── Critical information
Tier 2: Intermediate (30% of users need this)
├── Advanced features
├── Customization options
└── Detailed documentation
Tier 3: Expert (5% of users need this)
├── Edge cases
├── Debug/diagnostic tools
└── API reference
```
**Implementation:**
```markdown
# My API Documentation
## Quick Start (Tier 1)
Basic usage examples that work for most cases.
<details>
<summary>Advanced Usage (Tier 2)</summary>
## Authentication Options
Detailed authentication flows...
## Rate Limiting
How to handle rate limits...
</details>
[Expert Guide](./expert-guide.md) (Tier 3) →
```
### Pattern 2: Context-Aware Disclosure
Show features based on user context:
```python
class FeatureDisclosure:
def get_available_features(self, user_context):
features = ["core_feature_1", "core_feature_2"] # Always available
# Intermediate features
if user_context.skill_level >= "intermediate":
features.extend(["advanced_search", "bulk_operations"])
# Expert features
if user_context.has_permission("admin"):
features.extend(["system_config", "user_management"])
# Contextual features
if user_context.project_type == "data_science":
features.extend(["ml_tools", "visualization"])
return features
```
### Pattern 3: Progressive Enhancement
Start minimal, add capabilities:
```javascript
// Level 1: Basic functionality works everywhere
function saveData(data) {
localStorage.setItem('data', JSON.stringify(data));
}
// Level 2: Enhanced with sync (if available)
if (navigator.onLine && hasCloudSync()) {
function saveData(data) {
localStorage.setItem('data', JSON.stringify(data));
cloudSync.upload(data); // Progressive enhancement
}
}
// Level 3: Real-time collaboration (if enabled)
if (hasFeature('realtime_collaboration')) {
function saveData(data) {
localStorage.setItem('data', JSON.stringify(data));
cloudSync.upload(data);
websocket.broadcast(data); // Further enhancement
}
}
```
### Pattern 4: Lazy Loading
Defer initialization until needed:
```python
class SkillManager:
def __init__(self):
self._skills = {}
self._skill_registry = {
'basic': ['search', 'files'],
'advanced': ['ml', 'data_analysis'],
'expert': ['custom_models']
}
def get_skill(self, skill_name):
# Progressive disclosure: Load on first access
if skill_name not in self._skills:
self._skills[skill_name] = self._load_skill(skill_name)
return self._skills[skill_name]
def _load_skill(self, skill_name):
# Deferred loading happens here
module = import_module(f'skills.{skill_name}')
return module.SkillClass()
```
---
## Architectural Principles
### Principle 1: Information Hierarchy
Design with clear levels:
```
Level 0: Critical (always visible, < 5 items)
└─ Things users MUST see/do immediately
Level 1: Primary (visible by default, < 10 items)
└─ Core functionality, 80% use case
Level 2: Secondary (behind 1 click, < 20 items)
└─ Advanced features, configuration
Level 3: Tertiary (behind 2+ clicks, unlimited)
└─ Expert features, detailed docs, edge cases
```
### Principle 2: Cognitive Load Management
**Miller's Law**: Humans can hold 7±2 items in working memory.
**Application:**
- Level 1 UI: Show ≤ 7 primary actions
- Menus: Group into ≤ 7 categories
- Forms: Break into ≤ 7 fields per step
**Bad Example:**
```
[Button1] [Button2] [Button3] [Button4] [Button5]
[Button6] [Button7] [Button8] [Button9] [Button10]
[Button11] [Button12] [Button13] [Button14] [Button15]
```
**Good Example:**
```
[Common Actions ▼]
├─ Action 1
├─ Action 2
└─ Action 3
[Advanced ▼]
├─ Action 4
└─ Action 5
[Expert ▼]
└─ More...
```
### Principle 3: Discoverability vs. Visibility
Balance showing enough vs. hiding too much:
```
High Discoverability
│ Ideal Zone:
│ Core features visible,
│ Advanced features discoverable
│ ┌─────────────┐
│ │ ✓ Sweet │
│ │ Spot │
│ └─────────────┘
└──────────────────────────→ High Visibility
Hidden features Feature overload
```
**Techniques:**
- Visual cues: "▼ More options" "⚙ Advanced"
- Tooltips: Hint at hidden features
- Progressive help: "New features available!"
- Analytics: Track if users find features
### Principle 4: Reversible Disclosure
Users should control disclosure:
```
✅ Good: User-controlled
[▼ Show Advanced Options] ← User clicks to expand
[▲ Hide Advanced Options] ← User clicks to collapse
❌ Bad: Forced progression
Step 1 → Step 2 → Step 3 (can't go back)
```
**Implementation:**
- Persistent state: Remember user's disclosure preferences
- Keyboard shortcuts: Power users want quick access
- Breadcrumbs: Show where user is in hierarchy
---
## UI Patterns
### Pattern: Accordion/Collapsible Sections
```html
<details>
<summary>Basic Configuration</summary>
<p>Essential settings here...</p>
</details>
<details>
<summary>Advanced Configuration</summary>
<p>Advanced settings here...</p>
</details>
```
### Pattern: Tabs
```
┌─────────┬──────────┬──────────┐
│ Basic │ Advanced │ Expert │
├─────────┴──────────┴──────────┤
│ │
│ [Content for selected tab] │
│ │
└────────────────────────────────┘
```
### Pattern: Modal/Dialog
```
Main Screen (Simple)
[Click "Advanced Settings" button]
┌─────────────────────────┐
│ Advanced Settings │
│ │
│ [Complex options here] │
│ │
│ [Cancel] [Apply] │
└─────────────────────────┘
```
### Pattern: Progressive Form
```
Step 1: Basic Info Step 2: Details Step 3: Preferences
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Name: _______ │ → │ Address: ____ │ → │ Theme: [ ] │
│ Email: ______ │ │ Phone: ______ │ │ Notifications: │
│ │ │ │ │ [ ] Email │
│ [Next] │ │ [Back] [Next] │ │ [ ] SMS │
└─────────────────┘ └─────────────────┘ │ [Back] [Finish] │
└─────────────────┘
```
### Pattern: Contextual Help
```
Setting Name [?] ← Hover shows basic help
Hover: "Controls the display theme"
Click [?]: Opens detailed documentation
```
---
## Related Concepts
### Progressive Disclosure → [Dynamic Manifests](./dynamic-manifests.md)
Progressive disclosure = **design philosophy**
Dynamic manifests = **technical implementation**
Example:
- Progressive disclosure says: "Show basic tools first"
- Dynamic manifests implement: Runtime query of available tools based on context
See: [Dynamic Manifests: Configuration](./dynamic-manifests.md#configuration)
### Progressive Disclosure → [Deferred Loading](./deferred-loading.md)
Progressive disclosure = **what to show**
Deferred loading = **when to load**
Example:
- Progressive disclosure: "Advanced feature hidden until clicked"
- Deferred loading: "Advanced feature code loaded on first access"
See: [Deferred Loading: Strategies](./deferred-loading.md#strategies)
### Progressive Disclosure in MCP
MCP Skills use progressive disclosure:
```
User starts → Basic skills available
User works with Python files → Python skills appear
User requests ML feature → ML skills loaded
```
Implemented via:
- Metadata scanning (what's available)
- Lazy loading (when to load)
- Context awareness (what to show)
See: [Best Practices: MCP Applications](../README.md#real-world-applications)
---
## Measurement & Testing
### Key Metrics
Track these to validate progressive disclosure:
| Metric | Good | Bad |
|--------|------|-----|
| Time to first action | < 5s | > 30s |
| Feature discovery rate | > 70% | < 30% |
| User confusion (support tickets) | Decreasing | Increasing |
| Task completion rate | > 85% | < 60% |
### A/B Testing
```
Group A: Everything visible (control)
Group B: Progressive disclosure (test)
Measure:
- Time to complete common task
- Number of clicks
- Error rate
- User satisfaction
```
---
## Anti-Patterns
### ❌ Hiding Critical Information
```
❌ Bad: Hide error messages in collapsed section
✅ Good: Show errors prominently, hide resolution steps
```
### ❌ Too Many Levels
```
❌ Bad: Menu → Submenu → Submenu → Submenu → Action
✅ Good: Menu → Submenu → Action (max 3 levels)
```
### ❌ Inconsistent Disclosure
```
❌ Bad: Some settings in tabs, others in accordions, others in modals
✅ Good: Consistent pattern throughout app
```
### ❌ No Visual Cues
```
❌ Bad: Hidden features with no hint they exist
✅ Good: "⚙ Advanced settings" or "▼ Show more"
```
---
## Further Reading
- [Jakob Nielsen: Progressive Disclosure](https://www.nngroup.com/articles/progressive-disclosure/)
- [Information Architecture Basics](https://www.usability.gov/what-and-why/information-architecture.html)
- [Cognitive Load Theory](https://en.wikipedia.org/wiki/Cognitive_load)
---
**Navigation**: [← Back to Best Practices](../README.md) | [Next: Dynamic Manifests →](./dynamic-manifests.md)
**Last Updated**: 2025-10-20

View File

@@ -0,0 +1,362 @@
# Agent Skills Best Practices - Quick Reference
> **Quick access guide** for building efficient, maintainable Claude Code skills. For detailed architectural patterns, see [README.md](./README.md).
**📑 Navigation**: [INDEX.md](./INDEX.md) | [README.md](./README.md) | [Skill Creation Process](./reference/skill-creation-process.md)
---
## 🎯 Progressive Disclosure: Core Principle
**Progressive disclosure is the core design principle that makes Agent Skills flexible and scalable.** Like a well-organized manual that starts with a table of contents, then specific chapters, and finally a detailed appendix, skills let Claude load information only as needed:
| Level | File | Context Window | # Tokens |
|-------|------|----------------|----------|
| **1** | SKILL.md Metadata (YAML) | Always loaded | ~100 |
| **2** | SKILL.md Body (Markdown) | Loaded when Skill triggers | <5k |
| **3+** | Bundled files (text files, scripts, data) | Loaded as-needed by Claude | unlimited* |
**Key takeaways:**
- **Level 1 (Metadata)**: ~100 tokens, always in context - make it count!
- **Level 2 (Body)**: <5k tokens, loaded on trigger - keep focused
- **Level 3+ (Bundled)**: Unlimited, loaded as needed - reference from Level 2
**This means:** Your SKILL.md should be a **table of contents and quick reference**, not a comprehensive manual. Link to detailed files that Claude loads only when needed.
---
## 📑 Navigation
- **[README.md](./README.md)** - Comprehensive guide with architectural patterns
- **[Progressive Disclosure](./topics/progressive-disclosure.md)** - Design philosophy & UX patterns
- **[Dynamic Manifests](./topics/dynamic-manifests.md)** - Runtime capability discovery
- **[Deferred Loading](./topics/deferred-loading.md)** - Lazy initialization & optimization
---
## ⚡ Quick Start Checklist
Building a new skill? Follow this checklist:
- [ ] **Metadata (Level 1)**: Clear `name` and `description` (~100 tokens total)
- [ ] **Body (Level 2)**: Core instructions under 5k tokens (aim for <2k)
- [ ] **Bundled files (Level 3+)**: Complex details in separate files
- [ ] Move deterministic logic to executable scripts (not generated code)
- [ ] Extract shared utilities to reusable modules
- [ ] Add environment variable support for credentials
- [ ] Include error messages with troubleshooting steps
- [ ] Test with actual Claude usage
---
## 🎯 Core Principles (Summary)
### 1. Progressive Disclosure
Structure in layers:
- **Metadata** (always loaded) → **SKILL.md body** (on trigger) → **Linked files** (as needed)
### 2. Code > Tokens
Use scripts for deterministic tasks (API calls, data processing, calculations)
### 3. Keep SKILL.md Focused
<5k tokens (<2k recommended), scannable, action-oriented
### 4. Reusable Components
Extract shared logic to prevent duplication
### 5. Clear Metadata
Specific description helps Claude know when to trigger
### 6. Error Handling
Provide actionable feedback and troubleshooting steps
### 7. Logical Structure (Respecting Token Limits)
**⚠️ CRITICAL: Reference files MUST be in `/reference/` folder, NOT in root!**
```
skill-name/
├── SKILL.md # Level 1+2: Metadata (~100) + Body (<5k tokens)
├── reference/ # ✅ REQUIRED: Level 3 detailed docs (loaded as-needed)
│ ├── detail1.md # ✅ All .md reference files go HERE
│ └── detail2.md # ✅ NOT in root directory
├── scripts/ # Level 3: Executable code
└── shared/ # Level 3: Reusable utilities
```
**❌ WRONG - Reference files in root:**
```
skill-name/
├── SKILL.md
├── detail1.md # ❌ WRONG! Should be in reference/
├── detail2.md # ❌ WRONG! Should be in reference/
└── scripts/
```
**✅ CORRECT - Reference files in /reference/ folder:**
```
skill-name/
├── SKILL.md
├── reference/
│ ├── detail1.md # ✅ CORRECT!
│ └── detail2.md # ✅ CORRECT!
└── scripts/
```
### 8. Iterate
Test → Monitor → Refine based on actual usage
### 9. Security
No hardcoded secrets, audit third-party skills
### 10. Test
Smoke test scripts, verify with Claude, check error messages
---
## 📝 SKILL.md Template (Token-Aware)
```markdown
---
# Level 1: Metadata (~100 tokens) - Always loaded
name: skill-name
description: Specific description of what this does (triggers skill selection)
version: 1.0.0
---
# Level 2: Body (<5k tokens, <2k recommended) - Loaded on trigger
## When to Use
- Trigger condition 1
- Trigger condition 2
## Quick Start
1. Run `scripts/main.py --arg value`
2. Review output
## Advanced Usage
For complex scenarios, see [reference/advanced.md](./reference/advanced.md)
For API details, see [reference/api-spec.md](./reference/api-spec.md)
# Level 3: Bundled files - Loaded as-needed by Claude
# (Don't embed large content here - link to it!)
```
**Token budget guide:**
- Metadata: ~100 tokens
- Body target: <2k tokens (max 5k)
- If approaching 2k, move details to bundled files
---
## 🚫 Common Pitfalls
| ❌ Don't | ✅ Do |
|----------|-------|
| **Put reference files in root** | **Put reference files in /reference/ folder** |
| Put everything in SKILL.md | Split into focused files (Level 3) |
| Generate code via tokens | Write executable scripts |
| Vague names ("helper-skill") | Specific names ("pdf-form-filler") |
| Hardcode credentials | Use environment variables |
| >5k token SKILL.md body | Keep under 2k tokens (max 5k) |
| >100 token metadata | Concise name + description (~100) |
| Duplicate logic | Extract to shared modules |
| Generic descriptions | Specific trigger keywords |
---
## 🔧 Recommended Structure (Token-Optimized)
**⚠️ MANDATORY: All reference .md files MUST be in `/reference/` folder!**
```
my-skill/
├── SKILL.md # Level 1+2: Metadata (~100) + Body (<2k tokens)
│ # Quick reference + links to Level 3
├── README.md # Human documentation (optional, not loaded)
├── reference/ # ✅ REQUIRED: Level 3 detailed docs (loaded as-needed)
│ ├── api_spec.md # ✅ All detailed .md files go HERE
│ ├── examples.md # ✅ NOT in root directory!
│ └── advanced.md # ✅ Link from SKILL.md as ./reference/file.md
├── scripts/ # Level 3: Executable tools (loaded as-needed)
│ ├── main_tool.py
│ └── helper.py
└── shared/ # Level 3: Reusable components
├── __init__.py
├── config.py # Centralized config
├── api_client.py # API wrapper
└── formatters.py # Output formatting
```
**Key principles:**
1. SKILL.md is the table of contents. Details go in Level 3 files.
2. **ALL reference .md files MUST be in `/reference/` folder**
3. Link to them as `./reference/filename.md` from SKILL.md
---
## 🎨 Metadata Best Practices
### Good Metadata
```yaml
---
name: pdf-form-filler
description: Fill out PDF forms by extracting fields and inserting values
---
```
- Specific about function
- Contains keywords Claude might see
- Clear trigger conditions
### Poor Metadata
```yaml
---
name: pdf-skill
description: A skill for working with PDFs
---
```
- Too generic
- Vague purpose
- Unclear when to trigger
---
## 🛡️ Error Handling Pattern
```python
class AuthenticationError(Exception):
"""Raised when API authentication fails"""
pass
try:
client.authenticate()
except AuthenticationError:
print("❌ Authentication failed")
print("\nTroubleshooting:")
print("1. Verify API_KEY environment variable is set")
print("2. Check API endpoint is accessible")
print("3. Ensure network connectivity")
```
**Include:**
- Custom exception types
- Clear error messages with context
- Numbered troubleshooting steps
- Graceful degradation when possible
---
## 🔍 When to Use Each Pattern
### Use Progressive Disclosure When:
- Skill has optional advanced features
- Documentation is extensive
- Users have varying expertise levels
- See: [topics/progressive-disclosure.md](./topics/progressive-disclosure.md)
### Use Dynamic Manifests When:
- Capabilities change based on context
- Features depend on user permissions
- Tools should appear/disappear dynamically
- See: [topics/dynamic-manifests.md](./topics/dynamic-manifests.md)
### Use Deferred Loading When:
- Skill has heavy dependencies
- Not all features used every time
- Startup time matters
- See: [topics/deferred-loading.md](./topics/deferred-loading.md)
---
## ✅ Skill Structure Validation Checklist
**Run this checklist BEFORE considering a skill complete:**
- [ ] **Folder Structure**:
- [ ] `/reference/` folder exists
- [ ] ALL .md reference files are IN `/reference/` folder
- [ ] NO .md files in root (except SKILL.md and optional README.md)
- [ ] `/scripts/` folder exists (if scripts needed)
- [ ] `/shared/` folder exists (if shared utilities needed)
- [ ] **SKILL.md Structure**:
- [ ] Metadata section exists (~100 tokens)
- [ ] Body is <2k tokens (max 5k)
- [ ] Links to reference files use `./reference/filename.md` format
- [ ] No large content blocks embedded (moved to /reference/)
- [ ] **Progressive Disclosure**:
- [ ] Level 1 (metadata) is concise
- [ ] Level 2 (body) is a table of contents
- [ ] Level 3 (reference files) contains details
## 📊 Optimization Checklist
- [ ] **Token Efficiency**:
- Metadata ~100 tokens
- Body <2k tokens (max 5k)
- Detailed content in Level 3 files IN `/reference/` folder
- [ ] **Code Execution**: Deterministic tasks in scripts
- [ ] **Lazy Loading**: Heavy imports deferred (Level 3)
- [ ] **Caching**: Results cached when appropriate
- [ ] **Shared Utilities**: Common code extracted
- [ ] **Environment Config**: Credentials via env vars
- [ ] **Error Recovery**: Graceful failure handling
- [ ] **Progressive Disclosure**: SKILL.md links to details in `/reference/`, doesn't embed them
- [ ] **Folder Hierarchy**: All reference .md files in `/reference/` folder
---
## 🧪 Testing Workflow
```bash
# 1. Manual smoke test
cd skill-name/scripts
python main_tool.py --test-mode
# 2. Test with Claude
"Use the my-skill to process test data"
# 3. Verify checklist
✓ Works on first try?
✓ Error messages helpful?
✓ Claude understands how to use it?
✓ No credentials in code?
```
---
## 🛠️ Step-by-Step Process
**Building a new skill?** Follow the systematic process:
**[Skill Creation Process Guide](./reference/skill-creation-process.md)** - Complete walkthrough from planning to deployment
Includes:
- 5-phase process (Planning → Structure → Implementation → Testing → Refinement)
- Full working example: `incident-triage` skill
- Copy-paste templates for all components
- Token optimization at every step
- Adaptation checklist for your use case
---
## 📚 Additional Resources
- [Skill Creation Process](./reference/skill-creation-process.md) - Step-by-step guide with example
- [Anthropic: Equipping Agents with Skills](https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills)
- [Skills Documentation](https://docs.claude.com/en/docs/agents-and-tools/agent-skills/overview)
- [Skills Cookbook](https://github.com/anthropics/claude-cookbooks/tree/main/skills)
- [MCP Official Spec](https://spec.modelcontextprotocol.io/)
---
## 🗺️ Full Documentation
For comprehensive guides on architectural patterns, implementation details, and advanced techniques, see:
**[README.md](./README.md)** - Start here for the complete best practices guide
**Last Updated**: 2025-10-20

View File

@@ -0,0 +1,834 @@
# Skill Creation Process: Step-by-Step Guide
> **Use this guide** to systematically build a new Claude Code skill following progressive disclosure principles and token optimization.
**Example Used**: `incident-triage` skill (adapt for your use case)
---
## 📋 Process Overview
```
Phase 1: Planning → Phase 2: Structure → Phase 3: Implementation → Phase 4: Testing → Phase 5: Refinement
(30 min) (15 min) (2-4 hours) (30 min) (ongoing)
```
---
## Phase 1: Planning (30 minutes)
### Step 1.1: Define the Core Problem
**Questions to answer:**
- [ ] What specific, repeatable task does this solve?
- [ ] When should Claude invoke this skill?
- [ ] What are the inputs and outputs?
- [ ] What's the 1-sentence description?
**Example (incident-triage):**
- **Task**: Triage incidents by extracting facts, enriching with data, proposing severity/priority
- **Triggers**: "triage", "new incident", "assign severity", "prioritize ticket"
- **Inputs**: Free text or JSON ticket payload
- **Outputs**: Summary, severity/priority, next steps, assignment hint
- **Description**: "Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions."
### Step 1.2: Identify the Three Levels
**Level 1: Metadata** (~100 tokens, always loaded)
- [ ] Skill name (kebab-case)
- [ ] Description (triggers Claude's router)
- [ ] Version
**Level 2: SKILL.md Body** (<2k tokens, loaded on trigger)
- [ ] When to Use (2-3 bullet points)
- [ ] What It Does (high-level flow)
- [ ] Inputs/Outputs (contract)
- [ ] Quick Start (1-3 commands)
- [ ] Links to Level 3 docs
**Level 3: Bundled Files** (unlimited, loaded as-needed)
- [ ] Detailed documentation
- [ ] Executable scripts
- [ ] API specs, examples, decision matrices
- [ ] Shared utilities
### Step 1.3: Token Budget Plan
Fill out this table:
| Component | Target Tokens | What Goes Here |
|-----------|--------------|----------------|
| Metadata | ~100 | Name, description, version |
| SKILL.md Body | <2k (aim for 1.5k) | Quick ref, links to Level 3 |
| reference/*.md | 500-1000 each | Detailed docs (as many files as needed) |
| scripts/*.py | n/a | Executable code (not loaded unless run) |
---
## Phase 2: Structure (15 minutes)
### Step 2.1: Create Folder Layout
**⚠️ CRITICAL: Create `/reference/` folder and put ALL reference .md files there!**
```bash
# Navigate to skills directory
cd .claude/skills
# Create skill structure
mkdir -p incident-triage/{scripts,reference,shared}
touch incident-triage/SKILL.md
touch incident-triage/scripts/{triage_main.py,enrich_ticket.py,suggest_priority.py,common.py}
touch incident-triage/reference/{inputs-and-prompts.md,decision-matrix.md,runbook-links.md,api-specs.md,examples.md}
touch incident-triage/shared/{config.py,api_client.py,formatters.py}
```
**Verify structure matches this EXACT pattern:**
```
incident-triage/
├── SKILL.md # ✅ Level 1+2 (≤2k tokens) - ONLY .md in root
├── reference/ # ✅ REQUIRED: Level 3 docs folder
│ ├── inputs-and-prompts.md # ✅ All reference .md files go HERE
│ ├── decision-matrix.md # ✅ NOT in root!
│ ├── runbook-links.md
│ ├── api-specs.md
│ └── examples.md
├── scripts/ # Level 3: executable code
│ ├── triage_main.py
│ ├── enrich_ticket.py
│ ├── suggest_priority.py
│ └── common.py
└── shared/ # Level 3: utilities
├── config.py
├── api_client.py
└── formatters.py
```
**❌ WRONG - DO NOT DO THIS:**
```
incident-triage/
├── SKILL.md
├── inputs-and-prompts.md # ❌ WRONG! Should be in reference/
├── decision-matrix.md # ❌ WRONG! Should be in reference/
└── scripts/
```
### Step 2.2: Stub Out Files
Create minimal stubs for each file to establish contracts:
**SKILL.md** (copy template from best-practices.md)
**reference/*.md** (headers only for now)
**scripts/*.py** (function signatures with pass)
**shared/*.py** (class/function signatures)
### Step 2.3: Validate Folder Structure
**Run this validation BEFORE moving to Phase 3:**
```bash
# Check structure
ls -la incident-triage/
# Verify:
# ✅ SKILL.md exists in root
# ✅ reference/ folder exists
# ✅ NO .md files in root except SKILL.md
# ✅ scripts/ folder exists (if needed)
# ✅ shared/ folder exists (if needed)
# Check reference folder
ls -la incident-triage/reference/
# Verify:
# ✅ All .md reference files are HERE
# ✅ inputs-and-prompts.md
# ✅ decision-matrix.md
# ✅ api-specs.md
# ✅ examples.md
```
**Checklist:**
- [ ] `/reference/` folder created
- [ ] All reference .md files in `/reference/` (not root)
- [ ] SKILL.md links use `./reference/filename.md` format
- [ ] No .md files in root except SKILL.md
---
## Phase 3: Implementation (2-4 hours)
Work in this order to maintain focus and avoid scope creep:
### Step 3.1: Write Level 1 (Metadata) - 5 minutes
Open `SKILL.md` and write the frontmatter:
```yaml
---
name: incident-triage
description: Triage incidents by extracting key facts, enriching with CMDB/log data, and proposing severity, priority, and next actions.
version: 1.0.0
---
```
**Checklist:**
- [ ] Name is clear and specific (not "helper" or "utility")
- [ ] Description contains trigger keywords
- [ ] Description explains what it does (not what it is)
- [ ] Total metadata ≤100 tokens
### Step 3.2: Write Level 2 (SKILL.md Body) - 30 minutes
Follow this exact structure:
```markdown
# Level 2: Body (<2k tokens recommended) — Loaded when the skill triggers
## When to Use
- [Trigger condition 1]
- [Trigger condition 2]
- [Trigger condition 3]
## What It Does (at a glance)
- **[Action 1]**: [brief description]
- **[Action 2]**: [brief description]
- **[Action 3]**: [brief description]
- **[Action 4]**: [brief description]
## Inputs
- [Input format 1]
- [Input format 2]
Details: see [reference/inputs-and-prompts.md](./reference/inputs-and-prompts.md).
## Quick Start
1. **Dry-run** (no external calls):
```bash
python scripts/main.py --example --dry-run
```
2. **With enrichment**:
```bash
python scripts/main.py --ticket-id 12345 --include-logs
```
3. Review output
Examples: [reference/examples.md](./reference/examples.md)
## Decision Logic (high-level)
[2-3 sentences on how decisions are made]
Full details: [reference/decision-matrix.md](./reference/decision-matrix.md)
## Outputs (contract)
- `field1`: [description]
- `field2`: [description]
- `field3`: [description]
## Guardrails
- [Security consideration 1]
- [Token budget note]
- [Error handling approach]
## Links (Level 3, loaded only when needed)
- Prompts: [reference/inputs-and-prompts.md](./reference/inputs-and-prompts.md)
- Decision logic: [reference/decision-matrix.md](./reference/decision-matrix.md)
- Examples: [reference/examples.md](./reference/examples.md)
- API specs: [reference/api-specs.md](./reference/api-specs.md)
## Triggers (help the router)
Keywords: [keyword1], [keyword2], [keyword3]
Inputs containing: [field1], [field2]
## Security & Config
Set environment variables:
- `VAR1_API_KEY`
- `VAR2_API_KEY`
Centralized in `shared/config.py`. Never echo secrets.
## Testing
```bash
# Smoke test
python scripts/main.py --fixture reference/examples.md
# End-to-end
python scripts/main.py --text "Example input" --dry-run
```
```
**Checklist:**
- [ ] <2k tokens (aim for 1.5k)
- [ ] Links to Level 3 for details
- [ ] Quick Start is copy-paste ready
- [ ] Output contract is clear
- [ ] No extensive examples or specs embedded
### Step 3.3: Write Level 3 Reference Docs - 45 minutes
Create each reference file systematically:
#### reference/inputs-and-prompts.md
```markdown
# Inputs and Prompt Shapes
## Input Format 1: Free Text
- Description
- Example
## Input Format 2: Structured JSON
```json
{
"field": "value"
}
```
## Prompt Snippets
- Extraction goals
- Summarization style
- Redaction rules
```
#### reference/decision-matrix.md
```markdown
# Decision Matrix
[Full decision logic with tables, formulas, edge cases]
## Base Matrix
| Dimension 1 \ Dimension 2 | Value A | Value B | Value C |
|---|---|---|---|
| Low | Result | Result | Result |
| Med | Result | Result | Result |
| High | Result | Result | Result |
## Adjustments
- Adjustment rule 1
- Adjustment rule 2
## Rationale
[Why this matrix, examples, edge cases]
```
#### reference/api-specs.md
```markdown
# API Specs & Schemas
## API 1: CMDB
- Base URL: `{SERVICE_MAP_URL}`
- Auth: Header `X-API-Key: {CMDB_API_KEY}`
- Endpoints:
- GET `/service/{name}/dependencies`
- Response schema: [...]
## API 2: Logs
- Base URL: [...]
- Endpoints: [...]
```
#### reference/examples.md
```markdown
# Examples
## Example 1: [Scenario Name]
**Input:**
```
[Example input]
```
**Output:**
```
[Example output with all fields]
```
**Explanation:** [Why these decisions were made]
## Example 2: [Another Scenario]
[...]
```
#### reference/runbook-links.md
```markdown
# Runbook Links
- [Service 1]: <URL>
- [Service 2]: <URL>
- [Escalation tree]: <URL>
```
**Checklist for all reference docs:**
- [ ] Each file focuses on one aspect
- [ ] 500-1000 tokens per file (can be more if needed)
- [ ] Referenced from SKILL.md but not embedded
- [ ] Includes examples where helpful
### Step 3.4: Write Shared Utilities - 30 minutes
#### shared/config.py
```python
"""Centralized configuration from environment variables."""
import os
class Config:
"""Config object - never logs secrets"""
CMDB_API_KEY = os.getenv("CMDB_API_KEY")
LOGS_API_KEY = os.getenv("LOGS_API_KEY")
SERVICE_MAP_URL = os.getenv("SERVICE_MAP_URL")
DASHBOARD_BASE_URL = os.getenv("DASHBOARD_BASE_URL")
@classmethod
def validate(cls):
"""Check required env vars are set"""
missing = []
for key in ["CMDB_API_KEY", "LOGS_API_KEY"]:
if not getattr(cls, key):
missing.append(key)
if missing:
raise ValueError(f"Missing required env vars: {missing}")
cfg = Config()
```
#### shared/api_client.py
```python
"""API client wrappers."""
import requests
from .config import cfg
class CMDBClient:
def __init__(self):
self.base_url = cfg.SERVICE_MAP_URL
self.headers = {"X-API-Key": cfg.CMDB_API_KEY}
def get_service_dependencies(self, service_name):
"""Fetch service dependencies"""
try:
resp = requests.get(
f"{self.base_url}/service/{service_name}/dependencies",
headers=self.headers,
timeout=5
)
resp.raise_for_status()
return resp.json()
except requests.RequestException as e:
raise ConnectionError(f"CMDB API failed: {e}")
class LogsClient:
def __init__(self):
self.base_url = cfg.LOGS_API_URL
self.headers = {"Authorization": f"Bearer {cfg.LOGS_API_KEY}"}
def recent_errors(self, service_name, last_minutes=15):
"""Fetch recent error logs"""
# Implementation
pass
def cmdb_client():
return CMDBClient()
def logs_client():
return LogsClient()
```
#### shared/formatters.py
```python
"""Output formatting helpers."""
def format_output(enriched, severity, priority, rationale, next_steps):
"""Format triage result as markdown."""
lines = [
"### Incident Triage Result",
f"**Severity**: {severity} | **Priority**: {priority}",
f"**Rationale**: {rationale}",
"",
"**Summary**:",
enriched.get("summary", "N/A"),
"",
"**Next Steps**:",
]
for i, step in enumerate(next_steps, 1):
lines.append(f"{i}. {step}")
if "evidence" in enriched:
lines.extend(["", "**Evidence**:"])
for link in enriched["evidence"]:
lines.append(f"- {link}")
return "\n".join(lines)
```
### Step 3.5: Write Main Scripts - 1 hour
#### scripts/triage_main.py (entry point)
```python
#!/usr/bin/env python3
"""Main entry point for incident triage."""
import argparse
import json
import sys
from pathlib import Path
# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from shared.config import cfg
from shared.formatters import format_output
from scripts.enrich_ticket import enrich
from scripts.suggest_priority import score
def main():
parser = argparse.ArgumentParser(description="Triage an incident")
parser.add_argument("--text", help="Free-text incident description")
parser.add_argument("--ticket-id", help="Ticket ID to enrich")
parser.add_argument("--include-logs", action="store_true")
parser.add_argument("--include-cmdb", action="store_true")
parser.add_argument("--dry-run", action="store_true",
help="Skip external API calls")
args = parser.parse_args()
# Validate inputs
if not args.text and not args.ticket_id:
print("Error: Provide --text or --ticket-id")
sys.exit(1)
# Build payload
payload = {
"text": args.text,
"ticket_id": args.ticket_id
}
try:
# Enrich (respects --dry-run)
enriched = enrich(
payload,
include_logs=args.include_logs and not args.dry_run,
include_cmdb=args.include_cmdb and not args.dry_run
)
# Score (deterministic)
severity, priority, rationale = score(enriched)
# Generate next steps
next_steps = generate_next_steps(enriched, severity)
# Format output
output = format_output(enriched, severity, priority, rationale, next_steps)
print(output)
except Exception as e:
print(f"❌ Triage failed: {e}")
print("\nTroubleshooting:")
print("1. Check environment variables are set")
print("2. Verify API endpoints are accessible")
print("3. Run with --dry-run to test without external calls")
sys.exit(1)
def generate_next_steps(enriched, severity):
"""Generate action items based on enrichment and severity"""
steps = []
if severity in ["SEV1", "SEV2"]:
steps.append("Page on-call immediately")
if "dashboard_url" in enriched:
steps.append(f"Review dashboard: {enriched['dashboard_url']}")
steps.append("Compare last 15m vs 24h baseline")
if enriched.get("recent_deploy"):
steps.append("Consider rollback if error budget breached")
return steps
if __name__ == "__main__":
main()
```
#### scripts/enrich_ticket.py
```python
"""Enrich ticket with external data."""
from shared.config import cfg
from shared.api_client import cmdb_client, logs_client
def enrich(payload, include_logs=False, include_cmdb=False):
"""
Enrich ticket payload with CMDB/logs data.
Args:
payload: Dict with 'text' and/or 'ticket_id'
include_logs: Fetch recent logs
include_cmdb: Fetch CMDB dependencies
Returns:
Dict with original payload + enrichment
"""
result = {"input": payload}
# Extract service name from text or ticket
service = extract_service(payload)
if service:
result["service"] = service
# Enrich with CMDB
if include_cmdb and service:
try:
cmdb_data = cmdb_client().get_service_dependencies(service)
result["cmdb"] = cmdb_data
result["blast_radius"] = cmdb_data.get("dependent_services", [])
except Exception as e:
result["cmdb_error"] = str(e)
# Enrich with logs
if include_logs and service:
try:
logs = logs_client().recent_errors(service)
result["logs"] = logs
except Exception as e:
result["logs_error"] = str(e)
# Derive scope/impact hints
result["scope"] = derive_scope(result)
result["impact"] = derive_impact(result)
return result
def extract_service(payload):
"""Extract service name from payload."""
# Check explicit service field
if "service" in payload:
return payload["service"]
# Parse from text (simple keyword matching)
text = payload.get("text", "").lower()
known_services = ["checkout", "payments", "inventory", "auth"]
for service in known_services:
if service in text:
return service
return None
def derive_scope(enriched):
"""Determine blast radius scope."""
blast_radius = len(enriched.get("blast_radius", []))
if blast_radius == 0:
return "single-service"
elif blast_radius < 3:
return "few-services"
else:
return "multi-service"
def derive_impact(enriched):
"""Estimate user impact level."""
# Check for explicit impact data
if "impact" in enriched.get("input", {}):
pct = enriched["input"]["impact"].get("users_affected_pct", 0)
if pct > 50:
return "high"
elif pct > 10:
return "medium"
else:
return "low"
# Infer from service criticality
service = enriched.get("service", "")
critical_services = ["checkout", "payments", "auth"]
if service in critical_services:
return "medium" # Default to medium for critical services
return "low"
```
#### scripts/suggest_priority.py
```python
"""Deterministic severity/priority scoring."""
DECISION_MATRIX = {
# (impact, scope) -> (severity, priority)
("low", "single-service"): ("SEV4", "P4"),
("low", "few-services"): ("SEV3", "P3"),
("low", "multi-service"): ("SEV3", "P3"),
("medium", "single-service"): ("SEV3", "P3"),
("medium", "few-services"): ("SEV2", "P2"),
("medium", "multi-service"): ("SEV2", "P2"),
("high", "single-service"): ("SEV2", "P2"),
("high", "few-services"): ("SEV1", "P1"),
("high", "multi-service"): ("SEV1", "P1"),
}
def score(enriched):
"""
Score incident severity and priority.
Args:
enriched: Dict from enrich_ticket()
Returns:
Tuple of (severity, priority, rationale)
"""
impact = enriched.get("impact", "medium")
scope = enriched.get("scope", "single-service")
# Base score from matrix
key = (impact, scope)
if key not in DECISION_MATRIX:
# Default fallback
severity, priority = "SEV3", "P3"
rationale = f"Default scoring (impact={impact}, scope={scope})"
else:
severity, priority = DECISION_MATRIX[key]
rationale = f"{impact.title()} impact, {scope} scope"
# Apply adjustments
if should_escalate(enriched):
severity, priority = escalate(severity, priority)
rationale += " (escalated: long recovery expected)"
return severity, priority, rationale
def should_escalate(enriched):
"""Check if incident should be escalated."""
# Check for long recovery indicators
logs = enriched.get("logs", {})
if logs.get("error_rate_increasing"):
return True
# Check for repeated incidents
if enriched.get("recent_incidents_count", 0) > 3:
return True
return False
def escalate(severity, priority):
"""Escalate severity/priority by one level."""
sev_map = {"SEV4": "SEV3", "SEV3": "SEV2", "SEV2": "SEV1", "SEV1": "SEV1"}
pri_map = {"P4": "P3", "P3": "P2", "P2": "P1", "P1": "P1"}
return sev_map.get(severity, severity), pri_map.get(priority, priority)
```
---
## Phase 4: Testing (30 minutes)
### Step 4.1: Create Test Fixtures
Create `reference/test-fixtures.json`:
```json
{
"test1": {
"text": "Checkout API seeing 500 errors at 12%; started 15:05Z",
"expected_severity": "SEV2",
"expected_priority": "P2"
},
"test2": {
"text": "Single user reports login issue on mobile app",
"expected_severity": "SEV4",
"expected_priority": "P4"
}
}
```
### Step 4.2: Run Tests
```bash
# 1. Smoke test deterministic components
python scripts/suggest_priority.py --test
# 2. Dry-run end-to-end
python scripts/triage_main.py --text "API timeouts on checkout" --dry-run
# 3. With enrichment (requires env vars)
export CMDB_API_KEY="test_key"
export LOGS_API_KEY="test_key"
python scripts/triage_main.py --ticket-id 12345 --include-logs --include-cmdb
```
### Step 4.3: Test with Claude
Ask Claude:
```
"I have a new incident: checkout API showing 500 errors affecting 15% of users in EU region. Can you triage this?"
```
Verify:
- [ ] Skill triggers correctly
- [ ] Output is well-formatted
- [ ] Severity/priority makes sense
- [ ] Next steps are actionable
- [ ] Links work
---
## Phase 5: Refinement (Ongoing)
### Step 5.1: Token Count Audit
```bash
# Count tokens in SKILL.md body (exclude metadata)
wc -w incident-triage/SKILL.md
# Multiply by 0.75 for rough token count
```
**Checklist:**
- [ ] Metadata ~100 tokens
- [ ] Body <2k tokens
- [ ] If over, move content to reference/*.md
### Step 5.2: Real-World Usage Monitoring
Track these metrics:
- [ ] Does Claude trigger the skill appropriately?
- [ ] Are users getting helpful results?
- [ ] What questions/errors come up?
- [ ] Which Level 3 docs are never used?
### Step 5.3: Iterate Based on Feedback
**If skill triggers too often:**
→ Make description more specific
**If skill triggers too rarely:**
→ Add more trigger keywords
**If output is unhelpful:**
→ Improve decision logic or examples
**If token limit exceeded:**
→ Move more content to Level 3
---
## 🎓 Adaptation Checklist
To create YOUR skill from this template:
- [ ] **Folder Structure** (CRITICAL):
- [ ] Create `/reference/` folder
- [ ] Put ALL reference .md files IN `/reference/` folder
- [ ] NO .md files in root except SKILL.md
- [ ] Links in SKILL.md use `./reference/filename.md` format
- [ ] **Rename**: Replace "incident-triage" with your skill name
- [ ] **Metadata**: Write name/description with your trigger keywords
- [ ] **Triggers**: List all keywords/patterns that should invoke your skill
- [ ] **Inputs/Outputs**: Define your specific contract
- [ ] **Scripts**: Replace enrichment/scoring with your logic
- [ ] **Reference docs**: Create docs for your domain (decision matrices, API specs, etc.)
- [ ] **Config**: Add your required environment variables
- [ ] **Examples**: Create 3-5 realistic examples
- [ ] **Test**: Dry-run → with real data → with Claude
- [ ] **Validate Structure**: Run structure validation checklist
- [ ] **Refine**: Monitor usage, iterate based on feedback
---
## 📚 Related Resources
- [Agent Skills Best Practices](../best-practices.md) - Quick reference
- [Progressive Disclosure](../topics/progressive-disclosure.md) - Design philosophy
- [Token Optimization](../README.md#token-optimized-structure) - Token limits explained
---
**Last Updated**: 2025-10-20
**Version**: 1.0.0