Initial commit
This commit is contained in:
247
skills/skill-creator/patterns/data-processing.md
Normal file
247
skills/skill-creator/patterns/data-processing.md
Normal file
@@ -0,0 +1,247 @@
|
||||
# Data Processing Skill Pattern
|
||||
|
||||
Use this pattern when your skill **processes, analyzes, or transforms** data to extract insights.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Skill ingests data from files or APIs
|
||||
- Performs analysis or transformation
|
||||
- Generates insights, reports, or visualizations
|
||||
- Examples: cc-insights (conversation analysis)
|
||||
|
||||
## Structure
|
||||
|
||||
### Data Flow Architecture
|
||||
|
||||
Define clear data pipeline:
|
||||
|
||||
```
|
||||
Input Sources → Processing → Storage → Query/Analysis → Output
|
||||
```
|
||||
|
||||
Example:
|
||||
```
|
||||
JSONL files → Parser → SQLite + Vector DB → Search/Analytics → Reports/Dashboard
|
||||
```
|
||||
|
||||
### Processing Modes
|
||||
|
||||
**Batch Processing:**
|
||||
- Process all data at once
|
||||
- Good for: Initial setup, complete reprocessing
|
||||
- Trade-off: Slow startup, complete data
|
||||
|
||||
**Incremental Processing:**
|
||||
- Process only new/changed data
|
||||
- Good for: Regular updates, performance
|
||||
- Trade-off: Complex state tracking
|
||||
|
||||
**Streaming Processing:**
|
||||
- Process data as it arrives
|
||||
- Good for: Real-time updates
|
||||
- Trade-off: Complex implementation
|
||||
|
||||
### Storage Strategy
|
||||
|
||||
Choose appropriate storage:
|
||||
|
||||
**SQLite:**
|
||||
- Structured metadata
|
||||
- Fast queries
|
||||
- Relational data
|
||||
- Good for: Indexes, aggregations
|
||||
|
||||
**Vector Database (ChromaDB):**
|
||||
- Semantic embeddings
|
||||
- Similarity search
|
||||
- Good for: RAG, semantic queries
|
||||
|
||||
**File System:**
|
||||
- Raw data
|
||||
- Large blobs
|
||||
- Good for: Backups, archives
|
||||
|
||||
## Example: CC Insights
|
||||
|
||||
**Input**: Claude Code conversation JSONL files
|
||||
|
||||
**Processing Pipeline:**
|
||||
1. JSONL Parser - Decode base64, extract messages
|
||||
2. Metadata Extractor - Timestamps, files, tools
|
||||
3. Embeddings Generator - Vector representations
|
||||
4. Pattern Detector - Identify trends
|
||||
|
||||
**Storage:**
|
||||
- SQLite: Conversation metadata, fast queries
|
||||
- ChromaDB: Vector embeddings, semantic search
|
||||
- Cache: Processed conversation data
|
||||
|
||||
**Query Interfaces:**
|
||||
1. CLI Search - Command-line semantic search
|
||||
2. Insight Generator - Pattern-based reports
|
||||
3. Dashboard - Interactive web UI
|
||||
|
||||
**Outputs:**
|
||||
- Search results with similarity scores
|
||||
- Weekly activity reports
|
||||
- File heatmaps
|
||||
- Tool usage analytics
|
||||
|
||||
## Data Processing Workflow
|
||||
|
||||
### Phase 1: Ingestion
|
||||
```markdown
|
||||
1. **Discover Data Sources**
|
||||
- Locate input files/APIs
|
||||
- Validate accessibility
|
||||
- Calculate scope (file count, size)
|
||||
|
||||
2. **Initial Validation**
|
||||
- Check format validity
|
||||
- Verify schema compliance
|
||||
- Estimate processing time
|
||||
|
||||
3. **State Management**
|
||||
- Track what's been processed
|
||||
- Support incremental updates
|
||||
- Handle failures gracefully
|
||||
```
|
||||
|
||||
### Phase 2: Processing
|
||||
```markdown
|
||||
1. **Parse/Transform**
|
||||
- Read raw data
|
||||
- Apply transformations
|
||||
- Handle errors and edge cases
|
||||
|
||||
2. **Extract Features**
|
||||
- Generate metadata
|
||||
- Calculate metrics
|
||||
- Create embeddings (if semantic search)
|
||||
|
||||
3. **Store Results**
|
||||
- Write to database(s)
|
||||
- Update indexes
|
||||
- Maintain consistency
|
||||
```
|
||||
|
||||
### Phase 3: Analysis
|
||||
```markdown
|
||||
1. **Query Interface**
|
||||
- Support multiple query types
|
||||
- Optimize for common patterns
|
||||
- Return ranked results
|
||||
|
||||
2. **Pattern Detection**
|
||||
- Aggregate data
|
||||
- Identify trends
|
||||
- Generate insights
|
||||
|
||||
3. **Visualization**
|
||||
- Format for human consumption
|
||||
- Support multiple output formats
|
||||
- Interactive when possible
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
Document expected performance:
|
||||
|
||||
```markdown
|
||||
### Performance Characteristics
|
||||
|
||||
- **Initial indexing**: ~1-2 minutes for 100 records
|
||||
- **Incremental updates**: <5 seconds for new records
|
||||
- **Search latency**: <1 second for queries
|
||||
- **Report generation**: <10 seconds for standard reports
|
||||
- **Memory usage**: ~200MB for 1000 records
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Incremental Processing**: Don't reprocess everything on each run
|
||||
2. **State Tracking**: Track what's been processed to avoid duplicates
|
||||
3. **Batch Operations**: Process in batches for memory efficiency
|
||||
4. **Progress Indicators**: Show progress for long operations
|
||||
5. **Error Recovery**: Handle failures gracefully, resume where left off
|
||||
6. **Data Validation**: Validate inputs before expensive processing
|
||||
7. **Index Optimization**: Optimize databases for common queries
|
||||
8. **Memory Management**: Stream large files, don't load everything
|
||||
9. **Parallel Processing**: Use parallelism when possible
|
||||
10. **Cache Wisely**: Cache expensive computations
|
||||
|
||||
## Scripts Structure
|
||||
|
||||
For data processing skills, provide helper scripts:
|
||||
|
||||
```
|
||||
scripts/
|
||||
├── processor.py # Main data processing script
|
||||
├── indexer.py # Build indexes/embeddings
|
||||
├── query.py # Query interface (CLI)
|
||||
└── generator.py # Report/insight generation
|
||||
```
|
||||
|
||||
### Script Best Practices
|
||||
|
||||
```python
|
||||
# Good patterns for processing scripts:
|
||||
|
||||
# 1. Use click for CLI
|
||||
import click
|
||||
|
||||
@click.command()
|
||||
@click.option('--input', help='Input path')
|
||||
@click.option('--reindex', is_flag=True)
|
||||
def process(input, reindex):
|
||||
"""Process data from input source."""
|
||||
pass
|
||||
|
||||
# 2. Show progress
|
||||
from tqdm import tqdm
|
||||
for item in tqdm(items, desc="Processing"):
|
||||
process_item(item)
|
||||
|
||||
# 3. Handle errors gracefully
|
||||
try:
|
||||
result = process_item(item)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to process {item}: {e}")
|
||||
continue # Continue with next item
|
||||
|
||||
# 4. Support incremental updates
|
||||
if not reindex and is_already_processed(item):
|
||||
continue
|
||||
|
||||
# 5. Use batch processing
|
||||
for batch in chunks(items, batch_size=32):
|
||||
process_batch(batch)
|
||||
```
|
||||
|
||||
## Storage Schema
|
||||
|
||||
Document your data schema:
|
||||
|
||||
```sql
|
||||
-- Example SQLite schema
|
||||
CREATE TABLE conversations (
|
||||
id TEXT PRIMARY KEY,
|
||||
timestamp INTEGER,
|
||||
message_count INTEGER,
|
||||
files_modified TEXT, -- JSON array
|
||||
tools_used TEXT -- JSON array
|
||||
);
|
||||
|
||||
CREATE INDEX idx_timestamp ON conversations(timestamp);
|
||||
CREATE INDEX idx_files ON conversations(files_modified);
|
||||
```
|
||||
|
||||
## Output Formats
|
||||
|
||||
Support multiple output formats:
|
||||
|
||||
1. **Markdown**: Human-readable reports
|
||||
2. **JSON**: Machine-readable for integration
|
||||
3. **CSV**: Spreadsheet-compatible data
|
||||
4. **HTML**: Styled reports with charts
|
||||
5. **Interactive**: Web dashboards (optional)
|
||||
78
skills/skill-creator/patterns/mode-based.md
Normal file
78
skills/skill-creator/patterns/mode-based.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Mode-Based Skill Pattern
|
||||
|
||||
Use this pattern when your skill has **multiple distinct operating modes** based on user intent.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Skill performs fundamentally different operations based on context
|
||||
- Each mode has its own workflow and outputs
|
||||
- User intent determines which mode to activate
|
||||
- Examples: git-worktree-setup (single/batch/cleanup/list modes)
|
||||
|
||||
## Structure
|
||||
|
||||
### Quick Decision Matrix
|
||||
|
||||
Create a clear mapping of user requests to modes:
|
||||
|
||||
```
|
||||
User Request → Mode → Action
|
||||
───────────────────────────────────────────────────────────
|
||||
"trigger phrase 1" → Mode 1 → High-level action
|
||||
"trigger phrase 2" → Mode 2 → High-level action
|
||||
"trigger phrase 3" → Mode 3 → High-level action
|
||||
```
|
||||
|
||||
### Mode Detection Logic
|
||||
|
||||
Provide clear logic for mode selection:
|
||||
|
||||
```javascript
|
||||
// Mode 1: [Name]
|
||||
if (userMentions("keyword1", "keyword2")) {
|
||||
return "mode1-name";
|
||||
}
|
||||
|
||||
// Mode 2: [Name]
|
||||
if (userMentions("keyword3", "keyword4")) {
|
||||
return "mode2-name";
|
||||
}
|
||||
|
||||
// Ambiguous - ask user
|
||||
return askForClarification();
|
||||
```
|
||||
|
||||
### Separate Mode Documentation
|
||||
|
||||
For complex skills, create separate files for each mode:
|
||||
|
||||
```
|
||||
skill-name/
|
||||
├── SKILL.md # Overview and mode detection
|
||||
├── modes/
|
||||
│ ├── mode1-name.md # Detailed workflow for mode 1
|
||||
│ ├── mode2-name.md # Detailed workflow for mode 2
|
||||
│ └── mode3-name.md # Detailed workflow for mode 3
|
||||
```
|
||||
|
||||
## Example: Git Worktree Setup
|
||||
|
||||
**Modes:**
|
||||
1. Single Worktree - Create one worktree
|
||||
2. Batch Worktrees - Create multiple worktrees
|
||||
3. Cleanup - Remove worktrees
|
||||
4. List/Manage - Show worktree status
|
||||
|
||||
**Detection Logic:**
|
||||
- "create worktree for X" → Single mode
|
||||
- "create worktrees for A, B, C" → Batch mode
|
||||
- "remove worktree" → Cleanup mode
|
||||
- "list worktrees" → List mode
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Clear Mode Boundaries**: Each mode should be distinct and non-overlapping
|
||||
2. **Explicit Detection**: Provide clear rules for mode selection
|
||||
3. **Clarification Path**: Always have a fallback to ask user when ambiguous
|
||||
4. **Mode Independence**: Each mode should work standalone
|
||||
5. **Shared Prerequisites**: Extract common validation to reduce duplication
|
||||
115
skills/skill-creator/patterns/phase-based.md
Normal file
115
skills/skill-creator/patterns/phase-based.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# Phase-Based Skill Pattern
|
||||
|
||||
Use this pattern when your skill follows **sequential phases** that build on each other.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Skill has a linear workflow with clear stages
|
||||
- Each phase depends on the previous phase
|
||||
- Progressive disclosure of complexity
|
||||
- Examples: codebase-auditor (discovery → analysis → reporting → remediation)
|
||||
|
||||
## Structure
|
||||
|
||||
### Phase Overview
|
||||
|
||||
Define clear phases with dependencies:
|
||||
|
||||
```
|
||||
Phase 1: Discovery
|
||||
↓
|
||||
Phase 2: Analysis
|
||||
↓
|
||||
Phase 3: Reporting
|
||||
↓
|
||||
Phase 4: Action/Remediation
|
||||
```
|
||||
|
||||
### Phase Workflow Template
|
||||
|
||||
```markdown
|
||||
## Workflow
|
||||
|
||||
### Phase 1: [Name]
|
||||
|
||||
**Purpose**: [What this phase accomplishes]
|
||||
|
||||
**Steps:**
|
||||
1. [Step 1]
|
||||
2. [Step 2]
|
||||
3. [Step 3]
|
||||
|
||||
**Output**: [What information is produced]
|
||||
|
||||
**Transition**: [When to move to next phase]
|
||||
|
||||
### Phase 2: [Name]
|
||||
|
||||
**Purpose**: [What this phase accomplishes]
|
||||
|
||||
**Inputs**: [Required from previous phase]
|
||||
|
||||
**Steps:**
|
||||
1. [Step 1]
|
||||
2. [Step 2]
|
||||
|
||||
**Output**: [What information is produced]
|
||||
```
|
||||
|
||||
## Example: Codebase Auditor
|
||||
|
||||
**Phase 1: Initial Assessment** (Progressive Disclosure)
|
||||
- Lightweight scan to understand codebase
|
||||
- Identify tech stack and structure
|
||||
- Quick health check
|
||||
- **Output**: Project profile and initial findings
|
||||
|
||||
**Phase 2: Deep Analysis** (Load on Demand)
|
||||
- Based on Phase 1, perform targeted analysis
|
||||
- Code quality, security, testing, etc.
|
||||
- **Output**: Detailed findings with severity
|
||||
|
||||
**Phase 3: Report Generation**
|
||||
- Aggregate findings from Phase 2
|
||||
- Calculate scores and metrics
|
||||
- **Output**: Comprehensive audit report
|
||||
|
||||
**Phase 4: Remediation Planning**
|
||||
- Prioritize findings by severity
|
||||
- Generate action plan
|
||||
- **Output**: Prioritized task list
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Progressive Disclosure**: Start lightweight, go deep only when needed
|
||||
2. **Clear Transitions**: Explicitly state when moving between phases
|
||||
3. **Phase Independence**: Each phase should have clear inputs/outputs
|
||||
4. **Checkpoint Validation**: Verify prerequisites before advancing
|
||||
5. **Early Exit**: Allow stopping after any phase if user only needs partial analysis
|
||||
6. **Incremental Value**: Each phase should provide standalone value
|
||||
|
||||
## Phase Characteristics
|
||||
|
||||
### Discovery Phase
|
||||
- Fast and lightweight
|
||||
- Gather context and identify scope
|
||||
- No expensive operations
|
||||
- Output guides subsequent phases
|
||||
|
||||
### Analysis Phase
|
||||
- Deep dive based on discovery
|
||||
- Resource-intensive operations
|
||||
- Parallel processing when possible
|
||||
- Structured output for reporting
|
||||
|
||||
### Reporting Phase
|
||||
- Aggregate and synthesize data
|
||||
- Calculate metrics and scores
|
||||
- Generate human-readable output
|
||||
- Support multiple formats
|
||||
|
||||
### Action Phase
|
||||
- Provide recommendations
|
||||
- Generate implementation guidance
|
||||
- Offer to perform actions
|
||||
- Track completion
|
||||
174
skills/skill-creator/patterns/validation.md
Normal file
174
skills/skill-creator/patterns/validation.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# Validation/Audit Skill Pattern
|
||||
|
||||
Use this pattern when your skill **validates, audits, or checks** artifacts against standards.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Skill checks compliance against defined standards
|
||||
- Detects issues and provides remediation guidance
|
||||
- Generates reports with severity levels
|
||||
- Examples: claude-md-auditor, codebase-auditor
|
||||
|
||||
## Structure
|
||||
|
||||
### Validation Sources
|
||||
|
||||
Clearly define what you're validating against:
|
||||
|
||||
```markdown
|
||||
## Validation Sources
|
||||
|
||||
### 1. ✅ Official Standards
|
||||
- **Source**: [Authority/documentation]
|
||||
- **Authority**: Highest (requirements)
|
||||
- **Examples**: [List key standards]
|
||||
|
||||
### 2. 💡 Best Practices
|
||||
- **Source**: Community/field experience
|
||||
- **Authority**: Medium (recommendations)
|
||||
- **Examples**: [List practices]
|
||||
|
||||
### 3. 🔬 Research/Optimization
|
||||
- **Source**: Academic research
|
||||
- **Authority**: Medium (evidence-based)
|
||||
- **Examples**: [List findings]
|
||||
```
|
||||
|
||||
### Finding Structure
|
||||
|
||||
Use consistent structure for all findings:
|
||||
|
||||
```markdown
|
||||
**Severity**: Critical | High | Medium | Low
|
||||
**Category**: [Type of issue]
|
||||
**Location**: [File:line or context]
|
||||
**Description**: [What the issue is]
|
||||
**Impact**: [Why it matters]
|
||||
**Remediation**: [How to fix]
|
||||
**Effort**: [Time estimate]
|
||||
**Source**: Official | Community | Research
|
||||
```
|
||||
|
||||
### Severity Levels
|
||||
|
||||
Define clear severity criteria:
|
||||
|
||||
- **Critical**: Security risk, production-blocking (fix immediately)
|
||||
- **High**: Significant quality issue (fix this sprint)
|
||||
- **Medium**: Moderate improvement (schedule for next quarter)
|
||||
- **Low**: Minor optimization (backlog)
|
||||
|
||||
### Score Calculation
|
||||
|
||||
Provide quantitative scoring:
|
||||
|
||||
```
|
||||
Overall Health Score (0-100):
|
||||
- 90-100: Excellent
|
||||
- 75-89: Good
|
||||
- 60-74: Fair
|
||||
- 40-59: Poor
|
||||
- 0-39: Critical
|
||||
|
||||
Category Scores:
|
||||
- Security: Should always be 100
|
||||
- Compliance: Aim for 80+
|
||||
- Best Practices: 70+ is good
|
||||
```
|
||||
|
||||
## Example: CLAUDE.md Auditor
|
||||
|
||||
**Validation Against:**
|
||||
1. Official Anthropic documentation (docs.claude.com)
|
||||
2. Community best practices (field experience)
|
||||
3. Academic research (LLM context optimization)
|
||||
|
||||
**Finding Categories:**
|
||||
- Security (secrets, sensitive data)
|
||||
- Official Compliance (Anthropic guidelines)
|
||||
- Best Practices (community recommendations)
|
||||
- Structure (organization, formatting)
|
||||
|
||||
**Output Modes:**
|
||||
1. Audit Report - Detailed findings
|
||||
2. JSON Report - Machine-readable for CI/CD
|
||||
3. Refactored File - Production-ready output
|
||||
|
||||
## Validation Workflow
|
||||
|
||||
### Step 1: Discovery
|
||||
- Locate target artifact(s)
|
||||
- Calculate metrics (size, complexity)
|
||||
- Read content for analysis
|
||||
|
||||
### Step 2: Analysis
|
||||
Run validators in priority order:
|
||||
1. Security Validation (CRITICAL)
|
||||
2. Official Compliance
|
||||
3. Best Practices
|
||||
4. Optimization Opportunities
|
||||
|
||||
### Step 3: Scoring
|
||||
- Calculate overall health score
|
||||
- Generate category-specific scores
|
||||
- Count findings by severity
|
||||
|
||||
### Step 4: Reporting
|
||||
- Generate human-readable report
|
||||
- Provide machine-readable output
|
||||
- Offer remediation options
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Prioritize Security**: Always check security first
|
||||
2. **Source Attribution**: Label each finding with its source
|
||||
3. **Actionable Remediation**: Provide specific fix instructions
|
||||
4. **Multiple Output Formats**: Support markdown, JSON, HTML
|
||||
5. **Incremental Improvement**: Don't overwhelm with all issues
|
||||
6. **Track Over Time**: Support baseline comparisons
|
||||
7. **CI/CD Integration**: Provide exit codes and JSON output
|
||||
|
||||
## Report Structure
|
||||
|
||||
```markdown
|
||||
# [Artifact] Audit Report
|
||||
|
||||
## Executive Summary
|
||||
- Overall health score: [X/100]
|
||||
- Critical findings: [count]
|
||||
- High findings: [count]
|
||||
- Top 3 priorities
|
||||
|
||||
## File Metrics
|
||||
- [Relevant size/complexity metrics]
|
||||
|
||||
## Detailed Findings
|
||||
|
||||
### Critical Issues
|
||||
[Grouped by category]
|
||||
|
||||
### High Priority Issues
|
||||
[Grouped by category]
|
||||
|
||||
### Medium Priority Issues
|
||||
[Grouped by category]
|
||||
|
||||
## Remediation Plan
|
||||
- P0: IMMEDIATE (critical)
|
||||
- P1: THIS SPRINT (high)
|
||||
- P2: NEXT QUARTER (medium)
|
||||
- P3: BACKLOG (low)
|
||||
```
|
||||
|
||||
## Success Criteria Template
|
||||
|
||||
```markdown
|
||||
A well-validated [artifact] should achieve:
|
||||
|
||||
- ✅ Security Score: 100/100
|
||||
- ✅ Compliance Score: 80+/100
|
||||
- ✅ Overall Health: 75+/100
|
||||
- ✅ Zero CRITICAL findings
|
||||
- ✅ < 3 HIGH findings
|
||||
- ✅ [Artifact-specific criteria]
|
||||
```
|
||||
Reference in New Issue
Block a user