Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:16:56 +08:00
commit 8a3d331e04
61 changed files with 11808 additions and 0 deletions

View File

@@ -0,0 +1,139 @@
# Phase 1: Insight Discovery and Parsing
**Purpose**: Locate, read, deduplicate, and structure all insights from the project's lessons-learned directory.
## Steps
### 1. Verify project structure
- Ask user for project root directory (default: current working directory)
- Check if `docs/lessons-learned/` exists
- If not found, explain the expected structure and offer to search alternative locations
- List all categories found (testing, configuration, hooks-and-events, etc.)
### 2. Scan and catalog insight files
**File Naming Convention**:
Files MUST follow: `YYYY-MM-DD-descriptive-slug.md`
- Date prefix for chronological sorting
- Descriptive slug (3-5 words) summarizing the insight topic
- Examples:
- `2025-11-21-jwt-refresh-token-pattern.md`
- `2025-11-20-vitest-mocking-best-practices.md`
- `2025-11-19-react-testing-library-queries.md`
**Scanning**:
- Use Glob tool to find all markdown files: `docs/lessons-learned/**/*.md`
- For each file found, extract:
- File path and category (from directory name)
- Creation date (from filename prefix)
- Descriptive title (from filename slug)
- File size and line count
- Build initial inventory report
### 3. Deduplicate insights (CRITICAL)
**Why**: The extraction hook may create duplicate entries within files.
**Deduplication Algorithm**:
```python
def deduplicate_insights(insights):
seen_hashes = set()
unique_insights = []
for insight in insights:
# Create hash from normalized content
content_hash = hash(normalize(insight.title + insight.content[:200]))
if content_hash not in seen_hashes:
seen_hashes.add(content_hash)
unique_insights.append(insight)
else:
log_duplicate(insight)
return unique_insights
```
**Deduplication Checks**:
- Exact title match → duplicate
- First 200 chars content match → duplicate
- Same code blocks in same order → duplicate
- Report: "Found X insights, removed Y duplicates (Z unique)"
### 4. Parse individual insights
- Read each file using Read tool
- Extract session metadata (session ID, timestamp from file headers)
- Split file content on `---` separator (insights are separated by horizontal rules)
- For each insight section:
- Extract title (first line, often wrapped in `**bold**`)
- Extract body content (remaining markdown)
- Identify code blocks
- Extract actionable items (lines starting with `- [ ]` or numbered lists)
- Note any warnings/cautions
### 5. Apply quality filters
**Filter out low-depth insights** that are:
- Basic explanatory notes without actionable steps
- Simple definitions or concept explanations
- Single-paragraph observations
**Keep insights that have**:
- Actionable workflows (numbered steps, checklists)
- Decision frameworks (trade-offs, when to use X vs Y)
- Code patterns with explanation of WHY
- Troubleshooting guides with solutions
- Best practices with concrete examples
**Quality Score Calculation**:
```
score = 0
if has_actionable_items: score += 3
if has_code_examples: score += 2
if has_numbered_steps: score += 2
if word_count > 200: score += 1
if has_warnings_or_notes: score += 1
# Minimum score for skill consideration: 4
```
### 6. Build structured insight inventory
```
{
id: unique_id,
title: string,
content: string,
category: string,
date: ISO_date,
session_id: string,
source_file: path,
code_examples: [{ language, code }],
action_items: [string],
keywords: [string],
quality_score: int,
paragraph_count: int,
line_count: int
}
```
### 7. Present discovery summary
- Total insights found (before deduplication)
- Duplicates removed
- Low-quality insights filtered
- **Final count**: Unique, quality insights
- Category breakdown
- Date range (earliest to latest)
- Preview of top 5 insights by quality score
## Output
Deduplicated, quality-filtered inventory of insights with metadata and categorization.
## Common Issues
- **No lessons-learned directory**: Ask if user wants to search elsewhere or exit
- **Empty files**: Skip and report count of empty files
- **Malformed markdown**: Log warning but continue parsing (best effort)
- **Missing session metadata**: Use filename date as fallback
- **High duplicate count**: Indicates extraction hook bug - warn user
- **All insights filtered as low-quality**: Lower threshold or suggest manual curation
- **Files without descriptive names**: Suggest renaming for better organization

View File

@@ -0,0 +1,82 @@
# Phase 2: Smart Clustering
**Purpose**: Group related insights using similarity analysis to identify skill candidates.
## Steps
### 1. Load clustering configuration
- Read `data/clustering-config.yaml` for weights and thresholds
- Similarity weights:
- Same category: +0.3
- Shared keyword: +0.1 per keyword
- Temporal proximity (within 7 days): +0.05
- Title similarity: +0.15
- Content overlap: +0.2
- Clustering threshold: 0.6 minimum to group
- Standalone quality threshold: 0.8 for single-insight skills
### 2. Extract keywords from each insight
- Normalize text (lowercase, remove punctuation)
- Extract significant words from title (weight 2x)
- Extract significant words from body (weight 1x)
- Filter out common stop words
- Apply category-specific keyword boosting
- Build keyword vector for each insight
### 3. Calculate pairwise similarity scores
For each pair of insights (i, j):
- Base score = 0
- If same category: +0.3
- For each shared keyword: +0.1
- If dates within 7 days: +0.05
- Calculate title word overlap: shared_words / total_words * 0.15
- Calculate content concept overlap: shared_concepts / total_concepts * 0.2
- Final score = sum of all components
### 4. Build clusters
- Start with highest similarity pairs
- Group insights with similarity >= 0.6
- Use connected components algorithm
- Identify standalone insights (don't cluster with any others)
- For standalone insights, check if quality score >= 0.8
### 5. Assess cluster characteristics
For each cluster:
- Count insights
- Identify dominant category
- Extract common keywords
- Assess complexity (lines, code examples, etc.)
- Recommend skill complexity (minimal/standard/complex)
- Suggest skill pattern (phase-based/mode-based/validation)
### 6. Handle large clusters (>5 insights)
- Attempt sub-clustering by:
- Temporal splits (early vs. late insights)
- Sub-topic splits (different keyword groups)
- Complexity splits (simple vs. complex insights)
- Ask user if they want to split or keep as comprehensive skill
### 7. Present clustering results interactively
For each cluster, show:
- Cluster ID and size
- Suggested skill name (from keywords)
- Dominant category
- Insight titles in cluster
- Similarity scores
- Recommended complexity
Ask user to:
- Review proposed clusters
- Accept/reject/modify groupings
- Combine or split clusters
- Remove low-value insights
## Output
Validated clusters of insights, each representing a skill candidate.
## Common Issues
- **All insights are unrelated** (no clusters): Offer to generate standalone skills or exit
- **One giant cluster**: Suggest sub-clustering or mode-based skill
- **Too many standalone insights**: Suggest raising similarity threshold or manual grouping

View File

@@ -0,0 +1,82 @@
# Phase 3: Interactive Skill Design
**Purpose**: For each skill candidate, design the skill structure with user customization.
## Steps
### 1. Propose skill name
- Extract top keywords from cluster
- Apply naming heuristics:
- Max 40 characters
- Kebab-case
- Remove filler words ("insight", "lesson", "the")
- Add preferred suffix ("guide", "advisor", "helper")
- Example: "hook-deduplication-session-management" → "hook-deduplication-guide"
- Present to user with alternatives
- Allow user to customize
### 2. Generate description
- Use action verbs: "Use PROACTIVELY when", "Guides", "Analyzes"
- Include trigger context (what scenario)
- Include benefit (what outcome)
- Keep under 150 chars (soft limit, hard limit 1024)
- Present to user and allow editing
### 3. Assess complexity
Calculate based on:
- Number of insights (1 = minimal, 2-4 = standard, 5+ = complex)
- Total content length
- Presence of code examples
- Actionable items count
Recommend: minimal, standard, or complex
- Minimal: SKILL.md + README.md + plugin.json + CHANGELOG.md
- Standard: + data/insights-reference.md + examples/
- Complex: + templates/ + multiple examples/
### 4. Select skill pattern
Analyze insight content for pattern indicators:
- **Phase-based**: sequential steps, "first/then/finally"
- **Mode-based**: multiple approaches, "alternatively", "option"
- **Validation**: checking/auditing language, "ensure", "verify"
- **Data-processing**: parsing/transformation language
Recommend pattern with confidence level and explain trade-offs.
### 5. Map insights to skill structure
For each insight, identify content types:
- Problem description → Overview section
- Solution explanation → Workflow/Phases
- Code examples → examples/ directory
- Best practices → Important Reminders
- Checklists → templates/checklist.md
- Trade-offs → Decision Guide section
- Warnings → Important Reminders (high priority)
### 6. Define workflow phases (if phase-based)
For each phase:
- Generate phase name from insight content
- Extract purpose statement
- List steps (from insight action items or narrative)
- Define expected output
- Note common issues (from warnings in insights)
### 7. Preview the skill design
Show complete outline:
- Name, description, complexity
- Pattern and structure
- Section breakdown
- File structure
Ask for final confirmation or modifications.
## Output
Approved skill design specification ready for generation.
## Common Issues
- **User unsure about pattern**: Show examples from existing skills, offer recommendation
- **Naming conflicts**: Check ~/.claude/skills/ and .claude/skills/ for existing skills
- **Description too long**: Auto-trim and ask user to review
- **Unclear structure**: Fall back to default phase-based pattern

View File

@@ -0,0 +1,89 @@
# Phase 4: Skill Generation
**Purpose**: Create all skill files following the approved design.
## Steps
### 1. Prepare generation workspace
- Create temporary directory for skill assembly
- Load templates from `templates/` directory
### 2. Generate SKILL.md
- Create frontmatter with name and description
- Add h1 heading
- Generate Overview section (what, based on X insights, capabilities)
- Generate "When to Use" section (trigger phrases, use cases, anti-use cases)
- Generate Response Style section
- Generate workflow sections based on pattern:
- Phase-based: Phase 1, Phase 2, etc. with Purpose, Steps, Output, Common Issues
- Mode-based: Mode 1, Mode 2, etc. with When to use, Steps, Output
- Validation: Analysis → Detection → Recommendations
- Generate Reference Materials section
- Generate Important Reminders
- Generate Best Practices
- Generate Troubleshooting
- Add Metadata section with source insight attribution
### 3. Generate README.md
- Brief overview (1-2 sentences)
- Installation instructions (standard)
- Quick start example
- Trigger phrases list
- Link to SKILL.md for details
### 4. Generate plugin.json
```json
{
"name": "[skill-name]",
"version": "0.1.0",
"description": "[description]",
"type": "skill",
"author": "Connor",
"category": "[category from clustering-config]",
"tags": ["insights", "lessons-learned", "[domain]"]
}
```
### 5. Generate CHANGELOG.md
Initialize with v0.1.0 and list key features.
### 6. Generate data/insights-reference.md (if complexity >= standard)
- Add overview (insight count, date range, categories)
- For each insight: title, metadata, original content, code examples, related insights
- Add clustering analysis section
- Add insight-to-skill mapping explanation
### 7. Generate examples/ (if needed)
- Extract and organize code blocks by language or topic
- Add explanatory context
- Create usage examples showing example prompts and expected behaviors
### 8. Generate templates/ (if needed)
- Create templates/checklist.md from actionable items
- Organize items by section
- Add verification steps
- Include common mistakes section
### 9. Validate all generated files
- Check YAML frontmatter syntax
- Validate JSON syntax
- Check file references are valid
- Verify no broken markdown links
- Run quality checklist
- Report validation results to user
### 10. Preview generated skill
- Show file tree
- Show key sections from SKILL.md
- Show README.md preview
- Highlight any validation warnings
## Output
Complete, validated skill in temporary workspace, ready for installation.
## Common Issues
- **Validation failures**: Fix automatically if possible, otherwise ask user
- **Missing code examples**: Offer to generate placeholder or skip examples/ directory
- **Large SKILL.md** (>500 lines): Suggest splitting content into separate files

View File

@@ -0,0 +1,88 @@
# Phase 5: Installation and Testing
**Purpose**: Install the skill and provide testing guidance.
## Steps
### 1. Ask installation location
Present options:
- **Project-specific**: `[project]/.claude/skills/[skill-name]/`
- Pros: Version controlled with project, only available in this project
- Cons: Not available in other projects
- **Global**: `~/.claude/skills/[skill-name]/`
- Pros: Available in all projects
- Cons: Not version controlled (unless user manages ~/.claude with git)
### 2. Check for conflicts
- Verify chosen location doesn't already have a skill with same name
- If conflict found:
- Show existing skill details
- Offer options: Choose different name, Overwrite (with confirmation), Cancel
### 3. Copy skill files
- Create target directory
- Copy all generated files preserving structure
- Set appropriate permissions
- Verify all files copied successfully
### 4. Re-validate installed skill
- Read SKILL.md from install location
- Verify frontmatter is still valid
- Check file references work from install location
- Confirm no corruption during copy
### 5. Test skill loading
- Attempt to trigger skill using one of the trigger phrases
- Verify Claude Code recognizes the skill
- Check skill appears in available skills list
- Report results to user
### 6. Provide testing guidance
Show trigger phrases to test:
```
Try these phrases to test your new skill:
- "[trigger phrase 1]"
- "[trigger phrase 2]"
- "[trigger phrase 3]"
```
Suggest test scenarios based on skill purpose and explain expected behavior.
### 7. Offer refinement suggestions
Based on skill characteristics, suggest potential improvements:
- Add more examples if skill is complex
- Refine trigger phrases if they're too broad/narrow
- Split into multiple skills if scope is too large
- Add troubleshooting section if skill has edge cases
Ask if user wants to iterate on the skill.
### 8. Document the skill
Offer to add skill to project documentation:
```markdown
### [Skill Name]
**Location**: [path]
**Purpose**: [description]
**Trigger**: "[main trigger phrase]"
**Source**: Generated from [X] insights ([categories])
```
### 9. Next steps
Suggest:
- Test the skill with real scenarios
- Share with team if relevant
- Iterate based on usage (version 0.2.0)
- Generate more skills from other insight clusters
Ask if user wants to generate another skill from remaining insights.
## Output
Installed, validated skill with testing guidance and refinement suggestions.
## Common Issues
- **Installation permission errors**: Check directory permissions, suggest sudo if needed
- **Skill not recognized**: Verify frontmatter format, check Claude Code skill discovery
- **Trigger phrases don't work**: Suggest broadening or clarifying phrases
- **Conflicts with existing skills**: Help user choose unique name or merge functionality