Initial commit

2025-11-29 18:16:56 +08:00
commit 8a3d331e04
61 changed files with 11808 additions and 0 deletions
--- a/skills/insight-skill-generator/workflow/phase-1-discovery.md
+++ b/skills/insight-skill-generator/workflow/phase-1-discovery.md
@@ -0,0 +1,139 @@
+# Phase 1: Insight Discovery and Parsing
+
+**Purpose**: Locate, read, deduplicate, and structure all insights from the project's lessons-learned directory.
+
+## Steps
+
+### 1. Verify project structure
+- Ask user for project root directory (default: current working directory)
+- Check if `docs/lessons-learned/` exists
+- If not found, explain the expected structure and offer to search alternative locations
+- List all categories found (testing, configuration, hooks-and-events, etc.)
+
+### 2. Scan and catalog insight files
+
+**File Naming Convention**:
+Files MUST follow: `YYYY-MM-DD-descriptive-slug.md`
+- Date prefix for chronological sorting
+- Descriptive slug (3-5 words) summarizing the insight topic
+- Examples:
+  - `2025-11-21-jwt-refresh-token-pattern.md`
+  - `2025-11-20-vitest-mocking-best-practices.md`
+  - `2025-11-19-react-testing-library-queries.md`
+
+**Scanning**:
+- Use Glob tool to find all markdown files: `docs/lessons-learned/**/*.md`
+- For each file found, extract:
+  - File path and category (from directory name)
+  - Creation date (from filename prefix)
+  - Descriptive title (from filename slug)
+  - File size and line count
+- Build initial inventory report
+
+### 3. Deduplicate insights (CRITICAL)
+
+**Why**: The extraction hook may create duplicate entries within files.
+
+**Deduplication Algorithm**:
+```python
+def deduplicate_insights(insights):
+    seen_hashes = set()
+    unique_insights = []
+
+    for insight in insights:
+        # Create hash from normalized content
+        content_hash = hash(normalize(insight.title + insight.content[:200]))
+
+        if content_hash not in seen_hashes:
+            seen_hashes.add(content_hash)
+            unique_insights.append(insight)
+        else:
+            log_duplicate(insight)
+
+    return unique_insights
+```
+
+**Deduplication Checks**:
+- Exact title match → duplicate
+- First 200 chars content match → duplicate
+- Same code blocks in same order → duplicate
+- Report: "Found X insights, removed Y duplicates (Z unique)"
+
+### 4. Parse individual insights
+- Read each file using Read tool
+- Extract session metadata (session ID, timestamp from file headers)
+- Split file content on `---` separator (insights are separated by horizontal rules)
+- For each insight section:
+  - Extract title (first line, often wrapped in `**bold**`)
+  - Extract body content (remaining markdown)
+  - Identify code blocks
+  - Extract actionable items (lines starting with `- [ ]` or numbered lists)
+  - Note any warnings/cautions
+
+### 5. Apply quality filters
+
+**Filter out low-depth insights** that are:
+- Basic explanatory notes without actionable steps
+- Simple definitions or concept explanations
+- Single-paragraph observations
+
+**Keep insights that have**:
+- Actionable workflows (numbered steps, checklists)
+- Decision frameworks (trade-offs, when to use X vs Y)
+- Code patterns with explanation of WHY
+- Troubleshooting guides with solutions
+- Best practices with concrete examples
+
+**Quality Score Calculation**:
+```
+score = 0
+if has_actionable_items: score += 3
+if has_code_examples: score += 2
+if has_numbered_steps: score += 2
+if word_count > 200: score += 1
+if has_warnings_or_notes: score += 1
+
+# Minimum score for skill consideration: 4
+```
+
+### 6. Build structured insight inventory
+```
+{
+  id: unique_id,
+  title: string,
+  content: string,
+  category: string,
+  date: ISO_date,
+  session_id: string,
+  source_file: path,
+  code_examples: [{ language, code }],
+  action_items: [string],
+  keywords: [string],
+  quality_score: int,
+  paragraph_count: int,
+  line_count: int
+}
+```
+
+### 7. Present discovery summary
+- Total insights found (before deduplication)
+- Duplicates removed
+- Low-quality insights filtered
+- **Final count**: Unique, quality insights
+- Category breakdown
+- Date range (earliest to latest)
+- Preview of top 5 insights by quality score
+
+## Output
+
+Deduplicated, quality-filtered inventory of insights with metadata and categorization.
+
+## Common Issues
+
+- **No lessons-learned directory**: Ask if user wants to search elsewhere or exit
+- **Empty files**: Skip and report count of empty files
+- **Malformed markdown**: Log warning but continue parsing (best effort)
+- **Missing session metadata**: Use filename date as fallback
+- **High duplicate count**: Indicates extraction hook bug - warn user
+- **All insights filtered as low-quality**: Lower threshold or suggest manual curation
+- **Files without descriptive names**: Suggest renaming for better organization
--- a/skills/insight-skill-generator/workflow/phase-2-clustering.md
+++ b/skills/insight-skill-generator/workflow/phase-2-clustering.md
@@ -0,0 +1,82 @@
+# Phase 2: Smart Clustering
+
+**Purpose**: Group related insights using similarity analysis to identify skill candidates.
+
+## Steps
+
+### 1. Load clustering configuration
+- Read `data/clustering-config.yaml` for weights and thresholds
+- Similarity weights:
+  - Same category: +0.3
+  - Shared keyword: +0.1 per keyword
+  - Temporal proximity (within 7 days): +0.05
+  - Title similarity: +0.15
+  - Content overlap: +0.2
+- Clustering threshold: 0.6 minimum to group
+- Standalone quality threshold: 0.8 for single-insight skills
+
+### 2. Extract keywords from each insight
+- Normalize text (lowercase, remove punctuation)
+- Extract significant words from title (weight 2x)
+- Extract significant words from body (weight 1x)
+- Filter out common stop words
+- Apply category-specific keyword boosting
+- Build keyword vector for each insight
+
+### 3. Calculate pairwise similarity scores
+For each pair of insights (i, j):
+- Base score = 0
+- If same category: +0.3
+- For each shared keyword: +0.1
+- If dates within 7 days: +0.05
+- Calculate title word overlap: shared_words / total_words * 0.15
+- Calculate content concept overlap: shared_concepts / total_concepts * 0.2
+- Final score = sum of all components
+
+### 4. Build clusters
+- Start with highest similarity pairs
+- Group insights with similarity >= 0.6
+- Use connected components algorithm
+- Identify standalone insights (don't cluster with any others)
+- For standalone insights, check if quality score >= 0.8
+
+### 5. Assess cluster characteristics
+For each cluster:
+- Count insights
+- Identify dominant category
+- Extract common keywords
+- Assess complexity (lines, code examples, etc.)
+- Recommend skill complexity (minimal/standard/complex)
+- Suggest skill pattern (phase-based/mode-based/validation)
+
+### 6. Handle large clusters (>5 insights)
+- Attempt sub-clustering by:
+  - Temporal splits (early vs. late insights)
+  - Sub-topic splits (different keyword groups)
+  - Complexity splits (simple vs. complex insights)
+- Ask user if they want to split or keep as comprehensive skill
+
+### 7. Present clustering results interactively
+For each cluster, show:
+- Cluster ID and size
+- Suggested skill name (from keywords)
+- Dominant category
+- Insight titles in cluster
+- Similarity scores
+- Recommended complexity
+
+Ask user to:
+- Review proposed clusters
+- Accept/reject/modify groupings
+- Combine or split clusters
+- Remove low-value insights
+
+## Output
+
+Validated clusters of insights, each representing a skill candidate.
+
+## Common Issues
+
+- **All insights are unrelated** (no clusters): Offer to generate standalone skills or exit
+- **One giant cluster**: Suggest sub-clustering or mode-based skill
+- **Too many standalone insights**: Suggest raising similarity threshold or manual grouping
--- a/skills/insight-skill-generator/workflow/phase-3-design.md
+++ b/skills/insight-skill-generator/workflow/phase-3-design.md
@@ -0,0 +1,82 @@
+# Phase 3: Interactive Skill Design
+
+**Purpose**: For each skill candidate, design the skill structure with user customization.
+
+## Steps
+
+### 1. Propose skill name
+- Extract top keywords from cluster
+- Apply naming heuristics:
+  - Max 40 characters
+  - Kebab-case
+  - Remove filler words ("insight", "lesson", "the")
+  - Add preferred suffix ("guide", "advisor", "helper")
+- Example: "hook-deduplication-session-management" → "hook-deduplication-guide"
+- Present to user with alternatives
+- Allow user to customize
+
+### 2. Generate description
+- Use action verbs: "Use PROACTIVELY when", "Guides", "Analyzes"
+- Include trigger context (what scenario)
+- Include benefit (what outcome)
+- Keep under 150 chars (soft limit, hard limit 1024)
+- Present to user and allow editing
+
+### 3. Assess complexity
+Calculate based on:
+- Number of insights (1 = minimal, 2-4 = standard, 5+ = complex)
+- Total content length
+- Presence of code examples
+- Actionable items count
+
+Recommend: minimal, standard, or complex
+- Minimal: SKILL.md + README.md + plugin.json + CHANGELOG.md
+- Standard: + data/insights-reference.md + examples/
+- Complex: + templates/ + multiple examples/
+
+### 4. Select skill pattern
+Analyze insight content for pattern indicators:
+- **Phase-based**: sequential steps, "first/then/finally"
+- **Mode-based**: multiple approaches, "alternatively", "option"
+- **Validation**: checking/auditing language, "ensure", "verify"
+- **Data-processing**: parsing/transformation language
+
+Recommend pattern with confidence level and explain trade-offs.
+
+### 5. Map insights to skill structure
+For each insight, identify content types:
+- Problem description → Overview section
+- Solution explanation → Workflow/Phases
+- Code examples → examples/ directory
+- Best practices → Important Reminders
+- Checklists → templates/checklist.md
+- Trade-offs → Decision Guide section
+- Warnings → Important Reminders (high priority)
+
+### 6. Define workflow phases (if phase-based)
+For each phase:
+- Generate phase name from insight content
+- Extract purpose statement
+- List steps (from insight action items or narrative)
+- Define expected output
+- Note common issues (from warnings in insights)
+
+### 7. Preview the skill design
+Show complete outline:
+- Name, description, complexity
+- Pattern and structure
+- Section breakdown
+- File structure
+
+Ask for final confirmation or modifications.
+
+## Output
+
+Approved skill design specification ready for generation.
+
+## Common Issues
+
+- **User unsure about pattern**: Show examples from existing skills, offer recommendation
+- **Naming conflicts**: Check ~/.claude/skills/ and .claude/skills/ for existing skills
+- **Description too long**: Auto-trim and ask user to review
+- **Unclear structure**: Fall back to default phase-based pattern
--- a/skills/insight-skill-generator/workflow/phase-4-generation.md
+++ b/skills/insight-skill-generator/workflow/phase-4-generation.md
@@ -0,0 +1,89 @@
+# Phase 4: Skill Generation
+
+**Purpose**: Create all skill files following the approved design.
+
+## Steps
+
+### 1. Prepare generation workspace
+- Create temporary directory for skill assembly
+- Load templates from `templates/` directory
+
+### 2. Generate SKILL.md
+- Create frontmatter with name and description
+- Add h1 heading
+- Generate Overview section (what, based on X insights, capabilities)
+- Generate "When to Use" section (trigger phrases, use cases, anti-use cases)
+- Generate Response Style section
+- Generate workflow sections based on pattern:
+  - Phase-based: Phase 1, Phase 2, etc. with Purpose, Steps, Output, Common Issues
+  - Mode-based: Mode 1, Mode 2, etc. with When to use, Steps, Output
+  - Validation: Analysis → Detection → Recommendations
+- Generate Reference Materials section
+- Generate Important Reminders
+- Generate Best Practices
+- Generate Troubleshooting
+- Add Metadata section with source insight attribution
+
+### 3. Generate README.md
+- Brief overview (1-2 sentences)
+- Installation instructions (standard)
+- Quick start example
+- Trigger phrases list
+- Link to SKILL.md for details
+
+### 4. Generate plugin.json
+```json
+{
+  "name": "[skill-name]",
+  "version": "0.1.0",
+  "description": "[description]",
+  "type": "skill",
+  "author": "Connor",
+  "category": "[category from clustering-config]",
+  "tags": ["insights", "lessons-learned", "[domain]"]
+}
+```
+
+### 5. Generate CHANGELOG.md
+Initialize with v0.1.0 and list key features.
+
+### 6. Generate data/insights-reference.md (if complexity >= standard)
+- Add overview (insight count, date range, categories)
+- For each insight: title, metadata, original content, code examples, related insights
+- Add clustering analysis section
+- Add insight-to-skill mapping explanation
+
+### 7. Generate examples/ (if needed)
+- Extract and organize code blocks by language or topic
+- Add explanatory context
+- Create usage examples showing example prompts and expected behaviors
+
+### 8. Generate templates/ (if needed)
+- Create templates/checklist.md from actionable items
+- Organize items by section
+- Add verification steps
+- Include common mistakes section
+
+### 9. Validate all generated files
+- Check YAML frontmatter syntax
+- Validate JSON syntax
+- Check file references are valid
+- Verify no broken markdown links
+- Run quality checklist
+- Report validation results to user
+
+### 10. Preview generated skill
+- Show file tree
+- Show key sections from SKILL.md
+- Show README.md preview
+- Highlight any validation warnings
+
+## Output
+
+Complete, validated skill in temporary workspace, ready for installation.
+
+## Common Issues
+
+- **Validation failures**: Fix automatically if possible, otherwise ask user
+- **Missing code examples**: Offer to generate placeholder or skip examples/ directory
+- **Large SKILL.md** (>500 lines): Suggest splitting content into separate files
--- a/skills/insight-skill-generator/workflow/phase-5-installation.md
+++ b/skills/insight-skill-generator/workflow/phase-5-installation.md
@@ -0,0 +1,88 @@
+# Phase 5: Installation and Testing
+
+**Purpose**: Install the skill and provide testing guidance.
+
+## Steps
+
+### 1. Ask installation location
+Present options:
+- **Project-specific**: `[project]/.claude/skills/[skill-name]/`
+  - Pros: Version controlled with project, only available in this project
+  - Cons: Not available in other projects
+- **Global**: `~/.claude/skills/[skill-name]/`
+  - Pros: Available in all projects
+  - Cons: Not version controlled (unless user manages ~/.claude with git)
+
+### 2. Check for conflicts
+- Verify chosen location doesn't already have a skill with same name
+- If conflict found:
+  - Show existing skill details
+  - Offer options: Choose different name, Overwrite (with confirmation), Cancel
+
+### 3. Copy skill files
+- Create target directory
+- Copy all generated files preserving structure
+- Set appropriate permissions
+- Verify all files copied successfully
+
+### 4. Re-validate installed skill
+- Read SKILL.md from install location
+- Verify frontmatter is still valid
+- Check file references work from install location
+- Confirm no corruption during copy
+
+### 5. Test skill loading
+- Attempt to trigger skill using one of the trigger phrases
+- Verify Claude Code recognizes the skill
+- Check skill appears in available skills list
+- Report results to user
+
+### 6. Provide testing guidance
+Show trigger phrases to test:
+```
+Try these phrases to test your new skill:
+- "[trigger phrase 1]"
+- "[trigger phrase 2]"
+- "[trigger phrase 3]"
+```
+
+Suggest test scenarios based on skill purpose and explain expected behavior.
+
+### 7. Offer refinement suggestions
+Based on skill characteristics, suggest potential improvements:
+- Add more examples if skill is complex
+- Refine trigger phrases if they're too broad/narrow
+- Split into multiple skills if scope is too large
+- Add troubleshooting section if skill has edge cases
+
+Ask if user wants to iterate on the skill.
+
+### 8. Document the skill
+Offer to add skill to project documentation:
+```markdown
+### [Skill Name]
+**Location**: [path]
+**Purpose**: [description]
+**Trigger**: "[main trigger phrase]"
+**Source**: Generated from [X] insights ([categories])
+```
+
+### 9. Next steps
+Suggest:
+- Test the skill with real scenarios
+- Share with team if relevant
+- Iterate based on usage (version 0.2.0)
+- Generate more skills from other insight clusters
+
+Ask if user wants to generate another skill from remaining insights.
+
+## Output
+
+Installed, validated skill with testing guidance and refinement suggestions.
+
+## Common Issues
+
+- **Installation permission errors**: Check directory permissions, suggest sudo if needed
+- **Skill not recognized**: Verify frontmatter format, check Claude Code skill discovery
+- **Trigger phrases don't work**: Suggest broadening or clarifying phrases
+- **Conflicts with existing skills**: Help user choose unique name or merge functionality