Initial commit

2025-11-29 18:16:56 +08:00
commit 8a3d331e04
61 changed files with 11808 additions and 0 deletions
--- a/skills/insight-skill-generator/examples/example-clustering-output.md
+++ b/skills/insight-skill-generator/examples/example-clustering-output.md
@@ -0,0 +1,286 @@
+# Example: Clustering Analysis Output
+
+This example shows what the clustering phase produces when analyzing a project's insights.
+
+## Scenario
+
+A project has been using the extract-explanatory-insights hook for 2 weeks, generating 12 insights across different categories.
+
+---
+
+## Phase 1: Discovery Summary
+
+**Total Insights Found**: 12
+**Date Range**: 2025-11-01 to 2025-11-14
+**Unique Sessions**: 8
+**Categories**:
+- testing: 5 insights
+- hooks-and-events: 3 insights
+- architecture: 2 insights
+- performance: 2 insights
+
+**Preview**:
+1. "Modern Testing Strategy with Testing Trophy" (testing, 2025-11-01)
+2. "Hook Deduplication Session Management" (hooks-and-events, 2025-11-03)
+3. "CPU Usage Prevention in Vitest" (testing, 2025-11-03)
+4. "BSD awk Compatibility in Hook Scripts" (hooks-and-events, 2025-11-05)
+5. "Semantic Query Priorities in React Testing Library" (testing, 2025-11-06)
+
+---
+
+## Phase 2: Clustering Analysis
+
+### Cluster 1: Testing Strategy
+**Size**: 3 insights
+**Similarity Score**: 0.75 (high)
+**Recommended Complexity**: Standard
+**Recommended Pattern**: Validation
+
+**Insights**:
+1. "Modern Testing Strategy with Testing Trophy"
+   - Keywords: testing, integration, unit, e2e, trophy, kent-c-dodds
+   - Category: testing
+   - Date: 2025-11-01
+   - Length: 156 lines
+   - Has code examples: Yes
+
+2. "Semantic Query Priorities in React Testing Library"
+   - Keywords: testing, react, semantic, query, getByRole, accessibility
+   - Category: testing
+   - Date: 2025-11-06
+   - Length: 89 lines
+   - Has code examples: Yes
+
+3. "What NOT to Test - Brittle Patterns"
+   - Keywords: testing, avoid, brittle, implementation-details, user-behavior
+   - Category: testing
+   - Date: 2025-11-08
+   - Length: 67 lines
+   - Has code examples: No
+
+**Shared Keywords**: testing (3), react (2), user (2), behavior (2), semantic (2)
+
+**Cluster Characteristics**:
+- All in same category (testing)
+- Temporal span: 7 days
+- Common theme: User-focused testing approach
+- Total code examples: 5 blocks
+- Actionable items: 12
+
+**Suggested Skill Name**: "user-focused-testing-guide"
+
+**Suggested Description**: "Use PROACTIVELY when writing tests to ensure user-centric testing strategy following Testing Trophy methodology and React Testing Library best practices"
+
+**Skill Structure Recommendation**:
+```
+SKILL.md sections:
+- Overview (Testing Trophy philosophy)
+- Phase 1: Query Selection (semantic queries)
+- Phase 2: Test Writing (user workflows)
+- Phase 3: Avoiding Brittle Tests
+- Important Reminders (what NOT to test)
+- Examples (from code blocks)
+```
+
+---
+
+### Cluster 2: Hook Development
+**Size**: 2 insights
+**Similarity Score**: 0.68 (medium-high)
+**Recommended Complexity**: Minimal to Standard
+**Recommended Pattern**: Phase-based
+
+**Insights**:
+1. "Hook Deduplication Session Management"
+   - Keywords: hook, deduplication, session, hash, state
+   - Category: hooks-and-events
+   - Date: 2025-11-03
+   - Length: 124 lines
+   - Has code examples: Yes (bash)
+
+2. "BSD awk Compatibility in Hook Scripts"
+   - Keywords: hook, awk, bsd, macos, compatibility, portable
+   - Category: hooks-and-events
+   - Date: 2025-11-05
+   - Length: 78 lines
+   - Has code examples: Yes (bash)
+
+**Shared Keywords**: hook (2), bash (2), script (2), compatibility (2)
+
+**Cluster Characteristics**:
+- Same category (hooks-and-events)
+- Temporal span: 2 days (high proximity)
+- Common theme: Hook script development best practices
+- Total code examples: 4 blocks (all bash)
+- Actionable items: 8
+
+**Suggested Skill Name**: "hook-script-development-guide"
+
+**Suggested Description**: "Guides Claude Code hook script development with deduplication patterns and cross-platform compatibility best practices for macOS/Linux environments"
+
+**Skill Structure Recommendation**:
+```
+SKILL.md sections:
+- Overview (Hook development challenges)
+- Phase 1: Deduplication Strategy (session vs content)
+- Phase 2: Cross-Platform Compatibility (BSD awk)
+- Phase 3: Testing and Validation
+- Examples (bash scripts)
+- Templates (hook-template.sh)
+```
+
+---
+
+### Cluster 3: Performance Optimization
+**Size**: 2 insights
+**Similarity Score**: 0.62 (medium)
+**Recommended Complexity**: Minimal
+**Recommended Pattern**: Phase-based
+
+**Insights**:
+1. "CPU Usage Prevention in Vitest"
+   - Keywords: vitest, cpu, performance, threads, watch-mode
+   - Category: testing
+   - Date: 2025-11-03
+   - Length: 45 lines
+   - Has code examples: Yes (npm scripts)
+
+2. "Bundle Size Optimization"
+   - Keywords: bundle, size, webpack, optimization, tree-shaking
+   - Category: performance
+   - Date: 2025-11-09
+   - Length: 92 lines
+   - Has code examples: Yes (webpack config)
+
+**Shared Keywords**: performance (2), optimization (2), build (2)
+
+**Cluster Characteristics**:
+- Different categories (testing, performance)
+- Temporal span: 6 days
+- Common theme: Build/test performance
+- Total code examples: 3 blocks
+- Actionable items: 6
+
+**Suggested Skill Name**: "build-performance-optimizer"
+
+**Suggested Description**: "Analyzes build and test performance issues and generates optimization recommendations for Vitest test runs and webpack bundle sizes"
+
+**Skill Structure Recommendation**:
+```
+SKILL.md sections:
+- Overview (Performance impact on DX)
+- Phase 1: Test Performance Analysis
+- Phase 2: Bundle Size Analysis
+- Phase 3: Optimization Implementation
+- Important Reminders (check before test, monitor bundle)
+- Examples (npm scripts, webpack config)
+```
+
+---
+
+### Standalone Insights
+
+#### Standalone 1: "Hook State Management Patterns"
+**Quality Score**: 0.85 (high - qualifies for standalone skill)
+**Category**: hooks-and-events
+**Date**: 2025-11-07
+**Length**: 134 lines
+**Code Examples**: Yes (3 blocks)
+
+**Why Standalone**:
+- Doesn't cluster with other hook insights (different focus)
+- High quality with comprehensive coverage
+- Self-contained topic (state management)
+- Multiple actionable patterns
+
+**Suggested Skill Name**: "hook-state-manager"
+
+**Suggested Description**: "Automates state management setup for Claude Code hooks with persistent storage, cleanup, and safe concurrency patterns"
+
+---
+
+#### Standalone 2: "Architecture Decision Records"
+**Quality Score**: 0.82 (high - qualifies for standalone skill)
+**Category**: architecture
+**Date**: 2025-11-12
+**Length**: 156 lines
+**Code Examples**: Yes (template)
+
+**Why Standalone**:
+- Unique topic (no other architecture insights)
+- High quality with complete template
+- Valuable for documentation
+- Industry best practice
+
+**Suggested Skill Name**: "adr-documentation-helper"
+
+**Suggested Description**: "Guides creation of Architecture Decision Records (ADRs) following industry standards with templates and integration with project documentation"
+
+---
+
+### Low-Quality Insights (Not Recommended for Skills)
+
+#### "Git Branch Naming Convention"
+**Quality Score**: 0.42 (low)
+**Category**: version-control
+**Reason for Exclusion**: Too simple, covered by existing conventions, no unique value
+
+#### "TypeScript Strict Mode Benefits"
+**Quality Score**: 0.38 (low)
+**Category**: typescript
+**Reason for Exclusion**: Common knowledge, well-documented elsewhere, not actionable enough
+
+---
+
+## User Decision Points
+
+At this stage, the skill would present the following options to the user:
+
+**Option 1: Generate All Recommended Skills** (5 skills)
+- user-focused-testing-guide (Cluster 1)
+- hook-script-development-guide (Cluster 2)
+- build-performance-optimizer (Cluster 3)
+- hook-state-manager (Standalone 1)
+- adr-documentation-helper (Standalone 2)
+
+**Option 2: Select Specific Skills**
+- User picks which clusters/standalones to convert
+
+**Option 3: Modify Clusters**
+- Split large clusters
+- Merge small clusters
+- Recategorize insights
+- Adjust complexity levels
+
+**Option 4: Tune Thresholds and Retry**
+- Increase cluster_minimum (0.6 → 0.7) for tighter clusters
+- Decrease standalone_quality (0.8 → 0.7) for more standalone skills
+
+---
+
+## Proceeding to Phase 3
+
+If user selects "user-focused-testing-guide" to generate, the skill would proceed to Phase 3: Interactive Skill Design with the following proposal:
+
+**Skill Design Proposal**:
+- Name: `user-focused-testing-guide`
+- Description: "Use PROACTIVELY when writing tests to ensure user-centric testing strategy following Testing Trophy methodology and React Testing Library best practices"
+- Complexity: Standard
+- Pattern: Validation
+- Structure:
+  - SKILL.md with validation workflow
+  - data/insights-reference.md with 3 source insights
+  - examples/query-examples.md with semantic query patterns
+  - templates/test-checklist.md with testing checklist
+
+User can then customize before generation begins.
+
+---
+
+**This example demonstrates**:
+1. How clustering groups related insights
+2. What information is presented for each cluster
+3. How standalone insights are identified
+4. Why some insights are excluded
+5. What decisions users can make
+6. How the process flows into Phase 3
--- a/skills/insight-skill-generator/examples/example-generated-skill/CHANGELOG.md
+++ b/skills/insight-skill-generator/examples/example-generated-skill/CHANGELOG.md
@@ -0,0 +1,24 @@
+# Changelog
+
+## [0.1.0] - 2025-11-16
+
+### Added
+- Initial release
+- Generated from 1 insight (Hook Deduplication Session Management)
+- Phase 1: Choose Deduplication Strategy
+- Phase 2: Implement Content-Based Deduplication
+- Phase 3: Implement Hash Rotation
+- Phase 4: Testing and Validation
+- Code examples for bash hook implementation
+- Troubleshooting section
+
+### Features
+- Content-based deduplication using SHA256 hashes
+- Session-independent duplicate detection
+- Efficient hash storage with rotation
+- State management best practices
+
+### Generated By
+- insight-skill-generator v0.1.0
+- Source category: hooks-and-events
+- Original insight date: 2025-11-03
--- a/skills/insight-skill-generator/examples/example-generated-skill/README.md
+++ b/skills/insight-skill-generator/examples/example-generated-skill/README.md
@@ -0,0 +1,51 @@
+# Hook Deduplication Guide
+
+Implement robust content-based deduplication for Claude Code hooks.
+
+## Overview
+
+This skill guides you through implementing SHA256 hash-based deduplication to prevent duplicate insights or data from being stored across sessions.
+
+## When to Use
+
+**Trigger Phrases**:
+- "implement hook deduplication"
+- "prevent duplicate insights in hooks"
+- "content-based deduplication for hooks"
+
+## Quick Start
+
+```bash
+# Test the skill
+You: "I need to add deduplication to my hook to prevent storing the same insight twice"
+
+Claude: [Activates hook-deduplication-guide]
+- Explains content-based vs session-based strategies
+- Guides implementation of SHA256 hashing
+- Shows hash rotation to prevent file bloat
+- Provides testing validation
+```
+
+## What You'll Get
+
+- Content-based deduplication using SHA256
+- Efficient hash storage with rotation
+- Testing and validation guidance
+- Best practices for hook state management
+
+## Installation
+
+```bash
+# This is an example generated by insight-skill-generator
+# Copy to your skills directory if you want to use it
+cp -r examples/example-generated-skill ~/.claude/skills/hook-deduplication-guide
+```
+
+## Learn More
+
+See [SKILL.md](SKILL.md) for complete workflow documentation.
+
+---
+
+**Generated by**: insight-skill-generator v0.1.0
+**Source**: 1 insight from hooks-and-events category
--- a/skills/insight-skill-generator/examples/example-generated-skill/SKILL.md
+++ b/skills/insight-skill-generator/examples/example-generated-skill/SKILL.md
@@ -0,0 +1,342 @@
+---
+name: hook-deduplication-guide
+description: Use PROACTIVELY when developing Claude Code hooks to implement content-based deduplication and prevent duplicate insight storage across sessions
+---
+
+# Hook Deduplication Guide
+
+## Overview
+
+This skill guides you through implementing robust deduplication for Claude Code hooks, using content-based hashing instead of session-based tracking. Prevents duplicate insights from being stored while allowing multiple unique insights per session.
+
+**Based on 1 insight**:
+- Hook Deduplication Session Management (hooks-and-events, 2025-11-03)
+
+**Key Capabilities**:
+- Content-based deduplication using SHA256 hashes
+- Session-independent duplicate detection
+- Efficient hash storage with rotation
+- State management best practices
+
+## When to Use This Skill
+
+**Trigger Phrases**:
+- "implement hook deduplication"
+- "prevent duplicate insights in hooks"
+- "content-based deduplication for hooks"
+- "hook state management patterns"
+
+**Use Cases**:
+- Developing new Claude Code hooks that store data
+- Refactoring hooks to prevent duplicates
+- Implementing efficient state management for hooks
+- Debugging duplicate data issues in hooks
+
+**Do NOT use when**:
+- Creating hooks that don't store data (read-only hooks)
+- Session-based deduplication is actually desired
+- Hook doesn't run frequently enough to need deduplication
+
+## Response Style
+
+Educational and practical - explain the why behind content-based vs. session-based deduplication, then guide implementation with code examples.
+
+---
+
+## Workflow
+
+### Phase 1: Choose Deduplication Strategy
+
+**Purpose**: Determine whether content-based or session-based deduplication is appropriate.
+
+**Steps**:
+
+1. **Assess hook behavior**:
+   - How often does the hook run? (per message, per session, per event)
+   - What data is being stored? (insights, logs, metrics)
+   - Is the same content likely to appear across sessions?
+
+2. **Evaluate deduplication needs**:
+   - **Content-based**: Use when the same insight/data might appear in different sessions
+     - Example: Extract-explanatory-insights hook (same insight might appear in multiple conversations)
+   - **Session-based**: Use when duplicates should only be prevented within a session
+     - Example: Error logging (same error in different sessions should be logged)
+
+3. **Recommend strategy**:
+   - For insights/lessons-learned: Content-based (SHA256 hashing)
+   - For session logs/events: Session-based (session ID tracking)
+   - For unique events: No deduplication needed
+
+**Output**: Clear recommendation on deduplication strategy.
+
+**Common Issues**:
+- **Unsure which to use**: Default to content-based for data that's meant to be unique (insights, documentation)
+- **Performance concerns**: Content-based hashing is fast (<1ms for typical content)
+
+---
+
+### Phase 2: Implement Content-Based Deduplication
+
+**Purpose**: Set up SHA256 hash-based deduplication with state management.
+
+**Steps**:
+
+1. **Create state directory**:
+   ```bash
+   mkdir -p ~/.claude/state/hook-state/
+   ```
+
+2. **Initialize hash storage file**:
+   ```bash
+   HASH_FILE="$HOME/.claude/state/hook-state/content-hashes.txt"
+   touch "$HASH_FILE"
+   ```
+
+3. **Implement hash generation**:
+   ```bash
+   # Generate SHA256 hash of content
+   compute_content_hash() {
+     local content="$1"
+     echo -n "$content" | sha256sum | awk '{print $1}'
+   }
+   ```
+
+4. **Check for duplicates**:
+   ```bash
+   # Returns 0 if content is new, 1 if duplicate
+   is_duplicate() {
+     local content="$1"
+     local content_hash=$(compute_content_hash "$content")
+
+     if grep -Fxq "$content_hash" "$HASH_FILE"; then
+       return 1  # Duplicate found
+     else
+       return 0  # New content
+     fi
+   }
+   ```
+
+5. **Store hash after processing**:
+   ```bash
+   store_content_hash() {
+     local content="$1"
+     local content_hash=$(compute_content_hash "$content")
+     echo "$content_hash" >> "$HASH_FILE"
+   }
+   ```
+
+6. **Integrate into hook**:
+   ```bash
+   # In your hook script
+   content="extracted insight or data"
+
+   if is_duplicate "$content"; then
+     # Skip - duplicate content
+     echo "Duplicate detected, skipping..." >&2
+     exit 0
+   fi
+
+   # Process new content
+   process_content "$content"
+
+   # Store hash to prevent future duplicates
+   store_content_hash "$content"
+   ```
+
+**Output**: Working content-based deduplication in your hook.
+
+**Common Issues**:
+- **Hash file grows too large**: Implement rotation (see Phase 3)
+- **False positives**: Ensure content normalization (whitespace, formatting)
+
+---
+
+### Phase 3: Implement Hash Rotation
+
+**Purpose**: Prevent hash file from growing indefinitely.
+
+**Steps**:
+
+1. **Set rotation limit**:
+   ```bash
+   MAX_HASHES=10000  # Keep last 10,000 hashes
+   ```
+
+2. **Implement rotation logic**:
+   ```bash
+   rotate_hash_file() {
+     local hash_file="$1"
+     local max_hashes="${2:-10000}"
+
+     # Count current hashes
+     local current_count=$(wc -l < "$hash_file")
+
+     # Rotate if needed
+     if [ "$current_count" -gt "$max_hashes" ]; then
+       tail -n "$max_hashes" "$hash_file" > "${hash_file}.tmp"
+       mv "${hash_file}.tmp" "$hash_file"
+       echo "Rotated hash file: kept last $max_hashes hashes" >&2
+     fi
+   }
+   ```
+
+3. **Call rotation periodically**:
+   ```bash
+   # After storing new hash
+   store_content_hash "$content"
+   rotate_hash_file "$HASH_FILE" 10000
+   ```
+
+**Output**: Self-maintaining hash storage with bounded size.
+
+**Common Issues**:
+- **Rotation too aggressive**: Increase MAX_HASHES
+- **Rotation too infrequent**: Consider checking count before every append
+
+---
+
+### Phase 4: Testing and Validation
+
+**Purpose**: Verify deduplication works correctly.
+
+**Steps**:
+
+1. **Test duplicate detection**:
+   ```bash
+   # First run - should process
+   echo "Test insight" | your_hook.sh
+   # Check: Content was processed
+
+   # Second run - should skip
+   echo "Test insight" | your_hook.sh
+   # Check: Duplicate detected message
+   ```
+
+2. **Test multiple unique items**:
+   ```bash
+   echo "Insight 1" | your_hook.sh  # Processed
+   echo "Insight 2" | your_hook.sh  # Processed
+   echo "Insight 3" | your_hook.sh  # Processed
+   echo "Insight 1" | your_hook.sh  # Skipped (duplicate)
+   ```
+
+3. **Verify hash file**:
+   ```bash
+   cat ~/.claude/state/hook-state/content-hashes.txt
+   # Should show 3 unique hashes (not 4)
+   ```
+
+4. **Test rotation**:
+   ```bash
+   # Generate more than MAX_HASHES entries
+   for i in {1..10500}; do
+     echo "Insight $i" | your_hook.sh
+   done
+
+   # Verify file size bounded
+   wc -l ~/.claude/state/hook-state/content-hashes.txt
+   # Should be ~10000, not 10500
+   ```
+
+**Output**: Confirmed working deduplication with proper rotation.
+
+---
+
+## Reference Materials
+
+- [Original Insight](data/insights-reference.md) - Full context on hook deduplication patterns
+
+---
+
+## Important Reminders
+
+- **Use content-based deduplication for insights/documentation** - prevents duplicates across sessions
+- **Use session-based deduplication for logs/events** - same event in different sessions is meaningful
+- **Normalize content before hashing** - whitespace differences shouldn't create false negatives
+- **Implement rotation** - prevent unbounded hash file growth
+- **Hash storage location**: `~/.claude/state/hook-state/` (not project-specific)
+- **SHA256 is fast** - no performance concerns for typical hook data
+- **Test both paths** - verify both new content and duplicates work correctly
+
+**Warnings**:
+- ⚠️  **Do not use session ID alone** - prevents same insight in different sessions from being stored
+- ⚠️  **Do not skip rotation** - hash file will grow indefinitely
+- ⚠️  **Do not hash before normalization** - formatting changes will cause false negatives
+
+---
+
+## Best Practices
+
+1. **Choose the Right Strategy**: Content-based for unique data, session-based for session-specific events
+2. **Normalize Before Hashing**: Strip whitespace, lowercase if appropriate, consistent formatting
+3. **Efficient Storage**: Use grep -Fxq for fast hash lookups (fixed-string, line-match, quiet)
+4. **Bounded Growth**: Implement rotation to prevent file bloat
+5. **Clear Logging**: Log when duplicates are detected for debugging
+6. **State Location**: Use ~/.claude/state/hook-state/ for cross-project state
+
+---
+
+## Troubleshooting
+
+### Duplicates not being detected
+
+**Symptoms**: Same content processed multiple times
+
+**Solution**:
+1. Check hash file exists and is writable
+2. Verify store_content_hash is called after processing
+3. Check content normalization (whitespace differences)
+4. Verify grep command uses -Fxq flags
+
+**Prevention**: Test deduplication immediately after implementation
+
+---
+
+### Hash file growing too large
+
+**Symptoms**: Hash file exceeds MAX_HASHES significantly
+
+**Solution**:
+1. Verify rotate_hash_file is called
+2. Check MAX_HASHES value is reasonable
+3. Manually rotate if needed: `tail -n 10000 hashes.txt > hashes.tmp && mv hashes.tmp hashes.txt`
+
+**Prevention**: Call rotation after every hash storage
+
+---
+
+### False positives (new content marked as duplicate)
+
+**Symptoms**: Different content being skipped
+
+**Solution**:
+1. Check for hash collisions (extremely unlikely with SHA256)
+2. Verify content is actually different
+3. Check normalization isn't too aggressive
+4. Review recent hashes in file
+
+**Prevention**: Use consistent normalization, test with diverse content
+
+---
+
+## Next Steps
+
+After implementing deduplication:
+1. Monitor hash file growth over time
+2. Tune MAX_HASHES based on usage patterns
+3. Consider adding metrics (duplicates prevented, storage size)
+4. Share pattern with team for other hooks
+
+---
+
+## Metadata
+
+**Source Insights**:
+- Session: abc123-session-id
+- Date: 2025-11-03
+- Category: hooks-and-events
+- File: docs/lessons-learned/hooks-and-events/2025-11-03-hook-deduplication.md
+
+**Skill Version**: 0.1.0
+**Generated**: 2025-11-16
+**Last Updated**: 2025-11-16
--- a/skills/insight-skill-generator/examples/example-generated-skill/data/insights-reference.md
+++ b/skills/insight-skill-generator/examples/example-generated-skill/data/insights-reference.md
@@ -0,0 +1,116 @@
+# Insights Reference: hook-deduplication-guide
+
+This document contains the original insight from Claude Code's Explanatory output style that was used to create the **Hook Deduplication Guide** skill.
+
+## Overview
+
+**Total Insights**: 1
+**Date Range**: 2025-11-03
+**Categories**: hooks-and-events
+**Sessions**: 1 unique session
+
+---
+
+## 1. Hook Deduplication Session Management
+
+**Metadata**:
+- **Date**: 2025-11-03
+- **Category**: hooks-and-events
+- **Session**: abc123-session-id
+- **Source File**: docs/lessons-learned/hooks-and-events/2025-11-03-hook-deduplication.md
+
+**Original Content**:
+
+The extract-explanatory-insights hook initially used session-based deduplication, which prevented multiple insights from the same session from being stored. However, this created a limitation: if the same valuable insight appeared in different sessions, only the first one would be saved.
+
+By switching to content-based deduplication using SHA256 hashing, we can:
+
+1. **Allow multiple unique insights per session** - Different insights in the same conversation are all preserved
+2. **Prevent true duplicates across sessions** - The same insight appearing in multiple conversations is stored only once
+3. **Maintain efficient storage** - Hash file rotation keeps storage bounded
+
+The implementation involves:
+
+**Hash Generation**:
+```bash
+compute_content_hash() {
+  local content="$1"
+  echo -n "$content" | sha256sum | awk '{print $1}'
+}
+```
+
+**Duplicate Detection**:
+```bash
+is_duplicate() {
+  local content="$1"
+  local content_hash=$(compute_content_hash "$content")
+
+  if grep -Fxq "$content_hash" "$HASH_FILE"; then
+    return 1  # Duplicate
+  else
+    return 0  # New content
+  fi
+}
+```
+
+**Hash Storage with Rotation**:
+```bash
+store_content_hash() {
+  local content="$1"
+  local content_hash=$(compute_content_hash "$content")
+  echo "$content_hash" >> "$HASH_FILE"
+
+  # Rotate if file exceeds MAX_HASHES
+  local count=$(wc -l < "$HASH_FILE")
+  if [ "$count" -gt 10000 ]; then
+    tail -n 10000 "$HASH_FILE" > "${HASH_FILE}.tmp"
+    mv "${HASH_FILE}.tmp" "$HASH_FILE"
+  fi
+}
+```
+
+This approach provides the best of both worlds: session independence and true deduplication based on content, not session boundaries.
+
+---
+
+## How This Insight Informs the Skill
+
+### Hook Deduplication Session Management → Phase-Based Workflow
+
+The insight's structure (problem → solution → implementation) maps directly to the skill's phases:
+
+- **Problem Description** → Phase 1: Choose Deduplication Strategy
+  - Explains why session-based is insufficient
+  - Defines when content-based is needed
+
+- **Solution Explanation** → Phase 2: Implement Content-Based Deduplication
+  - Hash generation logic
+  - Duplicate detection mechanism
+  - State file management
+
+- **Implementation Details** → Phase 3: Implement Hash Rotation
+  - Rotation logic to prevent unbounded growth
+  - MAX_HASHES configuration
+
+- **Code Examples** → All phases
+  - Bash functions extracted and integrated into workflow steps
+
+---
+
+## Additional Context
+
+**Why This Insight Was Selected**:
+
+This insight was selected for skill generation because it:
+1. Provides a complete, actionable pattern
+2. Includes working code examples
+3. Solves a common problem in hook development
+4. Is generally applicable (not project-specific)
+5. Has clear benefits over the naive approach
+
+**Quality Score**: 0.85 (high - qualified for standalone skill)
+
+---
+
+**Generated**: 2025-11-16
+**Last Updated**: 2025-11-16
--- a/skills/insight-skill-generator/examples/example-generated-skill/plugin.json
+++ b/skills/insight-skill-generator/examples/example-generated-skill/plugin.json
@@ -0,0 +1,15 @@
+{
+  "name": "hook-deduplication-guide",
+  "version": "0.1.0",
+  "description": "Use PROACTIVELY when developing Claude Code hooks to implement content-based deduplication and prevent duplicate insight storage across sessions",
+  "type": "skill",
+  "author": "Connor",
+  "category": "productivity",
+  "tags": [
+    "hooks",
+    "deduplication",
+    "state-management",
+    "bash",
+    "generated-from-insights"
+  ]
+}