Initial commit

2025-11-29 18:16:51 +08:00
commit 4e8a12140c
88 changed files with 17078 additions and 0 deletions
--- a/skills/cc-insights/.gitignore
+++ b/skills/cc-insights/.gitignore
@@ -0,0 +1,29 @@
+# Ignore processed data and cache
+.processed/
+*.db
+*.db-journal
+
+# Python cache
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+
+# Virtual environment
+venv/
+env/
+ENV/
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# OS
+.DS_Store
+Thumbs.db
+
+# Logs
+*.log
+logs/
--- a/skills/cc-insights/CHANGELOG.md
+++ b/skills/cc-insights/CHANGELOG.md
@@ -0,0 +1,14 @@
+# Changelog
+
+## 0.2.0
+
+- Refactored to Anthropic progressive disclosure pattern
+- Updated description with "Use PROACTIVELY when..." format
+- Extracted detailed content to modes/, workflow/, reference/ directories
+
+## 0.1.0
+
+- Initial skill release
+- RAG-powered conversation analysis
+- Semantic search and insight reports
+- Optional Next.js dashboard
--- a/skills/cc-insights/README.md
+++ b/skills/cc-insights/README.md
@@ -0,0 +1,500 @@
+# cc-insights: Claude Code Conversation Insights
+
+Automatically process, search, and analyze your Claude Code conversation history using RAG-powered semantic search and intelligent pattern detection.
+
+## Overview
+
+This skill transforms your Claude Code conversations into actionable insights without any manual effort. It automatically processes conversations stored in `~/.claude/projects/`, builds a searchable knowledge base with semantic understanding, and generates insightful reports about your development patterns.
+
+### Key Features
+
+- 🔍 **RAG-Powered Semantic Search**: Find conversations by meaning, not just keywords
+- 📊 **Automatic Insight Reports**: Detect patterns, file hotspots, and tool usage analytics
+- 📈 **Activity Trends**: Understand development patterns over time
+- 💡 **Knowledge Extraction**: Surface recurring topics and solutions
+- 🎯 **Zero Manual Effort**: Fully automatic processing of existing conversations
+- 🚀 **Fast Performance**: <1s search, <10s report generation
+
+## Quick Start
+
+### 1. Installation
+
+```bash
+# Navigate to the skill directory
+cd .claude/skills/cc-insights
+
+# Install Python dependencies
+pip install -r requirements.txt
+
+# Verify installation
+python scripts/conversation-processor.py --help
+```
+
+### 2. Initial Setup
+
+Process your existing conversations:
+
+```bash
+# Process all conversations for current project
+python scripts/conversation-processor.py --project-name annex --verbose --stats
+
+# Build semantic search index
+python scripts/rag-indexer.py --verbose --stats
+```
+
+This one-time setup will:
+- Parse all JSONL files from `~/.claude/projects/`
+- Extract metadata (files, tools, topics, timestamps)
+- Build SQLite database for fast queries
+- Generate vector embeddings for semantic search
+- Create ChromaDB index
+
+**Time**: ~1-2 minutes for 100 conversations
+
+### 3. Search Conversations
+
+```bash
+# Semantic search (understands meaning)
+python scripts/search-conversations.py "fixing authentication bugs"
+
+# Search by file
+python scripts/search-conversations.py --file "src/auth/token.ts"
+
+# Search by tool
+python scripts/search-conversations.py --tool "Write"
+
+# Keyword search with date filter
+python scripts/search-conversations.py "refactoring" --keyword --date-from 2025-10-01
+```
+
+### 4. Generate Insights
+
+```bash
+# Weekly activity report
+python scripts/insight-generator.py weekly --verbose
+
+# File heatmap (most modified files)
+python scripts/insight-generator.py file-heatmap
+
+# Tool usage analytics
+python scripts/insight-generator.py tool-usage
+
+# Save report to file
+python scripts/insight-generator.py weekly --output weekly-report.md
+```
+
+## Usage via Skill
+
+Once set up, you can interact with the skill naturally:
+
+```
+User: "Search conversations about React performance optimization"
+→ Returns top semantic matches with context
+
+User: "Generate insights for the past week"
+→ Creates comprehensive weekly report with metrics
+
+User: "Show me files I've modified most often"
+→ Generates file heatmap with recommendations
+```
+
+## Architecture
+
+```
+.claude/skills/cc-insights/
+├── SKILL.md                   # Skill definition for Claude
+├── README.md                  # This file
+├── requirements.txt           # Python dependencies
+├── .gitignore                # Git ignore rules
+│
+├── scripts/                   # Core functionality
+│   ├── conversation-processor.py   # Parse JSONL, extract metadata
+│   ├── rag-indexer.py              # Build vector embeddings
+│   ├── search-conversations.py     # Search interface
+│   └── insight-generator.py        # Report generation
+│
+├── templates/                 # Report templates
+│   └── weekly-summary.md
+│
+└── .processed/               # Generated data (gitignored)
+    ├── conversations.db       # SQLite metadata
+    └── embeddings/           # ChromaDB vector store
+        ├── chroma.sqlite3
+        └── [embedding data]
+```
+
+## Scripts Reference
+
+### conversation-processor.py
+
+Parse JSONL files and extract conversation metadata.
+
+**Usage:**
+```bash
+python scripts/conversation-processor.py [OPTIONS]
+
+Options:
+  --project-name TEXT    Project to process (default: annex)
+  --db-path PATH         Database path
+  --reindex              Reprocess all (ignore cache)
+  --verbose              Show detailed logs
+  --stats                Display statistics after processing
+```
+
+**What it does:**
+- Scans `~/.claude/projects/[project]/*.jsonl`
+- Decodes base64-encoded conversation content
+- Extracts: messages, files, tools, topics, timestamps
+- Stores in SQLite with indexes for fast queries
+- Tracks processing state for incremental updates
+
+**Output:**
+- SQLite database at `.processed/conversations.db`
+- Processing state for incremental updates
+
+### rag-indexer.py
+
+Build vector embeddings for semantic search.
+
+**Usage:**
+```bash
+python scripts/rag-indexer.py [OPTIONS]
+
+Options:
+  --db-path PATH         Database path
+  --embeddings-dir PATH  ChromaDB directory
+  --model TEXT           Embedding model (default: all-MiniLM-L6-v2)
+  --rebuild              Rebuild entire index
+  --batch-size INT       Batch size (default: 32)
+  --verbose              Show detailed logs
+  --stats                Display statistics
+  --test-search TEXT     Test search with query
+```
+
+**What it does:**
+- Reads conversations from SQLite
+- Generates embeddings using sentence-transformers
+- Stores in ChromaDB for similarity search
+- Supports incremental indexing (only new conversations)
+
+**Models:**
+- `all-MiniLM-L6-v2` (default): Fast, good quality, 384 dimensions
+- `all-mpnet-base-v2`: Higher quality, slower, 768 dimensions
+
+### search-conversations.py
+
+Search conversations with semantic + metadata filters.
+
+**Usage:**
+```bash
+python scripts/search-conversations.py QUERY [OPTIONS]
+
+Options:
+  --semantic/--keyword   Semantic (RAG) or keyword search (default: semantic)
+  --file TEXT            Filter by file pattern
+  --tool TEXT            Search by tool name
+  --date-from TEXT       Start date (ISO format)
+  --date-to TEXT         End date (ISO format)
+  --limit INT            Max results (default: 10)
+  --format TEXT          Output: text|json|markdown (default: text)
+  --verbose              Show detailed logs
+```
+
+**Examples:**
+```bash
+# Semantic search
+python scripts/search-conversations.py "authentication bugs"
+
+# Filter by file
+python scripts/search-conversations.py "React optimization" --file "src/components"
+
+# Search by tool
+python scripts/search-conversations.py --tool "Edit"
+
+# Date range
+python scripts/search-conversations.py "deployment" --date-from 2025-10-01
+
+# JSON output for integration
+python scripts/search-conversations.py "testing" --format json > results.json
+```
+
+### insight-generator.py
+
+Generate pattern-based reports and analytics.
+
+**Usage:**
+```bash
+python scripts/insight-generator.py REPORT_TYPE [OPTIONS]
+
+Report Types:
+  weekly          Weekly activity summary
+  file-heatmap    File modification heatmap
+  tool-usage      Tool usage analytics
+
+Options:
+  --date-from TEXT       Start date (ISO format)
+  --date-to TEXT         End date (ISO format)
+  --output PATH          Save to file (default: stdout)
+  --verbose              Show detailed logs
+```
+
+**Examples:**
+```bash
+# Weekly report (last 7 days)
+python scripts/insight-generator.py weekly
+
+# Custom date range
+python scripts/insight-generator.py weekly --date-from 2025-10-01 --date-to 2025-10-15
+
+# File heatmap with output
+python scripts/insight-generator.py file-heatmap --output heatmap.md
+
+# Tool analytics
+python scripts/insight-generator.py tool-usage
+```
+
+## Data Storage
+
+All processed data is stored locally in `.processed/` (gitignored):
+
+### SQLite Database (`conversations.db`)
+
+**Tables:**
+- `conversations`: Main metadata (timestamps, messages, topics)
+- `file_interactions`: File-level interactions (read, write, edit)
+- `tool_usage`: Tool usage counts per conversation
+- `processing_state`: Tracks processed files for incremental updates
+
+**Indexes:**
+- `idx_timestamp`: Fast date-range queries
+- `idx_project`: Filter by project
+- `idx_file_path`: File-based searches
+- `idx_tool_name`: Tool usage queries
+
+### ChromaDB Vector Store (`embeddings/`)
+
+**Contents:**
+- Vector embeddings (384-dimensional by default)
+- Document text for retrieval
+- Metadata for filtering
+- HNSW index for fast similarity search
+
+**Performance:**
+- <1 second for semantic search
+- Handles 10,000+ conversations efficiently
+- ~100MB per 1,000 conversations
+
+## Performance
+
+| Operation | Time | Notes |
+|-----------|------|-------|
+| Initial processing (100 convs) | ~30s | One-time setup |
+| Initial indexing (100 convs) | ~60s | One-time setup |
+| Incremental processing | <5s | Only new conversations |
+| Semantic search | <1s | Top 10 results |
+| Keyword search | <0.1s | SQLite FTS |
+| Weekly report generation | <10s | Includes visualizations |
+
+## Troubleshooting
+
+### "Database not found"
+
+**Problem:** Scripts can't find `conversations.db`
+
+**Solution:**
+```bash
+# Run processor first
+python scripts/conversation-processor.py --project-name annex --verbose
+```
+
+### "No conversations found"
+
+**Problem:** Project name doesn't match or no JSONL files
+
+**Solution:**
+```bash
+# Check project directories
+ls ~/.claude/projects/
+
+# Use correct project name (may be encoded)
+python scripts/conversation-processor.py --project-name [actual-name] --verbose
+```
+
+### "ImportError: sentence_transformers"
+
+**Problem:** Dependencies not installed
+
+**Solution:**
+```bash
+# Install requirements
+pip install -r requirements.txt
+
+# Or individually
+pip install sentence-transformers chromadb jinja2 click python-dateutil
+```
+
+### "Slow embedding generation"
+
+**Problem:** Large number of conversations
+
+**Solution:**
+```bash
+# Use smaller batch size
+python scripts/rag-indexer.py --batch-size 16
+
+# Or use faster model (lower quality)
+python scripts/rag-indexer.py --model all-MiniLM-L6-v2
+```
+
+### "Out of memory"
+
+**Problem:** Too many conversations processed at once
+
+**Solution:**
+```bash
+# Smaller batch size
+python scripts/rag-indexer.py --batch-size 8
+
+# Or process in chunks by date
+python scripts/conversation-processor.py --date-from 2025-10-01 --date-to 2025-10-15
+```
+
+## Incremental Updates
+
+The system automatically handles incremental updates:
+
+1. **Conversation Processor**: Tracks file hashes in `processing_state` table
+   - Only reprocesses changed files
+   - Detects new JSONL files automatically
+
+2. **RAG Indexer**: Checks ChromaDB for existing IDs
+   - Only indexes new conversations
+   - Skips already-embedded conversations
+
+**Recommended workflow:**
+```bash
+# Daily/weekly: Run both for new conversations
+python scripts/conversation-processor.py --project-name annex
+python scripts/rag-indexer.py
+
+# Takes <5s if only a few new conversations
+```
+
+## Integration Examples
+
+### Search from command line
+```bash
+# Quick search function in .bashrc or .zshrc
+cc-search() {
+  python ~/.claude/skills/cc-insights/scripts/search-conversations.py "$@"
+}
+
+# Usage
+cc-search "authentication bugs"
+```
+
+### Generate weekly report automatically
+```bash
+# Add to crontab for weekly reports
+0 9 * * MON cd ~/.claude/skills/cc-insights && python scripts/insight-generator.py weekly --output ~/reports/weekly-$(date +\%Y-\%m-\%d).md
+```
+
+### Export data for external tools
+```bash
+# Export to JSON
+python scripts/search-conversations.py "testing" --format json | jq
+
+# Export metadata
+sqlite3 .processed/conversations.db "SELECT * FROM conversations" -json > export.json
+```
+
+## Privacy & Security
+
+- **Local-only**: All data stays on your machine
+- **No external APIs**: Embeddings generated locally
+- **Project-scoped**: Only accesses current project
+- **Gitignored**: `.processed/` excluded from version control
+- **Sensitive data**: Review before sharing reports (may contain secrets)
+
+## Requirements
+
+### Python Dependencies
+- `sentence-transformers>=2.2.0` - Semantic embeddings
+- `chromadb>=0.4.0` - Vector database
+- `jinja2>=3.1.0` - Template engine
+- `click>=8.1.0` - CLI framework
+- `python-dateutil>=2.8.0` - Date utilities
+
+### System Requirements
+- Python 3.8+
+- 500MB disk space (for 1,000 conversations)
+- 2GB RAM (for embedding generation)
+
+## Limitations
+
+- **Read-only**: Analyzes existing conversations, doesn't modify them
+- **Single project**: Designed for per-project insights (not cross-project)
+- **Static analysis**: Analyzes saved conversations, not real-time
+- **Embedding quality**: Good but not GPT-4 level (local models)
+- **JSONL format**: Depends on Claude Code's internal storage format
+
+## Future Enhancements
+
+Potential additions (not currently implemented):
+
+- [ ] Cross-project analytics dashboard
+- [ ] AI-powered summarization with LLM
+- [ ] Slack/Discord integration for weekly reports
+- [ ] Git commit correlation
+- [ ] VS Code extension
+- [ ] Web dashboard (Next.js)
+- [ ] Confluence/Notion export
+- [ ] Custom embedding models
+
+## FAQ
+
+**Q: How often should I rebuild the index?**
+A: Never, unless changing models. Use incremental updates.
+
+**Q: Can I change the embedding model?**
+A: Yes, use `--model` flag with rag-indexer.py, then `--rebuild`.
+
+**Q: Does this work with incognito mode?**
+A: No, incognito conversations aren't saved to JSONL files.
+
+**Q: Can I share reports with my team?**
+A: Yes, but review for sensitive information first (API keys, secrets).
+
+**Q: What if Claude Code changes the JSONL format?**
+A: The processor may need updates. File an issue if parsing breaks.
+
+**Q: Can I delete old conversations?**
+A: Yes, remove JSONL files and run `--reindex` to rebuild.
+
+## Contributing
+
+Contributions welcome! Areas to improve:
+
+- Additional report templates
+- Better pattern detection algorithms
+- Performance optimizations
+- Web dashboard implementation
+- Documentation improvements
+
+## License
+
+MIT License - See repository root for details
+
+## Support
+
+For issues or questions:
+1. Check this README and SKILL.md
+2. Review script `--help` output
+3. Run with `--verbose` to see detailed logs
+4. Check `.processed/logs/` if created
+5. Open an issue in the repository
+
+---
+
+**Built for Connor's annex project**
+*Zero-effort conversation intelligence*
--- a/skills/cc-insights/SKILL.md
+++ b/skills/cc-insights/SKILL.md
@@ -0,0 +1,120 @@
+---
+name: cc-insights
+description: Use PROACTIVELY when searching past Claude Code conversations, analyzing development patterns, or generating activity reports. Automatically processes conversation history from the project, enables RAG-powered semantic search, and generates insight reports with pattern detection. Provides optional dashboard for visualization. Not for real-time analysis or cross-project searches.
+---
+
+# Claude Code Insights
+
+Unlock the hidden value in your Claude Code conversation history through automatic processing, semantic search, and intelligent insight generation.
+
+## Overview
+
+This skill automatically analyzes your project's Claude Code conversations (stored in `~/.claude/projects/[project]/*.jsonl`) to provide:
+
+- **RAG-Powered Semantic Search**: Find conversations by meaning, not just keywords
+- **Automatic Insight Reports**: Pattern detection, file hotspots, tool usage analytics
+- **Activity Trends**: Understand your development patterns over time
+- **Knowledge Extraction**: Surface recurring topics, solutions, and best practices
+- **Zero Manual Effort**: Fully automatic processing of existing conversations
+
+## When to Use This Skill
+
+**Trigger Phrases**:
+- "Find conversations about [topic]"
+- "Generate weekly insights report"
+- "What files do I modify most often?"
+- "Launch the insights dashboard"
+- "Export insights as [format]"
+
+**Use Cases**:
+- Search past conversations by topic or file
+- Generate activity reports and insights
+- Understand development patterns over time
+- Extract knowledge and recurring solutions
+- Visualize activity with interactive dashboard
+
+**NOT for**:
+- Real-time conversation analysis (analyzes history only)
+- Conversations from other projects (project-specific)
+- Manual conversation logging (automatic only)
+
+## Response Style
+
+**Informative and Visual**: Present search results with relevance scores and snippets. Generate reports with clear metrics and ASCII visualizations. Offer to save or export results.
+
+## Mode Selection
+
+| User Request | Mode | Reference |
+|--------------|------|-----------|
+| "Find conversations about X" | Search | `modes/mode-1-search.md` |
+| "Generate insights report" | Insights | `modes/mode-2-insights.md` |
+| "Launch dashboard" | Dashboard | `modes/mode-3-dashboard.md` |
+| "Export as JSON/CSV/HTML" | Export | `modes/mode-4-export.md` |
+
+## Mode Overview
+
+### Mode 1: Search Conversations
+Find past conversations using semantic search (by meaning) or metadata search (by files/tools).
+→ **Details**: `modes/mode-1-search.md`
+
+### Mode 2: Generate Insights
+Analyze patterns and generate reports with file hotspots, tool usage, and knowledge highlights.
+→ **Details**: `modes/mode-2-insights.md`
+
+### Mode 3: Interactive Dashboard
+Launch a Next.js web dashboard for rich visualization and exploration.
+→ **Details**: `modes/mode-3-dashboard.md`
+
+### Mode 4: Export and Integration
+Export insights as Markdown, JSON, CSV, or HTML for sharing and integration.
+→ **Details**: `modes/mode-4-export.md`
+
+## Initial Setup
+
+**First time usage**:
+1. Install dependencies: `pip install -r requirements.txt`
+2. Run initial processing (automatic on first use)
+3. Build embeddings (one-time, ~1-2 min)
+4. Ready to search and analyze!
+
+**What happens automatically**:
+- Scans `~/.claude/projects/[current-project]/*.jsonl`
+- Extracts and indexes conversation metadata
+- Builds vector embeddings for semantic search
+- Creates SQLite database for fast queries
+
+## Important Reminders
+
+- **Automatic processing**: Skill updates index on each use (incremental)
+- **First run is slow**: Embedding creation takes 1-2 minutes
+- **Project-specific**: Analyzes only current project's conversations
+- **Dashboard requires Node.js**: v18+ for the Next.js dashboard
+- **ChromaDB for search**: Vector similarity search for semantic queries
+
+## Limitations
+
+- Only analyzes JSONL conversation files from Claude Code
+- Requires sentence-transformers for embedding creation
+- Dashboard is local only (localhost:3000)
+- Large conversation histories may take longer to process initially
+
+## Reference Materials
+
+| Resource | Purpose |
+|----------|---------|
+| `modes/*.md` | Detailed mode instructions |
+| `reference/troubleshooting.md` | Common issues and fixes |
+| `scripts/` | Processing and indexing scripts |
+| `dashboard/` | Next.js dashboard application |
+
+## Success Criteria
+
+- [ ] Conversations processed and indexed
+- [ ] Embeddings built for semantic search
+- [ ] Search returns relevant results
+- [ ] Insights reports generated correctly
+- [ ] Dashboard launches and displays data
+
+---
+
+**Tech Stack**: Python (processing), SQLite (metadata), ChromaDB (vectors), Next.js (dashboard)
--- a/skills/cc-insights/modes/mode-1-search.md
+++ b/skills/cc-insights/modes/mode-1-search.md
@@ -0,0 +1,68 @@
+# Mode 1: Search Conversations
+
+**When to use**: Find specific past conversations
+
+## Trigger Phrases
+- "Find conversations about React performance optimization"
+- "Search for times I fixed authentication bugs"
+- "Show me conversations that modified Auth.tsx"
+- "What conversations mention TypeScript strict mode?"
+
+## Process
+
+1. User asks to search for a topic or file
+2. Skill performs RAG semantic search
+3. Returns ranked results with context snippets
+4. Optionally show full conversation details
+
+## Search Types
+
+### Semantic Search (by meaning)
+```
+User: "Find conversations about fixing bugs related to user authentication"
+
+Skill: [Performs RAG search]
+Found 3 conversations:
+1. "Debug JWT token expiration" (Oct 24)
+2. "Fix OAuth redirect loop" (Oct 20)
+3. "Implement session timeout handling" (Oct 18)
+```
+
+### Metadata Search (by files/tools)
+```
+User: "Show conversations that modified src/auth/token.ts"
+
+Skill: [Queries SQLite metadata]
+Found 5 conversations touching src/auth/token.ts:
+1. "Implement token refresh logic" (Oct 25)
+2. "Add token validation" (Oct 22)
+...
+```
+
+### Time-based Search
+```
+User: "What did I work on last week?"
+
+Skill: [Queries by date range]
+Last week (Oct 19-25) you had 12 conversations:
+- 5 about authentication features
+- 3 about bug fixes
+- 2 about testing
+- 2 about refactoring
+```
+
+## Output Format
+
+```
+Found 5 conversations about "React performance optimization":
+
+1. [Similarity: 0.89] "Optimize UserProfile re-renders" (Oct 25, 2025)
+   Files: src/components/UserProfile.tsx, src/hooks/useUser.ts
+   Snippet: "...implemented useMemo to prevent unnecessary re-renders..."
+
+2. [Similarity: 0.82] "Fix dashboard performance issues" (Oct 20, 2025)
+   Files: src/pages/Dashboard.tsx
+   Snippet: "...React.memo wrapper reduced render count by 60%..."
+
+[View full conversations? Type the number]
+```
--- a/skills/cc-insights/modes/mode-2-insights.md
+++ b/skills/cc-insights/modes/mode-2-insights.md
@@ -0,0 +1,75 @@
+# Mode 2: Generate Insights
+
+**When to use**: Understand patterns and trends
+
+## Trigger Phrases
+- "Generate weekly insights report"
+- "Show me my most active files this month"
+- "What patterns do you see in my conversations?"
+- "Create a project summary report"
+
+## Process
+
+1. User asks for insights on a timeframe
+2. Skill analyzes metadata and patterns
+3. Creates markdown report with visualizations
+4. Offers to save report to file
+
+## Report Sections
+
+- **Executive Summary**: Key metrics
+- **Activity Timeline**: Conversations over time
+- **File Hotspots**: Most modified files
+- **Tool Usage Breakdown**: Which tools you use most
+- **Topic Clusters**: Recurring themes
+- **Knowledge Highlights**: Key solutions and learnings
+
+## Example Output
+
+```markdown
+# Weekly Insights (Oct 19-25, 2025)
+
+## Overview
+- 12 conversations
+- 8 active days
+- 23 files modified
+- 45 tool uses
+
+## Top Files
+1. src/auth/token.ts (5 modifications)
+2. src/components/Login.tsx (3 modifications)
+3. src/api/auth.ts (3 modifications)
+
+## Activity Pattern
+Mon: ████████ 4 conversations
+Tue: ██████ 3 conversations
+Wed: ██████ 3 conversations
+Thu: ████ 2 conversations
+Fri: ████ 2 conversations
+
+## Key Topics
+- Authentication (6 conversations)
+- Testing (3 conversations)
+- Bug fixes (2 conversations)
+
+## Knowledge Highlights
+- Implemented JWT refresh token pattern
+- Added React Testing Library for auth components
+- Fixed OAuth redirect edge case
+
+[Save report to file? Y/n]
+```
+
+## File-Centric Analysis
+
+```
+# File Hotspots (All Time)
+
+🔥🔥🔥 src/auth/token.ts (15 conversations)
+🔥🔥 src/components/Login.tsx (9 conversations)
+🔥🔥 src/api/auth.ts (8 conversations)
+🔥 src/hooks/useAuth.ts (6 conversations)
+
+Insight: Authentication module is your most active area.
+Consider: Review token.ts for refactoring opportunities.
+```
--- a/skills/cc-insights/modes/mode-3-dashboard.md
+++ b/skills/cc-insights/modes/mode-3-dashboard.md
@@ -0,0 +1,69 @@
+# Mode 3: Interactive Dashboard
+
+**When to use**: Rich visual exploration and ongoing monitoring
+
+## Trigger Phrases
+- "Launch the insights dashboard"
+- "Start the visualization server"
+- "Show me the interactive insights"
+
+## Process
+
+1. User asks to start the dashboard
+2. Skill launches Next.js dev server
+3. Opens browser to http://localhost:3000
+4. Provides real-time data from SQLite + ChromaDB
+
+## Dashboard Pages
+
+### Home
+- Timeline of recent conversations
+- Activity stats and quick metrics
+- Summary cards
+
+### Search
+- Interactive semantic + keyword search interface
+- Real-time results
+- Filter by date, files, tools
+
+### Insights
+- Auto-generated reports with interactive charts
+- Trend visualizations
+- Pattern detection results
+
+### Files
+- File-centric view of all conversations
+- Click to see all conversations touching a file
+- Modification frequency heatmap
+
+### Analytics
+- Deep-dive into patterns and trends
+- Tool usage statistics
+- Activity patterns by time of day/week
+
+## Tech Stack
+
+- **Framework**: Next.js 15 with React Server Components
+- **Styling**: Tailwind CSS
+- **Charts**: Recharts
+- **Data**: SQLite + ChromaDB
+- **URL**: http://localhost:3000
+
+## Starting the Dashboard
+
+```bash
+# Navigate to dashboard directory
+cd ~/.claude/skills/cc-insights/dashboard
+
+# Install dependencies (first time only)
+npm install
+
+# Start development server
+npm run dev
+```
+
+The browser will automatically open to http://localhost:3000.
+
+## Stopping the Dashboard
+
+Press `Ctrl+C` in the terminal or close the terminal window.
--- a/skills/cc-insights/modes/mode-4-export.md
+++ b/skills/cc-insights/modes/mode-4-export.md
@@ -0,0 +1,68 @@
+# Mode 4: Export and Integration
+
+**When to use**: Share insights or integrate with other tools
+
+## Trigger Phrases
+- "Export weekly insights as markdown"
+- "Save conversation metadata as JSON"
+- "Generate HTML report for sharing"
+
+## Process
+
+1. User asks to export in a specific format
+2. Skill generates formatted output
+3. Saves to specified location
+
+## Export Formats
+
+### Markdown
+Human-readable reports with formatting.
+
+```bash
+Export location: ./insights/weekly-report.md
+```
+
+### JSON
+Machine-readable data for integration with other tools.
+
+```json
+{
+  "period": "2025-10-19 to 2025-10-25",
+  "conversations": 12,
+  "files_modified": 23,
+  "tool_uses": 45,
+  "top_files": [
+    {"path": "src/auth/token.ts", "count": 5},
+    {"path": "src/components/Login.tsx", "count": 3}
+  ],
+  "topics": ["authentication", "testing", "bug fixes"]
+}
+```
+
+### CSV
+Activity data for spreadsheets.
+
+```csv
+date,conversation_count,files_modified,tool_uses
+2025-10-19,4,8,12
+2025-10-20,3,6,9
+...
+```
+
+### HTML
+Standalone report with styling for sharing.
+
+```html
+<!-- Self-contained report with inline CSS -->
+<!-- Can be opened in any browser -->
+```
+
+## Example Usage
+
+```
+User: "Export this month's insights as JSON for my dashboard"
+
+Skill: [Generates JSON report]
+Exported to: ./insights/october-2025.json
+Contains: 45 conversations, 89 files, 156 tool uses
+```
--- a/skills/cc-insights/reference/troubleshooting.md
+++ b/skills/cc-insights/reference/troubleshooting.md
@@ -0,0 +1,84 @@
+# Troubleshooting Guide
+
+## No conversations found
+
+**Symptoms**: Skill reports 0 conversations
+
+**Solution**:
+1. Verify you're in a project with conversation history
+2. Check `~/.claude/projects/` for your project folder
+3. Ensure JSONL files exist in the project folder
+4. Run initial processing if first time
+
+---
+
+## Search returns no results
+
+**Symptoms**: Semantic search finds nothing
+
+**Solution**:
+1. Check embeddings were built (look for ChromaDB folder)
+2. Rebuild embeddings: skill will do this automatically
+3. Try broader search terms
+4. Use metadata search for specific files
+
+---
+
+## Dashboard won't start
+
+**Symptoms**: Error when launching dashboard
+
+**Solution**:
+1. Check Node.js is installed (v18+)
+2. Run `npm install` in dashboard directory
+3. Check port 3000 is available
+4. Kill existing processes: `lsof -i :3000`
+
+---
+
+## Slow processing
+
+**Symptoms**: Initial setup takes very long
+
+**Solution**:
+1. First-time embedding creation is slow (1-2 min normal)
+2. Subsequent runs use incremental updates (fast)
+3. For very large history, consider limiting date range
+
+---
+
+## Missing dependencies
+
+**Symptoms**: Import errors when running
+
+**Solution**:
+```bash
+pip install -r requirements.txt
+```
+
+Required packages:
+- sentence-transformers
+- chromadb
+- sqlite3 (built-in)
+
+---
+
+## Embeddings out of date
+
+**Symptoms**: New conversations not appearing in search
+
+**Solution**:
+1. Skill automatically updates on each use
+2. Force rebuild: delete ChromaDB folder and rerun
+3. Check incremental processing completed
+
+---
+
+## Database locked
+
+**Symptoms**: SQLite errors about locked database
+
+**Solution**:
+1. Close other processes using the database
+2. Close the dashboard if running
+3. Wait a moment and retry
--- a/skills/cc-insights/requirements.txt
+++ b/skills/cc-insights/requirements.txt
@@ -0,0 +1,16 @@
+# Core dependencies for cc-insights skill
+
+# Sentence transformers for semantic embeddings
+sentence-transformers>=2.2.0
+
+# ChromaDB for vector database
+chromadb>=0.4.0
+
+# Template engine for reports
+jinja2>=3.1.0
+
+# CLI framework
+click>=8.1.0
+
+# Date utilities
+python-dateutil>=2.8.0
--- a/skills/cc-insights/scripts/conversation-processor.py
+++ b/skills/cc-insights/scripts/conversation-processor.py
@@ -0,0 +1,634 @@
+#!/usr/bin/env python3
+"""
+Conversation Processor for Claude Code Insights
+
+Parses JSONL conversation files from ~/.claude/projects/, extracts metadata,
+and stores in SQLite for fast querying. Supports incremental processing.
+"""
+
+import json
+import sqlite3
+import base64
+import hashlib
+from pathlib import Path
+from datetime import datetime
+from typing import List, Dict, Any, Optional
+from dataclasses import dataclass, asdict
+import click
+import re
+
+
+@dataclass
+class ConversationMetadata:
+    """Structured conversation metadata"""
+    id: str
+    project_path: str
+    timestamp: datetime
+    message_count: int
+    user_messages: int
+    assistant_messages: int
+    files_read: List[str]
+    files_written: List[str]
+    files_edited: List[str]
+    tools_used: List[str]
+    topics: List[str]
+    first_user_message: str
+    last_assistant_message: str
+    conversation_hash: str
+    file_size_bytes: int
+    processed_at: datetime
+
+
+class ConversationProcessor:
+    """Processes Claude Code conversation JSONL files"""
+
+    def __init__(self, db_path: Path, verbose: bool = False):
+        self.db_path = db_path
+        self.verbose = verbose
+        self.conn = None
+        self._init_database()
+
+    def _init_database(self):
+        """Initialize SQLite database with schema"""
+        self.db_path.parent.mkdir(parents=True, exist_ok=True)
+        self.conn = sqlite3.connect(str(self.db_path))
+        self.conn.row_factory = sqlite3.Row
+
+        # Create tables
+        self.conn.executescript("""
+            CREATE TABLE IF NOT EXISTS conversations (
+                id TEXT PRIMARY KEY,
+                project_path TEXT NOT NULL,
+                timestamp TEXT NOT NULL,
+                message_count INTEGER NOT NULL,
+                user_messages INTEGER NOT NULL,
+                assistant_messages INTEGER NOT NULL,
+                files_read TEXT,  -- JSON array
+                files_written TEXT,  -- JSON array
+                files_edited TEXT,  -- JSON array
+                tools_used TEXT,  -- JSON array
+                topics TEXT,  -- JSON array
+                first_user_message TEXT,
+                last_assistant_message TEXT,
+                conversation_hash TEXT UNIQUE NOT NULL,
+                file_size_bytes INTEGER NOT NULL,
+                processed_at TEXT NOT NULL
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_timestamp ON conversations(timestamp);
+            CREATE INDEX IF NOT EXISTS idx_project ON conversations(project_path);
+            CREATE INDEX IF NOT EXISTS idx_processed ON conversations(processed_at);
+
+            CREATE TABLE IF NOT EXISTS file_interactions (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                conversation_id TEXT NOT NULL,
+                file_path TEXT NOT NULL,
+                interaction_type TEXT NOT NULL,  -- read, write, edit
+                FOREIGN KEY (conversation_id) REFERENCES conversations(id)
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_file_path ON file_interactions(file_path);
+            CREATE INDEX IF NOT EXISTS idx_conversation ON file_interactions(conversation_id);
+
+            CREATE TABLE IF NOT EXISTS tool_usage (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                conversation_id TEXT NOT NULL,
+                tool_name TEXT NOT NULL,
+                usage_count INTEGER NOT NULL,
+                FOREIGN KEY (conversation_id) REFERENCES conversations(id)
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_tool_name ON tool_usage(tool_name);
+
+            CREATE TABLE IF NOT EXISTS processing_state (
+                file_path TEXT PRIMARY KEY,
+                last_modified TEXT NOT NULL,
+                last_processed TEXT NOT NULL,
+                file_hash TEXT NOT NULL
+            );
+        """)
+        self.conn.commit()
+
+    def _log(self, message: str):
+        """Log if verbose mode is enabled"""
+        if self.verbose:
+            print(f"[{datetime.now().strftime('%H:%M:%S')}] {message}")
+
+    def _compute_file_hash(self, file_path: Path) -> str:
+        """Compute SHA256 hash of file for change detection"""
+        sha256 = hashlib.sha256()
+        with open(file_path, 'rb') as f:
+            for chunk in iter(lambda: f.read(8192), b''):
+                sha256.update(chunk)
+        return sha256.hexdigest()
+
+    def _needs_processing(self, file_path: Path, reindex: bool = False) -> bool:
+        """Check if file needs (re)processing"""
+        if reindex:
+            return True
+
+        file_stat = file_path.stat()
+        file_hash = self._compute_file_hash(file_path)
+
+        cursor = self.conn.execute(
+            "SELECT last_modified, file_hash FROM processing_state WHERE file_path = ?",
+            (str(file_path),)
+        )
+        row = cursor.fetchone()
+
+        if not row:
+            return True  # Never processed
+
+        last_modified, stored_hash = row
+        return stored_hash != file_hash  # File changed
+
+    def _update_processing_state(self, file_path: Path):
+        """Update processing state for file"""
+        file_hash = self._compute_file_hash(file_path)
+        last_modified = datetime.fromtimestamp(file_path.stat().st_mtime).isoformat()
+
+        self.conn.execute("""
+            INSERT OR REPLACE INTO processing_state (file_path, last_modified, last_processed, file_hash)
+            VALUES (?, ?, ?, ?)
+        """, (str(file_path), last_modified, datetime.now().isoformat(), file_hash))
+
+    def _parse_jsonl_file(self, file_path: Path) -> List[Dict[str, Any]]:
+        """Parse JSONL file with base64-encoded content"""
+        messages = []
+        with open(file_path, 'r', encoding='utf-8') as f:
+            for line_num, line in enumerate(f, 1):
+                try:
+                    if line.strip():
+                        data = json.loads(line)
+                        messages.append(data)
+                except json.JSONDecodeError as e:
+                    self._log(f"Warning: Failed to parse line {line_num} in {file_path.name}: {e}")
+        return messages
+
+    def _extract_tool_uses(self, content: str) -> List[str]:
+        """Extract tool names from assistant messages"""
+        tools = []
+        # Look for tool use patterns in content
+        tool_patterns = [
+            r'"name":\s*"([A-Z][a-zA-Z]+)"',  # JSON tool calls
+            r'<tool>([A-Z][a-zA-Z]+)</tool>',  # XML tool calls
+        ]
+
+        for pattern in tool_patterns:
+            matches = re.findall(pattern, content)
+            tools.extend(matches)
+
+        return list(set(tools))  # Unique tools
+
+    def _extract_file_paths(self, content: str) -> Dict[str, List[str]]:
+        """Extract file paths and their interaction types from content"""
+        files = {
+            'read': [],
+            'written': [],
+            'edited': []
+        }
+
+        # Patterns for file operations
+        read_patterns = [
+            r'Reading\s+(.+\.(?:py|js|ts|tsx|jsx|md|json|yaml|yml))',
+            r'Read\s+file:\s*(.+)',
+            r'"file_path":\s*"([^"]+)"',  # Tool parameters
+        ]
+
+        write_patterns = [
+            r'Writing\s+(.+\.(?:py|js|ts|tsx|jsx|md|json|yaml|yml))',
+            r'Created\s+file:\s*(.+)',
+            r'Write\s+(.+)',
+        ]
+
+        edit_patterns = [
+            r'Editing\s+(.+\.(?:py|js|ts|tsx|jsx|md|json|yaml|yml))',
+            r'Modified\s+file:\s*(.+)',
+            r'Edit\s+(.+)',
+        ]
+
+        for pattern in read_patterns:
+            files['read'].extend(re.findall(pattern, content, re.IGNORECASE))
+        for pattern in write_patterns:
+            files['written'].extend(re.findall(pattern, content, re.IGNORECASE))
+        for pattern in edit_patterns:
+            files['edited'].extend(re.findall(pattern, content, re.IGNORECASE))
+
+        # Deduplicate and clean
+        for key in files:
+            files[key] = list(set(path.strip() for path in files[key]))
+
+        return files
+
+    def _extract_topics(self, messages: List[Dict[str, Any]]) -> List[str]:
+        """Extract topic keywords from conversation"""
+        # Combine first user message and some assistant responses
+        text = ""
+        user_count = 0
+        for msg in messages:
+            msg_type = msg.get('type', '')
+
+            # Handle event-stream format
+            if msg_type == 'user':
+                message_dict = msg.get('message', {})
+                content = message_dict.get('content', '') if isinstance(message_dict, dict) else ''
+
+                # Handle content that's a list (content blocks)
+                if isinstance(content, list):
+                    message_content = ' '.join(
+                        block.get('text', '') if isinstance(block, dict) and block.get('type') == 'text' else ''
+                        for block in content
+                    )
+                else:
+                    message_content = content
+
+                if message_content:
+                    text += message_content + " "
+                    user_count += 1
+                    if user_count >= 3:  # Only use first few user messages
+                        break
+            elif msg_type == 'assistant' and user_count < 3:
+                # Also include some assistant responses for context
+                message_dict = msg.get('message', {})
+                content = message_dict.get('content', '') if isinstance(message_dict, dict) else ''
+
+                # Handle content that's a list (content blocks)
+                if isinstance(content, list):
+                    message_content = ' '.join(
+                        block.get('text', '') if isinstance(block, dict) and block.get('type') == 'text' else ''
+                        for block in content
+                    )
+                else:
+                    message_content = content
+
+                if message_content:
+                    text += message_content[:200] + " "  # Just a snippet
+
+        # Extract common programming keywords
+        keywords = []
+        common_topics = [
+            'authentication', 'auth', 'login', 'jwt', 'oauth',
+            'testing', 'test', 'unit test', 'integration test',
+            'bug', 'fix', 'error', 'issue', 'debug',
+            'performance', 'optimization', 'optimize', 'slow',
+            'refactor', 'refactoring', 'cleanup',
+            'feature', 'implement', 'add', 'create',
+            'database', 'sql', 'query', 'schema',
+            'api', 'endpoint', 'rest', 'graphql',
+            'typescript', 'javascript', 'react', 'node',
+            'css', 'style', 'styling', 'tailwind',
+            'security', 'vulnerability', 'xss', 'csrf',
+            'deploy', 'deployment', 'ci/cd', 'docker',
+        ]
+
+        text_lower = text.lower()
+        for topic in common_topics:
+            if topic in text_lower:
+                keywords.append(topic)
+
+        return list(set(keywords))[:10]  # Max 10 topics
+
+    def _process_conversation(self, file_path: Path, messages: List[Dict[str, Any]]) -> ConversationMetadata:
+        """Extract metadata from parsed conversation"""
+        # Generate conversation ID from filename
+        conv_id = file_path.stem
+
+        # Count messages by role
+        user_messages = 0
+        assistant_messages = 0
+        first_user_msg = ""
+        last_assistant_msg = ""
+        all_tools = []
+        all_files = {'read': [], 'written': [], 'edited': []}
+
+        for msg in messages:
+            msg_type = msg.get('type', '')
+
+            # Handle event-stream format
+            if msg_type == 'user':
+                user_messages += 1
+                message_dict = msg.get('message', {})
+                content = message_dict.get('content', '') if isinstance(message_dict, dict) else ''
+
+                # Handle content that's a list (content blocks)
+                if isinstance(content, list):
+                    message_content = ' '.join(
+                        block.get('text', '') if isinstance(block, dict) and block.get('type') == 'text' else ''
+                        for block in content
+                    )
+                else:
+                    message_content = content
+
+                if not first_user_msg and message_content:
+                    first_user_msg = message_content[:500]  # First 500 chars
+
+            elif msg_type == 'assistant':
+                assistant_messages += 1
+                message_dict = msg.get('message', {})
+                content = message_dict.get('content', '') if isinstance(message_dict, dict) else ''
+
+                # Handle content that's a list (content blocks)
+                if isinstance(content, list):
+                    message_content = ' '.join(
+                        block.get('text', '') if isinstance(block, dict) and block.get('type') == 'text' else ''
+                        for block in content
+                    )
+                    # Also extract tools from content blocks
+                    for block in content:
+                        if isinstance(block, dict) and block.get('type') == 'tool_use':
+                            tool_name = block.get('name', '')
+                            if tool_name:
+                                all_tools.append(tool_name)
+                else:
+                    message_content = content
+
+                if message_content:
+                    last_assistant_msg = message_content[:500]
+
+                    # Extract tools and files from assistant messages
+                    tools = self._extract_tool_uses(message_content)
+                    all_tools.extend(tools)
+
+                    files = self._extract_file_paths(message_content)
+                    for key in all_files:
+                        all_files[key].extend(files[key])
+
+        # Deduplicate
+        all_tools = list(set(all_tools))
+        for key in all_files:
+            all_files[key] = list(set(all_files[key]))
+
+        # Extract topics
+        topics = self._extract_topics(messages)
+
+        # File stats
+        file_stat = file_path.stat()
+
+        # Compute conversation hash
+        conv_hash = self._compute_file_hash(file_path)
+
+        # Extract timestamp (from filename or file mtime)
+        try:
+            # Try to get timestamp from file modification time
+            timestamp = datetime.fromtimestamp(file_stat.st_mtime)
+        except Exception:
+            timestamp = datetime.now()
+
+        return ConversationMetadata(
+            id=conv_id,
+            project_path=str(file_path.parent),
+            timestamp=timestamp,
+            message_count=len(messages),
+            user_messages=user_messages,
+            assistant_messages=assistant_messages,
+            files_read=all_files['read'],
+            files_written=all_files['written'],
+            files_edited=all_files['edited'],
+            tools_used=all_tools,
+            topics=topics,
+            first_user_message=first_user_msg,
+            last_assistant_message=last_assistant_msg,
+            conversation_hash=conv_hash,
+            file_size_bytes=file_stat.st_size,
+            processed_at=datetime.now()
+        )
+
+    def _store_conversation(self, metadata: ConversationMetadata):
+        """Store conversation metadata in database"""
+        # Store main conversation record
+        self.conn.execute("""
+            INSERT OR REPLACE INTO conversations
+            (id, project_path, timestamp, message_count, user_messages, assistant_messages,
+             files_read, files_written, files_edited, tools_used, topics,
+             first_user_message, last_assistant_message, conversation_hash,
+             file_size_bytes, processed_at)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+        """, (
+            metadata.id,
+            metadata.project_path,
+            metadata.timestamp.isoformat(),
+            metadata.message_count,
+            metadata.user_messages,
+            metadata.assistant_messages,
+            json.dumps(metadata.files_read),
+            json.dumps(metadata.files_written),
+            json.dumps(metadata.files_edited),
+            json.dumps(metadata.tools_used),
+            json.dumps(metadata.topics),
+            metadata.first_user_message,
+            metadata.last_assistant_message,
+            metadata.conversation_hash,
+            metadata.file_size_bytes,
+            metadata.processed_at.isoformat()
+        ))
+
+        # Store file interactions
+        self.conn.execute(
+            "DELETE FROM file_interactions WHERE conversation_id = ?",
+            (metadata.id,)
+        )
+        for file_path in metadata.files_read:
+            self.conn.execute(
+                "INSERT INTO file_interactions (conversation_id, file_path, interaction_type) VALUES (?, ?, ?)",
+                (metadata.id, file_path, 'read')
+            )
+        for file_path in metadata.files_written:
+            self.conn.execute(
+                "INSERT INTO file_interactions (conversation_id, file_path, interaction_type) VALUES (?, ?, ?)",
+                (metadata.id, file_path, 'write')
+            )
+        for file_path in metadata.files_edited:
+            self.conn.execute(
+                "INSERT INTO file_interactions (conversation_id, file_path, interaction_type) VALUES (?, ?, ?)",
+                (metadata.id, file_path, 'edit')
+            )
+
+        # Store tool usage
+        self.conn.execute(
+            "DELETE FROM tool_usage WHERE conversation_id = ?",
+            (metadata.id,)
+        )
+        for tool_name in metadata.tools_used:
+            self.conn.execute(
+                "INSERT INTO tool_usage (conversation_id, tool_name, usage_count) VALUES (?, ?, ?)",
+                (metadata.id, tool_name, 1)
+            )
+
+    def process_file(self, file_path: Path, reindex: bool = False) -> bool:
+        """Process a single conversation file"""
+        if not self._needs_processing(file_path, reindex):
+            self._log(f"Skipping {file_path.name} (already processed)")
+            return False
+
+        self._log(f"Processing {file_path.name}...")
+
+        try:
+            # Parse JSONL
+            messages = self._parse_jsonl_file(file_path)
+
+            if not messages:
+                self._log(f"Warning: No messages found in {file_path.name}")
+                return False
+
+            # Extract metadata
+            metadata = self._process_conversation(file_path, messages)
+
+            # Store in database
+            self._store_conversation(metadata)
+
+            # Update processing state
+            self._update_processing_state(file_path)
+
+            self.conn.commit()
+
+            self._log(f"✓ Processed {file_path.name}: {metadata.message_count} messages, "
+                     f"{metadata.user_messages} user, {metadata.assistant_messages} assistant")
+
+            return True
+
+        except Exception as e:
+            self._log(f"Error processing {file_path.name}: {e}")
+            import traceback
+            if self.verbose:
+                traceback.print_exc()
+            return False
+
+    def process_project(self, project_name: str, reindex: bool = False) -> int:
+        """Process all conversations for a project"""
+        # Find conversation files
+        claude_projects = Path.home() / ".claude" / "projects"
+
+        if not claude_projects.exists():
+            self._log(f"Error: {claude_projects} does not exist")
+            return 0
+
+        # Find project directory (may be encoded)
+        project_dirs = list(claude_projects.glob(f"*{project_name}*"))
+
+        if not project_dirs:
+            self._log(f"Error: No project directory found matching '{project_name}'")
+            return 0
+
+        if len(project_dirs) > 1:
+            self._log(f"Warning: Multiple project directories found, using {project_dirs[0].name}")
+
+        project_dir = project_dirs[0]
+        self._log(f"Processing conversations from {project_dir}")
+
+        # Find all JSONL files
+        jsonl_files = list(project_dir.glob("*.jsonl"))
+
+        if not jsonl_files:
+            self._log(f"No conversation files found in {project_dir}")
+            return 0
+
+        self._log(f"Found {len(jsonl_files)} conversation files")
+
+        # Process each file
+        processed_count = 0
+        for jsonl_file in jsonl_files:
+            if self.process_file(jsonl_file, reindex):
+                processed_count += 1
+
+        self._log(f"\nProcessed {processed_count}/{len(jsonl_files)} conversations")
+        return processed_count
+
+    def get_stats(self) -> Dict[str, Any]:
+        """Get processing statistics"""
+        cursor = self.conn.execute("""
+            SELECT
+                COUNT(*) as total_conversations,
+                SUM(message_count) as total_messages,
+                SUM(user_messages) as total_user_messages,
+                SUM(assistant_messages) as total_assistant_messages,
+                MIN(timestamp) as earliest_conversation,
+                MAX(timestamp) as latest_conversation
+            FROM conversations
+        """)
+        row = cursor.fetchone()
+
+        stats = {
+            'total_conversations': row['total_conversations'],
+            'total_messages': row['total_messages'],
+            'total_user_messages': row['total_user_messages'],
+            'total_assistant_messages': row['total_assistant_messages'],
+            'earliest_conversation': row['earliest_conversation'],
+            'latest_conversation': row['latest_conversation']
+        }
+
+        # Top files
+        cursor = self.conn.execute("""
+            SELECT file_path, COUNT(*) as interaction_count
+            FROM file_interactions
+            GROUP BY file_path
+            ORDER BY interaction_count DESC
+            LIMIT 10
+        """)
+        stats['top_files'] = [
+            {'file': row['file_path'], 'count': row['interaction_count']}
+            for row in cursor.fetchall()
+        ]
+
+        # Top tools
+        cursor = self.conn.execute("""
+            SELECT tool_name, SUM(usage_count) as total_usage
+            FROM tool_usage
+            GROUP BY tool_name
+            ORDER BY total_usage DESC
+            LIMIT 10
+        """)
+        stats['top_tools'] = [
+            {'tool': row['tool_name'], 'count': row['total_usage']}
+            for row in cursor.fetchall()
+        ]
+
+        return stats
+
+    def close(self):
+        """Close database connection"""
+        if self.conn:
+            self.conn.close()
+
+
+@click.command()
+@click.option('--project-name', default='annex', help='Project name to process')
+@click.option('--db-path', type=click.Path(), default='.claude/skills/cc-insights/.processed/conversations.db',
+              help='SQLite database path')
+@click.option('--reindex', is_flag=True, help='Reprocess all conversations (ignore cache)')
+@click.option('--verbose', is_flag=True, help='Show detailed processing logs')
+@click.option('--stats', is_flag=True, help='Show statistics after processing')
+def main(project_name: str, db_path: str, reindex: bool, verbose: bool, stats: bool):
+    """Process Claude Code conversations and store metadata"""
+    db_path = Path(db_path)
+
+    processor = ConversationProcessor(db_path, verbose=verbose)
+
+    try:
+        # Process conversations
+        count = processor.process_project(project_name, reindex=reindex)
+
+        print(f"\n✓ Processed {count} conversations")
+
+        if stats:
+            print("\n=== Statistics ===")
+            stats_data = processor.get_stats()
+            print(f"Total conversations: {stats_data['total_conversations']}")
+            print(f"Total messages: {stats_data['total_messages']}")
+            print(f"User messages: {stats_data['total_user_messages']}")
+            print(f"Assistant messages: {stats_data['total_assistant_messages']}")
+            print(f"Date range: {stats_data['earliest_conversation']} to {stats_data['latest_conversation']}")
+
+            print("\nTop 10 Files:")
+            for item in stats_data['top_files']:
+                print(f"  {item['file']}: {item['count']} interactions")
+
+            print("\nTop 10 Tools:")
+            for item in stats_data['top_tools']:
+                print(f"  {item['tool']}: {item['count']} uses")
+
+    finally:
+        processor.close()
+
+
+if __name__ == '__main__':
+    main()
--- a/skills/cc-insights/scripts/insight-generator.py
+++ b/skills/cc-insights/scripts/insight-generator.py
@@ -0,0 +1,509 @@
+#!/usr/bin/env python3
+"""
+Insight Generator for Claude Code Insights
+
+Analyzes conversation patterns and generates insight reports with
+visualizations, metrics, and actionable recommendations.
+"""
+
+import sqlite3
+import json
+from pathlib import Path
+from typing import List, Dict, Any, Optional, Tuple
+from datetime import datetime, timedelta
+from collections import Counter, defaultdict
+import click
+
+try:
+    from jinja2 import Template, Environment, FileSystemLoader
+except ImportError:
+    print("Error: jinja2 not installed. Run: pip install jinja2")
+    exit(1)
+
+
+class PatternDetector:
+    """Detects patterns in conversation data"""
+
+    def __init__(self, db_path: Path, verbose: bool = False):
+        self.db_path = db_path
+        self.verbose = verbose
+        self.conn = sqlite3.connect(str(db_path))
+        self.conn.row_factory = sqlite3.Row
+
+    def _log(self, message: str):
+        """Log if verbose mode is enabled"""
+        if self.verbose:
+            print(f"[{datetime.now().strftime('%H:%M:%S')}] {message}")
+
+    def get_date_range_filter(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> Tuple[str, List]:
+        """Build date range SQL filter"""
+        conditions = []
+        params = []
+
+        if date_from:
+            conditions.append("timestamp >= ?")
+            params.append(date_from)
+        if date_to:
+            conditions.append("timestamp <= ?")
+            params.append(date_to)
+
+        where_clause = " AND ".join(conditions) if conditions else "1=1"
+        return where_clause, params
+
+    def get_overview_metrics(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> Dict[str, Any]:
+        """Get high-level overview metrics"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT
+                COUNT(*) as total_conversations,
+                SUM(message_count) as total_messages,
+                SUM(user_messages) as total_user_messages,
+                SUM(assistant_messages) as total_assistant_messages,
+                AVG(message_count) as avg_messages_per_conversation,
+                MIN(timestamp) as earliest_conversation,
+                MAX(timestamp) as latest_conversation,
+                COUNT(DISTINCT DATE(timestamp)) as active_days
+            FROM conversations
+            WHERE {where_clause}
+        """, params)
+
+        row = cursor.fetchone()
+
+        return {
+            'total_conversations': row['total_conversations'] or 0,
+            'total_messages': row['total_messages'] or 0,
+            'total_user_messages': row['total_user_messages'] or 0,
+            'total_assistant_messages': row['total_assistant_messages'] or 0,
+            'avg_messages_per_conversation': round(row['avg_messages_per_conversation'] or 0, 1),
+            'earliest_conversation': row['earliest_conversation'],
+            'latest_conversation': row['latest_conversation'],
+            'active_days': row['active_days'] or 0
+        }
+
+    def get_file_hotspots(self, date_from: Optional[str] = None, date_to: Optional[str] = None, limit: int = 20) -> List[Dict[str, Any]]:
+        """Get most frequently modified files"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT
+                fi.file_path,
+                COUNT(DISTINCT fi.conversation_id) as conversation_count,
+                SUM(CASE WHEN fi.interaction_type = 'read' THEN 1 ELSE 0 END) as read_count,
+                SUM(CASE WHEN fi.interaction_type = 'write' THEN 1 ELSE 0 END) as write_count,
+                SUM(CASE WHEN fi.interaction_type = 'edit' THEN 1 ELSE 0 END) as edit_count
+            FROM file_interactions fi
+            JOIN conversations c ON fi.conversation_id = c.id
+            WHERE {where_clause}
+            GROUP BY fi.file_path
+            ORDER BY conversation_count DESC
+            LIMIT ?
+        """, params + [limit])
+
+        return [
+            {
+                'file_path': row['file_path'],
+                'conversation_count': row['conversation_count'],
+                'read_count': row['read_count'],
+                'write_count': row['write_count'],
+                'edit_count': row['edit_count'],
+                'total_interactions': row['read_count'] + row['write_count'] + row['edit_count']
+            }
+            for row in cursor.fetchall()
+        ]
+
+    def get_tool_usage(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> List[Dict[str, Any]]:
+        """Get tool usage statistics"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT
+                tu.tool_name,
+                COUNT(DISTINCT tu.conversation_id) as conversation_count,
+                SUM(tu.usage_count) as total_uses
+            FROM tool_usage tu
+            JOIN conversations c ON tu.conversation_id = c.id
+            WHERE {where_clause}
+            GROUP BY tu.tool_name
+            ORDER BY total_uses DESC
+        """, params)
+
+        return [
+            {
+                'tool_name': row['tool_name'],
+                'conversation_count': row['conversation_count'],
+                'total_uses': row['total_uses']
+            }
+            for row in cursor.fetchall()
+        ]
+
+    def get_topic_clusters(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> List[Dict[str, Any]]:
+        """Get most common topics"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT topics FROM conversations
+            WHERE {where_clause} AND topics IS NOT NULL
+        """, params)
+
+        topic_counter = Counter()
+        for row in cursor.fetchall():
+            topics = json.loads(row['topics'])
+            topic_counter.update(topics)
+
+        return [
+            {'topic': topic, 'count': count}
+            for topic, count in topic_counter.most_common(20)
+        ]
+
+    def get_activity_timeline(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> Dict[str, int]:
+        """Get conversation count by date"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT DATE(timestamp) as date, COUNT(*) as count
+            FROM conversations
+            WHERE {where_clause}
+            GROUP BY DATE(timestamp)
+            ORDER BY date
+        """, params)
+
+        return {row['date']: row['count'] for row in cursor.fetchall()}
+
+    def get_hourly_distribution(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> Dict[int, int]:
+        """Get conversation distribution by hour of day"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT
+                CAST(strftime('%H', timestamp) AS INTEGER) as hour,
+                COUNT(*) as count
+            FROM conversations
+            WHERE {where_clause}
+            GROUP BY hour
+            ORDER BY hour
+        """, params)
+
+        return {row['hour']: row['count'] for row in cursor.fetchall()}
+
+    def get_weekday_distribution(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> Dict[str, int]:
+        """Get conversation distribution by day of week"""
+        where_clause, params = self.get_date_range_filter(date_from, date_to)
+
+        cursor = self.conn.execute(f"""
+            SELECT
+                CASE CAST(strftime('%w', timestamp) AS INTEGER)
+                    WHEN 0 THEN 'Sunday'
+                    WHEN 1 THEN 'Monday'
+                    WHEN 2 THEN 'Tuesday'
+                    WHEN 3 THEN 'Wednesday'
+                    WHEN 4 THEN 'Thursday'
+                    WHEN 5 THEN 'Friday'
+                    WHEN 6 THEN 'Saturday'
+                END as weekday,
+                COUNT(*) as count
+            FROM conversations
+            WHERE {where_clause}
+            GROUP BY weekday
+        """, params)
+
+        weekday_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
+        result = {day: 0 for day in weekday_order}
+        for row in cursor.fetchall():
+            result[row['weekday']] = row['count']
+
+        return result
+
+    def close(self):
+        """Close database connection"""
+        if self.conn:
+            self.conn.close()
+
+
+class InsightGenerator:
+    """Generates insight reports from pattern data"""
+
+    def __init__(self, db_path: Path, templates_dir: Path, verbose: bool = False):
+        self.db_path = db_path
+        self.templates_dir = templates_dir
+        self.verbose = verbose
+        self.detector = PatternDetector(db_path, verbose=verbose)
+
+        # Setup Jinja2 environment
+        if templates_dir.exists():
+            self.jinja_env = Environment(loader=FileSystemLoader(str(templates_dir)))
+        else:
+            self.jinja_env = None
+
+    def _log(self, message: str):
+        """Log if verbose mode is enabled"""
+        if self.verbose:
+            print(f"[{datetime.now().strftime('%H:%M:%S')}] {message}")
+
+    def _create_ascii_bar_chart(self, data: Dict[str, int], max_width: int = 50) -> str:
+        """Create ASCII bar chart"""
+        if not data:
+            return "No data"
+
+        max_value = max(data.values())
+        lines = []
+
+        for label, value in data.items():
+            bar_length = int((value / max_value) * max_width) if max_value > 0 else 0
+            bar = "█" * bar_length
+            lines.append(f"{label:15} {bar} {value}")
+
+        return "\n".join(lines)
+
+    def _create_sparkline(self, values: List[int]) -> str:
+        """Create sparkline chart"""
+        if not values:
+            return ""
+
+        chars = "▁▂▃▄▅▆▇█"
+        min_val = min(values)
+        max_val = max(values)
+
+        if max_val == min_val:
+            return chars[0] * len(values)
+
+        normalized = [(v - min_val) / (max_val - min_val) for v in values]
+        return "".join(chars[int(n * (len(chars) - 1))] for n in normalized)
+
+    def generate_weekly_report(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> str:
+        """Generate weekly activity report"""
+        self._log("Generating weekly report...")
+
+        # Auto-calculate date range if not provided
+        if not date_from:
+            date_from = (datetime.now() - timedelta(days=7)).date().isoformat()
+        if not date_to:
+            date_to = datetime.now().date().isoformat()
+
+        # Gather data
+        overview = self.detector.get_overview_metrics(date_from, date_to)
+        file_hotspots = self.detector.get_file_hotspots(date_from, date_to, limit=10)
+        tool_usage = self.detector.get_tool_usage(date_from, date_to)
+        topics = self.detector.get_topic_clusters(date_from, date_to)
+        timeline = self.detector.get_activity_timeline(date_from, date_to)
+        weekday_dist = self.detector.get_weekday_distribution(date_from, date_to)
+
+        # Build report
+        report_lines = [
+            f"# Weekly Insights Report",
+            f"**Period:** {date_from} to {date_to}",
+            f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M')}",
+            "",
+            "## Overview",
+            f"- **Total Conversations:** {overview['total_conversations']}",
+            f"- **Active Days:** {overview['active_days']}",
+            f"- **Total Messages:** {overview['total_messages']}",
+            f"- **Avg Messages/Conversation:** {overview['avg_messages_per_conversation']}",
+            "",
+            "## Activity Timeline",
+            "```",
+            self._create_ascii_bar_chart(timeline, max_width=40),
+            "```",
+            "",
+            "## Weekday Distribution",
+            "```",
+            self._create_ascii_bar_chart(weekday_dist, max_width=40),
+            "```",
+            ""
+        ]
+
+        if file_hotspots:
+            report_lines.extend([
+                "## File Hotspots (Top 10)",
+                ""
+            ])
+            for i, file in enumerate(file_hotspots, 1):
+                heat = "🔥" * min(3, (file['conversation_count'] + 2) // 3)
+                report_lines.append(
+                    f"{i}. {heat} **{file['file_path']}** "
+                    f"({file['conversation_count']} conversations, "
+                    f"R:{file['read_count']} W:{file['write_count']} E:{file['edit_count']})"
+                )
+            report_lines.append("")
+
+        if tool_usage:
+            report_lines.extend([
+                "## Tool Usage",
+                ""
+            ])
+            tool_dict = {t['tool_name']: t['total_uses'] for t in tool_usage[:10]}
+            report_lines.append("```")
+            report_lines.append(self._create_ascii_bar_chart(tool_dict, max_width=40))
+            report_lines.append("```")
+            report_lines.append("")
+
+        if topics:
+            report_lines.extend([
+                "## Top Topics",
+                ""
+            ])
+            topic_dict = {t['topic']: t['count'] for t in topics[:15]}
+            report_lines.append("```")
+            report_lines.append(self._create_ascii_bar_chart(topic_dict, max_width=40))
+            report_lines.append("```")
+            report_lines.append("")
+
+        # Insights and recommendations
+        report_lines.extend([
+            "## Insights & Recommendations",
+            ""
+        ])
+
+        # File hotspot insights
+        if file_hotspots and file_hotspots[0]['conversation_count'] >= 5:
+            top_file = file_hotspots[0]
+            report_lines.append(
+                f"- 🔥 **High Activity File:** `{top_file['file_path']}` was modified in "
+                f"{top_file['conversation_count']} conversations. Consider reviewing for refactoring opportunities."
+            )
+
+        # Topic insights
+        if topics and topics[0]['count'] >= 3:
+            top_topic = topics[0]
+            report_lines.append(
+                f"- 📌 **Trending Topic:** '{top_topic['topic']}' appeared in {top_topic['count']} conversations. "
+                f"This might warrant documentation or team knowledge sharing."
+            )
+
+        # Activity pattern insights
+        if overview['active_days'] < 3:
+            report_lines.append(
+                f"- 📅 **Low Activity:** Only {overview['active_days']} active days this week. "
+                f"Consider scheduling regular development sessions."
+            )
+
+        if not report_lines[-1]:  # If no insights were added
+            report_lines.append("- No significant patterns detected this period.")
+
+        return "\n".join(report_lines)
+
+    def generate_file_heatmap_report(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> str:
+        """Generate detailed file interaction heatmap"""
+        self._log("Generating file heatmap report...")
+
+        file_hotspots = self.detector.get_file_hotspots(date_from, date_to, limit=50)
+
+        report_lines = [
+            "# File Interaction Heatmap",
+            f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M')}",
+            "",
+            "## File Hotspots",
+            ""
+        ]
+
+        if not file_hotspots:
+            report_lines.append("No file interactions found in the specified period.")
+            return "\n".join(report_lines)
+
+        for i, file in enumerate(file_hotspots, 1):
+            heat_level = min(5, (file['conversation_count'] + 1) // 2)
+            heat_emoji = "🔥" * heat_level
+
+            report_lines.extend([
+                f"### {i}. {heat_emoji} {file['file_path']}",
+                f"- **Conversations:** {file['conversation_count']}",
+                f"- **Reads:** {file['read_count']}",
+                f"- **Writes:** {file['write_count']}",
+                f"- **Edits:** {file['edit_count']}",
+                f"- **Total Interactions:** {file['total_interactions']}",
+                ""
+            ])
+
+        return "\n".join(report_lines)
+
+    def generate_tool_usage_report(self, date_from: Optional[str] = None, date_to: Optional[str] = None) -> str:
+        """Generate tool usage analytics report"""
+        self._log("Generating tool usage report...")
+
+        tool_usage = self.detector.get_tool_usage(date_from, date_to)
+
+        report_lines = [
+            "# Tool Usage Analytics",
+            f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M')}",
+            "",
+            "## Tool Statistics",
+            ""
+        ]
+
+        if not tool_usage:
+            report_lines.append("No tool usage data found.")
+            return "\n".join(report_lines)
+
+        total_uses = sum(t['total_uses'] for t in tool_usage)
+
+        for i, tool in enumerate(tool_usage, 1):
+            percentage = (tool['total_uses'] / total_uses * 100) if total_uses > 0 else 0
+            report_lines.extend([
+                f"### {i}. {tool['tool_name']}",
+                f"- **Total Uses:** {tool['total_uses']}",
+                f"- **Used in Conversations:** {tool['conversation_count']}",
+                f"- **Percentage of Total:** {percentage:.1f}%",
+                ""
+            ])
+
+        return "\n".join(report_lines)
+
+    def close(self):
+        """Close connections"""
+        self.detector.close()
+
+
+@click.command()
+@click.argument('report_type', type=click.Choice(['weekly', 'file-heatmap', 'tool-usage', 'custom']))
+@click.option('--db-path', type=click.Path(), default='.claude/skills/cc-insights/.processed/conversations.db',
+              help='SQLite database path')
+@click.option('--templates-dir', type=click.Path(), default='.claude/skills/cc-insights/templates',
+              help='Templates directory')
+@click.option('--date-from', type=str, help='Start date (ISO format)')
+@click.option('--date-to', type=str, help='End date (ISO format)')
+@click.option('--output', type=click.Path(), help='Save to file (default: stdout)')
+@click.option('--verbose', is_flag=True, help='Show detailed logs')
+def main(report_type: str, db_path: str, templates_dir: str, date_from: Optional[str],
+         date_to: Optional[str], output: Optional[str], verbose: bool):
+    """Generate insight reports from conversation data
+
+    Report types:
+      weekly        - Weekly activity summary with metrics
+      file-heatmap  - File modification heatmap
+      tool-usage    - Tool usage analytics
+      custom        - Custom report from template
+    """
+    db_path = Path(db_path)
+    templates_dir = Path(templates_dir)
+
+    if not db_path.exists():
+        print(f"Error: Database not found at {db_path}")
+        exit(1)
+
+    generator = InsightGenerator(db_path, templates_dir, verbose=verbose)
+
+    try:
+        # Generate report based on type
+        if report_type == 'weekly':
+            report = generator.generate_weekly_report(date_from, date_to)
+        elif report_type == 'file-heatmap':
+            report = generator.generate_file_heatmap_report(date_from, date_to)
+        elif report_type == 'tool-usage':
+            report = generator.generate_tool_usage_report(date_from, date_to)
+        else:
+            print("Custom templates not yet implemented")
+            exit(1)
+
+        # Output report
+        if output:
+            Path(output).write_text(report)
+            print(f"✓ Report saved to {output}")
+        else:
+            print(report)
+
+    finally:
+        generator.close()
+
+
+if __name__ == '__main__':
+    main()
--- a/skills/cc-insights/scripts/rag_indexer.py
+++ b/skills/cc-insights/scripts/rag_indexer.py
@@ -0,0 +1,298 @@
+#!/usr/bin/env python3
+"""
+RAG Indexer for Claude Code Insights
+
+Builds vector embeddings for semantic search using sentence-transformers
+and ChromaDB. Supports incremental indexing and efficient similarity search.
+"""
+
+import sqlite3
+import json
+from pathlib import Path
+from typing import List, Dict, Any, Optional
+from datetime import datetime
+import click
+
+try:
+    from sentence_transformers import SentenceTransformer
+    import chromadb
+    from chromadb.config import Settings
+except ImportError as e:
+    print(f"Error: Required packages not installed. Run: pip install sentence-transformers chromadb")
+    print(f"Missing: {e}")
+    exit(1)
+
+
+class RAGIndexer:
+    """Builds and manages vector embeddings for conversations"""
+
+    def __init__(self, db_path: Path, embeddings_dir: Path, model_name: str = "all-MiniLM-L6-v2", verbose: bool = False):
+        self.db_path = db_path
+        self.embeddings_dir = embeddings_dir
+        self.model_name = model_name
+        self.verbose = verbose
+
+        # Initialize sentence transformer model
+        self._log("Loading embedding model...")
+        self.model = SentenceTransformer(model_name)
+        self._log(f"✓ Loaded {model_name}")
+
+        # Initialize ChromaDB
+        self.embeddings_dir.mkdir(parents=True, exist_ok=True)
+        self.chroma_client = chromadb.PersistentClient(
+            path=str(self.embeddings_dir),
+            settings=Settings(anonymized_telemetry=False)
+        )
+
+        # Get or create collection
+        self.collection = self.chroma_client.get_or_create_collection(
+            name="conversations",
+            metadata={"hnsw:space": "cosine"}  # Use cosine similarity
+        )
+
+        # Connect to SQLite
+        self.conn = sqlite3.connect(str(self.db_path))
+        self.conn.row_factory = sqlite3.Row
+
+    def _log(self, message: str):
+        """Log if verbose mode is enabled"""
+        if self.verbose:
+            print(f"[{datetime.now().strftime('%H:%M:%S')}] {message}")
+
+    def _get_indexed_conversation_ids(self) -> set:
+        """Get set of conversation IDs already indexed"""
+        try:
+            results = self.collection.get(include=[])
+            return set(results['ids'])
+        except Exception:
+            return set()
+
+    def _fetch_conversations_to_index(self, rebuild: bool = False) -> List[Dict[str, Any]]:
+        """Fetch conversations that need indexing"""
+        if rebuild:
+            # Rebuild: get all conversations
+            cursor = self.conn.execute("""
+                SELECT id, first_user_message, last_assistant_message, topics,
+                       files_read, files_written, files_edited, timestamp
+                FROM conversations
+                ORDER BY timestamp DESC
+            """)
+        else:
+            # Incremental: only get conversations not yet indexed
+            indexed_ids = self._get_indexed_conversation_ids()
+            if not indexed_ids:
+                # Nothing indexed yet, get all
+                cursor = self.conn.execute("""
+                    SELECT id, first_user_message, last_assistant_message, topics,
+                           files_read, files_written, files_edited, timestamp
+                    FROM conversations
+                    ORDER BY timestamp DESC
+                """)
+            else:
+                # Get conversations not in indexed set
+                placeholders = ','.join('?' * len(indexed_ids))
+                cursor = self.conn.execute(f"""
+                    SELECT id, first_user_message, last_assistant_message, topics,
+                           files_read, files_written, files_edited, timestamp
+                    FROM conversations
+                    WHERE id NOT IN ({placeholders})
+                    ORDER BY timestamp DESC
+                """, tuple(indexed_ids))
+
+        conversations = []
+        for row in cursor.fetchall():
+            conversations.append({
+                'id': row['id'],
+                'first_user_message': row['first_user_message'] or "",
+                'last_assistant_message': row['last_assistant_message'] or "",
+                'topics': json.loads(row['topics']) if row['topics'] else [],
+                'files_read': json.loads(row['files_read']) if row['files_read'] else [],
+                'files_written': json.loads(row['files_written']) if row['files_written'] else [],
+                'files_edited': json.loads(row['files_edited']) if row['files_edited'] else [],
+                'timestamp': row['timestamp']
+            })
+
+        return conversations
+
+    def _create_document_text(self, conversation: Dict[str, Any]) -> str:
+        """Create text document for embedding"""
+        # Combine relevant fields into searchable text
+        parts = []
+
+        if conversation['first_user_message']:
+            parts.append(f"User: {conversation['first_user_message']}")
+
+        if conversation['last_assistant_message']:
+            parts.append(f"Assistant: {conversation['last_assistant_message']}")
+
+        if conversation['topics']:
+            parts.append(f"Topics: {', '.join(conversation['topics'])}")
+
+        all_files = conversation['files_read'] + conversation['files_written'] + conversation['files_edited']
+        if all_files:
+            parts.append(f"Files: {', '.join(all_files)}")
+
+        return "\n\n".join(parts)
+
+    def _create_metadata(self, conversation: Dict[str, Any]) -> Dict[str, Any]:
+        """Create metadata for ChromaDB"""
+        return {
+            'timestamp': conversation['timestamp'],
+            'topics': json.dumps(conversation['topics']),
+            'files_read': json.dumps(conversation['files_read']),
+            'files_written': json.dumps(conversation['files_written']),
+            'files_edited': json.dumps(conversation['files_edited']),
+        }
+
+    def index_conversations(self, rebuild: bool = False, batch_size: int = 32) -> int:
+        """Index conversations for semantic search"""
+        if rebuild:
+            self._log("Rebuilding entire index...")
+            # Clear existing collection
+            self.chroma_client.delete_collection("conversations")
+            self.collection = self.chroma_client.create_collection(
+                name="conversations",
+                metadata={"hnsw:space": "cosine"}
+            )
+        else:
+            self._log("Incremental indexing...")
+
+        # Fetch conversations to index
+        conversations = self._fetch_conversations_to_index(rebuild)
+
+        if not conversations:
+            self._log("No conversations to index")
+            return 0
+
+        self._log(f"Indexing {len(conversations)} conversations...")
+
+        # Process in batches
+        indexed_count = 0
+        for i in range(0, len(conversations), batch_size):
+            batch = conversations[i:i + batch_size]
+
+            # Prepare batch data
+            ids = []
+            documents = []
+            metadatas = []
+
+            for conv in batch:
+                ids.append(conv['id'])
+                documents.append(self._create_document_text(conv))
+                metadatas.append(self._create_metadata(conv))
+
+            # Generate embeddings
+            embeddings = self.model.encode(documents, show_progress_bar=self.verbose)
+
+            # Add to ChromaDB
+            self.collection.add(
+                ids=ids,
+                documents=documents,
+                embeddings=embeddings.tolist(),
+                metadatas=metadatas
+            )
+
+            indexed_count += len(batch)
+            self._log(f"Indexed {indexed_count}/{len(conversations)} conversations")
+
+        self._log(f"✓ Indexing complete: {indexed_count} conversations")
+        return indexed_count
+
+    def search(self, query: str, n_results: int = 10, filters: Optional[Dict[str, Any]] = None) -> List[Dict[str, Any]]:
+        """Search conversations by semantic similarity"""
+        # Generate query embedding
+        query_embedding = self.model.encode([query])[0]
+
+        # Search in ChromaDB
+        results = self.collection.query(
+            query_embeddings=[query_embedding.tolist()],
+            n_results=n_results,
+            where=filters if filters else None
+        )
+
+        # Format results
+        formatted_results = []
+        for i in range(len(results['ids'][0])):
+            formatted_results.append({
+                'id': results['ids'][0][i],
+                'distance': results['distances'][0][i],
+                'similarity': 1 - results['distances'][0][i],  # Convert distance to similarity
+                'document': results['documents'][0][i],
+                'metadata': results['metadatas'][0][i] if results['metadatas'] else {}
+            })
+
+        return formatted_results
+
+    def get_stats(self) -> Dict[str, Any]:
+        """Get indexing statistics"""
+        try:
+            count = self.collection.count()
+            return {
+                'total_indexed': count,
+                'model': self.model_name,
+                'collection_name': self.collection.name,
+                'embedding_dimension': self.model.get_sentence_embedding_dimension()
+            }
+        except Exception as e:
+            return {
+                'error': str(e)
+            }
+
+    def close(self):
+        """Close connections"""
+        if self.conn:
+            self.conn.close()
+
+
+@click.command()
+@click.option('--db-path', type=click.Path(), default='.claude/skills/cc-insights/.processed/conversations.db',
+              help='SQLite database path')
+@click.option('--embeddings-dir', type=click.Path(), default='.claude/skills/cc-insights/.processed/embeddings',
+              help='ChromaDB embeddings directory')
+@click.option('--model', default='all-MiniLM-L6-v2', help='Sentence transformer model name')
+@click.option('--rebuild', is_flag=True, help='Rebuild entire index (delete and recreate)')
+@click.option('--batch-size', default=32, help='Batch size for embedding generation')
+@click.option('--verbose', is_flag=True, help='Show detailed logs')
+@click.option('--stats', is_flag=True, help='Show statistics after indexing')
+@click.option('--test-search', type=str, help='Test search with query')
+def main(db_path: str, embeddings_dir: str, model: str, rebuild: bool, batch_size: int, verbose: bool, stats: bool, test_search: Optional[str]):
+    """Build vector embeddings for semantic search"""
+    db_path = Path(db_path)
+    embeddings_dir = Path(embeddings_dir)
+
+    if not db_path.exists():
+        print(f"Error: Database not found at {db_path}")
+        print("Run conversation-processor.py first to process conversations")
+        exit(1)
+
+    indexer = RAGIndexer(db_path, embeddings_dir, model, verbose=verbose)
+
+    try:
+        # Index conversations
+        count = indexer.index_conversations(rebuild=rebuild, batch_size=batch_size)
+
+        print(f"\n✓ Indexed {count} conversations")
+
+        if stats:
+            print("\n=== Indexing Statistics ===")
+            stats_data = indexer.get_stats()
+            for key, value in stats_data.items():
+                print(f"{key}: {value}")
+
+        if test_search:
+            print(f"\n=== Test Search: '{test_search}' ===")
+            results = indexer.search(test_search, n_results=5)
+
+            if not results:
+                print("No results found")
+            else:
+                for i, result in enumerate(results, 1):
+                    print(f"\n{i}. [Similarity: {result['similarity']:.3f}] {result['id']}")
+                    print(f"   {result['document'][:200]}...")
+
+    finally:
+        indexer.close()
+
+
+if __name__ == '__main__':
+    main()
--- a/skills/cc-insights/scripts/search-conversations.py
+++ b/skills/cc-insights/scripts/search-conversations.py
@@ -0,0 +1,384 @@
+#!/usr/bin/env python3
+"""
+Search Interface for Claude Code Insights
+
+Provides unified search across conversations using semantic (RAG) and keyword search.
+Supports filtering by dates, files, and output formatting.
+"""
+
+import sqlite3
+import json
+from pathlib import Path
+from typing import List, Dict, Any, Optional
+from datetime import datetime
+import click
+
+try:
+    from rag_indexer import RAGIndexer
+except ImportError:
+    print("Error: Cannot import rag_indexer. Ensure it's in the same directory.")
+    exit(1)
+
+
+class ConversationSearch:
+    """Unified search interface for conversations"""
+
+    def __init__(self, db_path: Path, embeddings_dir: Path, verbose: bool = False):
+        self.db_path = db_path
+        self.embeddings_dir = embeddings_dir
+        self.verbose = verbose
+
+        # Initialize RAG indexer for semantic search
+        self.indexer = RAGIndexer(db_path, embeddings_dir, verbose=verbose)
+
+        # Separate SQLite connection for metadata queries
+        self.conn = sqlite3.connect(str(db_path))
+        self.conn.row_factory = sqlite3.Row
+
+    def _log(self, message: str):
+        """Log if verbose mode is enabled"""
+        if self.verbose:
+            print(f"[{datetime.now().strftime('%H:%M:%S')}] {message}")
+
+    def _get_conversation_details(self, conversation_id: str) -> Optional[Dict[str, Any]]:
+        """Get full conversation details from SQLite"""
+        cursor = self.conn.execute("""
+            SELECT * FROM conversations WHERE id = ?
+        """, (conversation_id,))
+
+        row = cursor.fetchone()
+        if not row:
+            return None
+
+        return {
+            'id': row['id'],
+            'timestamp': row['timestamp'],
+            'message_count': row['message_count'],
+            'user_messages': row['user_messages'],
+            'assistant_messages': row['assistant_messages'],
+            'files_read': json.loads(row['files_read']) if row['files_read'] else [],
+            'files_written': json.loads(row['files_written']) if row['files_written'] else [],
+            'files_edited': json.loads(row['files_edited']) if row['files_edited'] else [],
+            'tools_used': json.loads(row['tools_used']) if row['tools_used'] else [],
+            'topics': json.loads(row['topics']) if row['topics'] else [],
+            'first_user_message': row['first_user_message'],
+            'last_assistant_message': row['last_assistant_message']
+        }
+
+    def semantic_search(
+        self,
+        query: str,
+        limit: int = 10,
+        date_from: Optional[str] = None,
+        date_to: Optional[str] = None,
+        file_pattern: Optional[str] = None
+    ) -> List[Dict[str, Any]]:
+        """Perform RAG-based semantic search"""
+        self._log(f"Semantic search: '{query}'")
+
+        # TODO: Add ChromaDB filters for dates/files when supported
+        results = self.indexer.search(query, n_results=limit * 2)  # Get extra for filtering
+
+        # Enrich with full conversation details
+        enriched_results = []
+        for result in results:
+            details = self._get_conversation_details(result['id'])
+            if details:
+                # Apply post-search filters
+                if date_from and details['timestamp'] < date_from:
+                    continue
+                if date_to and details['timestamp'] > date_to:
+                    continue
+                if file_pattern:
+                    all_files = details['files_read'] + details['files_written'] + details['files_edited']
+                    if not any(file_pattern in f for f in all_files):
+                        continue
+
+                enriched_results.append({
+                    **result,
+                    **details
+                })
+
+            if len(enriched_results) >= limit:
+                break
+
+        return enriched_results
+
+    def keyword_search(
+        self,
+        query: str,
+        limit: int = 10,
+        date_from: Optional[str] = None,
+        date_to: Optional[str] = None,
+        file_pattern: Optional[str] = None
+    ) -> List[Dict[str, Any]]:
+        """Perform SQL-based keyword search"""
+        self._log(f"Keyword search: '{query}'")
+
+        # Build SQL query
+        conditions = [
+            "(first_user_message LIKE ? OR last_assistant_message LIKE ? OR topics LIKE ?)"
+        ]
+        params = [f"%{query}%", f"%{query}%", f"%{query}%"]
+
+        if date_from:
+            conditions.append("timestamp >= ?")
+            params.append(date_from)
+
+        if date_to:
+            conditions.append("timestamp <= ?")
+            params.append(date_to)
+
+        if file_pattern:
+            conditions.append(
+                "(files_read LIKE ? OR files_written LIKE ? OR files_edited LIKE ?)"
+            )
+            params.extend([f"%{file_pattern}%"] * 3)
+
+        where_clause = " AND ".join(conditions)
+
+        cursor = self.conn.execute(f"""
+            SELECT * FROM conversations
+            WHERE {where_clause}
+            ORDER BY timestamp DESC
+            LIMIT ?
+        """, params + [limit])
+
+        results = []
+        for row in cursor.fetchall():
+            results.append({
+                'id': row['id'],
+                'timestamp': row['timestamp'],
+                'message_count': row['message_count'],
+                'user_messages': row['user_messages'],
+                'assistant_messages': row['assistant_messages'],
+                'files_read': json.loads(row['files_read']) if row['files_read'] else [],
+                'files_written': json.loads(row['files_written']) if row['files_written'] else [],
+                'files_edited': json.loads(row['files_edited']) if row['files_edited'] else [],
+                'tools_used': json.loads(row['tools_used']) if row['tools_used'] else [],
+                'topics': json.loads(row['topics']) if row['topics'] else [],
+                'first_user_message': row['first_user_message'],
+                'last_assistant_message': row['last_assistant_message']
+            })
+
+        return results
+
+    def search_by_file(self, file_pattern: str, limit: int = 10) -> List[Dict[str, Any]]:
+        """Find all conversations that touched specific files"""
+        self._log(f"File search: '{file_pattern}'")
+
+        cursor = self.conn.execute("""
+            SELECT DISTINCT c.*
+            FROM conversations c
+            JOIN file_interactions fi ON c.id = fi.conversation_id
+            WHERE fi.file_path LIKE ?
+            ORDER BY c.timestamp DESC
+            LIMIT ?
+        """, (f"%{file_pattern}%", limit))
+
+        results = []
+        for row in cursor.fetchall():
+            results.append({
+                'id': row['id'],
+                'timestamp': row['timestamp'],
+                'message_count': row['message_count'],
+                'files_read': json.loads(row['files_read']) if row['files_read'] else [],
+                'files_written': json.loads(row['files_written']) if row['files_written'] else [],
+                'files_edited': json.loads(row['files_edited']) if row['files_edited'] else [],
+                'tools_used': json.loads(row['tools_used']) if row['tools_used'] else [],
+                'topics': json.loads(row['topics']) if row['topics'] else [],
+                'first_user_message': row['first_user_message']
+            })
+
+        return results
+
+    def search_by_tool(self, tool_name: str, limit: int = 10) -> List[Dict[str, Any]]:
+        """Find conversations using specific tools"""
+        self._log(f"Tool search: '{tool_name}'")
+
+        cursor = self.conn.execute("""
+            SELECT DISTINCT c.*
+            FROM conversations c
+            JOIN tool_usage tu ON c.id = tu.conversation_id
+            WHERE tu.tool_name LIKE ?
+            ORDER BY c.timestamp DESC
+            LIMIT ?
+        """, (f"%{tool_name}%", limit))
+
+        results = []
+        for row in cursor.fetchall():
+            results.append({
+                'id': row['id'],
+                'timestamp': row['timestamp'],
+                'message_count': row['message_count'],
+                'tools_used': json.loads(row['tools_used']) if row['tools_used'] else [],
+                'topics': json.loads(row['topics']) if row['topics'] else [],
+                'first_user_message': row['first_user_message']
+            })
+
+        return results
+
+    def format_results(self, results: List[Dict[str, Any]], format: str = 'text') -> str:
+        """Format search results"""
+        if format == 'json':
+            return json.dumps(results, indent=2)
+
+        elif format == 'markdown':
+            output = [f"# Search Results ({len(results)} found)\n"]
+
+            for i, result in enumerate(results, 1):
+                timestamp = datetime.fromisoformat(result['timestamp']).strftime('%b %d, %Y %H:%M')
+                similarity = f"[Similarity: {result['similarity']:.3f}] " if 'similarity' in result else ""
+
+                output.append(f"## {i}. {similarity}{result['id']}")
+                output.append(f"**Date:** {timestamp}")
+                output.append(f"**Messages:** {result.get('message_count', 'N/A')}")
+
+                if result.get('topics'):
+                    output.append(f"**Topics:** {', '.join(result['topics'])}")
+
+                all_files = (result.get('files_read', []) +
+                           result.get('files_written', []) +
+                           result.get('files_edited', []))
+                if all_files:
+                    output.append(f"**Files:** {', '.join(all_files[:5])}")
+                    if len(all_files) > 5:
+                        output.append(f"  _(and {len(all_files) - 5} more)_")
+
+                if result.get('tools_used'):
+                    output.append(f"**Tools:** {', '.join(result['tools_used'][:5])}")
+
+                if result.get('first_user_message'):
+                    msg = result['first_user_message'][:200]
+                    output.append(f"\n**Snippet:** {msg}...")
+
+                output.append("")
+
+            return "\n".join(output)
+
+        else:  # text format
+            output = [f"\nFound {len(results)} conversations:\n"]
+
+            for i, result in enumerate(results, 1):
+                timestamp = datetime.fromisoformat(result['timestamp']).strftime('%b %d, %Y %H:%M')
+                similarity = f"[Similarity: {result['similarity']:.3f}] " if 'similarity' in result else ""
+
+                output.append(f"{i}. {similarity}{result['id']}")
+                output.append(f"   Date: {timestamp}")
+                output.append(f"   Messages: {result.get('message_count', 'N/A')}")
+
+                if result.get('topics'):
+                    output.append(f"   Topics: {', '.join(result['topics'][:3])}")
+
+                all_files = (result.get('files_read', []) +
+                           result.get('files_written', []) +
+                           result.get('files_edited', []))
+                if all_files:
+                    output.append(f"   Files: {', '.join(all_files[:3])}")
+
+                if result.get('first_user_message'):
+                    msg = result['first_user_message'][:150].replace('\n', ' ')
+                    output.append(f"   Preview: {msg}...")
+
+                output.append("")
+
+            return "\n".join(output)
+
+    def close(self):
+        """Close connections"""
+        self.indexer.close()
+        if self.conn:
+            self.conn.close()
+
+
+@click.command()
+@click.argument('query', required=False)
+@click.option('--db-path', type=click.Path(), default='.claude/skills/cc-insights/.processed/conversations.db',
+              help='SQLite database path')
+@click.option('--embeddings-dir', type=click.Path(), default='.claude/skills/cc-insights/.processed/embeddings',
+              help='ChromaDB embeddings directory')
+@click.option('--semantic/--keyword', default=True, help='Use semantic (RAG) or keyword search')
+@click.option('--file', type=str, help='Filter by file pattern')
+@click.option('--tool', type=str, help='Search by tool name')
+@click.option('--date-from', type=str, help='Start date (ISO format)')
+@click.option('--date-to', type=str, help='End date (ISO format)')
+@click.option('--limit', default=10, help='Maximum results')
+@click.option('--format', type=click.Choice(['text', 'json', 'markdown']), default='text', help='Output format')
+@click.option('--verbose', is_flag=True, help='Show detailed logs')
+def main(query: Optional[str], db_path: str, embeddings_dir: str, semantic: bool, file: Optional[str],
+         tool: Optional[str], date_from: Optional[str], date_to: Optional[str], limit: int, format: str, verbose: bool):
+    """Search Claude Code conversations
+
+    Examples:
+
+      # Semantic search
+      python search-conversations.py "authentication bugs"
+
+      # Keyword search
+      python search-conversations.py "React optimization" --keyword
+
+      # Filter by file
+      python search-conversations.py "testing" --file "src/components"
+
+      # Search by tool
+      python search-conversations.py --tool "Write"
+
+      # Date range
+      python search-conversations.py "refactoring" --date-from 2025-10-01
+
+      # JSON output
+      python search-conversations.py "deployment" --format json
+    """
+    db_path = Path(db_path)
+    embeddings_dir = Path(embeddings_dir)
+
+    if not db_path.exists():
+        print(f"Error: Database not found at {db_path}")
+        print("Run conversation-processor.py first")
+        exit(1)
+
+    searcher = ConversationSearch(db_path, embeddings_dir, verbose=verbose)
+
+    try:
+        results = []
+
+        if tool:
+            # Search by tool
+            results = searcher.search_by_tool(tool, limit=limit)
+
+        elif file:
+            # Search by file
+            results = searcher.search_by_file(file, limit=limit)
+
+        elif query:
+            # Text search
+            if semantic:
+                results = searcher.semantic_search(
+                    query,
+                    limit=limit,
+                    date_from=date_from,
+                    date_to=date_to,
+                    file_pattern=file
+                )
+            else:
+                results = searcher.keyword_search(
+                    query,
+                    limit=limit,
+                    date_from=date_from,
+                    date_to=date_to,
+                    file_pattern=file
+                )
+        else:
+            print("Error: Provide a query, --file, or --tool option")
+            exit(1)
+
+        # Format and output
+        output = searcher.format_results(results, format=format)
+        print(output)
+
+    finally:
+        searcher.close()
+
+
+if __name__ == '__main__':
+    main()
--- a/skills/cc-insights/templates/weekly-summary.md
+++ b/skills/cc-insights/templates/weekly-summary.md
@@ -0,0 +1,32 @@
+# Weekly Activity Summary
+**Period:** {{ date_from }} to {{ date_to }}
+**Generated:** {{ generation_date }}
+
+## Overview
+- **Total Conversations:** {{ total_conversations }}
+- **Active Days:** {{ active_days }}
+- **Total Messages:** {{ total_messages }}
+- **Average Messages per Conversation:** {{ avg_messages }}
+
+## Activity Timeline
+{{ activity_timeline }}
+
+## Top Files Modified
+{% for file in top_files %}
+{{ loop.index }}. {{ file.heat_emoji }} **{{ file.path }}**
+   - Conversations: {{ file.count }}
+   - Interactions: Read {{ file.read }}, Write {{ file.write }}, Edit {{ file.edit }}
+{% endfor %}
+
+## Tool Usage
+{% for tool in top_tools %}
+{{ loop.index }}. **{{ tool.name }}**: {{ tool.count }} uses ({{ tool.percentage }}%)
+{% endfor %}
+
+## Topics
+{% for topic in top_topics %}
+- {{ topic.name }}: {{ topic.count }} mentions
+{% endfor %}
+
+## Insights & Recommendations
+{{ insights }}