cc-insights: Claude Code Conversation Insights
Automatically process, search, and analyze your Claude Code conversation history using RAG-powered semantic search and intelligent pattern detection.
Overview
This skill transforms your Claude Code conversations into actionable insights without any manual effort. It automatically processes conversations stored in ~/.claude/projects/, builds a searchable knowledge base with semantic understanding, and generates insightful reports about your development patterns.
Key Features
- 🔍 RAG-Powered Semantic Search: Find conversations by meaning, not just keywords
- 📊 Automatic Insight Reports: Detect patterns, file hotspots, and tool usage analytics
- 📈 Activity Trends: Understand development patterns over time
- 💡 Knowledge Extraction: Surface recurring topics and solutions
- 🎯 Zero Manual Effort: Fully automatic processing of existing conversations
- 🚀 Fast Performance: <1s search, <10s report generation
Quick Start
1. Installation
# Navigate to the skill directory
cd .claude/skills/cc-insights
# Install Python dependencies
pip install -r requirements.txt
# Verify installation
python scripts/conversation-processor.py --help
2. Initial Setup
Process your existing conversations:
# Process all conversations for current project
python scripts/conversation-processor.py --project-name annex --verbose --stats
# Build semantic search index
python scripts/rag-indexer.py --verbose --stats
This one-time setup will:
- Parse all JSONL files from
~/.claude/projects/ - Extract metadata (files, tools, topics, timestamps)
- Build SQLite database for fast queries
- Generate vector embeddings for semantic search
- Create ChromaDB index
Time: ~1-2 minutes for 100 conversations
3. Search Conversations
# Semantic search (understands meaning)
python scripts/search-conversations.py "fixing authentication bugs"
# Search by file
python scripts/search-conversations.py --file "src/auth/token.ts"
# Search by tool
python scripts/search-conversations.py --tool "Write"
# Keyword search with date filter
python scripts/search-conversations.py "refactoring" --keyword --date-from 2025-10-01
4. Generate Insights
# Weekly activity report
python scripts/insight-generator.py weekly --verbose
# File heatmap (most modified files)
python scripts/insight-generator.py file-heatmap
# Tool usage analytics
python scripts/insight-generator.py tool-usage
# Save report to file
python scripts/insight-generator.py weekly --output weekly-report.md
Usage via Skill
Once set up, you can interact with the skill naturally:
User: "Search conversations about React performance optimization"
→ Returns top semantic matches with context
User: "Generate insights for the past week"
→ Creates comprehensive weekly report with metrics
User: "Show me files I've modified most often"
→ Generates file heatmap with recommendations
Architecture
.claude/skills/cc-insights/
├── SKILL.md # Skill definition for Claude
├── README.md # This file
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore rules
│
├── scripts/ # Core functionality
│ ├── conversation-processor.py # Parse JSONL, extract metadata
│ ├── rag-indexer.py # Build vector embeddings
│ ├── search-conversations.py # Search interface
│ └── insight-generator.py # Report generation
│
├── templates/ # Report templates
│ └── weekly-summary.md
│
└── .processed/ # Generated data (gitignored)
├── conversations.db # SQLite metadata
└── embeddings/ # ChromaDB vector store
├── chroma.sqlite3
└── [embedding data]
Scripts Reference
conversation-processor.py
Parse JSONL files and extract conversation metadata.
Usage:
python scripts/conversation-processor.py [OPTIONS]
Options:
--project-name TEXT Project to process (default: annex)
--db-path PATH Database path
--reindex Reprocess all (ignore cache)
--verbose Show detailed logs
--stats Display statistics after processing
What it does:
- Scans
~/.claude/projects/[project]/*.jsonl - Decodes base64-encoded conversation content
- Extracts: messages, files, tools, topics, timestamps
- Stores in SQLite with indexes for fast queries
- Tracks processing state for incremental updates
Output:
- SQLite database at
.processed/conversations.db - Processing state for incremental updates
rag-indexer.py
Build vector embeddings for semantic search.
Usage:
python scripts/rag-indexer.py [OPTIONS]
Options:
--db-path PATH Database path
--embeddings-dir PATH ChromaDB directory
--model TEXT Embedding model (default: all-MiniLM-L6-v2)
--rebuild Rebuild entire index
--batch-size INT Batch size (default: 32)
--verbose Show detailed logs
--stats Display statistics
--test-search TEXT Test search with query
What it does:
- Reads conversations from SQLite
- Generates embeddings using sentence-transformers
- Stores in ChromaDB for similarity search
- Supports incremental indexing (only new conversations)
Models:
all-MiniLM-L6-v2(default): Fast, good quality, 384 dimensionsall-mpnet-base-v2: Higher quality, slower, 768 dimensions
search-conversations.py
Search conversations with semantic + metadata filters.
Usage:
python scripts/search-conversations.py QUERY [OPTIONS]
Options:
--semantic/--keyword Semantic (RAG) or keyword search (default: semantic)
--file TEXT Filter by file pattern
--tool TEXT Search by tool name
--date-from TEXT Start date (ISO format)
--date-to TEXT End date (ISO format)
--limit INT Max results (default: 10)
--format TEXT Output: text|json|markdown (default: text)
--verbose Show detailed logs
Examples:
# Semantic search
python scripts/search-conversations.py "authentication bugs"
# Filter by file
python scripts/search-conversations.py "React optimization" --file "src/components"
# Search by tool
python scripts/search-conversations.py --tool "Edit"
# Date range
python scripts/search-conversations.py "deployment" --date-from 2025-10-01
# JSON output for integration
python scripts/search-conversations.py "testing" --format json > results.json
insight-generator.py
Generate pattern-based reports and analytics.
Usage:
python scripts/insight-generator.py REPORT_TYPE [OPTIONS]
Report Types:
weekly Weekly activity summary
file-heatmap File modification heatmap
tool-usage Tool usage analytics
Options:
--date-from TEXT Start date (ISO format)
--date-to TEXT End date (ISO format)
--output PATH Save to file (default: stdout)
--verbose Show detailed logs
Examples:
# Weekly report (last 7 days)
python scripts/insight-generator.py weekly
# Custom date range
python scripts/insight-generator.py weekly --date-from 2025-10-01 --date-to 2025-10-15
# File heatmap with output
python scripts/insight-generator.py file-heatmap --output heatmap.md
# Tool analytics
python scripts/insight-generator.py tool-usage
Data Storage
All processed data is stored locally in .processed/ (gitignored):
SQLite Database (conversations.db)
Tables:
conversations: Main metadata (timestamps, messages, topics)file_interactions: File-level interactions (read, write, edit)tool_usage: Tool usage counts per conversationprocessing_state: Tracks processed files for incremental updates
Indexes:
idx_timestamp: Fast date-range queriesidx_project: Filter by projectidx_file_path: File-based searchesidx_tool_name: Tool usage queries
ChromaDB Vector Store (embeddings/)
Contents:
- Vector embeddings (384-dimensional by default)
- Document text for retrieval
- Metadata for filtering
- HNSW index for fast similarity search
Performance:
- <1 second for semantic search
- Handles 10,000+ conversations efficiently
- ~100MB per 1,000 conversations
Performance
| Operation | Time | Notes |
|---|---|---|
| Initial processing (100 convs) | ~30s | One-time setup |
| Initial indexing (100 convs) | ~60s | One-time setup |
| Incremental processing | <5s | Only new conversations |
| Semantic search | <1s | Top 10 results |
| Keyword search | <0.1s | SQLite FTS |
| Weekly report generation | <10s | Includes visualizations |
Troubleshooting
"Database not found"
Problem: Scripts can't find conversations.db
Solution:
# Run processor first
python scripts/conversation-processor.py --project-name annex --verbose
"No conversations found"
Problem: Project name doesn't match or no JSONL files
Solution:
# Check project directories
ls ~/.claude/projects/
# Use correct project name (may be encoded)
python scripts/conversation-processor.py --project-name [actual-name] --verbose
"ImportError: sentence_transformers"
Problem: Dependencies not installed
Solution:
# Install requirements
pip install -r requirements.txt
# Or individually
pip install sentence-transformers chromadb jinja2 click python-dateutil
"Slow embedding generation"
Problem: Large number of conversations
Solution:
# Use smaller batch size
python scripts/rag-indexer.py --batch-size 16
# Or use faster model (lower quality)
python scripts/rag-indexer.py --model all-MiniLM-L6-v2
"Out of memory"
Problem: Too many conversations processed at once
Solution:
# Smaller batch size
python scripts/rag-indexer.py --batch-size 8
# Or process in chunks by date
python scripts/conversation-processor.py --date-from 2025-10-01 --date-to 2025-10-15
Incremental Updates
The system automatically handles incremental updates:
-
Conversation Processor: Tracks file hashes in
processing_statetable- Only reprocesses changed files
- Detects new JSONL files automatically
-
RAG Indexer: Checks ChromaDB for existing IDs
- Only indexes new conversations
- Skips already-embedded conversations
Recommended workflow:
# Daily/weekly: Run both for new conversations
python scripts/conversation-processor.py --project-name annex
python scripts/rag-indexer.py
# Takes <5s if only a few new conversations
Integration Examples
Search from command line
# Quick search function in .bashrc or .zshrc
cc-search() {
python ~/.claude/skills/cc-insights/scripts/search-conversations.py "$@"
}
# Usage
cc-search "authentication bugs"
Generate weekly report automatically
# Add to crontab for weekly reports
0 9 * * MON cd ~/.claude/skills/cc-insights && python scripts/insight-generator.py weekly --output ~/reports/weekly-$(date +\%Y-\%m-\%d).md
Export data for external tools
# Export to JSON
python scripts/search-conversations.py "testing" --format json | jq
# Export metadata
sqlite3 .processed/conversations.db "SELECT * FROM conversations" -json > export.json
Privacy & Security
- Local-only: All data stays on your machine
- No external APIs: Embeddings generated locally
- Project-scoped: Only accesses current project
- Gitignored:
.processed/excluded from version control - Sensitive data: Review before sharing reports (may contain secrets)
Requirements
Python Dependencies
sentence-transformers>=2.2.0- Semantic embeddingschromadb>=0.4.0- Vector databasejinja2>=3.1.0- Template engineclick>=8.1.0- CLI frameworkpython-dateutil>=2.8.0- Date utilities
System Requirements
- Python 3.8+
- 500MB disk space (for 1,000 conversations)
- 2GB RAM (for embedding generation)
Limitations
- Read-only: Analyzes existing conversations, doesn't modify them
- Single project: Designed for per-project insights (not cross-project)
- Static analysis: Analyzes saved conversations, not real-time
- Embedding quality: Good but not GPT-4 level (local models)
- JSONL format: Depends on Claude Code's internal storage format
Future Enhancements
Potential additions (not currently implemented):
- Cross-project analytics dashboard
- AI-powered summarization with LLM
- Slack/Discord integration for weekly reports
- Git commit correlation
- VS Code extension
- Web dashboard (Next.js)
- Confluence/Notion export
- Custom embedding models
FAQ
Q: How often should I rebuild the index? A: Never, unless changing models. Use incremental updates.
Q: Can I change the embedding model?
A: Yes, use --model flag with rag-indexer.py, then --rebuild.
Q: Does this work with incognito mode? A: No, incognito conversations aren't saved to JSONL files.
Q: Can I share reports with my team? A: Yes, but review for sensitive information first (API keys, secrets).
Q: What if Claude Code changes the JSONL format? A: The processor may need updates. File an issue if parsing breaks.
Q: Can I delete old conversations?
A: Yes, remove JSONL files and run --reindex to rebuild.
Contributing
Contributions welcome! Areas to improve:
- Additional report templates
- Better pattern detection algorithms
- Performance optimizations
- Web dashboard implementation
- Documentation improvements
License
MIT License - See repository root for details
Support
For issues or questions:
- Check this README and SKILL.md
- Review script
--helpoutput - Run with
--verboseto see detailed logs - Check
.processed/logs/if created - Open an issue in the repository
Built for Connor's annex project Zero-effort conversation intelligence