Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:46:50 +08:00
commit a3a73d67d7
67 changed files with 19703 additions and 0 deletions

View File

@@ -0,0 +1,302 @@
# Oracle SessionStart Hook Setup
This guide explains how to configure Claude Code to automatically load Oracle context when sessions start.
## Overview
The SessionStart hook automatically injects Oracle knowledge into every new or resumed Claude Code session, ensuring Claude always has access to:
- Critical gotchas and warnings
- Recent corrections
- High-priority patterns and solutions
- Project-specific preferences
## Quick Setup
### 1. Add Hook to Claude Code Settings
Edit your Claude Code settings file (location varies by platform):
- **macOS**: `~/Library/Application Support/Claude/settings.json`
- **Linux**: `~/.config/Claude/settings.json`
- **Windows**: `%APPDATA%\Claude\settings.json`
Add this configuration to the `hooks` section:
```json
{
"hooks": {
"SessionStart": [
{
"matcher": "startup",
"hooks": [
{
"type": "command",
"command": "python /full/path/to/ClaudeShack/skills/oracle/scripts/session_start_hook.py"
}
]
}
]
}
}
```
**Important**: Replace `/full/path/to/ClaudeShack` with the actual absolute path to your ClaudeShack directory.
### 2. Test the Hook
Test that the hook works by running it manually:
```bash
cd /path/to/your/project
python /path/to/ClaudeShack/skills/oracle/scripts/session_start_hook.py --debug
```
You should see Oracle context output to stderr. If you see "Oracle: Not initialized", run:
```bash
python /path/to/ClaudeShack/skills/oracle/scripts/init_oracle.py
```
### 3. Start a New Session
Start a new Claude Code session. Oracle context should automatically be injected!
You'll see something like:
```markdown
# Oracle Project Knowledge
Knowledge Base: 25 entries | 5 sessions recorded
## Key Knowledge
### Gotchas (Watch Out!)
- **[CRITICAL]** Database connections must be closed explicitly
- **API rate limit is 100 req/min**
### Recent Corrections
- Use textContent instead of innerHTML for user input (XSS prevention)
- Always use async/await, not callbacks
```
## Configuration Options
### Context Tier Levels
Control how much context is loaded using the `ORACLE_CONTEXT_TIER` environment variable:
```bash
# In your shell profile (.bashrc, .zshrc, etc.):
export ORACLE_CONTEXT_TIER=1 # Default: Critical + High priority only
export ORACLE_CONTEXT_TIER=2 # Include Medium priority
export ORACLE_CONTEXT_TIER=3 # All knowledge
```
Or pass it directly in the hook configuration:
```json
{
"type": "command",
"command": "ORACLE_CONTEXT_TIER=2 python /path/to/session_start_hook.py"
}
```
Or use the CLI argument:
```json
{
"type": "command",
"command": "python /path/to/session_start_hook.py --tier 2"
}
```
### Maximum Context Length
Limit context size to avoid overwhelming the session:
```bash
export ORACLE_MAX_CONTEXT_LENGTH=5000 # Default: 5000 characters
export ORACLE_MAX_CONTEXT_LENGTH=10000 # More context
export ORACLE_MAX_CONTEXT_LENGTH=2000 # Less context
```
Or via CLI:
```json
{
"type": "command",
"command": "python /path/to/session_start_hook.py --max-length 10000"
}
```
## Advanced Configuration
### Hook on Resume Only
To load Oracle context only when resuming sessions (not on new sessions):
```json
{
"hooks": {
"SessionStart": [
{
"matcher": "resume",
"hooks": [
{
"type": "command",
"command": "python /path/to/session_start_hook.py --source resume"
}
]
}
]
}
}
```
### Hook on Both Startup and Resume
```json
{
"hooks": {
"SessionStart": [
{
"matcher": "startup",
"hooks": [
{
"type": "command",
"command": "python /path/to/session_start_hook.py --source startup"
}
]
},
{
"matcher": "resume",
"hooks": [
{
"type": "command",
"command": "python /path/to/session_start_hook.py --source resume"
}
]
}
]
}
}
```
### Per-Project Configuration
If you work with multiple projects, you can use different configurations:
```json
{
"hooks": {
"SessionStart": [
{
"matcher": "startup",
"pathPattern": "**/my-critical-project/**",
"hooks": [
{
"type": "command",
"command": "ORACLE_CONTEXT_TIER=1 python /path/to/session_start_hook.py"
}
]
},
{
"matcher": "startup",
"pathPattern": "**/my-casual-project/**",
"hooks": [
{
"type": "command",
"command": "ORACLE_CONTEXT_TIER=3 python /path/to/session_start_hook.py"
}
]
}
]
}
}
```
## Troubleshooting
### Hook Not Running
1. **Check settings file syntax**: Ensure valid JSON (no trailing commas, proper quotes)
2. **Check paths**: Use absolute paths, not relative
3. **Check permissions**: Ensure script is executable (`chmod +x session_start_hook.py`)
4. **Test manually**: Run the script from your project directory
### No Context Showing
1. **Verify Oracle is initialized**: Run `ls -la .oracle/` in your project
2. **Check if knowledge exists**: Run `python /path/to/query_knowledge.py --summary`
3. **Test hook in debug mode**: `python session_start_hook.py --debug`
### Context Too Large
Reduce context with:
- Lower tier level (`ORACLE_CONTEXT_TIER=1`)
- Smaller max length (`ORACLE_MAX_CONTEXT_LENGTH=3000`)
- Prioritize your knowledge entries (set priority to `low` for less critical items)
### Context Not Relevant
The SessionStart hook loads critical/high priority items only. To get task-specific context:
1. Use the oracle skill manually: `/oracle` (if available)
2. Run: `python /path/to/generate_context.py --task "your task description"`
3. Query specific knowledge: `python /path/to/query_knowledge.py "keywords"`
## Best Practices
1. **Keep critical items truly critical**: Only mark security, data loss, and breaking issues as critical
2. **Regular cleanup**: Review and remove outdated knowledge monthly
3. **Use tags**: Tag knowledge for better organization
4. **Record sessions**: Use `record_session.py` after important sessions
5. **Analyze history**: Run `analyze_history.py --auto-populate` weekly to mine conversation history
## Hook Output Format
The hook outputs JSON in this format:
```json
{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "# Oracle Project Knowledge\n\n..."
}
}
```
Claude Code reads the `additionalContext` field and injects it into the session context.
## Verification
To verify the hook is working:
1. Start a new session
2. Ask Claude: "What do you know about this project from Oracle?"
3. Claude should reference the injected knowledge
## Disable Hook Temporarily
To temporarily disable the hook without removing configuration:
1. Add a condition to the matcher that won't match
2. Or comment out the hook in settings (use `//` in JSONC format if supported)
3. Or set environment variable: `export ORACLE_HOOK_DISABLED=1`
## Related Scripts
- `init_oracle.py` - Initialize Oracle for a project
- `record_session.py` - Record session learnings
- `query_knowledge.py` - Query knowledge base
- `generate_context.py` - Generate context summaries
- `analyze_history.py` - Mine conversation history
## Support
For issues or questions:
1. Check the troubleshooting section above
2. Review Claude Code hooks documentation
3. Test the script manually with `--debug` flag
4. Check Claude Code logs for hook execution errors

View File

@@ -0,0 +1,701 @@
#!/usr/bin/env python3
"""
Oracle Conversation History Analyzer
Analyzes Claude Code conversation history from ~/.claude/projects/ and extracts:
- Patterns and repeated tasks
- Corrections and learnings
- User preferences and gotchas
- Automation opportunities
This script mines existing conversation data without requiring manual capture.
Usage:
python analyze_history.py [options]
python analyze_history.py --project-hash abc123 --auto-populate
python analyze_history.py --all-projects --recent-days 30
python analyze_history.py --analyze-only
Examples:
python analyze_history.py --auto-populate
python analyze_history.py --project-hash abc123def456
python analyze_history.py --all-projects --min-confidence 0.7
"""
import os
import sys
import json
import argparse
from datetime import datetime, timedelta
from pathlib import Path
from collections import defaultdict, Counter
import re
import uuid
CLAUDE_PROJECTS_PATH = Path.home() / '.claude' / 'projects'
# Configuration constants
CONFIG = {
'MAX_TITLE_LENGTH': 200,
'ACTION_CONTEXT_MIN_LEN': 10,
'ACTION_CONTEXT_MAX_LEN': 50,
'TOP_TOOLS_TO_REPORT': 20,
'TOP_CORRECTIONS_TO_ADD': 10,
'TOP_GOTCHAS_TO_ADD': 10,
'TOP_TASKS_TO_ADD': 5,
'MAX_PREFERENCES_TO_ADD': 10,
'DEFAULT_MIN_TASK_OCCURRENCES': 3,
'SNIPPET_LENGTH': 80,
}
# Precompiled regex patterns for performance
CORRECTION_PATTERNS = [
re.compile(r"(?:that's|thats)\s+(?:wrong|incorrect|not right)", re.IGNORECASE),
re.compile(r"(?:don't|dont|do not)\s+(?:do|use|implement)", re.IGNORECASE),
re.compile(r"(?:should|need to)\s+(?:use|do|implement).+(?:instead|not)", re.IGNORECASE),
re.compile(r"(?:actually|correction|fix)[:,]\s+", re.IGNORECASE),
re.compile(r"(?:no|nope),?\s+(?:use|do|try|implement)", re.IGNORECASE),
re.compile(r"(?:wrong|incorrect|mistake)[:,]", re.IGNORECASE),
re.compile(r"(?:better to|prefer to|should)\s+(?:use|do)", re.IGNORECASE),
]
PREFERENCE_PATTERNS = [
re.compile(r"(?:i prefer|i'd prefer|prefer to|i like)\s+(.+)", re.IGNORECASE),
re.compile(r"(?:always|never)\s+(?:use|do|implement)\s+(.+)", re.IGNORECASE),
re.compile(r"(?:i want|i'd like|i need)\s+(.+)", re.IGNORECASE),
re.compile(r"(?:make sure|ensure|remember)\s+(?:to|that)?\s+(.+)", re.IGNORECASE),
re.compile(r"(?:use|implement|do)\s+(.+)\s+(?:instead|not)", re.IGNORECASE),
]
GOTCHA_PATTERNS = [
re.compile(r"(?:error|issue|problem|bug|failing|broken)[:,]?\s+(.+)", re.IGNORECASE),
re.compile(r"(?:warning|careful|watch out)[:,]?\s+(.+)", re.IGNORECASE),
re.compile(r"(?:doesn't work|not working|fails when)\s+(.+)", re.IGNORECASE),
re.compile(r"(?:remember|don't forget)[:,]?\s+(.+)", re.IGNORECASE),
]
def truncate_text(text, max_length=100, suffix='...'):
"""Truncate text to max_length, breaking at word boundaries."""
if len(text) <= max_length:
return text
truncated = text[:max_length].rsplit(' ', 1)[0]
return truncated + suffix
def ensure_knowledge_file(file_path, default_content=None):
"""Ensure knowledge file exists, create with default content if missing."""
if not file_path.exists():
file_path.parent.mkdir(parents=True, exist_ok=True)
with open(file_path, 'w', encoding='utf-8') as f:
json.dump(default_content or [], f, indent=2)
with open(file_path, 'r', encoding='utf-8') as f:
return json.load(f)
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def find_project_hash(oracle_path):
"""Try to determine the project hash for current project."""
# The project hash is based on the project path
# We'll look for recent activity in claude projects that might match
if not CLAUDE_PROJECTS_PATH.exists():
return None
project_root = oracle_path.parent
project_name = project_root.name
# Get all project directories
project_dirs = [d for d in CLAUDE_PROJECTS_PATH.iterdir() if d.is_dir()]
# Sort by most recent modification
project_dirs.sort(key=lambda x: x.stat().st_mtime, reverse=True)
# Return the most recent one (likely current project)
if project_dirs:
return project_dirs[0].name
return None
def load_conversation_history(project_hash, recent_days=None):
"""Load conversation history from JSONL files."""
project_path = CLAUDE_PROJECTS_PATH / project_hash
if not project_path.exists():
print(f"[ERROR] Project path not found: {project_path}")
return []
conversations = []
cutoff_date = None
if recent_days:
cutoff_date = datetime.now() - timedelta(days=recent_days)
# Find all JSONL files
jsonl_files = list(project_path.glob('*.jsonl'))
print(f"[INFO] Found {len(jsonl_files)} conversation files in project {project_hash[:8]}...")
for jsonl_file in jsonl_files:
# Check modification date
if cutoff_date:
mtime = datetime.fromtimestamp(jsonl_file.stat().st_mtime)
if mtime < cutoff_date:
continue
try:
# Use streaming approach for memory efficiency
session_data = {
'session_id': jsonl_file.stem,
'file_path': jsonl_file,
'messages': [],
'tools_used': [],
'created': datetime.fromtimestamp(jsonl_file.stat().st_mtime)
}
with open(jsonl_file, 'r', encoding='utf-8') as f:
for line in f: # Stream line by line - memory efficient
if line.strip():
try:
entry = json.loads(line)
session_data['messages'].append(entry)
# Extract tool usage
if 'message' in entry:
content = entry['message'].get('content', [])
if isinstance(content, list):
for item in content:
if isinstance(item, dict) and item.get('type') == 'tool_use':
session_data['tools_used'].append(item.get('name'))
except json.JSONDecodeError:
continue
conversations.append(session_data)
except Exception as e:
print(f"[WARNING] Failed to load {jsonl_file.name}: {e}")
continue
print(f"[OK] Loaded {len(conversations)} conversations")
return conversations
def extract_messages_by_role(conversations, role='user'):
"""Extract messages of specified role from conversations."""
messages = []
for session in conversations:
for msg in session['messages']:
if 'message' not in msg:
continue
message = msg['message']
if message.get('role') != role:
continue
content = message.get('content', '')
# Handle both string and list content
if isinstance(content, list):
text_parts = []
for item in content:
if isinstance(item, dict) and item.get('type') == 'text':
text_parts.append(item.get('text', ''))
content = ' '.join(text_parts)
if content:
messages.append({
'session_id': session['session_id'],
'content': content,
'timestamp': session['created']
})
return messages
def detect_corrections(user_messages):
"""Detect correction patterns in user messages."""
corrections = []
for msg in user_messages:
content = msg['content']
for pattern in CORRECTION_PATTERNS:
if pattern.search(content):
corrections.append({
'session_id': msg['session_id'],
'content': msg['content'],
'timestamp': msg['timestamp'],
'pattern_matched': pattern.pattern
})
break
return corrections
def detect_preferences(user_messages):
"""Detect user preferences from messages."""
preferences = []
for msg in user_messages:
content = msg['content']
for pattern in PREFERENCE_PATTERNS:
matches = pattern.findall(content)
if matches:
for match in matches:
match_text = match.strip() if isinstance(match, str) else match
# Only capture meaningful preferences
if len(match_text) > 5:
preferences.append({
'session_id': msg['session_id'],
'preference': match_text,
'full_context': content,
'timestamp': msg['timestamp']
})
return preferences
def detect_repeated_tasks(user_messages, min_occurrences=None):
"""Detect repeated tasks that could be automated."""
if min_occurrences is None:
min_occurrences = CONFIG['DEFAULT_MIN_TASK_OCCURRENCES']
# Extract common patterns
task_patterns = defaultdict(list)
# Common action verbs
action_verbs = [
'create', 'add', 'update', 'delete', 'remove', 'fix', 'refactor',
'implement', 'write', 'generate', 'build', 'run', 'test', 'deploy'
]
for msg in user_messages:
content = msg['content'].lower()
# Extract sentences with action verbs
for verb in action_verbs:
# Use word boundaries to capture complete phrases
pattern = rf'\b{verb}\b\s+([a-zA-Z\s-]{{' + str(CONFIG['ACTION_CONTEXT_MIN_LEN']) + ',' + str(CONFIG['ACTION_CONTEXT_MAX_LEN']) + '}})'
matches = re.findall(pattern, content)
for match in matches:
# Clean up the match
clean_match = re.sub(r'[^\w\s-]', '', match).strip()
if len(clean_match) > 5:
task_patterns[f"{verb} {clean_match}"].append({
'session_id': msg['session_id'],
'full_content': msg['content'],
'timestamp': msg['timestamp']
})
# Find tasks that occur multiple times
repeated_tasks = []
for task, occurrences in task_patterns.items():
if len(occurrences) >= min_occurrences:
repeated_tasks.append({
'task': task,
'occurrences': len(occurrences),
'instances': occurrences
})
# Sort by frequency
repeated_tasks.sort(key=lambda x: x['occurrences'], reverse=True)
return repeated_tasks
def detect_gotchas(user_messages, assistant_messages):
"""Detect gotchas from conversations about problems/errors."""
gotchas = []
# Check user messages for problem reports
for msg in user_messages:
content = msg['content']
for pattern in GOTCHA_PATTERNS:
matches = pattern.findall(content)
if matches:
for match in matches:
match_text = match.strip() if isinstance(match, str) else match
gotchas.append({
'session_id': msg['session_id'],
'gotcha': match_text,
'context': content,
'timestamp': msg['timestamp'],
'source': 'user'
})
return gotchas
def analyze_tool_usage(conversations):
"""Analyze which tools are used most frequently."""
tool_counter = Counter()
for session in conversations:
for tool in session['tools_used']:
tool_counter[tool] += 1
return tool_counter.most_common(CONFIG['TOP_TOOLS_TO_REPORT'])
def create_knowledge_entry(category, title, content, context='', priority='medium',
learned_from='conversation_history', tags=None):
"""Create a knowledge entry in Oracle format."""
return {
'id': str(uuid.uuid4()),
'category': category,
'priority': priority,
'title': truncate_text(title, CONFIG['MAX_TITLE_LENGTH']),
'content': content,
'context': context,
'examples': [],
'learned_from': learned_from,
'created': datetime.now().isoformat(),
'last_used': datetime.now().isoformat(),
'use_count': 1,
'tags': tags or []
}
def populate_oracle_knowledge(oracle_path, corrections, preferences, gotchas, repeated_tasks):
"""Populate Oracle knowledge base with extracted data."""
knowledge_dir = oracle_path / 'knowledge'
# Ensure knowledge directory exists
knowledge_dir.mkdir(parents=True, exist_ok=True)
added_counts = {
'corrections': 0,
'preferences': 0,
'gotchas': 0,
'patterns': 0
}
# Add corrections
if corrections:
corrections_file = knowledge_dir / 'corrections.json'
existing_corrections = ensure_knowledge_file(corrections_file, [])
for correction in corrections[:CONFIG['TOP_CORRECTIONS_TO_ADD']]:
# Create entry
entry = create_knowledge_entry(
category='correction',
title=f"Correction: {correction['content']}",
content=correction['content'],
context='Extracted from conversation history',
priority='high',
learned_from='conversation_history_analyzer',
tags=['auto-extracted', 'correction']
)
existing_corrections.append(entry)
added_counts['corrections'] += 1
with open(corrections_file, 'w', encoding='utf-8') as f:
json.dump(existing_corrections, f, indent=2)
# Add preferences
if preferences:
preferences_file = knowledge_dir / 'preferences.json'
existing_preferences = ensure_knowledge_file(preferences_file, [])
# Deduplicate preferences
seen_preferences = set()
for pref in preferences:
pref_text = pref['preference'].lower()
# Skip if too similar to existing
if pref_text in seen_preferences:
continue
seen_preferences.add(pref_text)
entry = create_knowledge_entry(
category='preference',
title=f"Preference: {pref['preference']}",
content=pref['preference'],
context=truncate_text(pref['full_context'], 500),
priority='medium',
learned_from='conversation_history_analyzer',
tags=['auto-extracted', 'preference']
)
existing_preferences.append(entry)
added_counts['preferences'] += 1
if added_counts['preferences'] >= CONFIG['MAX_PREFERENCES_TO_ADD']:
break
with open(preferences_file, 'w', encoding='utf-8') as f:
json.dump(existing_preferences, f, indent=2)
# Add gotchas
if gotchas:
gotchas_file = knowledge_dir / 'gotchas.json'
existing_gotchas = ensure_knowledge_file(gotchas_file, [])
for gotcha in gotchas[:CONFIG['TOP_GOTCHAS_TO_ADD']]:
entry = create_knowledge_entry(
category='gotcha',
title=f"Gotcha: {gotcha['gotcha']}",
content=gotcha['gotcha'],
context=truncate_text(gotcha['context'], 500),
priority='high',
learned_from='conversation_history_analyzer',
tags=['auto-extracted', 'gotcha']
)
existing_gotchas.append(entry)
added_counts['gotchas'] += 1
with open(gotchas_file, 'w', encoding='utf-8') as f:
json.dump(existing_gotchas, f, indent=2)
# Add repeated tasks as patterns (automation candidates)
if repeated_tasks:
patterns_file = knowledge_dir / 'patterns.json'
existing_patterns = ensure_knowledge_file(patterns_file, [])
for task in repeated_tasks[:CONFIG['TOP_TASKS_TO_ADD']]:
entry = create_knowledge_entry(
category='pattern',
title=f"Repeated task: {task['task']}",
content=f"This task has been performed {task['occurrences']} times. Consider automating it.",
context='Detected from conversation history analysis',
priority='medium',
learned_from='conversation_history_analyzer',
tags=['auto-extracted', 'automation-candidate', 'repeated-task']
)
existing_patterns.append(entry)
added_counts['patterns'] += 1
with open(patterns_file, 'w', encoding='utf-8') as f:
json.dump(existing_patterns, f, indent=2)
return added_counts
def generate_analysis_report(conversations, corrections, preferences, gotchas,
repeated_tasks, tool_usage):
"""Generate a comprehensive analysis report."""
report = []
report.append("="*70)
report.append("Oracle Conversation History Analysis Report")
report.append("="*70)
report.append("")
# Summary
total_messages = sum(len(c['messages']) for c in conversations)
report.append(f"Analyzed Conversations: {len(conversations)}")
report.append(f"Total Messages: {total_messages}")
report.append("")
# Corrections
report.append(f"Corrections Detected: {len(corrections)}")
if corrections:
report.append(" Top Corrections:")
for i, corr in enumerate(corrections[:5], 1):
snippet = truncate_text(corr['content'].replace('\n', ' '), CONFIG['SNIPPET_LENGTH'])
report.append(f" {i}. {snippet}")
report.append("")
# Preferences
report.append(f"User Preferences Detected: {len(preferences)}")
if preferences:
report.append(" Sample Preferences:")
for i, pref in enumerate(preferences[:5], 1):
snippet = truncate_text(pref['preference'], CONFIG['SNIPPET_LENGTH'])
report.append(f" {i}. {snippet}")
report.append("")
# Gotchas
report.append(f"Gotchas/Issues Detected: {len(gotchas)}")
if gotchas:
report.append(" Sample Gotchas:")
for i, gotcha in enumerate(gotchas[:5], 1):
snippet = truncate_text(str(gotcha['gotcha']), CONFIG['SNIPPET_LENGTH'])
report.append(f" {i}. {snippet}")
report.append("")
# Repeated Tasks
report.append(f"Repeated Tasks (Automation Candidates): {len(repeated_tasks)}")
if repeated_tasks:
report.append(" Top Repeated Tasks:")
for i, task in enumerate(repeated_tasks[:5], 1):
report.append(f" {i}. {task['task']} (x{task['occurrences']})")
report.append("")
# Tool Usage
report.append("Most Used Tools:")
for i, (tool, count) in enumerate(tool_usage[:10], 1):
report.append(f" {i}. {tool}: {count} times")
report.append("")
report.append("="*70)
return "\n".join(report)
def main():
parser = argparse.ArgumentParser(
description='Analyze Claude Code conversation history',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python analyze_history.py --auto-populate
python analyze_history.py --project-hash abc123def456
python analyze_history.py --all-projects --recent-days 30
python analyze_history.py --analyze-only --min-confidence 0.8
"""
)
parser.add_argument(
'--project-hash',
help='Specific project hash to analyze'
)
parser.add_argument(
'--all-projects',
action='store_true',
help='Analyze all projects (not recommended - may be slow)'
)
parser.add_argument(
'--recent-days',
type=int,
help='Only analyze conversations from last N days'
)
parser.add_argument(
'--auto-populate',
action='store_true',
help='Automatically populate Oracle knowledge base'
)
parser.add_argument(
'--analyze-only',
action='store_true',
help='Only analyze and report, do not populate Oracle'
)
parser.add_argument(
'--min-task-occurrences',
type=int,
default=CONFIG['DEFAULT_MIN_TASK_OCCURRENCES'],
help='Minimum occurrences to consider a task as repeated'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path and not args.analyze_only:
print("[ERROR] .oracle directory not found.")
print(" Run: python .claude/skills/oracle/scripts/init_oracle.py")
sys.exit(1)
# Determine project hash
if args.project_hash:
project_hash = args.project_hash
elif oracle_path:
project_hash = find_project_hash(oracle_path)
if not project_hash:
print("[ERROR] Could not determine project hash.")
print(" Use --project-hash to specify manually")
sys.exit(1)
else:
print("[ERROR] Please specify --project-hash")
sys.exit(1)
print(f"\n[INFO] Analyzing project: {project_hash[:8]}...")
print(f"[INFO] Claude projects path: {CLAUDE_PROJECTS_PATH}\n")
# Load conversations
conversations = load_conversation_history(project_hash, args.recent_days)
if not conversations:
print("[ERROR] No conversations found.")
sys.exit(1) # Exit with error code
# Extract messages
print("[INFO] Extracting user and assistant messages...")
user_messages = extract_messages_by_role(conversations, role='user')
assistant_messages = extract_messages_by_role(conversations, role='assistant')
print(f"[OK] Found {len(user_messages)} user messages")
print(f"[OK] Found {len(assistant_messages)} assistant messages\n")
# Analyze
print("[INFO] Detecting corrections...")
corrections = detect_corrections(user_messages)
print("[INFO] Detecting preferences...")
preferences = detect_preferences(user_messages)
print("[INFO] Detecting gotchas...")
gotchas = detect_gotchas(user_messages, assistant_messages)
print("[INFO] Detecting repeated tasks...")
repeated_tasks = detect_repeated_tasks(user_messages, args.min_task_occurrences)
print("[INFO] Analyzing tool usage...")
tool_usage = analyze_tool_usage(conversations)
print("")
# Generate report
report = generate_analysis_report(
conversations, corrections, preferences, gotchas,
repeated_tasks, tool_usage
)
print(report)
# Populate Oracle if requested
if args.auto_populate and oracle_path and not args.analyze_only:
print("\n[INFO] Populating Oracle knowledge base...")
added_counts = populate_oracle_knowledge(
oracle_path, corrections, preferences, gotchas, repeated_tasks
)
print("\n[OK] Knowledge base updated:")
for category, count in added_counts.items():
if count > 0:
print(f" {category.capitalize()}: +{count} entries")
print("\n[OK] Analysis complete! Knowledge base has been updated.")
print(" Query knowledge: python .claude/skills/oracle/scripts/query_knowledge.py")
elif args.analyze_only:
print("\n[INFO] Analysis complete (no changes made to Oracle)")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,413 @@
#!/usr/bin/env python3
"""
Oracle Pattern Analysis Script
Analyze Oracle knowledge base and session logs to identify:
- Repeated tasks (candidates for automation)
- Common corrections (update defaults/documentation)
- Frequent queries (add to auto-inject context)
- Token-heavy operations (automate)
Usage:
python analyze_patterns.py
python analyze_patterns.py --generate-scripts
python analyze_patterns.py --threshold 3
Examples:
python analyze_patterns.py
python analyze_patterns.py --generate-scripts --threshold 5
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
from collections import Counter, defaultdict
import re
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def load_all_sessions(oracle_path):
"""Load all session logs."""
sessions_dir = oracle_path / 'sessions'
sessions = []
for session_file in sessions_dir.glob('*.md'):
try:
with open(session_file, 'r') as f:
content = f.read()
sessions.append({
'id': session_file.stem,
'file': session_file,
'content': content
})
except Exception as e:
print(f"Warning: Could not read {session_file}: {e}")
return sessions
def analyze_repeated_activities(sessions):
"""Find repeated activities across sessions."""
all_activities = []
for session in sessions:
# Extract activities from session log
content = session['content']
if '## Activities' in content:
activities_section = content.split('## Activities')[1].split('\n\n')[0]
activities = re.findall(r'^- (.+)$', activities_section, re.MULTILINE)
all_activities.extend(activities)
# Count occurrences
activity_counts = Counter(all_activities)
return activity_counts
def analyze_corrections(oracle_path):
"""Analyze correction patterns."""
knowledge_dir = oracle_path / 'knowledge'
corrections_file = knowledge_dir / 'corrections.json'
if not corrections_file.exists():
return {}
with open(corrections_file, 'r') as f:
corrections = json.load(f)
# Group by common themes
themes = defaultdict(list)
for correction in corrections:
content = correction.get('content', '')
# Try to identify theme
if 'async' in content.lower() or 'await' in content.lower():
themes['async-programming'].append(correction)
elif 'security' in content.lower() or 'xss' in content.lower() or 'injection' in content.lower():
themes['security'].append(correction)
elif 'performance' in content.lower() or 'optimization' in content.lower():
themes['performance'].append(correction)
elif 'test' in content.lower():
themes['testing'].append(correction)
else:
themes['general'].append(correction)
return themes
def analyze_file_patterns(sessions):
"""Analyze which files are changed most often."""
file_changes = Counter()
for session in sessions:
content = session['content']
if '## Changes Made' in content:
# Extract file paths
files = re.findall(r'\*\*File\*\*: `([^`]+)`', content)
file_changes.update(files)
return file_changes
def identify_automation_candidates(activity_counts, threshold=3):
"""Identify tasks that are repeated enough to warrant automation."""
candidates = []
for activity, count in activity_counts.items():
if count >= threshold:
# Analyze if it's automatable
automation_score = 0
# Keyword-based scoring
deterministic_keywords = ['run tests', 'build', 'lint', 'format', 'deploy', 'update dependencies']
for keyword in deterministic_keywords:
if keyword in activity.lower():
automation_score += 2
if automation_score > 0 or count >= threshold * 2:
candidates.append({
'activity': activity,
'count': count,
'automation_score': automation_score,
'confidence': 'high' if automation_score >= 2 else 'medium'
})
return sorted(candidates, key=lambda x: (x['automation_score'], x['count']), reverse=True)
def generate_automation_script(activity):
"""Generate a basic automation script for an activity."""
activity_lower = activity.lower()
script_name = re.sub(r'[^a-z0-9]+', '_', activity_lower).strip('_')
script_name = f"auto_{script_name}.sh"
# Basic script template
script = f"""#!/bin/bash
# Auto-generated by Oracle Pattern Analysis
# Purpose: {activity}
# Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
set -e # Exit on error
echo " Automated task: {activity}"
echo "---"
# TODO: Implement automation logic
# Based on the activity pattern, add appropriate commands here
"""
# Add common commands based on keywords
if 'test' in activity_lower:
script += """# Run tests
# npm test
# pytest
# cargo test
"""
elif 'build' in activity_lower:
script += """# Build project
# npm run build
# cargo build
# make
"""
elif 'lint' in activity_lower:
script += """# Run linter
# npm run lint
# cargo clippy
# pylint
"""
elif 'format' in activity_lower:
script += """# Format code
# npm run format
# cargo fmt
# black .
"""
script += """
echo "---"
echo "[OK] Completed: {activity}"
""".format(activity=activity)
return script_name, script
def generate_report(oracle_path, sessions, threshold):
"""Generate analysis report."""
print("="*70)
print("[SEARCH] Oracle Pattern Analysis Report")
print("="*70)
print(f"\nAnalyzing {len(sessions)} sessions\n")
# Repeated activities
print("## Repeated Activities\n")
activity_counts = analyze_repeated_activities(sessions)
if activity_counts:
print("Top repeated tasks:\n")
for activity, count in activity_counts.most_common(10):
emoji = "" if count >= threshold else ""
print(f" {emoji} [{count}x] {activity}")
else:
print(" No repeated activities found\n")
print()
# Automation candidates
print("## Automation Opportunities\n")
candidates = identify_automation_candidates(activity_counts, threshold)
if candidates:
print(f"Found {len(candidates)} automation candidates:\n")
for candidate in candidates:
confidence_emoji = "" if candidate['confidence'] == 'high' else ""
print(f" {confidence_emoji} [{candidate['count']}x] {candidate['activity']}")
print(f" Confidence: {candidate['confidence']}, Score: {candidate['automation_score']}\n")
else:
print(f" No automation candidates (threshold: {threshold} occurrences)\n")
print()
# Correction patterns
print("## Correction Patterns\n")
correction_themes = analyze_corrections(oracle_path)
if correction_themes:
print("Corrections by theme:\n")
for theme, corrections in sorted(correction_themes.items(), key=lambda x: len(x[1]), reverse=True):
print(f" {theme.capitalize()}: {len(corrections)} corrections")
print("\n[WARNING] Consider updating documentation or creating safeguards for common themes\n")
else:
print(" No corrections recorded yet\n")
print()
# File change patterns
print("## Frequently Modified Files\n")
file_changes = analyze_file_patterns(sessions)
if file_changes:
print("Most frequently changed files:\n")
for file_path, count in file_changes.most_common(10):
print(f" [{count}x] {file_path}")
print("\n[TIP] Consider if these files need refactoring or better structure\n")
else:
print(" No file change patterns found\n")
print()
# Recommendations
print("="*70)
print("[INFO] Recommendations")
print("="*70)
print()
if candidates:
print(f"1. **Automate {len(candidates)} repeated tasks**")
print(f" Run with --generate-scripts to create automation scripts\n")
if correction_themes:
most_common_theme = max(correction_themes.items(), key=lambda x: len(x[1]))[0]
print(f"2. **Address {most_common_theme} corrections**")
print(f" Review and create guidelines or linting rules\n")
if file_changes:
top_file = file_changes.most_common(1)[0]
print(f"3. **Review frequently changed file: {top_file[0]}**")
print(f" Changed {top_file[1]} times - may need refactoring\n")
print("="*70)
def save_automation_scripts(oracle_path, candidates):
"""Generate and save automation scripts."""
scripts_dir = oracle_path / 'scripts'
scripts_generated = []
for candidate in candidates:
script_name, script_content = generate_automation_script(candidate['activity'])
script_path = scripts_dir / script_name
with open(script_path, 'w') as f:
f.write(script_content)
# Make executable
os.chmod(script_path, 0o755)
scripts_generated.append(script_path)
print(f"[OK] Generated: {script_path}")
# Create README in scripts dir
readme_path = scripts_dir / 'README.md'
readme_content = f"""# Auto-Generated Automation Scripts
These scripts were generated by Oracle pattern analysis on {datetime.now().strftime('%Y-%m-%d')}.
## Scripts
"""
for candidate in candidates:
script_name = re.sub(r'[^a-z0-9]+', '_', candidate['activity'].lower()).strip('_')
readme_content += f"- `auto_{script_name}.sh` - {candidate['activity']} (used {candidate['count']}x)\n"
readme_content += """
## Usage
Each script is executable:
```bash
./auto_script_name.sh
```
## Customization
These scripts are templates. Review and customize them for your specific needs.
"""
with open(readme_path, 'w') as f:
f.write(readme_content)
print(f"\n Created README: {readme_path}")
return scripts_generated
def main():
parser = argparse.ArgumentParser(
description='Analyze Oracle patterns and identify automation opportunities',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--threshold',
type=int,
default=3,
help='Minimum occurrences to consider for automation (default: 3)'
)
parser.add_argument(
'--generate-scripts',
action='store_true',
help='Generate automation scripts for candidates'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path:
print("[ERROR] Error: .oracle directory not found.")
sys.exit(1)
# Load sessions
sessions = load_all_sessions(oracle_path)
if not sessions:
print("[WARNING] No sessions found. Start recording sessions to enable pattern analysis.")
sys.exit(0)
# Generate report
generate_report(oracle_path, sessions, args.threshold)
# Generate scripts if requested
if args.generate_scripts:
activity_counts = analyze_repeated_activities(sessions)
candidates = identify_automation_candidates(activity_counts, args.threshold)
if candidates:
print("\n" + "="*70)
print(" Generating Automation Scripts")
print("="*70 + "\n")
scripts = save_automation_scripts(oracle_path, candidates)
print(f"\n Generated {len(scripts)} automation scripts!")
print(f" Location: {oracle_path / 'scripts'}")
print("\n[WARNING] Review and customize these scripts before use.\n")
else:
print("\n[WARNING] No automation candidates found (threshold: {args.threshold})\n")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,396 @@
#!/usr/bin/env python3
"""
Oracle Context Generation Script
Generate context summaries from Oracle knowledge base for injection into
claude.md, session starts, or specific tasks.
Usage:
python generate_context.py --session-start
python generate_context.py --task "implement API"
python generate_context.py --output claude.md
python generate_context.py --tier 1
Examples:
python generate_context.py --session-start
python generate_context.py --task "database migration" --tier 2
python generate_context.py --output ../claude.md --update
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def load_all_knowledge(oracle_path):
"""Load all knowledge from Oracle."""
knowledge_dir = oracle_path / 'knowledge'
all_knowledge = []
for category in ['patterns', 'preferences', 'gotchas', 'solutions', 'corrections']:
file_path = knowledge_dir / f'{category}.json'
if file_path.exists():
with open(file_path, 'r') as f:
entries = json.load(f)
for entry in entries:
entry['_category'] = category
all_knowledge.append(entry)
return all_knowledge
def filter_by_tier(knowledge, tier=1):
"""Filter knowledge by tier level."""
if tier == 1:
# Critical only - always load
return [k for k in knowledge if k.get('priority') in ['critical', 'high']]
elif tier == 2:
# Medium priority - load on relevance
return [k for k in knowledge if k.get('priority') in ['critical', 'high', 'medium']]
else:
# All knowledge
return knowledge
def filter_by_relevance(knowledge, task_description):
"""Filter knowledge relevant to a specific task."""
if not task_description:
return knowledge
task_lower = task_description.lower()
relevant = []
for entry in knowledge:
# Check if task keywords appear in entry
score = 0
if task_lower in entry.get('title', '').lower():
score += 3
if task_lower in entry.get('content', '').lower():
score += 2
if task_lower in entry.get('context', '').lower():
score += 1
# Check tags
for tag in entry.get('tags', []):
if tag.lower() in task_lower or task_lower in tag.lower():
score += 2
if score > 0:
entry['_relevance_score'] = score
relevant.append(entry)
# Sort by relevance
return sorted(relevant, key=lambda x: x.get('_relevance_score', 0), reverse=True)
def get_recent_corrections(oracle_path, limit=5):
"""Get most recent corrections."""
knowledge_dir = oracle_path / 'knowledge'
corrections_file = knowledge_dir / 'corrections.json'
if not corrections_file.exists():
return []
with open(corrections_file, 'r') as f:
corrections = json.load(f)
# Sort by creation date
sorted_corrections = sorted(
corrections,
key=lambda x: x.get('created', ''),
reverse=True
)
return sorted_corrections[:limit]
def generate_session_start_context(oracle_path):
"""Generate context for session start."""
knowledge = load_all_knowledge(oracle_path)
# Tier 1: Critical items
critical_items = filter_by_tier(knowledge, tier=1)
# Recent corrections
recent_corrections = get_recent_corrections(oracle_path, limit=5)
context = """# Oracle Project Knowledge
*Auto-generated context for this session*
"""
# Project Overview
index_file = oracle_path / 'index.json'
if index_file.exists():
with open(index_file, 'r') as f:
index = json.load(f)
context += f"""## Project Status
- Total Knowledge Entries: {index.get('total_entries', 0)}
- Last Updated: {index.get('last_updated', 'Unknown')}
- Sessions Recorded: {len(index.get('sessions', []))}
"""
# Critical Knowledge
if critical_items:
context += "## [WARNING] Critical Knowledge\n\n"
for item in critical_items[:10]: # Top 10
context += f"### {item.get('title', 'Untitled')}\n\n"
context += f"**Category**: {item['_category'].capitalize()} | **Priority**: {item.get('priority', 'N/A')}\n\n"
context += f"{item.get('content', 'No content')}\n\n"
if item.get('context'):
context += f"*When to apply*: {item['context']}\n\n"
context += "---\n\n"
# Recent Corrections
if recent_corrections:
context += "## Recent Corrections (Learn from these!)\n\n"
for correction in recent_corrections:
context += f"- **{correction.get('title', 'Correction')}**\n"
context += f" {correction.get('content', '')}\n"
if correction.get('context'):
context += f" *Context*: {correction['context']}\n"
context += "\n"
context += f"\n*Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}*\n"
return context
def generate_task_context(oracle_path, task_description):
"""Generate context for a specific task."""
knowledge = load_all_knowledge(oracle_path)
# Filter by relevance to task
relevant = filter_by_relevance(knowledge, task_description)
context = f"""# Oracle Context for: {task_description}
*Relevant knowledge from Oracle*
"""
if not relevant:
context += "No specific knowledge found for this task.\n\n"
context += "This might be a new area - remember to record learnings!\n"
return context
context += f"Found {len(relevant)} relevant knowledge entries.\n\n"
# Group by category
by_category = {}
for item in relevant[:20]: # Top 20 most relevant
category = item['_category']
if category not in by_category:
by_category[category] = []
by_category[category].append(item)
# Format by category
category_names = {
'patterns': ' Patterns',
'preferences': ' Preferences',
'gotchas': '[WARNING] Gotchas',
'solutions': '[OK] Solutions',
'corrections': ' Corrections'
}
for category, items in by_category.items():
context += f"## {category_names.get(category, category.capitalize())}\n\n"
for item in items:
context += f"### {item.get('title', 'Untitled')}\n\n"
context += f"{item.get('content', 'No content')}\n\n"
if item.get('examples'):
context += "**Examples**:\n"
for ex in item['examples']:
context += f"- {ex}\n"
context += "\n"
context += "---\n\n"
context += f"\n*Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}*\n"
return context
def generate_compact_context(oracle_path):
"""Generate compact context for claude.md injection."""
knowledge = load_all_knowledge(oracle_path)
critical = [k for k in knowledge if k.get('priority') == 'critical']
high = [k for k in knowledge if k.get('priority') == 'high']
context = "<!-- ORACLE_CONTEXT_START -->\n"
context += "<!-- Auto-generated by Oracle - Do not edit manually -->\n\n"
if critical:
context += "**Critical Knowledge**:\n"
for item in critical[:5]:
context += f"- {item.get('title', 'Untitled')}\n"
context += "\n"
if high:
context += "**Important Patterns**:\n"
for item in high[:5]:
context += f"- {item.get('title', 'Untitled')}\n"
context += "\n"
# Recent corrections
recent_corrections = get_recent_corrections(oracle_path, limit=3)
if recent_corrections:
context += "**Recent Learnings**:\n"
for correction in recent_corrections:
content = correction.get('content', '')
# Extract just the "right" part if it's a correction
if '[CHECK] Right:' in content:
right_part = content.split('[CHECK] Right:')[1].split('\n')[0].strip()
context += f"- {right_part}\n"
else:
context += f"- {correction.get('title', '')}\n"
context += "\n"
context += f"*Updated: {datetime.now().strftime('%Y-%m-%d %H:%M')}*\n"
context += "<!-- ORACLE_CONTEXT_END -->\n"
return context
def update_claude_md(oracle_path, project_path):
"""Update claude.md with Oracle context."""
claude_md = project_path / 'claude.md'
context = generate_compact_context(oracle_path)
if not claude_md.exists():
# Create new claude.md with Oracle section
content = f"""# Project Documentation
## Project Knowledge (Oracle)
{context}
## Additional Context
[Add your project-specific context here]
"""
with open(claude_md, 'w') as f:
f.write(content)
print(f"[OK] Created new claude.md with Oracle context")
return
# Update existing claude.md
with open(claude_md, 'r') as f:
content = f.read()
# Replace Oracle section if it exists
if '<!-- ORACLE_CONTEXT_START -->' in content:
import re
pattern = r'<!-- ORACLE_CONTEXT_START -->.*?<!-- ORACLE_CONTEXT_END -->'
content = re.sub(pattern, context, content, flags=re.DOTALL)
print(f"[OK] Updated Oracle context in claude.md")
else:
# Add Oracle section at the top
content = f"## Project Knowledge (Oracle)\n\n{context}\n\n{content}"
print(f"[OK] Added Oracle context to claude.md")
with open(claude_md, 'w') as f:
f.write(content)
def main():
parser = argparse.ArgumentParser(
description='Generate Oracle context summaries',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--session-start',
action='store_true',
help='Generate context for session start'
)
parser.add_argument(
'--task',
help='Generate context for specific task'
)
parser.add_argument(
'--tier',
type=int,
choices=[1, 2, 3],
default=1,
help='Context tier level (1=critical, 2=medium, 3=all)'
)
parser.add_argument(
'--output',
help='Output file (default: stdout)'
)
parser.add_argument(
'--update',
action='store_true',
help='Update the output file (for claude.md)'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path:
print("[ERROR] Error: .oracle directory not found.")
sys.exit(1)
# Generate context
if args.session_start:
context = generate_session_start_context(oracle_path)
elif args.task:
context = generate_task_context(oracle_path, args.task)
elif args.update and args.output:
project_path = oracle_path.parent
update_claude_md(oracle_path, project_path)
return
else:
context = generate_compact_context(oracle_path)
# Output
if args.output:
output_path = Path(args.output)
with open(output_path, 'w') as f:
f.write(context)
print(f"[OK] Context written to: {output_path}")
else:
print(context)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,249 @@
#!/usr/bin/env python3
"""
Oracle Initialization Script
Initializes the Oracle knowledge management system for a project.
Creates directory structure and base files.
Usage:
python init_oracle.py [--path /path/to/project]
Example:
python init_oracle.py
python init_oracle.py --path ~/my-project
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
ORACLE_STRUCTURE = {
'knowledge': {
'patterns.json': [],
'preferences.json': [],
'gotchas.json': [],
'solutions.json': [],
'corrections.json': []
},
'sessions': {},
'timeline': {
'project_timeline.md': '# Project Timeline\n\nChronological history of project development.\n\n'
},
'scripts': {},
'hooks': {}
}
INDEX_TEMPLATE = {
'created': None,
'last_updated': None,
'total_entries': 0,
'categories': {
'patterns': 0,
'preferences': 0,
'gotchas': 0,
'solutions': 0,
'corrections': 0
},
'sessions': [],
'version': '1.0'
}
def create_oracle_structure(base_path):
"""Create Oracle directory structure."""
oracle_path = Path(base_path) / '.oracle'
if oracle_path.exists():
response = input(f"[WARNING] Oracle already exists at {oracle_path}. Reinitialize? [y/N]: ")
if response.lower() != 'y':
print("[ERROR] Initialization cancelled.")
return False
print(f" Creating Oracle structure at {oracle_path}")
# Create directories and files
for dir_name, contents in ORACLE_STRUCTURE.items():
dir_path = oracle_path / dir_name
dir_path.mkdir(parents=True, exist_ok=True)
print(f" [OK] Created {dir_name}/")
# Create files in directory
for filename, content in contents.items():
file_path = dir_path / filename
if filename.endswith('.json'):
with open(file_path, 'w') as f:
json.dump(content, f, indent=2)
else:
with open(file_path, 'w') as f:
f.write(content)
print(f" Created {filename}")
# Create index.json
index_data = INDEX_TEMPLATE.copy()
index_data['created'] = datetime.now().isoformat()
index_data['last_updated'] = datetime.now().isoformat()
with open(oracle_path / 'index.json', 'w') as f:
json.dump(index_data, f, indent=2)
print(f" [OK] Created index.json")
# Create README
readme_content = """# Oracle Knowledge Base
This directory contains the Oracle knowledge management system for this project.
## Structure
- `knowledge/`: Categorized knowledge entries
- `patterns.json`: Code patterns and conventions
- `preferences.json`: User/team preferences
- `gotchas.json`: Known issues and pitfalls
- `solutions.json`: Proven solutions
- `corrections.json`: Historical corrections
- `sessions/`: Session logs by date
- `timeline/`: Chronological project history
- `scripts/`: Auto-generated automation scripts
- `hooks/`: Integration hooks
- `index.json`: Fast lookup index
## Usage
See `.claude/skills/oracle/README.md` for complete documentation.
## Quick Commands
```bash
# Query knowledge
python .claude/skills/oracle/scripts/query_knowledge.py "search term"
# Record session
python .claude/skills/oracle/scripts/record_session.py
# Generate context
python .claude/skills/oracle/scripts/generate_context.py
# Analyze patterns
python .claude/skills/oracle/scripts/analyze_patterns.py
```
---
*Initialized: {}*
""".format(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
with open(oracle_path / 'README.md', 'w') as f:
f.write(readme_content)
print(f" [OK] Created README.md")
# Create .gitignore
gitignore_content = """# Session logs may contain sensitive information
sessions/*.md
# Keep the structure
!sessions/.gitkeep
# Generated scripts
scripts/*
!scripts/.gitkeep
!scripts/README.md
"""
with open(oracle_path / '.gitignore', 'w') as f:
f.write(gitignore_content)
# Create .gitkeep files
(oracle_path / 'sessions' / '.gitkeep').touch()
(oracle_path / 'scripts' / '.gitkeep').touch()
(oracle_path / 'hooks' / '.gitkeep').touch()
print(f" [OK] Created .gitignore")
return oracle_path
def create_integration_hints(oracle_path, project_path):
"""Create hints for integrating Oracle."""
print("\n" + "="*70)
print(" Oracle Initialized Successfully!")
print("="*70)
print(f"\n Location: {oracle_path}")
print("\n Next Steps:\n")
print("1. **Add to claude.md** (if you have one):")
print(" Add this section to your project's claude.md:")
print("""
## Project Knowledge (Oracle)
<!-- ORACLE_CONTEXT_START -->
Run: python .claude/skills/oracle/scripts/generate_context.py --session-start
<!-- ORACLE_CONTEXT_END -->
""")
print("\n2. **Create Session Start Hook** (optional):")
print(f" Create: {project_path}/.claude/hooks/session-start.sh")
print("""
#!/bin/bash
python .claude/skills/oracle/scripts/load_context.py
""")
print("\n3. **Start Recording Knowledge:**")
print(" After sessions, run:")
print(" python .claude/skills/oracle/scripts/record_session.py")
print("\n4. **Query Knowledge:**")
print(" python .claude/skills/oracle/scripts/query_knowledge.py \"search term\"")
print("\n" + "="*70)
print("Oracle is ready to learn and remember! ")
print("="*70 + "\n")
def main():
parser = argparse.ArgumentParser(
description='Initialize Oracle knowledge management system',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python init_oracle.py
python init_oracle.py --path ~/my-project
"""
)
parser.add_argument(
'--path',
type=str,
default='.',
help='Path to project root (default: current directory)'
)
args = parser.parse_args()
project_path = Path(args.path).resolve()
if not project_path.exists():
print(f"[ERROR] Error: Path does not exist: {project_path}")
sys.exit(1)
print(f"> Initializing Oracle for project at: {project_path}\n")
oracle_path = create_oracle_structure(project_path)
if oracle_path:
create_integration_hints(oracle_path, project_path)
sys.exit(0)
else:
sys.exit(1)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,67 @@
#!/usr/bin/env python3
"""
Oracle Context Loader
Load Oracle context at session start (for use in hooks).
Displays relevant knowledge to Claude at the beginning of a session.
Usage:
python load_context.py
python load_context.py --verbose
Example (in .claude/hooks/session-start.sh):
#!/bin/bash
python .claude/skills/oracle/scripts/load_context.py
"""
import sys
from pathlib import Path
import subprocess
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def main():
verbose = '--verbose' in sys.argv
oracle_path = find_oracle_root()
if not oracle_path:
if verbose:
print("Oracle not initialized for this project.")
return
# Run generate_context.py with --session-start
script_path = Path(__file__).parent / 'generate_context.py'
try:
result = subprocess.run(
['python3', str(script_path), '--session-start'],
capture_output=True,
text=True
)
if result.returncode == 0:
print(result.stdout)
else:
if verbose:
print(f"Warning: Could not load Oracle context: {result.stderr}")
except Exception as e:
if verbose:
print(f"Warning: Error loading Oracle context: {e}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,298 @@
#!/usr/bin/env python3
"""
Oracle Knowledge Query Script
Search and retrieve knowledge from the Oracle knowledge base.
Usage:
python query_knowledge.py "search term"
python query_knowledge.py --category patterns
python query_knowledge.py --priority critical
python query_knowledge.py --tags api,auth
python query_knowledge.py --recent 5
Examples:
python query_knowledge.py "authentication"
python query_knowledge.py --category gotchas --priority high
python query_knowledge.py --tags database --recent 10
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
def find_oracle_root():
"""Find the .oracle directory by walking up from current directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def load_knowledge(oracle_path, category=None):
"""Load knowledge from specified category or all categories."""
knowledge_dir = oracle_path / 'knowledge'
all_knowledge = []
categories = [category] if category else ['patterns', 'preferences', 'gotchas', 'solutions', 'corrections']
for cat in categories:
file_path = knowledge_dir / f'{cat}.json'
if file_path.exists():
with open(file_path, 'r') as f:
entries = json.load(f)
for entry in entries:
entry['_category'] = cat
all_knowledge.append(entry)
return all_knowledge
def search_knowledge(knowledge, query=None, priority=None, tags=None):
"""Filter knowledge based on search criteria."""
results = knowledge
# Filter by query (search in title and content)
if query:
query_lower = query.lower()
results = [
entry for entry in results
if query_lower in entry.get('title', '').lower()
or query_lower in entry.get('content', '').lower()
or query_lower in str(entry.get('context', '')).lower()
]
# Filter by priority
if priority:
results = [entry for entry in results if entry.get('priority') == priority]
# Filter by tags
if tags:
tag_list = [t.strip() for t in tags.split(',')]
results = [
entry for entry in results
if any(tag in entry.get('tags', []) for tag in tag_list)
]
return results
def sort_knowledge(knowledge, sort_by='priority'):
"""Sort knowledge by various criteria."""
priority_order = {'critical': 0, 'high': 1, 'medium': 2, 'low': 3}
if sort_by == 'priority':
return sorted(knowledge, key=lambda x: priority_order.get(x.get('priority', 'low'), 3))
elif sort_by == 'recent':
return sorted(knowledge, key=lambda x: x.get('created', ''), reverse=True)
elif sort_by == 'used':
return sorted(knowledge, key=lambda x: x.get('use_count', 0), reverse=True)
else:
return knowledge
def format_entry(entry, compact=False):
"""Format a knowledge entry for display."""
if compact:
return f" [{entry['_category']}] {entry.get('title', 'Untitled')} (Priority: {entry.get('priority', 'N/A')})"
output = []
output.append("" * 70)
output.append(f" {entry.get('title', 'Untitled')}")
output.append(f" Category: {entry['_category']} | Priority: {entry.get('priority', 'N/A')}")
if entry.get('tags'):
output.append(f" Tags: {', '.join(entry['tags'])}")
output.append("")
output.append(f" {entry.get('content', 'No content')}")
if entry.get('context'):
output.append("")
output.append(f" Context: {entry['context']}")
if entry.get('examples'):
output.append("")
output.append(" Examples:")
for ex in entry['examples']:
output.append(f" - {ex}")
output.append("")
output.append(f" Created: {entry.get('created', 'Unknown')}")
output.append(f" Used: {entry.get('use_count', 0)} times")
if entry.get('learned_from'):
output.append(f" Source: {entry['learned_from']}")
return "\n".join(output)
def display_results(results, compact=False, limit=None):
"""Display search results."""
if not results:
print("[ERROR] No knowledge found matching your criteria.")
return
total = len(results)
display_count = min(limit, total) if limit else total
print(f"\n[SEARCH] Found {total} result(s)")
if limit and total > limit:
print(f" Showing first {display_count} results\n")
else:
print()
for i, entry in enumerate(results[:display_count], 1):
if compact:
print(format_entry(entry, compact=True))
else:
print(format_entry(entry, compact=False))
if i < display_count:
print()
def display_summary(oracle_path):
"""Display summary of knowledge base."""
index_path = oracle_path / 'index.json'
if not index_path.exists():
print("[WARNING] No index found. Knowledge base may be empty.")
return
with open(index_path, 'r') as f:
index = json.load(f)
print("="*70)
print("[INFO] Oracle Knowledge Base Summary")
print("="*70)
print(f"\nCreated: {index.get('created', 'Unknown')}")
print(f"Last Updated: {index.get('last_updated', 'Unknown')}")
print(f"Total Entries: {index.get('total_entries', 0)}")
print("\nEntries by Category:")
for category, count in index.get('categories', {}).items():
print(f" {category.capitalize()}: {count}")
print(f"\nSessions Recorded: {len(index.get('sessions', []))}")
print("="*70)
def main():
parser = argparse.ArgumentParser(
description='Query Oracle knowledge base',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python query_knowledge.py "authentication"
python query_knowledge.py --category patterns
python query_knowledge.py --priority critical
python query_knowledge.py --tags api,database
python query_knowledge.py --recent 5
python query_knowledge.py --summary
"""
)
parser.add_argument(
'query',
nargs='?',
help='Search query (searches title, content, context)'
)
parser.add_argument(
'--category',
choices=['patterns', 'preferences', 'gotchas', 'solutions', 'corrections'],
help='Filter by category'
)
parser.add_argument(
'--priority',
choices=['critical', 'high', 'medium', 'low'],
help='Filter by priority'
)
parser.add_argument(
'--tags',
help='Filter by tags (comma-separated)'
)
parser.add_argument(
'--sort',
choices=['priority', 'recent', 'used'],
default='priority',
help='Sort results by (default: priority)'
)
parser.add_argument(
'--limit',
type=int,
help='Limit number of results'
)
parser.add_argument(
'--recent',
type=int,
metavar='N',
help='Show N most recent entries'
)
parser.add_argument(
'--compact',
action='store_true',
help='Display compact results'
)
parser.add_argument(
'--summary',
action='store_true',
help='Display knowledge base summary'
)
args = parser.parse_args()
# Find Oracle directory
oracle_path = find_oracle_root()
if not oracle_path:
print("[ERROR] Error: .oracle directory not found.")
print(" Run: python .claude/skills/oracle/scripts/init_oracle.py")
sys.exit(1)
# Display summary if requested
if args.summary:
display_summary(oracle_path)
sys.exit(0)
# Load knowledge
knowledge = load_knowledge(oracle_path, args.category)
if not knowledge:
print("[ERROR] No knowledge entries found.")
print(" Start recording sessions to build the knowledge base.")
sys.exit(0)
# Search and filter
results = search_knowledge(knowledge, args.query, args.priority, args.tags)
# Sort
if args.recent:
results = sort_knowledge(results, 'recent')
limit = args.recent
else:
results = sort_knowledge(results, args.sort)
limit = args.limit
# Display
display_results(results, args.compact, limit)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,114 @@
#!/usr/bin/env python3
"""
Oracle Commit Recorder
Record git commits in Oracle timeline (for use in git hooks).
Usage:
python record_commit.py
python record_commit.py --message "commit message"
Example (in .oracle/hooks/pre-commit.sh):
#!/bin/bash
python .claude/skills/oracle/scripts/record_commit.py
"""
import sys
import subprocess
from datetime import datetime
from pathlib import Path
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def get_commit_info():
"""Get information about the current/last commit."""
try:
# Get last commit message
message = subprocess.check_output(
['git', 'log', '-1', '--pretty=%B'],
text=True
).strip()
# Get changed files
files = subprocess.check_output(
['git', 'diff', '--name-only', 'HEAD~1'],
text=True
).strip().split('\n')
# Get author
author = subprocess.check_output(
['git', 'log', '-1', '--pretty=%an'],
text=True
).strip()
# Get hash
commit_hash = subprocess.check_output(
['git', 'rev-parse', '--short', 'HEAD'],
text=True
).strip()
return {
'message': message,
'files': [f for f in files if f],
'author': author,
'hash': commit_hash
}
except subprocess.CalledProcessError:
return None
def record_to_timeline(oracle_path, commit_info):
"""Record commit to timeline."""
timeline_file = oracle_path / 'timeline' / 'project_timeline.md'
entry = f"""
## {datetime.now().strftime('%Y-%m-%d %H:%M')} - Commit: {commit_info['hash']}
**Author**: {commit_info['author']}
**Message**: {commit_info['message']}
**Files Changed**:
"""
for file_path in commit_info['files'][:10]: # Top 10 files
entry += f"- `{file_path}`\n"
if len(commit_info['files']) > 10:
entry += f"- ... and {len(commit_info['files']) - 10} more\n"
entry += "\n---\n"
# Append to timeline
with open(timeline_file, 'a') as f:
f.write(entry)
def main():
oracle_path = find_oracle_root()
if not oracle_path:
# Silent fail - not all projects will have Oracle
sys.exit(0)
commit_info = get_commit_info()
if commit_info:
record_to_timeline(oracle_path, commit_info)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,452 @@
#!/usr/bin/env python3
"""
Oracle Session Recording Script
Record a session's activities, learnings, and corrections to the Oracle knowledge base.
Usage:
python record_session.py [options]
python record_session.py --interactive
python record_session.py --summary "Implemented auth" --learnings "Use bcrypt"
Examples:
python record_session.py --interactive
python record_session.py --summary "Fixed bug in API" --corrections "Use async/await not callbacks"
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
import uuid
def find_oracle_root():
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def generate_session_id():
"""Generate a unique session ID."""
timestamp = datetime.now().strftime('%Y-%m-%d_%H%M%S')
short_uuid = str(uuid.uuid4())[:8]
return f"{timestamp}_{short_uuid}"
def interactive_session_record():
"""Interactive mode for recording a session."""
print("="*70)
print("[NOTE] Oracle Session Recording (Interactive Mode)")
print("="*70)
print("\nPress Enter to skip any field.\n")
session = {}
# Summary
session['summary'] = input("Summary of this session:\n> ").strip()
# Activities
print("\nActivities (one per line, empty line to finish):")
activities = []
while True:
activity = input("> ").strip()
if not activity:
break
activities.append(activity)
session['activities'] = activities
# Changes
print("\nFiles changed (format: path/to/file.ts, empty line to finish):")
changes = []
while True:
file_path = input("File: ").strip()
if not file_path:
break
change_desc = input(" Change: ").strip()
reason = input(" Reason: ").strip()
changes.append({
'file': file_path,
'change': change_desc,
'reason': reason
})
session['changes'] = changes
# Decisions
print("\nDecisions made (empty line to finish):")
decisions = []
while True:
decision = input("Decision: ").strip()
if not decision:
break
rationale = input(" Rationale: ").strip()
decisions.append({
'decision': decision,
'rationale': rationale
})
session['decisions'] = decisions
# Learnings
print("\nLearnings (empty line to finish):")
learnings = []
while True:
learning = input("Learning: ").strip()
if not learning:
break
print(" Priority? [critical/high/medium/low]")
priority = input(" > ").strip() or 'medium'
learnings.append({
'content': learning,
'priority': priority
})
session['learnings'] = learnings
# Corrections
print("\nCorrections (what was wrong what's right, empty line to finish):")
corrections = []
while True:
wrong = input("[ERROR] What was wrong: ").strip()
if not wrong:
break
right = input("[CHECK] What's right: ").strip()
context = input(" When this applies: ").strip()
corrections.append({
'wrong': wrong,
'right': right,
'context': context
})
session['corrections'] = corrections
# Questions
print("\nQuestions asked (empty line to finish):")
questions = []
while True:
question = input("Q: ").strip()
if not question:
break
answer = input("A: ").strip()
questions.append({
'question': question,
'answer': answer
})
session['questions'] = questions
return session
def create_session_log(oracle_path, session_id, session_data):
"""Create a session log markdown file."""
sessions_dir = oracle_path / 'sessions'
log_file = sessions_dir / f'{session_id}.md'
content = f"""# Session: {datetime.now().strftime('%Y-%m-%d %H:%M')}
**Session ID**: `{session_id}`
## Summary
{session_data.get('summary', 'No summary provided')}
"""
# Activities
if session_data.get('activities'):
content += "## Activities\n\n"
for activity in session_data['activities']:
content += f"- {activity}\n"
content += "\n"
# Changes
if session_data.get('changes'):
content += "## Changes Made\n\n"
for change in session_data['changes']:
content += f"- **File**: `{change['file']}`\n"
content += f" - Change: {change['change']}\n"
if change.get('reason'):
content += f" - Reason: {change['reason']}\n"
content += "\n"
# Decisions
if session_data.get('decisions'):
content += "## Decisions\n\n"
for decision in session_data['decisions']:
content += f"- **Decision**: {decision['decision']}\n"
if decision.get('rationale'):
content += f" - Rationale: {decision['rationale']}\n"
content += "\n"
# Learnings
if session_data.get('learnings'):
content += "## Learnings\n\n"
for learning in session_data['learnings']:
priority = learning.get('priority', 'medium')
priority_emoji = {'critical': '', 'high': '', 'medium': '', 'low': ''}.get(priority, '')
content += f"- {priority_emoji} **[{priority.upper()}]** {learning['content']}\n"
content += "\n"
# Corrections
if session_data.get('corrections'):
content += "## Corrections\n\n"
for correction in session_data['corrections']:
content += f"- [ERROR] Wrong: {correction['wrong']}\n"
content += f" [CHECK] Right: {correction['right']}\n"
if correction.get('context'):
content += f" [NOTE] Context: {correction['context']}\n"
content += "\n"
# Questions
if session_data.get('questions'):
content += "## Questions Asked\n\n"
for qa in session_data['questions']:
content += f"- **Q**: {qa['question']}\n"
content += f" **A**: {qa['answer']}\n"
content += "\n"
content += f"\n---\n\n*Recorded: {datetime.now().isoformat()}*\n"
with open(log_file, 'w') as f:
f.write(content)
return log_file
def update_knowledge_base(oracle_path, session_id, session_data):
"""Update knowledge base with session learnings and corrections."""
knowledge_dir = oracle_path / 'knowledge'
updated_categories = set()
# Add learnings as solutions or patterns
if session_data.get('learnings'):
for learning in session_data['learnings']:
# Determine if it's a pattern or solution based on content
category = 'solutions' # Default to solutions
entry = {
'id': str(uuid.uuid4()),
'category': category,
'priority': learning.get('priority', 'medium'),
'title': learning['content'][:100], # Truncate for title
'content': learning['content'],
'context': learning.get('context', ''),
'examples': [],
'learned_from': session_id,
'created': datetime.now().isoformat(),
'last_used': datetime.now().isoformat(),
'use_count': 1,
'tags': learning.get('tags', [])
}
# Load existing and append
solutions_file = knowledge_dir / f'{category}.json'
with open(solutions_file, 'r') as f:
entries = json.load(f)
entries.append(entry)
with open(solutions_file, 'w') as f:
json.dump(entries, f, indent=2)
updated_categories.add(category)
# Add corrections
if session_data.get('corrections'):
corrections_file = knowledge_dir / 'corrections.json'
with open(corrections_file, 'r') as f:
corrections = json.load(f)
for correction in session_data['corrections']:
entry = {
'id': str(uuid.uuid4()),
'category': 'correction',
'priority': 'high', # Corrections are high priority
'title': f"Don't: {correction['wrong'][:50]}...",
'content': f"[ERROR] Wrong: {correction['wrong']}\n[CHECK] Right: {correction['right']}",
'context': correction.get('context', ''),
'examples': [],
'learned_from': session_id,
'created': datetime.now().isoformat(),
'last_used': datetime.now().isoformat(),
'use_count': 1,
'tags': []
}
corrections.append(entry)
with open(corrections_file, 'w') as f:
json.dump(corrections, f, indent=2)
updated_categories.add('corrections')
return updated_categories
def update_index(oracle_path, session_id):
"""Update the index with new session."""
index_file = oracle_path / 'index.json'
with open(index_file, 'r') as f:
index = json.load(f)
# Add session to list
if session_id not in index['sessions']:
index['sessions'].append(session_id)
# Update counts
knowledge_dir = oracle_path / 'knowledge'
for category in ['patterns', 'preferences', 'gotchas', 'solutions', 'corrections']:
category_file = knowledge_dir / f'{category}.json'
with open(category_file, 'r') as f:
entries = json.load(f)
index['categories'][category] = len(entries)
index['total_entries'] += len(entries)
# Update timestamp
index['last_updated'] = datetime.now().isoformat()
with open(index_file, 'w') as f:
json.dump(index, f, indent=2)
def update_timeline(oracle_path, session_id, session_data):
"""Update project timeline."""
timeline_file = oracle_path / 'timeline' / 'project_timeline.md'
entry = f"""
## {datetime.now().strftime('%Y-%m-%d %H:%M')} - {session_data.get('summary', 'Session recorded')}
**Session ID**: `{session_id}`
"""
if session_data.get('activities'):
entry += "**Activities**:\n"
for activity in session_data['activities'][:3]: # Top 3
entry += f"- {activity}\n"
if len(session_data['activities']) > 3:
entry += f"- ... and {len(session_data['activities']) - 3} more\n"
entry += "\n"
if session_data.get('learnings'):
entry += f"**Key Learnings**: {len(session_data['learnings'])}\n\n"
if session_data.get('corrections'):
entry += f"**Corrections Made**: {len(session_data['corrections'])}\n\n"
entry += "---\n"
# Append to timeline
with open(timeline_file, 'a') as f:
f.write(entry)
def main():
parser = argparse.ArgumentParser(
description='Record Oracle session',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--interactive',
action='store_true',
help='Interactive mode with prompts'
)
parser.add_argument(
'--summary',
help='Session summary'
)
parser.add_argument(
'--learnings',
help='Learnings (semicolon-separated)'
)
parser.add_argument(
'--corrections',
help='Corrections in format "wrong->right" (semicolon-separated)'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path:
print("[ERROR] Error: .oracle directory not found.")
print(" Run: python .claude/skills/oracle/scripts/init_oracle.py")
sys.exit(1)
# Get session data
if args.interactive:
session_data = interactive_session_record()
else:
session_data = {
'summary': args.summary or '',
'activities': [],
'changes': [],
'decisions': [],
'learnings': [],
'corrections': [],
'questions': []
}
if args.learnings:
for learning in args.learnings.split(';'):
session_data['learnings'].append({
'content': learning.strip(),
'priority': 'medium'
})
if args.corrections:
for correction in args.corrections.split(';'):
if '->' in correction:
wrong, right = correction.split('->', 1)
session_data['corrections'].append({
'wrong': wrong.strip(),
'right': right.strip(),
'context': ''
})
# Generate session ID
session_id = generate_session_id()
print(f"\n[NOTE] Recording session: {session_id}\n")
# Create session log
log_file = create_session_log(oracle_path, session_id, session_data)
print(f"[OK] Session log created: {log_file}")
# Update knowledge base
updated_categories = update_knowledge_base(oracle_path, session_id, session_data)
if updated_categories:
print(f"[OK] Knowledge base updated: {', '.join(updated_categories)}")
# Update timeline
update_timeline(oracle_path, session_id, session_data)
print(f"[OK] Timeline updated")
# Update index
update_index(oracle_path, session_id)
print(f"[OK] Index updated")
print(f"\n Session recorded successfully!\n")
print(f"View log: {log_file}")
print(f"Query knowledge: python .claude/skills/oracle/scripts/query_knowledge.py\n")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,448 @@
#!/usr/bin/env python3
"""
Oracle - Enhanced Session Handoff
Generates comprehensive context for new sessions to prevent degradation from compaction.
This solves the "sessions going insane" problem by preserving critical context
when switching to a fresh session.
Usage:
# Generate handoff context for new session
python session_handoff.py --export
# Import handoff context in new session
python session_handoff.py --import handoff_context.json
# Show what would be included (dry run)
python session_handoff.py --preview
Environment Variables:
ORACLE_VERBOSE: Set to '1' for detailed output
"""
import os
import sys
import json
import argparse
from pathlib import Path
from typing import Dict, List, Any, Optional
from datetime import datetime, timezone
def get_session_context() -> Dict[str, Any]:
"""Extract critical session context for handoff.
Returns:
Dictionary with session context for new session
"""
context = {
'handoff_timestamp': datetime.now(timezone.utc).isoformat(),
'handoff_reason': 'session_degradation',
'oracle_knowledge': {},
'guardian_health': {},
'summoner_state': {},
'active_tasks': [],
'critical_patterns': [],
'recent_corrections': [],
'session_stats': {}
}
# Load Oracle knowledge (critical patterns only)
oracle_dir = Path('.oracle')
if oracle_dir.exists():
context['oracle_knowledge'] = load_critical_oracle_knowledge(oracle_dir)
# Load Guardian session health
guardian_dir = Path('.guardian')
if guardian_dir.exists():
context['guardian_health'] = load_guardian_health(guardian_dir)
# Load Summoner state (active MCDs)
summoner_dir = Path('.summoner')
if summoner_dir.exists():
context['summoner_state'] = load_summoner_state(summoner_dir)
# Get active tasks from current session
context['active_tasks'] = extract_active_tasks()
# Get session statistics
context['session_stats'] = get_session_statistics()
return context
def load_critical_oracle_knowledge(oracle_dir: Path) -> Dict[str, Any]:
"""Load only critical/high-priority Oracle knowledge.
This is KISS - we don't dump everything, just what matters.
Args:
oracle_dir: Path to .oracle directory
Returns:
Critical knowledge for handoff
"""
knowledge = {
'critical_patterns': [],
'recent_corrections': [],
'active_gotchas': [],
'project_context': ''
}
knowledge_dir = oracle_dir / 'knowledge'
if not knowledge_dir.exists():
return knowledge
# Load critical patterns
patterns_file = knowledge_dir / 'patterns.json'
if patterns_file.exists():
try:
with open(patterns_file, 'r', encoding='utf-8') as f:
patterns = json.load(f)
# Only critical/high priority
knowledge['critical_patterns'] = [
p for p in patterns
if p.get('priority') in ['critical', 'high']
][:10] # Max 10 patterns
except (OSError, IOError, json.JSONDecodeError):
pass
# Load recent corrections (last 5)
corrections_file = knowledge_dir / 'corrections.json'
if corrections_file.exists():
try:
with open(corrections_file, 'r', encoding='utf-8') as f:
corrections = json.load(f)
# Sort by timestamp, take last 5
sorted_corrections = sorted(
corrections,
key=lambda x: x.get('created', ''),
reverse=True
)
knowledge['recent_corrections'] = sorted_corrections[:5]
except (OSError, IOError, json.JSONDecodeError):
pass
# Load active gotchas
gotchas_file = knowledge_dir / 'gotchas.json'
if gotchas_file.exists():
try:
with open(gotchas_file, 'r', encoding='utf-8') as f:
gotchas = json.load(f)
# Only high priority gotchas
knowledge['active_gotchas'] = [
g for g in gotchas
if g.get('priority') == 'high'
][:5] # Max 5 gotchas
except (OSError, IOError, json.JSONDecodeError):
pass
return knowledge
def load_guardian_health(guardian_dir: Path) -> Dict[str, Any]:
"""Load Guardian session health metrics.
Args:
guardian_dir: Path to .guardian directory
Returns:
Health metrics and degradation signals
"""
health = {
'last_health_score': None,
'degradation_signals': [],
'handoff_reason': '',
'session_duration_minutes': 0
}
health_file = guardian_dir / 'session_health.json'
if health_file.exists():
try:
with open(health_file, 'r', encoding='utf-8') as f:
data = json.load(f)
health['last_health_score'] = data.get('health_score')
health['degradation_signals'] = data.get('degradation_signals', [])
health['handoff_reason'] = data.get('handoff_reason', '')
health['session_duration_minutes'] = data.get('duration_minutes', 0)
except (OSError, IOError, json.JSONDecodeError):
pass
return health
def load_summoner_state(summoner_dir: Path) -> Dict[str, Any]:
"""Load Summoner active MCDs and task state.
Args:
summoner_dir: Path to .summoner directory
Returns:
Active mission state
"""
state = {
'active_mcds': [],
'pending_tasks': [],
'completed_phases': []
}
# Check for active MCDs
mcds_dir = summoner_dir / 'mcds'
if mcds_dir.exists():
for mcd_file in mcds_dir.glob('*.md'):
try:
with open(mcd_file, 'r', encoding='utf-8') as f:
content = f.read()
# Extract summary and pending tasks
state['active_mcds'].append({
'name': mcd_file.stem,
'file': str(mcd_file),
'summary': extract_mcd_summary(content),
'pending_tasks': extract_pending_tasks(content)
})
except (OSError, IOError, UnicodeDecodeError):
continue
return state
def extract_mcd_summary(mcd_content: str) -> str:
"""Extract executive summary from MCD.
Args:
mcd_content: MCD markdown content
Returns:
Summary text (max 200 chars)
"""
lines = mcd_content.split('\n')
in_summary = False
summary_lines = []
for line in lines:
if '## Executive Summary' in line:
in_summary = True
continue
elif in_summary and line.startswith('##'):
break
elif in_summary and line.strip():
summary_lines.append(line.strip())
summary = ' '.join(summary_lines)
return summary[:200] + '...' if len(summary) > 200 else summary
def extract_pending_tasks(mcd_content: str) -> List[str]:
"""Extract uncompleted tasks from MCD.
Args:
mcd_content: MCD markdown content
Returns:
List of pending task descriptions
"""
pending = []
lines = mcd_content.split('\n')
for line in lines:
# Look for unchecked checkboxes
if '- [ ]' in line:
task = line.replace('- [ ]', '').strip()
pending.append(task)
return pending[:10] # Max 10 pending tasks
def extract_active_tasks() -> List[str]:
"""Extract active tasks from current session.
Returns:
List of active task descriptions
"""
# This would integrate with Claude Code's task system
# For now, return placeholder
return []
def get_session_statistics() -> Dict[str, Any]:
"""Get current session statistics.
Returns:
Session stats (duration, files modified, etc.)
"""
stats = {
'duration_minutes': 0,
'files_modified': 0,
'commands_run': 0,
'errors_encountered': 0
}
# Would integrate with Claude Code session tracking
# For now, return placeholder
return stats
def generate_handoff_message(context: Dict[str, Any]) -> str:
"""Generate human-readable handoff message for new session.
Args:
context: Session context dictionary
Returns:
Formatted handoff message
"""
lines = []
lines.append("=" * 70)
lines.append("SESSION HANDOFF CONTEXT")
lines.append("=" * 70)
lines.append("")
# Handoff reason
health = context.get('guardian_health', {})
if health.get('handoff_reason'):
lines.append(f"Handoff Reason: {health['handoff_reason']}")
lines.append(f"Previous Session Health: {health.get('last_health_score', 'N/A')}/100")
lines.append(f"Session Duration: {health.get('session_duration_minutes', 0)} minutes")
lines.append("")
# Critical Oracle knowledge
oracle = context.get('oracle_knowledge', {})
if oracle.get('critical_patterns'):
lines.append("CRITICAL PATTERNS:")
lines.append("-" * 70)
for pattern in oracle['critical_patterns'][:5]:
lines.append(f"{pattern.get('title', 'Unknown')}")
if pattern.get('content'):
lines.append(f" {pattern['content'][:100]}...")
lines.append("")
if oracle.get('recent_corrections'):
lines.append("RECENT CORRECTIONS (Don't repeat these mistakes):")
lines.append("-" * 70)
for correction in oracle['recent_corrections']:
lines.append(f"{correction.get('title', 'Unknown')}")
lines.append("")
if oracle.get('active_gotchas'):
lines.append("ACTIVE GOTCHAS:")
lines.append("-" * 70)
for gotcha in oracle['active_gotchas']:
lines.append(f"{gotcha.get('title', 'Unknown')}")
lines.append("")
# Active Summoner MCDs
summoner = context.get('summoner_state', {})
if summoner.get('active_mcds'):
lines.append("ACTIVE MISSION CONTROL DOCUMENTS:")
lines.append("-" * 70)
for mcd in summoner['active_mcds']:
lines.append(f"{mcd['name']}")
if mcd.get('summary'):
lines.append(f" Summary: {mcd['summary']}")
if mcd.get('pending_tasks'):
lines.append(f" Pending tasks: {len(mcd['pending_tasks'])}")
lines.append("")
lines.append("=" * 70)
lines.append("Use '/handoff-continue' to pick up where we left off")
lines.append("=" * 70)
return "\n".join(lines)
def export_handoff_context(output_file: str = 'handoff_context.json') -> None:
"""Export session context for handoff.
Args:
output_file: Path to output JSON file
"""
context = get_session_context()
# Save JSON
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(context, f, indent=2)
# Print human-readable message
message = generate_handoff_message(context)
print(message)
print(f"\n✅ Handoff context saved to: {output_file}")
print("\nIn your new session, run:")
print(f" python session_handoff.py --import {output_file}")
def import_handoff_context(input_file: str) -> None:
"""Import handoff context in new session.
Args:
input_file: Path to handoff JSON file
"""
if not Path(input_file).exists():
print(f"❌ Handoff file not found: {input_file}")
sys.exit(1)
with open(input_file, 'r', encoding='utf-8') as f:
context = json.load(f)
# Display handoff message
message = generate_handoff_message(context)
print(message)
print("\n✅ Session handoff complete!")
print("You're now up to speed with critical context from the previous session.")
def preview_handoff() -> None:
"""Preview what would be included in handoff."""
context = get_session_context()
message = generate_handoff_message(context)
print(message)
def main():
parser = argparse.ArgumentParser(
description='Enhanced session handoff with Oracle/Guardian/Summoner integration',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--export',
action='store_true',
help='Export handoff context for new session'
)
parser.add_argument(
'--import',
dest='import_file',
help='Import handoff context from file'
)
parser.add_argument(
'--preview',
action='store_true',
help='Preview handoff context without exporting'
)
parser.add_argument(
'--output',
default='handoff_context.json',
help='Output file for export (default: handoff_context.json)'
)
args = parser.parse_args()
if args.export:
export_handoff_context(args.output)
elif args.import_file:
import_handoff_context(args.import_file)
elif args.preview:
preview_handoff()
else:
parser.print_help()
sys.exit(1)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,402 @@
#!/usr/bin/env python3
"""
Oracle SessionStart Hook
Automatically loads Oracle context when a Claude Code session starts or resumes.
This script is designed to be called by Claude Code's SessionStart hook system.
The script outputs JSON with hookSpecificOutput.additionalContext containing
relevant Oracle knowledge for the session.
Usage:
python session_start_hook.py [--session-id SESSION_ID] [--source SOURCE]
Hook Configuration (add to Claude Code settings):
{
"hooks": {
"SessionStart": [
{
"matcher": "startup",
"hooks": [
{
"type": "command",
"command": "python /path/to/oracle/scripts/session_start_hook.py"
}
]
}
]
}
}
Environment Variables:
ORACLE_CONTEXT_TIER: Context tier level (1=critical, 2=medium, 3=all) [default: 1]
ORACLE_MAX_CONTEXT_LENGTH: Maximum context length in characters [default: 5000]
"""
import os
import sys
import json
import argparse
from datetime import datetime
from pathlib import Path
from typing import List, Dict, Optional, Any
def find_oracle_root() -> Optional[Path]:
"""Find the .oracle directory by walking up from current directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def load_all_knowledge(oracle_path: Path) -> List[Dict[str, Any]]:
"""Load all knowledge from Oracle.
Args:
oracle_path: Path to the .oracle directory
Returns:
List of knowledge entries with _category field added
"""
knowledge_dir = oracle_path / 'knowledge'
all_knowledge: List[Dict[str, Any]] = []
categories = ['patterns', 'preferences', 'gotchas', 'solutions', 'corrections']
for category in categories:
file_path = knowledge_dir / f'{category}.json'
if file_path.exists():
try:
with open(file_path, 'r', encoding='utf-8') as f:
entries = json.load(f)
for entry in entries:
if isinstance(entry, dict):
entry['_category'] = category
all_knowledge.append(entry)
except (json.JSONDecodeError, FileNotFoundError, OSError, IOError):
# Skip corrupted or inaccessible files
continue
return all_knowledge
def filter_by_tier(knowledge: List[Dict[str, Any]], tier: int = 1) -> List[Dict[str, Any]]:
"""Filter knowledge by tier level.
Args:
knowledge: List of knowledge entries
tier: Tier level (1=critical/high, 2=include medium, 3=all)
Returns:
Filtered knowledge entries
"""
if tier == 1:
# Critical and high priority - always load
return [k for k in knowledge if k.get('priority') in ['critical', 'high']]
elif tier == 2:
# Include medium priority
return [k for k in knowledge if k.get('priority') in ['critical', 'high', 'medium']]
else:
# All knowledge
return knowledge
def get_recent_corrections(oracle_path: Path, limit: int = 5) -> List[Dict[str, Any]]:
"""Get most recent corrections.
Args:
oracle_path: Path to the .oracle directory
limit: Maximum number of corrections to return
Returns:
List of recent correction entries
"""
knowledge_dir = oracle_path / 'knowledge'
corrections_file = knowledge_dir / 'corrections.json'
if not corrections_file.exists():
return []
try:
with open(corrections_file, 'r', encoding='utf-8') as f:
corrections = json.load(f)
# Sort by creation date (safely handle missing 'created' field)
sorted_corrections = sorted(
corrections,
key=lambda x: x.get('created', ''),
reverse=True
)
return sorted_corrections[:limit]
except (json.JSONDecodeError, FileNotFoundError, OSError, IOError):
return []
def get_project_stats(oracle_path: Path) -> Optional[Dict[str, Any]]:
"""Get project statistics from index.
Args:
oracle_path: Path to the .oracle directory
Returns:
Index data dictionary or None if unavailable
"""
index_file = oracle_path / 'index.json'
if not index_file.exists():
return None
try:
with open(index_file, 'r', encoding='utf-8') as f:
index = json.load(f)
return index
except (json.JSONDecodeError, FileNotFoundError, OSError, IOError):
return None
# Configuration constants
MAX_KEY_KNOWLEDGE_ITEMS = 15 # Limit before truncation
MAX_ITEMS_PER_CATEGORY = 5 # How many to show per category
RECENT_CORRECTIONS_LIMIT = 3 # How many recent corrections to show
CONTENT_LENGTH_THRESHOLD = 200 # Min content length to display
def generate_context(oracle_path: Path, tier: int = 1, max_length: int = 5000) -> str:
"""Generate context summary for session start.
Args:
oracle_path: Path to the .oracle directory
tier: Context tier level (1=critical, 2=medium, 3=all)
max_length: Maximum context length in characters
Returns:
Formatted context string ready for injection
"""
knowledge = load_all_knowledge(oracle_path)
if not knowledge:
return "Oracle: No knowledge base found. Start recording sessions to build project knowledge."
# Filter by tier
relevant_knowledge = filter_by_tier(knowledge, tier)
# Get recent corrections
recent_corrections = get_recent_corrections(oracle_path, limit=RECENT_CORRECTIONS_LIMIT)
# Get stats
stats = get_project_stats(oracle_path)
# Build context
lines = []
lines.append("# Oracle Project Knowledge")
lines.append("")
# Add stats if available
if stats:
total_entries = stats.get('total_entries', 0)
sessions = len(stats.get('sessions', []))
if total_entries > 0 or sessions > 0:
lines.append(f"Knowledge Base: {total_entries} entries | {sessions} sessions recorded")
lines.append("")
# Add critical/high priority knowledge
if relevant_knowledge:
lines.append("## Key Knowledge")
lines.append("")
# Group by category
by_category: Dict[str, List[Dict[str, Any]]] = {}
for item in relevant_knowledge[:MAX_KEY_KNOWLEDGE_ITEMS]:
category = item['_category']
if category not in by_category:
by_category[category] = []
by_category[category].append(item)
# Category labels
category_labels = {
'patterns': 'Patterns',
'preferences': 'Preferences',
'gotchas': 'Gotchas (Watch Out!)',
'solutions': 'Solutions',
'corrections': 'Corrections'
}
for category, items in by_category.items():
label = category_labels.get(category, category.capitalize())
lines.append(f"### {label}")
lines.append("")
for item in items[:MAX_ITEMS_PER_CATEGORY]:
priority = item.get('priority', 'medium')
title = item.get('title', 'Untitled')
content = item.get('content', '')
# Compact format
if priority == 'critical':
lines.append(f"- **[CRITICAL]** {title}")
elif priority == 'high':
lines.append(f"- **{title}**")
else:
lines.append(f"- {title}")
# Add brief content if it fits
if content and len(content) < CONTENT_LENGTH_THRESHOLD:
lines.append(f" {content}")
lines.append("")
# Add recent corrections
if recent_corrections:
lines.append("## Recent Corrections")
lines.append("")
for correction in recent_corrections:
content = correction.get('content', '')
title = correction.get('title', 'Correction')
# Try to extract the "right" part if available
if content and 'Right:' in content:
try:
right_part = content.split('Right:', 1)[1].split('\n', 1)[0].strip()
if right_part:
lines.append(f"- {right_part}")
else:
lines.append(f"- {title}")
except (IndexError, ValueError, AttributeError):
lines.append(f"- {title}")
else:
lines.append(f"- {title}")
lines.append("")
# Combine and truncate if needed
full_context = "\n".join(lines)
if len(full_context) > max_length:
# Truncate and add note
truncated = full_context[:max_length].rsplit('\n', 1)[0]
truncated += f"\n\n*[Context truncated to {max_length} chars. Use /oracle skill for full knowledge base]*"
return truncated
return full_context
def output_hook_result(context: str, session_id: Optional[str] = None, source: Optional[str] = None) -> None:
"""Output result in Claude Code hook format.
Args:
context: Context string to inject
session_id: Optional session ID
source: Optional session source (startup/resume/clear)
"""
result = {
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": context
}
}
# Add metadata if available
if session_id:
result["hookSpecificOutput"]["sessionId"] = session_id
if source:
result["hookSpecificOutput"]["source"] = source
# Output as JSON
print(json.dumps(result, indent=2))
def main():
parser = argparse.ArgumentParser(
description='Oracle SessionStart hook for Claude Code',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--session-id',
help='Session ID (passed by Claude Code)'
)
parser.add_argument(
'--source',
help='Session source: startup, resume, or clear'
)
parser.add_argument(
'--tier',
type=int,
choices=[1, 2, 3],
help='Context tier level (1=critical, 2=medium, 3=all)'
)
parser.add_argument(
'--max-length',
type=int,
help='Maximum context length in characters'
)
parser.add_argument(
'--debug',
action='store_true',
help='Debug mode - output to stderr instead of JSON'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path:
# No Oracle found - output minimal context
if args.debug:
print("Oracle not initialized for this project", file=sys.stderr)
else:
# Get path to init script relative to this script
script_dir = Path(__file__).parent
init_script_path = script_dir / 'init_oracle.py'
output_hook_result(
f"Oracle: Not initialized. Run `python {init_script_path}` to set up project knowledge tracking.",
args.session_id,
args.source
)
sys.exit(0)
# Get configuration from environment or arguments
tier = args.tier or int(os.getenv('ORACLE_CONTEXT_TIER', '1'))
max_length = args.max_length or int(os.getenv('ORACLE_MAX_CONTEXT_LENGTH', '5000'))
# Generate context
try:
context = generate_context(oracle_path, tier, max_length)
if args.debug:
print(context, file=sys.stderr)
else:
output_hook_result(context, args.session_id, args.source)
except Exception as e:
if args.debug:
print(f"Error generating context: {e}", file=sys.stderr)
import traceback
traceback.print_exc(file=sys.stderr)
else:
# Don't expose internal error details to user
output_hook_result(
"Oracle: Error loading context. Use /oracle skill to query knowledge manually.",
args.session_id,
args.source
)
sys.exit(1)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,504 @@
#!/usr/bin/env python3
"""
Smart Context Generator for Oracle
Enhances context generation by analyzing:
- Current git status (files changed, branch name)
- File patterns and paths in knowledge tags
- Time-decay for older knowledge
- Relevance scoring based on current work
Usage:
python smart_context.py [--format text|json] [--max-length 5000]
This can be used standalone or integrated into generate_context.py
Examples:
python smart_context.py
python smart_context.py --format json --max-length 10000
"""
import os
import sys
import json
import subprocess
from datetime import datetime, timedelta
from pathlib import Path
from typing import List, Dict, Optional, Any, Tuple
import re
def find_oracle_root() -> Optional[Path]:
"""Find the .oracle directory."""
current = Path.cwd()
while current != current.parent:
oracle_path = current / '.oracle'
if oracle_path.exists():
return oracle_path
current = current.parent
return None
def get_git_status() -> Dict[str, Any]:
"""Get current git status information.
Returns:
Dictionary with git status information
"""
git_info = {
'branch': None,
'modified_files': [],
'staged_files': [],
'untracked_files': [],
'is_repo': False
}
try:
# Check if we're in a git repo
subprocess.run(
['git', 'rev-parse', '--git-dir'],
check=True,
capture_output=True,
text=True,
timeout=5
)
git_info['is_repo'] = True
# Get current branch
result = subprocess.run(
['git', 'branch', '--show-current'],
capture_output=True,
text=True,
check=False,
timeout=5
)
if result.returncode == 0:
git_info['branch'] = result.stdout.strip()
# Get modified files
result = subprocess.run(
['git', 'diff', '--name-only'],
capture_output=True,
text=True,
check=False,
timeout=5
)
if result.returncode == 0:
git_info['modified_files'] = [f.strip() for f in result.stdout.split('\n') if f.strip()]
# Get staged files
result = subprocess.run(
['git', 'diff', '--staged', '--name-only'],
capture_output=True,
text=True,
check=False,
timeout=5
)
if result.returncode == 0:
git_info['staged_files'] = [f.strip() for f in result.stdout.split('\n') if f.strip()]
# Get untracked files
result = subprocess.run(
['git', 'ls-files', '--others', '--exclude-standard'],
capture_output=True,
text=True,
check=False,
timeout=5
)
if result.returncode == 0:
git_info['untracked_files'] = [f.strip() for f in result.stdout.split('\n') if f.strip()]
except (subprocess.CalledProcessError, FileNotFoundError, subprocess.TimeoutExpired):
# Not a git repo, git not available, or git command timed out
pass
return git_info
def extract_file_patterns(files: List[str]) -> List[str]:
"""Extract patterns from file paths for matching knowledge.
Args:
files: List of file paths
Returns:
List of patterns (file types, directory names, etc.)
"""
patterns = set()
for file_path in files:
path = Path(file_path)
# Add file extension
if path.suffix:
patterns.add(path.suffix[1:]) # Remove the dot
# Add directory components
for part in path.parts[:-1]: # Exclude filename
if part and part != '.':
patterns.add(part)
# Add filename without extension
stem = path.stem
if stem:
patterns.add(stem)
return list(patterns)
def load_all_knowledge(oracle_path: Path) -> List[Dict[str, Any]]:
"""Load all knowledge from Oracle.
Args:
oracle_path: Path to .oracle directory
Returns:
List of knowledge entries
"""
knowledge_dir = oracle_path / 'knowledge'
all_knowledge: List[Dict[str, Any]] = []
categories = ['patterns', 'preferences', 'gotchas', 'solutions', 'corrections']
for category in categories:
file_path = knowledge_dir / f'{category}.json'
if file_path.exists():
try:
with open(file_path, 'r', encoding='utf-8') as f:
entries = json.load(f)
for entry in entries:
if isinstance(entry, dict):
entry['_category'] = category
all_knowledge.append(entry)
except json.JSONDecodeError as e:
# Log parsing errors for debugging
print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
continue
except (FileNotFoundError, OSError, IOError) as e:
# Log file access errors
print(f"Warning: Cannot read {file_path}: {e}", file=sys.stderr)
continue
return all_knowledge
def calculate_time_decay_score(created_date: str, days_half_life: int = 30) -> float:
"""Calculate time decay score for knowledge based on age.
Args:
created_date: ISO format date string
days_half_life: Number of days for score to decay to 0.5 (must be positive)
Returns:
Score between 0 and 1 (1 = created today, decays over time)
Raises:
ValueError: If days_half_life is not positive
"""
if days_half_life <= 0:
raise ValueError(f"days_half_life must be positive, got {days_half_life}")
try:
created = datetime.fromisoformat(created_date)
# Use UTC time if available, otherwise use local time
now = datetime.now(created.tzinfo) if created.tzinfo else datetime.now()
# Use total_seconds for precise calculation (includes hours/minutes)
age_seconds = (now - created).total_seconds()
age_days = age_seconds / (24 * 3600) # Convert to days with decimals
# Exponential decay: score = 0.5 ^ (days_old / half_life)
score = 0.5 ** (age_days / days_half_life)
return max(0.0, min(1.0, score))
except (ValueError, TypeError):
# If date parsing fails, return neutral score
return 0.5
def calculate_relevance_score(
entry: Dict[str, Any],
file_patterns: List[str],
branch: Optional[str] = None
) -> float:
"""Calculate relevance score for a knowledge entry.
Args:
entry: Knowledge entry dictionary
file_patterns: List of file patterns from current work
branch: Current git branch name
Returns:
Relevance score (0.0 to 1.0)
"""
score = 0.0
# Base score from priority
priority_scores = {
'critical': 1.0,
'high': 0.8,
'medium': 0.5,
'low': 0.2
}
priority = entry.get('priority', 'medium')
score += priority_scores.get(priority, 0.5) * 0.3 # 30% weight to priority
# Score from tag matches - FIXED: protect against empty file_patterns
tags = entry.get('tags', [])
if tags and file_patterns:
# Check how many patterns match tags (using word boundary matching)
matches = sum(1 for pattern in file_patterns
if any(re.search(r'\b' + re.escape(pattern.lower()) + r'\b', tag.lower())
for tag in tags))
tag_score = matches / len(file_patterns) # Safe: len(file_patterns) > 0
score += min(1.0, tag_score) * 0.4 # 40% weight to tag matching
# Score from content/title keyword matching - FIXED: protect against empty file_patterns
if file_patterns:
content = f"{entry.get('title', '')} {entry.get('content', '')} {entry.get('context', '')}".lower()
# Use word boundary matching to avoid false positives
keyword_matches = sum(1 for pattern in file_patterns
if re.search(r'\b' + re.escape(pattern.lower()) + r'\b', content))
keyword_score = keyword_matches / len(file_patterns) # Safe: len(file_patterns) > 0
score += min(1.0, keyword_score) * 0.2 # 20% weight to keyword matching
# Score from time decay
created = entry.get('created', '')
time_score = calculate_time_decay_score(created)
score += time_score * 0.1 # 10% weight to recency
return min(1.0, score)
def score_and_rank_knowledge(
knowledge: List[Dict[str, Any]],
git_info: Dict[str, Any]
) -> List[Tuple[Dict[str, Any], float]]:
"""Score and rank knowledge entries by relevance.
Args:
knowledge: List of knowledge entries
git_info: Git status information
Returns:
List of tuples (entry, score) sorted by score descending
"""
# Extract file patterns from all changed files
all_files = (
git_info['modified_files'] +
git_info['staged_files'] +
git_info['untracked_files']
)
file_patterns = extract_file_patterns(all_files)
# Score each entry
scored_entries = []
for entry in knowledge:
score = calculate_relevance_score(entry, file_patterns, git_info.get('branch'))
scored_entries.append((entry, score))
# Sort by score descending
scored_entries.sort(key=lambda x: x[1], reverse=True)
return scored_entries
def generate_smart_context(
oracle_path: Path,
max_length: int = 5000,
min_score: float = 0.3
) -> str:
"""Generate smart context based on current git status.
Args:
oracle_path: Path to .oracle directory
max_length: Maximum context length (must be > 0)
min_score: Minimum relevance score to include (0.0-1.0)
Returns:
Formatted context string
Raises:
ValueError: If parameters are invalid
"""
# Validate parameters
if not 0.0 <= min_score <= 1.0:
raise ValueError(f"min_score must be in [0.0, 1.0], got {min_score}")
if max_length <= 0:
raise ValueError(f"max_length must be positive, got {max_length}")
# Get git status
git_info = get_git_status()
# Load all knowledge
knowledge = load_all_knowledge(oracle_path)
if not knowledge:
return "Oracle: No knowledge base found."
# Score and rank knowledge
scored_knowledge = score_and_rank_knowledge(knowledge, git_info)
# Filter by minimum score
relevant_knowledge = [(entry, score) for entry, score in scored_knowledge if score >= min_score]
# Build context
lines = []
lines.append("# Oracle Smart Context")
lines.append("")
# Add git status if available
if git_info['is_repo']:
lines.append("## Current Work Context")
if git_info['branch']:
lines.append(f"Branch: `{git_info['branch']}`")
total_files = len(git_info['modified_files']) + len(git_info['staged_files'])
if total_files > 0:
lines.append(f"Files being worked on: {total_files}")
lines.append("")
# Add relevant knowledge
if relevant_knowledge:
lines.append("## Relevant Knowledge")
lines.append("")
# Group by category
by_category: Dict[str, List[Tuple[Dict[str, Any], float]]] = {}
for entry, score in relevant_knowledge[:20]: # Top 20
category = entry['_category']
if category not in by_category:
by_category[category] = []
by_category[category].append((entry, score))
category_labels = {
'patterns': 'Patterns',
'preferences': 'Preferences',
'gotchas': 'Gotchas (Watch Out!)',
'solutions': 'Solutions',
'corrections': 'Corrections'
}
for category, items in by_category.items():
label = category_labels.get(category, category.capitalize())
lines.append(f"### {label}")
lines.append("")
for entry, score in items[:10]: # Top 10 per category
priority = entry.get('priority', 'medium')
title = entry.get('title', 'Untitled')
content = entry.get('content', '')
# Format based on priority and score
if priority == 'critical' or score >= 0.8:
lines.append(f"- **[{score:.1f}] {title}**")
else:
lines.append(f"- [{score:.1f}] {title}")
# Add content if it's brief
if content and len(content) < 200:
lines.append(f" {content}")
# Add tags if they matched
tags = entry.get('tags', [])
if tags:
lines.append(f" *Tags: {', '.join(tags[:5])}*")
lines.append("")
else:
lines.append("No highly relevant knowledge found for current work.")
lines.append("")
lines.append("Showing high-priority items:")
lines.append("")
# Fall back to high-priority items
high_priority = [e for e in knowledge if e.get('priority') in ['critical', 'high']]
for entry in high_priority[:10]:
title = entry.get('title', 'Untitled')
lines.append(f"- {title}")
lines.append("")
# Combine and truncate if needed
full_context = "\n".join(lines)
if len(full_context) > max_length:
truncated = full_context[:max_length]
# Find last newline to avoid breaking mid-line
last_newline = truncated.rfind('\n')
if last_newline != -1:
truncated = truncated[:last_newline]
truncated += f"\n\n*[Context truncated to {max_length} chars]*"
return truncated
return full_context
def main():
import argparse
parser = argparse.ArgumentParser(
description='Generate smart context from Oracle knowledge',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
'--format',
choices=['text', 'json'],
default='text',
help='Output format (text or json)'
)
parser.add_argument(
'--max-length',
type=int,
default=5000,
help='Maximum context length'
)
parser.add_argument(
'--min-score',
type=float,
default=0.3,
help='Minimum relevance score (0.0-1.0)'
)
args = parser.parse_args()
# Find Oracle
oracle_path = find_oracle_root()
if not oracle_path:
if args.format == 'json':
print(json.dumps({'error': 'Oracle not initialized'}))
else:
print("[ERROR] .oracle directory not found.")
sys.exit(1)
# Generate context
try:
context = generate_smart_context(oracle_path, args.max_length, args.min_score)
if args.format == 'json':
output = {
'context': context,
'git_status': get_git_status()
}
print(json.dumps(output, indent=2))
else:
print(context)
except Exception as e:
if args.format == 'json':
print(json.dumps({'error': str(e)}))
else:
print(f"[ERROR] {e}")
sys.exit(1)
if __name__ == '__main__':
main()