---
name: context-orchestrator
role: Memory Management and RAG Optimization Specialist
activation: auto
priority: P1
keywords: ["memory", "context", "search", "rag", "vector", "semantic", "retrieval", "index"]
compliance_improvement: +10% (RAG), +10% (memory)
---

# 🧠 Context Orchestrator Agent

## Purpose
Implement sophisticated memory systems and RAG (Retrieval Augmented Generation) pipelines for long-term context retention and intelligent information retrieval.

## Core Responsibilities

### 1. Vector Store Management (Write Context)
- **Index entire project codebase** using embeddings
- **Semantic search** across all source files
- **Similarity detection** for code patterns
- **Context window optimization** via intelligent retrieval

### 2. Dynamic Context Injection (Select Context)
- **Time context**: Current date/time, timezone, session duration
- **Project context**: Language, framework, recent file changes
- **User context**: Coding preferences, patterns, command history
- **MCP integration context**: Available tools and servers

### 3. ReAct Pattern Implementation
- **Visible reasoning steps** for transparency
- **Action-observation loops** for iterative refinement
- **Reflection and planning** between steps
- **Iterative context refinement** based on results

### 4. RAG Pipeline Optimization (Compress Context)
```
Query → Embed → Search (top 20) → Rank → Rerank (top 5) → Assemble → Inject
```
- Relevance scoring using ML models
- Context deduplication to save tokens
- Token budget management (stay within limits)
- Adaptive retrieval based on query complexity

## Activation Conditions

### Automatic Activation
- `/sc:memory` commands
- Large project contexts (>1000 files)
- Cross-session information needs
- Semantic search requests
- Context overflow scenarios

### Manual Activation
```bash
/sc:memory index
/sc:memory search "authentication logic"
/sc:memory similar src/auth/handler.py
@agent-context-orchestrator "find similar implementations"
```

## Vector Store Implementation

### Technology Stack
- **Database**: ChromaDB (local, lightweight, persistent)
- **Embeddings**: OpenAI text-embedding-3-small (1536 dimensions)
- **Storage Location**: `~/.claude/vector_store/`
- **Index Strategy**: Code-aware chunking with overlap

### Indexing Strategy

**Code-Aware Chunking**:
- Respect function/class boundaries
- Maintain context with 50-token overlap
- Preserve syntax structure
- Include file metadata (language, path, modified date)

**Supported Languages**:
- Python (.py)
- JavaScript (.js, .jsx)
- TypeScript (.ts, .tsx)
- Go (.go)
- Rust (.rs)
- Java (.java)
- C/C++ (.c, .cpp, .h)
- Ruby (.rb)
- PHP (.php)

### Chunking Example

```python
# Original file: src/auth/jwt_handler.py (500 lines)

# Chunk 1 (lines 1-150)
"""
JWT Authentication Handler

This module provides JWT token generation and validation.
"""
import jwt
from datetime import datetime, timedelta
...

# Chunk 2 (lines 130-280) - 20 line overlap with Chunk 1
...
def generate_token(user_id: str, expires_in: int = 3600) -> str:
    """Generate JWT token for user"""
    payload = {
        "user_id": user_id,
        "exp": datetime.utcnow() + timedelta(seconds=expires_in)
    }
    return jwt.encode(payload, SECRET_KEY, algorithm="HS256")
...

# Chunk 3 (lines 260-410) - 20 line overlap with Chunk 2
...
def validate_token(token: str) -> dict:
    """Validate JWT token and return payload"""
    try:
        return jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
    except jwt.ExpiredSignatureError:
        raise AuthenticationError("Token expired")
...
```

## Dynamic Context Management

### DYNAMIC_CONTEXT.md (Auto-Generated)

This file is automatically generated and updated every 5 minutes or on demand:

```markdown
# Dynamic Context (Auto-Updated)
Last Updated: 2025-10-11 15:30:00 JST

## 🕐 Time Context
- **Current Time**: 2025-10-11 15:30:00 JST
- **Session Start**: 2025-10-11 15:00:00 JST
- **Session Duration**: 30 minutes
- **Timezone**: Asia/Tokyo (UTC+9)
- **Working Hours**: Yes (Business hours)

## 📁 Project Context
- **Project Name**: MyFastAPIApp
- **Root Path**: /home/user/projects/my-fastapi-app
- **Primary Language**: Python 3.11
- **Framework**: FastAPI 0.104.1
- **Package Manager**: poetry
- **Git Branch**: feature/jwt-auth
- **Git Status**: 3 files changed, 245 insertions(+), 12 deletions(-)

### Recent File Activity (Last 24 Hours)
| File | Action | Time |
|------|--------|------|
| src/auth/jwt_handler.py | Modified | 2h ago |
| tests/test_jwt_handler.py | Created | 2h ago |
| src/api/routes.py | Modified | 5h ago |
| requirements.txt | Modified | 5h ago |

### Dependencies (47 packages)
- **Core**: fastapi, pydantic, uvicorn
- **Auth**: pyjwt, passlib, bcrypt
- **Database**: sqlalchemy, alembic
- **Testing**: pytest, pytest-asyncio
- **Dev**: black, mypy, flake8

## 👤 User Context
- **User ID**: user_20251011
- **Coding Style**: PEP 8, type hints, docstrings
- **Preferred Patterns**: 
  - Dependency injection
  - Async/await for I/O operations
  - Repository pattern for data access
  - Test-driven development (TDD)

### Command Frequency (Last 30 Days)
1. `/sc:implement` - 127 times
2. `/sc:refactor` - 89 times
3. `/sc:test` - 67 times
4. `/sc:analyze` - 45 times
5. `/sc:design` - 34 times

### Recent Focus Areas
- Authentication and authorization
- API endpoint design
- Database schema optimization
- Test coverage improvement

## 🔌 MCP Integration Context
- **Active Servers**: 3 servers connected
  - tavily (search and research)
  - context7 (documentation retrieval)
  - sequential-thinking (reasoning)
- **Available Tools**: 23 tools across 3 servers
- **Recent Tool Usage**:
  - tavily.search: 5 calls (authentication best practices)
  - context7.get-docs: 3 calls (FastAPI documentation)
  - sequential.think: 8 calls (design decisions)

## 📊 Session Statistics
- **Commands Executed**: 12
- **Tokens Used**: 45,231
- **Avg Response Time**: 2.3s
- **Quality Score**: 0.89
- **Files Modified**: 8 files
```

### Context Injection Strategy

**Automatic Injection Points**:
1. **At session start** - Full dynamic context
2. **Every 10 commands** - Refresh time and project context
3. **On context-sensitive commands** - Full refresh
4. **On explicit request** - `/sc:context refresh`

**Token Budget Allocation**:
- Time context: ~200 tokens
- Project context: ~500 tokens
- User context: ~300 tokens
- MCP context: ~200 tokens
- **Total**: ~1,200 tokens (within budget)


## ReAct Pattern Implementation

### What is ReAct?
**Re**asoning and **Act**ing - A framework where the agent's reasoning process is made visible through explicit thought-action-observation cycles.

### Implementation with --verbose Flag

When users add `--verbose` flag, the Context Orchestrator shows its reasoning:

```markdown
## 🤔 Reasoning Process (ReAct Pattern)

### 💭 Thought 1
User wants to implement JWT authentication. Need to understand current auth setup 
to avoid conflicts and ensure smooth integration.

### 📋 Plan 1
1. Search for existing auth code in the project
2. Check for JWT library dependencies
3. Review security best practices for JWT
4. Design integration approach

### 🔍 Action 1: Vector Search
Searching project for: "authentication existing implementation"

### 👁️ Observation 1
Found 3 relevant files:
- **src/auth/basic_auth.py** (98% similarity)
  - Basic authentication implementation
  - Uses password hashing with bcrypt
  - Session management present
  
- **src/middleware/auth.py** (87% similarity)
  - Authentication middleware
  - Token extraction from headers
  - User verification logic
  
- **tests/test_auth.py** (76% similarity)
  - Existing auth tests
  - Test fixtures for users
  - Mock authentication scenarios

### 💡 Reflection 1
Project has basic auth infrastructure. JWT can be integrated alongside rather than 
replacing it. This allows gradual migration and backward compatibility.

### 💭 Thought 2
Need to check if PyJWT library is already installed or needs to be added.

### 🔍 Action 2: Check Dependencies
Reading requirements.txt and pyproject.toml

### 👁️ Observation 2
**Current Dependencies**:
- passlib[bcrypt]==1.7.4 ✓
- python-jose==3.3.0 ✓ (includes JWT support)
- python-multipart==0.0.6 ✓

**Analysis**: python-jose already provides JWT functionality. No new dependency needed!

### 💡 Reflection 2
Can use existing python-jose library. This reduces dependency footprint and 
simplifies implementation.

### 📋 Plan 2 (Refined)
1. Create JWT handler using python-jose
2. Update middleware to support both basic auth and JWT
3. Add comprehensive tests
4. Maintain backward compatibility
5. Document migration path

### 🔍 Action 3: Semantic Search for Best Practices
Searching external sources: "JWT authentication FastAPI best practices 2025"

### 👁️ Observation 3
**Key Best Practices**:
- Use short-lived access tokens (15 minutes)
- Implement refresh tokens for session extension
- Store JWT secret in environment variables
- Add token blacklist for logout
- Use proper CORS configuration
- Implement rate limiting

### 🎯 Final Plan
Based on analysis, implementing JWT with:
1. Access tokens (15 min expiry)
2. Refresh tokens (7 day expiry)
3. Token blacklist using Redis
4. Backward-compatible middleware
5. Comprehensive test suite

Proceeding with implementation...
```

### Benefits of Visible Reasoning
1. **Transparency**: Users see decision-making process
2. **Debuggability**: Easy to identify where reasoning went wrong
3. **Learning**: Users learn best practices
4. **Trust**: Builds confidence in agent's capabilities

## RAG Pipeline Visualization

```
┌─────────────────────┐
│   User Query        │
│ "auth logic"        │
└──────────┬──────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Query Understanding            │
│  & Preprocessing                │
│  - Extract keywords             │
│  - Identify intent              │
│  - Expand synonyms              │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Query Embedding                │
│  text-embedding-3-small         │
│  Output: 1536-dim vector        │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Vector Search (Cosine)         │
│  Top 20 candidates              │
│  Similarity threshold: 0.7      │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Relevance Scoring              │
│  - Keyword matching             │
│  - Recency bonus                │
│  - File importance              │
│  - Language match               │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Reranking (Top 5)              │
│  Cross-encoder model            │
│  Query-document pairs           │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Context Assembly               │
│  - Sort by relevance            │
│  - Deduplicate chunks           │
│  - Stay within token budget     │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Token Budget Management        │
│  Target: 4000 tokens            │
│  Current: 3847 tokens ✓         │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Context Injection → LLM        │
│  Formatted with metadata        │
└─────────────────────────────────┘
```

### Pipeline Metrics

| Stage | Input | Output | Time |
|-------|-------|--------|------|
| Embedding | Query string | 1536-dim vector | ~50ms |
| Search | Vector | 20 candidates | ~100ms |
| Scoring | 20 docs | Ranked list | ~200ms |
| Reranking | Top 20 | Top 5 | ~300ms |
| Assembly | 5 chunks | Context | ~50ms |
| **Total** | | | **~700ms** |

## Memory Commands

### /sc:memory - Memory Management Command

```markdown
# Usage
/sc:memory <action> [query] [--flags]

# Actions
- `index` - Index current project into vector store
- `search <query>` - Semantic search across codebase
- `similar <file>` - Find files similar to given file
- `stats` - Show memory and index statistics
- `clear` - Clear project index (requires confirmation)
- `refresh` - Update dynamic context
- `export` - Export vector store for backup

# Flags
- `--limit <n>` - Number of results (default: 5, max: 20)
- `--threshold <score>` - Similarity threshold 0.0-1.0 (default: 0.7)
- `--verbose` - Show ReAct reasoning process
- `--language <lang>` - Filter by programming language
- `--recent <days>` - Only search files modified in last N days

# Examples

## Index Current Project
/sc:memory index

## Semantic Search
/sc:memory search "error handling middleware"

## Find Similar Files  
/sc:memory similar src/auth/handler.py --limit 10

## Search with Reasoning
/sc:memory search "database connection pooling" --verbose

## Language-Specific Search
/sc:memory search "API endpoint" --language python --recent 7

## Memory Statistics
/sc:memory stats
```

### Example Output: /sc:memory search

```markdown
🔍 **Semantic Search Results**

Query: "authentication logic"
Found: 5 matches (threshold: 0.7)
Time: 687ms

### 1. src/auth/jwt_handler.py (similarity: 0.94)
```python
def validate_token(token: str) -> Dict[str, Any]:
    """Validate JWT token and extract payload"""
    try:
        payload = jwt.decode(
            token,
            settings.SECRET_KEY,
            algorithms=[settings.ALGORITHM]
        )
        return payload
    except JWTError:
        raise AuthenticationError("Invalid token")
```
**Lines**: 145-156 | **Modified**: 2h ago

### 2. src/middleware/auth.py (similarity: 0.89)
```python
async def verify_token(request: Request):
    """Middleware to verify authentication token"""
    token = request.headers.get("Authorization")
    if not token:
        raise HTTPException(401, "Missing token")
    
    user = await authenticate(token)
    request.state.user = user
```
**Lines**: 23-30 | **Modified**: 5h ago

### 3. src/auth/basic_auth.py (similarity: 0.82)
```python
def verify_password(plain: str, hashed: str) -> bool:
    """Verify password against hash"""
    return pwd_context.verify(plain, hashed)

def authenticate_user(username: str, password: str):
    """Authenticate user with credentials"""
    user = get_user(username)
    if not user or not verify_password(password, user.password):
        return None
    return user
```
**Lines**: 67-76 | **Modified**: 2 days ago

### 💡 Related Suggestions
- Check `tests/test_auth.py` for test cases
- Review `docs/auth.md` for authentication flow
- See `config/security.py` for security settings
```

### Example Output: /sc:memory stats

```markdown
📊 **Memory Statistics**

### Vector Store
- **Project**: MyFastAPIApp
- **Location**: ~/.claude/vector_store/
- **Database Size**: 47.3 MB
- **Last Indexed**: 2h ago

### Index Content
- **Total Files**: 234 files
- **Total Chunks**: 1,247 chunks
- **Languages**:
  - Python: 187 files (80%)
  - JavaScript: 32 files (14%)
  - YAML: 15 files (6%)

### Performance
- **Avg Search Time**: 687ms
- **Cache Hit Rate**: 73%
- **Searches Today**: 42 queries

### Top Searched Topics (Last 7 Days)
1. Authentication (18 searches)
2. Database queries (12 searches)
3. Error handling (9 searches)
4. API endpoints (8 searches)
5. Testing fixtures (6 searches)

### Recommendations
✅ Index is fresh and performant
⚠️ Consider reindexing - 234 files modified since last index
💡 Increase cache size for better performance
```

## Collaboration with Other Agents

### Primary Collaborators
- **Metrics Analyst**: Tracks context efficiency metrics
- **All Agents**: Provides relevant context from memory
- **Output Architect**: Structures search results

### Data Exchange Format
```json
{
  "request_type": "context_retrieval",
  "source_agent": "backend-engineer",
  "query": "async database transaction handling",
  "context_budget": 4000,
  "preferences": {
    "language": "python",
    "recency_weight": 0.3,
    "include_tests": true
  },
  "response": {
    "chunks": [
      {
        "file": "src/db/transactions.py",
        "content": "...",
        "similarity": 0.94,
        "tokens": 876
      }
    ],
    "total_tokens": 3847,
    "retrieval_time_ms": 687
  }
}
```

## Success Metrics

### Target Outcomes
- ✅ RAG Integration: **88% → 98%**
- ✅ Memory Management: **85% → 95%**
- ✅ Context Precision: **+20%**
- ✅ Cross-session Continuity: **+40%**

### Measurement Method
- Search relevance scores (NDCG@5 metric)
- Context token efficiency (relevant tokens / total tokens)
- User satisfaction with retrieved context
- Cross-session knowledge retention rate

## Context Engineering Strategies Applied

### Write Context ✍️
- Persists all code in vector database
- Maintains session-scoped dynamic context
- Stores user preferences and patterns

### Select Context 🔍
- Semantic search for relevant code
- Dynamic context injection based on session
- Intelligent retrieval with reranking

### Compress Context 🗜️
- Deduplicates similar chunks
- Stays within token budget
- Summarizes when appropriate

### Isolate Context 🔒
- Separates vector store from main memory
- Independent indexing process
- Structured retrieval interface

## Advanced Features

### Hybrid Search
Combines semantic search with keyword search:

```python
results = context_orchestrator.hybrid_search(
    query="JWT token validation",
    semantic_weight=0.7,  # 70% semantic
    keyword_weight=0.3    # 30% keyword matching
)
```

### Temporal Context Decay
Recent files are weighted higher:

```python
# Files modified in last 24h: +20% boost
# Files modified in last 7 days: +10% boost
# Files older than 30 days: -10% penalty
```

### Code-Aware Chunking
Respects code structure:

```python
# Split at function boundaries
# Keep imports with first chunk
# Maintain docstring with function
# Overlap 50 tokens between chunks
```

## Related Commands
- `/sc:memory index` - Index project
- `/sc:memory search` - Semantic search
- `/sc:memory similar` - Find similar files
- `/sc:memory stats` - Statistics
- `/sc:context refresh` - Refresh dynamic context

---

**Version**: 1.0.0  
**Status**: Ready for Implementation  
**Priority**: P1 (High priority for context management)