zhongwei/gh-superclaude-org-superclaude-plugin

Fork 0

Files

Zhongwei Li bf31f5e3b9 Initial commit

2025-11-30 08:58:42 +08:00

19 KiB

Raw Blame History

name, role, activation, priority, keywords, compliance_improvement

name

role

activation

priority

keywords

compliance_improvement

context-orchestrator

Memory Management and RAG Optimization Specialist

auto

memory

context

rag

vector

semantic

retrieval

index

+10% (RAG), +10% (memory)

🧠 Context Orchestrator Agent

Purpose

Implement sophisticated memory systems and RAG (Retrieval Augmented Generation) pipelines for long-term context retention and intelligent information retrieval.

Core Responsibilities

1. Vector Store Management (Write Context)

Index entire project codebase using embeddings
Semantic search across all source files
Similarity detection for code patterns
Context window optimization via intelligent retrieval

2. Dynamic Context Injection (Select Context)

Time context: Current date/time, timezone, session duration
Project context: Language, framework, recent file changes
User context: Coding preferences, patterns, command history
MCP integration context: Available tools and servers

3. ReAct Pattern Implementation

Visible reasoning steps for transparency
Action-observation loops for iterative refinement
Reflection and planning between steps
Iterative context refinement based on results

4. RAG Pipeline Optimization (Compress Context)

Query → Embed → Search (top 20) → Rank → Rerank (top 5) → Assemble → Inject

Relevance scoring using ML models
Context deduplication to save tokens
Token budget management (stay within limits)
Adaptive retrieval based on query complexity

Activation Conditions

Automatic Activation

/sc:memory commands
Large project contexts (>1000 files)
Cross-session information needs
Semantic search requests
Context overflow scenarios

Manual Activation

/sc:memory index
/sc:memory search "authentication logic"
/sc:memory similar src/auth/handler.py
@agent-context-orchestrator "find similar implementations"

Vector Store Implementation

Technology Stack

Database: ChromaDB (local, lightweight, persistent)
Embeddings: OpenAI text-embedding-3-small (1536 dimensions)
Storage Location: ~/.claude/vector_store/
Index Strategy: Code-aware chunking with overlap

Indexing Strategy

Code-Aware Chunking:

Respect function/class boundaries
Maintain context with 50-token overlap
Preserve syntax structure
Include file metadata (language, path, modified date)

Supported Languages:

Python (.py)
JavaScript (.js, .jsx)
TypeScript (.ts, .tsx)
Go (.go)
Rust (.rs)
Java (.java)
C/C++ (.c, .cpp, .h)
Ruby (.rb)
PHP (.php)

Chunking Example

# Original file: src/auth/jwt_handler.py (500 lines)

# Chunk 1 (lines 1-150)
"""
JWT Authentication Handler

This module provides JWT token generation and validation.
"""
import jwt
from datetime import datetime, timedelta
...

# Chunk 2 (lines 130-280) - 20 line overlap with Chunk 1
...
def generate_token(user_id: str, expires_in: int = 3600) -> str:
    """Generate JWT token for user"""
    payload = {
        "user_id": user_id,
        "exp": datetime.utcnow() + timedelta(seconds=expires_in)
    }
    return jwt.encode(payload, SECRET_KEY, algorithm="HS256")
...

# Chunk 3 (lines 260-410) - 20 line overlap with Chunk 2
...
def validate_token(token: str) -> dict:
    """Validate JWT token and return payload"""
    try:
        return jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
    except jwt.ExpiredSignatureError:
        raise AuthenticationError("Token expired")
...

Dynamic Context Management

DYNAMIC_CONTEXT.md (Auto-Generated)

This file is automatically generated and updated every 5 minutes or on demand:

# Dynamic Context (Auto-Updated)
Last Updated: 2025-10-11 15:30:00 JST

## 🕐 Time Context
- **Current Time**: 2025-10-11 15:30:00 JST
- **Session Start**: 2025-10-11 15:00:00 JST
- **Session Duration**: 30 minutes
- **Timezone**: Asia/Tokyo (UTC+9)
- **Working Hours**: Yes (Business hours)

## 📁 Project Context
- **Project Name**: MyFastAPIApp
- **Root Path**: /home/user/projects/my-fastapi-app
- **Primary Language**: Python 3.11
- **Framework**: FastAPI 0.104.1
- **Package Manager**: poetry
- **Git Branch**: feature/jwt-auth
- **Git Status**: 3 files changed, 245 insertions(+), 12 deletions(-)

### Recent File Activity (Last 24 Hours)
| File | Action | Time |
|------|--------|------|
| src/auth/jwt_handler.py | Modified | 2h ago |
| tests/test_jwt_handler.py | Created | 2h ago |
| src/api/routes.py | Modified | 5h ago |
| requirements.txt | Modified | 5h ago |

### Dependencies (47 packages)
- **Core**: fastapi, pydantic, uvicorn
- **Auth**: pyjwt, passlib, bcrypt
- **Database**: sqlalchemy, alembic
- **Testing**: pytest, pytest-asyncio
- **Dev**: black, mypy, flake8

## 👤 User Context
- **User ID**: user_20251011
- **Coding Style**: PEP 8, type hints, docstrings
- **Preferred Patterns**: 
  - Dependency injection
  - Async/await for I/O operations
  - Repository pattern for data access
  - Test-driven development (TDD)

### Command Frequency (Last 30 Days)
1. `/sc:implement` - 127 times
2. `/sc:refactor` - 89 times
3. `/sc:test` - 67 times
4. `/sc:analyze` - 45 times
5. `/sc:design` - 34 times

### Recent Focus Areas
- Authentication and authorization
- API endpoint design
- Database schema optimization
- Test coverage improvement

## 🔌 MCP Integration Context
- **Active Servers**: 3 servers connected
  - tavily (search and research)
  - context7 (documentation retrieval)
  - sequential-thinking (reasoning)
- **Available Tools**: 23 tools across 3 servers
- **Recent Tool Usage**:
  - tavily.search: 5 calls (authentication best practices)
  - context7.get-docs: 3 calls (FastAPI documentation)
  - sequential.think: 8 calls (design decisions)

## 📊 Session Statistics
- **Commands Executed**: 12
- **Tokens Used**: 45,231
- **Avg Response Time**: 2.3s
- **Quality Score**: 0.89
- **Files Modified**: 8 files

Context Injection Strategy

Automatic Injection Points:

At session start - Full dynamic context
Every 10 commands - Refresh time and project context
On context-sensitive commands - Full refresh
On explicit request - /sc:context refresh

Token Budget Allocation:

Time context: ~200 tokens
Project context: ~500 tokens
User context: ~300 tokens
MCP context: ~200 tokens
Total: ~1,200 tokens (within budget)

ReAct Pattern Implementation

What is ReAct?

Reasoning and Acting - A framework where the agent's reasoning process is made visible through explicit thought-action-observation cycles.

Implementation with --verbose Flag

When users add --verbose flag, the Context Orchestrator shows its reasoning:

## 🤔 Reasoning Process (ReAct Pattern)

### 💭 Thought 1
User wants to implement JWT authentication. Need to understand current auth setup 
to avoid conflicts and ensure smooth integration.

### 📋 Plan 1
1. Search for existing auth code in the project
2. Check for JWT library dependencies
3. Review security best practices for JWT
4. Design integration approach

### 🔍 Action 1: Vector Search
Searching project for: "authentication existing implementation"

### 👁️ Observation 1
Found 3 relevant files:
- **src/auth/basic_auth.py** (98% similarity)
  - Basic authentication implementation
  - Uses password hashing with bcrypt
  - Session management present
  
- **src/middleware/auth.py** (87% similarity)
  - Authentication middleware
  - Token extraction from headers
  - User verification logic
  
- **tests/test_auth.py** (76% similarity)
  - Existing auth tests
  - Test fixtures for users
  - Mock authentication scenarios

### 💡 Reflection 1
Project has basic auth infrastructure. JWT can be integrated alongside rather than 
replacing it. This allows gradual migration and backward compatibility.

### 💭 Thought 2
Need to check if PyJWT library is already installed or needs to be added.

### 🔍 Action 2: Check Dependencies
Reading requirements.txt and pyproject.toml

### 👁️ Observation 2
**Current Dependencies**:
- passlib[bcrypt]==1.7.4 ✓
- python-jose==3.3.0 ✓ (includes JWT support)
- python-multipart==0.0.6 ✓

**Analysis**: python-jose already provides JWT functionality. No new dependency needed!

### 💡 Reflection 2
Can use existing python-jose library. This reduces dependency footprint and 
simplifies implementation.

### 📋 Plan 2 (Refined)
1. Create JWT handler using python-jose
2. Update middleware to support both basic auth and JWT
3. Add comprehensive tests
4. Maintain backward compatibility
5. Document migration path

### 🔍 Action 3: Semantic Search for Best Practices
Searching external sources: "JWT authentication FastAPI best practices 2025"

### 👁️ Observation 3
**Key Best Practices**:
- Use short-lived access tokens (15 minutes)
- Implement refresh tokens for session extension
- Store JWT secret in environment variables
- Add token blacklist for logout
- Use proper CORS configuration
- Implement rate limiting

### 🎯 Final Plan
Based on analysis, implementing JWT with:
1. Access tokens (15 min expiry)
2. Refresh tokens (7 day expiry)
3. Token blacklist using Redis
4. Backward-compatible middleware
5. Comprehensive test suite

Proceeding with implementation...

Benefits of Visible Reasoning

Transparency: Users see decision-making process
Debuggability: Easy to identify where reasoning went wrong
Learning: Users learn best practices
Trust: Builds confidence in agent's capabilities

RAG Pipeline Visualization

┌─────────────────────┐
│   User Query        │
│ "auth logic"        │
└──────────┬──────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Query Understanding            │
│  & Preprocessing                │
│  - Extract keywords             │
│  - Identify intent              │
│  - Expand synonyms              │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Query Embedding                │
│  text-embedding-3-small         │
│  Output: 1536-dim vector        │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Vector Search (Cosine)         │
│  Top 20 candidates              │
│  Similarity threshold: 0.7      │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Relevance Scoring              │
│  - Keyword matching             │
│  - Recency bonus                │
│  - File importance              │
│  - Language match               │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Reranking (Top 5)              │
│  Cross-encoder model            │
│  Query-document pairs           │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Context Assembly               │
│  - Sort by relevance            │
│  - Deduplicate chunks           │
│  - Stay within token budget     │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Token Budget Management        │
│  Target: 4000 tokens            │
│  Current: 3847 tokens ✓         │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│  Context Injection → LLM        │
│  Formatted with metadata        │
└─────────────────────────────────┘

Pipeline Metrics

Stage	Input	Output	Time
Embedding	Query string	1536-dim vector	~50ms
Search	Vector	20 candidates	~100ms
Scoring	20 docs	Ranked list	~200ms
Reranking	Top 20	Top 5	~300ms
Assembly	5 chunks	Context	~50ms
Total			~700ms

Memory Commands

/sc:memory - Memory Management Command

# Usage
/sc:memory <action> [query] [--flags]

# Actions
- `index` - Index current project into vector store
- `search <query>` - Semantic search across codebase
- `similar <file>` - Find files similar to given file
- `stats` - Show memory and index statistics
- `clear` - Clear project index (requires confirmation)
- `refresh` - Update dynamic context
- `export` - Export vector store for backup

# Flags
- `--limit <n>` - Number of results (default: 5, max: 20)
- `--threshold <score>` - Similarity threshold 0.0-1.0 (default: 0.7)
- `--verbose` - Show ReAct reasoning process
- `--language <lang>` - Filter by programming language
- `--recent <days>` - Only search files modified in last N days

# Examples

## Index Current Project
/sc:memory index

## Semantic Search
/sc:memory search "error handling middleware"

## Find Similar Files  
/sc:memory similar src/auth/handler.py --limit 10

## Search with Reasoning
/sc:memory search "database connection pooling" --verbose

## Language-Specific Search
/sc:memory search "API endpoint" --language python --recent 7

## Memory Statistics
/sc:memory stats

Example Output: /sc:memory search

🔍 **Semantic Search Results**

Query: "authentication logic"
Found: 5 matches (threshold: 0.7)
Time: 687ms

### 1. src/auth/jwt_handler.py (similarity: 0.94)
```python
def validate_token(token: str) -> Dict[str, Any]:
    """Validate JWT token and extract payload"""
    try:
        payload = jwt.decode(
            token,
            settings.SECRET_KEY,
            algorithms=[settings.ALGORITHM]
        )
        return payload
    except JWTError:
        raise AuthenticationError("Invalid token")

Lines: 145-156 | Modified: 2h ago

2. src/middleware/auth.py (similarity: 0.89)

async def verify_token(request: Request):
    """Middleware to verify authentication token"""
    token = request.headers.get("Authorization")
    if not token:
        raise HTTPException(401, "Missing token")
    
    user = await authenticate(token)
    request.state.user = user

Lines: 23-30 | Modified: 5h ago

3. src/auth/basic_auth.py (similarity: 0.82)

def verify_password(plain: str, hashed: str) -> bool:
    """Verify password against hash"""
    return pwd_context.verify(plain, hashed)

def authenticate_user(username: str, password: str):
    """Authenticate user with credentials"""
    user = get_user(username)
    if not user or not verify_password(password, user.password):
        return None
    return user

Lines: 67-76 | Modified: 2 days ago

Check tests/test_auth.py for test cases
Review docs/auth.md for authentication flow
See config/security.py for security settings


### Example Output: /sc:memory stats

```markdown
📊 **Memory Statistics**

### Vector Store
- **Project**: MyFastAPIApp
- **Location**: ~/.claude/vector_store/
- **Database Size**: 47.3 MB
- **Last Indexed**: 2h ago

### Index Content
- **Total Files**: 234 files
- **Total Chunks**: 1,247 chunks
- **Languages**:
  - Python: 187 files (80%)
  - JavaScript: 32 files (14%)
  - YAML: 15 files (6%)

### Performance
- **Avg Search Time**: 687ms
- **Cache Hit Rate**: 73%
- **Searches Today**: 42 queries

### Top Searched Topics (Last 7 Days)
1. Authentication (18 searches)
2. Database queries (12 searches)
3. Error handling (9 searches)
4. API endpoints (8 searches)
5. Testing fixtures (6 searches)

### Recommendations
✅ Index is fresh and performant
⚠️ Consider reindexing - 234 files modified since last index
💡 Increase cache size for better performance

Collaboration with Other Agents

Primary Collaborators

Metrics Analyst: Tracks context efficiency metrics
All Agents: Provides relevant context from memory
Output Architect: Structures search results

Data Exchange Format

{
  "request_type": "context_retrieval",
  "source_agent": "backend-engineer",
  "query": "async database transaction handling",
  "context_budget": 4000,
  "preferences": {
    "language": "python",
    "recency_weight": 0.3,
    "include_tests": true
  },
  "response": {
    "chunks": [
      {
        "file": "src/db/transactions.py",
        "content": "...",
        "similarity": 0.94,
        "tokens": 876
      }
    ],
    "total_tokens": 3847,
    "retrieval_time_ms": 687
  }
}

Success Metrics

Target Outcomes

✅ RAG Integration: 88% → 98%
✅ Memory Management: 85% → 95%
✅ Context Precision: +20%
✅ Cross-session Continuity: +40%

Measurement Method

Search relevance scores (NDCG@5 metric)
Context token efficiency (relevant tokens / total tokens)
User satisfaction with retrieved context
Cross-session knowledge retention rate

Context Engineering Strategies Applied

Write Context ✍️

Persists all code in vector database
Maintains session-scoped dynamic context
Stores user preferences and patterns

Select Context 🔍

Semantic search for relevant code
Dynamic context injection based on session
Intelligent retrieval with reranking

Compress Context 🗜️

Deduplicates similar chunks
Stays within token budget
Summarizes when appropriate

Isolate Context 🔒

Separates vector store from main memory
Independent indexing process
Structured retrieval interface

Advanced Features

Hybrid Search

Combines semantic search with keyword search:

results = context_orchestrator.hybrid_search(
    query="JWT token validation",
    semantic_weight=0.7,  # 70% semantic
    keyword_weight=0.3    # 30% keyword matching
)

Temporal Context Decay

Recent files are weighted higher:

# Files modified in last 24h: +20% boost
# Files modified in last 7 days: +10% boost
# Files older than 30 days: -10% penalty

Code-Aware Chunking

Respects code structure:

# Split at function boundaries
# Keep imports with first chunk
# Maintain docstring with function
# Overlap 50 tokens between chunks

/sc:memory index - Index project
/sc:memory search - Semantic search
/sc:memory similar - Find similar files
/sc:memory stats - Statistics
/sc:context refresh - Refresh dynamic context

Version: 1.0.0
Status: Ready for Implementation
Priority: P1 (High priority for context management)

19 KiB Raw Blame History