Initial commit

2025-11-29 18:27:25 +08:00
commit e18b9b4fa8
77 changed files with 35441 additions and 0 deletions
--- a/docs/AGENTDB_LEARNING_FLOW_EXPLAINED.md
+++ b/docs/AGENTDB_LEARNING_FLOW_EXPLAINED.md
@@ -0,0 +1,380 @@
+# AgentDB Learning Flow: How Skills Learn and Improve
+
+**Purpose**: Complete explanation of how AgentDB stores, retrieves, and uses creation interactions to improve future skill generation.
+
+---
+
+## 🎯 **The Big Picture: Learning Feedback Loop**
+
+```
+User Request Skill Creation
+        ↓
+Agent Creator Uses /references + AgentDB Learning
+        ↓
+Skill Created & Deployed
+        ↓
+Creation Decision Stored in AgentDB
+        ↓
+Future Requests Benefit from Past Learning
+        ↓
+(Loop continues with each new creation)
+```
+
+---
+
+## 📊 **What Exactly Gets Stored in AgentDB?**
+
+### **1. Creation Episodes (Reflexion Store)**
+
+**When**: Every time a skill is created
+**Format**: Structured episode data
+
+```python
+# From _store_creation_decision():
+session_id = f"creation-{datetime.now().strftime('%Y%m%d-%H%M%S')}"
+
+# Data stored:
+{
+    "session_id": "creation-20251024-103406",
+    "task": "agent_creation_decision",
+    "reward": "85.0",  # Success probability * 100
+    "success": true,    # If creation succeeded
+    "input": user_input,  # "Create financial analysis agent..."
+    "output": intelligence,  # Template choice, improvements, etc.
+    "latency": creation_time_ms,
+    "critique": auto_generated_analysis
+}
+```
+
+**Real Example** (from our tests):
+```bash
+agentdb reflexion retrieve "agent creation" 5 0.0
+
+# Retrieved episodes show:
+#1: Episode 1
+#  Task: agent_creation_decision
+#  Reward: 0.00  ← Note: Our test returned 0.00 (no success feedback yet)
+#  Success: No
+#  Similarity: 0.785
+```
+
+### **2. Causal Relationships (Causal Edges)**
+
+**When**: After each creation decision
+**Purpose**: Learn cause→effect patterns
+
+```python
+# From _store_creation_decision():
+if intelligence.template_choice:
+    self._execute_agentdb_command([
+        "npx", "agentdb", "causal", "store",
+        f"user_input:{user_input[:50]}...",           # Cause
+        f"template_selected:{intelligence.template_choice}",  # Effect
+        "created_successfully"                          # Outcome
+    ])
+
+# Stored as causal edge:
+{
+    "cause": "user_input:Create financial analysis agent for stocks...",
+    "effect": "template_selected:financial-analysis-template",
+    "uplift": 0.25,  # Calculated from success rate
+    "confidence": 0.8,
+    "sample_size": 1
+}
+```
+
+### **3. Skills Database (Learned Patterns)**
+
+**When**: When patterns are identified from multiple episodes
+**Purpose**: Store reusable skills and patterns
+
+```python
+# From _enhance_with_real_agentdb():
+skills_result = self._execute_agentdb_command([
+    "agentdb", "skill", "search", user_input, "5"
+])
+
+# Skills stored as:
+{
+    "name": "financial-analysis-skill",
+    "description": "Pattern for financial analysis agents",
+    "code": "learned_code_patterns",
+    "success_rate": 0.85,
+    "uses": 12,
+    "domain": "finance"
+}
+```
+
+---
+
+## 🔍 **How Data Is Retrieved and Used**
+
+### **Step 1: User Makes Request**
+
+```
+"Create financial analysis agent for stock market data"
+```
+
+### **Step 2: AgentDB Queries Past Episodes**
+
+```python
+# From _enhance_with_real_agentdb():
+episodes_result = self._execute_agentdb_command([
+    "agentdb", "reflexion", "retrieve", user_input, "3", "0.6"
+])
+```
+
+**What this query does:**
+- Finds similar past creation requests
+- Returns top 3 most relevant episodes
+- Minimum similarity threshold: 0.6
+- Includes success rates and outcomes
+
+**Example Retrieved Data:**
+```python
+episodes = [
+    {
+        "task": "agent_creation_decision",
+        "success": True,
+        "reward": 85.0,
+        "input": "Create stock analysis tool with RSI indicators",
+        "template_used": "financial-analysis-template"
+    },
+    {
+        "task": "agent_creation_decision",
+        "success": False,
+        "reward": 0.0,
+        "input": "Build financial dashboard",
+        "template_used": "generic-dashboard-template"
+    }
+]
+```
+
+### **Step 3: Calculate Success Patterns**
+
+```python
+# From _parse_episodes_from_output():
+if episodes:
+    success_rate = sum(1 for e in episodes if e.get('success', False)) / len(episodes)
+    intelligence.success_probability = success_rate
+
+# Example calculation:
+# Episodes: [success=True, success=False, success=True]
+# Success rate: 2/3 = 0.667
+```
+
+### **Step 4: Query Causal Effects**
+
+```python
+# From _enhance_with_real_agentdb():
+causal_result = self._execute_agentdb_command([
+    "agentdb", "causal", "query",
+    f"use_{domain}_template", "", "0.7", "0.1", "5"
+])
+```
+
+**What this learns:**
+- Which templates work best for which domains
+- Historical success rates by template
+- Causal relationships between inputs and outcomes
+
+### **Step 5: Select Optimal Template**
+
+```python
+# From causal effects analysis:
+effects = [
+    {"cause": "finance_domain", "effect": "financial-template", "uplift": 0.25},
+    {"cause": "finance_domain", "effect": "generic-template", "uplift": 0.10}
+]
+
+# Choose best effect:
+best_effect = max(effects, key=lambda x: x.get('uplift', 0))
+intelligence.template_choice = "financial-analysis-template"
+intelligence.mathematical_proof = f"Causal uplift: {best_effect['uplift']:.2%}"
+```
+
+---
+
+## 🔄 **Complete Learning Flow Example**
+
+### **First Creation (No Learning Data)**
+
+```
+User: "Create financial analysis agent"
+↓
+AgentDB Query: reflexion retrieve "financial analysis" (0 results)
+↓
+Template Selection: Uses /references guidelines (static)
+↓
+Choice: financial-analysis-template
+↓
+Storage:
+  - Episode stored with success=unknown
+  - Causal edge: "financial analysis" → "financial-template"
+```
+
+### **Tenth Creation (Rich Learning Data)**
+
+```
+User: "Create financial analysis agent for cryptocurrency"
+↓
+AgentDB Query: reflexion retrieve "financial analysis" (12 results)
+↓
+Success Analysis:
+  - financial-template: 80% success (8/10)
+  - generic-template: 40% success (2/5)
+↓
+Causal Query: causal query "use_financial_template"
+↓
+Result: financial-template shows 0.25 uplift for finance domain
+↓
+Enhanced Decision:
+  - Template: financial-template (based on 80% success rate)
+  - Confidence: 0.80 (from historical data)
+  - Mathematical Proof: "Causal uplift: 25%"
+  - Learned Improvements: ["Include RSI indicators", "Add volatility analysis"]
+```
+
+---
+
+## 📈 **How Improvement Actually Happens**
+
+### **1. Success Rate Learning**
+
+**Pattern**: Template success rates improve over time
+```python
+# After 5 uses of financial-template:
+success_rate = successful_creatures / total_creatures
+# Example: 4/5 = 0.8 (80% success rate)
+
+# This influences future template selection:
+if success_rate > 0.7:
+    prefer_this_template = True
+```
+
+### **2. Feature Learning**
+
+**Pattern**: Agent learns which features work for which domains
+```python
+# From successful episodes:
+successful_features = extract_common_features([
+    "RSI indicators", "MACD analysis", "volume analysis"
+])
+
+# Added to learned improvements:
+intelligence.learned_improvements = [
+    "Include RSI indicators (82% success rate)",
+    "Add MACD analysis (75% success rate)",
+    "Volume analysis recommended (68% success rate)"
+]
+```
+
+### **3. Domain Specialization**
+
+**Pattern**: Templates become domain-specialized
+```python
+# Causal learning shows:
+causal_edges = [
+    {"cause": "finance_domain", "effect": "financial-template", "uplift": 0.25},
+    {"cause": "climate_domain", "effect": "climate-template", "uplift": 0.30},
+    {"cause": "ecommerce_domain", "effect": "ecommerce-template", "uplift": 0.20}
+]
+
+# Future decisions use this pattern:
+if "finance" in user_input:
+    recommended_template = "financial-template"  # 25% uplift
+```
+
+---
+
+## 🎯 **Key Insights About the Learning Process**
+
+### **1. Learning is Cumulative**
+- Every creation adds to the knowledge base
+- More episodes = better pattern recognition
+- Success rates become more reliable over time
+
+### **2. Learning is Domain-Specific**
+- Templates specialize for particular domains
+- Cross-domain patterns are identified
+- Generic vs specialized recommendations
+
+### **3. Learning is Measurable**
+- Success rates are tracked numerically
+- Causal effects have confidence scores
+- Mathematical proofs provide evidence
+
+### **4. Learning is Adaptive**
+- Failed attempts influence future decisions
+- Successful patterns are reinforced
+- System self-corrects based on outcomes
+
+---
+
+## 🔧 **Technical Implementation Details**
+
+### **Storage Commands Used**
+
+```python
+# 1. Store episode (reflexion)
+agentdb reflexion store <session_id> <task> <reward> <success> [critique] [input] [output]
+
+# 2. Store causal edge
+agentdb causal add-edge <cause> <effect> <uplift> [confidence] [sample-size]
+
+# 3. Store skill pattern
+agentdb skill create <name> <description> [code]
+
+# 4. Query episodes
+agentdb reflexion retrieve <task> [k] [min-reward] [only-failures] [only-successes]
+
+# 5. Query causal effects
+agentdb causal query [cause] [effect] [min-confidence] [min-uplift] [limit]
+
+# 6. Search skills
+agentdb skill search <query> [k]
+```
+
+### **Data Flow in Code**
+
+```python
+def enhance_agent_creation(user_input, domain):
+    # Step 1: Retrieve relevant past episodes
+    episodes = query_similar_episodes(user_input)
+
+    # Step 2: Analyze success patterns
+    success_rate = calculate_success_rate(episodes)
+
+    # Step 3: Query causal relationships
+    causal_effects = query_causal_effects(domain)
+
+    # Step 4: Search for relevant skills
+    relevant_skills = search_skills(user_input)
+
+    # Step 5: Make enhanced decision
+    intelligence = AgentDBIntelligence(
+        template_choice=select_best_template(causal_effects),
+        success_probability=success_rate,
+        learned_improvements=extract_improvements(relevant_skills),
+        mathematical_proof=generate_causal_proof(causal_effects)
+    )
+
+    # Step 6: Store this decision for future learning
+    store_creation_decision(user_input, intelligence)
+
+    return intelligence
+```
+
+---
+
+## 🎉 **Summary: From "Magic" to Understandable Process**
+
+**What seemed like magic is actually a systematic learning process:**
+
+1. **Store** every creation decision with context and outcomes
+2. **Query** past decisions when new requests arrive
+3. **Analyze** patterns of success and failure
+4. **Enhance** new decisions with learned insights
+5. **Improve** continuously with each interaction
+
+The AgentDB bridge turns Agent Creator from a **static tool** into a **learning system** that gets smarter with every skill created!