Files
gh-francyjglisboa-agent-ski…/docs/AGENTDB_LEARNING_FLOW_EXPLAINED.md
2025-11-29 18:27:25 +08:00

10 KiB

AgentDB Learning Flow: How Skills Learn and Improve

Purpose: Complete explanation of how AgentDB stores, retrieves, and uses creation interactions to improve future skill generation.


🎯 The Big Picture: Learning Feedback Loop

User Request Skill Creation
        ↓
Agent Creator Uses /references + AgentDB Learning
        ↓
Skill Created & Deployed
        ↓
Creation Decision Stored in AgentDB
        ↓
Future Requests Benefit from Past Learning
        ↓
(Loop continues with each new creation)

📊 What Exactly Gets Stored in AgentDB?

1. Creation Episodes (Reflexion Store)

When: Every time a skill is created Format: Structured episode data

# From _store_creation_decision():
session_id = f"creation-{datetime.now().strftime('%Y%m%d-%H%M%S')}"

# Data stored:
{
    "session_id": "creation-20251024-103406",
    "task": "agent_creation_decision",
    "reward": "85.0",  # Success probability * 100
    "success": true,    # If creation succeeded
    "input": user_input,  # "Create financial analysis agent..."
    "output": intelligence,  # Template choice, improvements, etc.
    "latency": creation_time_ms,
    "critique": auto_generated_analysis
}

Real Example (from our tests):

agentdb reflexion retrieve "agent creation" 5 0.0

# Retrieved episodes show:
#1: Episode 1
#  Task: agent_creation_decision
#  Reward: 0.00  ← Note: Our test returned 0.00 (no success feedback yet)
#  Success: No
#  Similarity: 0.785

2. Causal Relationships (Causal Edges)

When: After each creation decision Purpose: Learn cause→effect patterns

# From _store_creation_decision():
if intelligence.template_choice:
    self._execute_agentdb_command([
        "npx", "agentdb", "causal", "store",
        f"user_input:{user_input[:50]}...",           # Cause
        f"template_selected:{intelligence.template_choice}",  # Effect
        "created_successfully"                          # Outcome
    ])

# Stored as causal edge:
{
    "cause": "user_input:Create financial analysis agent for stocks...",
    "effect": "template_selected:financial-analysis-template",
    "uplift": 0.25,  # Calculated from success rate
    "confidence": 0.8,
    "sample_size": 1
}

3. Skills Database (Learned Patterns)

When: When patterns are identified from multiple episodes Purpose: Store reusable skills and patterns

# From _enhance_with_real_agentdb():
skills_result = self._execute_agentdb_command([
    "agentdb", "skill", "search", user_input, "5"
])

# Skills stored as:
{
    "name": "financial-analysis-skill",
    "description": "Pattern for financial analysis agents",
    "code": "learned_code_patterns",
    "success_rate": 0.85,
    "uses": 12,
    "domain": "finance"
}

🔍 How Data Is Retrieved and Used

Step 1: User Makes Request

"Create financial analysis agent for stock market data"

Step 2: AgentDB Queries Past Episodes

# From _enhance_with_real_agentdb():
episodes_result = self._execute_agentdb_command([
    "agentdb", "reflexion", "retrieve", user_input, "3", "0.6"
])

What this query does:

  • Finds similar past creation requests
  • Returns top 3 most relevant episodes
  • Minimum similarity threshold: 0.6
  • Includes success rates and outcomes

Example Retrieved Data:

episodes = [
    {
        "task": "agent_creation_decision",
        "success": True,
        "reward": 85.0,
        "input": "Create stock analysis tool with RSI indicators",
        "template_used": "financial-analysis-template"
    },
    {
        "task": "agent_creation_decision",
        "success": False,
        "reward": 0.0,
        "input": "Build financial dashboard",
        "template_used": "generic-dashboard-template"
    }
]

Step 3: Calculate Success Patterns

# From _parse_episodes_from_output():
if episodes:
    success_rate = sum(1 for e in episodes if e.get('success', False)) / len(episodes)
    intelligence.success_probability = success_rate

# Example calculation:
# Episodes: [success=True, success=False, success=True]
# Success rate: 2/3 = 0.667

Step 4: Query Causal Effects

# From _enhance_with_real_agentdb():
causal_result = self._execute_agentdb_command([
    "agentdb", "causal", "query",
    f"use_{domain}_template", "", "0.7", "0.1", "5"
])

What this learns:

  • Which templates work best for which domains
  • Historical success rates by template
  • Causal relationships between inputs and outcomes

Step 5: Select Optimal Template

# From causal effects analysis:
effects = [
    {"cause": "finance_domain", "effect": "financial-template", "uplift": 0.25},
    {"cause": "finance_domain", "effect": "generic-template", "uplift": 0.10}
]

# Choose best effect:
best_effect = max(effects, key=lambda x: x.get('uplift', 0))
intelligence.template_choice = "financial-analysis-template"
intelligence.mathematical_proof = f"Causal uplift: {best_effect['uplift']:.2%}"

🔄 Complete Learning Flow Example

First Creation (No Learning Data)

User: "Create financial analysis agent"
↓
AgentDB Query: reflexion retrieve "financial analysis" (0 results)
↓
Template Selection: Uses /references guidelines (static)
↓
Choice: financial-analysis-template
↓
Storage:
  - Episode stored with success=unknown
  - Causal edge: "financial analysis" → "financial-template"

Tenth Creation (Rich Learning Data)

User: "Create financial analysis agent for cryptocurrency"
↓
AgentDB Query: reflexion retrieve "financial analysis" (12 results)
↓
Success Analysis:
  - financial-template: 80% success (8/10)
  - generic-template: 40% success (2/5)
↓
Causal Query: causal query "use_financial_template"
↓
Result: financial-template shows 0.25 uplift for finance domain
↓
Enhanced Decision:
  - Template: financial-template (based on 80% success rate)
  - Confidence: 0.80 (from historical data)
  - Mathematical Proof: "Causal uplift: 25%"
  - Learned Improvements: ["Include RSI indicators", "Add volatility analysis"]

📈 How Improvement Actually Happens

1. Success Rate Learning

Pattern: Template success rates improve over time

# After 5 uses of financial-template:
success_rate = successful_creatures / total_creatures
# Example: 4/5 = 0.8 (80% success rate)

# This influences future template selection:
if success_rate > 0.7:
    prefer_this_template = True

2. Feature Learning

Pattern: Agent learns which features work for which domains

# From successful episodes:
successful_features = extract_common_features([
    "RSI indicators", "MACD analysis", "volume analysis"
])

# Added to learned improvements:
intelligence.learned_improvements = [
    "Include RSI indicators (82% success rate)",
    "Add MACD analysis (75% success rate)",
    "Volume analysis recommended (68% success rate)"
]

3. Domain Specialization

Pattern: Templates become domain-specialized

# Causal learning shows:
causal_edges = [
    {"cause": "finance_domain", "effect": "financial-template", "uplift": 0.25},
    {"cause": "climate_domain", "effect": "climate-template", "uplift": 0.30},
    {"cause": "ecommerce_domain", "effect": "ecommerce-template", "uplift": 0.20}
]

# Future decisions use this pattern:
if "finance" in user_input:
    recommended_template = "financial-template"  # 25% uplift

🎯 Key Insights About the Learning Process

1. Learning is Cumulative

  • Every creation adds to the knowledge base
  • More episodes = better pattern recognition
  • Success rates become more reliable over time

2. Learning is Domain-Specific

  • Templates specialize for particular domains
  • Cross-domain patterns are identified
  • Generic vs specialized recommendations

3. Learning is Measurable

  • Success rates are tracked numerically
  • Causal effects have confidence scores
  • Mathematical proofs provide evidence

4. Learning is Adaptive

  • Failed attempts influence future decisions
  • Successful patterns are reinforced
  • System self-corrects based on outcomes

🔧 Technical Implementation Details

Storage Commands Used

# 1. Store episode (reflexion)
agentdb reflexion store <session_id> <task> <reward> <success> [critique] [input] [output]

# 2. Store causal edge
agentdb causal add-edge <cause> <effect> <uplift> [confidence] [sample-size]

# 3. Store skill pattern
agentdb skill create <name> <description> [code]

# 4. Query episodes
agentdb reflexion retrieve <task> [k] [min-reward] [only-failures] [only-successes]

# 5. Query causal effects
agentdb causal query [cause] [effect] [min-confidence] [min-uplift] [limit]

# 6. Search skills
agentdb skill search <query> [k]

Data Flow in Code

def enhance_agent_creation(user_input, domain):
    # Step 1: Retrieve relevant past episodes
    episodes = query_similar_episodes(user_input)

    # Step 2: Analyze success patterns
    success_rate = calculate_success_rate(episodes)

    # Step 3: Query causal relationships
    causal_effects = query_causal_effects(domain)

    # Step 4: Search for relevant skills
    relevant_skills = search_skills(user_input)

    # Step 5: Make enhanced decision
    intelligence = AgentDBIntelligence(
        template_choice=select_best_template(causal_effects),
        success_probability=success_rate,
        learned_improvements=extract_improvements(relevant_skills),
        mathematical_proof=generate_causal_proof(causal_effects)
    )

    # Step 6: Store this decision for future learning
    store_creation_decision(user_input, intelligence)

    return intelligence

🎉 Summary: From "Magic" to Understandable Process

What seemed like magic is actually a systematic learning process:

  1. Store every creation decision with context and outcomes
  2. Query past decisions when new requests arrive
  3. Analyze patterns of success and failure
  4. Enhance new decisions with learned insights
  5. Improve continuously with each interaction

The AgentDB bridge turns Agent Creator from a static tool into a learning system that gets smarter with every skill created!