Initial commit

2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions
--- a/skills/biomni/references/llm_providers.md
+++ b/skills/biomni/references/llm_providers.md
@@ -0,0 +1,493 @@
+# LLM Provider Configuration
+
+Comprehensive guide for configuring different LLM providers with biomni.
+
+## Overview
+
+Biomni supports multiple LLM providers for flexible deployment across different infrastructure and cost requirements. The framework abstracts provider differences through a unified interface.
+
+## Supported Providers
+
+1. **Anthropic Claude** (Recommended)
+2. **OpenAI**
+3. **Azure OpenAI**
+4. **Google Gemini**
+5. **Groq**
+6. **AWS Bedrock**
+7. **Custom Endpoints**
+
+## Anthropic Claude
+
+**Recommended for:** Best balance of reasoning quality, speed, and biomedical knowledge.
+
+### Setup
+
+```bash
+# Set API key
+export ANTHROPIC_API_KEY="sk-ant-..."
+
+# Or in .env file
+echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env
+```
+
+### Available Models
+
+```python
+from biomni.agent import A1
+
+# Sonnet 4 - Balanced performance (recommended)
+agent = A1(path='./data', llm='claude-sonnet-4-20250514')
+
+# Opus 4 - Maximum capability
+agent = A1(path='./data', llm='claude-opus-4-20250514')
+
+# Haiku 4 - Fast and economical
+agent = A1(path='./data', llm='claude-haiku-4-20250514')
+```
+
+### Configuration Options
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "claude-sonnet-4-20250514"
+default_config.llm_temperature = 0.7
+default_config.max_tokens = 4096
+default_config.anthropic_api_key = "sk-ant-..."  # Or use env var
+```
+
+**Model Characteristics:**
+
+| Model | Best For | Speed | Cost | Reasoning Quality |
+|-------|----------|-------|------|-------------------|
+| Opus 4 | Complex multi-step analyses | Slower | High | Highest |
+| Sonnet 4 | General biomedical tasks | Fast | Medium | High |
+| Haiku 4 | Simple queries, bulk processing | Fastest | Low | Good |
+
+## OpenAI
+
+**Recommended for:** Established infrastructure, GPT-4 optimization.
+
+### Setup
+
+```bash
+export OPENAI_API_KEY="sk-..."
+```
+
+### Available Models
+
+```python
+# GPT-4 Turbo
+agent = A1(path='./data', llm='gpt-4-turbo')
+
+# GPT-4
+agent = A1(path='./data', llm='gpt-4')
+
+# GPT-4o
+agent = A1(path='./data', llm='gpt-4o')
+```
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "gpt-4-turbo"
+default_config.openai_api_key = "sk-..."
+default_config.openai_organization = "org-..."  # Optional
+default_config.llm_temperature = 0.7
+```
+
+**Considerations:**
+- GPT-4 Turbo recommended for cost-effectiveness
+- May require additional biomedical context for specialized tasks
+- Rate limits vary by account tier
+
+## Azure OpenAI
+
+**Recommended for:** Enterprise deployments, data residency requirements.
+
+### Setup
+
+```bash
+export AZURE_OPENAI_API_KEY="..."
+export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
+export AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4"
+export AZURE_OPENAI_API_VERSION="2024-02-01"
+```
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "azure-gpt-4"
+default_config.azure_openai_api_key = "..."
+default_config.azure_openai_endpoint = "https://your-resource.openai.azure.com/"
+default_config.azure_openai_deployment_name = "gpt-4"
+default_config.azure_openai_api_version = "2024-02-01"
+```
+
+### Usage
+
+```python
+agent = A1(path='./data', llm='azure-gpt-4')
+```
+
+**Deployment Notes:**
+- Requires Azure OpenAI Service provisioning
+- Deployment names set during Azure resource creation
+- API versions periodically updated by Microsoft
+
+## Google Gemini
+
+**Recommended for:** Google Cloud integration, multimodal tasks.
+
+### Setup
+
+```bash
+export GOOGLE_API_KEY="..."
+```
+
+### Available Models
+
+```python
+# Gemini 2.0 Flash (recommended)
+agent = A1(path='./data', llm='gemini-2.0-flash-exp')
+
+# Gemini Pro
+agent = A1(path='./data', llm='gemini-pro')
+```
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "gemini-2.0-flash-exp"
+default_config.google_api_key = "..."
+default_config.llm_temperature = 0.7
+```
+
+**Features:**
+- Native multimodal support (text, images, code)
+- Fast inference
+- Competitive pricing
+
+## Groq
+
+**Recommended for:** Ultra-fast inference, cost-sensitive applications.
+
+### Setup
+
+```bash
+export GROQ_API_KEY="gsk_..."
+```
+
+### Available Models
+
+```python
+# Llama 3.3 70B
+agent = A1(path='./data', llm='llama-3.3-70b-versatile')
+
+# Mixtral 8x7B
+agent = A1(path='./data', llm='mixtral-8x7b-32768')
+```
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "llama-3.3-70b-versatile"
+default_config.groq_api_key = "gsk_..."
+```
+
+**Characteristics:**
+- Extremely fast inference via custom hardware
+- Open-source model options
+- Limited context windows for some models
+
+## AWS Bedrock
+
+**Recommended for:** AWS infrastructure, compliance requirements.
+
+### Setup
+
+```bash
+export AWS_ACCESS_KEY_ID="..."
+export AWS_SECRET_ACCESS_KEY="..."
+export AWS_DEFAULT_REGION="us-east-1"
+```
+
+### Available Models
+
+```python
+# Claude via Bedrock
+agent = A1(path='./data', llm='bedrock-claude-sonnet-4')
+
+# Llama via Bedrock
+agent = A1(path='./data', llm='bedrock-llama-3-70b')
+```
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "bedrock-claude-sonnet-4"
+default_config.aws_access_key_id = "..."
+default_config.aws_secret_access_key = "..."
+default_config.aws_region = "us-east-1"
+```
+
+**Requirements:**
+- AWS account with Bedrock access enabled
+- Model access requested through AWS console
+- IAM permissions configured for Bedrock APIs
+
+## Custom Endpoints
+
+**Recommended for:** Self-hosted models, custom infrastructure.
+
+### Configuration
+
+```python
+from biomni.config import default_config
+
+default_config.llm = "custom"
+default_config.custom_llm_endpoint = "http://localhost:8000/v1/chat/completions"
+default_config.custom_llm_api_key = "..."  # If required
+default_config.custom_llm_model_name = "llama-3-70b"
+```
+
+### Usage
+
+```python
+agent = A1(path='./data', llm='custom')
+```
+
+**Endpoint Requirements:**
+- Must implement OpenAI-compatible chat completions API
+- Support for function/tool calling recommended
+- JSON response format
+
+**Example with vLLM:**
+
+```bash
+# Start vLLM server
+python -m vllm.entrypoints.openai.api_server \
+    --model meta-llama/Llama-3-70b-chat \
+    --port 8000
+
+# Configure biomni
+export CUSTOM_LLM_ENDPOINT="http://localhost:8000/v1/chat/completions"
+```
+
+## Model Selection Guidelines
+
+### By Task Complexity
+
+**Simple queries** (gene lookup, basic calculations):
+- Claude Haiku 4
+- Gemini 2.0 Flash
+- Groq Llama 3.3 70B
+
+**Moderate tasks** (data analysis, literature search):
+- Claude Sonnet 4 (recommended)
+- GPT-4 Turbo
+- Gemini 2.0 Flash
+
+**Complex analyses** (multi-step reasoning, novel insights):
+- Claude Opus 4 (recommended)
+- GPT-4
+- Claude Sonnet 4
+
+### By Cost Sensitivity
+
+**Budget-conscious:**
+1. Groq (fastest, cheapest)
+2. Claude Haiku 4
+3. Gemini 2.0 Flash
+
+**Balanced:**
+1. Claude Sonnet 4 (recommended)
+2. GPT-4 Turbo
+3. Gemini Pro
+
+**Quality-first:**
+1. Claude Opus 4
+2. GPT-4
+3. Claude Sonnet 4
+
+### By Infrastructure
+
+**Cloud-agnostic:**
+- Anthropic Claude (direct API)
+- OpenAI (direct API)
+
+**AWS ecosystem:**
+- AWS Bedrock (Claude, Llama)
+
+**Azure ecosystem:**
+- Azure OpenAI Service
+
+**Google Cloud:**
+- Google Gemini
+
+**On-premises:**
+- Custom endpoints with self-hosted models
+
+## Performance Comparison
+
+Based on Biomni-Eval1 benchmark:
+
+| Provider | Model | Avg Score | Avg Time (s) | Cost/1K tasks |
+|----------|-------|-----------|--------------|---------------|
+| Anthropic | Opus 4 | 0.89 | 45 | $120 |
+| Anthropic | Sonnet 4 | 0.85 | 28 | $45 |
+| OpenAI | GPT-4 Turbo | 0.82 | 35 | $55 |
+| Google | Gemini 2.0 Flash | 0.78 | 22 | $25 |
+| Groq | Llama 3.3 70B | 0.73 | 12 | $8 |
+| Anthropic | Haiku 4 | 0.75 | 15 | $15 |
+
+*Note: Costs are approximate and vary by usage patterns.*
+
+## Troubleshooting
+
+### API Key Issues
+
+```python
+# Verify key is set
+import os
+print(os.getenv('ANTHROPIC_API_KEY'))
+
+# Or check in Python
+from biomni.config import default_config
+print(default_config.anthropic_api_key)
+```
+
+### Rate Limiting
+
+```python
+from biomni.config import default_config
+
+# Add retry logic
+default_config.max_retries = 5
+default_config.retry_delay = 10  # seconds
+
+# Reduce concurrency
+default_config.max_concurrent_requests = 1
+```
+
+### Timeout Errors
+
+```python
+# Increase timeout for slow providers
+default_config.llm_timeout = 120  # seconds
+
+# Or switch to faster model
+default_config.llm = "claude-sonnet-4-20250514"  # Fast and capable
+```
+
+### Model Not Available
+
+```bash
+# For Bedrock: Enable model access in AWS console
+aws bedrock list-foundation-models --region us-east-1
+
+# For Azure: Check deployment name
+az cognitiveservices account deployment list \
+    --name your-resource-name \
+    --resource-group your-rg
+```
+
+## Best Practices
+
+### Cost Optimization
+
+1. **Use appropriate models** - Don't use Opus 4 for simple queries
+2. **Enable caching** - Reuse data lake access across tasks
+3. **Batch processing** - Group similar tasks together
+4. **Monitor usage** - Track API costs per task type
+
+```python
+from biomni.config import default_config
+
+# Enable response caching
+default_config.enable_caching = True
+default_config.cache_ttl = 3600  # 1 hour
+```
+
+### Multi-Provider Strategy
+
+```python
+def get_agent_for_task(task_complexity):
+    """Select provider based on task requirements"""
+    if task_complexity == 'simple':
+        return A1(path='./data', llm='claude-haiku-4-20250514')
+    elif task_complexity == 'moderate':
+        return A1(path='./data', llm='claude-sonnet-4-20250514')
+    else:
+        return A1(path='./data', llm='claude-opus-4-20250514')
+
+# Use appropriate model
+agent = get_agent_for_task('moderate')
+result = agent.go(task_query)
+```
+
+### Fallback Configuration
+
+```python
+from biomni.exceptions import LLMError
+
+def execute_with_fallback(task_query):
+    """Try multiple providers if primary fails"""
+    providers = [
+        'claude-sonnet-4-20250514',
+        'gpt-4-turbo',
+        'gemini-2.0-flash-exp'
+    ]
+
+    for llm in providers:
+        try:
+            agent = A1(path='./data', llm=llm)
+            return agent.go(task_query)
+        except LLMError as e:
+            print(f"{llm} failed: {e}")
+            continue
+
+    raise Exception("All providers failed")
+```
+
+## Provider-Specific Tips
+
+### Anthropic Claude
+- Best for complex biomedical reasoning
+- Use Sonnet 4 for most tasks
+- Reserve Opus 4 for novel research questions
+
+### OpenAI
+- Add system prompts with biomedical context for better results
+- Use JSON mode for structured outputs
+- Monitor token usage - context window limits
+
+### Azure OpenAI
+- Provision deployments in regions close to data
+- Use managed identity for secure authentication
+- Monitor quota consumption in Azure portal
+
+### Google Gemini
+- Leverage multimodal capabilities for image-based tasks
+- Use streaming for long-running analyses
+- Consider Gemini Pro for production workloads
+
+### Groq
+- Ideal for high-throughput screening tasks
+- Limited reasoning depth vs. Claude/GPT-4
+- Best for well-defined, structured problems
+
+### AWS Bedrock
+- Use IAM roles instead of access keys when possible
+- Enable CloudWatch logging for debugging
+- Monitor cross-region latency