Initial commit

2025-11-29 18:45:58 +08:00
commit 4b6db3349f
68 changed files with 15165 additions and 0 deletions
--- a/skills/langgraph-master/06_llm_model_ids_claude_advanced.md
+++ b/skills/langgraph-master/06_llm_model_ids_claude_advanced.md
@@ -0,0 +1,262 @@
+# Claude Advanced Features
+
+Advanced settings and parameter tuning for Claude models.
+
+## Context Window and Output Limits
+
+| Model | Context Window | Max Output Tokens | Notes |
+|--------|-------------------|---------------|------|
+| `claude-opus-4-1-20250805` | 200,000 | 32,000 | Highest performance |
+| `claude-sonnet-4-5` | 1,000,000 | 64,000 | Latest version |
+| `claude-sonnet-4-20250514` | 200,000 (1M beta) | 64,000 | 1M with beta header |
+| `claude-haiku-4-5-20251001` | 200,000 | 64,000 | Fast version |
+
+**Note**: To use 1M context with Sonnet 4, a beta header is required.
+
+## Parameter Configuration
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+llm = ChatAnthropic(
+    model="claude-sonnet-4-5",
+    temperature=0.7,          # Creativity (0.0-1.0)
+    max_tokens=64000,         # Max output (Sonnet 4.5: 64K)
+    top_p=0.9,               # Diversity
+    top_k=40,                # Sampling
+)
+
+# Opus 4.1 (max output 32K)
+llm_opus = ChatAnthropic(
+    model="claude-opus-4-1-20250805",
+    max_tokens=32000,
+)
+```
+
+## Using 1M Context
+
+### Sonnet 4.5 (Standard)
+
+```python
+llm = ChatAnthropic(
+    model="claude-sonnet-4-5",
+    max_tokens=64000
+)
+
+# Can process 1M tokens of context
+long_document = "..." * 500000  # Long document
+response = llm.invoke(f"Please analyze the following document:\n\n{long_document}")
+```
+
+### Sonnet 4 (Beta Header)
+
+```python
+# Enable 1M context with beta header
+llm = ChatAnthropic(
+    model="claude-sonnet-4-20250514",
+    max_tokens=64000,
+    default_headers={
+        "anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"
+    }
+)
+```
+
+## Streaming
+
+```python
+llm = ChatAnthropic(
+    model="claude-sonnet-4-5",
+    streaming=True
+)
+
+for chunk in llm.stream("question"):
+    print(chunk.content, end="", flush=True)
+```
+
+## Prompt Caching
+
+Cache parts of long prompts for efficiency:
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+llm = ChatAnthropic(
+    model="claude-sonnet-4-5",
+    max_tokens=4096
+)
+
+# System prompt for caching
+system_prompt = """
+You are a professional code reviewer.
+Please review according to the following coding guidelines:
+[long guidelines...]
+"""
+
+# Use cache
+response = llm.invoke(
+    [
+        {"role": "system", "content": system_prompt, "cache_control": {"type": "ephemeral"}},
+        {"role": "user", "content": "Please review this code"}
+    ]
+)
+```
+
+**Cache Benefits**:
+- Cost reduction (90% off on cache hits)
+- Latency reduction (faster processing on reuse)
+
+## Vision (Image Processing)
+
+```python
+from langchain_anthropic import ChatAnthropic
+from langchain_core.messages import HumanMessage
+
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+
+message = HumanMessage(
+    content=[
+        {"type": "text", "text": "What's in this image?"},
+        {
+            "type": "image_url",
+            "image_url": {
+                "url": "https://example.com/image.jpg"
+            }
+        }
+    ]
+)
+
+response = llm.invoke([message])
+```
+
+## JSON Mode
+
+When structured output is needed:
+
+```python
+llm = ChatAnthropic(
+    model="claude-sonnet-4-5",
+    model_kwargs={
+        "response_format": {"type": "json_object"}
+    }
+)
+
+response = llm.invoke("Return user information in JSON format")
+```
+
+## Token Usage Tracking
+
+```python
+from langchain.callbacks import get_openai_callback
+
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+
+with get_openai_callback() as cb:
+    response = llm.invoke("question")
+    print(f"Total Tokens: {cb.total_tokens}")
+    print(f"Prompt Tokens: {cb.prompt_tokens}")
+    print(f"Completion Tokens: {cb.completion_tokens}")
+```
+
+## Error Handling
+
+```python
+from anthropic import AnthropicError, RateLimitError
+
+try:
+    llm = ChatAnthropic(model="claude-sonnet-4-5")
+    response = llm.invoke("question")
+except RateLimitError:
+    print("Rate limit reached")
+except AnthropicError as e:
+    print(f"Anthropic error: {e}")
+```
+
+## Rate Limit Handling
+
+```python
+from tenacity import retry, wait_exponential, stop_after_attempt
+from anthropic import RateLimitError
+
+@retry(
+    wait=wait_exponential(multiplier=1, min=4, max=60),
+    stop=stop_after_attempt(5),
+    retry=lambda e: isinstance(e, RateLimitError)
+)
+def invoke_with_retry(llm, messages):
+    return llm.invoke(messages)
+
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+response = invoke_with_retry(llm, ["question"])
+```
+
+## Listing Models
+
+```python
+import anthropic
+import os
+
+client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
+models = client.models.list()
+
+for model in models.data:
+    print(f"{model.id} - {model.display_name}")
+```
+
+## Cost Optimization
+
+### Cost Management by Model Selection
+
+```python
+# Low-cost version (simple tasks)
+llm_cheap = ChatAnthropic(model="claude-haiku-4-5-20251001")
+
+# Balanced version (general tasks)
+llm_balanced = ChatAnthropic(model="claude-sonnet-4-5")
+
+# High-performance version (complex tasks)
+llm_powerful = ChatAnthropic(model="claude-opus-4-1-20250805")
+
+# Select based on task
+def get_llm_for_task(complexity):
+    if complexity == "simple":
+        return llm_cheap
+    elif complexity == "medium":
+        return llm_balanced
+    else:
+        return llm_powerful
+```
+
+### Cost Reduction with Prompt Caching
+
+```python
+# Cache long system prompt
+system = {"role": "system", "content": long_guidelines, "cache_control": {"type": "ephemeral"}}
+
+# Reuse cache across multiple calls (90% cost reduction)
+for user_input in user_inputs:
+    response = llm.invoke([system, {"role": "user", "content": user_input}])
+```
+
+## Leveraging Large Context
+
+```python
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+
+# Process large documents at once (1M token support)
+documents = load_large_documents()  # Large document collection
+
+response = llm.invoke(f"""
+Please analyze the following multiple documents:
+
+{documents}
+
+Tell me the main themes and conclusions.
+""")
+```
+
+## Reference Links
+
+- [Claude API Documentation](https://docs.anthropic.com/)
+- [Anthropic API Reference](https://docs.anthropic.com/en/api/)
+- [Claude Models Overview](https://docs.anthropic.com/en/docs/about-claude/models/overview)
+- [Prompt Caching Guide](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)