Initial commit

2025-11-30 08:34:18 +08:00
commit e58b027cd1
20 changed files with 6428 additions and 0 deletions
--- a/plugins/llm-application-dev/skills/langchain-architecture/SKILL.md
+++ b/plugins/llm-application-dev/skills/langchain-architecture/SKILL.md
@@ -0,0 +1,338 @@
+---
+name: langchain-architecture
+description: Design LLM applications using the LangChain framework with agents, memory, and tool integration patterns. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
+---
+
+# LangChain Architecture
+
+Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
+
+## When to Use This Skill
+
+- Building autonomous AI agents with tool access
+- Implementing complex multi-step LLM workflows
+- Managing conversation memory and state
+- Integrating LLMs with external data sources and APIs
+- Creating modular, reusable LLM application components
+- Implementing document processing pipelines
+- Building production-grade LLM applications
+
+## Core Concepts
+
+### 1. Agents
+Autonomous systems that use LLMs to decide which actions to take.
+
+**Agent Types:**
+- **ReAct**: Reasoning + Acting in interleaved manner
+- **OpenAI Functions**: Leverages function calling API
+- **Structured Chat**: Handles multi-input tools
+- **Conversational**: Optimized for chat interfaces
+- **Self-Ask with Search**: Decomposes complex queries
+
+### 2. Chains
+Sequences of calls to LLMs or other utilities.
+
+**Chain Types:**
+- **LLMChain**: Basic prompt + LLM combination
+- **SequentialChain**: Multiple chains in sequence
+- **RouterChain**: Routes inputs to specialized chains
+- **TransformChain**: Data transformations between steps
+- **MapReduceChain**: Parallel processing with aggregation
+
+### 3. Memory
+Systems for maintaining context across interactions.
+
+**Memory Types:**
+- **ConversationBufferMemory**: Stores all messages
+- **ConversationSummaryMemory**: Summarizes older messages
+- **ConversationBufferWindowMemory**: Keeps last N messages
+- **EntityMemory**: Tracks information about entities
+- **VectorStoreMemory**: Semantic similarity retrieval
+
+### 4. Document Processing
+Loading, transforming, and storing documents for retrieval.
+
+**Components:**
+- **Document Loaders**: Load from various sources
+- **Text Splitters**: Chunk documents intelligently
+- **Vector Stores**: Store and retrieve embeddings
+- **Retrievers**: Fetch relevant documents
+- **Indexes**: Organize documents for efficient access
+
+### 5. Callbacks
+Hooks for logging, monitoring, and debugging.
+
+**Use Cases:**
+- Request/response logging
+- Token usage tracking
+- Latency monitoring
+- Error handling
+- Custom metrics collection
+
+## Quick Start
+
+```python
+from langchain.agents import AgentType, initialize_agent, load_tools
+from langchain.llms import OpenAI
+from langchain.memory import ConversationBufferMemory
+
+# Initialize LLM
+llm = OpenAI(temperature=0)
+
+# Load tools
+tools = load_tools(["serpapi", "llm-math"], llm=llm)
+
+# Add memory
+memory = ConversationBufferMemory(memory_key="chat_history")
+
+# Create agent
+agent = initialize_agent(
+    tools,
+    llm,
+    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
+    memory=memory,
+    verbose=True
+)
+
+# Run agent
+result = agent.run("What's the weather in SF? Then calculate 25 * 4")
+```
+
+## Architecture Patterns
+
+### Pattern 1: RAG with LangChain
+```python
+from langchain.chains import RetrievalQA
+from langchain.document_loaders import TextLoader
+from langchain.text_splitter import CharacterTextSplitter
+from langchain.vectorstores import Chroma
+from langchain.embeddings import OpenAIEmbeddings
+
+# Load and process documents
+loader = TextLoader('documents.txt')
+documents = loader.load()
+
+text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
+texts = text_splitter.split_documents(documents)
+
+# Create vector store
+embeddings = OpenAIEmbeddings()
+vectorstore = Chroma.from_documents(texts, embeddings)
+
+# Create retrieval chain
+qa_chain = RetrievalQA.from_chain_type(
+    llm=llm,
+    chain_type="stuff",
+    retriever=vectorstore.as_retriever(),
+    return_source_documents=True
+)
+
+# Query
+result = qa_chain({"query": "What is the main topic?"})
+```
+
+### Pattern 2: Custom Agent with Tools
+```python
+from langchain.agents import Tool, AgentExecutor
+from langchain.agents.react.base import ReActDocstoreAgent
+from langchain.tools import tool
+
+@tool
+def search_database(query: str) -> str:
+    """Search internal database for information."""
+    # Your database search logic
+    return f"Results for: {query}"
+
+@tool
+def send_email(recipient: str, content: str) -> str:
+    """Send an email to specified recipient."""
+    # Email sending logic
+    return f"Email sent to {recipient}"
+
+tools = [search_database, send_email]
+
+agent = initialize_agent(
+    tools,
+    llm,
+    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
+    verbose=True
+)
+```
+
+### Pattern 3: Multi-Step Chain
+```python
+from langchain.chains import LLMChain, SequentialChain
+from langchain.prompts import PromptTemplate
+
+# Step 1: Extract key information
+extract_prompt = PromptTemplate(
+    input_variables=["text"],
+    template="Extract key entities from: {text}\n\nEntities:"
+)
+extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
+
+# Step 2: Analyze entities
+analyze_prompt = PromptTemplate(
+    input_variables=["entities"],
+    template="Analyze these entities: {entities}\n\nAnalysis:"
+)
+analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
+
+# Step 3: Generate summary
+summary_prompt = PromptTemplate(
+    input_variables=["entities", "analysis"],
+    template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
+)
+summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
+
+# Combine into sequential chain
+overall_chain = SequentialChain(
+    chains=[extract_chain, analyze_chain, summary_chain],
+    input_variables=["text"],
+    output_variables=["entities", "analysis", "summary"],
+    verbose=True
+)
+```
+
+## Memory Management Best Practices
+
+### Choosing the Right Memory Type
+```python
+# For short conversations (< 10 messages)
+from langchain.memory import ConversationBufferMemory
+memory = ConversationBufferMemory()
+
+# For long conversations (summarize old messages)
+from langchain.memory import ConversationSummaryMemory
+memory = ConversationSummaryMemory(llm=llm)
+
+# For sliding window (last N messages)
+from langchain.memory import ConversationBufferWindowMemory
+memory = ConversationBufferWindowMemory(k=5)
+
+# For entity tracking
+from langchain.memory import ConversationEntityMemory
+memory = ConversationEntityMemory(llm=llm)
+
+# For semantic retrieval of relevant history
+from langchain.memory import VectorStoreRetrieverMemory
+memory = VectorStoreRetrieverMemory(retriever=retriever)
+```
+
+## Callback System
+
+### Custom Callback Handler
+```python
+from langchain.callbacks.base import BaseCallbackHandler
+
+class CustomCallbackHandler(BaseCallbackHandler):
+    def on_llm_start(self, serialized, prompts, **kwargs):
+        print(f"LLM started with prompts: {prompts}")
+
+    def on_llm_end(self, response, **kwargs):
+        print(f"LLM ended with response: {response}")
+
+    def on_llm_error(self, error, **kwargs):
+        print(f"LLM error: {error}")
+
+    def on_chain_start(self, serialized, inputs, **kwargs):
+        print(f"Chain started with inputs: {inputs}")
+
+    def on_agent_action(self, action, **kwargs):
+        print(f"Agent taking action: {action}")
+
+# Use callback
+agent.run("query", callbacks=[CustomCallbackHandler()])
+```
+
+## Testing Strategies
+
+```python
+import pytest
+from unittest.mock import Mock
+
+def test_agent_tool_selection():
+    # Mock LLM to return specific tool selection
+    mock_llm = Mock()
+    mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
+
+    agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
+
+    result = agent.run("test query")
+
+    # Verify correct tool was selected
+    assert "search_database" in str(mock_llm.predict.call_args)
+
+def test_memory_persistence():
+    memory = ConversationBufferMemory()
+
+    memory.save_context({"input": "Hi"}, {"output": "Hello!"})
+
+    assert "Hi" in memory.load_memory_variables({})['history']
+    assert "Hello!" in memory.load_memory_variables({})['history']
+```
+
+## Performance Optimization
+
+### 1. Caching
+```python
+from langchain.cache import InMemoryCache
+import langchain
+
+langchain.llm_cache = InMemoryCache()
+```
+
+### 2. Batch Processing
+```python
+# Process multiple documents in parallel
+from langchain.document_loaders import DirectoryLoader
+from concurrent.futures import ThreadPoolExecutor
+
+loader = DirectoryLoader('./docs')
+docs = loader.load()
+
+def process_doc(doc):
+    return text_splitter.split_documents([doc])
+
+with ThreadPoolExecutor(max_workers=4) as executor:
+    split_docs = list(executor.map(process_doc, docs))
+```
+
+### 3. Streaming Responses
+```python
+from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
+
+llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
+```
+
+## Resources
+
+- **references/agents.md**: Deep dive on agent architectures
+- **references/memory.md**: Memory system patterns
+- **references/chains.md**: Chain composition strategies
+- **references/document-processing.md**: Document loading and indexing
+- **references/callbacks.md**: Monitoring and observability
+- **assets/agent-template.py**: Production-ready agent template
+- **assets/memory-config.yaml**: Memory configuration examples
+- **assets/chain-example.py**: Complex chain examples
+
+## Common Pitfalls
+
+1. **Memory Overflow**: Not managing conversation history length
+2. **Tool Selection Errors**: Poor tool descriptions confuse agents
+3. **Context Window Exceeded**: Exceeding LLM token limits
+4. **No Error Handling**: Not catching and handling agent failures
+5. **Inefficient Retrieval**: Not optimizing vector store queries
+
+## Production Checklist
+
+- [ ] Implement proper error handling
+- [ ] Add request/response logging
+- [ ] Monitor token usage and costs
+- [ ] Set timeout limits for agent execution
+- [ ] Implement rate limiting
+- [ ] Add input validation
+- [ ] Test with edge cases
+- [ ] Set up observability (callbacks)
+- [ ] Implement fallback strategies
+- [ ] Version control prompts and configurations