Initial commit

2025-11-30 08:44:08 +08:00
commit 12f823b4a9
24 changed files with 4951 additions and 0 deletions
--- a/skills/pocketflow/SKILL.md
+++ b/skills/pocketflow/SKILL.md
@@ -0,0 +1,818 @@
+---
+name: pocketflow
+description: PocketFlow framework for building LLM applications with graph-based abstractions, design patterns, and agentic coding workflows
+---
+
+# PocketFlow Skill
+
+A comprehensive guide to building LLM applications using PocketFlow - a 100-line minimalist framework for Agents, Task Decomposition, RAG, and more.
+
+## When to Use This Skill
+
+Activate this skill when working with:
+- **Graph-based LLM workflows** - Building complex AI systems with nodes and flows
+- **Agentic applications** - Creating autonomous agents with dynamic action selection
+- **Task decomposition** - Breaking down complex LLM tasks into manageable steps
+- **RAG systems** - Implementing Retrieval Augmented Generation pipelines
+- **Batch processing** - Handling large inputs or multiple files with LLMs
+- **Multi-agent systems** - Coordinating multiple AI agents
+- **Async workflows** - Building I/O-bound LLM applications with concurrency
+
+## Core Concepts
+
+### Architecture Overview
+
+PocketFlow models LLM workflows as **Graph + Shared Store**:
+
+```python
+# Shared Store: Central data storage
+shared = {
+    "data": {},
+    "summary": {},
+    "config": {...}
+}
+
+# Graph: Nodes connected by transitions
+node_a >> node_b >> node_c
+flow = Flow(start=node_a)
+flow.run(shared)
+```
+
+### The Node: Building Block
+
+Every Node has 3 steps: `prep()` → `exec()` → `post()`
+
+```python
+class SummarizeFile(Node):
+    def prep(self, shared):
+        # Get data from shared store
+        return shared["data"]
+
+    def exec(self, prep_res):
+        # Process with LLM (retries built-in)
+        prompt = f"Summarize this text in 10 words: {prep_res}"
+        summary = call_llm(prompt)
+        return summary
+
+    def post(self, shared, prep_res, exec_res):
+        # Write results back to shared store
+        shared["summary"] = exec_res
+        return "default"  # Action for flow control
+```
+
+**Why 3 steps?** Separation of concerns - data storage and processing operate separately.
+
+### The Flow: Orchestration
+
+```python
+# Simple sequence
+load_data >> summarize >> save_result
+flow = Flow(start=load_data)
+flow.run(shared)
+
+# Branching with actions
+review - "approved" >> payment
+review - "needs_revision" >> revise
+review - "rejected" >> finish
+revise >> review  # Loop back
+
+flow = Flow(start=review)
+```
+
+## Quick Reference
+
+### 1. Basic Node Pattern
+
+```python
+class LoadData(Node):
+    def post(self, shared, prep_res, exec_res):
+        shared["data"] = "Some text content"
+        return None
+
+class Summarize(Node):
+    def prep(self, shared):
+        return shared["data"]
+
+    def exec(self, prep_res):
+        return call_llm(f"Summarize: {prep_res}")
+
+    def post(self, shared, prep_res, exec_res):
+        shared["summary"] = exec_res
+        return "default"
+
+# Connect and run
+load_data >> summarize
+flow = Flow(start=load_data)
+flow.run(shared)
+```
+
+### 2. Batch Processing
+
+**BatchNode** - Process large inputs in chunks:
+
+```python
+class MapSummaries(BatchNode):
+    def prep(self, shared):
+        # Chunk big file
+        content = shared["data"]
+        chunk_size = 10000
+        return [content[i:i+chunk_size]
+                for i in range(0, len(content), chunk_size)]
+
+    def exec(self, chunk):
+        # Process each chunk
+        return call_llm(f"Summarize: {chunk}")
+
+    def post(self, shared, prep_res, exec_res_list):
+        # Combine all results
+        shared["summary"] = "\n".join(exec_res_list)
+        return "default"
+```
+
+**BatchFlow** - Run flow multiple times with different parameters:
+
+```python
+class SummarizeAllFiles(BatchFlow):
+    def prep(self, shared):
+        filenames = list(shared["data"].keys())
+        # Return list of parameter dicts
+        return [{"filename": fn} for fn in filenames]
+
+class LoadFile(Node):
+    def prep(self, shared):
+        # Access filename from params
+        filename = self.params["filename"]
+        return filename
+```
+
+### 3. Agent Pattern
+
+```python
+class DecideAction(Node):
+    def exec(self, inputs):
+        query, context = inputs
+        prompt = f"""
+Given input: {query}
+Previous search results: {context}
+Should I: 1) Search web for more info 2) Answer with current knowledge
+
+Output in yaml:
+```yaml
+action: search/answer
+reason: why this action
+search_term: search phrase if action is search
+```"""
+        resp = call_llm(prompt)
+        yaml_str = resp.split("```yaml")[1].split("```")[0]
+        action_data = yaml.safe_load(yaml_str)
+        return action_data
+
+# Build agent graph
+decide >> search_web
+decide - "answer" >> provide_answer
+search_web >> decide  # Loop back for more searches
+
+agent_flow = Flow(start=decide)
+```
+
+### 4. RAG (Retrieval Augmented Generation)
+
+**Stage 1: Offline Indexing**
+
+```python
+class ChunkDocs(BatchNode):
+    def prep(self, shared):
+        return shared["files"]
+
+    def exec(self, filepath):
+        with open(filepath, "r") as f:
+            text = f.read()
+        # Chunk by 100 chars
+        size = 100
+        return [text[i:i+size] for i in range(0, len(text), size)]
+
+    def post(self, shared, prep_res, exec_res_list):
+        shared["all_chunks"] = [c for chunks in exec_res_list
+                                for c in chunks]
+
+chunk_docs >> embed_docs >> build_index
+offline_flow = Flow(start=chunk_docs)
+```
+
+**Stage 2: Online Query**
+
+```python
+class RetrieveDocs(Node):
+    def exec(self, inputs):
+        q_emb, index, chunks = inputs
+        I, D = search_index(index, q_emb, top_k=1)
+        return chunks[I[0][0]]
+
+embed_query >> retrieve_docs >> generate_answer
+online_flow = Flow(start=embed_query)
+```
+
+### 5. Async & Parallel
+
+**AsyncNode** for I/O-bound operations:
+
+```python
+class SummarizeThenVerify(AsyncNode):
+    async def prep_async(self, shared):
+        doc_text = await read_file_async(shared["doc_path"])
+        return doc_text
+
+    async def exec_async(self, prep_res):
+        summary = await call_llm_async(f"Summarize: {prep_res}")
+        return summary
+
+    async def post_async(self, shared, prep_res, exec_res):
+        decision = await gather_user_feedback(exec_res)
+        if decision == "approve":
+            shared["summary"] = exec_res
+        return "default"
+
+# Must wrap in AsyncFlow
+node = SummarizeThenVerify()
+flow = AsyncFlow(start=node)
+await flow.run_async(shared)
+```
+
+**AsyncParallelBatchNode** - Process multiple items concurrently:
+
+```python
+class ParallelSummaries(AsyncParallelBatchNode):
+    async def prep_async(self, shared):
+        return shared["texts"]  # List of texts
+
+    async def exec_async(self, text):
+        # Runs in parallel for each text
+        return await call_llm_async(f"Summarize: {text}")
+
+    async def post_async(self, shared, prep_res, exec_res_list):
+        shared["summary"] = "\n\n".join(exec_res_list)
+        return "default"
+```
+
+### 6. Workflow (Task Decomposition)
+
+```python
+class GenerateOutline(Node):
+    def prep(self, shared):
+        return shared["topic"]
+
+    def exec(self, topic):
+        return call_llm(f"Create outline for: {topic}")
+
+    def post(self, shared, prep_res, exec_res):
+        shared["outline"] = exec_res
+
+class WriteSection(Node):
+    def exec(self, outline):
+        return call_llm(f"Write content: {outline}")
+
+    def post(self, shared, prep_res, exec_res):
+        shared["draft"] = exec_res
+
+class ReviewAndRefine(Node):
+    def exec(self, draft):
+        return call_llm(f"Review and improve: {draft}")
+
+# Chain the workflow
+outline >> write >> review
+workflow = Flow(start=outline)
+```
+
+### 7. Structured Output
+
+```python
+class SummarizeNode(Node):
+    def exec(self, prep_res):
+        prompt = f"""
+Summarize the following text as YAML, with exactly 3 bullet points
+
+{prep_res}
+
+Output:
+```yaml
+summary:
+  - bullet 1
+  - bullet 2
+  - bullet 3
+```"""
+        response = call_llm(prompt)
+        yaml_str = response.split("```yaml")[1].split("```")[0].strip()
+
+        import yaml
+        structured_result = yaml.safe_load(yaml_str)
+
+        # Validate
+        assert "summary" in structured_result
+        assert isinstance(structured_result["summary"], list)
+
+        return structured_result
+```
+
+**Why YAML?** Modern LLMs handle YAML better than JSON (less escaping issues).
+
+---
+
+## 🍳 Cookbook: Real-World Examples
+
+This skill includes **6 production-ready examples** from the official PocketFlow cookbook, plus a complete **Python project template**.
+
+**📂 Location:** `assets/examples/` and `assets/template/`
+
+### Example 1: Interactive Chat Bot (☆☆☆)
+
+**File:** `assets/examples/01_chat.py`
+
+A chat bot with conversation history that loops back to itself.
+
+```python
+# Key pattern: Self-looping node
+chat_node = ChatNode()
+chat_node - "continue" >> chat_node  # Loop for continuous chat
+flow = Flow(start=chat_node)
+```
+
+**What it demonstrates:**
+- Message history management
+- Self-looping nodes
+- Graceful exit handling
+- User input processing
+
+**Run it:** `python assets/examples/01_chat.py`
+
+---
+
+### Example 2: Article Writing Workflow (☆☆☆)
+
+**File:** `assets/examples/02_workflow.py`
+
+Multi-step content creation: outline → draft → refine.
+
+```python
+# Sequential pipeline
+outline >> draft >> refine
+flow = Flow(start=outline)
+```
+
+**What it demonstrates:**
+- Task decomposition
+- Sequential workflows
+- Progressive content generation
+
+**Run it:** `python assets/examples/02_workflow.py "AI Safety"`
+
+---
+
+### Example 3: Research Agent (☆☆☆)
+
+**File:** `assets/examples/03_agent.py`
+
+Agent that decides whether to search or answer.
+
+```python
+# Branching based on decision
+decide - "search" >> search
+decide - "answer" >> answer
+search - "continue" >> decide  # Loop back
+```
+
+**What it demonstrates:**
+- Dynamic action selection
+- Branching logic
+- Agent decision-making
+- Iterative research loops
+
+**Run it:** `python assets/examples/03_agent.py "Nobel Prize 2024"`
+
+---
+
+### Example 4: RAG System (☆☆☆)
+
+**File:** `assets/examples/04_rag.py`
+
+Complete two-stage RAG pipeline with offline indexing and online querying.
+
+```python
+# Stage 1: Offline indexing
+embed_docs >> build_index
+offline_flow = Flow(start=embed_docs)
+
+# Stage 2: Online query
+embed_query >> retrieve >> generate
+online_flow = Flow(start=embed_query)
+```
+
+**What it demonstrates:**
+- Document embedding and indexing
+- Similarity search
+- Context-based generation
+- Multi-stage pipelines
+
+**Run it:** `python assets/examples/04_rag.py --"How to install PocketFlow?"`
+
+---
+
+### Example 5: Structured Output Parser (☆☆☆)
+
+**File:** `assets/examples/05_structured_output.py`
+
+Resume parser extracting structured data with YAML.
+
+```python
+# Parse YAML from LLM response
+yaml_str = response.split("```yaml")[1].split("```")[0]
+structured_result = yaml.safe_load(yaml_str)
+
+# Validate structure
+assert "name" in structured_result
+assert "email" in structured_result
+```
+
+**What it demonstrates:**
+- Structured LLM responses with YAML
+- Schema validation
+- Retry logic for parsing
+- Data extraction patterns
+
+**Run it:** `python assets/examples/05_structured_output.py`
+
+---
+
+### Example 6: Multi-Agent Communication (★☆☆)
+
+**File:** `assets/examples/06_multi_agent.py`
+
+Two async agents playing Taboo word game.
+
+```python
+# Agents with message queues
+shared = {
+    "hinter_queue": asyncio.Queue(),
+    "guesser_queue": asyncio.Queue()
+}
+
+# Run concurrently
+await asyncio.gather(
+    hinter_flow.run_async(shared),
+    guesser_flow.run_async(shared)
+)
+```
+
+**What it demonstrates:**
+- AsyncNode for concurrent operations
+- Message queues for inter-agent communication
+- Multi-agent coordination
+- Game logic with termination
+
+**Run it:** `python assets/examples/06_multi_agent.py`
+
+---
+
+### Python Project Template
+
+**Location:** `assets/template/`
+
+Official best-practice template with complete project structure.
+
+**Files:**
+- `main.py` - Entry point
+- `flow.py` - Flow definition
+- `nodes.py` - Node implementations
+- `utils.py` - LLM wrappers
+- `requirements.txt` - Dependencies
+
+**Quick Start:**
+```bash
+cd assets/template/
+pip install -r requirements.txt
+# Edit utils.py to add your LLM API key
+python main.py
+```
+
+**What it demonstrates:**
+- Separation of concerns
+- Factory pattern for flows
+- Clean data flow with shared store
+- Configuration best practices
+
+---
+
+### Full Cookbook (47 Examples)
+
+The complete cookbook has **47 progressively complex examples** on GitHub:
+
+**Dummy Level (☆☆☆):**
+Chat, Workflow, Agent, RAG, Map-Reduce, Streaming, Structured Output, Guardrails
+
+**Beginner Level (★☆☆):**
+Multi-Agent, Supervisor, Parallel (3x/8x), Thinking (CoT), Memory, MCP, Tracing
+
+**Plus 30+ more advanced patterns:**
+FastAPI integration, Code generator, Text-to-SQL, Voice chat, PDF vision, Website chatbot, and more.
+
+**Browse all:** https://github.com/The-Pocket/PocketFlow/tree/main/cookbook
+
+**Complete guide:** See `assets/COOKBOOK_GUIDE.md` for full index and learning path.
+
+---
+
+## Design Patterns Summary
+
+| Pattern | Use Case | Key Component |
+|---------|----------|---------------|
+| **Agent** | Dynamic action selection | Action space + context management |
+| **Workflow** | Multi-step task decomposition | Chained nodes |
+| **RAG** | Context-aware answers | Offline indexing + online retrieval |
+| **Map Reduce** | Large input processing | BatchNode with aggregation |
+| **Multi-Agent** | Collaborative agents | Message queues + AsyncNode |
+| **Structured Output** | Typed LLM responses | YAML prompting + validation |
+
+## Communication Patterns
+
+### Shared Store (Primary)
+
+```python
+# Design data structure first
+shared = {
+    "user": {
+        "id": "user123",
+        "context": {
+            "weather": {"temp": 72, "condition": "sunny"},
+            "location": "San Francisco"
+        }
+    },
+    "results": {}
+}
+```
+
+**Best Practice:** Separate data schema from compute logic using shared store.
+
+### Params (For Batch Only)
+
+```python
+class SummarizeFile(Node):
+    def prep(self, shared):
+        # Access node's params
+        filename = self.params["filename"]
+        return shared["data"].get(filename, "")
+
+# Set params
+node = SummarizeFile()
+node.set_params({"filename": "report.txt"})
+```
+
+## Advanced Features
+
+### Fault Tolerance
+
+```python
+# Automatic retries
+my_node = SummarizeFile(max_retries=3, wait=10)
+
+# Graceful fallback
+class ResilientNode(Node):
+    def exec_fallback(self, prep_res, exc):
+        # Return fallback instead of crashing
+        return "There was an error processing your request."
+```
+
+### Nested Flows
+
+```python
+# Flows can act as nodes
+node_a >> node_b
+subflow = Flow(start=node_a)
+
+# Connect to other nodes
+subflow >> node_c
+
+# Create parent flow
+parent_flow = Flow(start=subflow)
+```
+
+### Multi-Agent Communication
+
+```python
+class AgentNode(AsyncNode):
+    async def prep_async(self, _):
+        message_queue = self.params["messages"]
+        message = await message_queue.get()
+        print(f"Agent received: {message}")
+        return message
+
+# Create self-loop for continuous listening
+agent = AgentNode()
+agent >> agent
+flow = AsyncFlow(start=agent)
+```
+
+## Utility Functions
+
+### LLM Wrappers
+
+```python
+# OpenAI
+def call_llm(prompt):
+    from openai import OpenAI
+    client = OpenAI(api_key="YOUR_API_KEY")
+    r = client.chat.completions.create(
+        model="gpt-4o",
+        messages=[{"role": "user", "content": prompt}]
+    )
+    return r.choices[0].message.content
+
+# Anthropic Claude
+def call_llm(prompt):
+    from anthropic import Anthropic
+    client = Anthropic(api_key="YOUR_API_KEY")
+    r = client.messages.create(
+        model="claude-sonnet-4-0",
+        messages=[{"role": "user", "content": prompt}]
+    )
+    return r.content[0].text
+
+# Google Gemini
+def call_llm(prompt):
+    from google import genai
+    client = genai.Client(api_key='GEMINI_API_KEY')
+    response = client.models.generate_content(
+        model='gemini-2.5-pro',
+        contents=prompt
+    )
+    return response.text
+```
+
+### Embeddings
+
+```python
+# OpenAI
+from openai import OpenAI
+client = OpenAI(api_key="YOUR_API_KEY")
+response = client.embeddings.create(
+    model="text-embedding-ada-002",
+    input=text
+)
+embedding = response.data[0].embedding
+```
+
+### Text Chunking
+
+```python
+# Fixed-size chunking
+def fixed_size_chunk(text, chunk_size=100):
+    return [text[i:i+chunk_size]
+            for i in range(0, len(text), chunk_size)]
+
+# Sentence-based chunking
+import nltk
+def sentence_based_chunk(text, max_sentences=2):
+    sentences = nltk.sent_tokenize(text)
+    return [" ".join(sentences[i:i+max_sentences])
+            for i in range(0, len(sentences), max_sentences)]
+```
+
+## Agentic Coding Guidelines
+
+**IMPORTANT for AI Agents building LLM systems:**
+
+1. **Start Simple** - Begin with the smallest solution first
+2. **Design First** - Create high-level design (docs/design.md) before implementation
+3. **Manual Testing** - Solve example inputs manually to develop intuition
+4. **Iterate Frequently** - Expect hundreds of iterations on Steps 3-6
+5. **Ask Humans** - Request feedback and clarification regularly
+
+### Recommended Project Structure
+
+```
+my_project/
+├── main.py
+├── nodes.py
+├── flow.py
+├── utils/
+│   ├── __init__.py
+│   ├── call_llm.py
+│   └── search_web.py
+├── requirements.txt
+└── docs/
+    └── design.md
+```
+
+### Development Workflow
+
+```mermaid
+flowchart LR
+    start[Start] --> batch[Batch]
+    batch --> check[Check]
+    check -->|OK| process
+    check -->|Error| fix[Fix]
+    fix --> check
+
+    subgraph process[Process]
+      step1[Step 1] --> step2[Step 2]
+    end
+
+    process --> endNode[End]
+```
+
+## Best Practices
+
+### Context Management (Agents)
+- **Relevant & Minimal** - Retrieve most relevant via RAG, not entire history
+- **Avoid "lost in the middle"** - LLMs overlook mid-prompt content even with large windows
+
+### Action Space Design (Agents)
+- **Unambiguous** - Avoid overlapping actions (e.g., one `read_database` instead of separate `read_databases` and `read_csvs`)
+- **Incremental** - Feed 500 lines or 1 page at a time, not all at once
+- **Overview-zoom-in** - Show structure first (TOC, summary), then details
+- **Parameterized** - Enable flexible actions with parameters (columns, SQL queries)
+- **Backtracking** - Allow undo instead of full restart
+
+### Error Handling
+- **No try/except in utilities** - Let Node retry mechanism handle failures
+- **Use exec_fallback()** - Provide graceful degradation instead of crashes
+
+### Performance Tips
+- **Batch APIs** - Use LLM batch inference for multiple prompts
+- **Rate Limiting** - Use semaphores to avoid API limits
+- **Parallel only for I/O** - Python GIL prevents true CPU parallelism
+- **Independent tasks** - Don't parallelize dependent operations
+
+## Reference Files
+
+This skill includes comprehensive documentation in `references/core_abstraction.md`:
+
+- Node - Basic building block with prep/exec/post
+- Flow - Orchestration and graph control
+- Communication - Shared store vs params
+- Batch - BatchNode and BatchFlow patterns
+- Async - AsyncNode for I/O-bound tasks
+- Parallel - AsyncParallelBatchNode/Flow
+- Agent - Dynamic action selection
+- Workflow - Task decomposition chains
+- RAG - Retrieval augmented generation
+- Map Reduce - Large input processing
+- Structured Output - YAML-based schemas
+- Multi-Agents - Inter-agent communication
+- LLM Wrappers - OpenAI, Anthropic, Google, Azure
+- Embeddings - Text embedding APIs
+- Vector Databases - FAISS, Pinecone, Qdrant, etc.
+- Web Search - Google, Bing, DuckDuckGo, Brave
+- Text Chunking - Fixed-size and sentence-based
+- Text-to-Speech - AWS Polly, Google Cloud, Azure, IBM
+- Visualization - Mermaid diagrams and call stacks
+- Agentic Coding - Development workflow guidance
+
+## Navigation Guide
+
+### For Beginners
+1. Start with **Node** and **Flow** basics
+2. Learn **Communication** (shared store)
+3. Try simple **Workflow** example
+4. Read **Agentic Coding** guidelines
+
+### For Specific Use Cases
+- **Document processing** → Batch + Map Reduce
+- **Question answering** → RAG
+- **Dynamic task planning** → Agent
+- **Multi-step pipelines** → Workflow
+- **Real-time systems** → Async + Parallel
+- **Collaborative AI** → Multi-Agents
+
+### For Advanced Users
+- Nested flows for complex pipelines
+- Custom fault tolerance with exec_fallback
+- Parallel processing with rate limiting
+- Multi-agent communication patterns
+- Custom visualization and debugging tools
+
+## Common Pitfalls
+
+❌ **Don't** use Multi-Agents unless necessary - Start simple!
+❌ **Don't** parallelize dependent operations
+❌ **Don't** add try/except in utility functions called from exec()
+❌ **Don't** use node.run() in production - Always use flow.run()
+❌ **Don't** modify shared store in exec() - Use prep() and post()
+
+✅ **Do** design data schema before implementation
+✅ **Do** use shared store for data, params for identifiers
+✅ **Do** leverage built-in retry mechanisms
+✅ **Do** validate structured output with assertions
+✅ **Do** start with simplest solution and iterate
+
+## Resources
+
+**Official Docs:** https://the-pocket.github.io/PocketFlow/
+
+**Framework Philosophy:**
+- Minimalist (100 lines of core code)
+- No vendor lock-in (implement your own utilities)
+- Separation of concerns (graph + shared store)
+- Graph-based workflow modeling
+
+---
+
+*This skill was generated from PocketFlow official documentation. For detailed examples and complete API reference, see `references/core_abstraction.md`.*