Initial commit

2025-11-30 08:25:09 +08:00
commit 9475095985
30 changed files with 5609 additions and 0 deletions
--- a/references/cloudflare-integration.md
+++ b/references/cloudflare-integration.md
@@ -0,0 +1,358 @@
+# Cloudflare Workers Integration
+
+**Status**: Experimental Support
+
+OpenAI Agents SDK has experimental support for Cloudflare Workers. Some features work, others have limitations.
+
+---
+
+## Compatibility
+
+### What Works ✅
+- Text agents (`Agent`, `run()`)
+- Basic tool calling
+- Structured outputs with Zod
+- Streaming responses (with caveats)
+- Environment variable access
+
+### What Doesn't Work ❌
+- Realtime voice agents (WebRTC not supported in Workers)
+- Some Node.js APIs (timers, crypto edge cases)
+- Long-running operations (CPU time limits)
+
+### What's Experimental ⚠️
+- Multi-agent handoffs (works but untested at scale)
+- Large context windows (may hit memory limits)
+- Complex tool executions (CPU time limits)
+
+---
+
+## Setup
+
+### 1. Install Dependencies
+
+```bash
+npm install @openai/agents zod hono
+```
+
+### 2. Configure wrangler.jsonc
+
+```jsonc
+{
+  "name": "openai-agents-worker",
+  "main": "src/index.ts",
+  "compatibility_date": "2025-10-26",
+  "compatibility_flags": ["nodejs_compat"],
+  "node_compat": true, // Required for OpenAI SDK
+
+  "observability": {
+    "enabled": true
+  },
+
+  "limits": {
+    "cpu_ms": 30000 // Adjust based on agent complexity
+  }
+}
+```
+
+### 3. Set Environment Variable
+
+```bash
+# Set OPENAI_API_KEY secret
+wrangler secret put OPENAI_API_KEY
+
+# Enter your OpenAI API key when prompted
+```
+
+---
+
+## Basic Worker Example
+
+```typescript
+import { Agent, run } from '@openai/agents';
+
+export default {
+  async fetch(request: Request, env: Env): Promise<Response> {
+    if (request.method !== 'POST') {
+      return new Response('Method not allowed', { status: 405 });
+    }
+
+    try {
+      const { message } = await request.json();
+
+      // Set API key from environment
+      process.env.OPENAI_API_KEY = env.OPENAI_API_KEY;
+
+      const agent = new Agent({
+        name: 'Assistant',
+        instructions: 'You are helpful.',
+        model: 'gpt-4o-mini', // Use smaller models for faster response
+      });
+
+      const result = await run(agent, message, {
+        maxTurns: 5, // Limit turns to control execution time
+      });
+
+      return new Response(JSON.stringify({
+        response: result.finalOutput,
+        tokens: result.usage.totalTokens,
+      }), {
+        headers: { 'Content-Type': 'application/json' },
+      });
+
+    } catch (error) {
+      return new Response(JSON.stringify({ error: error.message }), {
+        status: 500,
+        headers: { 'Content-Type': 'application/json' },
+      });
+    }
+  },
+};
+
+interface Env {
+  OPENAI_API_KEY: string;
+}
+```
+
+---
+
+## Hono Integration
+
+```typescript
+import { Hono } from 'hono';
+import { Agent, run } from '@openai/agents';
+
+const app = new Hono<{ Bindings: { OPENAI_API_KEY: string } }>();
+
+app.post('/api/agent', async (c) => {
+  const { message } = await c.req.json();
+
+  process.env.OPENAI_API_KEY = c.env.OPENAI_API_KEY;
+
+  const agent = new Agent({
+    name: 'Assistant',
+    instructions: 'You are helpful.',
+  });
+
+  const result = await run(agent, message);
+
+  return c.json({
+    response: result.finalOutput,
+  });
+});
+
+export default app;
+```
+
+**See Template**: `templates/cloudflare-workers/worker-agent-hono.ts`
+
+---
+
+## Streaming Responses
+
+Streaming works but requires careful handling:
+
+```typescript
+const stream = await run(agent, message, { stream: true });
+
+const { readable, writable } = new TransformStream();
+const writer = writable.getWriter();
+const encoder = new TextEncoder();
+
+// Stream in background
+(async () => {
+  try {
+    for await (const event of stream) {
+      if (event.type === 'raw_model_stream_event') {
+        const chunk = event.data?.choices?.[0]?.delta?.content || '';
+        if (chunk) {
+          await writer.write(encoder.encode(`data: ${chunk}\n\n`));
+        }
+      }
+    }
+    await stream.completed;
+  } finally {
+    await writer.close();
+  }
+})();
+
+return new Response(readable, {
+  headers: {
+    'Content-Type': 'text/event-stream',
+    'Cache-Control': 'no-cache',
+  },
+});
+```
+
+---
+
+## Known Limitations
+
+### 1. CPU Time Limits
+Workers have CPU time limits (default 50ms, up to 30s with paid plans).
+
+**Solution**: Use smaller models and limit `maxTurns`:
+
+```typescript
+const result = await run(agent, message, {
+  maxTurns: 3, // Limit turns
+  model: 'gpt-4o-mini', // Faster model
+});
+```
+
+### 2. Memory Limits
+Large context windows may hit memory limits (128MB default).
+
+**Solution**: Keep conversations concise, summarize history:
+
+```typescript
+const agent = new Agent({
+  instructions: 'Keep responses concise. Summarize context when needed.',
+});
+```
+
+### 3. No Realtime Voice
+WebRTC not supported in Workers runtime.
+
+**Solution**: Use realtime agents in Next.js or other Node.js environments.
+
+### 4. Cold Starts
+First request after inactivity may be slow.
+
+**Solution**: Use warm-up requests or keep Workers warm with cron triggers.
+
+---
+
+## Performance Tips
+
+### 1. Use Smaller Models
+```typescript
+model: 'gpt-4o-mini' // Faster than gpt-4o
+```
+
+### 2. Limit Turns
+```typescript
+maxTurns: 3 // Prevent long-running loops
+```
+
+### 3. Stream Responses
+```typescript
+stream: true // Start returning data faster
+```
+
+### 4. Cache Results
+```typescript
+// Cache frequent queries in KV
+const cached = await env.KV.get(cacheKey);
+if (cached) return cached;
+
+const result = await run(agent, message);
+await env.KV.put(cacheKey, result, { expirationTtl: 3600 });
+```
+
+### 5. Use Durable Objects for State
+```typescript
+// Store agent state in Durable Objects for long conversations
+class AgentSession {
+  async fetch(request) {
+    // Maintain conversation state across requests
+  }
+}
+```
+
+---
+
+## Deployment
+
+```bash
+# Build and deploy
+npm run build
+wrangler deploy
+
+# Test locally
+wrangler dev
+```
+
+---
+
+## Cost Considerations
+
+**Workers Costs**:
+- Requests: $0.15 per million (after 100k free/day)
+- CPU Time: $0.02 per million CPU-ms (after 10ms free per request)
+
+**OpenAI Costs**:
+- GPT-4o-mini: $0.15 / 1M input tokens, $0.60 / 1M output tokens
+- GPT-4o: $2.50 / 1M input tokens, $10.00 / 1M output tokens
+
+**Example**: 1M agent requests (avg 500 tokens each)
+- Workers: ~$1.50
+- GPT-4o-mini: ~$75
+- **Total**: ~$76.50
+
+**Use gpt-4o-mini for cost efficiency!**
+
+---
+
+## Monitoring
+
+```typescript
+// Log execution time
+const start = Date.now();
+const result = await run(agent, message);
+const duration = Date.now() - start;
+
+console.log(`Agent execution: ${duration}ms`);
+console.log(`Tokens used: ${result.usage.totalTokens}`);
+```
+
+Enable Workers observability in wrangler.jsonc:
+
+```jsonc
+"observability": {
+  "enabled": true,
+  "head_sampling_rate": 0.1
+}
+```
+
+---
+
+## Error Handling
+
+```typescript
+try {
+  const result = await run(agent, message, {
+    maxTurns: 5,
+  });
+  return result;
+
+} catch (error) {
+  if (error.message.includes('CPU time limit')) {
+    // Hit Workers CPU limit - reduce complexity
+    return { error: 'Request too complex' };
+  }
+
+  if (error.message.includes('memory')) {
+    // Hit memory limit - reduce context
+    return { error: 'Context too large' };
+  }
+
+  throw error;
+}
+```
+
+---
+
+## Alternatives
+
+If Workers limitations are problematic:
+
+1. **Cloudflare Pages Functions** (same runtime, may not help)
+2. **Next.js on Vercel** (better Node.js support)
+3. **Node.js on Railway/Render** (full Node.js environment)
+4. **AWS Lambda** (longer timeouts, more memory)
+
+---
+
+**Last Updated**: 2025-10-26
+**Status**: Experimental - test thoroughly before production use