Initial commit

2025-11-30 08:25:17 +08:00
commit 07f3f3c71c
22 changed files with 5007 additions and 0 deletions
--- a/references/built-in-tools-guide.md
+++ b/references/built-in-tools-guide.md
@@ -0,0 +1,126 @@
+# Built-in Tools Guide
+
+**Last Updated**: 2025-10-25
+
+Comprehensive guide to using Responses API built-in tools.
+
+---
+
+## Available Tools
+
+| Tool | Purpose | Use Case |
+|------|---------|----------|
+| **Code Interpreter** | Execute Python code | Data analysis, calculations, charts |
+| **File Search** | RAG without vector stores | Search uploaded files |
+| **Web Search** | Real-time web info | Current events, fact-checking |
+| **Image Generation** | DALL-E integration | Create images from descriptions |
+| **MCP** | Connect external tools | Stripe, databases, custom APIs |
+
+---
+
+## Code Interpreter
+
+**Execute Python code server-side:**
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Calculate mean, median, mode of: 10, 20, 30, 40, 50',
+  tools: [{ type: 'code_interpreter' }],
+});
+```
+
+**Features:**
+- Sandboxed Python environment
+- Automatic chart generation
+- File processing support
+- Timeout: 30s (use `background: true` for longer)
+
+---
+
+## File Search
+
+**RAG without building vector stores:**
+
+```typescript
+// 1. Upload file
+const file = await openai.files.create({
+  file: fs.createReadStream('./document.pdf'),
+  purpose: 'assistants',
+});
+
+// 2. Search
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'What does the document say about pricing?',
+  tools: [{ type: 'file_search', file_ids: [file.id] }],
+});
+```
+
+**Supported formats:**
+- PDFs, Word docs, text files
+- Markdown, HTML, code files
+- Max: 512MB per file
+
+---
+
+## Web Search
+
+**Real-time web information:**
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'What are the latest AI news?',
+  tools: [{ type: 'web_search' }],
+});
+```
+
+**Features:**
+- No cutoff date limitations
+- Automatic source citations
+- Real-time data access
+
+---
+
+## Image Generation
+
+**DALL-E integration:**
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Create an image of a futuristic cityscape at sunset',
+  tools: [{ type: 'image_generation' }],
+});
+
+// Find image in output
+response.output.forEach(item => {
+  if (item.type === 'image_generation_call') {
+    console.log('Image URL:', item.output.url);
+  }
+});
+```
+
+**Models:** DALL-E 3 (default)
+
+---
+
+## Combining Tools
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Find current Bitcoin price and calculate what $1000 would be worth',
+  tools: [
+    { type: 'web_search' },       // Get price
+    { type: 'code_interpreter' }, // Calculate
+  ],
+});
+```
+
+Model automatically uses the right tool for each subtask.
+
+---
+
+**Official Docs**: https://platform.openai.com/docs/guides/responses
--- a/references/mcp-integration-guide.md
+++ b/references/mcp-integration-guide.md
@@ -0,0 +1,133 @@
+# MCP Integration Guide
+
+**Last Updated**: 2025-10-25
+
+Guide for integrating external tools using Model Context Protocol (MCP).
+
+---
+
+## What Is MCP?
+
+MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to LLMs. It allows connecting external tools like Stripe, databases, and custom APIs.
+
+**Key Benefits:**
+- ✅ Built into Responses API (no separate setup)
+- ✅ Automatic tool discovery
+- ✅ OAuth authentication support
+- ✅ No additional cost (billed as output tokens)
+
+---
+
+## Basic MCP Integration
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Roll 2d6 dice',
+  tools: [
+    {
+      type: 'mcp',
+      server_label: 'dice',
+      server_url: 'https://dmcp.example.com',
+    },
+  ],
+});
+```
+
+---
+
+## Authentication
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Create payment link',
+  tools: [
+    {
+      type: 'mcp',
+      server_label: 'stripe',
+      server_url: 'https://mcp.stripe.com',
+      authorization: process.env.STRIPE_OAUTH_TOKEN, // ✅
+    },
+  ],
+});
+```
+
+**Important:** API does NOT store tokens. Provide with each request.
+
+---
+
+## Popular MCP Servers
+
+- **Stripe**: https://mcp.stripe.com
+- **Database MCP**: Custom servers for PostgreSQL, MySQL, MongoDB
+- **Custom APIs**: Build your own MCP server
+
+---
+
+## Building Custom MCP Server
+
+MCP server must implement:
+
+### 1. List Tools Endpoint
+
+```typescript
+// POST /mcp/list_tools
+{
+  tools: [
+    {
+      name: 'get_weather',
+      description: 'Get weather for a city',
+      input_schema: {
+        type: 'object',
+        properties: {
+          city: { type: 'string' },
+        },
+        required: ['city'],
+      },
+    },
+  ],
+}
+```
+
+### 2. Call Tool Endpoint
+
+```typescript
+// POST /mcp/call_tool
+Request: {
+  name: 'get_weather',
+  arguments: { city: 'San Francisco' }
+}
+
+Response: {
+  result: {
+    temperature: 72,
+    condition: 'sunny',
+  }
+}
+```
+
+---
+
+## Error Handling
+
+```typescript
+try {
+  const response = await openai.responses.create({
+    model: 'gpt-5',
+    input: 'Use tool',
+    tools: [{ type: 'mcp', server_url: '...', authorization: '...' }],
+  });
+} catch (error: any) {
+  if (error.type === 'mcp_connection_error') {
+    console.error('Server connection failed');
+  }
+  if (error.type === 'mcp_authentication_error') {
+    console.error('Invalid token');
+  }
+}
+```
+
+---
+
+**Official MCP Docs**: https://platform.openai.com/docs/guides/tools-connectors-mcp
--- a/references/migration-guide.md
+++ b/references/migration-guide.md
@@ -0,0 +1,236 @@
+# Migration Guide: Chat Completions → Responses API
+
+**Last Updated**: 2025-10-25
+
+Quick guide for migrating from Chat Completions to Responses API.
+
+---
+
+## Breaking Changes Summary
+
+| Chat Completions | Responses API | Migration |
+|-----------------|---------------|-----------|
+| **Endpoint** | `/v1/chat/completions` | `/v1/responses` | Update URL |
+| **Parameter** | `messages` | `input` | Rename |
+| **Role** | `system` | `developer` | Update role name |
+| **Output** | `choices[0].message.content` | `output_text` | Update accessor |
+| **State** | Manual (messages array) | Automatic (conversation ID) | Use conversations |
+| **Tools** | `tools` array with functions | Built-in types + MCP | Update tool definitions |
+
+---
+
+## Step-by-Step Migration
+
+### Step 1: Update Endpoint
+
+**Before:**
+```typescript
+const response = await openai.chat.completions.create({...});
+```
+
+**After:**
+```typescript
+const response = await openai.responses.create({...});
+```
+
+### Step 2: Rename `messages` to `input`
+
+**Before:**
+```typescript
+{
+  messages: [
+    { role: 'system', content: '...' },
+    { role: 'user', content: '...' }
+  ]
+}
+```
+
+**After:**
+```typescript
+{
+  input: [
+    { role: 'developer', content: '...' },
+    { role: 'user', content: '...' }
+  ]
+}
+```
+
+### Step 3: Update Response Access
+
+**Before:**
+```typescript
+const text = response.choices[0].message.content;
+```
+
+**After:**
+```typescript
+const text = response.output_text;
+```
+
+### Step 4: Use Conversation IDs (Optional but Recommended)
+
+**Before (Manual History):**
+```typescript
+let messages = [...previousMessages, newMessage];
+const response = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages,
+});
+```
+
+**After (Automatic):**
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id, // ✅ Automatic state
+  input: newMessage,
+});
+```
+
+---
+
+## Complete Example
+
+**Before (Chat Completions):**
+```typescript
+import OpenAI from 'openai';
+
+const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
+
+let messages = [
+  { role: 'system', content: 'You are a helpful assistant.' },
+];
+
+async function chat(userMessage: string) {
+  messages.push({ role: 'user', content: userMessage });
+
+  const response = await openai.chat.completions.create({
+    model: 'gpt-5',
+    messages,
+  });
+
+  const assistantMessage = response.choices[0].message;
+  messages.push(assistantMessage);
+
+  return assistantMessage.content;
+}
+
+// Usage
+await chat('Hello');
+await chat('Tell me a joke');
+```
+
+**After (Responses):**
+```typescript
+import OpenAI from 'openai';
+
+const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
+
+const conversation = await openai.conversations.create({
+  items: [
+    { type: 'message', role: 'developer', content: 'You are a helpful assistant.' },
+  ],
+});
+
+async function chat(userMessage: string) {
+  const response = await openai.responses.create({
+    model: 'gpt-5',
+    conversation: conversation.id,
+    input: userMessage,
+  });
+
+  return response.output_text;
+}
+
+// Usage
+await chat('Hello');
+await chat('Tell me a joke'); // Remembers previous turn automatically
+```
+
+---
+
+## Tool Migration
+
+### Chat Completions Functions → Responses Built-in Tools
+
+**Before (Custom Function):**
+```typescript
+{
+  tools: [
+    {
+      type: 'function',
+      function: {
+        name: 'get_weather',
+        description: 'Get weather',
+        parameters: { /* schema */ }
+      }
+    }
+  ]
+}
+```
+
+**After (Built-in or MCP):**
+```typescript
+{
+  tools: [
+    { type: 'web_search' },        // Built-in
+    { type: 'code_interpreter' },  // Built-in
+    {
+      type: 'mcp',                 // External tools
+      server_label: 'weather',
+      server_url: 'https://weather-mcp.example.com'
+    }
+  ]
+}
+```
+
+---
+
+## Streaming Migration
+
+**Before:**
+```typescript
+const stream = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages,
+  stream: true,
+});
+
+for await (const chunk of stream) {
+  process.stdout.write(chunk.choices[0]?.delta?.content || '');
+}
+```
+
+**After:**
+```typescript
+const stream = await openai.responses.create({
+  model: 'gpt-5',
+  input,
+  stream: true,
+});
+
+for await (const chunk of stream) {
+  // Handle polymorphic outputs
+  if (chunk.type === 'message_delta') {
+    process.stdout.write(chunk.content || '');
+  }
+}
+```
+
+---
+
+## Testing Checklist
+
+- [ ] Update all endpoint calls
+- [ ] Rename `messages` to `input`
+- [ ] Update `system` role to `developer`
+- [ ] Update response access (`choices[0]` → `output_text`)
+- [ ] Implement conversation management
+- [ ] Update tool definitions
+- [ ] Test multi-turn conversations
+- [ ] Verify streaming works
+- [ ] Check cost tracking (tool tokens)
+
+---
+
+**Official Docs**: https://platform.openai.com/docs/guides/responses
--- a/references/reasoning-preservation.md
+++ b/references/reasoning-preservation.md
@@ -0,0 +1,72 @@
+# Reasoning Preservation Guide
+
+**Last Updated**: 2025-10-25
+
+Understanding how Responses API preserves reasoning across turns.
+
+---
+
+## What Is Reasoning Preservation?
+
+Unlike Chat Completions (which discards reasoning between turns), Responses preserves the model's internal thought process.
+
+**Analogy:**
+- **Chat Completions**: Model tears out scratchpad page after each turn
+- **Responses API**: Model keeps scratchpad open, previous reasoning visible
+
+---
+
+## Performance Impact
+
+**TAUBench Results (GPT-5):**
+- Chat Completions: Baseline
+- Responses API: **+5% better** (purely from preserved reasoning)
+
+**Why It Matters:**
+- ✅ Better multi-turn problem solving
+- ✅ More coherent long conversations
+- ✅ Improved step-by-step reasoning
+- ✅ Fewer context errors
+
+---
+
+## Reasoning Summaries
+
+Responses API provides reasoning summaries at **no additional cost**.
+
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Solve this complex math problem',
+});
+
+// Inspect reasoning
+response.output.forEach(item => {
+  if (item.type === 'reasoning') {
+    console.log('Model thinking:', item.summary[0].text);
+  }
+  if (item.type === 'message') {
+    console.log('Final answer:', item.content[0].text);
+  }
+});
+```
+
+---
+
+## Use Cases
+
+**Debugging:**
+- See how model arrived at answer
+- Identify reasoning errors
+
+**Auditing:**
+- Track decision-making process
+- Compliance requirements
+
+**Transparency:**
+- Show users why AI made decision
+- Build trust in AI systems
+
+---
+
+**Official Docs**: https://developers.openai.com/blog/responses-api/
--- a/references/responses-vs-chat-completions.md
+++ b/references/responses-vs-chat-completions.md
@@ -0,0 +1,492 @@
+# Responses API vs Chat Completions: Complete Comparison
+
+**Last Updated**: 2025-10-25
+
+This document provides a comprehensive comparison between the Responses API and Chat Completions API to help you choose the right one for your use case.
+
+---
+
+## Quick Decision Guide
+
+### ✅ Use Responses API When:
+
+- Building **agentic applications** (reasoning + actions)
+- Need **multi-turn conversations** with automatic state management
+- Using **built-in tools** (Code Interpreter, File Search, Web Search, Image Gen)
+- Connecting to **MCP servers** for external integrations
+- Want **preserved reasoning** for better multi-turn performance
+- Implementing **background processing** for long tasks
+- Need **polymorphic outputs** for debugging/auditing
+
+### ✅ Use Chat Completions When:
+
+- Simple **one-off text generation**
+- Fully **stateless** interactions (no conversation continuity needed)
+- **Legacy integrations** with existing Chat Completions code
+- Very **simple use cases** without tools
+
+---
+
+## Feature Comparison Matrix
+
+| Feature | Chat Completions | Responses API | Winner |
+|---------|-----------------|---------------|---------|
+| **State Management** | Manual (you track history) | Automatic (conversation IDs) | Responses ✅ |
+| **Reasoning Preservation** | Dropped between turns | Preserved across turns | Responses ✅ |
+| **Tools Execution** | Client-side round trips | Server-side hosted | Responses ✅ |
+| **Output Format** | Single message | Polymorphic (messages, reasoning, tool calls) | Responses ✅ |
+| **Cache Utilization** | Baseline | 40-80% better | Responses ✅ |
+| **MCP Support** | Manual integration required | Built-in | Responses ✅ |
+| **Performance (GPT-5)** | Baseline | +5% on TAUBench | Responses ✅ |
+| **Simplicity** | Simpler for one-offs | More features = more complexity | Chat Completions ✅ |
+| **Legacy Compatibility** | Mature, stable | New (March 2025) | Chat Completions ✅ |
+
+---
+
+## API Comparison
+
+### Endpoints
+
+**Chat Completions:**
+```
+POST /v1/chat/completions
+```
+
+**Responses:**
+```
+POST /v1/responses
+```
+
+---
+
+### Request Structure
+
+**Chat Completions:**
+```typescript
+{
+  model: 'gpt-5',
+  messages: [
+    { role: 'system', content: 'You are helpful.' },
+    { role: 'user', content: 'Hello!' },
+  ],
+  temperature: 0.7,
+  max_tokens: 1000,
+}
+```
+
+**Responses:**
+```typescript
+{
+  model: 'gpt-5',
+  input: [
+    { role: 'developer', content: 'You are helpful.' },
+    { role: 'user', content: 'Hello!' },
+  ],
+  conversation: 'conv_abc123', // Optional: automatic state
+  temperature: 0.7,
+}
+```
+
+**Key Differences:**
+- `messages` → `input`
+- `system` role → `developer` role
+- `max_tokens` not required in Responses
+- `conversation` parameter for automatic state
+
+---
+
+### Response Structure
+
+**Chat Completions:**
+```typescript
+{
+  id: 'chatcmpl-123',
+  object: 'chat.completion',
+  created: 1677652288,
+  model: 'gpt-5',
+  choices: [
+    {
+      index: 0,
+      message: {
+        role: 'assistant',
+        content: 'Hello! How can I help?',
+      },
+      finish_reason: 'stop',
+    },
+  ],
+  usage: {
+    prompt_tokens: 10,
+    completion_tokens: 5,
+    total_tokens: 15,
+  },
+}
+```
+
+**Responses:**
+```typescript
+{
+  id: 'resp_123',
+  object: 'response',
+  created: 1677652288,
+  model: 'gpt-5',
+  output: [
+    {
+      type: 'reasoning',
+      summary: [{ type: 'summary_text', text: 'User greeting, respond friendly' }],
+    },
+    {
+      type: 'message',
+      role: 'assistant',
+      content: [{ type: 'output_text', text: 'Hello! How can I help?' }],
+    },
+  ],
+  output_text: 'Hello! How can I help?', // Helper field
+  usage: {
+    prompt_tokens: 10,
+    completion_tokens: 5,
+    tool_tokens: 0,
+    total_tokens: 15,
+  },
+  conversation_id: 'conv_abc123', // If using conversation
+}
+```
+
+**Key Differences:**
+- Single `message` → Polymorphic `output` array
+- `choices[0].message.content` → `output_text` helper
+- Additional output types: `reasoning`, `tool_calls`, etc.
+- `conversation_id` included if using conversations
+
+---
+
+## State Management Comparison
+
+### Chat Completions (Manual)
+
+```typescript
+// You track history manually
+let messages = [
+  { role: 'system', content: 'You are helpful.' },
+  { role: 'user', content: 'What is AI?' },
+];
+
+const response1 = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages,
+});
+
+// Add response to history
+messages.push({
+  role: 'assistant',
+  content: response1.choices[0].message.content,
+});
+
+// Next turn
+messages.push({ role: 'user', content: 'Tell me more' });
+
+const response2 = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages, // ✅ You must pass full history
+});
+```
+
+**Pros:**
+- Full control over history
+- Can prune old messages
+- Simple for one-off requests
+
+**Cons:**
+- Manual tracking error-prone
+- Must handle history yourself
+- No automatic caching benefits
+
+### Responses (Automatic)
+
+```typescript
+// Create conversation once
+const conv = await openai.conversations.create();
+
+const response1 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id, // ✅ Automatic state
+  input: 'What is AI?',
+});
+
+// Next turn - no manual history tracking
+const response2 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id, // ✅ Remembers previous turn
+  input: 'Tell me more',
+});
+```
+
+**Pros:**
+- Automatic state management
+- No manual history tracking
+- Better cache utilization (40-80%)
+- Reasoning preserved
+
+**Cons:**
+- Less direct control
+- Must create conversation first
+- Conversations expire after 90 days
+
+---
+
+## Reasoning Preservation
+
+### Chat Completions
+
+**What Happens:**
+1. Model generates internal reasoning (scratchpad)
+2. Reasoning used to produce response
+3. **Reasoning discarded** before returning
+4. Next turn starts fresh (no reasoning memory)
+
+**Visual:**
+```
+Turn 1: [Reasoning] → Response → ❌ Reasoning deleted
+Turn 2: [New Reasoning] → Response → ❌ Reasoning deleted
+Turn 3: [New Reasoning] → Response → ❌ Reasoning deleted
+```
+
+**Impact:**
+- Model "forgets" its thought process
+- May repeat reasoning steps
+- Lower performance on complex multi-turn tasks
+
+### Responses API
+
+**What Happens:**
+1. Model generates internal reasoning
+2. Reasoning used to produce response
+3. **Reasoning preserved** in conversation state
+4. Next turn builds on previous reasoning
+
+**Visual:**
+```
+Turn 1: [Reasoning A] → Response → ✅ Reasoning A saved
+Turn 2: [Reasoning A + B] → Response → ✅ Reasoning A+B saved
+Turn 3: [Reasoning A + B + C] → Response → ✅ All reasoning saved
+```
+
+**Impact:**
+- Model remembers thought process
+- No redundant reasoning
+- **+5% better on TAUBench (GPT-5)**
+- Better multi-turn problem solving
+
+---
+
+## Tools Comparison
+
+### Chat Completions (Client-Side)
+
+```typescript
+// 1. Define function
+const response1 = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [{ role: 'user', content: 'What is the weather?' }],
+  tools: [
+    {
+      type: 'function',
+      function: {
+        name: 'get_weather',
+        description: 'Get weather',
+        parameters: {
+          type: 'object',
+          properties: {
+            location: { type: 'string' },
+          },
+        },
+      },
+    },
+  ],
+});
+
+// 2. Check if tool called
+const toolCall = response1.choices[0].message.tool_calls?.[0];
+
+// 3. Execute tool on your server
+const weatherData = await getWeather(toolCall.function.arguments);
+
+// 4. Send result back
+const response2 = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [
+    ...messages,
+    response1.choices[0].message,
+    {
+      role: 'tool',
+      tool_call_id: toolCall.id,
+      content: JSON.stringify(weatherData),
+    },
+  ],
+});
+```
+
+**Pros:**
+- Full control over tool execution
+- Can use any custom tools
+
+**Cons:**
+- Manual round trips (latency)
+- More complex code
+- You handle tool execution
+
+### Responses (Server-Side Built-in)
+
+```typescript
+// All in one request - tools executed server-side
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'What is the weather and analyze the temperature trend?',
+  tools: [
+    { type: 'web_search' },       // Built-in
+    { type: 'code_interpreter' }, // Built-in
+  ],
+});
+
+// Tools executed automatically, results in output
+console.log(response.output_text);
+```
+
+**Pros:**
+- No round trips (lower latency)
+- Simpler code
+- Built-in tools (no setup)
+
+**Cons:**
+- Less control over execution
+- Limited to built-in + MCP tools
+
+---
+
+## Performance Benchmarks
+
+### TAUBench (GPT-5)
+
+| Scenario | Chat Completions | Responses API | Difference |
+|----------|-----------------|---------------|------------|
+| Multi-turn reasoning | 82% | 87% | **+5%** |
+| Tool usage accuracy | 85% | 88% | **+3%** |
+| Context retention | 78% | 85% | **+7%** |
+
+### Cache Utilization
+
+| Metric | Chat Completions | Responses API | Improvement |
+|--------|-----------------|---------------|-------------|
+| Cache hit rate | 30% | 54-72% | **40-80% better** |
+| Latency (cached) | 100ms | 60-80ms | **20-40% faster** |
+| Cost (cached) | $0.10/1K | $0.05-0.07/1K | **30-50% cheaper** |
+
+---
+
+## Cost Comparison
+
+### Pricing Structure
+
+**Chat Completions:**
+- Input tokens: $X per 1K
+- Output tokens: $Y per 1K
+- **No storage costs**
+
+**Responses:**
+- Input tokens: $X per 1K
+- Output tokens: $Y per 1K
+- Tool tokens: $Z per 1K (if tools used)
+- **Conversation storage**: $0.01 per conversation per month
+
+### Example Cost Calculation
+
+**Scenario:** 100 multi-turn conversations, 10 turns each, 1000 tokens per turn
+
+**Chat Completions:**
+```
+Input: 100 convs × 10 turns × 500 tokens × $X = $A
+Output: 100 convs × 10 turns × 500 tokens × $Y = $B
+Total: $A + $B
+```
+
+**Responses:**
+```
+Input: 100 convs × 10 turns × 500 tokens × $X = $A
+Output: 100 convs × 10 turns × 500 tokens × $Y = $B
+Storage: 100 convs × $0.01 = $1
+Cache savings: -30% on input (due to better caching)
+Total: ($A × 0.7) + $B + $1 (usually cheaper!)
+```
+
+---
+
+## Migration Path
+
+### Simple Migration
+
+**Before (Chat Completions):**
+```typescript
+const response = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [
+    { role: 'system', content: 'You are helpful.' },
+    { role: 'user', content: 'Hello!' },
+  ],
+});
+
+console.log(response.choices[0].message.content);
+```
+
+**After (Responses):**
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: [
+    { role: 'developer', content: 'You are helpful.' },
+    { role: 'user', content: 'Hello!' },
+  ],
+});
+
+console.log(response.output_text);
+```
+
+**Changes:**
+1. `chat.completions.create` → `responses.create`
+2. `messages` → `input`
+3. `system` → `developer`
+4. `choices[0].message.content` → `output_text`
+
+---
+
+## When to Migrate
+
+### ✅ Migrate Now If:
+
+- Building new applications
+- Need stateful conversations
+- Using agentic patterns (reasoning + tools)
+- Want better performance (preserved reasoning)
+- Need built-in tools (Code Interpreter, File Search, etc.)
+
+### ⏸️ Stay on Chat Completions If:
+
+- Simple one-off generations
+- Legacy integrations (migration effort)
+- No need for state management
+- Very simple use cases
+
+---
+
+## Summary
+
+**Responses API** is the future of OpenAI's API for agentic applications. It provides:
+- ✅ Better performance (+5% on TAUBench)
+- ✅ Lower latency (40-80% better caching)
+- ✅ Simpler code (automatic state management)
+- ✅ More features (built-in tools, MCP, reasoning preservation)
+
+**Chat Completions** is still great for:
+- ✅ Simple one-off text generation
+- ✅ Legacy integrations
+- ✅ When you need maximum simplicity
+
+**Recommendation:** Use Responses for new projects, especially agentic workflows. Chat Completions remains valid for simple use cases.
--- a/references/stateful-conversations.md
+++ b/references/stateful-conversations.md
@@ -0,0 +1,78 @@
+# Stateful Conversations Guide
+
+**Last Updated**: 2025-10-25
+
+Guide to managing conversation state with the Responses API.
+
+---
+
+## Automatic State Management
+
+```typescript
+// Create conversation
+const conv = await openai.conversations.create({
+  metadata: { user_id: 'user_123' },
+});
+
+// Turn 1
+const response1 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id,
+  input: 'What are the 5 Ds of dodgeball?',
+});
+
+// Turn 2 - automatically remembers turn 1
+const response2 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id,
+  input: 'Tell me more about the first one',
+});
+```
+
+---
+
+## Conversation Management
+
+**Create:**
+```typescript
+const conv = await openai.conversations.create({
+  metadata: { topic: 'support' },
+  items: [
+    { type: 'message', role: 'developer', content: 'You are helpful.' }
+  ],
+});
+```
+
+**List:**
+```typescript
+const convs = await openai.conversations.list({ limit: 10 });
+```
+
+**Delete:**
+```typescript
+await openai.conversations.delete(conv.id);
+```
+
+---
+
+## Benefits vs Manual History
+
+| Feature | Manual History | Conversation IDs |
+|---------|---------------|------------------|
+| **Complexity** | High (you track) | Low (automatic) |
+| **Cache** | Baseline | 40-80% better |
+| **Reasoning** | Discarded | Preserved |
+| **Errors** | Common | Rare |
+
+---
+
+## Best Practices
+
+1. **Store conversation IDs**: Database, session storage, cookies
+2. **Add metadata**: Track user, topic, session type
+3. **Expire old conversations**: Delete after 90 days or when done
+4. **One conversation per topic**: Don't mix unrelated topics
+
+---
+
+**Official Docs**: https://platform.openai.com/docs/api-reference/conversations
--- a/references/top-errors.md
+++ b/references/top-errors.md
@@ -0,0 +1,476 @@
+# Top 8 Errors with OpenAI Responses API
+
+**Last Updated**: 2025-10-25
+
+This document covers the most common errors encountered when using the Responses API and their solutions.
+
+---
+
+## 1. Session State Not Persisting
+
+**Error Symptom:**
+Model doesn't remember previous conversation turns.
+
+**Causes:**
+- Not using conversation IDs
+- Using different conversation IDs per turn
+- Creating new conversation for each request
+
+**Solution:**
+```typescript
+// ❌ BAD: New conversation each time
+const response1 = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Question 1',
+});
+const response2 = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Question 2', // Model doesn't remember question 1
+});
+
+// ✅ GOOD: Reuse conversation ID
+const conv = await openai.conversations.create();
+const response1 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id, // ✅ Same ID
+  input: 'Question 1',
+});
+const response2 = await openai.responses.create({
+  model: 'gpt-5',
+  conversation: conv.id, // ✅ Same ID - remembers previous
+  input: 'Question 2',
+});
+```
+
+**Prevention:**
+- Create conversation once
+- Store conversation ID (database, session, cookie)
+- Reuse ID for all related turns
+
+---
+
+## 2. MCP Server Connection Failed
+
+**Error:**
+```json
+{
+  "error": {
+    "type": "mcp_connection_error",
+    "message": "Failed to connect to MCP server"
+  }
+}
+```
+
+**Causes:**
+- Invalid server URL
+- Missing or expired authorization token
+- Server not responding
+- Network issues
+
+**Solutions:**
+```typescript
+// 1. Verify URL is correct
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Test MCP',
+  tools: [
+    {
+      type: 'mcp',
+      server_label: 'stripe',
+      server_url: 'https://mcp.stripe.com', // ✅ Full HTTPS URL
+      authorization: process.env.STRIPE_OAUTH_TOKEN, // ✅ Valid token
+    },
+  ],
+});
+
+// 2. Test server URL manually
+const testResponse = await fetch('https://mcp.stripe.com');
+console.log(testResponse.status); // Should be 200
+
+// 3. Check token expiration
+const tokenExpiry = parseJWT(token).exp;
+if (Date.now() / 1000 > tokenExpiry) {
+  console.error('Token expired, refresh it');
+}
+```
+
+**Prevention:**
+- Use environment variables for secrets
+- Implement token refresh logic
+- Add retry with exponential backoff
+- Log connection attempts for debugging
+
+---
+
+## 3. Code Interpreter Timeout
+
+**Error:**
+```json
+{
+  "error": {
+    "type": "code_interpreter_timeout",
+    "message": "Code execution exceeded time limit"
+  }
+}
+```
+
+**Cause:**
+Code runs longer than 30 seconds (standard mode limit)
+
+**Solution:**
+```typescript
+// ❌ BAD: Long-running code in standard mode
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Process this massive dataset',
+  tools: [{ type: 'code_interpreter' }], // Timeout after 30s
+});
+
+// ✅ GOOD: Use background mode for long tasks
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Process this massive dataset',
+  background: true, // ✅ Up to 10 minutes
+  tools: [{ type: 'code_interpreter' }],
+});
+
+// Poll for results
+let result = await openai.responses.retrieve(response.id);
+while (result.status === 'in_progress') {
+  await new Promise(r => setTimeout(r, 5000));
+  result = await openai.responses.retrieve(response.id);
+}
+console.log(result.output_text);
+```
+
+**Prevention:**
+- Use `background: true` for tasks > 30 seconds
+- Break large tasks into smaller chunks
+- Optimize code for performance
+
+---
+
+## 4. Image Generation Rate Limit
+
+**Error:**
+```json
+{
+  "error": {
+    "type": "rate_limit_error",
+    "message": "DALL-E rate limit exceeded"
+  }
+}
+```
+
+**Cause:**
+Too many image generation requests in short time
+
+**Solution:**
+```typescript
+// Implement retry with exponential backoff
+async function generateImageWithRetry(prompt: string, retries = 3): Promise<any> {
+  for (let i = 0; i < retries; i++) {
+    try {
+      return await openai.responses.create({
+        model: 'gpt-5',
+        input: prompt,
+        tools: [{ type: 'image_generation' }],
+      });
+    } catch (error: any) {
+      if (error.type === 'rate_limit_error' && i < retries - 1) {
+        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
+        console.log(`Rate limited, retrying in ${delay}ms`);
+        await new Promise(resolve => setTimeout(resolve, delay));
+      } else {
+        throw error;
+      }
+    }
+  }
+}
+
+const response = await generateImageWithRetry('Create an image of a sunset');
+```
+
+**Prevention:**
+- Implement rate limiting on your side
+- Use exponential backoff for retries
+- Queue image requests
+- Monitor API usage
+
+---
+
+## 5. File Search Relevance Issues
+
+**Problem:**
+File search returns irrelevant or low-quality results
+
+**Causes:**
+- Vague queries
+- Poor file quality (OCR errors, formatting)
+- Not enough context
+
+**Solutions:**
+```typescript
+// ❌ BAD: Vague query
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Find pricing', // Too vague
+  tools: [{ type: 'file_search', file_ids: [fileId] }],
+});
+
+// ✅ GOOD: Specific query
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Find the monthly subscription pricing for the premium plan in the 2025 pricing document',
+  tools: [{ type: 'file_search', file_ids: [fileId] }],
+});
+
+// ✅ ALSO GOOD: Filter low-confidence results
+response.output.forEach(item => {
+  if (item.type === 'file_search_call') {
+    const highConfidence = item.results.filter(r => r.score > 0.7);
+    console.log('High confidence results:', highConfidence);
+  }
+});
+```
+
+**Prevention:**
+- Use specific, detailed queries
+- Upload high-quality documents (PDFs, Markdown)
+- Filter results by confidence score (> 0.7)
+- Provide context in query
+
+---
+
+## 6. Variable Substitution Errors (Reusable Prompts)
+
+**Error:**
+Variables not replaced in prompt templates
+
+**Cause:**
+Incorrect variable syntax or missing values
+
+**Solution:**
+```typescript
+// ❌ BAD: Incorrect variable syntax
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Hello {username}', // Not supported directly
+});
+
+// ✅ GOOD: Use template literals
+const username = 'Alice';
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: `Hello ${username}`, // ✅ JavaScript template literal
+});
+
+// ✅ ALSO GOOD: Build message dynamically
+function buildPrompt(vars: Record<string, string>) {
+  return `Hello ${vars.username}, your order ${vars.orderId} is ready.`;
+}
+
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: buildPrompt({ username: 'Alice', orderId: '12345' }),
+});
+```
+
+**Prevention:**
+- Use JavaScript template literals
+- Validate all variables before substitution
+- Provide defaults for optional variables
+
+---
+
+## 7. Chat Completions Migration Breaking Changes
+
+**Errors:**
+- `messages parameter not found`
+- `choices is undefined`
+- `system role not recognized`
+
+**Cause:**
+Using Chat Completions syntax with Responses API
+
+**Solution:**
+```typescript
+// ❌ BAD: Chat Completions syntax
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  messages: [{ role: 'system', content: 'You are helpful.' }], // Wrong
+});
+console.log(response.choices[0].message.content); // Wrong
+
+// ✅ GOOD: Responses syntax
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: [{ role: 'developer', content: 'You are helpful.' }], // ✅
+});
+console.log(response.output_text); // ✅
+```
+
+**Breaking Changes:**
+| Chat Completions | Responses API |
+|-----------------|---------------|
+| `messages` | `input` |
+| `system` role | `developer` role |
+| `choices[0].message.content` | `output_text` |
+| `/v1/chat/completions` | `/v1/responses` |
+
+**Prevention:**
+- Read migration guide: `references/migration-guide.md`
+- Update all references systematically
+- Test thoroughly after migration
+
+---
+
+## 8. Cost Tracking Confusion
+
+**Problem:**
+Billing different than expected
+
+**Cause:**
+Not accounting for tool tokens and conversation storage
+
+**Explanation:**
+- **Chat Completions**: input tokens + output tokens
+- **Responses API**: input tokens + output tokens + tool tokens + conversation storage
+
+**Solution:**
+```typescript
+const response = await openai.responses.create({
+  model: 'gpt-5',
+  input: 'Hello',
+  store: false, // ✅ Disable storage if not needed
+  tools: [{ type: 'code_interpreter' }],
+});
+
+// Monitor usage
+console.log('Input tokens:', response.usage.prompt_tokens);
+console.log('Output tokens:', response.usage.completion_tokens);
+console.log('Tool tokens:', response.usage.tool_tokens);
+console.log('Total tokens:', response.usage.total_tokens);
+
+// Calculate cost
+const inputCost = response.usage.prompt_tokens * 0.00001; // Example rate
+const outputCost = response.usage.completion_tokens * 0.00003;
+const toolCost = response.usage.tool_tokens * 0.00002;
+const totalCost = inputCost + outputCost + toolCost;
+console.log('Estimated cost: $' + totalCost.toFixed(4));
+```
+
+**Prevention:**
+- Monitor `usage.tool_tokens` in responses
+- Set `store: false` for one-off requests
+- Track conversation count (storage costs)
+- Implement cost alerts
+
+---
+
+## Common Error Response Formats
+
+### Authentication Error
+```json
+{
+  "error": {
+    "type": "authentication_error",
+    "message": "Invalid API key"
+  }
+}
+```
+
+### Rate Limit Error
+```json
+{
+  "error": {
+    "type": "rate_limit_error",
+    "message": "Rate limit exceeded",
+    "retry_after": 5
+  }
+}
+```
+
+### Invalid Request Error
+```json
+{
+  "error": {
+    "type": "invalid_request_error",
+    "message": "Conversation conv_xyz not found"
+  }
+}
+```
+
+### Server Error
+```json
+{
+  "error": {
+    "type": "server_error",
+    "message": "Internal server error"
+  }
+}
+```
+
+---
+
+## General Error Handling Pattern
+
+```typescript
+async function handleResponsesAPI(input: string) {
+  try {
+    const response = await openai.responses.create({
+      model: 'gpt-5',
+      input,
+    });
+
+    return response.output_text;
+  } catch (error: any) {
+    // Handle specific errors
+    switch (error.type) {
+      case 'rate_limit_error':
+        console.error('Rate limited, retry after:', error.retry_after);
+        break;
+      case 'mcp_connection_error':
+        console.error('MCP server failed:', error.message);
+        break;
+      case 'code_interpreter_timeout':
+        console.error('Code execution timed out, use background mode');
+        break;
+      case 'authentication_error':
+        console.error('Invalid API key');
+        break;
+      default:
+        console.error('Unexpected error:', error.message);
+    }
+
+    throw error; // Re-throw or handle
+  }
+}
+```
+
+---
+
+## Prevention Checklist
+
+- [ ] Use conversation IDs for multi-turn interactions
+- [ ] Provide valid MCP server URLs and tokens
+- [ ] Use `background: true` for tasks > 30 seconds
+- [ ] Implement exponential backoff for rate limits
+- [ ] Use specific queries for file search
+- [ ] Use template literals for variable substitution
+- [ ] Update Chat Completions syntax to Responses syntax
+- [ ] Monitor `usage.tool_tokens` and conversation count
+
+---
+
+## Getting Help
+
+If you encounter an error not covered here:
+
+1. Check official docs: https://platform.openai.com/docs/api-reference/responses
+2. Search OpenAI Community: https://community.openai.com
+3. Contact OpenAI Support: https://help.openai.com
+
+---
+
+**Last Updated**: 2025-10-25