Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:25:17 +08:00
commit 07f3f3c71c
22 changed files with 5007 additions and 0 deletions

View File

@@ -0,0 +1,126 @@
# Built-in Tools Guide
**Last Updated**: 2025-10-25
Comprehensive guide to using Responses API built-in tools.
---
## Available Tools
| Tool | Purpose | Use Case |
|------|---------|----------|
| **Code Interpreter** | Execute Python code | Data analysis, calculations, charts |
| **File Search** | RAG without vector stores | Search uploaded files |
| **Web Search** | Real-time web info | Current events, fact-checking |
| **Image Generation** | DALL-E integration | Create images from descriptions |
| **MCP** | Connect external tools | Stripe, databases, custom APIs |
---
## Code Interpreter
**Execute Python code server-side:**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Calculate mean, median, mode of: 10, 20, 30, 40, 50',
tools: [{ type: 'code_interpreter' }],
});
```
**Features:**
- Sandboxed Python environment
- Automatic chart generation
- File processing support
- Timeout: 30s (use `background: true` for longer)
---
## File Search
**RAG without building vector stores:**
```typescript
// 1. Upload file
const file = await openai.files.create({
file: fs.createReadStream('./document.pdf'),
purpose: 'assistants',
});
// 2. Search
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What does the document say about pricing?',
tools: [{ type: 'file_search', file_ids: [file.id] }],
});
```
**Supported formats:**
- PDFs, Word docs, text files
- Markdown, HTML, code files
- Max: 512MB per file
---
## Web Search
**Real-time web information:**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What are the latest AI news?',
tools: [{ type: 'web_search' }],
});
```
**Features:**
- No cutoff date limitations
- Automatic source citations
- Real-time data access
---
## Image Generation
**DALL-E integration:**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Create an image of a futuristic cityscape at sunset',
tools: [{ type: 'image_generation' }],
});
// Find image in output
response.output.forEach(item => {
if (item.type === 'image_generation_call') {
console.log('Image URL:', item.output.url);
}
});
```
**Models:** DALL-E 3 (default)
---
## Combining Tools
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Find current Bitcoin price and calculate what $1000 would be worth',
tools: [
{ type: 'web_search' }, // Get price
{ type: 'code_interpreter' }, // Calculate
],
});
```
Model automatically uses the right tool for each subtask.
---
**Official Docs**: https://platform.openai.com/docs/guides/responses

View File

@@ -0,0 +1,133 @@
# MCP Integration Guide
**Last Updated**: 2025-10-25
Guide for integrating external tools using Model Context Protocol (MCP).
---
## What Is MCP?
MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to LLMs. It allows connecting external tools like Stripe, databases, and custom APIs.
**Key Benefits:**
- ✅ Built into Responses API (no separate setup)
- ✅ Automatic tool discovery
- ✅ OAuth authentication support
- ✅ No additional cost (billed as output tokens)
---
## Basic MCP Integration
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Roll 2d6 dice',
tools: [
{
type: 'mcp',
server_label: 'dice',
server_url: 'https://dmcp.example.com',
},
],
});
```
---
## Authentication
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Create payment link',
tools: [
{
type: 'mcp',
server_label: 'stripe',
server_url: 'https://mcp.stripe.com',
authorization: process.env.STRIPE_OAUTH_TOKEN, // ✅
},
],
});
```
**Important:** API does NOT store tokens. Provide with each request.
---
## Popular MCP Servers
- **Stripe**: https://mcp.stripe.com
- **Database MCP**: Custom servers for PostgreSQL, MySQL, MongoDB
- **Custom APIs**: Build your own MCP server
---
## Building Custom MCP Server
MCP server must implement:
### 1. List Tools Endpoint
```typescript
// POST /mcp/list_tools
{
tools: [
{
name: 'get_weather',
description: 'Get weather for a city',
input_schema: {
type: 'object',
properties: {
city: { type: 'string' },
},
required: ['city'],
},
},
],
}
```
### 2. Call Tool Endpoint
```typescript
// POST /mcp/call_tool
Request: {
name: 'get_weather',
arguments: { city: 'San Francisco' }
}
Response: {
result: {
temperature: 72,
condition: 'sunny',
}
}
```
---
## Error Handling
```typescript
try {
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Use tool',
tools: [{ type: 'mcp', server_url: '...', authorization: '...' }],
});
} catch (error: any) {
if (error.type === 'mcp_connection_error') {
console.error('Server connection failed');
}
if (error.type === 'mcp_authentication_error') {
console.error('Invalid token');
}
}
```
---
**Official MCP Docs**: https://platform.openai.com/docs/guides/tools-connectors-mcp

View File

@@ -0,0 +1,236 @@
# Migration Guide: Chat Completions → Responses API
**Last Updated**: 2025-10-25
Quick guide for migrating from Chat Completions to Responses API.
---
## Breaking Changes Summary
| Chat Completions | Responses API | Migration |
|-----------------|---------------|-----------|
| **Endpoint** | `/v1/chat/completions` | `/v1/responses` | Update URL |
| **Parameter** | `messages` | `input` | Rename |
| **Role** | `system` | `developer` | Update role name |
| **Output** | `choices[0].message.content` | `output_text` | Update accessor |
| **State** | Manual (messages array) | Automatic (conversation ID) | Use conversations |
| **Tools** | `tools` array with functions | Built-in types + MCP | Update tool definitions |
---
## Step-by-Step Migration
### Step 1: Update Endpoint
**Before:**
```typescript
const response = await openai.chat.completions.create({...});
```
**After:**
```typescript
const response = await openai.responses.create({...});
```
### Step 2: Rename `messages` to `input`
**Before:**
```typescript
{
messages: [
{ role: 'system', content: '...' },
{ role: 'user', content: '...' }
]
}
```
**After:**
```typescript
{
input: [
{ role: 'developer', content: '...' },
{ role: 'user', content: '...' }
]
}
```
### Step 3: Update Response Access
**Before:**
```typescript
const text = response.choices[0].message.content;
```
**After:**
```typescript
const text = response.output_text;
```
### Step 4: Use Conversation IDs (Optional but Recommended)
**Before (Manual History):**
```typescript
let messages = [...previousMessages, newMessage];
const response = await openai.chat.completions.create({
model: 'gpt-5',
messages,
});
```
**After (Automatic):**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Automatic state
input: newMessage,
});
```
---
## Complete Example
**Before (Chat Completions):**
```typescript
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
let messages = [
{ role: 'system', content: 'You are a helpful assistant.' },
];
async function chat(userMessage: string) {
messages.push({ role: 'user', content: userMessage });
const response = await openai.chat.completions.create({
model: 'gpt-5',
messages,
});
const assistantMessage = response.choices[0].message;
messages.push(assistantMessage);
return assistantMessage.content;
}
// Usage
await chat('Hello');
await chat('Tell me a joke');
```
**After (Responses):**
```typescript
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const conversation = await openai.conversations.create({
items: [
{ type: 'message', role: 'developer', content: 'You are a helpful assistant.' },
],
});
async function chat(userMessage: string) {
const response = await openai.responses.create({
model: 'gpt-5',
conversation: conversation.id,
input: userMessage,
});
return response.output_text;
}
// Usage
await chat('Hello');
await chat('Tell me a joke'); // Remembers previous turn automatically
```
---
## Tool Migration
### Chat Completions Functions → Responses Built-in Tools
**Before (Custom Function):**
```typescript
{
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather',
parameters: { /* schema */ }
}
}
]
}
```
**After (Built-in or MCP):**
```typescript
{
tools: [
{ type: 'web_search' }, // Built-in
{ type: 'code_interpreter' }, // Built-in
{
type: 'mcp', // External tools
server_label: 'weather',
server_url: 'https://weather-mcp.example.com'
}
]
}
```
---
## Streaming Migration
**Before:**
```typescript
const stream = await openai.chat.completions.create({
model: 'gpt-5',
messages,
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
```
**After:**
```typescript
const stream = await openai.responses.create({
model: 'gpt-5',
input,
stream: true,
});
for await (const chunk of stream) {
// Handle polymorphic outputs
if (chunk.type === 'message_delta') {
process.stdout.write(chunk.content || '');
}
}
```
---
## Testing Checklist
- [ ] Update all endpoint calls
- [ ] Rename `messages` to `input`
- [ ] Update `system` role to `developer`
- [ ] Update response access (`choices[0]``output_text`)
- [ ] Implement conversation management
- [ ] Update tool definitions
- [ ] Test multi-turn conversations
- [ ] Verify streaming works
- [ ] Check cost tracking (tool tokens)
---
**Official Docs**: https://platform.openai.com/docs/guides/responses

View File

@@ -0,0 +1,72 @@
# Reasoning Preservation Guide
**Last Updated**: 2025-10-25
Understanding how Responses API preserves reasoning across turns.
---
## What Is Reasoning Preservation?
Unlike Chat Completions (which discards reasoning between turns), Responses preserves the model's internal thought process.
**Analogy:**
- **Chat Completions**: Model tears out scratchpad page after each turn
- **Responses API**: Model keeps scratchpad open, previous reasoning visible
---
## Performance Impact
**TAUBench Results (GPT-5):**
- Chat Completions: Baseline
- Responses API: **+5% better** (purely from preserved reasoning)
**Why It Matters:**
- ✅ Better multi-turn problem solving
- ✅ More coherent long conversations
- ✅ Improved step-by-step reasoning
- ✅ Fewer context errors
---
## Reasoning Summaries
Responses API provides reasoning summaries at **no additional cost**.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Solve this complex math problem',
});
// Inspect reasoning
response.output.forEach(item => {
if (item.type === 'reasoning') {
console.log('Model thinking:', item.summary[0].text);
}
if (item.type === 'message') {
console.log('Final answer:', item.content[0].text);
}
});
```
---
## Use Cases
**Debugging:**
- See how model arrived at answer
- Identify reasoning errors
**Auditing:**
- Track decision-making process
- Compliance requirements
**Transparency:**
- Show users why AI made decision
- Build trust in AI systems
---
**Official Docs**: https://developers.openai.com/blog/responses-api/

View File

@@ -0,0 +1,492 @@
# Responses API vs Chat Completions: Complete Comparison
**Last Updated**: 2025-10-25
This document provides a comprehensive comparison between the Responses API and Chat Completions API to help you choose the right one for your use case.
---
## Quick Decision Guide
### ✅ Use Responses API When:
- Building **agentic applications** (reasoning + actions)
- Need **multi-turn conversations** with automatic state management
- Using **built-in tools** (Code Interpreter, File Search, Web Search, Image Gen)
- Connecting to **MCP servers** for external integrations
- Want **preserved reasoning** for better multi-turn performance
- Implementing **background processing** for long tasks
- Need **polymorphic outputs** for debugging/auditing
### ✅ Use Chat Completions When:
- Simple **one-off text generation**
- Fully **stateless** interactions (no conversation continuity needed)
- **Legacy integrations** with existing Chat Completions code
- Very **simple use cases** without tools
---
## Feature Comparison Matrix
| Feature | Chat Completions | Responses API | Winner |
|---------|-----------------|---------------|---------|
| **State Management** | Manual (you track history) | Automatic (conversation IDs) | Responses ✅ |
| **Reasoning Preservation** | Dropped between turns | Preserved across turns | Responses ✅ |
| **Tools Execution** | Client-side round trips | Server-side hosted | Responses ✅ |
| **Output Format** | Single message | Polymorphic (messages, reasoning, tool calls) | Responses ✅ |
| **Cache Utilization** | Baseline | 40-80% better | Responses ✅ |
| **MCP Support** | Manual integration required | Built-in | Responses ✅ |
| **Performance (GPT-5)** | Baseline | +5% on TAUBench | Responses ✅ |
| **Simplicity** | Simpler for one-offs | More features = more complexity | Chat Completions ✅ |
| **Legacy Compatibility** | Mature, stable | New (March 2025) | Chat Completions ✅ |
---
## API Comparison
### Endpoints
**Chat Completions:**
```
POST /v1/chat/completions
```
**Responses:**
```
POST /v1/responses
```
---
### Request Structure
**Chat Completions:**
```typescript
{
model: 'gpt-5',
messages: [
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: 'Hello!' },
],
temperature: 0.7,
max_tokens: 1000,
}
```
**Responses:**
```typescript
{
model: 'gpt-5',
input: [
{ role: 'developer', content: 'You are helpful.' },
{ role: 'user', content: 'Hello!' },
],
conversation: 'conv_abc123', // Optional: automatic state
temperature: 0.7,
}
```
**Key Differences:**
- `messages``input`
- `system` role → `developer` role
- `max_tokens` not required in Responses
- `conversation` parameter for automatic state
---
### Response Structure
**Chat Completions:**
```typescript
{
id: 'chatcmpl-123',
object: 'chat.completion',
created: 1677652288,
model: 'gpt-5',
choices: [
{
index: 0,
message: {
role: 'assistant',
content: 'Hello! How can I help?',
},
finish_reason: 'stop',
},
],
usage: {
prompt_tokens: 10,
completion_tokens: 5,
total_tokens: 15,
},
}
```
**Responses:**
```typescript
{
id: 'resp_123',
object: 'response',
created: 1677652288,
model: 'gpt-5',
output: [
{
type: 'reasoning',
summary: [{ type: 'summary_text', text: 'User greeting, respond friendly' }],
},
{
type: 'message',
role: 'assistant',
content: [{ type: 'output_text', text: 'Hello! How can I help?' }],
},
],
output_text: 'Hello! How can I help?', // Helper field
usage: {
prompt_tokens: 10,
completion_tokens: 5,
tool_tokens: 0,
total_tokens: 15,
},
conversation_id: 'conv_abc123', // If using conversation
}
```
**Key Differences:**
- Single `message` → Polymorphic `output` array
- `choices[0].message.content``output_text` helper
- Additional output types: `reasoning`, `tool_calls`, etc.
- `conversation_id` included if using conversations
---
## State Management Comparison
### Chat Completions (Manual)
```typescript
// You track history manually
let messages = [
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: 'What is AI?' },
];
const response1 = await openai.chat.completions.create({
model: 'gpt-5',
messages,
});
// Add response to history
messages.push({
role: 'assistant',
content: response1.choices[0].message.content,
});
// Next turn
messages.push({ role: 'user', content: 'Tell me more' });
const response2 = await openai.chat.completions.create({
model: 'gpt-5',
messages, // ✅ You must pass full history
});
```
**Pros:**
- Full control over history
- Can prune old messages
- Simple for one-off requests
**Cons:**
- Manual tracking error-prone
- Must handle history yourself
- No automatic caching benefits
### Responses (Automatic)
```typescript
// Create conversation once
const conv = await openai.conversations.create();
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Automatic state
input: 'What is AI?',
});
// Next turn - no manual history tracking
const response2 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Remembers previous turn
input: 'Tell me more',
});
```
**Pros:**
- Automatic state management
- No manual history tracking
- Better cache utilization (40-80%)
- Reasoning preserved
**Cons:**
- Less direct control
- Must create conversation first
- Conversations expire after 90 days
---
## Reasoning Preservation
### Chat Completions
**What Happens:**
1. Model generates internal reasoning (scratchpad)
2. Reasoning used to produce response
3. **Reasoning discarded** before returning
4. Next turn starts fresh (no reasoning memory)
**Visual:**
```
Turn 1: [Reasoning] → Response → ❌ Reasoning deleted
Turn 2: [New Reasoning] → Response → ❌ Reasoning deleted
Turn 3: [New Reasoning] → Response → ❌ Reasoning deleted
```
**Impact:**
- Model "forgets" its thought process
- May repeat reasoning steps
- Lower performance on complex multi-turn tasks
### Responses API
**What Happens:**
1. Model generates internal reasoning
2. Reasoning used to produce response
3. **Reasoning preserved** in conversation state
4. Next turn builds on previous reasoning
**Visual:**
```
Turn 1: [Reasoning A] → Response → ✅ Reasoning A saved
Turn 2: [Reasoning A + B] → Response → ✅ Reasoning A+B saved
Turn 3: [Reasoning A + B + C] → Response → ✅ All reasoning saved
```
**Impact:**
- Model remembers thought process
- No redundant reasoning
- **+5% better on TAUBench (GPT-5)**
- Better multi-turn problem solving
---
## Tools Comparison
### Chat Completions (Client-Side)
```typescript
// 1. Define function
const response1 = await openai.chat.completions.create({
model: 'gpt-5',
messages: [{ role: 'user', content: 'What is the weather?' }],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather',
parameters: {
type: 'object',
properties: {
location: { type: 'string' },
},
},
},
},
],
});
// 2. Check if tool called
const toolCall = response1.choices[0].message.tool_calls?.[0];
// 3. Execute tool on your server
const weatherData = await getWeather(toolCall.function.arguments);
// 4. Send result back
const response2 = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
...messages,
response1.choices[0].message,
{
role: 'tool',
tool_call_id: toolCall.id,
content: JSON.stringify(weatherData),
},
],
});
```
**Pros:**
- Full control over tool execution
- Can use any custom tools
**Cons:**
- Manual round trips (latency)
- More complex code
- You handle tool execution
### Responses (Server-Side Built-in)
```typescript
// All in one request - tools executed server-side
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What is the weather and analyze the temperature trend?',
tools: [
{ type: 'web_search' }, // Built-in
{ type: 'code_interpreter' }, // Built-in
],
});
// Tools executed automatically, results in output
console.log(response.output_text);
```
**Pros:**
- No round trips (lower latency)
- Simpler code
- Built-in tools (no setup)
**Cons:**
- Less control over execution
- Limited to built-in + MCP tools
---
## Performance Benchmarks
### TAUBench (GPT-5)
| Scenario | Chat Completions | Responses API | Difference |
|----------|-----------------|---------------|------------|
| Multi-turn reasoning | 82% | 87% | **+5%** |
| Tool usage accuracy | 85% | 88% | **+3%** |
| Context retention | 78% | 85% | **+7%** |
### Cache Utilization
| Metric | Chat Completions | Responses API | Improvement |
|--------|-----------------|---------------|-------------|
| Cache hit rate | 30% | 54-72% | **40-80% better** |
| Latency (cached) | 100ms | 60-80ms | **20-40% faster** |
| Cost (cached) | $0.10/1K | $0.05-0.07/1K | **30-50% cheaper** |
---
## Cost Comparison
### Pricing Structure
**Chat Completions:**
- Input tokens: $X per 1K
- Output tokens: $Y per 1K
- **No storage costs**
**Responses:**
- Input tokens: $X per 1K
- Output tokens: $Y per 1K
- Tool tokens: $Z per 1K (if tools used)
- **Conversation storage**: $0.01 per conversation per month
### Example Cost Calculation
**Scenario:** 100 multi-turn conversations, 10 turns each, 1000 tokens per turn
**Chat Completions:**
```
Input: 100 convs × 10 turns × 500 tokens × $X = $A
Output: 100 convs × 10 turns × 500 tokens × $Y = $B
Total: $A + $B
```
**Responses:**
```
Input: 100 convs × 10 turns × 500 tokens × $X = $A
Output: 100 convs × 10 turns × 500 tokens × $Y = $B
Storage: 100 convs × $0.01 = $1
Cache savings: -30% on input (due to better caching)
Total: ($A × 0.7) + $B + $1 (usually cheaper!)
```
---
## Migration Path
### Simple Migration
**Before (Chat Completions):**
```typescript
const response = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: 'Hello!' },
],
});
console.log(response.choices[0].message.content);
```
**After (Responses):**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: [
{ role: 'developer', content: 'You are helpful.' },
{ role: 'user', content: 'Hello!' },
],
});
console.log(response.output_text);
```
**Changes:**
1. `chat.completions.create``responses.create`
2. `messages``input`
3. `system``developer`
4. `choices[0].message.content``output_text`
---
## When to Migrate
### ✅ Migrate Now If:
- Building new applications
- Need stateful conversations
- Using agentic patterns (reasoning + tools)
- Want better performance (preserved reasoning)
- Need built-in tools (Code Interpreter, File Search, etc.)
### ⏸️ Stay on Chat Completions If:
- Simple one-off generations
- Legacy integrations (migration effort)
- No need for state management
- Very simple use cases
---
## Summary
**Responses API** is the future of OpenAI's API for agentic applications. It provides:
- ✅ Better performance (+5% on TAUBench)
- ✅ Lower latency (40-80% better caching)
- ✅ Simpler code (automatic state management)
- ✅ More features (built-in tools, MCP, reasoning preservation)
**Chat Completions** is still great for:
- ✅ Simple one-off text generation
- ✅ Legacy integrations
- ✅ When you need maximum simplicity
**Recommendation:** Use Responses for new projects, especially agentic workflows. Chat Completions remains valid for simple use cases.

View File

@@ -0,0 +1,78 @@
# Stateful Conversations Guide
**Last Updated**: 2025-10-25
Guide to managing conversation state with the Responses API.
---
## Automatic State Management
```typescript
// Create conversation
const conv = await openai.conversations.create({
metadata: { user_id: 'user_123' },
});
// Turn 1
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id,
input: 'What are the 5 Ds of dodgeball?',
});
// Turn 2 - automatically remembers turn 1
const response2 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id,
input: 'Tell me more about the first one',
});
```
---
## Conversation Management
**Create:**
```typescript
const conv = await openai.conversations.create({
metadata: { topic: 'support' },
items: [
{ type: 'message', role: 'developer', content: 'You are helpful.' }
],
});
```
**List:**
```typescript
const convs = await openai.conversations.list({ limit: 10 });
```
**Delete:**
```typescript
await openai.conversations.delete(conv.id);
```
---
## Benefits vs Manual History
| Feature | Manual History | Conversation IDs |
|---------|---------------|------------------|
| **Complexity** | High (you track) | Low (automatic) |
| **Cache** | Baseline | 40-80% better |
| **Reasoning** | Discarded | Preserved |
| **Errors** | Common | Rare |
---
## Best Practices
1. **Store conversation IDs**: Database, session storage, cookies
2. **Add metadata**: Track user, topic, session type
3. **Expire old conversations**: Delete after 90 days or when done
4. **One conversation per topic**: Don't mix unrelated topics
---
**Official Docs**: https://platform.openai.com/docs/api-reference/conversations

476
references/top-errors.md Normal file
View File

@@ -0,0 +1,476 @@
# Top 8 Errors with OpenAI Responses API
**Last Updated**: 2025-10-25
This document covers the most common errors encountered when using the Responses API and their solutions.
---
## 1. Session State Not Persisting
**Error Symptom:**
Model doesn't remember previous conversation turns.
**Causes:**
- Not using conversation IDs
- Using different conversation IDs per turn
- Creating new conversation for each request
**Solution:**
```typescript
// ❌ BAD: New conversation each time
const response1 = await openai.responses.create({
model: 'gpt-5',
input: 'Question 1',
});
const response2 = await openai.responses.create({
model: 'gpt-5',
input: 'Question 2', // Model doesn't remember question 1
});
// ✅ GOOD: Reuse conversation ID
const conv = await openai.conversations.create();
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Same ID
input: 'Question 1',
});
const response2 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Same ID - remembers previous
input: 'Question 2',
});
```
**Prevention:**
- Create conversation once
- Store conversation ID (database, session, cookie)
- Reuse ID for all related turns
---
## 2. MCP Server Connection Failed
**Error:**
```json
{
"error": {
"type": "mcp_connection_error",
"message": "Failed to connect to MCP server"
}
}
```
**Causes:**
- Invalid server URL
- Missing or expired authorization token
- Server not responding
- Network issues
**Solutions:**
```typescript
// 1. Verify URL is correct
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Test MCP',
tools: [
{
type: 'mcp',
server_label: 'stripe',
server_url: 'https://mcp.stripe.com', // ✅ Full HTTPS URL
authorization: process.env.STRIPE_OAUTH_TOKEN, // ✅ Valid token
},
],
});
// 2. Test server URL manually
const testResponse = await fetch('https://mcp.stripe.com');
console.log(testResponse.status); // Should be 200
// 3. Check token expiration
const tokenExpiry = parseJWT(token).exp;
if (Date.now() / 1000 > tokenExpiry) {
console.error('Token expired, refresh it');
}
```
**Prevention:**
- Use environment variables for secrets
- Implement token refresh logic
- Add retry with exponential backoff
- Log connection attempts for debugging
---
## 3. Code Interpreter Timeout
**Error:**
```json
{
"error": {
"type": "code_interpreter_timeout",
"message": "Code execution exceeded time limit"
}
}
```
**Cause:**
Code runs longer than 30 seconds (standard mode limit)
**Solution:**
```typescript
// ❌ BAD: Long-running code in standard mode
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Process this massive dataset',
tools: [{ type: 'code_interpreter' }], // Timeout after 30s
});
// ✅ GOOD: Use background mode for long tasks
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Process this massive dataset',
background: true, // ✅ Up to 10 minutes
tools: [{ type: 'code_interpreter' }],
});
// Poll for results
let result = await openai.responses.retrieve(response.id);
while (result.status === 'in_progress') {
await new Promise(r => setTimeout(r, 5000));
result = await openai.responses.retrieve(response.id);
}
console.log(result.output_text);
```
**Prevention:**
- Use `background: true` for tasks > 30 seconds
- Break large tasks into smaller chunks
- Optimize code for performance
---
## 4. Image Generation Rate Limit
**Error:**
```json
{
"error": {
"type": "rate_limit_error",
"message": "DALL-E rate limit exceeded"
}
}
```
**Cause:**
Too many image generation requests in short time
**Solution:**
```typescript
// Implement retry with exponential backoff
async function generateImageWithRetry(prompt: string, retries = 3): Promise<any> {
for (let i = 0; i < retries; i++) {
try {
return await openai.responses.create({
model: 'gpt-5',
input: prompt,
tools: [{ type: 'image_generation' }],
});
} catch (error: any) {
if (error.type === 'rate_limit_error' && i < retries - 1) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
console.log(`Rate limited, retrying in ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
} else {
throw error;
}
}
}
}
const response = await generateImageWithRetry('Create an image of a sunset');
```
**Prevention:**
- Implement rate limiting on your side
- Use exponential backoff for retries
- Queue image requests
- Monitor API usage
---
## 5. File Search Relevance Issues
**Problem:**
File search returns irrelevant or low-quality results
**Causes:**
- Vague queries
- Poor file quality (OCR errors, formatting)
- Not enough context
**Solutions:**
```typescript
// ❌ BAD: Vague query
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Find pricing', // Too vague
tools: [{ type: 'file_search', file_ids: [fileId] }],
});
// ✅ GOOD: Specific query
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Find the monthly subscription pricing for the premium plan in the 2025 pricing document',
tools: [{ type: 'file_search', file_ids: [fileId] }],
});
// ✅ ALSO GOOD: Filter low-confidence results
response.output.forEach(item => {
if (item.type === 'file_search_call') {
const highConfidence = item.results.filter(r => r.score > 0.7);
console.log('High confidence results:', highConfidence);
}
});
```
**Prevention:**
- Use specific, detailed queries
- Upload high-quality documents (PDFs, Markdown)
- Filter results by confidence score (> 0.7)
- Provide context in query
---
## 6. Variable Substitution Errors (Reusable Prompts)
**Error:**
Variables not replaced in prompt templates
**Cause:**
Incorrect variable syntax or missing values
**Solution:**
```typescript
// ❌ BAD: Incorrect variable syntax
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Hello {username}', // Not supported directly
});
// ✅ GOOD: Use template literals
const username = 'Alice';
const response = await openai.responses.create({
model: 'gpt-5',
input: `Hello ${username}`, // ✅ JavaScript template literal
});
// ✅ ALSO GOOD: Build message dynamically
function buildPrompt(vars: Record<string, string>) {
return `Hello ${vars.username}, your order ${vars.orderId} is ready.`;
}
const response = await openai.responses.create({
model: 'gpt-5',
input: buildPrompt({ username: 'Alice', orderId: '12345' }),
});
```
**Prevention:**
- Use JavaScript template literals
- Validate all variables before substitution
- Provide defaults for optional variables
---
## 7. Chat Completions Migration Breaking Changes
**Errors:**
- `messages parameter not found`
- `choices is undefined`
- `system role not recognized`
**Cause:**
Using Chat Completions syntax with Responses API
**Solution:**
```typescript
// ❌ BAD: Chat Completions syntax
const response = await openai.responses.create({
model: 'gpt-5',
messages: [{ role: 'system', content: 'You are helpful.' }], // Wrong
});
console.log(response.choices[0].message.content); // Wrong
// ✅ GOOD: Responses syntax
const response = await openai.responses.create({
model: 'gpt-5',
input: [{ role: 'developer', content: 'You are helpful.' }], // ✅
});
console.log(response.output_text); // ✅
```
**Breaking Changes:**
| Chat Completions | Responses API |
|-----------------|---------------|
| `messages` | `input` |
| `system` role | `developer` role |
| `choices[0].message.content` | `output_text` |
| `/v1/chat/completions` | `/v1/responses` |
**Prevention:**
- Read migration guide: `references/migration-guide.md`
- Update all references systematically
- Test thoroughly after migration
---
## 8. Cost Tracking Confusion
**Problem:**
Billing different than expected
**Cause:**
Not accounting for tool tokens and conversation storage
**Explanation:**
- **Chat Completions**: input tokens + output tokens
- **Responses API**: input tokens + output tokens + tool tokens + conversation storage
**Solution:**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Hello',
store: false, // ✅ Disable storage if not needed
tools: [{ type: 'code_interpreter' }],
});
// Monitor usage
console.log('Input tokens:', response.usage.prompt_tokens);
console.log('Output tokens:', response.usage.completion_tokens);
console.log('Tool tokens:', response.usage.tool_tokens);
console.log('Total tokens:', response.usage.total_tokens);
// Calculate cost
const inputCost = response.usage.prompt_tokens * 0.00001; // Example rate
const outputCost = response.usage.completion_tokens * 0.00003;
const toolCost = response.usage.tool_tokens * 0.00002;
const totalCost = inputCost + outputCost + toolCost;
console.log('Estimated cost: $' + totalCost.toFixed(4));
```
**Prevention:**
- Monitor `usage.tool_tokens` in responses
- Set `store: false` for one-off requests
- Track conversation count (storage costs)
- Implement cost alerts
---
## Common Error Response Formats
### Authentication Error
```json
{
"error": {
"type": "authentication_error",
"message": "Invalid API key"
}
}
```
### Rate Limit Error
```json
{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded",
"retry_after": 5
}
}
```
### Invalid Request Error
```json
{
"error": {
"type": "invalid_request_error",
"message": "Conversation conv_xyz not found"
}
}
```
### Server Error
```json
{
"error": {
"type": "server_error",
"message": "Internal server error"
}
}
```
---
## General Error Handling Pattern
```typescript
async function handleResponsesAPI(input: string) {
try {
const response = await openai.responses.create({
model: 'gpt-5',
input,
});
return response.output_text;
} catch (error: any) {
// Handle specific errors
switch (error.type) {
case 'rate_limit_error':
console.error('Rate limited, retry after:', error.retry_after);
break;
case 'mcp_connection_error':
console.error('MCP server failed:', error.message);
break;
case 'code_interpreter_timeout':
console.error('Code execution timed out, use background mode');
break;
case 'authentication_error':
console.error('Invalid API key');
break;
default:
console.error('Unexpected error:', error.message);
}
throw error; // Re-throw or handle
}
}
```
---
## Prevention Checklist
- [ ] Use conversation IDs for multi-turn interactions
- [ ] Provide valid MCP server URLs and tokens
- [ ] Use `background: true` for tasks > 30 seconds
- [ ] Implement exponential backoff for rate limits
- [ ] Use specific queries for file search
- [ ] Use template literals for variable substitution
- [ ] Update Chat Completions syntax to Responses syntax
- [ ] Monitor `usage.tool_tokens` and conversation count
---
## Getting Help
If you encounter an error not covered here:
1. Check official docs: https://platform.openai.com/docs/api-reference/responses
2. Search OpenAI Community: https://community.openai.com
3. Contact OpenAI Support: https://help.openai.com
---
**Last Updated**: 2025-10-25