Files
2025-11-30 08:25:17 +08:00

1218 lines
30 KiB
Markdown

---
name: openai-responses
description: |
Build agentic AI applications with OpenAI's Responses API - the stateful successor to Chat Completions. Preserves reasoning across turns for 5% better multi-turn performance and 40-80% improved cache utilization.
Use when: building AI agents with persistent reasoning, integrating MCP servers for external tools, using built-in Code Interpreter/File Search/Web Search, managing stateful conversations, implementing background processing for long tasks, or migrating from Chat Completions to gain polymorphic outputs and server-side tools.
license: MIT
---
# OpenAI Responses API
**Status**: Production Ready
**Last Updated**: 2025-10-25
**API Launch**: March 2025
**Dependencies**: openai@5.19.1+ (Node.js) or fetch API (Cloudflare Workers)
---
## What Is the Responses API?
The Responses API (`/v1/responses`) is OpenAI's unified interface for building agentic applications, launched in March 2025. It fundamentally changes how you interact with OpenAI models by providing **stateful conversations** and a **structured loop for reasoning and acting**.
### Key Innovation: Preserved Reasoning State
Unlike Chat Completions where reasoning is discarded between turns, Responses **keeps the notebook open**. The model's step-by-step thought processes survive into the next turn, improving performance by approximately **5% on TAUBench** and enabling better multi-turn interactions.
### Why Use Responses Over Chat Completions?
| Feature | Chat Completions | Responses API | Benefit |
|---------|-----------------|---------------|---------|
| **State Management** | Manual (you track history) | Automatic (conversation IDs) | Simpler code, less error-prone |
| **Reasoning** | Dropped between turns | Preserved across turns | Better multi-turn performance |
| **Tools** | Client-side round trips | Server-side hosted | Lower latency, simpler code |
| **Output Format** | Single message | Polymorphic (messages, reasoning, tool calls) | Richer debugging, better UX |
| **Cache Utilization** | Baseline | 40-80% better | Lower costs, faster responses |
| **MCP Support** | Manual integration | Built-in | Easy external tool connections |
---
## Quick Start (5 Minutes)
### 1. Get API Key
```bash
# Sign up at https://platform.openai.com/
# Navigate to API Keys section
# Create new key and save securely
export OPENAI_API_KEY="sk-proj-..."
```
**Why this matters:**
- API key required for all requests
- Keep secure (never commit to git)
- Use environment variables
### 2. Install SDK (Node.js)
```bash
npm install openai
```
```typescript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What are the 5 Ds of dodgeball?',
});
console.log(response.output_text);
```
**CRITICAL:**
- Always use server-side (never expose API key in client code)
- Model defaults to `gpt-5` (can use `gpt-5-mini`, `gpt-4o`, etc.)
- `input` can be string or array of messages
### 3. Or Use Direct API (Cloudflare Workers)
```typescript
// No SDK needed - use fetch()
const response = await fetch('https://api.openai.com/v1/responses', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5',
input: 'Hello, world!',
}),
});
const data = await response.json();
console.log(data.output_text);
```
**Why fetch?**
- No dependencies in edge environments
- Full control over request/response
- Works in Cloudflare Workers, Deno, Bun
---
## Responses vs Chat Completions: Complete Comparison
### When to Use Each
**Use Responses API when:**
- ✅ Building agentic applications (reasoning + actions)
- ✅ Need preserved reasoning state across turns
- ✅ Want built-in tools (Code Interpreter, File Search, Web Search)
- ✅ Using MCP servers for external integrations
- ✅ Implementing conversational AI with automatic state management
- ✅ Background processing for long-running tasks
- ✅ Need polymorphic outputs (messages, reasoning, tool calls)
**Use Chat Completions when:**
- ✅ Simple one-off text generation
- ✅ Fully stateless interactions (no conversation continuity needed)
- ✅ Legacy integrations (existing Chat Completions code)
- ✅ Very simple use cases without tools
### Architecture Differences
**Chat Completions Flow:**
```
User Input → Model → Single Message → Done
(Reasoning discarded, state lost)
```
**Responses API Flow:**
```
User Input → Model (preserved reasoning) → Polymorphic Outputs
↓ (server-side tools)
Tool Call → Tool Result → Model → Final Response
(Reasoning preserved, state maintained)
```
### Performance Benefits
**Cache Utilization:**
- Chat Completions: Baseline performance
- Responses API: **40-80% better cache utilization**
- Result: Lower latency + reduced costs
**Reasoning Performance:**
- Chat Completions: Reasoning dropped between turns
- Responses API: Reasoning preserved across turns
- Result: **5% better on TAUBench** (GPT-5 with Responses vs Chat Completions)
---
## Stateful Conversations
### Automatic State Management
The Responses API can automatically manage conversation state using **conversation IDs**.
#### Creating a Conversation
```typescript
// Create conversation with initial message
const conversation = await openai.conversations.create({
metadata: { user_id: 'user_123' },
items: [
{
type: 'message',
role: 'user',
content: 'Hello!',
},
],
});
console.log(conversation.id); // "conv_abc123..."
```
#### Using Conversation ID
```typescript
// First turn
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: 'conv_abc123',
input: 'What are the 5 Ds of dodgeball?',
});
console.log(response1.output_text);
// Second turn - model remembers previous context
const response2 = await openai.responses.create({
model: 'gpt-5',
conversation: 'conv_abc123',
input: 'Tell me more about the first one',
});
console.log(response2.output_text);
// Model automatically knows "first one" refers to first D from previous turn
```
**Why this matters:**
- No manual history tracking required
- Reasoning state preserved between turns
- Automatic context management
- Lower risk of context errors
### Manual State Management (Alternative)
If you need full control, you can manually manage history:
```typescript
let history = [
{ role: 'user', content: 'Tell me a joke' },
];
const response = await openai.responses.create({
model: 'gpt-5',
input: history,
store: true, // Optional: store for retrieval later
});
// Add response to history
history = [
...history,
...response.output.map(el => ({
role: el.role,
content: el.content,
})),
];
// Next turn
history.push({ role: 'user', content: 'Tell me another' });
const secondResponse = await openai.responses.create({
model: 'gpt-5',
input: history,
});
```
**When to use manual management:**
- Need custom history pruning logic
- Want to modify conversation history programmatically
- Implementing custom caching strategies
---
## Built-in Tools (Server-Side)
The Responses API includes **server-side hosted tools** that eliminate costly backend round trips.
### Available Tools
| Tool | Purpose | Use Case |
|------|---------|----------|
| **Code Interpreter** | Execute Python code | Data analysis, calculations, charts |
| **File Search** | RAG without vector stores | Search uploaded files for answers |
| **Web Search** | Real-time web information | Current events, fact-checking |
| **Image Generation** | DALL-E integration | Create images from descriptions |
| **MCP** | Connect external tools | Stripe, databases, custom APIs |
### Code Interpreter
Execute Python code server-side for data analysis, calculations, and visualizations.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Calculate the mean, median, and mode of: 10, 20, 30, 40, 50',
tools: [{ type: 'code_interpreter' }],
});
console.log(response.output_text);
// Model writes and executes Python code, returns results
```
**Advanced Example: Data Analysis**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Analyze this sales data and create a bar chart showing monthly revenue: [data here]',
tools: [{ type: 'code_interpreter' }],
});
// Check output for code execution results
response.output.forEach(item => {
if (item.type === 'code_interpreter_call') {
console.log('Code executed:', item.input);
console.log('Result:', item.output);
}
});
```
**Why this matters:**
- No need to run Python locally
- Sandboxed execution environment
- Automatic chart generation
- Can process uploaded files
### File Search (RAG Without Vector Stores)
Search through uploaded files without building your own RAG pipeline.
```typescript
// 1. Upload files first (one-time setup)
const file = await openai.files.create({
file: fs.createReadStream('knowledge-base.pdf'),
purpose: 'assistants',
});
// 2. Use file search
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What does the document say about pricing?',
tools: [
{
type: 'file_search',
file_ids: [file.id],
},
],
});
console.log(response.output_text);
// Model searches file and provides answer with citations
```
**Supported File Types:**
- PDFs, Word docs, text files
- Markdown, HTML
- Code files (Python, JavaScript, etc.)
- Max: 512MB per file
### Web Search
Get real-time information from the web.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'What are the latest updates on GPT-5?',
tools: [{ type: 'web_search' }],
});
console.log(response.output_text);
// Model searches web and provides current information with sources
```
**Why this matters:**
- No cutoff date limitations
- Automatic source citations
- Real-time data access
- No need for external search APIs
### Image Generation (DALL-E)
Generate images directly in the Responses API.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Create an image of a futuristic cityscape at sunset',
tools: [{ type: 'image_generation' }],
});
// Find image in output
response.output.forEach(item => {
if (item.type === 'image_generation_call') {
console.log('Image URL:', item.output.url);
}
});
```
**Models Available:**
- DALL-E 3 (default)
- Various sizes and quality options
---
## MCP Server Integration
The Responses API has built-in support for **Model Context Protocol (MCP)** servers, allowing you to connect external tools.
### What Is MCP?
MCP is an open protocol that standardizes how applications provide context to LLMs. It allows you to:
- Connect to external APIs (Stripe, databases, CRMs)
- Use hosted MCP servers
- Build custom tool integrations
### Basic MCP Integration
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Roll 2d6 dice',
tools: [
{
type: 'mcp',
server_label: 'dice',
server_url: 'https://example.com/mcp',
},
],
});
// Model discovers available tools on MCP server and uses them
console.log(response.output_text);
```
### MCP with Authentication (OAuth)
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Create a $20 payment link',
tools: [
{
type: 'mcp',
server_label: 'stripe',
server_url: 'https://mcp.stripe.com',
authorization: process.env.STRIPE_OAUTH_TOKEN,
},
],
});
console.log(response.output_text);
// Model uses Stripe MCP server to create payment link
```
**CRITICAL:**
- API does NOT store authorization tokens
- Must provide token with each request
- Use environment variables for security
### Polymorphic Output: MCP Tool Calls
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Roll 2d4+1',
tools: [
{
type: 'mcp',
server_label: 'dice',
server_url: 'https://dmcp.example.com',
},
],
});
// Inspect tool calls
response.output.forEach(item => {
if (item.type === 'mcp_call') {
console.log('Tool:', item.name);
console.log('Arguments:', item.arguments);
console.log('Output:', item.output);
}
if (item.type === 'mcp_list_tools') {
console.log('Available tools:', item.tools);
}
});
```
**Output Types:**
- `mcp_list_tools` - Tools discovered on server
- `mcp_call` - Tool invocation and result
- `message` - Final response to user
---
## Reasoning Preservation
### How It Works
The Responses API preserves the model's **internal reasoning state** across turns, unlike Chat Completions which discards it.
**Visual Analogy:**
- **Chat Completions**: Model has a scratchpad, writes reasoning, then **tears out the page** before responding
- **Responses API**: Model keeps the scratchpad open, **previous reasoning visible** for next turn
### Performance Impact
**TAUBench Results (GPT-5):**
- Chat Completions: Baseline score
- Responses API: **+5% better** (purely from preserved reasoning)
**Why This Matters:**
- Better multi-turn problem solving
- More coherent long conversations
- Improved step-by-step reasoning
- Fewer context errors
### Reasoning Summaries (Free!)
The Responses API provides **reasoning summaries** at no additional cost.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Solve this complex math problem: [problem]',
});
// Inspect reasoning
response.output.forEach(item => {
if (item.type === 'reasoning') {
console.log('Model reasoning:', item.summary[0].text);
}
if (item.type === 'message') {
console.log('Final answer:', item.content[0].text);
}
});
```
**Use Cases:**
- Debugging model decisions
- Audit trails for compliance
- Understanding model thought process
- Building transparent AI systems
---
## Background Mode (Long-Running Tasks)
For tasks that take longer than standard timeout limits, use **background mode**.
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Analyze this 500-page document and summarize key findings',
background: true,
tools: [{ type: 'file_search', file_ids: [fileId] }],
});
// Returns immediately with status
console.log(response.status); // "in_progress"
console.log(response.id); // Use to check status later
// Poll for completion
const checkStatus = async (responseId) => {
const result = await openai.responses.retrieve(responseId);
if (result.status === 'completed') {
console.log(result.output_text);
} else if (result.status === 'failed') {
console.error('Task failed:', result.error);
} else {
// Still running, check again later
setTimeout(() => checkStatus(responseId), 5000);
}
};
checkStatus(response.id);
```
**When to Use:**
- Large file processing
- Complex calculations
- Multi-step research tasks
- Data analysis on large datasets
**Timeout Limits:**
- Standard mode: 60 seconds
- Background mode: Up to 10 minutes
---
## Polymorphic Outputs
The Responses API returns **multiple output types** instead of a single message.
### Output Types
| Type | Description | Example |
|------|-------------|---------|
| `message` | Text response to user | Final answer, explanation |
| `reasoning` | Model's internal thought process | Step-by-step reasoning summary |
| `code_interpreter_call` | Code execution | Python code + results |
| `mcp_call` | Tool invocation | Tool name, args, output |
| `mcp_list_tools` | Available tools | Tool definitions from MCP server |
| `file_search_call` | File search results | Matched chunks, citations |
| `web_search_call` | Web search results | URLs, snippets |
| `image_generation_call` | Image generation | Image URL |
### Processing Polymorphic Outputs
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Search the web for the latest AI news and summarize',
tools: [{ type: 'web_search' }],
});
// Process different output types
response.output.forEach(item => {
switch (item.type) {
case 'reasoning':
console.log('Reasoning:', item.summary[0].text);
break;
case 'web_search_call':
console.log('Searched:', item.query);
console.log('Sources:', item.results);
break;
case 'message':
console.log('Response:', item.content[0].text);
break;
}
});
// Or use helper for text-only
console.log(response.output_text);
```
**Why This Matters:**
- Better debugging (see all steps)
- Audit trails (track all tool calls)
- Richer UX (show progress to users)
- Compliance (log all actions)
---
## Migration from Chat Completions
### Breaking Changes
| Feature | Chat Completions | Responses API | Migration |
|---------|-----------------|---------------|-----------|
| **Endpoint** | `/v1/chat/completions` | `/v1/responses` | Update URL |
| **Parameter** | `messages` | `input` | Rename parameter |
| **State** | Manual (`messages` array) | Automatic (`conversation` ID) | Use conversation IDs |
| **Tools** | `tools` array with functions | Built-in types + MCP | Update tool definitions |
| **Output** | `choices[0].message.content` | `output_text` or `output` array | Update response parsing |
| **Streaming** | `data: {"choices":[...]}` | SSE with multiple item types | Update stream parser |
### Migration Example
**Before (Chat Completions):**
```typescript
const response = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' },
],
});
console.log(response.choices[0].message.content);
```
**After (Responses):**
```typescript
const response = await openai.responses.create({
model: 'gpt-5',
input: [
{ role: 'developer', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' },
],
});
console.log(response.output_text);
```
**Key Differences:**
1. `chat.completions.create``responses.create`
2. `messages``input`
3. `system` role → `developer` role
4. `choices[0].message.content``output_text`
### When to Migrate
**Migrate now if:**
- ✅ Building new applications
- ✅ Need stateful conversations
- ✅ Using agentic patterns (reasoning + tools)
- ✅ Want better performance (preserved reasoning)
**Stay on Chat Completions if:**
- ✅ Simple one-off generations
- ✅ Legacy integrations
- ✅ No need for state management
---
## Error Handling
### Common Errors and Solutions
#### 1. Session State Not Persisting
**Error:**
```
Conversation state not maintained between turns
```
**Cause:**
- Not using conversation IDs
- Using different conversation IDs per turn
**Solution:**
```typescript
// Create conversation once
const conv = await openai.conversations.create();
// Reuse conversation ID for all turns
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Same ID
input: 'First message',
});
const response2 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id, // ✅ Same ID
input: 'Follow-up message',
});
```
#### 2. MCP Server Connection Failed
**Error:**
```json
{
"error": {
"type": "mcp_connection_error",
"message": "Failed to connect to MCP server"
}
}
```
**Causes:**
- Invalid server URL
- Missing or expired authorization token
- Server not responding
**Solutions:**
```typescript
// 1. Verify URL is correct
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Test MCP',
tools: [
{
type: 'mcp',
server_label: 'test',
server_url: 'https://api.example.com/mcp', // ✅ Full URL
authorization: process.env.AUTH_TOKEN, // ✅ Valid token
},
],
});
// 2. Test server URL manually
const testResponse = await fetch('https://api.example.com/mcp');
console.log(testResponse.status); // Should be 200
// 3. Check token expiration
console.log('Token expires:', parseJWT(token).exp);
```
#### 3. Code Interpreter Timeout
**Error:**
```json
{
"error": {
"type": "code_interpreter_timeout",
"message": "Code execution exceeded time limit"
}
}
```
**Cause:**
- Code runs longer than 30 seconds
**Solution:**
```typescript
// Use background mode for long-running code
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Process this large dataset',
background: true, // ✅ Extended timeout
tools: [{ type: 'code_interpreter' }],
});
// Poll for results
const result = await openai.responses.retrieve(response.id);
```
#### 4. Image Generation Rate Limit
**Error:**
```json
{
"error": {
"type": "rate_limit_error",
"message": "DALL-E rate limit exceeded"
}
}
```
**Cause:**
- Too many image generation requests
**Solution:**
```typescript
// Implement retry with exponential backoff
const generateImage = async (prompt, retries = 3) => {
try {
return await openai.responses.create({
model: 'gpt-5',
input: prompt,
tools: [{ type: 'image_generation' }],
});
} catch (error) {
if (error.type === 'rate_limit_error' && retries > 0) {
const delay = (4 - retries) * 1000; // 1s, 2s, 3s
await new Promise(resolve => setTimeout(resolve, delay));
return generateImage(prompt, retries - 1);
}
throw error;
}
};
```
#### 5. File Search Relevance Issues
**Problem:**
- File search returns irrelevant results
**Solution:**
```typescript
// Use more specific queries
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Find sections about pricing in Q4 2024 specifically', // ✅ Specific
// NOT: 'Find pricing' (too vague)
tools: [{ type: 'file_search', file_ids: [fileId] }],
});
// Or filter results manually
response.output.forEach(item => {
if (item.type === 'file_search_call') {
const relevantChunks = item.results.filter(
chunk => chunk.score > 0.7 // ✅ Only high-confidence matches
);
}
});
```
#### 6. Cost Tracking Confusion
**Problem:**
- Billing different than expected
**Explanation:**
- Responses API bills for: input tokens + output tokens + tool usage + stored conversations
- Chat Completions bills only: input tokens + output tokens
**Solution:**
```typescript
// Monitor usage
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Hello',
store: false, // ✅ Don't store if not needed
});
console.log('Usage:', response.usage);
// {
// prompt_tokens: 10,
// completion_tokens: 20,
// tool_tokens: 5,
// total_tokens: 35
// }
```
#### 7. Conversation Not Found
**Error:**
```json
{
"error": {
"type": "invalid_request_error",
"message": "Conversation conv_xyz not found"
}
}
```
**Causes:**
- Conversation ID typo
- Conversation deleted
- Conversation expired (90 days)
**Solution:**
```typescript
// Verify conversation exists before using
const conversations = await openai.conversations.list();
const exists = conversations.data.some(c => c.id === 'conv_xyz');
if (!exists) {
// Create new conversation
const newConv = await openai.conversations.create();
// Use newConv.id
}
```
#### 8. Tool Output Parsing Failed
**Problem:**
- Can't access tool outputs correctly
**Solution:**
```typescript
// Use helper methods
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Search for AI news',
tools: [{ type: 'web_search' }],
});
// Helper: Get text-only output
console.log(response.output_text);
// Manual: Inspect all outputs
response.output.forEach(item => {
console.log('Type:', item.type);
console.log('Content:', item);
});
```
---
## Production Patterns
### Cost Optimization
**1. Use Conversation IDs (Cache Benefits)**
```typescript
// ✅ GOOD: Reuse conversation ID
const conv = await openai.conversations.create();
const response1 = await openai.responses.create({
model: 'gpt-5',
conversation: conv.id,
input: 'Question 1',
});
// 40-80% better cache utilization
// ❌ BAD: New manual history each time
const response2 = await openai.responses.create({
model: 'gpt-5',
input: [...previousHistory, newMessage],
});
// No cache benefits
```
**2. Disable Storage When Not Needed**
```typescript
// For one-off requests
const response = await openai.responses.create({
model: 'gpt-5',
input: 'Quick question',
store: false, // ✅ Don't store conversation
});
```
**3. Use Smaller Models When Possible**
```typescript
// For simple tasks
const response = await openai.responses.create({
model: 'gpt-5-mini', // ✅ 50% cheaper
input: 'Summarize this paragraph',
});
```
### Rate Limit Handling
```typescript
const createResponseWithRetry = async (params, maxRetries = 3) => {
for (let i = 0; i < maxRetries; i++) {
try {
return await openai.responses.create(params);
} catch (error) {
if (error.type === 'rate_limit_error' && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // Exponential backoff
console.log(`Rate limited, retrying in ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
} else {
throw error;
}
}
}
};
```
### Monitoring and Logging
```typescript
const monitoredResponse = async (input) => {
const startTime = Date.now();
try {
const response = await openai.responses.create({
model: 'gpt-5',
input,
});
// Log success metrics
console.log({
status: 'success',
latency: Date.now() - startTime,
tokens: response.usage.total_tokens,
model: response.model,
conversation: response.conversation_id,
});
return response;
} catch (error) {
// Log error metrics
console.error({
status: 'error',
latency: Date.now() - startTime,
error: error.message,
type: error.type,
});
throw error;
}
};
```
---
## Node.js vs Cloudflare Workers
### Node.js Implementation
```typescript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function handleRequest(input: string) {
const response = await openai.responses.create({
model: 'gpt-5',
input,
tools: [{ type: 'web_search' }],
});
return response.output_text;
}
```
**Pros:**
- Full SDK support
- Type safety
- Streaming helpers
**Cons:**
- Requires Node.js runtime
- Larger bundle size
### Cloudflare Workers Implementation
```typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { input } = await request.json();
const response = await fetch('https://api.openai.com/v1/responses', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5',
input,
tools: [{ type: 'web_search' }],
}),
});
const data = await response.json();
return new Response(data.output_text, {
headers: { 'Content-Type': 'text/plain' },
});
},
};
```
**Pros:**
- No dependencies
- Edge deployment
- Faster cold starts
**Cons:**
- Manual request building
- No type safety without custom types
---
## Always Do / Never Do
### ✅ Always Do
1. **Use conversation IDs for multi-turn interactions**
```typescript
const conv = await openai.conversations.create();
// Reuse conv.id for all related turns
```
2. **Handle all output types in polymorphic responses**
```typescript
response.output.forEach(item => {
if (item.type === 'reasoning') { /* log */ }
if (item.type === 'message') { /* display */ }
});
```
3. **Use background mode for long-running tasks**
```typescript
const response = await openai.responses.create({
background: true, // ✅ For tasks >30s
...
});
```
4. **Provide authorization tokens for MCP servers**
```typescript
tools: [{
type: 'mcp',
authorization: process.env.TOKEN, // ✅ Required
}]
```
5. **Monitor token usage for cost control**
```typescript
console.log(response.usage.total_tokens);
```
### ❌ Never Do
1. **Never expose API keys in client-side code**
```typescript
// ❌ DANGER: API key in browser
const response = await fetch('https://api.openai.com/v1/responses', {
headers: { 'Authorization': 'Bearer sk-proj-...' }
});
```
2. **Never assume single message output**
```typescript
// ❌ BAD: Ignores reasoning, tool calls
console.log(response.output[0].content);
// ✅ GOOD: Use helper or check all types
console.log(response.output_text);
```
3. **Never reuse conversation IDs across users**
```typescript
// ❌ DANGER: User A sees User B's conversation
const sharedConv = 'conv_123';
```
4. **Never ignore error types**
```typescript
// ❌ BAD: Generic error handling
try { ... } catch (e) { console.log('error'); }
// ✅ GOOD: Type-specific handling
catch (e) {
if (e.type === 'rate_limit_error') { /* retry */ }
if (e.type === 'mcp_connection_error') { /* alert */ }
}
```
5. **Never poll faster than 1 second for background tasks**
```typescript
// ❌ BAD: Too frequent
setInterval(() => checkStatus(), 100);
// ✅ GOOD: Reasonable interval
setInterval(() => checkStatus(), 5000);
```
---
## References
### Official Documentation
- **Responses API Guide**: https://platform.openai.com/docs/guides/responses
- **API Reference**: https://platform.openai.com/docs/api-reference/responses
- **MCP Integration**: https://platform.openai.com/docs/guides/tools-connectors-mcp
- **Blog Post (Why Responses API)**: https://developers.openai.com/blog/responses-api/
- **Starter App**: https://github.com/openai/openai-responses-starter-app
### Skill Resources
- `templates/` - Working code examples
- `references/responses-vs-chat-completions.md` - Feature comparison
- `references/mcp-integration-guide.md` - MCP server setup
- `references/built-in-tools-guide.md` - Tool usage patterns
- `references/stateful-conversations.md` - Conversation management
- `references/migration-guide.md` - Chat Completions → Responses
- `references/top-errors.md` - Common errors and solutions
---
## Next Steps
1. ✅ Read `templates/basic-response.ts` - Simple example
2. ✅ Try `templates/stateful-conversation.ts` - Multi-turn chat
3. ✅ Explore `templates/mcp-integration.ts` - External tools
4. ✅ Review `references/top-errors.md` - Avoid common pitfalls
5. ✅ Check `references/migration-guide.md` - If migrating from Chat Completions
**Happy building with the Responses API!** 🚀