Files
gh-jezweb-claude-skills-ski…/references/assistants-api-v2.md
2025-11-30 08:25:15 +08:00

243 lines
5.7 KiB
Markdown

# Assistants API v2 - Complete Overview
**Version**: v2 (v1 deprecated Dec 18, 2024)
**Status**: Production (Deprecated H1 2026)
**Replacement**: [Responses API](../../openai-responses/SKILL.md)
---
## Architecture
The Assistants API provides stateful conversational AI through four main objects:
```
Assistant (configured AI entity)
Thread (conversation container)
Messages (user + assistant messages)
Runs (execution on thread)
```
---
## Core Objects
### 1. Assistants
Configured AI entities with:
- **Instructions**: System prompt (max 256k characters)
- **Model**: gpt-4o, gpt-5, etc.
- **Tools**: code_interpreter, file_search, functions
- **Tool Resources**: Vector stores, files
- **Metadata**: Custom key-value pairs
**Lifecycle**: Create once, reuse many times.
### 2. Threads
Conversation containers:
- **Persistent**: Store entire conversation history
- **Capacity**: Up to 100,000 messages
- **Reusable**: One thread per user for continuity
- **Metadata**: Track ownership, session info
### 3. Messages
Individual conversation turns:
- **Roles**: user, assistant
- **Content**: Text, images, files
- **Attachments**: Files with tool associations
- **Metadata**: Custom tracking info
### 4. Runs
Asynchronous execution:
- **States**: queued → in_progress → completed/failed/requires_action
- **Streaming**: Real-time SSE events
- **Tool Calls**: Automatic handling or requires_action
- **Timeouts**: 10-minute max execution
---
## Workflow Patterns
### Basic Pattern
```typescript
// 1. Create assistant (once)
const assistant = await openai.beta.assistants.create({...});
// 2. Create thread (per conversation)
const thread = await openai.beta.threads.create();
// 3. Add message
await openai.beta.threads.messages.create(thread.id, {...});
// 4. Run
const run = await openai.beta.threads.runs.create(thread.id, {
assistant_id: assistant.id,
});
// 5. Poll for completion
while (run.status !== 'completed') {
await sleep(1000);
run = await openai.beta.threads.runs.retrieve(thread.id, run.id);
}
// 6. Get response
const messages = await openai.beta.threads.messages.list(thread.id);
```
### Streaming Pattern
```typescript
const stream = await openai.beta.threads.runs.stream(thread.id, {
assistant_id: assistant.id,
});
for await (const event of stream) {
if (event.event === 'thread.message.delta') {
process.stdout.write(event.data.delta.content?.[0]?.text?.value || '');
}
}
```
---
## Tools
### 1. Code Interpreter
- **Purpose**: Execute Python code
- **Capabilities**: Data analysis, charts, file processing
- **File Support**: CSV, JSON, images, etc.
- **Outputs**: Text logs, image files
### 2. File Search
- **Purpose**: Semantic search over documents
- **Capacity**: Up to 10,000 files per assistant
- **Technology**: Vector + keyword search
- **Pricing**: $0.10/GB/day (first 1GB free)
### 3. Function Calling
- **Purpose**: Custom tools integration
- **Pattern**: requires_action → submit_tool_outputs
- **Timeout**: Must respond within 10 minutes
- **Parallel**: Multiple functions can be called at once
---
## Key Limits
| Resource | Limit |
|----------|-------|
| Assistant instructions | 256,000 characters |
| Thread messages | 100,000 per thread |
| Tools per assistant | 128 tools |
| Vector store files | 10,000 per assistant |
| File size | 512 MB per file |
| Run execution time | 10 minutes |
| Metadata pairs | 16 per object |
---
## Pricing
### API Calls
- Same as Chat Completions (pay per token)
- Run usage reported in `run.usage`
### Vector Stores
- **Storage**: $0.10/GB/day
- **Free tier**: First 1GB
- **Auto-expiration**: Configurable
---
## Migration Timeline
- **✅ Dec 18, 2024**: v1 deprecated (no longer accessible)
- **⏳ H1 2026**: v2 planned sunset
- **✅ Now**: Responses API available (recommended replacement)
**Action**: Plan migration to Responses API for new projects.
---
## Best Practices
1. **Reuse Assistants**: Create once, use many times
2. **One Thread Per User**: Maintain conversation continuity
3. **Check Active Runs**: Before creating new runs
4. **Stream for UX**: Better user experience than polling
5. **Set Timeouts**: Prevent infinite polling
6. **Clean Up**: Delete old threads and vector stores
7. **Monitor Costs**: Track token usage and storage
---
## Common Patterns
### Multi-User Chatbot
```typescript
const userThreads = new Map<string, string>();
async function getUserThread(userId: string) {
if (!userThreads.has(userId)) {
const thread = await openai.beta.threads.create({
metadata: { user_id: userId },
});
userThreads.set(userId, thread.id);
}
return userThreads.get(userId)!;
}
```
### RAG Application
```typescript
// 1. Create vector store with documents
const vectorStore = await openai.beta.vectorStores.create({...});
await openai.beta.vectorStores.fileBatches.create(vectorStore.id, {...});
// 2. Create assistant with file_search
const assistant = await openai.beta.assistants.create({
tools: [{ type: "file_search" }],
tool_resources: {
file_search: { vector_store_ids: [vectorStore.id] },
},
});
```
### Data Analysis
```typescript
const assistant = await openai.beta.assistants.create({
tools: [{ type: "code_interpreter" }],
});
// Upload data
const file = await openai.files.create({...});
// Attach to message
await openai.beta.threads.messages.create(thread.id, {
content: "Analyze this data",
attachments: [{ file_id: file.id, tools: [{ type: "code_interpreter" }] }],
});
```
---
## Official Documentation
- **API Reference**: https://platform.openai.com/docs/api-reference/assistants
- **Overview**: https://platform.openai.com/docs/assistants/overview
- **Tools**: https://platform.openai.com/docs/assistants/tools
- **Migration**: https://platform.openai.com/docs/assistants/whats-new
---
**Last Updated**: 2025-10-25