Initial commit
This commit is contained in:
205
references/audio-guide.md
Normal file
205
references/audio-guide.md
Normal file
@@ -0,0 +1,205 @@
|
||||
# Audio Guide (Whisper & TTS)
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Complete guide to OpenAI's Audio API for transcription and text-to-speech.
|
||||
|
||||
---
|
||||
|
||||
## Whisper Transcription
|
||||
|
||||
### Supported Formats
|
||||
- mp3, mp4, mpeg, mpga, m4a, wav, webm
|
||||
|
||||
### Best Practices
|
||||
|
||||
✅ **Audio Quality**:
|
||||
- Use clear audio with minimal background noise
|
||||
- 16 kHz or higher sample rate recommended
|
||||
- Mono or stereo both supported
|
||||
|
||||
✅ **File Size**:
|
||||
- Max file size: 25 MB
|
||||
- For larger files: split into chunks or compress
|
||||
|
||||
✅ **Languages**:
|
||||
- Whisper automatically detects language
|
||||
- Supports 50+ languages
|
||||
- Best results with English, Spanish, French, German, Chinese
|
||||
|
||||
❌ **Limitations**:
|
||||
- May struggle with heavy accents
|
||||
- Background noise reduces accuracy
|
||||
- Very quiet audio may fail
|
||||
|
||||
---
|
||||
|
||||
## Text-to-Speech (TTS)
|
||||
|
||||
### Model Selection
|
||||
|
||||
| Model | Quality | Latency | Features | Best For |
|
||||
|-------|---------|---------|----------|----------|
|
||||
| tts-1 | Standard | Lowest | Basic TTS | Real-time streaming |
|
||||
| tts-1-hd | High | Medium | Better fidelity | Offline audio, podcasts |
|
||||
| gpt-4o-mini-tts | Best | Medium | Voice instructions, streaming | Maximum control |
|
||||
|
||||
### Voice Selection Guide
|
||||
|
||||
| Voice | Character | Best For |
|
||||
|-------|-----------|----------|
|
||||
| alloy | Neutral, balanced | General use, professional |
|
||||
| ash | Clear, professional | Business, presentations |
|
||||
| ballad | Warm, storytelling | Narration, audiobooks |
|
||||
| coral | Soft, friendly | Customer service, greetings |
|
||||
| echo | Calm, measured | Meditation, calm content |
|
||||
| fable | Expressive, narrative | Stories, entertainment |
|
||||
| onyx | Deep, authoritative | News, serious content |
|
||||
| nova | Bright, energetic | Marketing, enthusiastic content |
|
||||
| sage | Wise, thoughtful | Educational, informative |
|
||||
| shimmer | Gentle, soothing | Relaxation, sleep content |
|
||||
| verse | Poetic, rhythmic | Poetry, artistic content |
|
||||
|
||||
### Voice Instructions (gpt-4o-mini-tts only)
|
||||
|
||||
```typescript
|
||||
// Professional tone
|
||||
{
|
||||
model: 'gpt-4o-mini-tts',
|
||||
voice: 'ash',
|
||||
input: 'Welcome to our service',
|
||||
instructions: 'Speak in a calm, professional, and friendly tone suitable for customer service.',
|
||||
}
|
||||
|
||||
// Energetic marketing
|
||||
{
|
||||
model: 'gpt-4o-mini-tts',
|
||||
voice: 'nova',
|
||||
input: 'Don\'t miss this sale!',
|
||||
instructions: 'Use an enthusiastic, energetic tone perfect for marketing and advertisements.',
|
||||
}
|
||||
|
||||
// Meditation guidance
|
||||
{
|
||||
model: 'gpt-4o-mini-tts',
|
||||
voice: 'shimmer',
|
||||
input: 'Take a deep breath',
|
||||
instructions: 'Adopt a calm, soothing voice suitable for meditation and relaxation guidance.',
|
||||
}
|
||||
```
|
||||
|
||||
### Speed Control
|
||||
|
||||
```typescript
|
||||
// Slow (0.5x)
|
||||
{ speed: 0.5 } // Good for: Learning, accessibility
|
||||
|
||||
// Normal (1.0x)
|
||||
{ speed: 1.0 } // Default
|
||||
|
||||
// Fast (1.5x)
|
||||
{ speed: 1.5 } // Good for: Previews, time-saving
|
||||
|
||||
// Very fast (2.0x)
|
||||
{ speed: 2.0 } // Good for: Quick previews only
|
||||
```
|
||||
|
||||
Range: 0.25 to 4.0
|
||||
|
||||
### Audio Format Selection
|
||||
|
||||
| Format | Compression | Quality | Best For |
|
||||
|--------|-------------|---------|----------|
|
||||
| mp3 | Lossy | Good | Maximum compatibility |
|
||||
| opus | Lossy | Excellent | Web streaming, low bandwidth |
|
||||
| aac | Lossy | Good | iOS, Apple devices |
|
||||
| flac | Lossless | Best | Archiving, editing |
|
||||
| wav | Uncompressed | Best | Editing, processing |
|
||||
| pcm | Raw | Best | Low-level processing |
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### 1. Transcribe Interview
|
||||
|
||||
```typescript
|
||||
const transcription = await openai.audio.transcriptions.create({
|
||||
file: fs.createReadStream('./interview.mp3'),
|
||||
model: 'whisper-1',
|
||||
});
|
||||
|
||||
// Save transcript
|
||||
fs.writeFileSync('./interview.txt', transcription.text);
|
||||
```
|
||||
|
||||
### 2. Generate Podcast Narration
|
||||
|
||||
```typescript
|
||||
const script = "Welcome to today's podcast...";
|
||||
|
||||
const audio = await openai.audio.speech.create({
|
||||
model: 'tts-1-hd',
|
||||
voice: 'fable',
|
||||
input: script,
|
||||
response_format: 'mp3',
|
||||
});
|
||||
|
||||
const buffer = Buffer.from(await audio.arrayBuffer());
|
||||
fs.writeFileSync('./podcast.mp3', buffer);
|
||||
```
|
||||
|
||||
### 3. Multi-Voice Conversation
|
||||
|
||||
```typescript
|
||||
// Speaker 1
|
||||
const speaker1 = await openai.audio.speech.create({
|
||||
model: 'tts-1',
|
||||
voice: 'onyx',
|
||||
input: 'Hello, how are you?',
|
||||
});
|
||||
|
||||
// Speaker 2
|
||||
const speaker2 = await openai.audio.speech.create({
|
||||
model: 'tts-1',
|
||||
voice: 'nova',
|
||||
input: 'I\'m doing great, thanks!',
|
||||
});
|
||||
|
||||
// Combine audio files (requires audio processing library)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
1. **Use tts-1 for real-time** (cheaper, faster)
|
||||
2. **Use tts-1-hd for final production** (better quality)
|
||||
3. **Cache generated audio** (deterministic for same input)
|
||||
4. **Choose appropriate format** (opus for web, mp3 for compatibility)
|
||||
5. **Batch transcriptions** with delays to avoid rate limits
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Transcription Accuracy
|
||||
- Improve audio quality
|
||||
- Reduce background noise
|
||||
- Ensure adequate volume levels
|
||||
- Use supported audio formats
|
||||
|
||||
### TTS Naturalness
|
||||
- Test different voices
|
||||
- Use voice instructions (gpt-4o-mini-tts)
|
||||
- Adjust speed for better pacing
|
||||
- Add punctuation for natural pauses
|
||||
|
||||
### File Size
|
||||
- Compress audio before transcribing
|
||||
- Choose lossy formats (mp3, opus) for TTS
|
||||
- Use appropriate bitrates
|
||||
|
||||
---
|
||||
|
||||
**See Also**: Official Audio Guide (https://platform.openai.com/docs/guides/speech-to-text)
|
||||
278
references/cost-optimization.md
Normal file
278
references/cost-optimization.md
Normal file
@@ -0,0 +1,278 @@
|
||||
# Cost Optimization Guide
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Strategies to minimize OpenAI API costs while maintaining quality.
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Strategies
|
||||
|
||||
### 1. Model Cascading
|
||||
|
||||
Start with cheaper models, escalate only when needed:
|
||||
|
||||
```typescript
|
||||
async function smartCompletion(prompt: string) {
|
||||
// Try gpt-5-nano first
|
||||
const nanoResult = await openai.chat.completions.create({
|
||||
model: 'gpt-5-nano',
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
|
||||
// Validate quality
|
||||
if (isGoodEnough(nanoResult)) {
|
||||
return nanoResult;
|
||||
}
|
||||
|
||||
// Escalate to gpt-5-mini
|
||||
const miniResult = await openai.chat.completions.create({
|
||||
model: 'gpt-5-mini',
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
|
||||
if (isGoodEnough(miniResult)) {
|
||||
return miniResult;
|
||||
}
|
||||
|
||||
// Final escalation to gpt-5
|
||||
return await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Task-Based Model Selection
|
||||
|
||||
| Task | Model | Why |
|
||||
|------|-------|-----|
|
||||
| Simple chat | gpt-5-nano | Fast, cheap, sufficient |
|
||||
| Summarization | gpt-5-mini | Good quality, cost-effective |
|
||||
| Code generation | gpt-5 | Best reasoning, worth the cost |
|
||||
| Data extraction | gpt-4o + structured output | Reliable, accurate |
|
||||
| Vision tasks | gpt-4o | Only model with vision |
|
||||
|
||||
---
|
||||
|
||||
## Token Optimization
|
||||
|
||||
### 1. Limit max_tokens
|
||||
|
||||
```typescript
|
||||
// ❌ No limit: May generate unnecessarily long responses
|
||||
{
|
||||
model: 'gpt-5',
|
||||
messages,
|
||||
}
|
||||
|
||||
// ✅ Set reasonable limit
|
||||
{
|
||||
model: 'gpt-5',
|
||||
messages,
|
||||
max_tokens: 500, // Prevent runaway generation
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Trim Conversation History
|
||||
|
||||
```typescript
|
||||
function trimHistory(messages: Message[], maxTokens: number = 4000) {
|
||||
// Keep system message and recent messages
|
||||
const system = messages.find(m => m.role === 'system');
|
||||
const recent = messages.slice(-10); // Last 10 messages
|
||||
|
||||
return [system, ...recent].filter(Boolean);
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Use Shorter Prompts
|
||||
|
||||
```typescript
|
||||
// ❌ Verbose
|
||||
"Please analyze the following text and provide a detailed summary of the main points, including any key takeaways and important details..."
|
||||
|
||||
// ✅ Concise
|
||||
"Summarize key points:"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Caching Strategies
|
||||
|
||||
### 1. Cache Embeddings
|
||||
|
||||
```typescript
|
||||
const embeddingCache = new Map<string, number[]>();
|
||||
|
||||
async function getCachedEmbedding(text: string) {
|
||||
if (embeddingCache.has(text)) {
|
||||
return embeddingCache.get(text)!;
|
||||
}
|
||||
|
||||
const response = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: text,
|
||||
});
|
||||
|
||||
const embedding = response.data[0].embedding;
|
||||
embeddingCache.set(text, embedding);
|
||||
|
||||
return embedding;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Cache Common Completions
|
||||
|
||||
```typescript
|
||||
const completionCache = new Map<string, string>();
|
||||
|
||||
async function getCachedCompletion(prompt: string) {
|
||||
const cacheKey = `${model}:${prompt}`;
|
||||
|
||||
if (completionCache.has(cacheKey)) {
|
||||
return completionCache.get(cacheKey)!;
|
||||
}
|
||||
|
||||
const result = await openai.chat.completions.create({
|
||||
model: 'gpt-5-mini',
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
|
||||
const content = result.choices[0].message.content;
|
||||
completionCache.set(cacheKey, content!);
|
||||
|
||||
return content;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Batch Processing
|
||||
|
||||
### 1. Use Embeddings Batch API
|
||||
|
||||
```typescript
|
||||
// ❌ Individual requests (expensive)
|
||||
for (const doc of documents) {
|
||||
await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: doc,
|
||||
});
|
||||
}
|
||||
|
||||
// ✅ Batch request (cheaper)
|
||||
const response = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: documents, // Array of up to 2048 documents
|
||||
});
|
||||
```
|
||||
|
||||
### 2. Group Similar Requests
|
||||
|
||||
```typescript
|
||||
// Process non-urgent requests in batches during off-peak hours
|
||||
const batchQueue: string[] = [];
|
||||
|
||||
function queueForBatch(prompt: string) {
|
||||
batchQueue.push(prompt);
|
||||
|
||||
if (batchQueue.length >= 10) {
|
||||
processBatch();
|
||||
}
|
||||
}
|
||||
|
||||
async function processBatch() {
|
||||
// Process all at once
|
||||
const results = await Promise.all(
|
||||
batchQueue.map(prompt =>
|
||||
openai.chat.completions.create({
|
||||
model: 'gpt-5-nano',
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
})
|
||||
)
|
||||
);
|
||||
|
||||
batchQueue.length = 0;
|
||||
return results;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Feature-Specific Optimization
|
||||
|
||||
### Embeddings
|
||||
|
||||
1. **Use custom dimensions**: 256 instead of 1536 = 6x storage reduction
|
||||
2. **Use text-embedding-3-small**: Cheaper than large, good for most use cases
|
||||
3. **Batch requests**: Up to 2048 documents per request
|
||||
|
||||
### Images
|
||||
|
||||
1. **Use standard quality**: Unless HD is critical
|
||||
2. **Use smaller sizes**: Generate 1024x1024 instead of 1792x1024 when possible
|
||||
3. **Use natural style**: Cheaper than vivid
|
||||
|
||||
### Audio
|
||||
|
||||
1. **Use tts-1 for real-time**: Cheaper than tts-1-hd
|
||||
2. **Use opus format**: Smaller files, good quality
|
||||
3. **Cache generated audio**: Deterministic for same input
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Alerts
|
||||
|
||||
```typescript
|
||||
interface CostTracker {
|
||||
totalTokens: number;
|
||||
totalCost: number;
|
||||
requestCount: number;
|
||||
}
|
||||
|
||||
const tracker: CostTracker = {
|
||||
totalTokens: 0,
|
||||
totalCost: 0,
|
||||
requestCount: 0,
|
||||
};
|
||||
|
||||
async function trackCosts(fn: () => Promise<any>) {
|
||||
const result = await fn();
|
||||
|
||||
if (result.usage) {
|
||||
tracker.totalTokens += result.usage.total_tokens;
|
||||
tracker.requestCount++;
|
||||
|
||||
// Estimate cost (adjust rates based on actual pricing)
|
||||
const cost = estimateCost(result.model, result.usage.total_tokens);
|
||||
tracker.totalCost += cost;
|
||||
|
||||
// Alert if threshold exceeded
|
||||
if (tracker.totalCost > 100) {
|
||||
console.warn('Cost threshold exceeded!', tracker);
|
||||
}
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cost Reduction Checklist
|
||||
|
||||
- [ ] Use cheapest model that meets requirements
|
||||
- [ ] Set max_tokens limits
|
||||
- [ ] Trim conversation history
|
||||
- [ ] Cache embeddings and common queries
|
||||
- [ ] Batch requests when possible
|
||||
- [ ] Use custom embedding dimensions (256-512)
|
||||
- [ ] Monitor token usage
|
||||
- [ ] Implement rate limiting
|
||||
- [ ] Use structured outputs to avoid retries
|
||||
- [ ] Compress prompts (remove unnecessary words)
|
||||
|
||||
---
|
||||
|
||||
**Estimated Savings**: Following these practices can reduce costs by 40-70% while maintaining quality.
|
||||
187
references/embeddings-guide.md
Normal file
187
references/embeddings-guide.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Embeddings Guide
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Complete guide to OpenAI's Embeddings API for semantic search, RAG, and clustering.
|
||||
|
||||
---
|
||||
|
||||
## Model Comparison
|
||||
|
||||
| Model | Default Dimensions | Custom Dimensions | Best For |
|
||||
|-------|-------------------|-------------------|----------|
|
||||
| text-embedding-3-large | 3072 | 256-3072 | Highest quality semantic search |
|
||||
| text-embedding-3-small | 1536 | 256-1536 | Most applications, cost-effective |
|
||||
| text-embedding-ada-002 | 1536 | Fixed | Legacy (use v3 models) |
|
||||
|
||||
---
|
||||
|
||||
## Dimension Selection
|
||||
|
||||
### Full Dimensions
|
||||
- **text-embedding-3-small**: 1536 (default)
|
||||
- **text-embedding-3-large**: 3072 (default)
|
||||
- Use for maximum accuracy
|
||||
|
||||
### Reduced Dimensions
|
||||
- **256 dims**: 4-12x storage reduction, minimal quality loss
|
||||
- **512 dims**: 2-6x storage reduction, good quality
|
||||
- Use for cost/storage optimization
|
||||
|
||||
```typescript
|
||||
// Full dimensions (1536)
|
||||
const full = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: 'Sample text',
|
||||
});
|
||||
|
||||
// Reduced dimensions (256)
|
||||
const reduced = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: 'Sample text',
|
||||
dimensions: 256,
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## RAG (Retrieval-Augmented Generation) Pattern
|
||||
|
||||
### 1. Build Knowledge Base
|
||||
|
||||
```typescript
|
||||
const documents = [
|
||||
'TypeScript is a superset of JavaScript',
|
||||
'Python is a high-level programming language',
|
||||
'React is a JavaScript library for UIs',
|
||||
];
|
||||
|
||||
const embeddings = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: documents,
|
||||
});
|
||||
|
||||
const knowledgeBase = documents.map((text, i) => ({
|
||||
text,
|
||||
embedding: embeddings.data[i].embedding,
|
||||
}));
|
||||
```
|
||||
|
||||
### 2. Query with Similarity Search
|
||||
|
||||
```typescript
|
||||
// Embed user query
|
||||
const queryEmbedding = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: 'What is TypeScript?',
|
||||
});
|
||||
|
||||
// Find similar documents
|
||||
const similarities = knowledgeBase.map(doc => ({
|
||||
text: doc.text,
|
||||
similarity: cosineSimilarity(queryEmbedding.data[0].embedding, doc.embedding),
|
||||
}));
|
||||
|
||||
similarities.sort((a, b) => b.similarity - a.similarity);
|
||||
const topResults = similarities.slice(0, 3);
|
||||
```
|
||||
|
||||
### 3. Generate Answer with Context
|
||||
|
||||
```typescript
|
||||
const context = topResults.map(r => r.text).join('\n\n');
|
||||
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [
|
||||
{ role: 'system', content: `Answer using this context:\n\n${context}` },
|
||||
{ role: 'user', content: 'What is TypeScript?' },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Similarity Metrics
|
||||
|
||||
### Cosine Similarity (Recommended)
|
||||
|
||||
```typescript
|
||||
function cosineSimilarity(a: number[], b: number[]): number {
|
||||
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
|
||||
const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
|
||||
const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
|
||||
return dotProduct / (magnitudeA * magnitudeB);
|
||||
}
|
||||
```
|
||||
|
||||
### Euclidean Distance
|
||||
|
||||
```typescript
|
||||
function euclideanDistance(a: number[], b: number[]): number {
|
||||
return Math.sqrt(
|
||||
a.reduce((sum, val, i) => sum + Math.pow(val - b[i], 2), 0)
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Batch Processing
|
||||
|
||||
```typescript
|
||||
// Process up to 2048 documents
|
||||
const embeddings = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: documents, // Array of strings
|
||||
});
|
||||
|
||||
embeddings.data.forEach((item, index) => {
|
||||
console.log(`Doc ${index}: ${item.embedding.length} dimensions`);
|
||||
});
|
||||
```
|
||||
|
||||
**Limits**:
|
||||
- Max tokens per input: 8192
|
||||
- Max summed tokens across all inputs: 300,000
|
||||
- Array dimension max: 2048
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
✅ **Pre-processing**:
|
||||
- Normalize text (lowercase, remove special chars)
|
||||
- Be consistent across queries and documents
|
||||
- Chunk long documents (max 8192 tokens)
|
||||
|
||||
✅ **Storage**:
|
||||
- Use custom dimensions (256-512) for storage optimization
|
||||
- Store embeddings in vector databases (Pinecone, Weaviate, Qdrant)
|
||||
- Cache embeddings (deterministic for same input)
|
||||
|
||||
✅ **Search**:
|
||||
- Use cosine similarity for comparison
|
||||
- Normalize embeddings before storing (L2 normalization)
|
||||
- Pre-filter with metadata before similarity search
|
||||
|
||||
❌ **Don't**:
|
||||
- Mix models (incompatible dimensions)
|
||||
- Exceed token limits (8192 per input)
|
||||
- Skip normalization
|
||||
- Use raw embeddings without similarity metric
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
1. **Semantic Search**: Find similar documents
|
||||
2. **RAG**: Retrieve context for generation
|
||||
3. **Clustering**: Group similar content
|
||||
4. **Recommendations**: Content-based recommendations
|
||||
5. **Anomaly Detection**: Detect outliers
|
||||
6. **Duplicate Detection**: Find similar/duplicate content
|
||||
|
||||
---
|
||||
|
||||
**See Also**: Official Embeddings Guide (https://platform.openai.com/docs/guides/embeddings)
|
||||
189
references/function-calling-patterns.md
Normal file
189
references/function-calling-patterns.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Function Calling Patterns
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Advanced patterns for implementing function calling (tool calling) with OpenAI's Chat Completions API.
|
||||
|
||||
---
|
||||
|
||||
## Basic Pattern
|
||||
|
||||
```typescript
|
||||
const tools = [
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'get_weather',
|
||||
description: 'Get current weather for a location',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
location: { type: 'string', description: 'City name' },
|
||||
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] },
|
||||
},
|
||||
required: ['location'],
|
||||
},
|
||||
},
|
||||
},
|
||||
];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Patterns
|
||||
|
||||
### 1. Parallel Tool Calls
|
||||
|
||||
The model can call multiple tools simultaneously:
|
||||
|
||||
```typescript
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [
|
||||
{ role: 'user', content: 'What is the weather in SF and NYC?' }
|
||||
],
|
||||
tools: tools,
|
||||
});
|
||||
|
||||
// Model may return multiple tool_calls
|
||||
const toolCalls = completion.choices[0].message.tool_calls;
|
||||
|
||||
// Execute all in parallel
|
||||
const results = await Promise.all(
|
||||
toolCalls.map(call => executeFunction(call.function.name, call.function.arguments))
|
||||
);
|
||||
```
|
||||
|
||||
### 2. Dynamic Tool Generation
|
||||
|
||||
Generate tools based on runtime context:
|
||||
|
||||
```typescript
|
||||
function generateTools(database: Database) {
|
||||
const tables = database.getTables();
|
||||
|
||||
return tables.map(table => ({
|
||||
type: 'function',
|
||||
function: {
|
||||
name: `query_${table.name}`,
|
||||
description: `Query the ${table.name} table`,
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: table.columns.reduce((acc, col) => ({
|
||||
...acc,
|
||||
[col.name]: { type: col.type, description: col.description },
|
||||
}), {}),
|
||||
},
|
||||
},
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Tool Chaining
|
||||
|
||||
Chain tool results:
|
||||
|
||||
```typescript
|
||||
async function chatWithToolChaining(userMessage: string) {
|
||||
let messages = [{ role: 'user', content: userMessage }];
|
||||
|
||||
while (true) {
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages,
|
||||
tools,
|
||||
});
|
||||
|
||||
const message = completion.choices[0].message;
|
||||
messages.push(message);
|
||||
|
||||
if (!message.tool_calls) {
|
||||
return message.content; // Final answer
|
||||
}
|
||||
|
||||
// Execute tool calls and add results
|
||||
for (const toolCall of message.tool_calls) {
|
||||
const result = await executeFunction(
|
||||
toolCall.function.name,
|
||||
toolCall.function.arguments
|
||||
);
|
||||
|
||||
messages.push({
|
||||
role: 'tool',
|
||||
tool_call_id: toolCall.id,
|
||||
content: JSON.stringify(result),
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Error Handling in Tools
|
||||
|
||||
```typescript
|
||||
async function executeFunction(name: string, argsString: string) {
|
||||
try {
|
||||
const args = JSON.parse(argsString);
|
||||
|
||||
switch (name) {
|
||||
case 'get_weather':
|
||||
return await getWeather(args.location, args.unit);
|
||||
|
||||
default:
|
||||
return { error: `Unknown function: ${name}` };
|
||||
}
|
||||
} catch (error: any) {
|
||||
return { error: error.message };
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Streaming with Tools
|
||||
|
||||
```typescript
|
||||
const stream = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages,
|
||||
tools,
|
||||
stream: true,
|
||||
});
|
||||
|
||||
for await (const chunk of stream) {
|
||||
const delta = chunk.choices[0]?.delta;
|
||||
|
||||
// Check for tool calls in streaming
|
||||
if (delta?.tool_calls) {
|
||||
// Accumulate tool call data
|
||||
console.log('Tool call chunk:', delta.tool_calls);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
✅ **Schema Design**:
|
||||
- Provide clear descriptions for each parameter
|
||||
- Use enum when options are limited
|
||||
- Mark required vs optional parameters
|
||||
|
||||
✅ **Error Handling**:
|
||||
- Return structured error objects
|
||||
- Don't throw exceptions from tool functions
|
||||
- Let the model handle error recovery
|
||||
|
||||
✅ **Performance**:
|
||||
- Execute independent tool calls in parallel
|
||||
- Cache tool results when appropriate
|
||||
- Limit recursion depth to avoid infinite loops
|
||||
|
||||
❌ **Don't**:
|
||||
- Expose sensitive internal functions
|
||||
- Allow unlimited recursion
|
||||
- Skip parameter validation
|
||||
- Return unstructured error messages
|
||||
|
||||
---
|
||||
|
||||
**See Also**: Official Function Calling Guide (https://platform.openai.com/docs/guides/function-calling)
|
||||
153
references/images-guide.md
Normal file
153
references/images-guide.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# Images Guide (DALL-E 3 & GPT-Image-1)
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Best practices for image generation and editing with OpenAI's Images API.
|
||||
|
||||
---
|
||||
|
||||
## DALL-E 3 Generation
|
||||
|
||||
### Size Selection
|
||||
|
||||
| Size | Use Case |
|
||||
|------|----------|
|
||||
| 1024x1024 | Profile pictures, icons, square posts |
|
||||
| 1024x1536 | Portrait photos, vertical ads |
|
||||
| 1536x1024 | Landscape photos, banners |
|
||||
| 1024x1792 | Tall portraits, mobile wallpapers |
|
||||
| 1792x1024 | Wide banners, desktop wallpapers |
|
||||
|
||||
### Quality Settings
|
||||
|
||||
**standard**: Normal quality, faster, cheaper
|
||||
- Use for: Prototyping, high-volume generation, quick iterations
|
||||
|
||||
**hd**: High definition, finer details, more expensive
|
||||
- Use for: Final production images, marketing materials, print
|
||||
|
||||
### Style Options
|
||||
|
||||
**vivid**: Hyper-real, dramatic, high-contrast
|
||||
- Use for: Marketing, advertising, eye-catching visuals
|
||||
|
||||
**natural**: More realistic, less dramatic
|
||||
- Use for: Product photos, realistic scenes, professional content
|
||||
|
||||
---
|
||||
|
||||
## Prompting Best Practices
|
||||
|
||||
### Be Specific
|
||||
|
||||
```
|
||||
❌ "A cat"
|
||||
✅ "A white siamese cat with striking blue eyes, sitting on a wooden table, golden hour lighting, professional photography"
|
||||
```
|
||||
|
||||
### Include Art Style
|
||||
|
||||
```
|
||||
✅ "Oil painting of a sunset in the style of Claude Monet"
|
||||
✅ "3D render of a futuristic city, Pixar animation style"
|
||||
✅ "Professional product photo with studio lighting"
|
||||
```
|
||||
|
||||
### Specify Lighting
|
||||
|
||||
```
|
||||
- "Golden hour lighting"
|
||||
- "Soft studio lighting from the left"
|
||||
- "Dramatic shadows"
|
||||
- "Bright natural daylight"
|
||||
```
|
||||
|
||||
### Composition Details
|
||||
|
||||
```
|
||||
- "Shallow depth of field"
|
||||
- "Wide angle lens"
|
||||
- "Centered composition"
|
||||
- "Rule of thirds"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## GPT-Image-1 Editing
|
||||
|
||||
### Input Fidelity
|
||||
|
||||
**low**: More creative freedom
|
||||
- Use for: Major transformations, style changes
|
||||
|
||||
**medium**: Balance (default)
|
||||
- Use for: Most editing tasks
|
||||
|
||||
**high**: Stay close to original
|
||||
- Use for: Subtle edits, preserving details
|
||||
|
||||
### Common Editing Tasks
|
||||
|
||||
1. **Background Removal**
|
||||
```typescript
|
||||
formData.append('prompt', 'Remove the background, keep only the product');
|
||||
formData.append('format', 'png');
|
||||
formData.append('background', 'transparent');
|
||||
```
|
||||
|
||||
2. **Color Correction**
|
||||
```typescript
|
||||
formData.append('prompt', 'Increase brightness and saturation, make colors more vibrant');
|
||||
```
|
||||
|
||||
3. **Object Removal**
|
||||
```typescript
|
||||
formData.append('prompt', 'Remove the person from the background');
|
||||
```
|
||||
|
||||
4. **Compositing**
|
||||
```typescript
|
||||
formData.append('image', mainImage);
|
||||
formData.append('image_2', logoImage);
|
||||
formData.append('prompt', 'Add the logo to the product, as if stamped on the surface');
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Format Selection
|
||||
|
||||
| Format | Transparency | Compression | Best For |
|
||||
|--------|--------------|-------------|----------|
|
||||
| PNG | Yes | Lossless | Logos, transparency needed |
|
||||
| JPEG | No | Lossy | Photos, smaller file size |
|
||||
| WebP | Yes | Lossy | Web, best compression |
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
1. Use standard quality unless HD is critical
|
||||
2. Generate smaller sizes when possible
|
||||
3. Cache generated images
|
||||
4. Use natural style for most cases (vivid costs more)
|
||||
5. Batch requests with delays to avoid rate limits
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Prompt Revision
|
||||
DALL-E 3 may revise prompts for safety/quality. Check `revised_prompt` in response.
|
||||
|
||||
### URL Expiration
|
||||
Image URLs expire in 1 hour. Download and save if needed long-term.
|
||||
|
||||
### Non-Deterministic
|
||||
Same prompt = different images. Cache results if consistency needed.
|
||||
|
||||
### Rate Limits
|
||||
DALL-E has separate IPM (Images Per Minute) limits. Monitor and implement delays.
|
||||
|
||||
---
|
||||
|
||||
**See Also**: Official Images Guide (https://platform.openai.com/docs/guides/images)
|
||||
311
references/models-guide.md
Normal file
311
references/models-guide.md
Normal file
@@ -0,0 +1,311 @@
|
||||
# OpenAI Models Guide
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
This guide provides a comprehensive comparison of OpenAI's language models to help you choose the right model for your use case.
|
||||
|
||||
---
|
||||
|
||||
## GPT-5 Series (Released August 2025)
|
||||
|
||||
### gpt-5
|
||||
**Status**: Latest flagship model
|
||||
**Best for**: Complex reasoning, advanced problem-solving, code generation
|
||||
|
||||
**Key Features**:
|
||||
- Advanced reasoning capabilities
|
||||
- Unique parameters: `reasoning_effort`, `verbosity`
|
||||
- Best-in-class performance on complex tasks
|
||||
|
||||
**Limitations**:
|
||||
- ❌ No `temperature` support
|
||||
- ❌ No `top_p` support
|
||||
- ❌ No `logprobs` support
|
||||
- ❌ CoT (Chain of Thought) does NOT persist between turns
|
||||
|
||||
**When to use**:
|
||||
- Complex mathematical problems
|
||||
- Advanced code generation
|
||||
- Logic puzzles and reasoning tasks
|
||||
- Multi-step problem solving
|
||||
|
||||
**Cost**: Highest pricing tier
|
||||
|
||||
---
|
||||
|
||||
### gpt-5-mini
|
||||
**Status**: Cost-effective GPT-5 variant
|
||||
**Best for**: Balanced performance and cost
|
||||
|
||||
**Key Features**:
|
||||
- Same parameter support as gpt-5 (`reasoning_effort`, `verbosity`)
|
||||
- Better than GPT-4 Turbo performance
|
||||
- Significantly cheaper than gpt-5
|
||||
|
||||
**When to use**:
|
||||
- Most production applications
|
||||
- When you need GPT-5 features but not maximum performance
|
||||
- High-volume use cases where cost matters
|
||||
|
||||
**Cost**: Mid-tier pricing
|
||||
|
||||
---
|
||||
|
||||
### gpt-5-nano
|
||||
**Status**: Smallest GPT-5 variant
|
||||
**Best for**: Simple tasks, high-volume processing
|
||||
|
||||
**Key Features**:
|
||||
- Fastest response times
|
||||
- Lowest cost in GPT-5 series
|
||||
- Still supports GPT-5 unique parameters
|
||||
|
||||
**When to use**:
|
||||
- Simple text generation
|
||||
- High-volume batch processing
|
||||
- Real-time streaming applications
|
||||
- Cost-sensitive deployments
|
||||
|
||||
**Cost**: Low-tier pricing
|
||||
|
||||
---
|
||||
|
||||
## GPT-4o Series
|
||||
|
||||
### gpt-4o
|
||||
**Status**: Multimodal flagship (pre-GPT-5)
|
||||
**Best for**: Vision tasks, multimodal applications
|
||||
|
||||
**Key Features**:
|
||||
- ✅ Vision support (image understanding)
|
||||
- ✅ Temperature control
|
||||
- ✅ Top-p sampling
|
||||
- ✅ Function calling
|
||||
- ✅ Structured outputs
|
||||
|
||||
**Limitations**:
|
||||
- ❌ No `reasoning_effort` parameter
|
||||
- ❌ No `verbosity` parameter
|
||||
|
||||
**When to use**:
|
||||
- Image understanding and analysis
|
||||
- OCR / text extraction from images
|
||||
- Visual question answering
|
||||
- When you need temperature/top_p control
|
||||
- Multimodal applications
|
||||
|
||||
**Cost**: High-tier pricing (cheaper than gpt-5)
|
||||
|
||||
---
|
||||
|
||||
### gpt-4-turbo
|
||||
**Status**: Fast GPT-4 variant
|
||||
**Best for**: When you need GPT-4 speed
|
||||
|
||||
**Key Features**:
|
||||
- Faster than base GPT-4
|
||||
- Full parameter support (temperature, top_p, logprobs)
|
||||
- Good balance of quality and speed
|
||||
|
||||
**When to use**:
|
||||
- When GPT-4 quality is needed with faster responses
|
||||
- Legacy applications requiring specific parameters
|
||||
- When vision is not required
|
||||
|
||||
**Cost**: Mid-tier pricing
|
||||
|
||||
---
|
||||
|
||||
## Comparison Table
|
||||
|
||||
| Feature | GPT-5 | GPT-5-mini | GPT-5-nano | GPT-4o | GPT-4 Turbo |
|
||||
|---------|-------|------------|------------|--------|-------------|
|
||||
| **Reasoning** | Best | Excellent | Good | Excellent | Excellent |
|
||||
| **Speed** | Medium | Medium | Fastest | Medium | Fast |
|
||||
| **Cost** | Highest | Mid | Lowest | High | Mid |
|
||||
| **reasoning_effort** | ✅ | ✅ | ✅ | ❌ | ❌ |
|
||||
| **verbosity** | ✅ | ✅ | ✅ | ❌ | ❌ |
|
||||
| **temperature** | ❌ | ❌ | ❌ | ✅ | ✅ |
|
||||
| **top_p** | ❌ | ❌ | ❌ | ✅ | ✅ |
|
||||
| **Vision** | ❌ | ❌ | ❌ | ✅ | ❌ |
|
||||
| **Function calling** | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| **Structured outputs** | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| **Max output tokens** | 16,384 | 16,384 | 16,384 | 16,384 | 16,384 |
|
||||
|
||||
---
|
||||
|
||||
## Selection Guide
|
||||
|
||||
### Use GPT-5 when:
|
||||
- ✅ You need the best reasoning performance
|
||||
- ✅ Complex mathematical or logical problems
|
||||
- ✅ Advanced code generation
|
||||
- ✅ Multi-step problem solving
|
||||
- ❌ Cost is not the primary concern
|
||||
|
||||
### Use GPT-5-mini when:
|
||||
- ✅ You want GPT-5 features at lower cost
|
||||
- ✅ Production applications with high volume
|
||||
- ✅ Good reasoning performance is needed
|
||||
- ✅ Balance of quality and cost matters
|
||||
|
||||
### Use GPT-5-nano when:
|
||||
- ✅ Simple text generation tasks
|
||||
- ✅ High-volume batch processing
|
||||
- ✅ Real-time streaming applications
|
||||
- ✅ Cost optimization is critical
|
||||
- ❌ Complex reasoning is not required
|
||||
|
||||
### Use GPT-4o when:
|
||||
- ✅ Vision / image understanding is required
|
||||
- ✅ You need temperature/top_p control
|
||||
- ✅ Multimodal applications
|
||||
- ✅ OCR and visual analysis
|
||||
- ❌ Pure text tasks (use GPT-5 series)
|
||||
|
||||
### Use GPT-4 Turbo when:
|
||||
- ✅ Legacy application compatibility
|
||||
- ✅ You need specific parameters not in GPT-5
|
||||
- ✅ Fast responses without vision
|
||||
- ❌ Not recommended for new applications (use GPT-5 or GPT-4o)
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization Strategies
|
||||
|
||||
### 1. Model Cascading
|
||||
Start with cheaper models and escalate only when needed:
|
||||
|
||||
```
|
||||
gpt-5-nano (try first) → gpt-5-mini → gpt-5 (if needed)
|
||||
```
|
||||
|
||||
### 2. Task-Specific Model Selection
|
||||
- **Simple**: Use gpt-5-nano
|
||||
- **Medium complexity**: Use gpt-5-mini
|
||||
- **Complex reasoning**: Use gpt-5
|
||||
- **Vision tasks**: Use gpt-4o
|
||||
|
||||
### 3. Hybrid Approach
|
||||
- Use embeddings (cheap) for retrieval
|
||||
- Use gpt-5-mini for generation
|
||||
- Use gpt-5 only for critical decisions
|
||||
|
||||
### 4. Batch Processing
|
||||
- Use cheaper models for bulk operations
|
||||
- Reserve expensive models for user-facing requests
|
||||
|
||||
---
|
||||
|
||||
## Parameter Guide
|
||||
|
||||
### GPT-5 Unique Parameters
|
||||
|
||||
**reasoning_effort**: Controls reasoning depth
|
||||
- "minimal": Quick responses
|
||||
- "low": Basic reasoning
|
||||
- "medium": Balanced (default)
|
||||
- "high": Deep reasoning for complex problems
|
||||
|
||||
**verbosity**: Controls output length
|
||||
- "low": Concise responses
|
||||
- "medium": Balanced detail (default)
|
||||
- "high": Verbose, detailed responses
|
||||
|
||||
### GPT-4o/GPT-4 Turbo Parameters
|
||||
|
||||
**temperature**: Controls randomness (0-2)
|
||||
- 0: Deterministic, focused
|
||||
- 1: Balanced creativity (default)
|
||||
- 2: Maximum creativity
|
||||
|
||||
**top_p**: Nucleus sampling (0-1)
|
||||
- Lower values: More focused
|
||||
- Higher values: More diverse
|
||||
|
||||
**logprobs**: Get token probabilities
|
||||
- Useful for debugging and analysis
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern 1: Automatic Model Selection
|
||||
|
||||
```typescript
|
||||
function selectModel(taskComplexity: 'simple' | 'medium' | 'complex') {
|
||||
switch (taskComplexity) {
|
||||
case 'simple':
|
||||
return 'gpt-5-nano';
|
||||
case 'medium':
|
||||
return 'gpt-5-mini';
|
||||
case 'complex':
|
||||
return 'gpt-5';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 2: Fallback Chain
|
||||
|
||||
```typescript
|
||||
async function completionWithFallback(prompt: string) {
|
||||
const models = ['gpt-5-nano', 'gpt-5-mini', 'gpt-5'];
|
||||
|
||||
for (const model of models) {
|
||||
try {
|
||||
const result = await openai.chat.completions.create({
|
||||
model,
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
|
||||
// Validate quality
|
||||
if (isGoodEnough(result)) {
|
||||
return result;
|
||||
}
|
||||
} catch (error) {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
throw new Error('All models failed');
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Vision + Text Hybrid
|
||||
|
||||
```typescript
|
||||
// Use gpt-4o for image analysis
|
||||
const imageAnalysis = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [
|
||||
{
|
||||
role: 'user',
|
||||
content: [
|
||||
{ type: 'text', text: 'Describe this image' },
|
||||
{ type: 'image_url', image_url: { url: imageUrl } },
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
// Use gpt-5 for reasoning based on analysis
|
||||
const reasoning = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [
|
||||
{ role: 'system', content: `Image analysis: ${imageAnalysis.choices[0].message.content}` },
|
||||
{ role: 'user', content: 'What does this imply about...' },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Official Documentation
|
||||
|
||||
- **GPT-5 Guide**: https://platform.openai.com/docs/guides/latest-model
|
||||
- **Model Pricing**: https://openai.com/pricing
|
||||
- **Model Comparison**: https://platform.openai.com/docs/models
|
||||
|
||||
---
|
||||
|
||||
**Summary**: Choose the right model based on your specific needs. GPT-5 series for reasoning, GPT-4o for vision, and optimize costs by selecting the smallest model that meets your requirements.
|
||||
220
references/structured-output-guide.md
Normal file
220
references/structured-output-guide.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# Structured Output Guide
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
Best practices for using JSON schemas with OpenAI's structured outputs feature.
|
||||
|
||||
---
|
||||
|
||||
## When to Use Structured Outputs
|
||||
|
||||
Use structured outputs when you need:
|
||||
- ✅ **Guaranteed JSON format**: Response will always be valid JSON
|
||||
- ✅ **Schema validation**: Enforce specific structure
|
||||
- ✅ **Type safety**: Parse directly into TypeScript types
|
||||
- ✅ **Data extraction**: Pull specific fields from text
|
||||
- ✅ **Classification**: Map to predefined categories
|
||||
|
||||
---
|
||||
|
||||
## Schema Best Practices
|
||||
|
||||
### 1. Keep Schemas Simple
|
||||
|
||||
```typescript
|
||||
// ✅ Good: Simple, focused schema
|
||||
{
|
||||
type: 'object',
|
||||
properties: {
|
||||
name: { type: 'string' },
|
||||
age: { type: 'number' },
|
||||
},
|
||||
required: ['name', 'age'],
|
||||
additionalProperties: false,
|
||||
}
|
||||
|
||||
// ❌ Avoid: Overly complex nested structures
|
||||
// (they work but are harder to debug)
|
||||
```
|
||||
|
||||
### 2. Use Enums for Fixed Options
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: 'object',
|
||||
properties: {
|
||||
category: {
|
||||
type: 'string',
|
||||
enum: ['bug', 'feature', 'question'],
|
||||
},
|
||||
priority: {
|
||||
type: 'string',
|
||||
enum: ['low', 'medium', 'high', 'critical'],
|
||||
},
|
||||
},
|
||||
required: ['category', 'priority'],
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Always Use `strict: true`
|
||||
|
||||
```typescript
|
||||
response_format: {
|
||||
type: 'json_schema',
|
||||
json_schema: {
|
||||
name: 'response_schema',
|
||||
strict: true, // ✅ Enforces exact compliance
|
||||
schema: { /* ... */ },
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Set `additionalProperties: false`
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: 'object',
|
||||
properties: { /* ... */ },
|
||||
required: [ /* ... */ ],
|
||||
additionalProperties: false, // ✅ Prevents unexpected fields
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Data Extraction
|
||||
|
||||
```typescript
|
||||
const schema = {
|
||||
type: 'object',
|
||||
properties: {
|
||||
person: { type: 'string' },
|
||||
company: { type: 'string' },
|
||||
email: { type: 'string' },
|
||||
phone: { type: 'string' },
|
||||
},
|
||||
required: ['person'],
|
||||
additionalProperties: false,
|
||||
};
|
||||
|
||||
// Extract from unstructured text
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [
|
||||
{ role: 'system', content: 'Extract contact information' },
|
||||
{ role: 'user', content: 'John works at TechCorp, email: john@tech.com' },
|
||||
],
|
||||
response_format: { type: 'json_schema', json_schema: { name: 'contact', strict: true, schema } },
|
||||
});
|
||||
|
||||
const contact = JSON.parse(completion.choices[0].message.content);
|
||||
// { person: "John", company: "TechCorp", email: "john@tech.com", phone: null }
|
||||
```
|
||||
|
||||
### Classification
|
||||
|
||||
```typescript
|
||||
const schema = {
|
||||
type: 'object',
|
||||
properties: {
|
||||
sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] },
|
||||
confidence: { type: 'number' },
|
||||
topics: { type: 'array', items: { type: 'string' } },
|
||||
},
|
||||
required: ['sentiment', 'confidence', 'topics'],
|
||||
additionalProperties: false,
|
||||
};
|
||||
|
||||
// Classify text
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [
|
||||
{ role: 'system', content: 'Classify the text' },
|
||||
{ role: 'user', content: 'This product is amazing!' },
|
||||
],
|
||||
response_format: { type: 'json_schema', json_schema: { name: 'classification', strict: true, schema } },
|
||||
});
|
||||
|
||||
const result = JSON.parse(completion.choices[0].message.content);
|
||||
// { sentiment: "positive", confidence: 0.95, topics: ["product", "satisfaction"] }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## TypeScript Integration
|
||||
|
||||
### Type-Safe Parsing
|
||||
|
||||
```typescript
|
||||
interface PersonProfile {
|
||||
name: string;
|
||||
age: number;
|
||||
skills: string[];
|
||||
}
|
||||
|
||||
const schema = {
|
||||
type: 'object',
|
||||
properties: {
|
||||
name: { type: 'string' },
|
||||
age: { type: 'number' },
|
||||
skills: { type: 'array', items: { type: 'string' } },
|
||||
},
|
||||
required: ['name', 'age', 'skills'],
|
||||
additionalProperties: false,
|
||||
};
|
||||
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [{ role: 'user', content: 'Generate a person profile' }],
|
||||
response_format: { type: 'json_schema', json_schema: { name: 'person', strict: true, schema } },
|
||||
});
|
||||
|
||||
const person: PersonProfile = JSON.parse(completion.choices[0].message.content);
|
||||
// TypeScript knows the shape!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
```typescript
|
||||
try {
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages,
|
||||
response_format: { type: 'json_schema', json_schema: { name: 'data', strict: true, schema } },
|
||||
});
|
||||
|
||||
const data = JSON.parse(completion.choices[0].message.content);
|
||||
return data;
|
||||
} catch (error) {
|
||||
if (error.message.includes('JSON')) {
|
||||
console.error('Failed to parse JSON (should not happen with strict mode)');
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation
|
||||
|
||||
While `strict: true` ensures the response matches the schema, you may want additional validation:
|
||||
|
||||
```typescript
|
||||
import { z } from 'zod';
|
||||
|
||||
const zodSchema = z.object({
|
||||
email: z.string().email(),
|
||||
age: z.number().min(0).max(120),
|
||||
});
|
||||
|
||||
const data = JSON.parse(completion.choices[0].message.content);
|
||||
const validated = zodSchema.parse(data); // Throws if invalid
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**See Also**: Official Structured Outputs Guide (https://platform.openai.com/docs/guides/structured-outputs)
|
||||
453
references/top-errors.md
Normal file
453
references/top-errors.md
Normal file
@@ -0,0 +1,453 @@
|
||||
# Top OpenAI API Errors & Solutions
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
**Skill**: openai-api
|
||||
**Status**: Phase 1 Complete
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document covers the 10 most common errors encountered when using OpenAI APIs, with causes, solutions, and code examples.
|
||||
|
||||
---
|
||||
|
||||
## 1. Rate Limit Error (429)
|
||||
|
||||
### Cause
|
||||
Too many requests or tokens per minute/day.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Rate limit reached",
|
||||
"type": "rate_limit_error",
|
||||
"code": "rate_limit_exceeded"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Implement exponential backoff:
|
||||
|
||||
```typescript
|
||||
async function completionWithRetry(params, maxRetries = 3) {
|
||||
for (let i = 0; i < maxRetries; i++) {
|
||||
try {
|
||||
return await openai.chat.completions.create(params);
|
||||
} catch (error: any) {
|
||||
if (error.status === 429 && i < maxRetries - 1) {
|
||||
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
|
||||
console.log(`Rate limited. Retrying in ${delay}ms...`);
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
continue;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Invalid API Key (401)
|
||||
|
||||
### Cause
|
||||
Missing or incorrect `OPENAI_API_KEY`.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Incorrect API key provided",
|
||||
"type": "invalid_request_error",
|
||||
"code": "invalid_api_key"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Verify environment variable:
|
||||
|
||||
```bash
|
||||
# Check if set
|
||||
echo $OPENAI_API_KEY
|
||||
|
||||
# Set in .env
|
||||
OPENAI_API_KEY=sk-...
|
||||
```
|
||||
|
||||
```typescript
|
||||
if (!process.env.OPENAI_API_KEY) {
|
||||
throw new Error('OPENAI_API_KEY environment variable is required');
|
||||
}
|
||||
|
||||
const openai = new OpenAI({
|
||||
apiKey: process.env.OPENAI_API_KEY,
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Function Calling Schema Mismatch
|
||||
|
||||
### Cause
|
||||
Tool definition doesn't match model expectations or arguments are invalid.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Invalid schema for function 'get_weather'",
|
||||
"type": "invalid_request_error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Validate JSON schema:
|
||||
|
||||
```typescript
|
||||
const tools = [
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'get_weather',
|
||||
description: 'Get weather for a location', // Required
|
||||
parameters: { // Required
|
||||
type: 'object',
|
||||
properties: {
|
||||
location: {
|
||||
type: 'string',
|
||||
description: 'City name' // Add descriptions
|
||||
}
|
||||
},
|
||||
required: ['location'] // Specify required fields
|
||||
}
|
||||
}
|
||||
}
|
||||
];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Streaming Parse Error
|
||||
|
||||
### Cause
|
||||
Incomplete or malformed SSE (Server-Sent Events) chunks.
|
||||
|
||||
### Symptom
|
||||
```
|
||||
SyntaxError: Unexpected end of JSON input
|
||||
```
|
||||
|
||||
### Solution
|
||||
Properly handle SSE format:
|
||||
|
||||
```typescript
|
||||
const lines = chunk.split('\n').filter(line => line.trim() !== '');
|
||||
|
||||
for (const line of lines) {
|
||||
if (line.startsWith('data: ')) {
|
||||
const data = line.slice(6);
|
||||
|
||||
if (data === '[DONE]') {
|
||||
break;
|
||||
}
|
||||
|
||||
try {
|
||||
const json = JSON.parse(data);
|
||||
const content = json.choices[0]?.delta?.content || '';
|
||||
console.log(content);
|
||||
} catch (e) {
|
||||
// Skip invalid JSON - don't crash
|
||||
console.warn('Skipping invalid JSON chunk');
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Vision Image Encoding Error
|
||||
|
||||
### Cause
|
||||
Invalid base64 encoding or unsupported image format.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Invalid image format",
|
||||
"type": "invalid_request_error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Ensure proper base64 encoding:
|
||||
|
||||
```typescript
|
||||
import fs from 'fs';
|
||||
|
||||
// Read and encode image
|
||||
const imageBuffer = fs.readFileSync('./image.jpg');
|
||||
const base64Image = imageBuffer.toString('base64');
|
||||
|
||||
// Use with correct MIME type
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [
|
||||
{
|
||||
role: 'user',
|
||||
content: [
|
||||
{ type: 'text', text: 'What is in this image?' },
|
||||
{
|
||||
type: 'image_url',
|
||||
image_url: {
|
||||
url: `data:image/jpeg;base64,${base64Image}` // Include MIME type
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Token Limit Exceeded
|
||||
|
||||
### Cause
|
||||
Input + output tokens exceed model's context window.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "This model's maximum context length is 128000 tokens",
|
||||
"type": "invalid_request_error",
|
||||
"code": "context_length_exceeded"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Truncate input or reduce max_tokens:
|
||||
|
||||
```typescript
|
||||
function truncateMessages(messages, maxTokens = 120000) {
|
||||
// Rough estimate: 1 token ≈ 4 characters
|
||||
const maxChars = maxTokens * 4;
|
||||
let totalChars = 0;
|
||||
|
||||
const truncated = [];
|
||||
for (const msg of messages.reverse()) {
|
||||
const msgChars = msg.content.length;
|
||||
if (totalChars + msgChars > maxChars) break;
|
||||
truncated.unshift(msg);
|
||||
totalChars += msgChars;
|
||||
}
|
||||
|
||||
return truncated;
|
||||
}
|
||||
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: truncateMessages(messages),
|
||||
max_tokens: 8000, // Limit output tokens
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. GPT-5 Temperature Not Supported
|
||||
|
||||
### Cause
|
||||
Using `temperature` parameter with GPT-5 models.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "temperature is not supported for gpt-5",
|
||||
"type": "invalid_request_error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Use `reasoning_effort` instead or switch to GPT-4o:
|
||||
|
||||
```typescript
|
||||
// ❌ Bad - GPT-5 doesn't support temperature
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [...],
|
||||
temperature: 0.7, // NOT SUPPORTED
|
||||
});
|
||||
|
||||
// ✅ Good - Use reasoning_effort for GPT-5
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [...],
|
||||
reasoning_effort: 'medium',
|
||||
});
|
||||
|
||||
// ✅ Or use GPT-4o if you need temperature
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [...],
|
||||
temperature: 0.7,
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Streaming Not Closed Properly
|
||||
|
||||
### Cause
|
||||
Stream not properly terminated, causing resource leaks.
|
||||
|
||||
### Symptom
|
||||
Memory leaks, hanging connections.
|
||||
|
||||
### Solution
|
||||
Always close streams:
|
||||
|
||||
```typescript
|
||||
const stream = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [...],
|
||||
stream: true,
|
||||
});
|
||||
|
||||
try {
|
||||
for await (const chunk of stream) {
|
||||
const content = chunk.choices[0]?.delta?.content || '';
|
||||
process.stdout.write(content);
|
||||
}
|
||||
} finally {
|
||||
// Stream is automatically closed when iteration completes
|
||||
// But handle errors explicitly
|
||||
}
|
||||
|
||||
// For fetch-based streaming:
|
||||
const reader = response.body?.getReader();
|
||||
try {
|
||||
while (true) {
|
||||
const { done, value } = await reader!.read();
|
||||
if (done) break;
|
||||
// Process chunk
|
||||
}
|
||||
} finally {
|
||||
reader!.releaseLock(); // Important!
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. API Key Exposure in Client-Side Code
|
||||
|
||||
### Cause
|
||||
Including API key in frontend JavaScript.
|
||||
|
||||
### Risk
|
||||
API key visible to all users, can be stolen and abused.
|
||||
|
||||
### Solution
|
||||
Use server-side proxy:
|
||||
|
||||
```typescript
|
||||
// ❌ Bad - Client-side (NEVER DO THIS)
|
||||
const apiKey = 'sk-...'; // Exposed to all users!
|
||||
const response = await fetch('https://api.openai.com/v1/chat/completions', {
|
||||
headers: { 'Authorization': `Bearer ${apiKey}` }
|
||||
});
|
||||
|
||||
// ✅ Good - Server-side proxy
|
||||
// Frontend:
|
||||
const response = await fetch('/api/chat', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({ message: 'Hello' }),
|
||||
});
|
||||
|
||||
// Backend (e.g., Express):
|
||||
app.post('/api/chat', async (req, res) => {
|
||||
const completion = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [{ role: 'user', content: req.body.message }],
|
||||
});
|
||||
res.json(completion);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Embeddings Dimension Mismatch
|
||||
|
||||
### Cause
|
||||
Using wrong dimensions for embedding model.
|
||||
|
||||
### Error Response
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "dimensions must be less than or equal to 3072 for text-embedding-3-large",
|
||||
"type": "invalid_request_error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Solution
|
||||
Use correct dimensions for each model:
|
||||
|
||||
```typescript
|
||||
// text-embedding-3-small: default 1536, max 1536
|
||||
const embedding1 = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-small',
|
||||
input: 'Hello world',
|
||||
// dimensions: 256, // Optional: reduce from default 1536
|
||||
});
|
||||
|
||||
// text-embedding-3-large: default 3072, max 3072
|
||||
const embedding2 = await openai.embeddings.create({
|
||||
model: 'text-embedding-3-large',
|
||||
input: 'Hello world',
|
||||
// dimensions: 1024, // Optional: reduce from default 3072
|
||||
});
|
||||
|
||||
// text-embedding-ada-002: fixed 1536 (no dimensions parameter)
|
||||
const embedding3 = await openai.embeddings.create({
|
||||
model: 'text-embedding-ada-002',
|
||||
input: 'Hello world',
|
||||
// No dimensions parameter supported
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Table
|
||||
|
||||
| Error Code | HTTP Status | Primary Cause | Quick Fix |
|
||||
|------------|-------------|---------------|-----------|
|
||||
| `rate_limit_exceeded` | 429 | Too many requests | Exponential backoff |
|
||||
| `invalid_api_key` | 401 | Wrong/missing key | Check OPENAI_API_KEY |
|
||||
| `invalid_request_error` | 400 | Bad parameters | Validate schema/params |
|
||||
| `context_length_exceeded` | 400 | Too many tokens | Truncate input |
|
||||
| `model_not_found` | 404 | Invalid model name | Use correct model ID |
|
||||
| `insufficient_quota` | 429 | No credits left | Add billing/credits |
|
||||
|
||||
---
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **Official Error Codes**: https://platform.openai.com/docs/guides/error-codes
|
||||
- **Rate Limits Guide**: https://platform.openai.com/docs/guides/rate-limits
|
||||
- **Best Practices**: https://platform.openai.com/docs/guides/production-best-practices
|
||||
|
||||
---
|
||||
|
||||
**Phase 1 Complete** ✅
|
||||
**Phase 2**: Additional errors for Embeddings, Images, Audio, Moderation (next session)
|
||||
Reference in New Issue
Block a user