Initial commit
This commit is contained in:
311
references/models-guide.md
Normal file
311
references/models-guide.md
Normal file
@@ -0,0 +1,311 @@
|
||||
# OpenAI Models Guide
|
||||
|
||||
**Last Updated**: 2025-10-25
|
||||
|
||||
This guide provides a comprehensive comparison of OpenAI's language models to help you choose the right model for your use case.
|
||||
|
||||
---
|
||||
|
||||
## GPT-5 Series (Released August 2025)
|
||||
|
||||
### gpt-5
|
||||
**Status**: Latest flagship model
|
||||
**Best for**: Complex reasoning, advanced problem-solving, code generation
|
||||
|
||||
**Key Features**:
|
||||
- Advanced reasoning capabilities
|
||||
- Unique parameters: `reasoning_effort`, `verbosity`
|
||||
- Best-in-class performance on complex tasks
|
||||
|
||||
**Limitations**:
|
||||
- ❌ No `temperature` support
|
||||
- ❌ No `top_p` support
|
||||
- ❌ No `logprobs` support
|
||||
- ❌ CoT (Chain of Thought) does NOT persist between turns
|
||||
|
||||
**When to use**:
|
||||
- Complex mathematical problems
|
||||
- Advanced code generation
|
||||
- Logic puzzles and reasoning tasks
|
||||
- Multi-step problem solving
|
||||
|
||||
**Cost**: Highest pricing tier
|
||||
|
||||
---
|
||||
|
||||
### gpt-5-mini
|
||||
**Status**: Cost-effective GPT-5 variant
|
||||
**Best for**: Balanced performance and cost
|
||||
|
||||
**Key Features**:
|
||||
- Same parameter support as gpt-5 (`reasoning_effort`, `verbosity`)
|
||||
- Better than GPT-4 Turbo performance
|
||||
- Significantly cheaper than gpt-5
|
||||
|
||||
**When to use**:
|
||||
- Most production applications
|
||||
- When you need GPT-5 features but not maximum performance
|
||||
- High-volume use cases where cost matters
|
||||
|
||||
**Cost**: Mid-tier pricing
|
||||
|
||||
---
|
||||
|
||||
### gpt-5-nano
|
||||
**Status**: Smallest GPT-5 variant
|
||||
**Best for**: Simple tasks, high-volume processing
|
||||
|
||||
**Key Features**:
|
||||
- Fastest response times
|
||||
- Lowest cost in GPT-5 series
|
||||
- Still supports GPT-5 unique parameters
|
||||
|
||||
**When to use**:
|
||||
- Simple text generation
|
||||
- High-volume batch processing
|
||||
- Real-time streaming applications
|
||||
- Cost-sensitive deployments
|
||||
|
||||
**Cost**: Low-tier pricing
|
||||
|
||||
---
|
||||
|
||||
## GPT-4o Series
|
||||
|
||||
### gpt-4o
|
||||
**Status**: Multimodal flagship (pre-GPT-5)
|
||||
**Best for**: Vision tasks, multimodal applications
|
||||
|
||||
**Key Features**:
|
||||
- ✅ Vision support (image understanding)
|
||||
- ✅ Temperature control
|
||||
- ✅ Top-p sampling
|
||||
- ✅ Function calling
|
||||
- ✅ Structured outputs
|
||||
|
||||
**Limitations**:
|
||||
- ❌ No `reasoning_effort` parameter
|
||||
- ❌ No `verbosity` parameter
|
||||
|
||||
**When to use**:
|
||||
- Image understanding and analysis
|
||||
- OCR / text extraction from images
|
||||
- Visual question answering
|
||||
- When you need temperature/top_p control
|
||||
- Multimodal applications
|
||||
|
||||
**Cost**: High-tier pricing (cheaper than gpt-5)
|
||||
|
||||
---
|
||||
|
||||
### gpt-4-turbo
|
||||
**Status**: Fast GPT-4 variant
|
||||
**Best for**: When you need GPT-4 speed
|
||||
|
||||
**Key Features**:
|
||||
- Faster than base GPT-4
|
||||
- Full parameter support (temperature, top_p, logprobs)
|
||||
- Good balance of quality and speed
|
||||
|
||||
**When to use**:
|
||||
- When GPT-4 quality is needed with faster responses
|
||||
- Legacy applications requiring specific parameters
|
||||
- When vision is not required
|
||||
|
||||
**Cost**: Mid-tier pricing
|
||||
|
||||
---
|
||||
|
||||
## Comparison Table
|
||||
|
||||
| Feature | GPT-5 | GPT-5-mini | GPT-5-nano | GPT-4o | GPT-4 Turbo |
|
||||
|---------|-------|------------|------------|--------|-------------|
|
||||
| **Reasoning** | Best | Excellent | Good | Excellent | Excellent |
|
||||
| **Speed** | Medium | Medium | Fastest | Medium | Fast |
|
||||
| **Cost** | Highest | Mid | Lowest | High | Mid |
|
||||
| **reasoning_effort** | ✅ | ✅ | ✅ | ❌ | ❌ |
|
||||
| **verbosity** | ✅ | ✅ | ✅ | ❌ | ❌ |
|
||||
| **temperature** | ❌ | ❌ | ❌ | ✅ | ✅ |
|
||||
| **top_p** | ❌ | ❌ | ❌ | ✅ | ✅ |
|
||||
| **Vision** | ❌ | ❌ | ❌ | ✅ | ❌ |
|
||||
| **Function calling** | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| **Structured outputs** | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| **Max output tokens** | 16,384 | 16,384 | 16,384 | 16,384 | 16,384 |
|
||||
|
||||
---
|
||||
|
||||
## Selection Guide
|
||||
|
||||
### Use GPT-5 when:
|
||||
- ✅ You need the best reasoning performance
|
||||
- ✅ Complex mathematical or logical problems
|
||||
- ✅ Advanced code generation
|
||||
- ✅ Multi-step problem solving
|
||||
- ❌ Cost is not the primary concern
|
||||
|
||||
### Use GPT-5-mini when:
|
||||
- ✅ You want GPT-5 features at lower cost
|
||||
- ✅ Production applications with high volume
|
||||
- ✅ Good reasoning performance is needed
|
||||
- ✅ Balance of quality and cost matters
|
||||
|
||||
### Use GPT-5-nano when:
|
||||
- ✅ Simple text generation tasks
|
||||
- ✅ High-volume batch processing
|
||||
- ✅ Real-time streaming applications
|
||||
- ✅ Cost optimization is critical
|
||||
- ❌ Complex reasoning is not required
|
||||
|
||||
### Use GPT-4o when:
|
||||
- ✅ Vision / image understanding is required
|
||||
- ✅ You need temperature/top_p control
|
||||
- ✅ Multimodal applications
|
||||
- ✅ OCR and visual analysis
|
||||
- ❌ Pure text tasks (use GPT-5 series)
|
||||
|
||||
### Use GPT-4 Turbo when:
|
||||
- ✅ Legacy application compatibility
|
||||
- ✅ You need specific parameters not in GPT-5
|
||||
- ✅ Fast responses without vision
|
||||
- ❌ Not recommended for new applications (use GPT-5 or GPT-4o)
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization Strategies
|
||||
|
||||
### 1. Model Cascading
|
||||
Start with cheaper models and escalate only when needed:
|
||||
|
||||
```
|
||||
gpt-5-nano (try first) → gpt-5-mini → gpt-5 (if needed)
|
||||
```
|
||||
|
||||
### 2. Task-Specific Model Selection
|
||||
- **Simple**: Use gpt-5-nano
|
||||
- **Medium complexity**: Use gpt-5-mini
|
||||
- **Complex reasoning**: Use gpt-5
|
||||
- **Vision tasks**: Use gpt-4o
|
||||
|
||||
### 3. Hybrid Approach
|
||||
- Use embeddings (cheap) for retrieval
|
||||
- Use gpt-5-mini for generation
|
||||
- Use gpt-5 only for critical decisions
|
||||
|
||||
### 4. Batch Processing
|
||||
- Use cheaper models for bulk operations
|
||||
- Reserve expensive models for user-facing requests
|
||||
|
||||
---
|
||||
|
||||
## Parameter Guide
|
||||
|
||||
### GPT-5 Unique Parameters
|
||||
|
||||
**reasoning_effort**: Controls reasoning depth
|
||||
- "minimal": Quick responses
|
||||
- "low": Basic reasoning
|
||||
- "medium": Balanced (default)
|
||||
- "high": Deep reasoning for complex problems
|
||||
|
||||
**verbosity**: Controls output length
|
||||
- "low": Concise responses
|
||||
- "medium": Balanced detail (default)
|
||||
- "high": Verbose, detailed responses
|
||||
|
||||
### GPT-4o/GPT-4 Turbo Parameters
|
||||
|
||||
**temperature**: Controls randomness (0-2)
|
||||
- 0: Deterministic, focused
|
||||
- 1: Balanced creativity (default)
|
||||
- 2: Maximum creativity
|
||||
|
||||
**top_p**: Nucleus sampling (0-1)
|
||||
- Lower values: More focused
|
||||
- Higher values: More diverse
|
||||
|
||||
**logprobs**: Get token probabilities
|
||||
- Useful for debugging and analysis
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern 1: Automatic Model Selection
|
||||
|
||||
```typescript
|
||||
function selectModel(taskComplexity: 'simple' | 'medium' | 'complex') {
|
||||
switch (taskComplexity) {
|
||||
case 'simple':
|
||||
return 'gpt-5-nano';
|
||||
case 'medium':
|
||||
return 'gpt-5-mini';
|
||||
case 'complex':
|
||||
return 'gpt-5';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 2: Fallback Chain
|
||||
|
||||
```typescript
|
||||
async function completionWithFallback(prompt: string) {
|
||||
const models = ['gpt-5-nano', 'gpt-5-mini', 'gpt-5'];
|
||||
|
||||
for (const model of models) {
|
||||
try {
|
||||
const result = await openai.chat.completions.create({
|
||||
model,
|
||||
messages: [{ role: 'user', content: prompt }],
|
||||
});
|
||||
|
||||
// Validate quality
|
||||
if (isGoodEnough(result)) {
|
||||
return result;
|
||||
}
|
||||
} catch (error) {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
throw new Error('All models failed');
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Vision + Text Hybrid
|
||||
|
||||
```typescript
|
||||
// Use gpt-4o for image analysis
|
||||
const imageAnalysis = await openai.chat.completions.create({
|
||||
model: 'gpt-4o',
|
||||
messages: [
|
||||
{
|
||||
role: 'user',
|
||||
content: [
|
||||
{ type: 'text', text: 'Describe this image' },
|
||||
{ type: 'image_url', image_url: { url: imageUrl } },
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
// Use gpt-5 for reasoning based on analysis
|
||||
const reasoning = await openai.chat.completions.create({
|
||||
model: 'gpt-5',
|
||||
messages: [
|
||||
{ role: 'system', content: `Image analysis: ${imageAnalysis.choices[0].message.content}` },
|
||||
{ role: 'user', content: 'What does this imply about...' },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Official Documentation
|
||||
|
||||
- **GPT-5 Guide**: https://platform.openai.com/docs/guides/latest-model
|
||||
- **Model Pricing**: https://openai.com/pricing
|
||||
- **Model Comparison**: https://platform.openai.com/docs/models
|
||||
|
||||
---
|
||||
|
||||
**Summary**: Choose the right model based on your specific needs. GPT-5 series for reasoning, GPT-4o for vision, and optimize costs by selecting the smallest model that meets your requirements.
|
||||
Reference in New Issue
Block a user