Initial commit

2025-11-30 08:25:12 +08:00
commit 7a35a34caa
30 changed files with 8396 additions and 0 deletions
--- a/references/audio-guide.md
+++ b/references/audio-guide.md
@@ -0,0 +1,205 @@
+# Audio Guide (Whisper & TTS)
+
+**Last Updated**: 2025-10-25
+
+Complete guide to OpenAI's Audio API for transcription and text-to-speech.
+
+---
+
+## Whisper Transcription
+
+### Supported Formats
+- mp3, mp4, mpeg, mpga, m4a, wav, webm
+
+### Best Practices
+
+✅ **Audio Quality**:
+- Use clear audio with minimal background noise
+- 16 kHz or higher sample rate recommended
+- Mono or stereo both supported
+
+✅ **File Size**:
+- Max file size: 25 MB
+- For larger files: split into chunks or compress
+
+✅ **Languages**:
+- Whisper automatically detects language
+- Supports 50+ languages
+- Best results with English, Spanish, French, German, Chinese
+
+❌ **Limitations**:
+- May struggle with heavy accents
+- Background noise reduces accuracy
+- Very quiet audio may fail
+
+---
+
+## Text-to-Speech (TTS)
+
+### Model Selection
+
+| Model | Quality | Latency | Features | Best For |
+|-------|---------|---------|----------|----------|
+| tts-1 | Standard | Lowest | Basic TTS | Real-time streaming |
+| tts-1-hd | High | Medium | Better fidelity | Offline audio, podcasts |
+| gpt-4o-mini-tts | Best | Medium | Voice instructions, streaming | Maximum control |
+
+### Voice Selection Guide
+
+| Voice | Character | Best For |
+|-------|-----------|----------|
+| alloy | Neutral, balanced | General use, professional |
+| ash | Clear, professional | Business, presentations |
+| ballad | Warm, storytelling | Narration, audiobooks |
+| coral | Soft, friendly | Customer service, greetings |
+| echo | Calm, measured | Meditation, calm content |
+| fable | Expressive, narrative | Stories, entertainment |
+| onyx | Deep, authoritative | News, serious content |
+| nova | Bright, energetic | Marketing, enthusiastic content |
+| sage | Wise, thoughtful | Educational, informative |
+| shimmer | Gentle, soothing | Relaxation, sleep content |
+| verse | Poetic, rhythmic | Poetry, artistic content |
+
+### Voice Instructions (gpt-4o-mini-tts only)
+
+```typescript
+// Professional tone
+{
+  model: 'gpt-4o-mini-tts',
+  voice: 'ash',
+  input: 'Welcome to our service',
+  instructions: 'Speak in a calm, professional, and friendly tone suitable for customer service.',
+}
+
+// Energetic marketing
+{
+  model: 'gpt-4o-mini-tts',
+  voice: 'nova',
+  input: 'Don\'t miss this sale!',
+  instructions: 'Use an enthusiastic, energetic tone perfect for marketing and advertisements.',
+}
+
+// Meditation guidance
+{
+  model: 'gpt-4o-mini-tts',
+  voice: 'shimmer',
+  input: 'Take a deep breath',
+  instructions: 'Adopt a calm, soothing voice suitable for meditation and relaxation guidance.',
+}
+```
+
+### Speed Control
+
+```typescript
+// Slow (0.5x)
+{ speed: 0.5 } // Good for: Learning, accessibility
+
+// Normal (1.0x)
+{ speed: 1.0 } // Default
+
+// Fast (1.5x)
+{ speed: 1.5 } // Good for: Previews, time-saving
+
+// Very fast (2.0x)
+{ speed: 2.0 } // Good for: Quick previews only
+```
+
+Range: 0.25 to 4.0
+
+### Audio Format Selection
+
+| Format | Compression | Quality | Best For |
+|--------|-------------|---------|----------|
+| mp3 | Lossy | Good | Maximum compatibility |
+| opus | Lossy | Excellent | Web streaming, low bandwidth |
+| aac | Lossy | Good | iOS, Apple devices |
+| flac | Lossless | Best | Archiving, editing |
+| wav | Uncompressed | Best | Editing, processing |
+| pcm | Raw | Best | Low-level processing |
+
+---
+
+## Common Patterns
+
+### 1. Transcribe Interview
+
+```typescript
+const transcription = await openai.audio.transcriptions.create({
+  file: fs.createReadStream('./interview.mp3'),
+  model: 'whisper-1',
+});
+
+// Save transcript
+fs.writeFileSync('./interview.txt', transcription.text);
+```
+
+### 2. Generate Podcast Narration
+
+```typescript
+const script = "Welcome to today's podcast...";
+
+const audio = await openai.audio.speech.create({
+  model: 'tts-1-hd',
+  voice: 'fable',
+  input: script,
+  response_format: 'mp3',
+});
+
+const buffer = Buffer.from(await audio.arrayBuffer());
+fs.writeFileSync('./podcast.mp3', buffer);
+```
+
+### 3. Multi-Voice Conversation
+
+```typescript
+// Speaker 1
+const speaker1 = await openai.audio.speech.create({
+  model: 'tts-1',
+  voice: 'onyx',
+  input: 'Hello, how are you?',
+});
+
+// Speaker 2
+const speaker2 = await openai.audio.speech.create({
+  model: 'tts-1',
+  voice: 'nova',
+  input: 'I\'m doing great, thanks!',
+});
+
+// Combine audio files (requires audio processing library)
+```
+
+---
+
+## Cost Optimization
+
+1. **Use tts-1 for real-time** (cheaper, faster)
+2. **Use tts-1-hd for final production** (better quality)
+3. **Cache generated audio** (deterministic for same input)
+4. **Choose appropriate format** (opus for web, mp3 for compatibility)
+5. **Batch transcriptions** with delays to avoid rate limits
+
+---
+
+## Common Issues
+
+### Transcription Accuracy
+- Improve audio quality
+- Reduce background noise
+- Ensure adequate volume levels
+- Use supported audio formats
+
+### TTS Naturalness
+- Test different voices
+- Use voice instructions (gpt-4o-mini-tts)
+- Adjust speed for better pacing
+- Add punctuation for natural pauses
+
+### File Size
+- Compress audio before transcribing
+- Choose lossy formats (mp3, opus) for TTS
+- Use appropriate bitrates
+
+---
+
+**See Also**: Official Audio Guide (https://platform.openai.com/docs/guides/speech-to-text)
--- a/references/cost-optimization.md
+++ b/references/cost-optimization.md
@@ -0,0 +1,278 @@
+# Cost Optimization Guide
+
+**Last Updated**: 2025-10-25
+
+Strategies to minimize OpenAI API costs while maintaining quality.
+
+---
+
+## Model Selection Strategies
+
+### 1. Model Cascading
+
+Start with cheaper models, escalate only when needed:
+
+```typescript
+async function smartCompletion(prompt: string) {
+  // Try gpt-5-nano first
+  const nanoResult = await openai.chat.completions.create({
+    model: 'gpt-5-nano',
+    messages: [{ role: 'user', content: prompt }],
+  });
+
+  // Validate quality
+  if (isGoodEnough(nanoResult)) {
+    return nanoResult;
+  }
+
+  // Escalate to gpt-5-mini
+  const miniResult = await openai.chat.completions.create({
+    model: 'gpt-5-mini',
+    messages: [{ role: 'user', content: prompt }],
+  });
+
+  if (isGoodEnough(miniResult)) {
+    return miniResult;
+  }
+
+  // Final escalation to gpt-5
+  return await openai.chat.completions.create({
+    model: 'gpt-5',
+    messages: [{ role: 'user', content: prompt }],
+  });
+}
+```
+
+### 2. Task-Based Model Selection
+
+| Task | Model | Why |
+|------|-------|-----|
+| Simple chat | gpt-5-nano | Fast, cheap, sufficient |
+| Summarization | gpt-5-mini | Good quality, cost-effective |
+| Code generation | gpt-5 | Best reasoning, worth the cost |
+| Data extraction | gpt-4o + structured output | Reliable, accurate |
+| Vision tasks | gpt-4o | Only model with vision |
+
+---
+
+## Token Optimization
+
+### 1. Limit max_tokens
+
+```typescript
+// ❌ No limit: May generate unnecessarily long responses
+{
+  model: 'gpt-5',
+  messages,
+}
+
+// ✅ Set reasonable limit
+{
+  model: 'gpt-5',
+  messages,
+  max_tokens: 500, // Prevent runaway generation
+}
+```
+
+### 2. Trim Conversation History
+
+```typescript
+function trimHistory(messages: Message[], maxTokens: number = 4000) {
+  // Keep system message and recent messages
+  const system = messages.find(m => m.role === 'system');
+  const recent = messages.slice(-10); // Last 10 messages
+
+  return [system, ...recent].filter(Boolean);
+}
+```
+
+### 3. Use Shorter Prompts
+
+```typescript
+// ❌ Verbose
+"Please analyze the following text and provide a detailed summary of the main points, including any key takeaways and important details..."
+
+// ✅ Concise
+"Summarize key points:"
+```
+
+---
+
+## Caching Strategies
+
+### 1. Cache Embeddings
+
+```typescript
+const embeddingCache = new Map<string, number[]>();
+
+async function getCachedEmbedding(text: string) {
+  if (embeddingCache.has(text)) {
+    return embeddingCache.get(text)!;
+  }
+
+  const response = await openai.embeddings.create({
+    model: 'text-embedding-3-small',
+    input: text,
+  });
+
+  const embedding = response.data[0].embedding;
+  embeddingCache.set(text, embedding);
+
+  return embedding;
+}
+```
+
+### 2. Cache Common Completions
+
+```typescript
+const completionCache = new Map<string, string>();
+
+async function getCachedCompletion(prompt: string) {
+  const cacheKey = `${model}:${prompt}`;
+
+  if (completionCache.has(cacheKey)) {
+    return completionCache.get(cacheKey)!;
+  }
+
+  const result = await openai.chat.completions.create({
+    model: 'gpt-5-mini',
+    messages: [{ role: 'user', content: prompt }],
+  });
+
+  const content = result.choices[0].message.content;
+  completionCache.set(cacheKey, content!);
+
+  return content;
+}
+```
+
+---
+
+## Batch Processing
+
+### 1. Use Embeddings Batch API
+
+```typescript
+// ❌ Individual requests (expensive)
+for (const doc of documents) {
+  await openai.embeddings.create({
+    model: 'text-embedding-3-small',
+    input: doc,
+  });
+}
+
+// ✅ Batch request (cheaper)
+const response = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: documents, // Array of up to 2048 documents
+});
+```
+
+### 2. Group Similar Requests
+
+```typescript
+// Process non-urgent requests in batches during off-peak hours
+const batchQueue: string[] = [];
+
+function queueForBatch(prompt: string) {
+  batchQueue.push(prompt);
+
+  if (batchQueue.length >= 10) {
+    processBatch();
+  }
+}
+
+async function processBatch() {
+  // Process all at once
+  const results = await Promise.all(
+    batchQueue.map(prompt =>
+      openai.chat.completions.create({
+        model: 'gpt-5-nano',
+        messages: [{ role: 'user', content: prompt }],
+      })
+    )
+  );
+
+  batchQueue.length = 0;
+  return results;
+}
+```
+
+---
+
+## Feature-Specific Optimization
+
+### Embeddings
+
+1. **Use custom dimensions**: 256 instead of 1536 = 6x storage reduction
+2. **Use text-embedding-3-small**: Cheaper than large, good for most use cases
+3. **Batch requests**: Up to 2048 documents per request
+
+### Images
+
+1. **Use standard quality**: Unless HD is critical
+2. **Use smaller sizes**: Generate 1024x1024 instead of 1792x1024 when possible
+3. **Use natural style**: Cheaper than vivid
+
+### Audio
+
+1. **Use tts-1 for real-time**: Cheaper than tts-1-hd
+2. **Use opus format**: Smaller files, good quality
+3. **Cache generated audio**: Deterministic for same input
+
+---
+
+## Monitoring and Alerts
+
+```typescript
+interface CostTracker {
+  totalTokens: number;
+  totalCost: number;
+  requestCount: number;
+}
+
+const tracker: CostTracker = {
+  totalTokens: 0,
+  totalCost: 0,
+  requestCount: 0,
+};
+
+async function trackCosts(fn: () => Promise<any>) {
+  const result = await fn();
+
+  if (result.usage) {
+    tracker.totalTokens += result.usage.total_tokens;
+    tracker.requestCount++;
+
+    // Estimate cost (adjust rates based on actual pricing)
+    const cost = estimateCost(result.model, result.usage.total_tokens);
+    tracker.totalCost += cost;
+
+    // Alert if threshold exceeded
+    if (tracker.totalCost > 100) {
+      console.warn('Cost threshold exceeded!', tracker);
+    }
+  }
+
+  return result;
+}
+```
+
+---
+
+## Cost Reduction Checklist
+
+- [ ] Use cheapest model that meets requirements
+- [ ] Set max_tokens limits
+- [ ] Trim conversation history
+- [ ] Cache embeddings and common queries
+- [ ] Batch requests when possible
+- [ ] Use custom embedding dimensions (256-512)
+- [ ] Monitor token usage
+- [ ] Implement rate limiting
+- [ ] Use structured outputs to avoid retries
+- [ ] Compress prompts (remove unnecessary words)
+
+---
+
+**Estimated Savings**: Following these practices can reduce costs by 40-70% while maintaining quality.
--- a/references/embeddings-guide.md
+++ b/references/embeddings-guide.md
@@ -0,0 +1,187 @@
+# Embeddings Guide
+
+**Last Updated**: 2025-10-25
+
+Complete guide to OpenAI's Embeddings API for semantic search, RAG, and clustering.
+
+---
+
+## Model Comparison
+
+| Model | Default Dimensions | Custom Dimensions | Best For |
+|-------|-------------------|-------------------|----------|
+| text-embedding-3-large | 3072 | 256-3072 | Highest quality semantic search |
+| text-embedding-3-small | 1536 | 256-1536 | Most applications, cost-effective |
+| text-embedding-ada-002 | 1536 | Fixed | Legacy (use v3 models) |
+
+---
+
+## Dimension Selection
+
+### Full Dimensions
+- **text-embedding-3-small**: 1536 (default)
+- **text-embedding-3-large**: 3072 (default)
+- Use for maximum accuracy
+
+### Reduced Dimensions
+- **256 dims**: 4-12x storage reduction, minimal quality loss
+- **512 dims**: 2-6x storage reduction, good quality
+- Use for cost/storage optimization
+
+```typescript
+// Full dimensions (1536)
+const full = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: 'Sample text',
+});
+
+// Reduced dimensions (256)
+const reduced = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: 'Sample text',
+  dimensions: 256,
+});
+```
+
+---
+
+## RAG (Retrieval-Augmented Generation) Pattern
+
+### 1. Build Knowledge Base
+
+```typescript
+const documents = [
+  'TypeScript is a superset of JavaScript',
+  'Python is a high-level programming language',
+  'React is a JavaScript library for UIs',
+];
+
+const embeddings = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: documents,
+});
+
+const knowledgeBase = documents.map((text, i) => ({
+  text,
+  embedding: embeddings.data[i].embedding,
+}));
+```
+
+### 2. Query with Similarity Search
+
+```typescript
+// Embed user query
+const queryEmbedding = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: 'What is TypeScript?',
+});
+
+// Find similar documents
+const similarities = knowledgeBase.map(doc => ({
+  text: doc.text,
+  similarity: cosineSimilarity(queryEmbedding.data[0].embedding, doc.embedding),
+}));
+
+similarities.sort((a, b) => b.similarity - a.similarity);
+const topResults = similarities.slice(0, 3);
+```
+
+### 3. Generate Answer with Context
+
+```typescript
+const context = topResults.map(r => r.text).join('\n\n');
+
+const completion = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [
+    { role: 'system', content: `Answer using this context:\n\n${context}` },
+    { role: 'user', content: 'What is TypeScript?' },
+  ],
+});
+```
+
+---
+
+## Similarity Metrics
+
+### Cosine Similarity (Recommended)
+
+```typescript
+function cosineSimilarity(a: number[], b: number[]): number {
+  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
+  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
+  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
+  return dotProduct / (magnitudeA * magnitudeB);
+}
+```
+
+### Euclidean Distance
+
+```typescript
+function euclideanDistance(a: number[], b: number[]): number {
+  return Math.sqrt(
+    a.reduce((sum, val, i) => sum + Math.pow(val - b[i], 2), 0)
+  );
+}
+```
+
+---
+
+## Batch Processing
+
+```typescript
+// Process up to 2048 documents
+const embeddings = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: documents, // Array of strings
+});
+
+embeddings.data.forEach((item, index) => {
+  console.log(`Doc ${index}: ${item.embedding.length} dimensions`);
+});
+```
+
+**Limits**:
+- Max tokens per input: 8192
+- Max summed tokens across all inputs: 300,000
+- Array dimension max: 2048
+
+---
+
+## Best Practices
+
+✅ **Pre-processing**:
+- Normalize text (lowercase, remove special chars)
+- Be consistent across queries and documents
+- Chunk long documents (max 8192 tokens)
+
+✅ **Storage**:
+- Use custom dimensions (256-512) for storage optimization
+- Store embeddings in vector databases (Pinecone, Weaviate, Qdrant)
+- Cache embeddings (deterministic for same input)
+
+✅ **Search**:
+- Use cosine similarity for comparison
+- Normalize embeddings before storing (L2 normalization)
+- Pre-filter with metadata before similarity search
+
+❌ **Don't**:
+- Mix models (incompatible dimensions)
+- Exceed token limits (8192 per input)
+- Skip normalization
+- Use raw embeddings without similarity metric
+
+---
+
+## Use Cases
+
+1. **Semantic Search**: Find similar documents
+2. **RAG**: Retrieve context for generation
+3. **Clustering**: Group similar content
+4. **Recommendations**: Content-based recommendations
+5. **Anomaly Detection**: Detect outliers
+6. **Duplicate Detection**: Find similar/duplicate content
+
+---
+
+**See Also**: Official Embeddings Guide (https://platform.openai.com/docs/guides/embeddings)
--- a/references/function-calling-patterns.md
+++ b/references/function-calling-patterns.md
@@ -0,0 +1,189 @@
+# Function Calling Patterns
+
+**Last Updated**: 2025-10-25
+
+Advanced patterns for implementing function calling (tool calling) with OpenAI's Chat Completions API.
+
+---
+
+## Basic Pattern
+
+```typescript
+const tools = [
+  {
+    type: 'function',
+    function: {
+      name: 'get_weather',
+      description: 'Get current weather for a location',
+      parameters: {
+        type: 'object',
+        properties: {
+          location: { type: 'string', description: 'City name' },
+          unit: { type: 'string', enum: ['celsius', 'fahrenheit'] },
+        },
+        required: ['location'],
+      },
+    },
+  },
+];
+```
+
+---
+
+## Advanced Patterns
+
+### 1. Parallel Tool Calls
+
+The model can call multiple tools simultaneously:
+
+```typescript
+const completion = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [
+    { role: 'user', content: 'What is the weather in SF and NYC?' }
+  ],
+  tools: tools,
+});
+
+// Model may return multiple tool_calls
+const toolCalls = completion.choices[0].message.tool_calls;
+
+// Execute all in parallel
+const results = await Promise.all(
+  toolCalls.map(call => executeFunction(call.function.name, call.function.arguments))
+);
+```
+
+### 2. Dynamic Tool Generation
+
+Generate tools based on runtime context:
+
+```typescript
+function generateTools(database: Database) {
+  const tables = database.getTables();
+
+  return tables.map(table => ({
+    type: 'function',
+    function: {
+      name: `query_${table.name}`,
+      description: `Query the ${table.name} table`,
+      parameters: {
+        type: 'object',
+        properties: table.columns.reduce((acc, col) => ({
+          ...acc,
+          [col.name]: { type: col.type, description: col.description },
+        }), {}),
+      },
+    },
+  }));
+}
+```
+
+### 3. Tool Chaining
+
+Chain tool results:
+
+```typescript
+async function chatWithToolChaining(userMessage: string) {
+  let messages = [{ role: 'user', content: userMessage }];
+
+  while (true) {
+    const completion = await openai.chat.completions.create({
+      model: 'gpt-5',
+      messages,
+      tools,
+    });
+
+    const message = completion.choices[0].message;
+    messages.push(message);
+
+    if (!message.tool_calls) {
+      return message.content; // Final answer
+    }
+
+    // Execute tool calls and add results
+    for (const toolCall of message.tool_calls) {
+      const result = await executeFunction(
+        toolCall.function.name,
+        toolCall.function.arguments
+      );
+
+      messages.push({
+        role: 'tool',
+        tool_call_id: toolCall.id,
+        content: JSON.stringify(result),
+      });
+    }
+  }
+}
+```
+
+### 4. Error Handling in Tools
+
+```typescript
+async function executeFunction(name: string, argsString: string) {
+  try {
+    const args = JSON.parse(argsString);
+
+    switch (name) {
+      case 'get_weather':
+        return await getWeather(args.location, args.unit);
+
+      default:
+        return { error: `Unknown function: ${name}` };
+    }
+  } catch (error: any) {
+    return { error: error.message };
+  }
+}
+```
+
+### 5. Streaming with Tools
+
+```typescript
+const stream = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages,
+  tools,
+  stream: true,
+});
+
+for await (const chunk of stream) {
+  const delta = chunk.choices[0]?.delta;
+
+  // Check for tool calls in streaming
+  if (delta?.tool_calls) {
+    // Accumulate tool call data
+    console.log('Tool call chunk:', delta.tool_calls);
+  }
+}
+```
+
+---
+
+## Best Practices
+
+✅ **Schema Design**:
+- Provide clear descriptions for each parameter
+- Use enum when options are limited
+- Mark required vs optional parameters
+
+✅ **Error Handling**:
+- Return structured error objects
+- Don't throw exceptions from tool functions
+- Let the model handle error recovery
+
+✅ **Performance**:
+- Execute independent tool calls in parallel
+- Cache tool results when appropriate
+- Limit recursion depth to avoid infinite loops
+
+❌ **Don't**:
+- Expose sensitive internal functions
+- Allow unlimited recursion
+- Skip parameter validation
+- Return unstructured error messages
+
+---
+
+**See Also**: Official Function Calling Guide (https://platform.openai.com/docs/guides/function-calling)
--- a/references/images-guide.md
+++ b/references/images-guide.md
@@ -0,0 +1,153 @@
+# Images Guide (DALL-E 3 & GPT-Image-1)
+
+**Last Updated**: 2025-10-25
+
+Best practices for image generation and editing with OpenAI's Images API.
+
+---
+
+## DALL-E 3 Generation
+
+### Size Selection
+
+| Size | Use Case |
+|------|----------|
+| 1024x1024 | Profile pictures, icons, square posts |
+| 1024x1536 | Portrait photos, vertical ads |
+| 1536x1024 | Landscape photos, banners |
+| 1024x1792 | Tall portraits, mobile wallpapers |
+| 1792x1024 | Wide banners, desktop wallpapers |
+
+### Quality Settings
+
+**standard**: Normal quality, faster, cheaper
+- Use for: Prototyping, high-volume generation, quick iterations
+
+**hd**: High definition, finer details, more expensive
+- Use for: Final production images, marketing materials, print
+
+### Style Options
+
+**vivid**: Hyper-real, dramatic, high-contrast
+- Use for: Marketing, advertising, eye-catching visuals
+
+**natural**: More realistic, less dramatic
+- Use for: Product photos, realistic scenes, professional content
+
+---
+
+## Prompting Best Practices
+
+### Be Specific
+
+```
+❌ "A cat"
+✅ "A white siamese cat with striking blue eyes, sitting on a wooden table, golden hour lighting, professional photography"
+```
+
+### Include Art Style
+
+```
+✅ "Oil painting of a sunset in the style of Claude Monet"
+✅ "3D render of a futuristic city, Pixar animation style"
+✅ "Professional product photo with studio lighting"
+```
+
+### Specify Lighting
+
+```
+- "Golden hour lighting"
+- "Soft studio lighting from the left"
+- "Dramatic shadows"
+- "Bright natural daylight"
+```
+
+### Composition Details
+
+```
+- "Shallow depth of field"
+- "Wide angle lens"
+- "Centered composition"
+- "Rule of thirds"
+```
+
+---
+
+## GPT-Image-1 Editing
+
+### Input Fidelity
+
+**low**: More creative freedom
+- Use for: Major transformations, style changes
+
+**medium**: Balance (default)
+- Use for: Most editing tasks
+
+**high**: Stay close to original
+- Use for: Subtle edits, preserving details
+
+### Common Editing Tasks
+
+1. **Background Removal**
+```typescript
+formData.append('prompt', 'Remove the background, keep only the product');
+formData.append('format', 'png');
+formData.append('background', 'transparent');
+```
+
+2. **Color Correction**
+```typescript
+formData.append('prompt', 'Increase brightness and saturation, make colors more vibrant');
+```
+
+3. **Object Removal**
+```typescript
+formData.append('prompt', 'Remove the person from the background');
+```
+
+4. **Compositing**
+```typescript
+formData.append('image', mainImage);
+formData.append('image_2', logoImage);
+formData.append('prompt', 'Add the logo to the product, as if stamped on the surface');
+```
+
+---
+
+## Format Selection
+
+| Format | Transparency | Compression | Best For |
+|--------|--------------|-------------|----------|
+| PNG | Yes | Lossless | Logos, transparency needed |
+| JPEG | No | Lossy | Photos, smaller file size |
+| WebP | Yes | Lossy | Web, best compression |
+
+---
+
+## Cost Optimization
+
+1. Use standard quality unless HD is critical
+2. Generate smaller sizes when possible
+3. Cache generated images
+4. Use natural style for most cases (vivid costs more)
+5. Batch requests with delays to avoid rate limits
+
+---
+
+## Common Issues
+
+### Prompt Revision
+DALL-E 3 may revise prompts for safety/quality. Check `revised_prompt` in response.
+
+### URL Expiration
+Image URLs expire in 1 hour. Download and save if needed long-term.
+
+### Non-Deterministic
+Same prompt = different images. Cache results if consistency needed.
+
+### Rate Limits
+DALL-E has separate IPM (Images Per Minute) limits. Monitor and implement delays.
+
+---
+
+**See Also**: Official Images Guide (https://platform.openai.com/docs/guides/images)
--- a/references/models-guide.md
+++ b/references/models-guide.md
@@ -0,0 +1,311 @@
+# OpenAI Models Guide
+
+**Last Updated**: 2025-10-25
+
+This guide provides a comprehensive comparison of OpenAI's language models to help you choose the right model for your use case.
+
+---
+
+## GPT-5 Series (Released August 2025)
+
+### gpt-5
+**Status**: Latest flagship model
+**Best for**: Complex reasoning, advanced problem-solving, code generation
+
+**Key Features**:
+- Advanced reasoning capabilities
+- Unique parameters: `reasoning_effort`, `verbosity`
+- Best-in-class performance on complex tasks
+
+**Limitations**:
+- ❌ No `temperature` support
+- ❌ No `top_p` support
+- ❌ No `logprobs` support
+- ❌ CoT (Chain of Thought) does NOT persist between turns
+
+**When to use**:
+- Complex mathematical problems
+- Advanced code generation
+- Logic puzzles and reasoning tasks
+- Multi-step problem solving
+
+**Cost**: Highest pricing tier
+
+---
+
+### gpt-5-mini
+**Status**: Cost-effective GPT-5 variant
+**Best for**: Balanced performance and cost
+
+**Key Features**:
+- Same parameter support as gpt-5 (`reasoning_effort`, `verbosity`)
+- Better than GPT-4 Turbo performance
+- Significantly cheaper than gpt-5
+
+**When to use**:
+- Most production applications
+- When you need GPT-5 features but not maximum performance
+- High-volume use cases where cost matters
+
+**Cost**: Mid-tier pricing
+
+---
+
+### gpt-5-nano
+**Status**: Smallest GPT-5 variant
+**Best for**: Simple tasks, high-volume processing
+
+**Key Features**:
+- Fastest response times
+- Lowest cost in GPT-5 series
+- Still supports GPT-5 unique parameters
+
+**When to use**:
+- Simple text generation
+- High-volume batch processing
+- Real-time streaming applications
+- Cost-sensitive deployments
+
+**Cost**: Low-tier pricing
+
+---
+
+## GPT-4o Series
+
+### gpt-4o
+**Status**: Multimodal flagship (pre-GPT-5)
+**Best for**: Vision tasks, multimodal applications
+
+**Key Features**:
+- ✅ Vision support (image understanding)
+- ✅ Temperature control
+- ✅ Top-p sampling
+- ✅ Function calling
+- ✅ Structured outputs
+
+**Limitations**:
+- ❌ No `reasoning_effort` parameter
+- ❌ No `verbosity` parameter
+
+**When to use**:
+- Image understanding and analysis
+- OCR / text extraction from images
+- Visual question answering
+- When you need temperature/top_p control
+- Multimodal applications
+
+**Cost**: High-tier pricing (cheaper than gpt-5)
+
+---
+
+### gpt-4-turbo
+**Status**: Fast GPT-4 variant
+**Best for**: When you need GPT-4 speed
+
+**Key Features**:
+- Faster than base GPT-4
+- Full parameter support (temperature, top_p, logprobs)
+- Good balance of quality and speed
+
+**When to use**:
+- When GPT-4 quality is needed with faster responses
+- Legacy applications requiring specific parameters
+- When vision is not required
+
+**Cost**: Mid-tier pricing
+
+---
+
+## Comparison Table
+
+| Feature | GPT-5 | GPT-5-mini | GPT-5-nano | GPT-4o | GPT-4 Turbo |
+|---------|-------|------------|------------|--------|-------------|
+| **Reasoning** | Best | Excellent | Good | Excellent | Excellent |
+| **Speed** | Medium | Medium | Fastest | Medium | Fast |
+| **Cost** | Highest | Mid | Lowest | High | Mid |
+| **reasoning_effort** | ✅ | ✅ | ✅ | ❌ | ❌ |
+| **verbosity** | ✅ | ✅ | ✅ | ❌ | ❌ |
+| **temperature** | ❌ | ❌ | ❌ | ✅ | ✅ |
+| **top_p** | ❌ | ❌ | ❌ | ✅ | ✅ |
+| **Vision** | ❌ | ❌ | ❌ | ✅ | ❌ |
+| **Function calling** | ✅ | ✅ | ✅ | ✅ | ✅ |
+| **Structured outputs** | ✅ | ✅ | ✅ | ✅ | ✅ |
+| **Max output tokens** | 16,384 | 16,384 | 16,384 | 16,384 | 16,384 |
+
+---
+
+## Selection Guide
+
+### Use GPT-5 when:
+- ✅ You need the best reasoning performance
+- ✅ Complex mathematical or logical problems
+- ✅ Advanced code generation
+- ✅ Multi-step problem solving
+- ❌ Cost is not the primary concern
+
+### Use GPT-5-mini when:
+- ✅ You want GPT-5 features at lower cost
+- ✅ Production applications with high volume
+- ✅ Good reasoning performance is needed
+- ✅ Balance of quality and cost matters
+
+### Use GPT-5-nano when:
+- ✅ Simple text generation tasks
+- ✅ High-volume batch processing
+- ✅ Real-time streaming applications
+- ✅ Cost optimization is critical
+- ❌ Complex reasoning is not required
+
+### Use GPT-4o when:
+- ✅ Vision / image understanding is required
+- ✅ You need temperature/top_p control
+- ✅ Multimodal applications
+- ✅ OCR and visual analysis
+- ❌ Pure text tasks (use GPT-5 series)
+
+### Use GPT-4 Turbo when:
+- ✅ Legacy application compatibility
+- ✅ You need specific parameters not in GPT-5
+- ✅ Fast responses without vision
+- ❌ Not recommended for new applications (use GPT-5 or GPT-4o)
+
+---
+
+## Cost Optimization Strategies
+
+### 1. Model Cascading
+Start with cheaper models and escalate only when needed:
+
+```
+gpt-5-nano (try first) → gpt-5-mini → gpt-5 (if needed)
+```
+
+### 2. Task-Specific Model Selection
+- **Simple**: Use gpt-5-nano
+- **Medium complexity**: Use gpt-5-mini
+- **Complex reasoning**: Use gpt-5
+- **Vision tasks**: Use gpt-4o
+
+### 3. Hybrid Approach
+- Use embeddings (cheap) for retrieval
+- Use gpt-5-mini for generation
+- Use gpt-5 only for critical decisions
+
+### 4. Batch Processing
+- Use cheaper models for bulk operations
+- Reserve expensive models for user-facing requests
+
+---
+
+## Parameter Guide
+
+### GPT-5 Unique Parameters
+
+**reasoning_effort**: Controls reasoning depth
+- "minimal": Quick responses
+- "low": Basic reasoning
+- "medium": Balanced (default)
+- "high": Deep reasoning for complex problems
+
+**verbosity**: Controls output length
+- "low": Concise responses
+- "medium": Balanced detail (default)
+- "high": Verbose, detailed responses
+
+### GPT-4o/GPT-4 Turbo Parameters
+
+**temperature**: Controls randomness (0-2)
+- 0: Deterministic, focused
+- 1: Balanced creativity (default)
+- 2: Maximum creativity
+
+**top_p**: Nucleus sampling (0-1)
+- Lower values: More focused
+- Higher values: More diverse
+
+**logprobs**: Get token probabilities
+- Useful for debugging and analysis
+
+---
+
+## Common Patterns
+
+### Pattern 1: Automatic Model Selection
+
+```typescript
+function selectModel(taskComplexity: 'simple' | 'medium' | 'complex') {
+  switch (taskComplexity) {
+    case 'simple':
+      return 'gpt-5-nano';
+    case 'medium':
+      return 'gpt-5-mini';
+    case 'complex':
+      return 'gpt-5';
+  }
+}
+```
+
+### Pattern 2: Fallback Chain
+
+```typescript
+async function completionWithFallback(prompt: string) {
+  const models = ['gpt-5-nano', 'gpt-5-mini', 'gpt-5'];
+
+  for (const model of models) {
+    try {
+      const result = await openai.chat.completions.create({
+        model,
+        messages: [{ role: 'user', content: prompt }],
+      });
+
+      // Validate quality
+      if (isGoodEnough(result)) {
+        return result;
+      }
+    } catch (error) {
+      continue;
+    }
+  }
+
+  throw new Error('All models failed');
+}
+```
+
+### Pattern 3: Vision + Text Hybrid
+
+```typescript
+// Use gpt-4o for image analysis
+const imageAnalysis = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [
+    {
+      role: 'user',
+      content: [
+        { type: 'text', text: 'Describe this image' },
+        { type: 'image_url', image_url: { url: imageUrl } },
+      ],
+    },
+  ],
+});
+
+// Use gpt-5 for reasoning based on analysis
+const reasoning = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [
+    { role: 'system', content: `Image analysis: ${imageAnalysis.choices[0].message.content}` },
+    { role: 'user', content: 'What does this imply about...' },
+  ],
+});
+```
+
+---
+
+## Official Documentation
+
+- **GPT-5 Guide**: https://platform.openai.com/docs/guides/latest-model
+- **Model Pricing**: https://openai.com/pricing
+- **Model Comparison**: https://platform.openai.com/docs/models
+
+---
+
+**Summary**: Choose the right model based on your specific needs. GPT-5 series for reasoning, GPT-4o for vision, and optimize costs by selecting the smallest model that meets your requirements.
--- a/references/structured-output-guide.md
+++ b/references/structured-output-guide.md
@@ -0,0 +1,220 @@
+# Structured Output Guide
+
+**Last Updated**: 2025-10-25
+
+Best practices for using JSON schemas with OpenAI's structured outputs feature.
+
+---
+
+## When to Use Structured Outputs
+
+Use structured outputs when you need:
+- ✅ **Guaranteed JSON format**: Response will always be valid JSON
+- ✅ **Schema validation**: Enforce specific structure
+- ✅ **Type safety**: Parse directly into TypeScript types
+- ✅ **Data extraction**: Pull specific fields from text
+- ✅ **Classification**: Map to predefined categories
+
+---
+
+## Schema Best Practices
+
+### 1. Keep Schemas Simple
+
+```typescript
+// ✅ Good: Simple, focused schema
+{
+  type: 'object',
+  properties: {
+    name: { type: 'string' },
+    age: { type: 'number' },
+  },
+  required: ['name', 'age'],
+  additionalProperties: false,
+}
+
+// ❌ Avoid: Overly complex nested structures
+// (they work but are harder to debug)
+```
+
+### 2. Use Enums for Fixed Options
+
+```typescript
+{
+  type: 'object',
+  properties: {
+    category: {
+      type: 'string',
+      enum: ['bug', 'feature', 'question'],
+    },
+    priority: {
+      type: 'string',
+      enum: ['low', 'medium', 'high', 'critical'],
+    },
+  },
+  required: ['category', 'priority'],
+}
+```
+
+### 3. Always Use `strict: true`
+
+```typescript
+response_format: {
+  type: 'json_schema',
+  json_schema: {
+    name: 'response_schema',
+    strict: true, // ✅ Enforces exact compliance
+    schema: { /* ... */ },
+  },
+}
+```
+
+### 4. Set `additionalProperties: false`
+
+```typescript
+{
+  type: 'object',
+  properties: { /* ... */ },
+  required: [ /* ... */ ],
+  additionalProperties: false, // ✅ Prevents unexpected fields
+}
+```
+
+---
+
+## Common Use Cases
+
+### Data Extraction
+
+```typescript
+const schema = {
+  type: 'object',
+  properties: {
+    person: { type: 'string' },
+    company: { type: 'string' },
+    email: { type: 'string' },
+    phone: { type: 'string' },
+  },
+  required: ['person'],
+  additionalProperties: false,
+};
+
+// Extract from unstructured text
+const completion = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [
+    { role: 'system', content: 'Extract contact information' },
+    { role: 'user', content: 'John works at TechCorp, email: john@tech.com' },
+  ],
+  response_format: { type: 'json_schema', json_schema: { name: 'contact', strict: true, schema } },
+});
+
+const contact = JSON.parse(completion.choices[0].message.content);
+// { person: "John", company: "TechCorp", email: "john@tech.com", phone: null }
+```
+
+### Classification
+
+```typescript
+const schema = {
+  type: 'object',
+  properties: {
+    sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] },
+    confidence: { type: 'number' },
+    topics: { type: 'array', items: { type: 'string' } },
+  },
+  required: ['sentiment', 'confidence', 'topics'],
+  additionalProperties: false,
+};
+
+// Classify text
+const completion = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [
+    { role: 'system', content: 'Classify the text' },
+    { role: 'user', content: 'This product is amazing!' },
+  ],
+  response_format: { type: 'json_schema', json_schema: { name: 'classification', strict: true, schema } },
+});
+
+const result = JSON.parse(completion.choices[0].message.content);
+// { sentiment: "positive", confidence: 0.95, topics: ["product", "satisfaction"] }
+```
+
+---
+
+## TypeScript Integration
+
+### Type-Safe Parsing
+
+```typescript
+interface PersonProfile {
+  name: string;
+  age: number;
+  skills: string[];
+}
+
+const schema = {
+  type: 'object',
+  properties: {
+    name: { type: 'string' },
+    age: { type: 'number' },
+    skills: { type: 'array', items: { type: 'string' } },
+  },
+  required: ['name', 'age', 'skills'],
+  additionalProperties: false,
+};
+
+const completion = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [{ role: 'user', content: 'Generate a person profile' }],
+  response_format: { type: 'json_schema', json_schema: { name: 'person', strict: true, schema } },
+});
+
+const person: PersonProfile = JSON.parse(completion.choices[0].message.content);
+// TypeScript knows the shape!
+```
+
+---
+
+## Error Handling
+
+```typescript
+try {
+  const completion = await openai.chat.completions.create({
+    model: 'gpt-4o',
+    messages,
+    response_format: { type: 'json_schema', json_schema: { name: 'data', strict: true, schema } },
+  });
+
+  const data = JSON.parse(completion.choices[0].message.content);
+  return data;
+} catch (error) {
+  if (error.message.includes('JSON')) {
+    console.error('Failed to parse JSON (should not happen with strict mode)');
+  }
+  throw error;
+}
+```
+
+---
+
+## Validation
+
+While `strict: true` ensures the response matches the schema, you may want additional validation:
+
+```typescript
+import { z } from 'zod';
+
+const zodSchema = z.object({
+  email: z.string().email(),
+  age: z.number().min(0).max(120),
+});
+
+const data = JSON.parse(completion.choices[0].message.content);
+const validated = zodSchema.parse(data); // Throws if invalid
+```
+
+---
+
+**See Also**: Official Structured Outputs Guide (https://platform.openai.com/docs/guides/structured-outputs)
--- a/references/top-errors.md
+++ b/references/top-errors.md
@@ -0,0 +1,453 @@
+# Top OpenAI API Errors & Solutions
+
+**Last Updated**: 2025-10-25
+**Skill**: openai-api
+**Status**: Phase 1 Complete
+
+---
+
+## Overview
+
+This document covers the 10 most common errors encountered when using OpenAI APIs, with causes, solutions, and code examples.
+
+---
+
+## 1. Rate Limit Error (429)
+
+### Cause
+Too many requests or tokens per minute/day.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "Rate limit reached",
+    "type": "rate_limit_error",
+    "code": "rate_limit_exceeded"
+  }
+}
+```
+
+### Solution
+Implement exponential backoff:
+
+```typescript
+async function completionWithRetry(params, maxRetries = 3) {
+  for (let i = 0; i < maxRetries; i++) {
+    try {
+      return await openai.chat.completions.create(params);
+    } catch (error: any) {
+      if (error.status === 429 && i < maxRetries - 1) {
+        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
+        console.log(`Rate limited. Retrying in ${delay}ms...`);
+        await new Promise(resolve => setTimeout(resolve, delay));
+        continue;
+      }
+      throw error;
+    }
+  }
+}
+```
+
+---
+
+## 2. Invalid API Key (401)
+
+### Cause
+Missing or incorrect `OPENAI_API_KEY`.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "Incorrect API key provided",
+    "type": "invalid_request_error",
+    "code": "invalid_api_key"
+  }
+}
+```
+
+### Solution
+Verify environment variable:
+
+```bash
+# Check if set
+echo $OPENAI_API_KEY
+
+# Set in .env
+OPENAI_API_KEY=sk-...
+```
+
+```typescript
+if (!process.env.OPENAI_API_KEY) {
+  throw new Error('OPENAI_API_KEY environment variable is required');
+}
+
+const openai = new OpenAI({
+  apiKey: process.env.OPENAI_API_KEY,
+});
+```
+
+---
+
+## 3. Function Calling Schema Mismatch
+
+### Cause
+Tool definition doesn't match model expectations or arguments are invalid.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "Invalid schema for function 'get_weather'",
+    "type": "invalid_request_error"
+  }
+}
+```
+
+### Solution
+Validate JSON schema:
+
+```typescript
+const tools = [
+  {
+    type: 'function',
+    function: {
+      name: 'get_weather',
+      description: 'Get weather for a location', // Required
+      parameters: { // Required
+        type: 'object',
+        properties: {
+          location: {
+            type: 'string',
+            description: 'City name' // Add descriptions
+          }
+        },
+        required: ['location'] // Specify required fields
+      }
+    }
+  }
+];
+```
+
+---
+
+## 4. Streaming Parse Error
+
+### Cause
+Incomplete or malformed SSE (Server-Sent Events) chunks.
+
+### Symptom
+```
+SyntaxError: Unexpected end of JSON input
+```
+
+### Solution
+Properly handle SSE format:
+
+```typescript
+const lines = chunk.split('\n').filter(line => line.trim() !== '');
+
+for (const line of lines) {
+  if (line.startsWith('data: ')) {
+    const data = line.slice(6);
+
+    if (data === '[DONE]') {
+      break;
+    }
+
+    try {
+      const json = JSON.parse(data);
+      const content = json.choices[0]?.delta?.content || '';
+      console.log(content);
+    } catch (e) {
+      // Skip invalid JSON - don't crash
+      console.warn('Skipping invalid JSON chunk');
+    }
+  }
+}
+```
+
+---
+
+## 5. Vision Image Encoding Error
+
+### Cause
+Invalid base64 encoding or unsupported image format.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "Invalid image format",
+    "type": "invalid_request_error"
+  }
+}
+```
+
+### Solution
+Ensure proper base64 encoding:
+
+```typescript
+import fs from 'fs';
+
+// Read and encode image
+const imageBuffer = fs.readFileSync('./image.jpg');
+const base64Image = imageBuffer.toString('base64');
+
+// Use with correct MIME type
+const completion = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [
+    {
+      role: 'user',
+      content: [
+        { type: 'text', text: 'What is in this image?' },
+        {
+          type: 'image_url',
+          image_url: {
+            url: `data:image/jpeg;base64,${base64Image}` // Include MIME type
+          }
+        }
+      ]
+    }
+  ]
+});
+```
+
+---
+
+## 6. Token Limit Exceeded
+
+### Cause
+Input + output tokens exceed model's context window.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "This model's maximum context length is 128000 tokens",
+    "type": "invalid_request_error",
+    "code": "context_length_exceeded"
+  }
+}
+```
+
+### Solution
+Truncate input or reduce max_tokens:
+
+```typescript
+function truncateMessages(messages, maxTokens = 120000) {
+  // Rough estimate: 1 token ≈ 4 characters
+  const maxChars = maxTokens * 4;
+  let totalChars = 0;
+
+  const truncated = [];
+  for (const msg of messages.reverse()) {
+    const msgChars = msg.content.length;
+    if (totalChars + msgChars > maxChars) break;
+    truncated.unshift(msg);
+    totalChars += msgChars;
+  }
+
+  return truncated;
+}
+
+const completion = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: truncateMessages(messages),
+  max_tokens: 8000, // Limit output tokens
+});
+```
+
+---
+
+## 7. GPT-5 Temperature Not Supported
+
+### Cause
+Using `temperature` parameter with GPT-5 models.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "temperature is not supported for gpt-5",
+    "type": "invalid_request_error"
+  }
+}
+```
+
+### Solution
+Use `reasoning_effort` instead or switch to GPT-4o:
+
+```typescript
+// ❌ Bad - GPT-5 doesn't support temperature
+const completion = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [...],
+  temperature: 0.7, // NOT SUPPORTED
+});
+
+// ✅ Good - Use reasoning_effort for GPT-5
+const completion = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [...],
+  reasoning_effort: 'medium',
+});
+
+// ✅ Or use GPT-4o if you need temperature
+const completion = await openai.chat.completions.create({
+  model: 'gpt-4o',
+  messages: [...],
+  temperature: 0.7,
+});
+```
+
+---
+
+## 8. Streaming Not Closed Properly
+
+### Cause
+Stream not properly terminated, causing resource leaks.
+
+### Symptom
+Memory leaks, hanging connections.
+
+### Solution
+Always close streams:
+
+```typescript
+const stream = await openai.chat.completions.create({
+  model: 'gpt-5',
+  messages: [...],
+  stream: true,
+});
+
+try {
+  for await (const chunk of stream) {
+    const content = chunk.choices[0]?.delta?.content || '';
+    process.stdout.write(content);
+  }
+} finally {
+  // Stream is automatically closed when iteration completes
+  // But handle errors explicitly
+}
+
+// For fetch-based streaming:
+const reader = response.body?.getReader();
+try {
+  while (true) {
+    const { done, value } = await reader!.read();
+    if (done) break;
+    // Process chunk
+  }
+} finally {
+  reader!.releaseLock(); // Important!
+}
+```
+
+---
+
+## 9. API Key Exposure in Client-Side Code
+
+### Cause
+Including API key in frontend JavaScript.
+
+### Risk
+API key visible to all users, can be stolen and abused.
+
+### Solution
+Use server-side proxy:
+
+```typescript
+// ❌ Bad - Client-side (NEVER DO THIS)
+const apiKey = 'sk-...'; // Exposed to all users!
+const response = await fetch('https://api.openai.com/v1/chat/completions', {
+  headers: { 'Authorization': `Bearer ${apiKey}` }
+});
+
+// ✅ Good - Server-side proxy
+// Frontend:
+const response = await fetch('/api/chat', {
+  method: 'POST',
+  body: JSON.stringify({ message: 'Hello' }),
+});
+
+// Backend (e.g., Express):
+app.post('/api/chat', async (req, res) => {
+  const completion = await openai.chat.completions.create({
+    model: 'gpt-5',
+    messages: [{ role: 'user', content: req.body.message }],
+  });
+  res.json(completion);
+});
+```
+
+---
+
+## 10. Embeddings Dimension Mismatch
+
+### Cause
+Using wrong dimensions for embedding model.
+
+### Error Response
+```json
+{
+  "error": {
+    "message": "dimensions must be less than or equal to 3072 for text-embedding-3-large",
+    "type": "invalid_request_error"
+  }
+}
+```
+
+### Solution
+Use correct dimensions for each model:
+
+```typescript
+// text-embedding-3-small: default 1536, max 1536
+const embedding1 = await openai.embeddings.create({
+  model: 'text-embedding-3-small',
+  input: 'Hello world',
+  // dimensions: 256, // Optional: reduce from default 1536
+});
+
+// text-embedding-3-large: default 3072, max 3072
+const embedding2 = await openai.embeddings.create({
+  model: 'text-embedding-3-large',
+  input: 'Hello world',
+  // dimensions: 1024, // Optional: reduce from default 3072
+});
+
+// text-embedding-ada-002: fixed 1536 (no dimensions parameter)
+const embedding3 = await openai.embeddings.create({
+  model: 'text-embedding-ada-002',
+  input: 'Hello world',
+  // No dimensions parameter supported
+});
+```
+
+---
+
+## Quick Reference Table
+
+| Error Code | HTTP Status | Primary Cause | Quick Fix |
+|------------|-------------|---------------|-----------|
+| `rate_limit_exceeded` | 429 | Too many requests | Exponential backoff |
+| `invalid_api_key` | 401 | Wrong/missing key | Check OPENAI_API_KEY |
+| `invalid_request_error` | 400 | Bad parameters | Validate schema/params |
+| `context_length_exceeded` | 400 | Too many tokens | Truncate input |
+| `model_not_found` | 404 | Invalid model name | Use correct model ID |
+| `insufficient_quota` | 429 | No credits left | Add billing/credits |
+
+---
+
+## Additional Resources
+
+- **Official Error Codes**: https://platform.openai.com/docs/guides/error-codes
+- **Rate Limits Guide**: https://platform.openai.com/docs/guides/rate-limits
+- **Best Practices**: https://platform.openai.com/docs/guides/production-best-practices
+
+---
+
+**Phase 1 Complete** ✅
+**Phase 2**: Additional errors for Embeddings, Images, Audio, Moderation (next session)