# OpenAI API Skill - Phase 2 Session Plan **Created**: 2025-10-25 **Status**: Phase 1 Complete ✅ - Ready for Phase 2 **Estimated Phase 2 Time**: 3-4 hours --- ## Phase 1 Completion Summary ✅ ### What's Done 1. **SKILL.md** - Complete foundation (900+ lines) - ✅ Full Chat Completions API documentation - ✅ GPT-5 series coverage with unique parameters - ✅ Streaming patterns (both SDK and fetch) - ✅ Function calling complete guide - ✅ Structured outputs examples - ✅ Vision (GPT-4o) coverage - ✅ Error handling section - ✅ Rate limits section - ✅ Production best practices - ✅ Relationship to openai-responses 2. **README.md** - Complete with comprehensive keywords ✅ - All auto-trigger keywords - When to use guide - Quick examples - Known issues table - Token efficiency metrics 3. **Core Templates** (6 files) ✅ - chat-completion-basic.ts - chat-completion-nodejs.ts - streaming-chat.ts - streaming-fetch.ts - function-calling.ts - cloudflare-worker.ts - package.json 4. **Reference Docs** (1 file) ✅ - top-errors.md (10 common errors with solutions) 5. **Scripts** (1 file) ✅ - check-versions.sh 6. **Research** ✅ - Complete research log: `/planning/research-logs/openai-api.md` ### Current Status - **Usable NOW**: Chat Completions fully documented and working - **Phase 1**: Production-ready for primary use case (Chat Completions) - **Phase 2**: Remaining APIs to be completed --- ## Phase 2 Tasks ### 1. Complete SKILL.md Sections (2-3 hours) #### Embeddings API Section Location: `SKILL.md` line ~600 (marked as "Phase 2") **Content to Add**: - Models: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002 - Custom dimensions parameter - Batch processing patterns - Request/response examples - RAG integration patterns - Dimension reduction techniques - Token limits (8192 per input, 300k summed) **Source**: `/planning/research-logs/openai-api.md` Section 2 #### Images API Section Location: `SKILL.md` line ~620 (marked as "Phase 2") **Content to Add**: - DALL-E 3 generation (/v1/images/generations) - Image editing (/v1/images/edits) - Parameters: size, quality, style, response_format - Quality settings (standard vs HD) - Style options (vivid vs natural) - Transparent backgrounds - Output compression - Request/response examples **Source**: `/planning/research-logs/openai-api.md` Section 3 #### Audio API Section Location: `SKILL.md` line ~640 (marked as "Phase 2") **Content to Add**: - Whisper transcription (/v1/audio/transcriptions) - Text-to-Speech (/v1/audio/speech) - Models: whisper-1, tts-1, tts-1-hd, gpt-4o-mini-tts - 11 voices (alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse) - Audio formats: mp3, opus, aac, flac, wav, pcm - Speed control (0.25 to 4.0) - Voice instructions (gpt-4o-mini-tts only) - Streaming audio (sse format) - Request/response examples **Source**: `/planning/research-logs/openai-api.md` Section 4 #### Moderation API Section Location: `SKILL.md` line ~660 (marked as "Phase 2") **Content to Add**: - Moderation endpoint (/v1/moderations) - Model: omni-moderation-latest - Categories: sexual, hate, harassment, self-harm, violence, etc. - Category scores (0-1 confidence) - Multi-modal moderation (text + images) - Batch moderation - Request/response examples - Threshold recommendations **Source**: `/planning/research-logs/openai-api.md` Section 5 ### 2. Create Remaining Templates (9 files, 1-2 hours) #### Embeddings Templates 1. **embeddings.ts** - Basic embeddings generation ```typescript // text-embedding-3-small and text-embedding-3-large examples // Custom dimensions // Batch processing ``` #### Images Templates 2. **image-generation.ts** - DALL-E 3 generation ```typescript // Basic generation // Quality and style options // Transparent backgrounds ``` 3. **image-editing.ts** - Image editing ```typescript // Edit with mask // Transparent backgrounds // Compression options ``` #### Audio Templates 4. **audio-transcription.ts** - Whisper transcription ```typescript // File transcription // Supported formats ``` 5. **text-to-speech.ts** - TTS generation ```typescript // All 11 voices // gpt-4o-mini-tts with instructions // Speed control // Format options ``` #### Moderation Templates 6. **moderation.ts** - Content moderation ```typescript // Basic moderation // Category filtering // Batch moderation ``` #### Advanced Templates 7. **structured-output.ts** - JSON schema validation ```typescript // Using response_format with JSON schema // Strict mode // Complex nested schemas ``` 8. **vision-gpt4o.ts** - Vision examples ```typescript // Image via URL // Image via base64 // Multiple images ``` 9. **rate-limit-handling.ts** - Production retry logic ```typescript // Exponential backoff // Rate limit header monitoring // Queue implementation ``` ### 3. Create Remaining Reference Docs (7 files, 1 hour) 1. **models-guide.md** - GPT-5 vs GPT-4o vs GPT-4 Turbo comparison table - When to use each model - Cost comparison - Capability matrix 2. **function-calling-patterns.md** - Advanced tool patterns - Parallel tool calls - Dynamic tool generation - Error handling in tools 3. **structured-output-guide.md** - JSON schema best practices - Complex nested schemas - Validation strategies - Error handling 4. **embeddings-guide.md** - Model comparison (small vs large vs ada-002) - Dimension selection - RAG patterns - Cosine similarity examples - Batch processing strategies 5. **images-guide.md** - DALL-E 3 prompting tips - Quality vs cost trade-offs - Style guide (vivid vs natural) - Transparent backgrounds use cases - Editing best practices 6. **audio-guide.md** - Voice selection guide - TTS vs real recordings - Whisper accuracy tips - Format selection 7. **cost-optimization.md** - Model selection strategies - Caching patterns - Batch processing - Token optimization - Rate limit management ### 4. Testing & Validation (30 min) - [ ] Install skill: `./scripts/install-skill.sh openai-api` - [ ] Test auto-discovery with Claude Code - [ ] Verify all templates compile (TypeScript check) - [ ] Test at least 2-3 templates end-to-end with real API calls - [ ] Check against ONE_PAGE_CHECKLIST.md ### 5. Final Documentation (30 min) - [ ] Update roadmap: `/planning/skills-roadmap.md` - Mark openai-api as complete - Add completion metrics (token savings, errors prevented) - Update status to Production Ready - [ ] Update SKILL.md - Remove all "Phase 2" markers - Update status to "Production Ready ✅" - Update Last Updated date - [ ] Create final commit message --- ## Quick Start for Phase 2 Session ### Context to Load 1. Read this file (NEXT-SESSION.md) 2. Read `/planning/research-logs/openai-api.md` for all API details 3. Review current `SKILL.md` to see structure ### First Steps ```bash # 1. Navigate to skill directory cd /home/jez/Documents/claude-skills/skills/openai-api # 2. Verify current state ls -la templates/ ls -la references/ # 3. Start with Embeddings API section # Edit SKILL.md around line 600 ``` ### Development Order 1. **Embeddings** (most requested after Chat Completions) 2. **Images** (DALL-E 3 popular) 3. **Audio** (Whisper + TTS) 4. **Moderation** (simple, quick) 5. **Templates** (parallel work) 6. **Reference docs** (parallel work) 7. **Testing** 8. **Commit** --- ## Reference Files ### Already Created - `SKILL.md` - Foundation with Chat Completions complete - `README.md` - Complete - `templates/chat-completion-basic.ts` ✅ - `templates/chat-completion-nodejs.ts` ✅ - `templates/streaming-chat.ts` ✅ - `templates/streaming-fetch.ts` ✅ - `templates/function-calling.ts` ✅ - `templates/cloudflare-worker.ts` ✅ - `templates/package.json` ✅ - `references/top-errors.md` ✅ - `scripts/check-versions.sh` ✅ - `/planning/research-logs/openai-api.md` ✅ ### To Be Created (Phase 2) - **Templates** (9 files) - **References** (7 files) - **SKILL.md sections** (4 API sections) --- ## Success Criteria ### Phase 2 Complete When: - [ ] All 4 API sections in SKILL.md complete (Embeddings, Images, Audio, Moderation) - [ ] All 14 templates created (6 done + 9 new = 15 total) - [ ] All 10 reference docs created (1 done + 7 new = 8 total minimum) - [ ] Auto-discovery working - [ ] All templates tested - [ ] Token savings >= 60% (measured) - [ ] Errors prevented: 10+ (documented) - [ ] Roadmap updated - [ ] Committed to git - [ ] Status: Production Ready ✅ --- ## Token Efficiency Target **Phase 1 Baseline**: - Manual Chat Completions setup: ~10,000 tokens - With Phase 1 skill: ~4,000 tokens - **Savings: ~60%** **Phase 2 Target** (full skill): - Manual full API setup: ~21,000 tokens - With complete skill: ~8,500 tokens - **Target savings: ~60%** --- ## Notes - All research is complete and documented - Templates follow consistent patterns - Both SDK and fetch approaches where applicable - Focus on copy-paste ready code - Production patterns emphasized - Clear relationship to openai-responses skill --- **Ready to Execute Phase 2!** 🚀 When starting next session, simply read this file and continue from Phase 2 Tasks.