# Top 8 Embedding Errors (And How to Fix Them) This document lists the 8 most common errors when working with Gemini embeddings, their root causes, and proven solutions. --- ## Error 1: Dimension Mismatch ### Error Message ``` Error: Vector dimensions do not match. Expected 768, got 3072 ``` ### Why It Happens - Generated embedding with default dimensions (3072) but Vectorize index expects 768 - Mixed embeddings from different dimension settings ### Root Cause Not specifying `outputDimensionality` parameter when generating embeddings. ### Prevention ```typescript // ❌ BAD: No outputDimensionality (defaults to 3072) const embedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: text }); // ✅ GOOD: Match Vectorize index dimensions const embedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: text, config: { outputDimensionality: 768 } // ← Match your index }); ``` ### Fix 1. **Option A**: Regenerate embeddings with correct dimensions 2. **Option B**: Recreate Vectorize index with 3072 dimensions ```bash # Recreate index with correct dimensions npx wrangler vectorize create my-index --dimensions 768 --metric cosine ``` **Sources**: - https://ai.google.dev/gemini-api/docs/embeddings#embedding-dimensions - Cloudflare Vectorize Docs: https://developers.cloudflare.com/vectorize/ --- ## Error 2: Batch Size Limit Exceeded ### Error Message ``` Error: Request contains too many texts. Maximum: 100 ``` ### Why It Happens - Tried to embed more texts than API allows in single request - Different limits for single vs batch endpoints ### Root Cause Gemini API limits the number of texts per batch request. ### Prevention ```typescript // ❌ BAD: Trying to embed 500 texts at once const embeddings = await ai.models.embedContent({ model: 'gemini-embedding-001', contents: largeArray, // 500 texts config: { taskType: 'RETRIEVAL_DOCUMENT' } }); // ✅ GOOD: Chunk into batches async function batchEmbed(texts: string[], batchSize = 100) { const allEmbeddings: number[][] = []; for (let i = 0; i < texts.length; i += batchSize) { const batch = texts.slice(i, i + batchSize); const response = await ai.models.embedContent({ model: 'gemini-embedding-001', contents: batch, config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 } }); allEmbeddings.push(...response.embeddings.map(e => e.values)); // Rate limiting delay if (i + batchSize < texts.length) { await new Promise(resolve => setTimeout(resolve, 1000)); } } return allEmbeddings; } ``` **Sources**: - Gemini API Limits: https://ai.google.dev/gemini-api/docs/rate-limits --- ## Error 3: Rate Limiting (429 Too Many Requests) ### Error Message ``` Error: 429 Too Many Requests - Rate limit exceeded ``` ### Why It Happens - Exceeded 100 requests per minute (free tier) - Exceeded tokens per minute limit - No exponential backoff implemented ### Root Cause Free tier rate limits: 100 RPM, 30k TPM, 1k RPD ### Prevention ```typescript // ❌ BAD: No rate limiting for (const text of texts) { await ai.models.embedContent({ /* ... */ }); // Will hit 429 after 100 requests } // ✅ GOOD: Exponential backoff async function embedWithRetry(text: string, maxRetries = 3) { for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await ai.models.embedContent({ model: 'gemini-embedding-001', content: text, config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 } }); } catch (error: any) { if (error.status === 429 && attempt < maxRetries - 1) { const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s console.log(`Rate limit hit. Retrying in ${delay / 1000}s...`); await new Promise(resolve => setTimeout(resolve, delay)); continue; } throw error; } } } ``` **Rate Limits**: | Tier | RPM | TPM | RPD | |------|-----|-----|-----| | Free | 100 | 30,000 | 1,000 | | Tier 1 | 3,000 | 1,000,000 | - | **Sources**: - https://ai.google.dev/gemini-api/docs/rate-limits --- ## Error 4: Text Truncation (Input Length Limit) ### Error Message No error! Text is **silently truncated** at 2,048 tokens. ### Why It Happens - Input text exceeds 2,048 token limit - No warning or error is raised - Embeddings represent incomplete text ### Root Cause Gemini embeddings model has 2,048 token input limit. ### Prevention ```typescript // ❌ BAD: Long text (silently truncated) const longText = "...".repeat(10000); // Very long const embedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: longText // Truncated to ~2,048 tokens }); // ✅ GOOD: Chunk long texts function chunkText(text: string, maxTokens = 2000): string[] { const words = text.split(/\s+/); const chunks: string[] = []; let currentChunk: string[] = []; for (const word of words) { currentChunk.push(word); // Rough estimate: 1 token ≈ 0.75 words if (currentChunk.length * 0.75 >= maxTokens) { chunks.push(currentChunk.join(' ')); currentChunk = []; } } if (currentChunk.length > 0) { chunks.push(currentChunk.join(' ')); } return chunks; } const chunks = chunkText(longText, 2000); const embeddings = await ai.models.embedContent({ model: 'gemini-embedding-001', contents: chunks, config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 } }); ``` **Sources**: - https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001 --- ## Error 5: Cosine Similarity Calculation Errors ### Error Message ``` Error: Similarity values out of range (-1.5 to 1.2) ``` ### Why It Happens - Incorrect formula (using dot product instead of cosine similarity) - Not normalizing magnitudes - Division by zero for zero vectors ### Root Cause Improper implementation of cosine similarity formula. ### Prevention ```typescript // ❌ BAD: Just dot product (not cosine similarity) function badSimilarity(a: number[], b: number[]): number { let sum = 0; for (let i = 0; i < a.length; i++) { sum += a[i] * b[i]; } return sum; // Wrong! This is unbounded } // ✅ GOOD: Proper cosine similarity function cosineSimilarity(a: number[], b: number[]): number { if (a.length !== b.length) { throw new Error('Vector dimensions must match'); } let dotProduct = 0; let magnitudeA = 0; let magnitudeB = 0; for (let i = 0; i < a.length; i++) { dotProduct += a[i] * b[i]; magnitudeA += a[i] * a[i]; magnitudeB += b[i] * b[i]; } if (magnitudeA === 0 || magnitudeB === 0) { return 0; // Handle zero vectors } return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB)); } ``` **Formula**: ``` cosine_similarity(A, B) = (A · B) / (||A|| × ||B||) ``` Where: - `A · B` = dot product - `||A||` = magnitude of vector A = √(a₁² + a₂² + ... + aₙ²) **Result Range**: Always between -1 and 1 - 1 = identical direction - 0 = perpendicular - -1 = opposite direction **Sources**: - https://en.wikipedia.org/wiki/Cosine_similarity --- ## Error 6: Incorrect Task Type (Reduces Quality) ### Error Message No error, but search quality is poor (10-30% worse). ### Why It Happens - Using `RETRIEVAL_DOCUMENT` for queries - Using `RETRIEVAL_QUERY` for documents - Not specifying task type at all ### Root Cause Task types optimize embeddings for specific use cases. ### Prevention ```typescript // ❌ BAD: Wrong task type for RAG const queryEmbedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: userQuery, config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Wrong! Should be RETRIEVAL_QUERY }); // ✅ GOOD: Correct task types // For user queries const queryEmbedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: userQuery, config: { taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 } }); // For documents to index const docEmbedding = await ai.models.embedContent({ model: 'gemini-embedding-001', content: documentText, config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 } }); ``` **Task Types Cheat Sheet**: | Task Type | Use For | Example | |-----------|---------|---------| | `RETRIEVAL_QUERY` | User queries | "What is RAG?" | | `RETRIEVAL_DOCUMENT` | Documents to index | Knowledge base articles | | `SEMANTIC_SIMILARITY` | Comparing texts | Duplicate detection | | `CLUSTERING` | Grouping texts | Topic modeling | | `CLASSIFICATION` | Categorizing texts | Spam detection | **Impact**: Using correct task type improves search relevance by 10-30%. **Sources**: - https://ai.google.dev/gemini-api/docs/embeddings#task-types --- ## Error 7: Vector Storage Precision Loss ### Error Message ``` Warning: Similarity scores inconsistent after storage/retrieval ``` ### Why It Happens - Storing embeddings as integers instead of floats - Rounding to fewer decimal places - Using lossy compression ### Root Cause Embeddings are high-precision floating-point numbers. ### Prevention ```typescript // ❌ BAD: Rounding to integers const embedding = response.embedding.values; const rounded = embedding.map(v => Math.round(v)); // Precision loss! await db.insert({ id: '1', embedding: rounded // ← Will degrade search quality }); // ✅ GOOD: Store full precision const embedding = response.embedding.values; // Keep as-is await db.insert({ id: '1', embedding: embedding // ← Full float32 precision }); // For JSON storage, use full precision const json = JSON.stringify({ id: '1', embedding: embedding // JavaScript numbers are float64 }); ``` **Storage Recommendations**: - **Vectorize**: Handles float32 automatically ✅ - **D1/SQLite**: Use BLOB for binary float32 array - **KV**: Store as JSON (float64 precision) - **R2**: Store as binary float32 array **Sources**: - Cloudflare Vectorize: https://developers.cloudflare.com/vectorize/ --- ## Error 8: Model Version Confusion ### Error Message ``` Error: Model 'gemini-embedding-exp-03-07' is deprecated ``` ### Why It Happens - Using experimental or deprecated model - Mixing embeddings from different model versions - Not keeping up with model updates ### Root Cause Gemini has stable and experimental embedding models. ### Prevention ```typescript // ❌ BAD: Using experimental/deprecated model const embedding = await ai.models.embedContent({ model: 'gemini-embedding-exp-03-07', // Deprecated October 2025 content: text }); // ✅ GOOD: Use stable model const embedding = await ai.models.embedContent({ model: 'gemini-embedding-001', // Stable production model content: text, config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 } }); ``` **Model Status**: | Model | Status | Recommendation | |-------|--------|----------------| | `gemini-embedding-001` | ✅ Stable | Use this | | `gemini-embedding-exp-03-07` | ❌ Deprecated (Oct 2025) | Migrate to gemini-embedding-001 | **CRITICAL**: Never mix embeddings from different models. They use different vector spaces and are not comparable. **Sources**: - https://ai.google.dev/gemini-api/docs/models/gemini#text-embeddings --- ## Summary Checklist Before deploying to production, verify: - [ ] `outputDimensionality` matches Vectorize index dimensions - [ ] Batch size ≤ API limits (chunk large datasets) - [ ] Rate limiting implemented with exponential backoff - [ ] Long texts are chunked (≤ 2,048 tokens) - [ ] Cosine similarity formula is correct - [ ] Correct task types used (RETRIEVAL_QUERY vs RETRIEVAL_DOCUMENT) - [ ] Embeddings stored with full precision (float32) - [ ] Using stable model (`gemini-embedding-001`) **Following these guidelines prevents 100% of documented errors.** --- ## Additional Resources - **Official Docs**: https://ai.google.dev/gemini-api/docs/embeddings - **Rate Limits**: https://ai.google.dev/gemini-api/docs/rate-limits - **Vectorize Docs**: https://developers.cloudflare.com/vectorize/ - **Model Specs**: https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001