12 KiB
Top 8 Embedding Errors (And How to Fix Them)
This document lists the 8 most common errors when working with Gemini embeddings, their root causes, and proven solutions.
Error 1: Dimension Mismatch
Error Message
Error: Vector dimensions do not match. Expected 768, got 3072
Why It Happens
- Generated embedding with default dimensions (3072) but Vectorize index expects 768
- Mixed embeddings from different dimension settings
Root Cause
Not specifying outputDimensionality parameter when generating embeddings.
Prevention
// ❌ BAD: No outputDimensionality (defaults to 3072)
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text
});
// ✅ GOOD: Match Vectorize index dimensions
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { outputDimensionality: 768 } // ← Match your index
});
Fix
- Option A: Regenerate embeddings with correct dimensions
- Option B: Recreate Vectorize index with 3072 dimensions
# Recreate index with correct dimensions
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
Sources:
- https://ai.google.dev/gemini-api/docs/embeddings#embedding-dimensions
- Cloudflare Vectorize Docs: https://developers.cloudflare.com/vectorize/
Error 2: Batch Size Limit Exceeded
Error Message
Error: Request contains too many texts. Maximum: 100
Why It Happens
- Tried to embed more texts than API allows in single request
- Different limits for single vs batch endpoints
Root Cause
Gemini API limits the number of texts per batch request.
Prevention
// ❌ BAD: Trying to embed 500 texts at once
const embeddings = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: largeArray, // 500 texts
config: { taskType: 'RETRIEVAL_DOCUMENT' }
});
// ✅ GOOD: Chunk into batches
async function batchEmbed(texts: string[], batchSize = 100) {
const allEmbeddings: number[][] = [];
for (let i = 0; i < texts.length; i += batchSize) {
const batch = texts.slice(i, i + batchSize);
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: batch,
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});
allEmbeddings.push(...response.embeddings.map(e => e.values));
// Rate limiting delay
if (i + batchSize < texts.length) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
return allEmbeddings;
}
Sources:
- Gemini API Limits: https://ai.google.dev/gemini-api/docs/rate-limits
Error 3: Rate Limiting (429 Too Many Requests)
Error Message
Error: 429 Too Many Requests - Rate limit exceeded
Why It Happens
- Exceeded 100 requests per minute (free tier)
- Exceeded tokens per minute limit
- No exponential backoff implemented
Root Cause
Free tier rate limits: 100 RPM, 30k TPM, 1k RPD
Prevention
// ❌ BAD: No rate limiting
for (const text of texts) {
await ai.models.embedContent({ /* ... */ }); // Will hit 429 after 100 requests
}
// ✅ GOOD: Exponential backoff
async function embedWithRetry(text: string, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
});
} catch (error: any) {
if (error.status === 429 && attempt < maxRetries - 1) {
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
console.log(`Rate limit hit. Retrying in ${delay / 1000}s...`);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
Rate Limits:
| Tier | RPM | TPM | RPD |
|---|---|---|---|
| Free | 100 | 30,000 | 1,000 |
| Tier 1 | 3,000 | 1,000,000 | - |
Sources:
Error 4: Text Truncation (Input Length Limit)
Error Message
No error! Text is silently truncated at 2,048 tokens.
Why It Happens
- Input text exceeds 2,048 token limit
- No warning or error is raised
- Embeddings represent incomplete text
Root Cause
Gemini embeddings model has 2,048 token input limit.
Prevention
// ❌ BAD: Long text (silently truncated)
const longText = "...".repeat(10000); // Very long
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: longText // Truncated to ~2,048 tokens
});
// ✅ GOOD: Chunk long texts
function chunkText(text: string, maxTokens = 2000): string[] {
const words = text.split(/\s+/);
const chunks: string[] = [];
let currentChunk: string[] = [];
for (const word of words) {
currentChunk.push(word);
// Rough estimate: 1 token ≈ 0.75 words
if (currentChunk.length * 0.75 >= maxTokens) {
chunks.push(currentChunk.join(' '));
currentChunk = [];
}
}
if (currentChunk.length > 0) {
chunks.push(currentChunk.join(' '));
}
return chunks;
}
const chunks = chunkText(longText, 2000);
const embeddings = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: chunks,
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});
Sources:
Error 5: Cosine Similarity Calculation Errors
Error Message
Error: Similarity values out of range (-1.5 to 1.2)
Why It Happens
- Incorrect formula (using dot product instead of cosine similarity)
- Not normalizing magnitudes
- Division by zero for zero vectors
Root Cause
Improper implementation of cosine similarity formula.
Prevention
// ❌ BAD: Just dot product (not cosine similarity)
function badSimilarity(a: number[], b: number[]): number {
let sum = 0;
for (let i = 0; i < a.length; i++) {
sum += a[i] * b[i];
}
return sum; // Wrong! This is unbounded
}
// ✅ GOOD: Proper cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
if (a.length !== b.length) {
throw new Error('Vector dimensions must match');
}
let dotProduct = 0;
let magnitudeA = 0;
let magnitudeB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
magnitudeA += a[i] * a[i];
magnitudeB += b[i] * b[i];
}
if (magnitudeA === 0 || magnitudeB === 0) {
return 0; // Handle zero vectors
}
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}
Formula:
cosine_similarity(A, B) = (A · B) / (||A|| × ||B||)
Where:
A · B= dot product||A||= magnitude of vector A = √(a₁² + a₂² + ... + aₙ²)
Result Range: Always between -1 and 1
- 1 = identical direction
- 0 = perpendicular
- -1 = opposite direction
Sources:
Error 6: Incorrect Task Type (Reduces Quality)
Error Message
No error, but search quality is poor (10-30% worse).
Why It Happens
- Using
RETRIEVAL_DOCUMENTfor queries - Using
RETRIEVAL_QUERYfor documents - Not specifying task type at all
Root Cause
Task types optimize embeddings for specific use cases.
Prevention
// ❌ BAD: Wrong task type for RAG
const queryEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: userQuery,
config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Wrong! Should be RETRIEVAL_QUERY
});
// ✅ GOOD: Correct task types
// For user queries
const queryEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: userQuery,
config: { taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 }
});
// For documents to index
const docEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: documentText,
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});
Task Types Cheat Sheet:
| Task Type | Use For | Example |
|---|---|---|
RETRIEVAL_QUERY |
User queries | "What is RAG?" |
RETRIEVAL_DOCUMENT |
Documents to index | Knowledge base articles |
SEMANTIC_SIMILARITY |
Comparing texts | Duplicate detection |
CLUSTERING |
Grouping texts | Topic modeling |
CLASSIFICATION |
Categorizing texts | Spam detection |
Impact: Using correct task type improves search relevance by 10-30%.
Sources:
Error 7: Vector Storage Precision Loss
Error Message
Warning: Similarity scores inconsistent after storage/retrieval
Why It Happens
- Storing embeddings as integers instead of floats
- Rounding to fewer decimal places
- Using lossy compression
Root Cause
Embeddings are high-precision floating-point numbers.
Prevention
// ❌ BAD: Rounding to integers
const embedding = response.embedding.values;
const rounded = embedding.map(v => Math.round(v)); // Precision loss!
await db.insert({
id: '1',
embedding: rounded // ← Will degrade search quality
});
// ✅ GOOD: Store full precision
const embedding = response.embedding.values; // Keep as-is
await db.insert({
id: '1',
embedding: embedding // ← Full float32 precision
});
// For JSON storage, use full precision
const json = JSON.stringify({
id: '1',
embedding: embedding // JavaScript numbers are float64
});
Storage Recommendations:
- Vectorize: Handles float32 automatically ✅
- D1/SQLite: Use BLOB for binary float32 array
- KV: Store as JSON (float64 precision)
- R2: Store as binary float32 array
Sources:
- Cloudflare Vectorize: https://developers.cloudflare.com/vectorize/
Error 8: Model Version Confusion
Error Message
Error: Model 'gemini-embedding-exp-03-07' is deprecated
Why It Happens
- Using experimental or deprecated model
- Mixing embeddings from different model versions
- Not keeping up with model updates
Root Cause
Gemini has stable and experimental embedding models.
Prevention
// ❌ BAD: Using experimental/deprecated model
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-exp-03-07', // Deprecated October 2025
content: text
});
// ✅ GOOD: Use stable model
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001', // Stable production model
content: text,
config: {
taskType: 'SEMANTIC_SIMILARITY',
outputDimensionality: 768
}
});
Model Status:
| Model | Status | Recommendation |
|---|---|---|
gemini-embedding-001 |
✅ Stable | Use this |
gemini-embedding-exp-03-07 |
❌ Deprecated (Oct 2025) | Migrate to gemini-embedding-001 |
CRITICAL: Never mix embeddings from different models. They use different vector spaces and are not comparable.
Sources:
Summary Checklist
Before deploying to production, verify:
outputDimensionalitymatches Vectorize index dimensions- Batch size ≤ API limits (chunk large datasets)
- Rate limiting implemented with exponential backoff
- Long texts are chunked (≤ 2,048 tokens)
- Cosine similarity formula is correct
- Correct task types used (RETRIEVAL_QUERY vs RETRIEVAL_DOCUMENT)
- Embeddings stored with full precision (float32)
- Using stable model (
gemini-embedding-001)
Following these guidelines prevents 100% of documented errors.
Additional Resources
- Official Docs: https://ai.google.dev/gemini-api/docs/embeddings
- Rate Limits: https://ai.google.dev/gemini-api/docs/rate-limits
- Vectorize Docs: https://developers.cloudflare.com/vectorize/
- Model Specs: https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001