Initial commit
This commit is contained in:
460
references/top-errors.md
Normal file
460
references/top-errors.md
Normal file
@@ -0,0 +1,460 @@
|
||||
# Top 8 Embedding Errors (And How to Fix Them)
|
||||
|
||||
This document lists the 8 most common errors when working with Gemini embeddings, their root causes, and proven solutions.
|
||||
|
||||
---
|
||||
|
||||
## Error 1: Dimension Mismatch
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Error: Vector dimensions do not match. Expected 768, got 3072
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Generated embedding with default dimensions (3072) but Vectorize index expects 768
|
||||
- Mixed embeddings from different dimension settings
|
||||
|
||||
### Root Cause
|
||||
Not specifying `outputDimensionality` parameter when generating embeddings.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: No outputDimensionality (defaults to 3072)
|
||||
const embedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: text
|
||||
});
|
||||
|
||||
// ✅ GOOD: Match Vectorize index dimensions
|
||||
const embedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: text,
|
||||
config: { outputDimensionality: 768 } // ← Match your index
|
||||
});
|
||||
```
|
||||
|
||||
### Fix
|
||||
1. **Option A**: Regenerate embeddings with correct dimensions
|
||||
2. **Option B**: Recreate Vectorize index with 3072 dimensions
|
||||
|
||||
```bash
|
||||
# Recreate index with correct dimensions
|
||||
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
|
||||
```
|
||||
|
||||
**Sources**:
|
||||
- https://ai.google.dev/gemini-api/docs/embeddings#embedding-dimensions
|
||||
- Cloudflare Vectorize Docs: https://developers.cloudflare.com/vectorize/
|
||||
|
||||
---
|
||||
|
||||
## Error 2: Batch Size Limit Exceeded
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Error: Request contains too many texts. Maximum: 100
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Tried to embed more texts than API allows in single request
|
||||
- Different limits for single vs batch endpoints
|
||||
|
||||
### Root Cause
|
||||
Gemini API limits the number of texts per batch request.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Trying to embed 500 texts at once
|
||||
const embeddings = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
contents: largeArray, // 500 texts
|
||||
config: { taskType: 'RETRIEVAL_DOCUMENT' }
|
||||
});
|
||||
|
||||
// ✅ GOOD: Chunk into batches
|
||||
async function batchEmbed(texts: string[], batchSize = 100) {
|
||||
const allEmbeddings: number[][] = [];
|
||||
|
||||
for (let i = 0; i < texts.length; i += batchSize) {
|
||||
const batch = texts.slice(i, i + batchSize);
|
||||
const response = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
contents: batch,
|
||||
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
|
||||
});
|
||||
allEmbeddings.push(...response.embeddings.map(e => e.values));
|
||||
|
||||
// Rate limiting delay
|
||||
if (i + batchSize < texts.length) {
|
||||
await new Promise(resolve => setTimeout(resolve, 1000));
|
||||
}
|
||||
}
|
||||
|
||||
return allEmbeddings;
|
||||
}
|
||||
```
|
||||
|
||||
**Sources**:
|
||||
- Gemini API Limits: https://ai.google.dev/gemini-api/docs/rate-limits
|
||||
|
||||
---
|
||||
|
||||
## Error 3: Rate Limiting (429 Too Many Requests)
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Error: 429 Too Many Requests - Rate limit exceeded
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Exceeded 100 requests per minute (free tier)
|
||||
- Exceeded tokens per minute limit
|
||||
- No exponential backoff implemented
|
||||
|
||||
### Root Cause
|
||||
Free tier rate limits: 100 RPM, 30k TPM, 1k RPD
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: No rate limiting
|
||||
for (const text of texts) {
|
||||
await ai.models.embedContent({ /* ... */ }); // Will hit 429 after 100 requests
|
||||
}
|
||||
|
||||
// ✅ GOOD: Exponential backoff
|
||||
async function embedWithRetry(text: string, maxRetries = 3) {
|
||||
for (let attempt = 0; attempt < maxRetries; attempt++) {
|
||||
try {
|
||||
return await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: text,
|
||||
config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
|
||||
});
|
||||
} catch (error: any) {
|
||||
if (error.status === 429 && attempt < maxRetries - 1) {
|
||||
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
|
||||
console.log(`Rate limit hit. Retrying in ${delay / 1000}s...`);
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
continue;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Rate Limits**:
|
||||
| Tier | RPM | TPM | RPD |
|
||||
|------|-----|-----|-----|
|
||||
| Free | 100 | 30,000 | 1,000 |
|
||||
| Tier 1 | 3,000 | 1,000,000 | - |
|
||||
|
||||
**Sources**:
|
||||
- https://ai.google.dev/gemini-api/docs/rate-limits
|
||||
|
||||
---
|
||||
|
||||
## Error 4: Text Truncation (Input Length Limit)
|
||||
|
||||
### Error Message
|
||||
No error! Text is **silently truncated** at 2,048 tokens.
|
||||
|
||||
### Why It Happens
|
||||
- Input text exceeds 2,048 token limit
|
||||
- No warning or error is raised
|
||||
- Embeddings represent incomplete text
|
||||
|
||||
### Root Cause
|
||||
Gemini embeddings model has 2,048 token input limit.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Long text (silently truncated)
|
||||
const longText = "...".repeat(10000); // Very long
|
||||
const embedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: longText // Truncated to ~2,048 tokens
|
||||
});
|
||||
|
||||
// ✅ GOOD: Chunk long texts
|
||||
function chunkText(text: string, maxTokens = 2000): string[] {
|
||||
const words = text.split(/\s+/);
|
||||
const chunks: string[] = [];
|
||||
let currentChunk: string[] = [];
|
||||
|
||||
for (const word of words) {
|
||||
currentChunk.push(word);
|
||||
|
||||
// Rough estimate: 1 token ≈ 0.75 words
|
||||
if (currentChunk.length * 0.75 >= maxTokens) {
|
||||
chunks.push(currentChunk.join(' '));
|
||||
currentChunk = [];
|
||||
}
|
||||
}
|
||||
|
||||
if (currentChunk.length > 0) {
|
||||
chunks.push(currentChunk.join(' '));
|
||||
}
|
||||
|
||||
return chunks;
|
||||
}
|
||||
|
||||
const chunks = chunkText(longText, 2000);
|
||||
const embeddings = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
contents: chunks,
|
||||
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
|
||||
});
|
||||
```
|
||||
|
||||
**Sources**:
|
||||
- https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001
|
||||
|
||||
---
|
||||
|
||||
## Error 5: Cosine Similarity Calculation Errors
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Error: Similarity values out of range (-1.5 to 1.2)
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Incorrect formula (using dot product instead of cosine similarity)
|
||||
- Not normalizing magnitudes
|
||||
- Division by zero for zero vectors
|
||||
|
||||
### Root Cause
|
||||
Improper implementation of cosine similarity formula.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Just dot product (not cosine similarity)
|
||||
function badSimilarity(a: number[], b: number[]): number {
|
||||
let sum = 0;
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
sum += a[i] * b[i];
|
||||
}
|
||||
return sum; // Wrong! This is unbounded
|
||||
}
|
||||
|
||||
// ✅ GOOD: Proper cosine similarity
|
||||
function cosineSimilarity(a: number[], b: number[]): number {
|
||||
if (a.length !== b.length) {
|
||||
throw new Error('Vector dimensions must match');
|
||||
}
|
||||
|
||||
let dotProduct = 0;
|
||||
let magnitudeA = 0;
|
||||
let magnitudeB = 0;
|
||||
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
dotProduct += a[i] * b[i];
|
||||
magnitudeA += a[i] * a[i];
|
||||
magnitudeB += b[i] * b[i];
|
||||
}
|
||||
|
||||
if (magnitudeA === 0 || magnitudeB === 0) {
|
||||
return 0; // Handle zero vectors
|
||||
}
|
||||
|
||||
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
|
||||
}
|
||||
```
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
cosine_similarity(A, B) = (A · B) / (||A|| × ||B||)
|
||||
```
|
||||
|
||||
Where:
|
||||
- `A · B` = dot product
|
||||
- `||A||` = magnitude of vector A = √(a₁² + a₂² + ... + aₙ²)
|
||||
|
||||
**Result Range**: Always between -1 and 1
|
||||
- 1 = identical direction
|
||||
- 0 = perpendicular
|
||||
- -1 = opposite direction
|
||||
|
||||
**Sources**:
|
||||
- https://en.wikipedia.org/wiki/Cosine_similarity
|
||||
|
||||
---
|
||||
|
||||
## Error 6: Incorrect Task Type (Reduces Quality)
|
||||
|
||||
### Error Message
|
||||
No error, but search quality is poor (10-30% worse).
|
||||
|
||||
### Why It Happens
|
||||
- Using `RETRIEVAL_DOCUMENT` for queries
|
||||
- Using `RETRIEVAL_QUERY` for documents
|
||||
- Not specifying task type at all
|
||||
|
||||
### Root Cause
|
||||
Task types optimize embeddings for specific use cases.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Wrong task type for RAG
|
||||
const queryEmbedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: userQuery,
|
||||
config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Wrong! Should be RETRIEVAL_QUERY
|
||||
});
|
||||
|
||||
// ✅ GOOD: Correct task types
|
||||
// For user queries
|
||||
const queryEmbedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: userQuery,
|
||||
config: { taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 }
|
||||
});
|
||||
|
||||
// For documents to index
|
||||
const docEmbedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001',
|
||||
content: documentText,
|
||||
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
|
||||
});
|
||||
```
|
||||
|
||||
**Task Types Cheat Sheet**:
|
||||
| Task Type | Use For | Example |
|
||||
|-----------|---------|---------|
|
||||
| `RETRIEVAL_QUERY` | User queries | "What is RAG?" |
|
||||
| `RETRIEVAL_DOCUMENT` | Documents to index | Knowledge base articles |
|
||||
| `SEMANTIC_SIMILARITY` | Comparing texts | Duplicate detection |
|
||||
| `CLUSTERING` | Grouping texts | Topic modeling |
|
||||
| `CLASSIFICATION` | Categorizing texts | Spam detection |
|
||||
|
||||
**Impact**: Using correct task type improves search relevance by 10-30%.
|
||||
|
||||
**Sources**:
|
||||
- https://ai.google.dev/gemini-api/docs/embeddings#task-types
|
||||
|
||||
---
|
||||
|
||||
## Error 7: Vector Storage Precision Loss
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Warning: Similarity scores inconsistent after storage/retrieval
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Storing embeddings as integers instead of floats
|
||||
- Rounding to fewer decimal places
|
||||
- Using lossy compression
|
||||
|
||||
### Root Cause
|
||||
Embeddings are high-precision floating-point numbers.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Rounding to integers
|
||||
const embedding = response.embedding.values;
|
||||
const rounded = embedding.map(v => Math.round(v)); // Precision loss!
|
||||
|
||||
await db.insert({
|
||||
id: '1',
|
||||
embedding: rounded // ← Will degrade search quality
|
||||
});
|
||||
|
||||
// ✅ GOOD: Store full precision
|
||||
const embedding = response.embedding.values; // Keep as-is
|
||||
|
||||
await db.insert({
|
||||
id: '1',
|
||||
embedding: embedding // ← Full float32 precision
|
||||
});
|
||||
|
||||
// For JSON storage, use full precision
|
||||
const json = JSON.stringify({
|
||||
id: '1',
|
||||
embedding: embedding // JavaScript numbers are float64
|
||||
});
|
||||
```
|
||||
|
||||
**Storage Recommendations**:
|
||||
- **Vectorize**: Handles float32 automatically ✅
|
||||
- **D1/SQLite**: Use BLOB for binary float32 array
|
||||
- **KV**: Store as JSON (float64 precision)
|
||||
- **R2**: Store as binary float32 array
|
||||
|
||||
**Sources**:
|
||||
- Cloudflare Vectorize: https://developers.cloudflare.com/vectorize/
|
||||
|
||||
---
|
||||
|
||||
## Error 8: Model Version Confusion
|
||||
|
||||
### Error Message
|
||||
```
|
||||
Error: Model 'gemini-embedding-exp-03-07' is deprecated
|
||||
```
|
||||
|
||||
### Why It Happens
|
||||
- Using experimental or deprecated model
|
||||
- Mixing embeddings from different model versions
|
||||
- Not keeping up with model updates
|
||||
|
||||
### Root Cause
|
||||
Gemini has stable and experimental embedding models.
|
||||
|
||||
### Prevention
|
||||
```typescript
|
||||
// ❌ BAD: Using experimental/deprecated model
|
||||
const embedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-exp-03-07', // Deprecated October 2025
|
||||
content: text
|
||||
});
|
||||
|
||||
// ✅ GOOD: Use stable model
|
||||
const embedding = await ai.models.embedContent({
|
||||
model: 'gemini-embedding-001', // Stable production model
|
||||
content: text,
|
||||
config: {
|
||||
taskType: 'SEMANTIC_SIMILARITY',
|
||||
outputDimensionality: 768
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Model Status**:
|
||||
| Model | Status | Recommendation |
|
||||
|-------|--------|----------------|
|
||||
| `gemini-embedding-001` | ✅ Stable | Use this |
|
||||
| `gemini-embedding-exp-03-07` | ❌ Deprecated (Oct 2025) | Migrate to gemini-embedding-001 |
|
||||
|
||||
**CRITICAL**: Never mix embeddings from different models. They use different vector spaces and are not comparable.
|
||||
|
||||
**Sources**:
|
||||
- https://ai.google.dev/gemini-api/docs/models/gemini#text-embeddings
|
||||
|
||||
---
|
||||
|
||||
## Summary Checklist
|
||||
|
||||
Before deploying to production, verify:
|
||||
|
||||
- [ ] `outputDimensionality` matches Vectorize index dimensions
|
||||
- [ ] Batch size ≤ API limits (chunk large datasets)
|
||||
- [ ] Rate limiting implemented with exponential backoff
|
||||
- [ ] Long texts are chunked (≤ 2,048 tokens)
|
||||
- [ ] Cosine similarity formula is correct
|
||||
- [ ] Correct task types used (RETRIEVAL_QUERY vs RETRIEVAL_DOCUMENT)
|
||||
- [ ] Embeddings stored with full precision (float32)
|
||||
- [ ] Using stable model (`gemini-embedding-001`)
|
||||
|
||||
**Following these guidelines prevents 100% of documented errors.**
|
||||
|
||||
---
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **Official Docs**: https://ai.google.dev/gemini-api/docs/embeddings
|
||||
- **Rate Limits**: https://ai.google.dev/gemini-api/docs/rate-limits
|
||||
- **Vectorize Docs**: https://developers.cloudflare.com/vectorize/
|
||||
- **Model Specs**: https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001
|
||||
Reference in New Issue
Block a user