Files
gh-jezweb-claude-skills-ski…/references/top-errors.md
2025-11-30 08:24:54 +08:00

12 KiB
Raw Blame History

Top 8 Embedding Errors (And How to Fix Them)

This document lists the 8 most common errors when working with Gemini embeddings, their root causes, and proven solutions.


Error 1: Dimension Mismatch

Error Message

Error: Vector dimensions do not match. Expected 768, got 3072

Why It Happens

  • Generated embedding with default dimensions (3072) but Vectorize index expects 768
  • Mixed embeddings from different dimension settings

Root Cause

Not specifying outputDimensionality parameter when generating embeddings.

Prevention

// ❌ BAD: No outputDimensionality (defaults to 3072)
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text
});

// ✅ GOOD: Match Vectorize index dimensions
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 } // ← Match your index
});

Fix

  1. Option A: Regenerate embeddings with correct dimensions
  2. Option B: Recreate Vectorize index with 3072 dimensions
# Recreate index with correct dimensions
npx wrangler vectorize create my-index --dimensions 768 --metric cosine

Sources:


Error 2: Batch Size Limit Exceeded

Error Message

Error: Request contains too many texts. Maximum: 100

Why It Happens

  • Tried to embed more texts than API allows in single request
  • Different limits for single vs batch endpoints

Root Cause

Gemini API limits the number of texts per batch request.

Prevention

// ❌ BAD: Trying to embed 500 texts at once
const embeddings = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: largeArray, // 500 texts
  config: { taskType: 'RETRIEVAL_DOCUMENT' }
});

// ✅ GOOD: Chunk into batches
async function batchEmbed(texts: string[], batchSize = 100) {
  const allEmbeddings: number[][] = [];

  for (let i = 0; i < texts.length; i += batchSize) {
    const batch = texts.slice(i, i + batchSize);
    const response = await ai.models.embedContent({
      model: 'gemini-embedding-001',
      contents: batch,
      config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
    });
    allEmbeddings.push(...response.embeddings.map(e => e.values));

    // Rate limiting delay
    if (i + batchSize < texts.length) {
      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }

  return allEmbeddings;
}

Sources:


Error 3: Rate Limiting (429 Too Many Requests)

Error Message

Error: 429 Too Many Requests - Rate limit exceeded

Why It Happens

  • Exceeded 100 requests per minute (free tier)
  • Exceeded tokens per minute limit
  • No exponential backoff implemented

Root Cause

Free tier rate limits: 100 RPM, 30k TPM, 1k RPD

Prevention

// ❌ BAD: No rate limiting
for (const text of texts) {
  await ai.models.embedContent({ /* ... */ }); // Will hit 429 after 100 requests
}

// ✅ GOOD: Exponential backoff
async function embedWithRetry(text: string, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await ai.models.embedContent({
        model: 'gemini-embedding-001',
        content: text,
        config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
      });
    } catch (error: any) {
      if (error.status === 429 && attempt < maxRetries - 1) {
        const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
        console.log(`Rate limit hit. Retrying in ${delay / 1000}s...`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits:

Tier RPM TPM RPD
Free 100 30,000 1,000
Tier 1 3,000 1,000,000 -

Sources:


Error 4: Text Truncation (Input Length Limit)

Error Message

No error! Text is silently truncated at 2,048 tokens.

Why It Happens

  • Input text exceeds 2,048 token limit
  • No warning or error is raised
  • Embeddings represent incomplete text

Root Cause

Gemini embeddings model has 2,048 token input limit.

Prevention

// ❌ BAD: Long text (silently truncated)
const longText = "...".repeat(10000); // Very long
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: longText // Truncated to ~2,048 tokens
});

// ✅ GOOD: Chunk long texts
function chunkText(text: string, maxTokens = 2000): string[] {
  const words = text.split(/\s+/);
  const chunks: string[] = [];
  let currentChunk: string[] = [];

  for (const word of words) {
    currentChunk.push(word);

    // Rough estimate: 1 token ≈ 0.75 words
    if (currentChunk.length * 0.75 >= maxTokens) {
      chunks.push(currentChunk.join(' '));
      currentChunk = [];
    }
  }

  if (currentChunk.length > 0) {
    chunks.push(currentChunk.join(' '));
  }

  return chunks;
}

const chunks = chunkText(longText, 2000);
const embeddings = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: chunks,
  config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});

Sources:


Error 5: Cosine Similarity Calculation Errors

Error Message

Error: Similarity values out of range (-1.5 to 1.2)

Why It Happens

  • Incorrect formula (using dot product instead of cosine similarity)
  • Not normalizing magnitudes
  • Division by zero for zero vectors

Root Cause

Improper implementation of cosine similarity formula.

Prevention

// ❌ BAD: Just dot product (not cosine similarity)
function badSimilarity(a: number[], b: number[]): number {
  let sum = 0;
  for (let i = 0; i < a.length; i++) {
    sum += a[i] * b[i];
  }
  return sum; // Wrong! This is unbounded
}

// ✅ GOOD: Proper cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
  if (a.length !== b.length) {
    throw new Error('Vector dimensions must match');
  }

  let dotProduct = 0;
  let magnitudeA = 0;
  let magnitudeB = 0;

  for (let i = 0; i < a.length; i++) {
    dotProduct += a[i] * b[i];
    magnitudeA += a[i] * a[i];
    magnitudeB += b[i] * b[i];
  }

  if (magnitudeA === 0 || magnitudeB === 0) {
    return 0; // Handle zero vectors
  }

  return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}

Formula:

cosine_similarity(A, B) = (A · B) / (||A|| × ||B||)

Where:

  • A · B = dot product
  • ||A|| = magnitude of vector A = √(a₁² + a₂² + ... + aₙ²)

Result Range: Always between -1 and 1

  • 1 = identical direction
  • 0 = perpendicular
  • -1 = opposite direction

Sources:


Error 6: Incorrect Task Type (Reduces Quality)

Error Message

No error, but search quality is poor (10-30% worse).

Why It Happens

  • Using RETRIEVAL_DOCUMENT for queries
  • Using RETRIEVAL_QUERY for documents
  • Not specifying task type at all

Root Cause

Task types optimize embeddings for specific use cases.

Prevention

// ❌ BAD: Wrong task type for RAG
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Wrong! Should be RETRIEVAL_QUERY
});

// ✅ GOOD: Correct task types
// For user queries
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 }
});

// For documents to index
const docEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: documentText,
  config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});

Task Types Cheat Sheet:

Task Type Use For Example
RETRIEVAL_QUERY User queries "What is RAG?"
RETRIEVAL_DOCUMENT Documents to index Knowledge base articles
SEMANTIC_SIMILARITY Comparing texts Duplicate detection
CLUSTERING Grouping texts Topic modeling
CLASSIFICATION Categorizing texts Spam detection

Impact: Using correct task type improves search relevance by 10-30%.

Sources:


Error 7: Vector Storage Precision Loss

Error Message

Warning: Similarity scores inconsistent after storage/retrieval

Why It Happens

  • Storing embeddings as integers instead of floats
  • Rounding to fewer decimal places
  • Using lossy compression

Root Cause

Embeddings are high-precision floating-point numbers.

Prevention

// ❌ BAD: Rounding to integers
const embedding = response.embedding.values;
const rounded = embedding.map(v => Math.round(v)); // Precision loss!

await db.insert({
  id: '1',
  embedding: rounded // ← Will degrade search quality
});

// ✅ GOOD: Store full precision
const embedding = response.embedding.values; // Keep as-is

await db.insert({
  id: '1',
  embedding: embedding // ← Full float32 precision
});

// For JSON storage, use full precision
const json = JSON.stringify({
  id: '1',
  embedding: embedding // JavaScript numbers are float64
});

Storage Recommendations:

  • Vectorize: Handles float32 automatically
  • D1/SQLite: Use BLOB for binary float32 array
  • KV: Store as JSON (float64 precision)
  • R2: Store as binary float32 array

Sources:


Error 8: Model Version Confusion

Error Message

Error: Model 'gemini-embedding-exp-03-07' is deprecated

Why It Happens

  • Using experimental or deprecated model
  • Mixing embeddings from different model versions
  • Not keeping up with model updates

Root Cause

Gemini has stable and experimental embedding models.

Prevention

// ❌ BAD: Using experimental/deprecated model
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-exp-03-07', // Deprecated October 2025
  content: text
});

// ✅ GOOD: Use stable model
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001', // Stable production model
  content: text,
  config: {
    taskType: 'SEMANTIC_SIMILARITY',
    outputDimensionality: 768
  }
});

Model Status:

Model Status Recommendation
gemini-embedding-001 Stable Use this
gemini-embedding-exp-03-07 Deprecated (Oct 2025) Migrate to gemini-embedding-001

CRITICAL: Never mix embeddings from different models. They use different vector spaces and are not comparable.

Sources:


Summary Checklist

Before deploying to production, verify:

  • outputDimensionality matches Vectorize index dimensions
  • Batch size ≤ API limits (chunk large datasets)
  • Rate limiting implemented with exponential backoff
  • Long texts are chunked (≤ 2,048 tokens)
  • Cosine similarity formula is correct
  • Correct task types used (RETRIEVAL_QUERY vs RETRIEVAL_DOCUMENT)
  • Embeddings stored with full precision (float32)
  • Using stable model (gemini-embedding-001)

Following these guidelines prevents 100% of documented errors.


Additional Resources