# Cloudflare Vectorize Integration Complete guide for using Gemini embeddings with Cloudflare Vectorize. --- ## Quick Start ### 1. Create Vectorize Index ```bash # Create index with 768 dimensions (recommended for Gemini) npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine # Alternative: 3072 dimensions (Gemini default, more accurate but larger) npx wrangler vectorize create gemini-embeddings-large --dimensions 3072 --metric cosine ``` ### 2. Bind to Worker Add to `wrangler.jsonc`: ```jsonc { "name": "my-rag-worker", "main": "src/index.ts", "compatibility_date": "2025-10-25", "vectorize": { "bindings": [ { "binding": "VECTORIZE", "index_name": "gemini-embeddings" } ] } } ``` ### 3. Generate and Store Embeddings ```typescript // Generate embedding const response = await fetch( 'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent', { method: 'POST', headers: { 'x-goog-api-key': env.GEMINI_API_KEY, 'Content-Type': 'application/json' }, body: JSON.stringify({ content: { parts: [{ text: 'Your document text' }] }, taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 // MUST match index dimensions }) } ); const data = await response.json(); const embedding = data.embedding.values; // Insert into Vectorize await env.VECTORIZE.insert([{ id: 'doc-1', values: embedding, metadata: { text: 'Your document text', source: 'manual' } }]); ``` --- ## Dimension Configuration **CRITICAL**: Embedding dimensions MUST match Vectorize index dimensions. | Gemini Dimensions | Storage (per vector) | Recommended For | |-------------------|---------------------|-----------------| | 768 | 3 KB | Most use cases, cost-effective | | 1536 | 6 KB | Balance accuracy/storage | | 3072 | 12 KB | Maximum accuracy | **Create index to match your embeddings**: ```bash # For 768-dim embeddings npx wrangler vectorize create my-index --dimensions 768 --metric cosine # For 1536-dim embeddings npx wrangler vectorize create my-index --dimensions 1536 --metric cosine # For 3072-dim embeddings (Gemini default) npx wrangler vectorize create my-index --dimensions 3072 --metric cosine ``` --- ## Metric Selection Vectorize supports 3 distance metrics: ### Cosine (Recommended) ```bash npx wrangler vectorize create my-index --dimensions 768 --metric cosine ``` **When to use**: - ✅ Semantic search (most common) - ✅ Document similarity - ✅ RAG systems **Range**: 0 (different) to 1 (identical) ### Euclidean ```bash npx wrangler vectorize create my-index --dimensions 768 --metric euclidean ``` **When to use**: - ✅ Absolute distance matters - ✅ Magnitude is important **Range**: 0 (identical) to ∞ (very different) ### Dot Product ```bash npx wrangler vectorize create my-index --dimensions 768 --metric dot-product ``` **When to use**: - ✅ Pre-normalized vectors - ✅ Performance optimization **Range**: -1 to 1 (for normalized vectors) **Recommendation**: Use **cosine** for Gemini embeddings (most common and intuitive). --- ## Insert Patterns ### Single Insert ```typescript await env.VECTORIZE.insert([{ id: 'doc-1', values: embedding, metadata: { text: 'Document content', timestamp: Date.now(), category: 'documentation' } }]); ``` ### Batch Insert ```typescript const vectors = documents.map((doc, i) => ({ id: `doc-${i}`, values: doc.embedding, metadata: { text: doc.text } })); // Insert up to 100 vectors at once await env.VECTORIZE.insert(vectors); ``` ### Upsert (Update or Insert) ```typescript // Vectorize automatically updates if ID exists await env.VECTORIZE.insert([{ id: 'doc-1', // Existing ID values: newEmbedding, metadata: { text: 'Updated content' } }]); ``` --- ## Query Patterns ### Basic Query ```typescript const results = await env.VECTORIZE.query(queryEmbedding, { topK: 5 }); console.log(results.matches); // [{ id: 'doc-1', score: 0.95 }, ...] ``` ### Query with Metadata ```typescript const results = await env.VECTORIZE.query(queryEmbedding, { topK: 5, returnMetadata: true }); results.matches.forEach(match => { console.log(match.id); // 'doc-1' console.log(match.score); // 0.95 console.log(match.metadata.text); // 'Document content' }); ``` ### Query with Metadata Filtering (Future) ```typescript // Coming soon: Filter by metadata const results = await env.VECTORIZE.query(queryEmbedding, { topK: 5, filter: { category: 'documentation' } }); ``` --- ## Metadata Best Practices ### What to Store ```typescript await env.VECTORIZE.insert([{ id: 'doc-1', values: embedding, metadata: { // ✅ Store these text: 'The actual document content', // For retrieval title: 'Document title', url: 'https://example.com/doc', timestamp: Date.now(), category: 'product', // ❌ Don't store these embedding: embedding, // Already stored as values largeObject: { /* ... */ } // Keep metadata small } }]); ``` ### Metadata Limits - **Max size**: ~1 KB per vector - **Best practice**: Store only what you need for retrieval/display - **For large data**: Store minimal metadata, fetch full data from D1/KV using ID --- ## Complete RAG Example ```typescript interface Env { GEMINI_API_KEY: string; VECTORIZE: VectorizeIndex; } export default { async fetch(request: Request, env: Env): Promise { const url = new URL(request.url); // Ingest: POST /ingest with { text: "..." } if (url.pathname === '/ingest' && request.method === 'POST') { const { text } = await request.json(); // 1. Generate embedding const embeddingRes = await fetch( 'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent', { method: 'POST', headers: { 'x-goog-api-key': env.GEMINI_API_KEY, 'Content-Type': 'application/json' }, body: JSON.stringify({ content: { parts: [{ text }] }, taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }) } ); const embeddingData = await embeddingRes.json(); const embedding = embeddingData.embedding.values; // 2. Store in Vectorize await env.VECTORIZE.insert([{ id: `doc-${Date.now()}`, values: embedding, metadata: { text, timestamp: Date.now() } }]); return new Response(JSON.stringify({ success: true })); } // Query: POST /query with { query: "..." } if (url.pathname === '/query' && request.method === 'POST') { const { query } = await request.json(); // 1. Generate query embedding const embeddingRes = await fetch( 'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent', { method: 'POST', headers: { 'x-goog-api-key': env.GEMINI_API_KEY, 'Content-Type': 'application/json' }, body: JSON.stringify({ content: { parts: [{ text: query }] }, taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 }) } ); const embeddingData = await embeddingRes.json(); const embedding = embeddingData.embedding.values; // 2. Search Vectorize const results = await env.VECTORIZE.query(embedding, { topK: 5, returnMetadata: true }); return new Response(JSON.stringify({ query, results: results.matches.map(m => ({ id: m.id, score: m.score, text: m.metadata?.text })) })); } return new Response('Not found', { status: 404 }); } }; ``` --- ## Index Management ### List Indexes ```bash npx wrangler vectorize list ``` ### Get Index Info ```bash npx wrangler vectorize get gemini-embeddings ``` ### Delete Index ```bash npx wrangler vectorize delete gemini-embeddings ``` **CRITICAL**: Deleting an index deletes all vectors permanently. --- ## Limitations & Quotas | Feature | Free Plan | Paid Plans | |---------|-----------|------------| | Indexes per account | 100 | 100 | | Vectors per index | 200,000 | 5,000,000+ | | Queries per day | 30,000,000 | Unlimited | | Dimensions | Up to 1536 | Up to 3072 | **Source**: https://developers.cloudflare.com/vectorize/platform/pricing/ --- ## Best Practices ### 1. Choose Dimensions Wisely ```typescript // ✅ 768 dimensions (recommended) // - Good accuracy // - Low storage // - Fast queries // ⚠️ 3072 dimensions (if accuracy is critical) // - Best accuracy // - 4x storage // - Slower queries ``` ### 2. Use Metadata for Context ```typescript await env.VECTORIZE.insert([{ id: 'doc-1', values: embedding, metadata: { text: 'Store the actual text here for retrieval', url: 'https://...', timestamp: Date.now() } }]); ``` ### 3. Implement Caching ```typescript // Cache embeddings in KV const cached = await env.KV.get(`embedding:${textHash}`); if (cached) { return JSON.parse(cached); } const embedding = await generateEmbedding(text); await env.KV.put(`embedding:${textHash}`, JSON.stringify(embedding), { expirationTtl: 86400 // 24 hours }); ``` ### 4. Monitor Usage ```bash # Check index stats npx wrangler vectorize get gemini-embeddings # Shows: # - Total vectors # - Dimensions # - Metric type ``` --- ## Troubleshooting ### Dimension Mismatch Error ``` Error: Vector dimensions do not match. Expected 768, got 3072 ``` **Solution**: Ensure embedding `outputDimensionality` matches index dimensions. ### No Results Found **Possible causes**: 1. Index is empty (no vectors inserted) 2. Query embedding is wrong task type (use RETRIEVAL_QUERY) 3. Similarity threshold too high **Solution**: Check index has vectors, use correct task types. --- ## Official Documentation - **Vectorize Docs**: https://developers.cloudflare.com/vectorize/ - **Pricing**: https://developers.cloudflare.com/vectorize/platform/pricing/ - **Wrangler CLI**: https://developers.cloudflare.com/workers/wrangler/