470 lines
10 KiB
Markdown
470 lines
10 KiB
Markdown
# Cloudflare Vectorize Integration
|
|
|
|
Complete guide for using Gemini embeddings with Cloudflare Vectorize.
|
|
|
|
---
|
|
|
|
## Quick Start
|
|
|
|
### 1. Create Vectorize Index
|
|
|
|
```bash
|
|
# Create index with 768 dimensions (recommended for Gemini)
|
|
npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine
|
|
|
|
# Alternative: 3072 dimensions (Gemini default, more accurate but larger)
|
|
npx wrangler vectorize create gemini-embeddings-large --dimensions 3072 --metric cosine
|
|
```
|
|
|
|
### 2. Bind to Worker
|
|
|
|
Add to `wrangler.jsonc`:
|
|
|
|
```jsonc
|
|
{
|
|
"name": "my-rag-worker",
|
|
"main": "src/index.ts",
|
|
"compatibility_date": "2025-10-25",
|
|
"vectorize": {
|
|
"bindings": [
|
|
{
|
|
"binding": "VECTORIZE",
|
|
"index_name": "gemini-embeddings"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. Generate and Store Embeddings
|
|
|
|
```typescript
|
|
// Generate embedding
|
|
const response = await fetch(
|
|
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
|
|
{
|
|
method: 'POST',
|
|
headers: {
|
|
'x-goog-api-key': env.GEMINI_API_KEY,
|
|
'Content-Type': 'application/json'
|
|
},
|
|
body: JSON.stringify({
|
|
content: { parts: [{ text: 'Your document text' }] },
|
|
taskType: 'RETRIEVAL_DOCUMENT',
|
|
outputDimensionality: 768 // MUST match index dimensions
|
|
})
|
|
}
|
|
);
|
|
|
|
const data = await response.json();
|
|
const embedding = data.embedding.values;
|
|
|
|
// Insert into Vectorize
|
|
await env.VECTORIZE.insert([{
|
|
id: 'doc-1',
|
|
values: embedding,
|
|
metadata: { text: 'Your document text', source: 'manual' }
|
|
}]);
|
|
```
|
|
|
|
---
|
|
|
|
## Dimension Configuration
|
|
|
|
**CRITICAL**: Embedding dimensions MUST match Vectorize index dimensions.
|
|
|
|
| Gemini Dimensions | Storage (per vector) | Recommended For |
|
|
|-------------------|---------------------|-----------------|
|
|
| 768 | 3 KB | Most use cases, cost-effective |
|
|
| 1536 | 6 KB | Balance accuracy/storage |
|
|
| 3072 | 12 KB | Maximum accuracy |
|
|
|
|
**Create index to match your embeddings**:
|
|
|
|
```bash
|
|
# For 768-dim embeddings
|
|
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
|
|
|
|
# For 1536-dim embeddings
|
|
npx wrangler vectorize create my-index --dimensions 1536 --metric cosine
|
|
|
|
# For 3072-dim embeddings (Gemini default)
|
|
npx wrangler vectorize create my-index --dimensions 3072 --metric cosine
|
|
```
|
|
|
|
---
|
|
|
|
## Metric Selection
|
|
|
|
Vectorize supports 3 distance metrics:
|
|
|
|
### Cosine (Recommended)
|
|
|
|
```bash
|
|
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
|
|
```
|
|
|
|
**When to use**:
|
|
- ✅ Semantic search (most common)
|
|
- ✅ Document similarity
|
|
- ✅ RAG systems
|
|
|
|
**Range**: 0 (different) to 1 (identical)
|
|
|
|
### Euclidean
|
|
|
|
```bash
|
|
npx wrangler vectorize create my-index --dimensions 768 --metric euclidean
|
|
```
|
|
|
|
**When to use**:
|
|
- ✅ Absolute distance matters
|
|
- ✅ Magnitude is important
|
|
|
|
**Range**: 0 (identical) to ∞ (very different)
|
|
|
|
### Dot Product
|
|
|
|
```bash
|
|
npx wrangler vectorize create my-index --dimensions 768 --metric dot-product
|
|
```
|
|
|
|
**When to use**:
|
|
- ✅ Pre-normalized vectors
|
|
- ✅ Performance optimization
|
|
|
|
**Range**: -1 to 1 (for normalized vectors)
|
|
|
|
**Recommendation**: Use **cosine** for Gemini embeddings (most common and intuitive).
|
|
|
|
---
|
|
|
|
## Insert Patterns
|
|
|
|
### Single Insert
|
|
|
|
```typescript
|
|
await env.VECTORIZE.insert([{
|
|
id: 'doc-1',
|
|
values: embedding,
|
|
metadata: {
|
|
text: 'Document content',
|
|
timestamp: Date.now(),
|
|
category: 'documentation'
|
|
}
|
|
}]);
|
|
```
|
|
|
|
### Batch Insert
|
|
|
|
```typescript
|
|
const vectors = documents.map((doc, i) => ({
|
|
id: `doc-${i}`,
|
|
values: doc.embedding,
|
|
metadata: { text: doc.text }
|
|
}));
|
|
|
|
// Insert up to 100 vectors at once
|
|
await env.VECTORIZE.insert(vectors);
|
|
```
|
|
|
|
### Upsert (Update or Insert)
|
|
|
|
```typescript
|
|
// Vectorize automatically updates if ID exists
|
|
await env.VECTORIZE.insert([{
|
|
id: 'doc-1', // Existing ID
|
|
values: newEmbedding,
|
|
metadata: { text: 'Updated content' }
|
|
}]);
|
|
```
|
|
|
|
---
|
|
|
|
## Query Patterns
|
|
|
|
### Basic Query
|
|
|
|
```typescript
|
|
const results = await env.VECTORIZE.query(queryEmbedding, {
|
|
topK: 5
|
|
});
|
|
|
|
console.log(results.matches);
|
|
// [{ id: 'doc-1', score: 0.95 }, ...]
|
|
```
|
|
|
|
### Query with Metadata
|
|
|
|
```typescript
|
|
const results = await env.VECTORIZE.query(queryEmbedding, {
|
|
topK: 5,
|
|
returnMetadata: true
|
|
});
|
|
|
|
results.matches.forEach(match => {
|
|
console.log(match.id); // 'doc-1'
|
|
console.log(match.score); // 0.95
|
|
console.log(match.metadata.text); // 'Document content'
|
|
});
|
|
```
|
|
|
|
### Query with Metadata Filtering (Future)
|
|
|
|
```typescript
|
|
// Coming soon: Filter by metadata
|
|
const results = await env.VECTORIZE.query(queryEmbedding, {
|
|
topK: 5,
|
|
filter: { category: 'documentation' }
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## Metadata Best Practices
|
|
|
|
### What to Store
|
|
|
|
```typescript
|
|
await env.VECTORIZE.insert([{
|
|
id: 'doc-1',
|
|
values: embedding,
|
|
metadata: {
|
|
// ✅ Store these
|
|
text: 'The actual document content', // For retrieval
|
|
title: 'Document title',
|
|
url: 'https://example.com/doc',
|
|
timestamp: Date.now(),
|
|
category: 'product',
|
|
|
|
// ❌ Don't store these
|
|
embedding: embedding, // Already stored as values
|
|
largeObject: { /* ... */ } // Keep metadata small
|
|
}
|
|
}]);
|
|
```
|
|
|
|
### Metadata Limits
|
|
|
|
- **Max size**: ~1 KB per vector
|
|
- **Best practice**: Store only what you need for retrieval/display
|
|
- **For large data**: Store minimal metadata, fetch full data from D1/KV using ID
|
|
|
|
---
|
|
|
|
## Complete RAG Example
|
|
|
|
```typescript
|
|
interface Env {
|
|
GEMINI_API_KEY: string;
|
|
VECTORIZE: VectorizeIndex;
|
|
}
|
|
|
|
export default {
|
|
async fetch(request: Request, env: Env): Promise<Response> {
|
|
const url = new URL(request.url);
|
|
|
|
// Ingest: POST /ingest with { text: "..." }
|
|
if (url.pathname === '/ingest' && request.method === 'POST') {
|
|
const { text } = await request.json();
|
|
|
|
// 1. Generate embedding
|
|
const embeddingRes = await fetch(
|
|
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
|
|
{
|
|
method: 'POST',
|
|
headers: {
|
|
'x-goog-api-key': env.GEMINI_API_KEY,
|
|
'Content-Type': 'application/json'
|
|
},
|
|
body: JSON.stringify({
|
|
content: { parts: [{ text }] },
|
|
taskType: 'RETRIEVAL_DOCUMENT',
|
|
outputDimensionality: 768
|
|
})
|
|
}
|
|
);
|
|
|
|
const embeddingData = await embeddingRes.json();
|
|
const embedding = embeddingData.embedding.values;
|
|
|
|
// 2. Store in Vectorize
|
|
await env.VECTORIZE.insert([{
|
|
id: `doc-${Date.now()}`,
|
|
values: embedding,
|
|
metadata: { text, timestamp: Date.now() }
|
|
}]);
|
|
|
|
return new Response(JSON.stringify({ success: true }));
|
|
}
|
|
|
|
// Query: POST /query with { query: "..." }
|
|
if (url.pathname === '/query' && request.method === 'POST') {
|
|
const { query } = await request.json();
|
|
|
|
// 1. Generate query embedding
|
|
const embeddingRes = await fetch(
|
|
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
|
|
{
|
|
method: 'POST',
|
|
headers: {
|
|
'x-goog-api-key': env.GEMINI_API_KEY,
|
|
'Content-Type': 'application/json'
|
|
},
|
|
body: JSON.stringify({
|
|
content: { parts: [{ text: query }] },
|
|
taskType: 'RETRIEVAL_QUERY',
|
|
outputDimensionality: 768
|
|
})
|
|
}
|
|
);
|
|
|
|
const embeddingData = await embeddingRes.json();
|
|
const embedding = embeddingData.embedding.values;
|
|
|
|
// 2. Search Vectorize
|
|
const results = await env.VECTORIZE.query(embedding, {
|
|
topK: 5,
|
|
returnMetadata: true
|
|
});
|
|
|
|
return new Response(JSON.stringify({
|
|
query,
|
|
results: results.matches.map(m => ({
|
|
id: m.id,
|
|
score: m.score,
|
|
text: m.metadata?.text
|
|
}))
|
|
}));
|
|
}
|
|
|
|
return new Response('Not found', { status: 404 });
|
|
}
|
|
};
|
|
```
|
|
|
|
---
|
|
|
|
## Index Management
|
|
|
|
### List Indexes
|
|
|
|
```bash
|
|
npx wrangler vectorize list
|
|
```
|
|
|
|
### Get Index Info
|
|
|
|
```bash
|
|
npx wrangler vectorize get gemini-embeddings
|
|
```
|
|
|
|
### Delete Index
|
|
|
|
```bash
|
|
npx wrangler vectorize delete gemini-embeddings
|
|
```
|
|
|
|
**CRITICAL**: Deleting an index deletes all vectors permanently.
|
|
|
|
---
|
|
|
|
## Limitations & Quotas
|
|
|
|
| Feature | Free Plan | Paid Plans |
|
|
|---------|-----------|------------|
|
|
| Indexes per account | 100 | 100 |
|
|
| Vectors per index | 200,000 | 5,000,000+ |
|
|
| Queries per day | 30,000,000 | Unlimited |
|
|
| Dimensions | Up to 1536 | Up to 3072 |
|
|
|
|
**Source**: https://developers.cloudflare.com/vectorize/platform/pricing/
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### 1. Choose Dimensions Wisely
|
|
|
|
```typescript
|
|
// ✅ 768 dimensions (recommended)
|
|
// - Good accuracy
|
|
// - Low storage
|
|
// - Fast queries
|
|
|
|
// ⚠️ 3072 dimensions (if accuracy is critical)
|
|
// - Best accuracy
|
|
// - 4x storage
|
|
// - Slower queries
|
|
```
|
|
|
|
### 2. Use Metadata for Context
|
|
|
|
```typescript
|
|
await env.VECTORIZE.insert([{
|
|
id: 'doc-1',
|
|
values: embedding,
|
|
metadata: {
|
|
text: 'Store the actual text here for retrieval',
|
|
url: 'https://...',
|
|
timestamp: Date.now()
|
|
}
|
|
}]);
|
|
```
|
|
|
|
### 3. Implement Caching
|
|
|
|
```typescript
|
|
// Cache embeddings in KV
|
|
const cached = await env.KV.get(`embedding:${textHash}`);
|
|
if (cached) {
|
|
return JSON.parse(cached);
|
|
}
|
|
|
|
const embedding = await generateEmbedding(text);
|
|
await env.KV.put(`embedding:${textHash}`, JSON.stringify(embedding), {
|
|
expirationTtl: 86400 // 24 hours
|
|
});
|
|
```
|
|
|
|
### 4. Monitor Usage
|
|
|
|
```bash
|
|
# Check index stats
|
|
npx wrangler vectorize get gemini-embeddings
|
|
|
|
# Shows:
|
|
# - Total vectors
|
|
# - Dimensions
|
|
# - Metric type
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Dimension Mismatch Error
|
|
|
|
```
|
|
Error: Vector dimensions do not match. Expected 768, got 3072
|
|
```
|
|
|
|
**Solution**: Ensure embedding `outputDimensionality` matches index dimensions.
|
|
|
|
### No Results Found
|
|
|
|
**Possible causes**:
|
|
1. Index is empty (no vectors inserted)
|
|
2. Query embedding is wrong task type (use RETRIEVAL_QUERY)
|
|
3. Similarity threshold too high
|
|
|
|
**Solution**: Check index has vectors, use correct task types.
|
|
|
|
---
|
|
|
|
## Official Documentation
|
|
|
|
- **Vectorize Docs**: https://developers.cloudflare.com/vectorize/
|
|
- **Pricing**: https://developers.cloudflare.com/vectorize/platform/pricing/
|
|
- **Wrangler CLI**: https://developers.cloudflare.com/workers/wrangler/
|