Files
gh-jezweb-claude-skills-ski…/references/vectorize-integration.md
2025-11-30 08:24:54 +08:00

470 lines
10 KiB
Markdown

# Cloudflare Vectorize Integration
Complete guide for using Gemini embeddings with Cloudflare Vectorize.
---
## Quick Start
### 1. Create Vectorize Index
```bash
# Create index with 768 dimensions (recommended for Gemini)
npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine
# Alternative: 3072 dimensions (Gemini default, more accurate but larger)
npx wrangler vectorize create gemini-embeddings-large --dimensions 3072 --metric cosine
```
### 2. Bind to Worker
Add to `wrangler.jsonc`:
```jsonc
{
"name": "my-rag-worker",
"main": "src/index.ts",
"compatibility_date": "2025-10-25",
"vectorize": {
"bindings": [
{
"binding": "VECTORIZE",
"index_name": "gemini-embeddings"
}
]
}
}
```
### 3. Generate and Store Embeddings
```typescript
// Generate embedding
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
{
method: 'POST',
headers: {
'x-goog-api-key': env.GEMINI_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
content: { parts: [{ text: 'Your document text' }] },
taskType: 'RETRIEVAL_DOCUMENT',
outputDimensionality: 768 // MUST match index dimensions
})
}
);
const data = await response.json();
const embedding = data.embedding.values;
// Insert into Vectorize
await env.VECTORIZE.insert([{
id: 'doc-1',
values: embedding,
metadata: { text: 'Your document text', source: 'manual' }
}]);
```
---
## Dimension Configuration
**CRITICAL**: Embedding dimensions MUST match Vectorize index dimensions.
| Gemini Dimensions | Storage (per vector) | Recommended For |
|-------------------|---------------------|-----------------|
| 768 | 3 KB | Most use cases, cost-effective |
| 1536 | 6 KB | Balance accuracy/storage |
| 3072 | 12 KB | Maximum accuracy |
**Create index to match your embeddings**:
```bash
# For 768-dim embeddings
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
# For 1536-dim embeddings
npx wrangler vectorize create my-index --dimensions 1536 --metric cosine
# For 3072-dim embeddings (Gemini default)
npx wrangler vectorize create my-index --dimensions 3072 --metric cosine
```
---
## Metric Selection
Vectorize supports 3 distance metrics:
### Cosine (Recommended)
```bash
npx wrangler vectorize create my-index --dimensions 768 --metric cosine
```
**When to use**:
- ✅ Semantic search (most common)
- ✅ Document similarity
- ✅ RAG systems
**Range**: 0 (different) to 1 (identical)
### Euclidean
```bash
npx wrangler vectorize create my-index --dimensions 768 --metric euclidean
```
**When to use**:
- ✅ Absolute distance matters
- ✅ Magnitude is important
**Range**: 0 (identical) to ∞ (very different)
### Dot Product
```bash
npx wrangler vectorize create my-index --dimensions 768 --metric dot-product
```
**When to use**:
- ✅ Pre-normalized vectors
- ✅ Performance optimization
**Range**: -1 to 1 (for normalized vectors)
**Recommendation**: Use **cosine** for Gemini embeddings (most common and intuitive).
---
## Insert Patterns
### Single Insert
```typescript
await env.VECTORIZE.insert([{
id: 'doc-1',
values: embedding,
metadata: {
text: 'Document content',
timestamp: Date.now(),
category: 'documentation'
}
}]);
```
### Batch Insert
```typescript
const vectors = documents.map((doc, i) => ({
id: `doc-${i}`,
values: doc.embedding,
metadata: { text: doc.text }
}));
// Insert up to 100 vectors at once
await env.VECTORIZE.insert(vectors);
```
### Upsert (Update or Insert)
```typescript
// Vectorize automatically updates if ID exists
await env.VECTORIZE.insert([{
id: 'doc-1', // Existing ID
values: newEmbedding,
metadata: { text: 'Updated content' }
}]);
```
---
## Query Patterns
### Basic Query
```typescript
const results = await env.VECTORIZE.query(queryEmbedding, {
topK: 5
});
console.log(results.matches);
// [{ id: 'doc-1', score: 0.95 }, ...]
```
### Query with Metadata
```typescript
const results = await env.VECTORIZE.query(queryEmbedding, {
topK: 5,
returnMetadata: true
});
results.matches.forEach(match => {
console.log(match.id); // 'doc-1'
console.log(match.score); // 0.95
console.log(match.metadata.text); // 'Document content'
});
```
### Query with Metadata Filtering (Future)
```typescript
// Coming soon: Filter by metadata
const results = await env.VECTORIZE.query(queryEmbedding, {
topK: 5,
filter: { category: 'documentation' }
});
```
---
## Metadata Best Practices
### What to Store
```typescript
await env.VECTORIZE.insert([{
id: 'doc-1',
values: embedding,
metadata: {
// ✅ Store these
text: 'The actual document content', // For retrieval
title: 'Document title',
url: 'https://example.com/doc',
timestamp: Date.now(),
category: 'product',
// ❌ Don't store these
embedding: embedding, // Already stored as values
largeObject: { /* ... */ } // Keep metadata small
}
}]);
```
### Metadata Limits
- **Max size**: ~1 KB per vector
- **Best practice**: Store only what you need for retrieval/display
- **For large data**: Store minimal metadata, fetch full data from D1/KV using ID
---
## Complete RAG Example
```typescript
interface Env {
GEMINI_API_KEY: string;
VECTORIZE: VectorizeIndex;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
// Ingest: POST /ingest with { text: "..." }
if (url.pathname === '/ingest' && request.method === 'POST') {
const { text } = await request.json();
// 1. Generate embedding
const embeddingRes = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
{
method: 'POST',
headers: {
'x-goog-api-key': env.GEMINI_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
content: { parts: [{ text }] },
taskType: 'RETRIEVAL_DOCUMENT',
outputDimensionality: 768
})
}
);
const embeddingData = await embeddingRes.json();
const embedding = embeddingData.embedding.values;
// 2. Store in Vectorize
await env.VECTORIZE.insert([{
id: `doc-${Date.now()}`,
values: embedding,
metadata: { text, timestamp: Date.now() }
}]);
return new Response(JSON.stringify({ success: true }));
}
// Query: POST /query with { query: "..." }
if (url.pathname === '/query' && request.method === 'POST') {
const { query } = await request.json();
// 1. Generate query embedding
const embeddingRes = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
{
method: 'POST',
headers: {
'x-goog-api-key': env.GEMINI_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
content: { parts: [{ text: query }] },
taskType: 'RETRIEVAL_QUERY',
outputDimensionality: 768
})
}
);
const embeddingData = await embeddingRes.json();
const embedding = embeddingData.embedding.values;
// 2. Search Vectorize
const results = await env.VECTORIZE.query(embedding, {
topK: 5,
returnMetadata: true
});
return new Response(JSON.stringify({
query,
results: results.matches.map(m => ({
id: m.id,
score: m.score,
text: m.metadata?.text
}))
}));
}
return new Response('Not found', { status: 404 });
}
};
```
---
## Index Management
### List Indexes
```bash
npx wrangler vectorize list
```
### Get Index Info
```bash
npx wrangler vectorize get gemini-embeddings
```
### Delete Index
```bash
npx wrangler vectorize delete gemini-embeddings
```
**CRITICAL**: Deleting an index deletes all vectors permanently.
---
## Limitations & Quotas
| Feature | Free Plan | Paid Plans |
|---------|-----------|------------|
| Indexes per account | 100 | 100 |
| Vectors per index | 200,000 | 5,000,000+ |
| Queries per day | 30,000,000 | Unlimited |
| Dimensions | Up to 1536 | Up to 3072 |
**Source**: https://developers.cloudflare.com/vectorize/platform/pricing/
---
## Best Practices
### 1. Choose Dimensions Wisely
```typescript
// ✅ 768 dimensions (recommended)
// - Good accuracy
// - Low storage
// - Fast queries
// ⚠️ 3072 dimensions (if accuracy is critical)
// - Best accuracy
// - 4x storage
// - Slower queries
```
### 2. Use Metadata for Context
```typescript
await env.VECTORIZE.insert([{
id: 'doc-1',
values: embedding,
metadata: {
text: 'Store the actual text here for retrieval',
url: 'https://...',
timestamp: Date.now()
}
}]);
```
### 3. Implement Caching
```typescript
// Cache embeddings in KV
const cached = await env.KV.get(`embedding:${textHash}`);
if (cached) {
return JSON.parse(cached);
}
const embedding = await generateEmbedding(text);
await env.KV.put(`embedding:${textHash}`, JSON.stringify(embedding), {
expirationTtl: 86400 // 24 hours
});
```
### 4. Monitor Usage
```bash
# Check index stats
npx wrangler vectorize get gemini-embeddings
# Shows:
# - Total vectors
# - Dimensions
# - Metric type
```
---
## Troubleshooting
### Dimension Mismatch Error
```
Error: Vector dimensions do not match. Expected 768, got 3072
```
**Solution**: Ensure embedding `outputDimensionality` matches index dimensions.
### No Results Found
**Possible causes**:
1. Index is empty (no vectors inserted)
2. Query embedding is wrong task type (use RETRIEVAL_QUERY)
3. Similarity threshold too high
**Solution**: Check index has vectors, use correct task types.
---
## Official Documentation
- **Vectorize Docs**: https://developers.cloudflare.com/vectorize/
- **Pricing**: https://developers.cloudflare.com/vectorize/platform/pricing/
- **Wrangler CLI**: https://developers.cloudflare.com/workers/wrangler/