zhongwei/gh-jezweb-claude-skills-skills-openai-assistants

Fork 0

Files

Zhongwei Li 0c577730d5 Initial commit

2025-11-30 08:25:15 +08:00

8.0 KiB

Raw Blame History

File Search & RAG Guide

Complete guide to implementing Retrieval-Augmented Generation (RAG) with the Assistants API.

What is File Search?

A built-in tool for semantic search over documents using vector stores:

Capacity: Up to 10,000 files per assistant (vs 20 in v1)
Technology: Vector + keyword search with reranking
Automatic: Chunking, embedding, and indexing handled by OpenAI
Pricing: $0.10/GB/day (first 1GB free)

Architecture

Documents (PDF, DOCX, MD, etc.)
    ↓
Vector Store (chunking + embeddings)
    ↓
Assistant with file_search tool
    ↓
Semantic Search + Reranking
    ↓
Retrieved Context + LLM Generation

Quick Setup

1. Create Vector Store

const vectorStore = await openai.beta.vectorStores.create({
  name: "Product Documentation",
  expires_after: {
    anchor: "last_active_at",
    days: 30,
  },
});

2. Upload Documents

const files = await Promise.all([
  openai.files.create({ file: fs.createReadStream("doc1.pdf"), purpose: "assistants" }),
  openai.files.create({ file: fs.createReadStream("doc2.md"), purpose: "assistants" }),
]);

const batch = await openai.beta.vectorStores.fileBatches.create(vectorStore.id, {
  file_ids: files.map(f => f.id),
});

3. Wait for Indexing

let batch = await openai.beta.vectorStores.fileBatches.retrieve(vectorStore.id, batch.id);

while (batch.status === 'in_progress') {
  await new Promise(r => setTimeout(r, 2000));
  batch = await openai.beta.vectorStores.fileBatches.retrieve(vectorStore.id, batch.id);
}

4. Create Assistant

const assistant = await openai.beta.assistants.create({
  name: "Knowledge Base Assistant",
  instructions: "Answer questions using the file search tool. Always cite your sources.",
  tools: [{ type: "file_search" }],
  tool_resources: {
    file_search: {
      vector_store_ids: [vectorStore.id],
    },
  },
  model: "gpt-4o",
});

Supported File Formats

.pdf - PDFs (most common)
.docx - Word documents
.md, .txt - Plain text
.html - HTML documents
.json - JSON data
.py, .js, .ts, .cpp, .java - Code files

Size Limits:

Per file: 512 MB
Total per vector store: Limited by pricing ($0.10/GB/day)

Chunking Strategy

OpenAI automatically chunks documents using:

Max chunk size: ~800 tokens (configurable internally)
Overlap: Ensures context continuity
Hierarchy: Preserves document structure (headers, sections)

Optimize for Better Results

Document Structure:

# Main Topic

## Subtopic 1
Content here...

## Subtopic 2
Content here...

Clear Sections: Use headers to organize content Concise Paragraphs: Avoid very long paragraphs (500+ words) Self-Contained: Each section should make sense independently

Improving Search Quality

1. Better Instructions

const assistant = await openai.beta.assistants.create({
  instructions: `You are a support assistant. When answering:
1. Use file_search to find relevant information
2. Synthesize information from multiple sources
3. Always provide citations with file names
4. If information isn't found, say so clearly
5. Don't make up information not in the documents`,
  tools: [{ type: "file_search" }],
  // ...
});

2. Query Refinement

Encourage users to be specific:

❌ "How do I install?"
✅ "How do I install the product on Windows 10?"

3. Multi-Document Answers

File Search automatically retrieves from multiple documents and combines information.

Citations

Accessing Citations

const messages = await openai.beta.threads.messages.list(thread.id);
const response = messages.data[0];

for (const content of response.content) {
  if (content.type === 'text') {
    console.log('Answer:', content.text.value);

    // Citations
    if (content.text.annotations) {
      for (const annotation of content.text.annotations) {
        if (annotation.type === 'file_citation') {
          console.log('Source:', annotation.file_citation.file_id);
          console.log('Quote:', annotation.file_citation.quote);
        }
      }
    }
  }
}

Displaying Citations

let answer = response.content[0].text.value;

// Replace citation markers with clickable links
for (const annotation of response.content[0].text.annotations) {
  if (annotation.type === 'file_citation') {
    const citation = `[${annotation.text}](source: ${annotation.file_citation.file_id})`;
    answer = answer.replace(annotation.text, citation);
  }
}

console.log(answer);

Cost Management

Pricing Structure

Storage: $0.10/GB/day
Free tier: First 1GB
Example: 5GB = $0.40/day = $12/month

Optimization Strategies

Auto-Expiration:

const vectorStore = await openai.beta.vectorStores.create({
  expires_after: {
    anchor: "last_active_at",
    days: 7, // Delete after 7 days of inactivity
  },
});

Cleanup Old Stores:

async function cleanupOldVectorStores() {
  const stores = await openai.beta.vectorStores.list({ limit: 100 });

  for (const store of stores.data) {
    const ageDays = (Date.now() / 1000 - store.created_at) / (60 * 60 * 24);

    if (ageDays > 30) {
      await openai.beta.vectorStores.del(store.id);
    }
  }
}

Monitor Usage:

const store = await openai.beta.vectorStores.retrieve(vectorStoreId);
const sizeGB = store.usage_bytes / (1024 * 1024 * 1024);
const costPerDay = Math.max(0, (sizeGB - 1) * 0.10);
console.log(`Daily cost: $${costPerDay.toFixed(4)}`);

Advanced Patterns

Pattern: Multi-Tenant Knowledge Bases

// Separate vector store per tenant
const tenantStore = await openai.beta.vectorStores.create({
  name: `Tenant ${tenantId} KB`,
  metadata: { tenant_id: tenantId },
});

// Or: Single store with namespace simulation via file metadata
await openai.files.create({
  file: fs.createReadStream("doc.pdf"),
  purpose: "assistants",
  metadata: { tenant_id: tenantId }, // Coming soon
});

Pattern: Versioned Documentation

// Version 1.0
const v1Store = await openai.beta.vectorStores.create({
  name: "Docs v1.0",
  metadata: { version: "1.0" },
});

// Version 2.0
const v2Store = await openai.beta.vectorStores.create({
  name: "Docs v2.0",
  metadata: { version: "2.0" },
});

// Switch based on user preference
const storeId = userVersion === "1.0" ? v1Store.id : v2Store.id;

Pattern: Hybrid Search (File Search + Code Interpreter)

const assistant = await openai.beta.assistants.create({
  tools: [
    { type: "file_search" },
    { type: "code_interpreter" },
  ],
  tool_resources: {
    file_search: {
      vector_store_ids: [docsVectorStoreId],
    },
  },
});

// Assistant can search docs AND analyze attached data files
await openai.beta.threads.messages.create(thread.id, {
  content: "Compare this sales data against the targets in our planning docs",
  attachments: [{
    file_id: salesDataFileId,
    tools: [{ type: "code_interpreter" }],
  }],
});

Troubleshooting

No Results Found

Causes:

Vector store not fully indexed
Poor query formulation
Documents lack relevant content

Solutions:

Wait for status: "completed"
Refine query to be more specific
Check document quality and structure

Irrelevant Results

Causes:

Poor document structure
Too much noise in documents
Vague queries

Solutions:

Add clear section headers
Remove boilerplate/repetitive content
Improve query specificity

High Costs

Causes:

Too many vector stores
Large files that don't expire
Duplicate content

Solutions:

Set auto-expiration
Deduplicate documents
Delete unused stores

Best Practices

Structure documents with clear headers and sections
Wait for indexing before using vector store
Set auto-expiration to manage costs
Monitor storage regularly
Provide citations in responses
Refine queries for better results
Clean up old vector stores

Last Updated: 2025-10-25

8.0 KiB Raw Blame History

File Search & RAG Guide

What is File Search?

Architecture

Quick Setup

1. Create Vector Store

2. Upload Documents

3. Wait for Indexing

4. Create Assistant

Supported File Formats

Chunking Strategy

Optimize for Better Results

Improving Search Quality

1. Better Instructions

2. Query Refinement

3. Multi-Document Answers

Citations

Accessing Citations

Displaying Citations

Cost Management

Pricing Structure

Optimization Strategies

Advanced Patterns

Pattern: Multi-Tenant Knowledge Bases

Pattern: Versioned Documentation

Pattern: Hybrid Search (File Search + Code Interpreter)

Troubleshooting

No Results Found

Irrelevant Results

High Costs

Best Practices

8.0 KiB

Raw Blame History