Initial commit
This commit is contained in:
@@ -0,0 +1,349 @@
|
||||
---
|
||||
name: langchain4j-rag-implementation-patterns
|
||||
description: Implement Retrieval-Augmented Generation (RAG) systems with LangChain4j. Build document ingestion pipelines, embedding stores, vector search strategies, and knowledge-enhanced AI applications. Use when creating question-answering systems over document collections or AI assistants with external knowledge bases.
|
||||
allowed-tools: Read, Write, Bash
|
||||
category: ai-development
|
||||
tags: [langchain4j, rag, retrieval-augmented-generation, embedding, vector-search, document-ingestion, java]
|
||||
version: 1.1.0
|
||||
---
|
||||
|
||||
# LangChain4j RAG Implementation Patterns
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when:
|
||||
- Building knowledge-based AI applications requiring external document access
|
||||
- Implementing question-answering systems over large document collections
|
||||
- Creating AI assistants with access to company knowledge bases
|
||||
- Building semantic search capabilities for document repositories
|
||||
- Implementing chat systems that reference specific information sources
|
||||
- Creating AI applications requiring source attribution
|
||||
- Building domain-specific AI systems with curated knowledge
|
||||
- Implementing hybrid search combining vector similarity with traditional search
|
||||
- Creating AI applications requiring real-time document updates
|
||||
- Building multi-modal RAG systems with text, images, and other content types
|
||||
|
||||
## Overview
|
||||
|
||||
Implement complete Retrieval-Augmented Generation (RAG) systems with LangChain4j. RAG enhances language models by providing relevant context from external knowledge sources, improving accuracy and reducing hallucinations.
|
||||
|
||||
## Instructions
|
||||
|
||||
### Initialize RAG Project
|
||||
|
||||
Create a new Spring Boot project with required dependencies:
|
||||
|
||||
**pom.xml**:
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>dev.langchain4j</groupId>
|
||||
<artifactId>langchain4j-spring-boot-starter</artifactId>
|
||||
<version>1.8.0</version>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>dev.langchain4j</groupId>
|
||||
<artifactId>langchain4j-open-ai</artifactId>
|
||||
<version>1.8.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
### Setup Document Ingestion
|
||||
|
||||
Configure document loading and processing:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class RAGConfiguration {
|
||||
|
||||
@Bean
|
||||
public EmbeddingModel embeddingModel() {
|
||||
return OpenAiEmbeddingModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.modelName("text-embedding-3-small")
|
||||
.build();
|
||||
}
|
||||
|
||||
@Bean
|
||||
public EmbeddingStore<TextSegment> embeddingStore() {
|
||||
return new InMemoryEmbeddingStore<>();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Create document ingestion service:
|
||||
|
||||
```java
|
||||
@Service
|
||||
@RequiredArgsConstructor
|
||||
public class DocumentIngestionService {
|
||||
|
||||
private final EmbeddingModel embeddingModel;
|
||||
private final EmbeddingStore<TextSegment> embeddingStore;
|
||||
|
||||
public void ingestDocument(String filePath, Map<String, Object> metadata) {
|
||||
Document document = FileSystemDocumentLoader.loadDocument(filePath);
|
||||
document.metadata().putAll(metadata);
|
||||
|
||||
DocumentSplitter splitter = DocumentSplitters.recursive(
|
||||
500, 50, new OpenAiTokenCountEstimator("text-embedding-3-small")
|
||||
);
|
||||
|
||||
List<TextSegment> segments = splitter.split(document);
|
||||
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
|
||||
embeddingStore.addAll(embeddings, segments);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Configure Content Retrieval
|
||||
|
||||
Setup content retrieval with filtering:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class ContentRetrieverConfiguration {
|
||||
|
||||
@Bean
|
||||
public ContentRetriever contentRetriever(
|
||||
EmbeddingStore<TextSegment> embeddingStore,
|
||||
EmbeddingModel embeddingModel) {
|
||||
|
||||
return EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(5)
|
||||
.minScore(0.7)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Create RAG-Enabled AI Service
|
||||
|
||||
Define AI service with context retrieval:
|
||||
|
||||
```java
|
||||
interface KnowledgeAssistant {
|
||||
@SystemMessage("""
|
||||
You are a knowledgeable assistant with access to a comprehensive knowledge base.
|
||||
|
||||
When answering questions:
|
||||
1. Use the provided context from the knowledge base
|
||||
2. If information is not in the context, clearly state this
|
||||
3. Provide accurate, helpful responses
|
||||
4. When possible, reference specific sources
|
||||
5. If the context is insufficient, ask for clarification
|
||||
""")
|
||||
String answerQuestion(String question);
|
||||
}
|
||||
|
||||
@Service
|
||||
@RequiredArgsConstructor
|
||||
public class KnowledgeService {
|
||||
|
||||
private final KnowledgeAssistant assistant;
|
||||
|
||||
public KnowledgeService(ChatModel chatModel, ContentRetriever contentRetriever) {
|
||||
this.assistant = AiServices.builder(KnowledgeAssistant.class)
|
||||
.chatModel(chatModel)
|
||||
.contentRetriever(contentRetriever)
|
||||
.build();
|
||||
}
|
||||
|
||||
public String answerQuestion(String question) {
|
||||
return assistant.answerQuestion(question);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Basic Document Processing
|
||||
|
||||
```java
|
||||
public class BasicRAGExample {
|
||||
public static void main(String[] args) {
|
||||
var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();
|
||||
|
||||
var embeddingModel = OpenAiEmbeddingModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.modelName("text-embedding-3-small")
|
||||
.build();
|
||||
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(embeddingStore)
|
||||
.build();
|
||||
|
||||
ingestor.ingest(Document.from("Spring Boot is a framework for building Java applications with minimal configuration."));
|
||||
|
||||
var retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Multi-Domain Assistant
|
||||
|
||||
```java
|
||||
interface MultiDomainAssistant {
|
||||
@SystemMessage("""
|
||||
You are an expert assistant with access to multiple knowledge domains:
|
||||
- Technical documentation
|
||||
- Company policies
|
||||
- Product information
|
||||
- Customer support guides
|
||||
|
||||
Tailor your response based on the type of question and available context.
|
||||
Always indicate which domain the information comes from.
|
||||
""")
|
||||
String answerQuestion(@MemoryId String userId, String question);
|
||||
}
|
||||
```
|
||||
|
||||
### Hierarchical RAG
|
||||
|
||||
```java
|
||||
@Service
|
||||
@RequiredArgsConstructor
|
||||
public class HierarchicalRAGService {
|
||||
|
||||
private final EmbeddingStore<TextSegment> chunkStore;
|
||||
private final EmbeddingStore<TextSegment> summaryStore;
|
||||
private final EmbeddingModel embeddingModel;
|
||||
|
||||
public String performHierarchicalRetrieval(String query) {
|
||||
List<EmbeddingMatch<TextSegment>> summaryMatches = searchSummaries(query);
|
||||
List<TextSegment> relevantChunks = new ArrayList<>();
|
||||
|
||||
for (EmbeddingMatch<TextSegment> summaryMatch : summaryMatches) {
|
||||
String documentId = summaryMatch.embedded().metadata().getString("documentId");
|
||||
List<EmbeddingMatch<TextSegment>> chunkMatches = searchChunksInDocument(query, documentId);
|
||||
chunkMatches.stream()
|
||||
.map(EmbeddingMatch::embedded)
|
||||
.forEach(relevantChunks::add);
|
||||
}
|
||||
|
||||
return generateResponseWithChunks(query, relevantChunks);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Document Segmentation
|
||||
|
||||
- Use recursive splitting with 500-1000 token chunks for most applications
|
||||
- Maintain 20-50 token overlap between chunks for context preservation
|
||||
- Consider document structure (headings, paragraphs) when splitting
|
||||
- Use token-aware splitters for optimal embedding generation
|
||||
|
||||
### Metadata Strategy
|
||||
|
||||
- Include rich metadata for filtering and attribution:
|
||||
- User and tenant identifiers for multi-tenancy
|
||||
- Document type and category classification
|
||||
- Creation and modification timestamps
|
||||
- Version and author information
|
||||
- Confidentiality and access level tags
|
||||
|
||||
### Query Processing
|
||||
|
||||
- Implement query preprocessing and cleaning
|
||||
- Consider query expansion for better recall
|
||||
- Apply dynamic filtering based on user context
|
||||
- Use re-ranking for improved result quality
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
- Cache embeddings for repeated queries
|
||||
- Use batch embedding generation for bulk operations
|
||||
- Implement pagination for large result sets
|
||||
- Consider asynchronous processing for long operations
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Simple RAG Pipeline
|
||||
|
||||
```java
|
||||
@RequiredArgsConstructor
|
||||
@Service
|
||||
public class SimpleRAGPipeline {
|
||||
|
||||
private final EmbeddingModel embeddingModel;
|
||||
private final EmbeddingStore<TextSegment> embeddingStore;
|
||||
private final ChatModel chatModel;
|
||||
|
||||
public String answerQuestion(String question) {
|
||||
Embedding queryEmbedding = embeddingModel.embed(question).content();
|
||||
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
|
||||
.queryEmbedding(queryEmbedding)
|
||||
.maxResults(3)
|
||||
.build();
|
||||
|
||||
List<TextSegment> segments = embeddingStore.search(request).matches().stream()
|
||||
.map(EmbeddingMatch::embedded)
|
||||
.collect(Collectors.toList());
|
||||
|
||||
String context = segments.stream()
|
||||
.map(TextSegment::text)
|
||||
.collect(Collectors.joining("\n\n"));
|
||||
|
||||
return chatModel.generate(context + "\n\nQuestion: " + question + "\nAnswer:");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Hybrid Search (Vector + Keyword)
|
||||
|
||||
```java
|
||||
@Service
|
||||
@RequiredArgsConstructor
|
||||
public class HybridSearchService {
|
||||
|
||||
private final EmbeddingStore<TextSegment> vectorStore;
|
||||
private final FullTextSearchEngine keywordEngine;
|
||||
private final EmbeddingModel embeddingModel;
|
||||
|
||||
public List<Content> hybridSearch(String query, int maxResults) {
|
||||
// Vector search
|
||||
List<Content> vectorResults = performVectorSearch(query, maxResults);
|
||||
|
||||
// Keyword search
|
||||
List<Content> keywordResults = performKeywordSearch(query, maxResults);
|
||||
|
||||
// Combine and re-rank using RRF algorithm
|
||||
return combineResults(vectorResults, keywordResults, maxResults);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Poor Retrieval Results**
|
||||
- Check document chunk size and overlap settings
|
||||
- Verify embedding model compatibility
|
||||
- Ensure metadata filters are not too restrictive
|
||||
- Consider adding re-ranking step
|
||||
|
||||
**Slow Performance**
|
||||
- Use cached embeddings for frequent queries
|
||||
- Optimize database indexing for vector stores
|
||||
- Implement pagination for large datasets
|
||||
- Consider async processing for bulk operations
|
||||
|
||||
**High Memory Usage**
|
||||
- Use disk-based embedding stores for large datasets
|
||||
- Implement proper pagination and filtering
|
||||
- Clean up unused embeddings periodically
|
||||
- Monitor and optimize chunk sizes
|
||||
|
||||
## References
|
||||
|
||||
- [API Reference](references/references.md) - Complete API documentation and interfaces
|
||||
- [Examples](references/examples.md) - Production-ready examples and patterns
|
||||
- [Official LangChain4j Documentation](https://docs.langchain4j.dev/)
|
||||
@@ -0,0 +1,482 @@
|
||||
# LangChain4j RAG Implementation - Practical Examples
|
||||
|
||||
Production-ready examples for implementing Retrieval-Augmented Generation (RAG) systems with LangChain4j.
|
||||
|
||||
## 1. Simple In-Memory RAG
|
||||
|
||||
**Scenario**: Quick RAG setup with documents in memory for development/testing.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.data.document.Document;
|
||||
import dev.langchain4j.data.segment.TextSegment;
|
||||
import dev.langchain4j.model.embedding.EmbeddingModel;
|
||||
import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
|
||||
import dev.langchain4j.model.openai.OpenAiChatModel;
|
||||
import dev.langchain4j.service.AiServices;
|
||||
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
|
||||
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
|
||||
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
|
||||
|
||||
interface DocumentAssistant {
|
||||
String answer(String question);
|
||||
}
|
||||
|
||||
public class SimpleRagExample {
|
||||
public static void main(String[] args) {
|
||||
// Setup
|
||||
var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();
|
||||
|
||||
var embeddingModel = OpenAiEmbeddingModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.modelName("text-embedding-3-small")
|
||||
.build();
|
||||
|
||||
var chatModel = OpenAiChatModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.modelName("gpt-4o-mini")
|
||||
.build();
|
||||
|
||||
// Ingest documents
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(embeddingStore)
|
||||
.build();
|
||||
|
||||
ingestor.ingest(Document.from("Spring Boot is a framework for building Java applications with minimal configuration."));
|
||||
ingestor.ingest(Document.from("Spring Data JPA provides data access abstraction using repositories."));
|
||||
ingestor.ingest(Document.from("Spring Cloud enables building distributed systems and microservices."));
|
||||
|
||||
// Create retriever and AI service
|
||||
var contentRetriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(3)
|
||||
.minScore(0.7)
|
||||
.build();
|
||||
|
||||
var assistant = AiServices.builder(DocumentAssistant.class)
|
||||
.chatModel(chatModel)
|
||||
.contentRetriever(contentRetriever)
|
||||
.build();
|
||||
|
||||
// Query with RAG
|
||||
System.out.println(assistant.answer("What is Spring Boot?"));
|
||||
System.out.println(assistant.answer("What does Spring Data JPA do?"));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 2. Vector Database RAG (Pinecone)
|
||||
|
||||
**Scenario**: Production RAG with persistent vector database.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.store.embedding.pinecone.PineconeEmbeddingStore;
|
||||
import dev.langchain4j.data.segment.TextSegment;
|
||||
import dev.langchain4j.data.document.Document;
|
||||
import dev.langchain4j.data.document.Metadata;
|
||||
|
||||
public class PineconeRagExample {
|
||||
public static void main(String[] args) {
|
||||
// Production vector store
|
||||
var embeddingStore = PineconeEmbeddingStore.builder()
|
||||
.apiKey(System.getenv("PINECONE_API_KEY"))
|
||||
.index("docs-index")
|
||||
.namespace("production")
|
||||
.build();
|
||||
|
||||
var embeddingModel = OpenAiEmbeddingModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.build();
|
||||
|
||||
// Ingest with metadata
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
.documentTransformer(doc -> {
|
||||
doc.metadata().put("source", "documentation");
|
||||
doc.metadata().put("date", LocalDate.now().toString());
|
||||
return doc;
|
||||
})
|
||||
.documentSplitter(DocumentSplitters.recursive(1000, 200))
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(embeddingStore)
|
||||
.build();
|
||||
|
||||
ingestor.ingest(Document.from("Your large document..."));
|
||||
|
||||
// Retrieve with filters
|
||||
var retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(5)
|
||||
.dynamicFilter(query ->
|
||||
new IsEqualTo("source", "documentation")
|
||||
)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 3. Document Loading and Splitting
|
||||
|
||||
**Scenario**: Load documents from various sources and split intelligently.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.data.document.Document;
|
||||
import dev.langchain4j.data.document.DocumentSplitter;
|
||||
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
|
||||
import dev.langchain4j.data.document.splitter.DocumentSplitters;
|
||||
import dev.langchain4j.data.segment.TextSegment;
|
||||
import dev.langchain4j.model.openai.OpenAiTokenCountEstimator;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.Paths;
|
||||
import java.util.List;
|
||||
|
||||
public class DocumentProcessingExample {
|
||||
public static void main(String[] args) {
|
||||
// Load from filesystem
|
||||
Path docPath = Paths.get("documents");
|
||||
List<Document> documents = FileSystemDocumentLoader.load(docPath);
|
||||
|
||||
// Smart recursive splitting with token counting
|
||||
DocumentSplitter splitter = DocumentSplitters.recursive(
|
||||
500, // Max tokens per segment
|
||||
50, // Overlap tokens
|
||||
new OpenAiTokenCountEstimator("gpt-4o-mini")
|
||||
);
|
||||
|
||||
// Process documents
|
||||
for (Document doc : documents) {
|
||||
List<TextSegment> segments = splitter.split(doc);
|
||||
System.out.println("Document split into " + segments.size() + " segments");
|
||||
|
||||
segments.forEach(segment -> {
|
||||
System.out.println("Text: " + segment.text());
|
||||
System.out.println("Metadata: " + segment.metadata());
|
||||
});
|
||||
}
|
||||
|
||||
// Alternative: Character-based splitting
|
||||
DocumentSplitter charSplitter = DocumentSplitters.recursive(
|
||||
1000, // Max characters
|
||||
100 // Overlap characters
|
||||
);
|
||||
|
||||
// Alternative: Paragraph-based splitting
|
||||
DocumentSplitter paraSplitter = DocumentSplitters.byParagraph(500, 50);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 4. Metadata Filtering in RAG
|
||||
|
||||
**Scenario**: Search with complex metadata filters for multi-tenant RAG.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.store.embedding.filter.comparison.*;
|
||||
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
|
||||
|
||||
public class MetadataFilteringExample {
|
||||
public static void main(String[] args) {
|
||||
var retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
|
||||
// Single filter: user isolation
|
||||
.filter(new IsEqualTo("userId", "user123"))
|
||||
|
||||
// Complex AND filter
|
||||
.filter(new And(
|
||||
new IsEqualTo("department", "engineering"),
|
||||
new IsEqualTo("status", "active")
|
||||
))
|
||||
|
||||
// OR filter: multiple categories
|
||||
.filter(new Or(
|
||||
new IsEqualTo("category", "tutorial"),
|
||||
new IsEqualTo("category", "guide")
|
||||
))
|
||||
|
||||
// NOT filter: exclude deprecated
|
||||
.filter(new Not(
|
||||
new IsEqualTo("deprecated", "true")
|
||||
))
|
||||
|
||||
// Numeric filters
|
||||
.filter(new IsGreaterThan("relevance", 0.8))
|
||||
.filter(new IsLessThanOrEqualTo("createdDaysAgo", 30))
|
||||
|
||||
// Multiple conditions
|
||||
.dynamicFilter(query -> {
|
||||
String userId = extractUserFromQuery(query);
|
||||
return new And(
|
||||
new IsEqualTo("userId", userId),
|
||||
new IsGreaterThan("score", 0.7)
|
||||
);
|
||||
})
|
||||
|
||||
.build();
|
||||
}
|
||||
|
||||
private static String extractUserFromQuery(Object query) {
|
||||
// Extract user context
|
||||
return "user123";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 5. Document Transformation Pipeline
|
||||
|
||||
**Scenario**: Transform documents with custom metadata before ingestion.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
|
||||
import dev.langchain4j.data.document.Metadata;
|
||||
import dev.langchain4j.data.segment.TextSegment;
|
||||
import java.time.LocalDate;
|
||||
|
||||
public class DocumentTransformationExample {
|
||||
public static void main(String[] args) {
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
|
||||
// Add metadata to each document
|
||||
.documentTransformer(doc -> {
|
||||
doc.metadata().put("ingested_date", LocalDate.now().toString());
|
||||
doc.metadata().put("source_system", "internal");
|
||||
doc.metadata().put("version", "1.0");
|
||||
return doc;
|
||||
})
|
||||
|
||||
// Split documents intelligently
|
||||
.documentSplitter(DocumentSplitters.recursive(500, 50))
|
||||
|
||||
// Transform each segment (e.g., add filename)
|
||||
.textSegmentTransformer(segment -> {
|
||||
String fileName = segment.metadata().getString("file_name", "unknown");
|
||||
String enrichedText = "File: " + fileName + "\n" + segment.text();
|
||||
return TextSegment.from(enrichedText, segment.metadata());
|
||||
})
|
||||
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(embeddingStore)
|
||||
.build();
|
||||
|
||||
// Ingest with tracking
|
||||
IngestionResult result = ingestor.ingest(document);
|
||||
System.out.println("Tokens ingested: " + result.tokenUsage().totalTokenCount());
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 6. Hybrid Search (Vector + Full-Text)
|
||||
|
||||
**Scenario**: Combine semantic search with keyword search for better recall.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.store.embedding.neo4j.Neo4jEmbeddingStore;
|
||||
|
||||
public class HybridSearchExample {
|
||||
public static void main(String[] args) {
|
||||
// Configure Neo4j for hybrid search
|
||||
var embeddingStore = Neo4jEmbeddingStore.builder()
|
||||
.withBasicAuth("bolt://localhost:7687", "neo4j", "password")
|
||||
.dimension(1536)
|
||||
|
||||
// Enable full-text search
|
||||
.fullTextIndexName("documents_fulltext")
|
||||
.autoCreateFullText(true)
|
||||
|
||||
// Query for full-text context
|
||||
.fullTextQuery("Spring OR Boot")
|
||||
|
||||
.build();
|
||||
|
||||
var retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(5)
|
||||
.build();
|
||||
|
||||
// Search combines both vector similarity and full-text keywords
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 7. Advanced RAG with Query Transformation
|
||||
|
||||
**Scenario**: Transform user queries before retrieval for better results.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.rag.DefaultRetrievalAugmentor;
|
||||
import dev.langchain4j.rag.query.transformer.CompressingQueryTransformer;
|
||||
import dev.langchain4j.rag.content.aggregator.ReRankingContentAggregator;
|
||||
import dev.langchain4j.model.cohere.CohereScoringModel;
|
||||
|
||||
public class AdvancedRagExample {
|
||||
public static void main(String[] args) {
|
||||
// Scoring model for re-ranking
|
||||
var scoringModel = CohereScoringModel.builder()
|
||||
.apiKey(System.getenv("COHERE_API_KEY"))
|
||||
.build();
|
||||
|
||||
// Advanced retrieval augmentor
|
||||
var augmentor = DefaultRetrievalAugmentor.builder()
|
||||
|
||||
// Transform query for better context
|
||||
.queryTransformer(new CompressingQueryTransformer(chatModel))
|
||||
|
||||
// Retrieve relevant content
|
||||
.contentRetriever(EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(10)
|
||||
.minScore(0.6)
|
||||
.build())
|
||||
|
||||
// Re-rank results by relevance
|
||||
.contentAggregator(ReRankingContentAggregator.builder()
|
||||
.scoringModel(scoringModel)
|
||||
.minScore(0.8)
|
||||
.build())
|
||||
|
||||
.build();
|
||||
|
||||
// Use with AI Service
|
||||
var assistant = AiServices.builder(QuestionAnswering.class)
|
||||
.chatModel(chatModel)
|
||||
.retrievalAugmentor(augmentor)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 8. Multi-User RAG with Isolation
|
||||
|
||||
**Scenario**: Per-user vector stores for data isolation.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
|
||||
import java.util.HashMap;
|
||||
import java.util.Map;
|
||||
|
||||
public class MultiUserRagExample {
|
||||
private final Map<String, EmbeddingStore<TextSegment>> userStores = new HashMap<>();
|
||||
|
||||
public void ingestForUser(String userId, Document document) {
|
||||
var store = userStores.computeIfAbsent(userId,
|
||||
k -> new InMemoryEmbeddingStore<>());
|
||||
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(store)
|
||||
.build();
|
||||
|
||||
ingestor.ingest(document);
|
||||
}
|
||||
|
||||
public String askQuestion(String userId, String question) {
|
||||
var store = userStores.get(userId);
|
||||
|
||||
var retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(store)
|
||||
.embeddingModel(embeddingModel)
|
||||
.maxResults(3)
|
||||
.build();
|
||||
|
||||
var assistant = AiServices.builder(QuestionAnswering.class)
|
||||
.chatModel(chatModel)
|
||||
.contentRetriever(retriever)
|
||||
.build();
|
||||
|
||||
return assistant.answer(question);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 9. Streaming RAG with Content Access
|
||||
|
||||
**Scenario**: Stream RAG responses while accessing retrieved content.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.service.TokenStream;
|
||||
|
||||
interface StreamingRagAssistant {
|
||||
TokenStream streamAnswer(String question);
|
||||
}
|
||||
|
||||
public class StreamingRagExample {
|
||||
public static void main(String[] args) {
|
||||
var assistant = AiServices.builder(StreamingRagAssistant.class)
|
||||
.streamingChatModel(streamingModel)
|
||||
.contentRetriever(contentRetriever)
|
||||
.build();
|
||||
|
||||
assistant.streamAnswer("What is Spring Boot?")
|
||||
.onRetrieved(contents -> {
|
||||
System.out.println("=== Retrieved Content ===");
|
||||
contents.forEach(content ->
|
||||
System.out.println("Score: " + content.score() +
|
||||
", Text: " + content.textSegment().text()));
|
||||
})
|
||||
.onNext(token -> System.out.print(token))
|
||||
.onCompleteResponse(response ->
|
||||
System.out.println("\n=== Complete ==="))
|
||||
.onError(error -> System.err.println("Error: " + error))
|
||||
.start();
|
||||
|
||||
try {
|
||||
Thread.sleep(5000);
|
||||
} catch (InterruptedException e) {
|
||||
Thread.currentThread().interrupt();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 10. Batch Document Ingestion
|
||||
|
||||
**Scenario**: Efficiently ingest large document collections.
|
||||
|
||||
```java
|
||||
import dev.langchain4j.data.document.Document;
|
||||
import java.util.List;
|
||||
import java.util.ArrayList;
|
||||
|
||||
public class BatchIngestionExample {
|
||||
public static void main(String[] args) {
|
||||
var ingestor = EmbeddingStoreIngestor.builder()
|
||||
.embeddingModel(embeddingModel)
|
||||
.embeddingStore(embeddingStore)
|
||||
.documentSplitter(DocumentSplitters.recursive(500, 50))
|
||||
.build();
|
||||
|
||||
// Load batch of documents
|
||||
List<Document> documents = new ArrayList<>();
|
||||
for (int i = 1; i <= 100; i++) {
|
||||
documents.add(Document.from("Content " + i));
|
||||
}
|
||||
|
||||
// Ingest all at once
|
||||
IngestionResult result = ingestor.ingest(documents);
|
||||
|
||||
System.out.println("Documents ingested: " + documents.size());
|
||||
System.out.println("Total tokens: " + result.tokenUsage().totalTokenCount());
|
||||
|
||||
// Track progress
|
||||
long tokensPerDoc = result.tokenUsage().totalTokenCount() / documents.size();
|
||||
System.out.println("Average tokens per document: " + tokensPerDoc);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
1. **Batch Processing**: Ingest documents in batches to optimize embedding API calls
|
||||
2. **Document Splitting**: Use recursive splitting for better semantic chunks
|
||||
3. **Metadata**: Add minimal metadata to reduce embedding overhead
|
||||
4. **Vector DB**: Choose appropriate vector DB based on scale (in-memory for dev, Pinecone/Weaviate for prod)
|
||||
5. **Similarity Threshold**: Adjust minScore based on use case (0.7-0.85 typical)
|
||||
6. **Max Results**: Return top 3-5 results unless specific needs require more
|
||||
7. **Caching**: Cache frequently retrieved content to reduce API calls
|
||||
8. **Async Ingestion**: Use async ingestion for large datasets
|
||||
9. **Monitoring**: Track token usage and retrieval quality metrics
|
||||
10. **Testing**: Use in-memory store for unit tests, external DB for integration tests
|
||||
@@ -0,0 +1,506 @@
|
||||
# LangChain4j RAG Implementation - API References
|
||||
|
||||
Complete API reference for implementing RAG systems with LangChain4j.
|
||||
|
||||
## Document Loading
|
||||
|
||||
### Document Loaders
|
||||
|
||||
**FileSystemDocumentLoader**: Load from filesystem.
|
||||
```java
|
||||
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
|
||||
import java.nio.file.Path;
|
||||
|
||||
List<Document> documents = FileSystemDocumentLoader.load("documents");
|
||||
List<Document> single = FileSystemDocumentLoader.load("document.pdf");
|
||||
```
|
||||
|
||||
**ClassPathDocumentLoader**: Load from classpath resources.
|
||||
```java
|
||||
List<Document> resources = ClassPathDocumentLoader.load("documents");
|
||||
```
|
||||
|
||||
**UrlDocumentLoader**: Load from web URLs.
|
||||
```java
|
||||
Document webDoc = UrlDocumentLoader.load("https://example.com/doc.html");
|
||||
```
|
||||
|
||||
## Document Splitting
|
||||
|
||||
### DocumentSplitter Interface
|
||||
|
||||
```java
|
||||
interface DocumentSplitter {
|
||||
List<TextSegment> split(Document document);
|
||||
List<TextSegment> splitAll(Collection<Document> documents);
|
||||
}
|
||||
```
|
||||
|
||||
### DocumentSplitters Factory
|
||||
|
||||
**Recursive Split**: Smart recursive splitting by paragraphs, sentences, words.
|
||||
```java
|
||||
DocumentSplitter splitter = DocumentSplitters.recursive(
|
||||
500, // Max segment size (tokens or characters)
|
||||
50 // Overlap size
|
||||
);
|
||||
|
||||
// With token counting
|
||||
DocumentSplitter splitter = DocumentSplitters.recursive(
|
||||
500,
|
||||
50,
|
||||
new OpenAiTokenCountEstimator("gpt-4o-mini")
|
||||
);
|
||||
```
|
||||
|
||||
**Paragraph Split**: Split by paragraphs.
|
||||
```java
|
||||
DocumentSplitter splitter = DocumentSplitters.byParagraph(500, 50);
|
||||
```
|
||||
|
||||
**Sentence Split**: Split by sentences.
|
||||
```java
|
||||
DocumentSplitter splitter = DocumentSplitters.bySentence(500, 50);
|
||||
```
|
||||
|
||||
**Line Split**: Split by lines.
|
||||
```java
|
||||
DocumentSplitter splitter = DocumentSplitters.byLine(500, 50);
|
||||
```
|
||||
|
||||
## Embedding Models
|
||||
|
||||
### EmbeddingModel Interface
|
||||
|
||||
```java
|
||||
public interface EmbeddingModel {
|
||||
// Embed single text
|
||||
Response<Embedding> embed(String text);
|
||||
Response<Embedding> embed(TextSegment textSegment);
|
||||
|
||||
// Batch embedding
|
||||
Response<List<Embedding>> embedAll(List<TextSegment> textSegments);
|
||||
|
||||
// Model dimension
|
||||
int dimension();
|
||||
}
|
||||
```
|
||||
|
||||
### OpenAI Embedding Model
|
||||
|
||||
```java
|
||||
EmbeddingModel model = OpenAiEmbeddingModel.builder()
|
||||
.apiKey(System.getenv("OPENAI_API_KEY"))
|
||||
.modelName("text-embedding-3-small") // or text-embedding-3-large
|
||||
.dimensions(512) // Optional: reduce dimensions
|
||||
.timeout(Duration.ofSeconds(30))
|
||||
.logRequests(true)
|
||||
.logResponses(true)
|
||||
.build();
|
||||
```
|
||||
|
||||
### Other Embedding Models
|
||||
|
||||
```java
|
||||
// Google Vertex AI
|
||||
EmbeddingModel google = VertexAiEmbeddingModel.builder()
|
||||
.project("PROJECT_ID")
|
||||
.location("us-central1")
|
||||
.modelName("textembedding-gecko")
|
||||
.build();
|
||||
|
||||
// Ollama (local)
|
||||
EmbeddingModel ollama = OllamaEmbeddingModel.builder()
|
||||
.baseUrl("http://localhost:11434")
|
||||
.modelName("all-minilm")
|
||||
.build();
|
||||
|
||||
// AllMiniLmL6V2 (offline)
|
||||
EmbeddingModel offline = new AllMiniLmL6V2EmbeddingModel();
|
||||
```
|
||||
|
||||
## Vector Stores (EmbeddingStore)
|
||||
|
||||
### EmbeddingStore Interface
|
||||
|
||||
```java
|
||||
public interface EmbeddingStore<Embedded> {
|
||||
// Add embeddings
|
||||
String add(Embedding embedding);
|
||||
String add(String id, Embedding embedding);
|
||||
String add(Embedding embedding, Embedded embedded);
|
||||
List<String> addAll(List<Embedding> embeddings);
|
||||
List<String> addAll(List<Embedding> embeddings, List<Embedded> embeddeds);
|
||||
List<String> addAll(List<String> ids, List<Embedding> embeddings, List<Embedded> embeddeds);
|
||||
|
||||
// Search embeddings
|
||||
EmbeddingSearchResult<Embedded> search(EmbeddingSearchRequest request);
|
||||
|
||||
// Remove embeddings
|
||||
void remove(String id);
|
||||
void removeAll(Collection<String> ids);
|
||||
void removeAll(Filter filter);
|
||||
void removeAll();
|
||||
}
|
||||
```
|
||||
|
||||
### In-Memory Store
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
|
||||
|
||||
// Merge stores
|
||||
InMemoryEmbeddingStore<TextSegment> merged = InMemoryEmbeddingStore.merge(
|
||||
store1, store2, store3
|
||||
);
|
||||
```
|
||||
|
||||
### Pinecone
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = PineconeEmbeddingStore.builder()
|
||||
.apiKey(System.getenv("PINECONE_API_KEY"))
|
||||
.index("my-index")
|
||||
.namespace("production")
|
||||
.environment("gcp-starter") // or "aws-us-east-1"
|
||||
.build();
|
||||
```
|
||||
|
||||
### Weaviate
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = WeaviateEmbeddingStore.builder()
|
||||
.host("localhost")
|
||||
.port(8080)
|
||||
.scheme("http")
|
||||
.collectionName("Documents")
|
||||
.build();
|
||||
```
|
||||
|
||||
### Qdrant
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = QdrantEmbeddingStore.builder()
|
||||
.host("localhost")
|
||||
.port(6333)
|
||||
.collectionName("documents")
|
||||
.build();
|
||||
```
|
||||
|
||||
### Chroma
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = ChromaEmbeddingStore.builder()
|
||||
.baseUrl("http://localhost:8000")
|
||||
.collectionName("my-collection")
|
||||
.build();
|
||||
```
|
||||
|
||||
### Neo4j
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = Neo4jEmbeddingStore.builder()
|
||||
.withBasicAuth("bolt://localhost:7687", "neo4j", "password")
|
||||
.dimension(1536)
|
||||
.label("Document")
|
||||
.build();
|
||||
```
|
||||
|
||||
### MongoDB Atlas
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = MongoDbEmbeddingStore.builder()
|
||||
.databaseName("search")
|
||||
.collectionName("documents")
|
||||
.indexName("vector_index")
|
||||
.createIndex(true)
|
||||
.fromClient(mongoClient)
|
||||
.build();
|
||||
```
|
||||
|
||||
### PostgreSQL (pgvector)
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = PgVectorEmbeddingStore.builder()
|
||||
.host("localhost")
|
||||
.port(5432)
|
||||
.database("embeddings")
|
||||
.user("postgres")
|
||||
.password("password")
|
||||
.table("embeddings")
|
||||
.createTableIfNotExists(true)
|
||||
.build();
|
||||
```
|
||||
|
||||
### Milvus
|
||||
|
||||
```java
|
||||
EmbeddingStore<TextSegment> store = MilvusEmbeddingStore.builder()
|
||||
.host("localhost")
|
||||
.port(19530)
|
||||
.collectionName("documents")
|
||||
.dimension(1536)
|
||||
.build();
|
||||
```
|
||||
|
||||
## Document Ingestion
|
||||
|
||||
### EmbeddingStoreIngestor
|
||||
|
||||
```java
|
||||
public class EmbeddingStoreIngestor {
|
||||
public static Builder builder();
|
||||
|
||||
public IngestionResult ingest(Document document);
|
||||
public IngestionResult ingest(Document... documents);
|
||||
public IngestionResult ingest(Collection<Document> documents);
|
||||
}
|
||||
```
|
||||
|
||||
### Building an Ingestor
|
||||
|
||||
```java
|
||||
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
|
||||
|
||||
// Document transformation
|
||||
.documentTransformer(doc -> {
|
||||
doc.metadata().put("source", "manual");
|
||||
return doc;
|
||||
})
|
||||
|
||||
// Document splitting strategy
|
||||
.documentSplitter(DocumentSplitters.recursive(500, 50))
|
||||
|
||||
// Text segment transformation
|
||||
.textSegmentTransformer(segment -> {
|
||||
String enhanced = "Category: Spring\n" + segment.text();
|
||||
return TextSegment.from(enhanced, segment.metadata());
|
||||
})
|
||||
|
||||
// Embedding model (required)
|
||||
.embeddingModel(embeddingModel)
|
||||
|
||||
// Embedding store (required)
|
||||
.embeddingStore(embeddingStore)
|
||||
|
||||
.build();
|
||||
```
|
||||
|
||||
### IngestionResult
|
||||
|
||||
```java
|
||||
IngestionResult result = ingestor.ingest(documents);
|
||||
|
||||
// Access results
|
||||
TokenUsage usage = result.tokenUsage();
|
||||
long totalTokens = usage.totalTokenCount();
|
||||
long inputTokens = usage.inputTokenCount();
|
||||
```
|
||||
|
||||
## Content Retrieval
|
||||
|
||||
### EmbeddingSearchRequest
|
||||
|
||||
```java
|
||||
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
|
||||
.queryEmbedding(embedding) // Required
|
||||
.maxResults(5) // Default: 3
|
||||
.minScore(0.7) // Threshold 0-1
|
||||
.filter(new IsEqualTo("category", "tutorial"))
|
||||
.build();
|
||||
```
|
||||
|
||||
### EmbeddingSearchResult
|
||||
|
||||
```java
|
||||
EmbeddingSearchResult<TextSegment> result = store.search(request);
|
||||
List<EmbeddingMatch<TextSegment>> matches = result.matches();
|
||||
|
||||
for (EmbeddingMatch<TextSegment> match : matches) {
|
||||
double score = match.score(); // Relevance 0-1
|
||||
TextSegment segment = match.embedded(); // Retrieved content
|
||||
String id = match.embeddingId(); // Store ID
|
||||
}
|
||||
```
|
||||
|
||||
### ContentRetriever Interface
|
||||
|
||||
```java
|
||||
public interface ContentRetriever {
|
||||
Content retrieve(Query query);
|
||||
List<Content> retrieveAll(List<Query> queries);
|
||||
}
|
||||
```
|
||||
|
||||
### EmbeddingStoreContentRetriever
|
||||
|
||||
```java
|
||||
ContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
|
||||
.embeddingStore(embeddingStore)
|
||||
.embeddingModel(embeddingModel)
|
||||
|
||||
// Static configuration
|
||||
.maxResults(5)
|
||||
.minScore(0.7)
|
||||
|
||||
// Dynamic configuration per query
|
||||
.dynamicMaxResults(query -> 10)
|
||||
.dynamicMinScore(query -> 0.8)
|
||||
.dynamicFilter(query ->
|
||||
new IsEqualTo("userId", extractUserId(query))
|
||||
)
|
||||
|
||||
.build();
|
||||
```
|
||||
|
||||
## Advanced RAG
|
||||
|
||||
### RetrievalAugmentor
|
||||
|
||||
```java
|
||||
public interface RetrievalAugmentor {
|
||||
AugmentationResult augment(UserMessage message);
|
||||
AugmentationResult augmentAll(List<UserMessage> messages);
|
||||
}
|
||||
```
|
||||
|
||||
### DefaultRetrievalAugmentor
|
||||
|
||||
```java
|
||||
RetrievalAugmentor augmentor = DefaultRetrievalAugmentor.builder()
|
||||
|
||||
// Query transformation
|
||||
.queryTransformer(new CompressingQueryTransformer(chatModel))
|
||||
|
||||
// Content retrieval
|
||||
.contentRetriever(contentRetriever)
|
||||
|
||||
// Content aggregation and re-ranking
|
||||
.contentAggregator(ReRankingContentAggregator.builder()
|
||||
.scoringModel(scoringModel)
|
||||
.minScore(0.8)
|
||||
.build())
|
||||
|
||||
// Parallelization
|
||||
.executor(customExecutor)
|
||||
|
||||
.build();
|
||||
```
|
||||
|
||||
### Use with AI Services
|
||||
|
||||
```java
|
||||
Assistant assistant = AiServices.builder(Assistant.class)
|
||||
.chatModel(chatModel)
|
||||
.retrievalAugmentor(augmentor)
|
||||
.build();
|
||||
```
|
||||
|
||||
## Metadata and Filtering
|
||||
|
||||
### Metadata Object
|
||||
|
||||
```java
|
||||
// Create from map
|
||||
Metadata meta = Metadata.from(Map.of(
|
||||
"userId", "user123",
|
||||
"category", "tutorial",
|
||||
"score", 0.95
|
||||
));
|
||||
|
||||
// Add entries
|
||||
meta.put("status", "active");
|
||||
meta.put("version", 2);
|
||||
|
||||
// Retrieve entries
|
||||
String userId = meta.getString("userId");
|
||||
int version = meta.getInt("version");
|
||||
double score = meta.getDouble("score");
|
||||
|
||||
// Check existence
|
||||
boolean has = meta.containsKey("userId");
|
||||
|
||||
// Remove entry
|
||||
meta.remove("userId");
|
||||
|
||||
// Merge
|
||||
Metadata other = Metadata.from(Map.of("source", "db"));
|
||||
meta.merge(other);
|
||||
```
|
||||
|
||||
### Filter Operations
|
||||
|
||||
```java
|
||||
import dev.langchain4j.store.embedding.filter.comparison.*;
|
||||
import dev.langchain4j.store.embedding.filter.logical.*;
|
||||
|
||||
// Equality
|
||||
Filter filter = new IsEqualTo("status", "active");
|
||||
Filter filter = new IsNotEqualTo("deprecated", "true");
|
||||
|
||||
// Comparison
|
||||
Filter filter = new IsGreaterThan("score", 0.8);
|
||||
Filter filter = new IsLessThanOrEqualTo("daysOld", 30);
|
||||
Filter filter = new IsGreaterThanOrEqualTo("priority", 5);
|
||||
Filter filter = new IsLessThan("errorRate", 0.01);
|
||||
|
||||
// Membership
|
||||
Filter filter = new IsIn("category", Arrays.asList("tech", "guide"));
|
||||
Filter filter = new IsNotIn("status", Arrays.asList("archived"));
|
||||
|
||||
// String operations
|
||||
Filter filter = new ContainsString("content", "Spring");
|
||||
|
||||
// Logical operations
|
||||
Filter filter = new And(
|
||||
new IsEqualTo("userId", "123"),
|
||||
new IsGreaterThan("score", 0.7)
|
||||
);
|
||||
|
||||
Filter filter = new Or(
|
||||
new IsEqualTo("type", "doc"),
|
||||
new IsEqualTo("type", "guide")
|
||||
);
|
||||
|
||||
Filter filter = new Not(new IsEqualTo("archived", "true"));
|
||||
```
|
||||
|
||||
## TextSegment
|
||||
|
||||
### Creating TextSegments
|
||||
|
||||
```java
|
||||
// Text only
|
||||
TextSegment segment = TextSegment.from("This is the content");
|
||||
|
||||
// With metadata
|
||||
Metadata metadata = Metadata.from(Map.of("source", "docs"));
|
||||
TextSegment segment = TextSegment.from("Content", metadata);
|
||||
|
||||
// Accessing
|
||||
String text = segment.text();
|
||||
Metadata meta = segment.metadata();
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Chunk Size**: Use 300-500 tokens per chunk for optimal balance
|
||||
2. **Overlap**: Use 10-50 token overlap for semantic continuity
|
||||
3. **Metadata**: Include source and timestamp for traceability
|
||||
4. **Batch Processing**: Ingest documents in batches when possible
|
||||
5. **Similarity Threshold**: Adjust minScore (0.7-0.85) based on precision/recall needs
|
||||
6. **Vector DB Selection**: In-memory for dev/test, Pinecone/Qdrant for production
|
||||
7. **Filtering**: Pre-filter by metadata to reduce search space
|
||||
8. **Re-ranking**: Use scoring models for better relevance in production
|
||||
9. **Monitoring**: Track retrieval quality metrics
|
||||
10. **Testing**: Use small in-memory stores for unit tests
|
||||
|
||||
## Performance Tips
|
||||
|
||||
- Use recursive splitting for semantic coherence
|
||||
- Enable batch processing for large datasets
|
||||
- Use dynamic max results based on query complexity
|
||||
- Cache embedding model for frequently accessed content
|
||||
- Implement async ingestion for large document collections
|
||||
- Monitor token usage for cost optimization
|
||||
- Use appropriate vector DB indexes for scale
|
||||
Reference in New Issue
Block a user