94 lines
2.9 KiB
Markdown
94 lines
2.9 KiB
Markdown
# LangChain4j RAG Implementation Guide
|
|
|
|
## Overview
|
|
RAG (Retrieval-Augmented Generation) extends LLM knowledge by finding and injecting relevant information from your data into prompts before sending to the LLM.
|
|
|
|
## What is RAG?
|
|
RAG helps LLMs answer questions using domain-specific knowledge by retrieving relevant information to reduce hallucinations.
|
|
|
|
## RAG Flavors in LangChain4j
|
|
|
|
### 1. Easy RAG
|
|
Simplest way to start with minimal setup. Handles document loading, splitting, and embedding automatically.
|
|
|
|
### 2. Core RAG APIs
|
|
Modular components including:
|
|
- Document
|
|
- TextSegment
|
|
- EmbeddingModel
|
|
- EmbeddingStore
|
|
- DocumentSplitter
|
|
|
|
### 3. Advanced RAG
|
|
Complex pipelines supporting:
|
|
- Query transformation
|
|
- Multi-source retrieval
|
|
- Re-ranking with components like QueryTransformer and ContentRetriever
|
|
|
|
## RAG Stages
|
|
|
|
### 1. Indexing
|
|
Pre-process documents for efficient search
|
|
|
|
### 2. Retrieval
|
|
Find relevant content based on user queries
|
|
|
|
## Core Components
|
|
|
|
### Documents with metadata
|
|
Structured representation of your content with associated metadata for filtering and context.
|
|
|
|
### Text segments (chunks)
|
|
Smaller, manageable pieces of documents that are embedded and stored in vector databases.
|
|
|
|
### Embedding models
|
|
Convert text segments into numerical vectors for similarity search.
|
|
|
|
### Embedding stores (vector databases)
|
|
Store and efficiently retrieve embedded text segments.
|
|
|
|
### Content retrievers
|
|
Find relevant content based on user queries.
|
|
|
|
### Query transformers
|
|
Transform and optimize user queries for better retrieval.
|
|
|
|
### Content aggregators
|
|
Combine and rank retrieved content.
|
|
|
|
## Advanced Features
|
|
|
|
- Query transformation and routing
|
|
- Multiple retrievers for different data sources
|
|
- Re-ranking models for improved relevance
|
|
- Metadata filtering for targeted retrieval
|
|
- Parallel processing for performance
|
|
|
|
## Implementation Example (Easy RAG)
|
|
|
|
```java
|
|
// Load documents
|
|
List<Document> documents = FileSystemDocumentLoader.loadDocuments("/path/to/docs");
|
|
|
|
// Create embedding store
|
|
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
|
|
|
|
// Ingest documents
|
|
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
|
|
|
|
// Create AI service
|
|
Assistant assistant = AiServices.builder(Assistant.class)
|
|
.chatModel(chatModel)
|
|
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
|
|
.contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
|
|
.build();
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Document Preparation**: Clean and structure documents before ingestion
|
|
2. **Chunk Size**: Balance between context preservation and retrieval precision
|
|
3. **Metadata Strategy**: Include relevant metadata for filtering and context
|
|
4. **Embedding Model Selection**: Choose models appropriate for your domain
|
|
5. **Retrieval Strategy**: Select appropriate k values and filtering criteria
|
|
6. **Evaluation**: Continuously evaluate retrieval quality and answer accuracy |