gh-giuseppe-trisciuoglio-de…/skills/ai/rag/references/langchain4j-rag-guide.md

# LangChain4j RAG Implementation Guide

## Overview
RAG (Retrieval-Augmented Generation) extends LLM knowledge by finding and injecting relevant information from your data into prompts before sending to the LLM.

## What is RAG?
RAG helps LLMs answer questions using domain-specific knowledge by retrieving relevant information to reduce hallucinations.

## RAG Flavors in LangChain4j

### 1. Easy RAG
Simplest way to start with minimal setup. Handles document loading, splitting, and embedding automatically.

### 2. Core RAG APIs
Modular components including:
- Document
- TextSegment
- EmbeddingModel
- EmbeddingStore
- DocumentSplitter

### 3. Advanced RAG
Complex pipelines supporting:
- Query transformation
- Multi-source retrieval
- Re-ranking with components like QueryTransformer and ContentRetriever

## RAG Stages

### 1. Indexing
Pre-process documents for efficient search

### 2. Retrieval
Find relevant content based on user queries

## Core Components

### Documents with metadata
Structured representation of your content with associated metadata for filtering and context.

### Text segments (chunks)
Smaller, manageable pieces of documents that are embedded and stored in vector databases.

### Embedding models
Convert text segments into numerical vectors for similarity search.

### Embedding stores (vector databases)
Store and efficiently retrieve embedded text segments.

### Content retrievers
Find relevant content based on user queries.

### Query transformers
Transform and optimize user queries for better retrieval.

### Content aggregators
Combine and rank retrieved content.

## Advanced Features

- Query transformation and routing
- Multiple retrievers for different data sources
- Re-ranking models for improved relevance
- Metadata filtering for targeted retrieval
- Parallel processing for performance

## Implementation Example (Easy RAG)

```java
// Load documents
List<Document> documents = FileSystemDocumentLoader.loadDocuments("/path/to/docs");

// Create embedding store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

// Ingest documents
EmbeddingStoreIngestor.ingest(documents, embeddingStore);

// Create AI service
Assistant assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
    .build();
```

## Best Practices

1. **Document Preparation**: Clean and structure documents before ingestion
2. **Chunk Size**: Balance between context preservation and retrieval precision
3. **Metadata Strategy**: Include relevant metadata for filtering and context
4. **Embedding Model Selection**: Choose models appropriate for your domain
5. **Retrieval Strategy**: Select appropriate k values and filtering criteria
6. **Evaluation**: Continuously evaluate retrieval quality and answer accuracy