Initial commit

2025-11-29 18:28:30 +08:00
commit 171acedaa4
220 changed files with 85967 additions and 0 deletions
--- a/skills/ai/chunking-strategy/SKILL.md
+++ b/skills/ai/chunking-strategy/SKILL.md
@@ -0,0 +1,194 @@
+---
+name: chunking-strategy
+description: Implement optimal chunking strategies in RAG systems and document processing pipelines. Use when building retrieval-augmented generation systems, vector databases, or processing large documents that require breaking into semantically meaningful segments for embeddings and search.
+allowed-tools: Read, Write, Bash
+category: artificial-intelligence
+tags: [rag, chunking, vector-search, embeddings, document-processing]
+version: 1.0.0
+---
+
+# Chunking Strategy for RAG Systems
+
+## Overview
+
+Implement optimal chunking strategies for Retrieval-Augmented Generation (RAG) systems and document processing pipelines. This skill provides a comprehensive framework for breaking large documents into smaller, semantically meaningful segments that preserve context while enabling efficient retrieval and search.
+
+## When to Use
+
+Use this skill when building RAG systems, optimizing vector search performance, implementing document processing pipelines, handling multi-modal content, or performance-tuning existing RAG systems with poor retrieval quality.
+
+## Instructions
+
+### Choose Chunking Strategy
+
+Select appropriate chunking strategy based on document type and use case:
+
+1. **Fixed-Size Chunking** (Level 1)
+   - Use for simple documents without clear structure
+   - Start with 512 tokens and 10-20% overlap
+   - Adjust size based on query type: 256 for factoid, 1024 for analytical
+
+2. **Recursive Character Chunking** (Level 2)
+   - Use for documents with clear structural boundaries
+   - Implement hierarchical separators: paragraphs → sentences → words
+   - Customize separators for document types (HTML, Markdown)
+
+3. **Structure-Aware Chunking** (Level 3)
+   - Use for structured documents (Markdown, code, tables, PDFs)
+   - Preserve semantic units: functions, sections, table blocks
+   - Validate structure preservation post-splitting
+
+4. **Semantic Chunking** (Level 4)
+   - Use for complex documents with thematic shifts
+   - Implement embedding-based boundary detection
+   - Configure similarity threshold (0.8) and buffer size (3-5 sentences)
+
+5. **Advanced Methods** (Level 5)
+   - Use Late Chunking for long-context embedding models
+   - Apply Contextual Retrieval for high-precision requirements
+   - Monitor computational costs vs. retrieval improvements
+
+Reference detailed strategy implementations in [references/strategies.md](references/strategies.md).
+
+### Implement Chunking Pipeline
+
+Follow these steps to implement effective chunking:
+
+1. **Pre-process documents**
+   - Analyze document structure and content types
+   - Identify multi-modal content (tables, images, code)
+   - Assess information density and complexity
+
+2. **Select strategy parameters**
+   - Choose chunk size based on embedding model context window
+   - Set overlap percentage (10-20% for most cases)
+   - Configure strategy-specific parameters
+
+3. **Process and validate**
+   - Apply chosen chunking strategy
+   - Validate semantic coherence of chunks
+   - Test with representative documents
+
+4. **Evaluate and iterate**
+   - Measure retrieval precision and recall
+   - Monitor processing latency and resource usage
+   - Optimize based on specific use case requirements
+
+Reference detailed implementation guidelines in [references/implementation.md](references/implementation.md).
+
+### Evaluate Performance
+
+Use these metrics to evaluate chunking effectiveness:
+
+- **Retrieval Precision**: Fraction of retrieved chunks that are relevant
+- **Retrieval Recall**: Fraction of relevant chunks that are retrieved
+- **End-to-End Accuracy**: Quality of final RAG responses
+- **Processing Time**: Latency impact on overall system
+- **Resource Usage**: Memory and computational costs
+
+Reference detailed evaluation framework in [references/evaluation.md](references/evaluation.md).
+
+## Examples
+
+### Basic Fixed-Size Chunking
+
+```python
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+
+# Configure for factoid queries
+splitter = RecursiveCharacterTextSplitter(
+    chunk_size=256,
+    chunk_overlap=25,
+    length_function=len
+)
+
+chunks = splitter.split_documents(documents)
+```
+
+### Structure-Aware Code Chunking
+
+```python
+def chunk_python_code(code):
+    """Split Python code into semantic chunks"""
+    import ast
+
+    tree = ast.parse(code)
+    chunks = []
+
+    for node in ast.walk(tree):
+        if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
+            chunks.append(ast.get_source_segment(code, node))
+
+    return chunks
+```
+
+### Semantic Chunking with Embeddings
+
+```python
+def semantic_chunk(text, similarity_threshold=0.8):
+    """Chunk text based on semantic boundaries"""
+    sentences = split_into_sentences(text)
+    embeddings = generate_embeddings(sentences)
+
+    chunks = []
+    current_chunk = [sentences[0]]
+
+    for i in range(1, len(sentences)):
+        similarity = cosine_similarity(embeddings[i-1], embeddings[i])
+
+        if similarity < similarity_threshold:
+            chunks.append(" ".join(current_chunk))
+            current_chunk = [sentences[i]]
+        else:
+            current_chunk.append(sentences[i])
+
+    chunks.append(" ".join(current_chunk))
+    return chunks
+```
+
+## Best Practices
+
+### Core Principles
+- Balance context preservation with retrieval precision
+- Maintain semantic coherence within chunks
+- Optimize for embedding model constraints
+- Preserve document structure when beneficial
+
+### Implementation Guidelines
+- Start simple with fixed-size chunking (512 tokens, 10-20% overlap)
+- Test thoroughly with representative documents
+- Monitor both accuracy metrics and computational costs
+- Iterate based on specific document characteristics
+
+### Common Pitfalls to Avoid
+- Over-chunking: Creating too many small, context-poor chunks
+- Under-chunking: Missing relevant information due to oversized chunks
+- Ignoring document structure and semantic boundaries
+- Using one-size-fits-all approach for diverse content types
+- Neglecting overlap for boundary-crossing information
+
+## Constraints
+
+### Resource Considerations
+- Semantic and contextual methods require significant computational resources
+- Late chunking needs long-context embedding models
+- Complex strategies increase processing latency
+- Monitor memory usage for large document processing
+
+### Quality Requirements
+- Validate chunk semantic coherence post-processing
+- Test with domain-specific documents before deployment
+- Ensure chunks maintain standalone meaning where possible
+- Implement proper error handling for edge cases
+
+## References
+
+Reference detailed documentation in the [references/](references/) folder:
+- [strategies.md](references/strategies.md) - Detailed strategy implementations
+- [implementation.md](references/implementation.md) - Complete implementation guidelines
+- [evaluation.md](references/evaluation.md) - Performance evaluation framework
+- [tools.md](references/tools.md) - Recommended libraries and frameworks
+- [research.md](references/research.md) - Key research papers and findings
+- [advanced-strategies.md](references/advanced-strategies.md) - 11 comprehensive chunking methods
+- [semantic-methods.md](references/semantic-methods.md) - Semantic and contextual approaches
+- [visualization-tools.md](references/visualization-tools.md) - Evaluation and visualization tools
--- a/skills/ai/chunking-strategy/references/advanced-strategies.md
+++ b/skills/ai/chunking-strategy/references/advanced-strategies.md
--- a/skills/ai/chunking-strategy/references/evaluation.md
+++ b/skills/ai/chunking-strategy/references/evaluation.md
@@ -0,0 +1,904 @@
+# Performance Evaluation Framework
+
+This document provides comprehensive methodologies for evaluating chunking strategy performance and effectiveness.
+
+## Evaluation Metrics
+
+### Core Retrieval Metrics
+
+#### Retrieval Precision
+Measures the fraction of retrieved chunks that are relevant to the query.
+
+```python
+def calculate_precision(retrieved_chunks: List[Dict], relevant_chunks: List[Dict]) -> float:
+    """
+    Calculate retrieval precision
+    Precision = |Relevant ∩ Retrieved| / |Retrieved|
+    """
+    retrieved_ids = {chunk.get('id') for chunk in retrieved_chunks}
+    relevant_ids = {chunk.get('id') for chunk in relevant_chunks}
+
+    intersection = retrieved_ids & relevant_ids
+
+    if not retrieved_ids:
+        return 0.0
+
+    return len(intersection) / len(retrieved_ids)
+```
+
+#### Retrieval Recall
+Measures the fraction of relevant chunks that are successfully retrieved.
+
+```python
+def calculate_recall(retrieved_chunks: List[Dict], relevant_chunks: List[Dict]) -> float:
+    """
+    Calculate retrieval recall
+    Recall = |Relevant ∩ Retrieved| / |Relevant|
+    """
+    retrieved_ids = {chunk.get('id') for chunk in retrieved_chunks}
+    relevant_ids = {chunk.get('id') for chunk in relevant_chunks}
+
+    intersection = retrieved_ids & relevant_ids
+
+    if not relevant_ids:
+        return 0.0
+
+    return len(intersection) / len(relevant_ids)
+```
+
+#### F1-Score
+Harmonic mean of precision and recall.
+
+```python
+def calculate_f1_score(precision: float, recall: float) -> float:
+    """
+    Calculate F1-score
+    F1 = 2 * (Precision * Recall) / (Precision + Recall)
+    """
+    if precision + recall == 0:
+        return 0.0
+
+    return 2 * (precision * recall) / (precision + recall)
+```
+
+### Mean Reciprocal Rank (MRR)
+Measures the rank of the first relevant result.
+
+```python
+def calculate_mrr(queries: List[Dict], results: List[List[Dict]]) -> float:
+    """
+    Calculate Mean Reciprocal Rank
+    """
+    reciprocal_ranks = []
+
+    for query, query_results in zip(queries, results):
+        relevant_found = False
+
+        for rank, result in enumerate(query_results, 1):
+            if result.get('is_relevant', False):
+                reciprocal_ranks.append(1.0 / rank)
+                relevant_found = True
+                break
+
+        if not relevant_found:
+            reciprocal_ranks.append(0.0)
+
+    return sum(reciprocal_ranks) / len(reciprocal_ranks)
+```
+
+### Mean Average Precision (MAP)
+Considers both precision and the ranking of relevant documents.
+
+```python
+def calculate_average_precision(retrieved_chunks: List[Dict], relevant_chunks: List[Dict]) -> float:
+    """
+    Calculate Average Precision for a single query
+    """
+    retrieved_ids = {chunk.get('id') for chunk in retrieved_chunks}
+    relevant_ids = {chunk.get('id') for chunk in relevant_chunks}
+
+    if not relevant_ids:
+        return 0.0
+
+    precisions = []
+    relevant_count = 0
+
+    for rank, chunk in enumerate(retrieved_chunks, 1):
+        if chunk.get('id') in relevant_ids:
+            relevant_count += 1
+            precision_at_rank = relevant_count / rank
+            precisions.append(precision_at_rank)
+
+    return sum(precisions) / len(relevant_ids) if relevant_ids else 0.0
+
+def calculate_map(queries: List[Dict], results: List[List[Dict]]) -> float:
+    """
+    Calculate Mean Average Precision across multiple queries
+    """
+    average_precisions = []
+
+    for query, query_results in zip(queries, results):
+        ap = calculate_average_precision(query_results, query.get('relevant_chunks', []))
+        average_precisions.append(ap)
+
+    return sum(average_precisions) / len(average_precisions) if average_precisions else 0.0
+```
+
+### Normalized Discounted Cumulative Gain (NDCG)
+Measures ranking quality with emphasis on highly relevant results.
+
+```python
+def calculate_dcg(retrieved_chunks: List[Dict]) -> float:
+    """
+    Calculate Discounted Cumulative Gain
+    """
+    dcg = 0.0
+
+    for rank, chunk in enumerate(retrieved_chunks, 1):
+        relevance = chunk.get('relevance_score', 0)
+        dcg += relevance / np.log2(rank + 1)
+
+    return dcg
+
+def calculate_ndcg(retrieved_chunks: List[Dict], ideal_chunks: List[Dict]) -> float:
+    """
+    Calculate Normalized Discounted Cumulative Gain
+    """
+    dcg = calculate_dcg(retrieved_chunks)
+    idcg = calculate_dcg(ideal_chunks)
+
+    if idcg == 0:
+        return 0.0
+
+    return dcg / idcg
+```
+
+## End-to-End RAG Evaluation
+
+### Answer Quality Metrics
+
+#### Factual Consistency
+Measures how well the generated answer aligns with retrieved chunks.
+
+```python
+import spacy
+from transformers import pipeline
+
+class FactualConsistencyEvaluator:
+    def __init__(self):
+        self.nlp = spacy.load("en_core_web_sm")
+        self.nli_pipeline = pipeline("text-classification",
+                                   model="roberta-large-mnli")
+
+    def evaluate_consistency(self, answer: str, retrieved_chunks: List[str]) -> float:
+        """
+        Evaluate factual consistency between answer and retrieved context
+        """
+        if not retrieved_chunks:
+            return 0.0
+
+        # Combine retrieved chunks as context
+        context = " ".join(retrieved_chunks[:3])  # Use top 3 chunks
+
+        # Use Natural Language Inference to check consistency
+        result = self.nli_pipeline(f"premise: {context} hypothesis: {answer}")
+
+        # Extract consistency score (entailment probability)
+        for item in result:
+            if item['label'] == 'ENTAILMENT':
+                return item['score']
+            elif item['label'] == 'CONTRADICTION':
+                return 1.0 - item['score']
+
+        return 0.5  # Neutral if NLI is inconclusive
+```
+
+#### Answer Completeness
+Measures how completely the answer addresses the user's query.
+
+```python
+def evaluate_completeness(answer: str, query: str, reference_answer: str = None) -> float:
+    """
+    Evaluate answer completeness
+    """
+    # Extract key entities from query
+    query_entities = extract_entities(query)
+    answer_entities = extract_entities(answer)
+
+    # Calculate entity coverage
+    if not query_entities:
+        return 0.5  # Neutral if no entities in query
+
+    covered_entities = query_entities & answer_entities
+    entity_coverage = len(covered_entities) / len(query_entities)
+
+    # If reference answer is available, compare against it
+    if reference_answer:
+        reference_entities = extract_entities(reference_answer)
+        answer_reference_overlap = len(answer_entities & reference_entities) / max(len(reference_entities), 1)
+        return (entity_coverage + answer_reference_overlap) / 2
+
+    return entity_coverage
+
+def extract_entities(text: str) -> set:
+    """
+    Extract named entities from text (simplified)
+    """
+    # This would use a proper NER model in practice
+    import re
+
+    # Simple noun phrase extraction as placeholder
+    noun_phrases = re.findall(r'\b[A-Z][a-z]+(?:\s+[A-Z][a-z]+)*\b', text)
+    return set(noun_phrases)
+```
+
+#### Response Relevance
+Measures how relevant the answer is to the original query.
+
+```python
+from sentence_transformers import SentenceTransformer
+from sklearn.metrics.pairwise import cosine_similarity
+
+class RelevanceEvaluator:
+    def __init__(self, model_name="all-MiniLM-L6-v2"):
+        self.model = SentenceTransformer(model_name)
+
+    def evaluate_relevance(self, query: str, answer: str) -> float:
+        """
+        Evaluate semantic relevance between query and answer
+        """
+        # Generate embeddings
+        query_embedding = self.model.encode([query])
+        answer_embedding = self.model.encode([answer])
+
+        # Calculate cosine similarity
+        similarity = cosine_similarity(query_embedding, answer_embedding)[0][0]
+
+        return float(similarity)
+```
+
+## Performance Metrics
+
+### Processing Time
+
+```python
+import time
+from dataclasses import dataclass
+from typing import List, Dict
+
+@dataclass
+class PerformanceMetrics:
+    total_time: float
+    chunking_time: float
+    embedding_time: float
+    search_time: float
+    generation_time: float
+    throughput: float  # documents per second
+
+class PerformanceProfiler:
+    def __init__(self):
+        self.timings = {}
+        self.start_times = {}
+
+    def start_timer(self, operation: str):
+        self.start_times[operation] = time.time()
+
+    def end_timer(self, operation: str):
+        if operation in self.start_times:
+            duration = time.time() - self.start_times[operation]
+            if operation not in self.timings:
+                self.timings[operation] = []
+            self.timings[operation].append(duration)
+            return duration
+        return 0.0
+
+    def get_performance_metrics(self, document_count: int) -> PerformanceMetrics:
+        total_time = sum(sum(times) for times in self.timings.values())
+
+        return PerformanceMetrics(
+            total_time=total_time,
+            chunking_time=sum(self.timings.get('chunking', [0])),
+            embedding_time=sum(self.timings.get('embedding', [0])),
+            search_time=sum(self.timings.get('search', [0])),
+            generation_time=sum(self.timings.get('generation', [0])),
+            throughput=document_count / total_time if total_time > 0 else 0
+        )
+```
+
+### Memory Usage
+
+```python
+import psutil
+import os
+from typing import Dict, List
+
+class MemoryProfiler:
+    def __init__(self):
+        self.process = psutil.Process(os.getpid())
+        self.memory_snapshots = []
+
+    def take_memory_snapshot(self, label: str):
+        """Take a snapshot of current memory usage"""
+        memory_info = self.process.memory_info()
+        memory_mb = memory_info.rss / 1024 / 1024  # Convert to MB
+
+        self.memory_snapshots.append({
+            'label': label,
+            'memory_mb': memory_mb,
+            'timestamp': time.time()
+        })
+
+    def get_peak_memory_usage(self) -> float:
+        """Get peak memory usage in MB"""
+        if not self.memory_snapshots:
+            return 0.0
+        return max(snapshot['memory_mb'] for snapshot in self.memory_snapshots)
+
+    def get_memory_usage_by_operation(self) -> Dict[str, float]:
+        """Get memory usage breakdown by operation"""
+        if not self.memory_snapshots:
+            return {}
+
+        memory_by_op = {}
+        for i in range(1, len(self.memory_snapshots)):
+            prev_snapshot = self.memory_snapshots[i-1]
+            curr_snapshot = self.memory_snapshots[i]
+
+            operation = curr_snapshot['label']
+            memory_delta = curr_snapshot['memory_mb'] - prev_snapshot['memory_mb']
+
+            if operation not in memory_by_op:
+                memory_by_op[operation] = []
+            memory_by_op[operation].append(memory_delta)
+
+        return {op: sum(deltas) for op, deltas in memory_by_op.items()}
+```
+
+## Evaluation Datasets
+
+### Standardized Test Sets
+
+#### Question-Answer Pairs
+
+```python
+from dataclasses import dataclass
+from typing import List, Optional
+import json
+
+@dataclass
+class EvaluationQuery:
+    id: str
+    question: str
+    reference_answer: Optional[str]
+    relevant_chunk_ids: List[str]
+    query_type: str  # factoid, analytical, comparative
+    difficulty: str  # easy, medium, hard
+    domain: str  # finance, medical, legal, technical
+
+class EvaluationDataset:
+    def __init__(self, name: str):
+        self.name = name
+        self.queries: List[EvaluationQuery] = []
+        self.documents: Dict[str, str] = {}
+        self.chunks: Dict[str, Dict] = {}
+
+    def add_query(self, query: EvaluationQuery):
+        self.queries.append(query)
+
+    def add_document(self, doc_id: str, content: str):
+        self.documents[doc_id] = content
+
+    def add_chunk(self, chunk_id: str, content: str, doc_id: str, metadata: Dict):
+        self.chunks[chunk_id] = {
+            'id': chunk_id,
+            'content': content,
+            'doc_id': doc_id,
+            'metadata': metadata
+        }
+
+    def save_to_file(self, filepath: str):
+        data = {
+            'name': self.name,
+            'queries': [
+                {
+                    'id': q.id,
+                    'question': q.question,
+                    'reference_answer': q.reference_answer,
+                    'relevant_chunk_ids': q.relevant_chunk_ids,
+                    'query_type': q.query_type,
+                    'difficulty': q.difficulty,
+                    'domain': q.domain
+                }
+                for q in self.queries
+            ],
+            'documents': self.documents,
+            'chunks': self.chunks
+        }
+
+        with open(filepath, 'w') as f:
+            json.dump(data, f, indent=2)
+
+    @classmethod
+    def load_from_file(cls, filepath: str):
+        with open(filepath, 'r') as f:
+            data = json.load(f)
+
+        dataset = cls(data['name'])
+        dataset.documents = data['documents']
+        dataset.chunks = data['chunks']
+
+        for q_data in data['queries']:
+            query = EvaluationQuery(
+                id=q_data['id'],
+                question=q_data['question'],
+                reference_answer=q_data.get('reference_answer'),
+                relevant_chunk_ids=q_data['relevant_chunk_ids'],
+                query_type=q_data['query_type'],
+                difficulty=q_data['difficulty'],
+                domain=q_data['domain']
+            )
+            dataset.add_query(query)
+
+        return dataset
+```
+
+### Dataset Generation
+
+#### Synthetic Query Generation
+
+```python
+import random
+from typing import List, Dict
+
+class SyntheticQueryGenerator:
+    def __init__(self):
+        self.query_templates = {
+            'factoid': [
+                "What is {concept}?",
+                "When did {event} occur?",
+                "Who developed {technology}?",
+                "How many {items} are mentioned?",
+                "What is the value of {metric}?"
+            ],
+            'analytical': [
+                "Compare and contrast {concept1} and {concept2}.",
+                "Analyze the impact of {concept} on {domain}.",
+                "What are the advantages and disadvantages of {technology}?",
+                "Explain the relationship between {concept1} and {concept2}.",
+                "Evaluate the effectiveness of {approach} for {problem}."
+            ],
+            'comparative': [
+                "Which is better: {option1} or {option2}?",
+                "How does {method1} differ from {method2}?",
+                "Compare the performance of {system1} and {system2}.",
+                "What are the key differences between {approach1} and {approach2}?"
+            ]
+        }
+
+    def generate_queries_from_chunks(self, chunks: List[Dict], num_queries: int = 100) -> List[EvaluationQuery]:
+        """Generate synthetic queries from document chunks"""
+        queries = []
+
+        # Extract entities and concepts from chunks
+        entities = self._extract_entities_from_chunks(chunks)
+
+        for i in range(num_queries):
+            query_type = random.choice(['factoid', 'analytical', 'comparative'])
+            template = random.choice(self.query_templates[query_type])
+
+            # Fill template with extracted entities
+            query_text = self._fill_template(template, entities)
+
+            # Find relevant chunks for this query
+            relevant_chunks = self._find_relevant_chunks(query_text, chunks)
+
+            query = EvaluationQuery(
+                id=f"synthetic_{i}",
+                question=query_text,
+                reference_answer=None,  # Would need generation model
+                relevant_chunk_ids=[chunk['id'] for chunk in relevant_chunks],
+                query_type=query_type,
+                difficulty=random.choice(['easy', 'medium', 'hard']),
+                domain='synthetic'
+            )
+
+            queries.append(query)
+
+        return queries
+
+    def _extract_entities_from_chunks(self, chunks: List[Dict]) -> Dict[str, List[str]]:
+        """Extract entities, concepts, and relationships from chunks"""
+        # This would use proper NER in practice
+        entities = {
+            'concepts': [],
+            'technologies': [],
+            'methods': [],
+            'metrics': [],
+            'events': []
+        }
+
+        for chunk in chunks:
+            content = chunk['content']
+            # Simplified entity extraction
+            words = content.split()
+            entities['concepts'].extend([word for word in words if len(word) > 6])
+            entities['technologies'].extend([word for word in words if 'technology' in word.lower()])
+            entities['methods'].extend([word for word in words if 'method' in word.lower()])
+            entities['metrics'].extend([word for word in words if '%' in word or '$' in word])
+
+        # Remove duplicates and limit
+        for key in entities:
+            entities[key] = list(set(entities[key]))[:50]
+
+        return entities
+
+    def _fill_template(self, template: str, entities: Dict[str, List[str]]) -> str:
+        """Fill query template with random entities"""
+        import re
+
+        def replace_placeholder(match):
+            placeholder = match.group(1)
+
+            # Map placeholders to entity types
+            entity_mapping = {
+                'concept': 'concepts',
+                'concept1': 'concepts',
+                'concept2': 'concepts',
+                'technology': 'technologies',
+                'method': 'methods',
+                'method1': 'methods',
+                'method2': 'methods',
+                'metric': 'metrics',
+                'event': 'events',
+                'items': 'concepts',
+                'option1': 'concepts',
+                'option2': 'concepts',
+                'approach': 'methods',
+                'problem': 'concepts',
+                'domain': 'concepts',
+                'system1': 'concepts',
+                'system2': 'concepts'
+            }
+
+            entity_type = entity_mapping.get(placeholder, 'concepts')
+            available_entities = entities.get(entity_type, ['something'])
+
+            if available_entities:
+                return random.choice(available_entities)
+            else:
+                return 'something'
+
+        return re.sub(r'\{(\w+)\}', replace_placeholder, template)
+
+    def _find_relevant_chunks(self, query: str, chunks: List[Dict], k: int = 3) -> List[Dict]:
+        """Find chunks most relevant to the query"""
+        # Simple keyword matching for synthetic generation
+        query_words = set(query.lower().split())
+
+        chunk_scores = []
+        for chunk in chunks:
+            chunk_words = set(chunk['content'].lower().split())
+            overlap = len(query_words & chunk_words)
+            chunk_scores.append((overlap, chunk))
+
+        # Sort by overlap and return top k
+        chunk_scores.sort(key=lambda x: x[0], reverse=True)
+        return [chunk for _, chunk in chunk_scores[:k]]
+```
+
+## A/B Testing Framework
+
+### Statistical Significance Testing
+
+```python
+import numpy as np
+from scipy import stats
+from typing import List, Dict, Tuple
+
+class ABTestAnalyzer:
+    def __init__(self):
+        self.significance_level = 0.05
+
+    def compare_metrics(self, control_metrics: List[float],
+                       treatment_metrics: List[float],
+                       metric_name: str) -> Dict:
+        """
+        Compare metrics between control and treatment groups
+        """
+        control_mean = np.mean(control_metrics)
+        treatment_mean = np.mean(treatment_metrics)
+
+        control_std = np.std(control_metrics)
+        treatment_std = np.std(treatment_metrics)
+
+        # Perform t-test
+        t_statistic, p_value = stats.ttest_ind(control_metrics, treatment_metrics)
+
+        # Calculate effect size (Cohen's d)
+        pooled_std = np.sqrt(((len(control_metrics) - 1) * control_std**2 +
+                             (len(treatment_metrics) - 1) * treatment_std**2) /
+                            (len(control_metrics) + len(treatment_metrics) - 2))
+
+        cohens_d = (treatment_mean - control_mean) / pooled_std if pooled_std > 0 else 0
+
+        # Determine significance
+        is_significant = p_value < self.significance_level
+
+        return {
+            'metric_name': metric_name,
+            'control_mean': control_mean,
+            'treatment_mean': treatment_mean,
+            'absolute_difference': treatment_mean - control_mean,
+            'relative_difference': ((treatment_mean - control_mean) / control_mean * 100) if control_mean != 0 else 0,
+            'control_std': control_std,
+            'treatment_std': treatment_std,
+            't_statistic': t_statistic,
+            'p_value': p_value,
+            'is_significant': is_significant,
+            'effect_size': cohens_d,
+            'significance_level': self.significance_level
+        }
+
+    def analyze_ab_test_results(self,
+                               control_results: Dict[str, List[float]],
+                               treatment_results: Dict[str, List[float]]) -> Dict:
+        """
+        Analyze A/B test results across multiple metrics
+        """
+        analysis_results = {}
+
+        # Ensure both dictionaries have the same keys
+        all_metrics = set(control_results.keys()) & set(treatment_results.keys())
+
+        for metric in all_metrics:
+            if metric in control_results and metric in treatment_results:
+                analysis_results[metric] = self.compare_metrics(
+                    control_results[metric],
+                    treatment_results[metric],
+                    metric
+                )
+
+        # Calculate overall summary
+        significant_improvements = sum(1 for result in analysis_results.values()
+                                     if result['is_significant'] and result['relative_difference'] > 0)
+        significant_degradations = sum(1 for result in analysis_results.values()
+                                      if result['is_significant'] and result['relative_difference'] < 0)
+
+        analysis_results['summary'] = {
+            'total_metrics_compared': len(analysis_results),
+            'significant_improvements': significant_improvements,
+            'significant_degradations': significant_degradations,
+            'no_significant_change': len(analysis_results) - significant_improvements - significant_degradations
+        }
+
+        return analysis_results
+```
+
+## Automated Evaluation Pipeline
+
+### End-to-End Evaluation
+
+```python
+class ChunkingEvaluationPipeline:
+    def __init__(self, strategies: Dict[str, Any], dataset: EvaluationDataset):
+        self.strategies = strategies
+        self.dataset = dataset
+        self.results = {}
+        self.profiler = PerformanceProfiler()
+        self.memory_profiler = MemoryProfiler()
+
+    def run_evaluation(self) -> Dict:
+        """Run comprehensive evaluation of all strategies"""
+        evaluation_results = {}
+
+        for strategy_name, strategy in self.strategies.items():
+            print(f"Evaluating strategy: {strategy_name}")
+
+            # Reset profilers for each strategy
+            self.profiler = PerformanceProfiler()
+            self.memory_profiler = MemoryProfiler()
+
+            # Evaluate strategy
+            strategy_results = self._evaluate_strategy(strategy, strategy_name)
+            evaluation_results[strategy_name] = strategy_results
+
+        # Compare strategies
+        comparison_results = self._compare_strategies(evaluation_results)
+
+        return {
+            'individual_results': evaluation_results,
+            'comparison': comparison_results,
+            'recommendations': self._generate_recommendations(comparison_results)
+        }
+
+    def _evaluate_strategy(self, strategy: Any, strategy_name: str) -> Dict:
+        """Evaluate a single chunking strategy"""
+        results = {
+            'strategy_name': strategy_name,
+            'retrieval_metrics': {},
+            'quality_metrics': {},
+            'performance_metrics': {}
+        }
+
+        # Track memory usage
+        self.memory_profiler.take_memory_snapshot(f"{strategy_name}_start")
+
+        # Process all documents
+        self.profiler.start_timer('total_processing')
+
+        all_chunks = {}
+        for doc_id, content in self.dataset.documents.items():
+            self.profiler.start_timer('chunking')
+            chunks = strategy.chunk(content)
+            self.profiler.end_timer('chunking')
+
+            all_chunks[doc_id] = chunks
+
+        self.memory_profiler.take_memory_snapshot(f"{strategy_name}_after_chunking")
+
+        # Generate embeddings for chunks
+        self.profiler.start_timer('embedding')
+        chunk_embeddings = self._generate_embeddings(all_chunks)
+        self.profiler.end_timer('embedding')
+
+        self.memory_profiler.take_memory_snapshot(f"{strategy_name}_after_embedding")
+
+        # Evaluate retrieval performance
+        retrieval_results = self._evaluate_retrieval(all_chunks, chunk_embeddings)
+        results['retrieval_metrics'] = retrieval_results
+
+        # Evaluate chunk quality
+        quality_results = self._evaluate_chunk_quality(all_chunks)
+        results['quality_metrics'] = quality_results
+
+        # Get performance metrics
+        self.profiler.end_timer('total_processing')
+        performance_metrics = self.profiler.get_performance_metrics(len(self.dataset.documents))
+        results['performance_metrics'] = performance_metrics.__dict__
+
+        # Get memory metrics
+        self.memory_profiler.take_memory_snapshot(f"{strategy_name}_end")
+        results['memory_metrics'] = {
+            'peak_memory_mb': self.memory_profiler.get_peak_memory_usage(),
+            'memory_by_operation': self.memory_profiler.get_memory_usage_by_operation()
+        }
+
+        return results
+
+    def _evaluate_retrieval(self, all_chunks: Dict, chunk_embeddings: Dict) -> Dict:
+        """Evaluate retrieval performance"""
+        retrieval_metrics = {
+            'precision': [],
+            'recall': [],
+            'f1_score': [],
+            'mrr': [],
+            'map': []
+        }
+
+        for query in self.dataset.queries:
+            # Perform retrieval
+            self.profiler.start_timer('search')
+            retrieved_chunks = self._retrieve_chunks(query.question, chunk_embeddings, k=10)
+            self.profiler.end_timer('search')
+
+            # Get relevant chunks for this query
+            relevant_chunk_ids = set(query.relevant_chunk_ids)
+            relevant_chunks = [chunk for chunk in retrieved_chunks
+                             if chunk.get('id') in relevant_chunk_ids]
+
+            # Calculate metrics
+            precision = calculate_precision(retrieved_chunks, relevant_chunks)
+            recall = calculate_recall(retrieved_chunks, relevant_chunks)
+            f1 = calculate_f1_score(precision, recall)
+
+            retrieval_metrics['precision'].append(precision)
+            retrieval_metrics['recall'].append(recall)
+            retrieval_metrics['f1_score'].append(f1)
+
+        # Calculate averages
+        return {metric: np.mean(values) for metric, values in retrieval_metrics.items()}
+
+    def _evaluate_chunk_quality(self, all_chunks: Dict) -> Dict:
+        """Evaluate quality of generated chunks"""
+        quality_assessor = ChunkQualityAssessor()
+        quality_scores = []
+
+        for doc_id, chunks in all_chunks.items():
+            # Analyze document
+            content = self.dataset.documents[doc_id]
+            analyzer = DocumentAnalyzer()
+            analysis = analyzer.analyze(content)
+
+            # Assess chunk quality
+            scores = quality_assessor.assess_chunks(chunks, analysis)
+            quality_scores.append(scores)
+
+        # Aggregate quality scores
+        if quality_scores:
+            avg_scores = {}
+            for metric in quality_scores[0].keys():
+                avg_scores[metric] = np.mean([scores[metric] for scores in quality_scores])
+            return avg_scores
+
+        return {}
+
+    def _compare_strategies(self, evaluation_results: Dict) -> Dict:
+        """Compare performance across strategies"""
+        ab_analyzer = ABTestAnalyzer()
+
+        comparison = {}
+
+        # Compare each metric across strategies
+        strategy_names = list(evaluation_results.keys())
+
+        for i in range(len(strategy_names)):
+            for j in range(i + 1, len(strategy_names)):
+                strategy1 = strategy_names[i]
+                strategy2 = strategy_names[j]
+
+                comparison_key = f"{strategy1}_vs_{strategy2}"
+                comparison[comparison_key] = {}
+
+                # Compare retrieval metrics
+                for metric in ['precision', 'recall', 'f1_score']:
+                    if (metric in evaluation_results[strategy1]['retrieval_metrics'] and
+                        metric in evaluation_results[strategy2]['retrieval_metrics']):
+
+                        comparison[comparison_key][f"retrieval_{metric}"] = ab_analyzer.compare_metrics(
+                            [evaluation_results[strategy1]['retrieval_metrics'][metric]],
+                            [evaluation_results[strategy2]['retrieval_metrics'][metric]],
+                            f"retrieval_{metric}"
+                        )
+
+        return comparison
+
+    def _generate_recommendations(self, comparison_results: Dict) -> Dict:
+        """Generate recommendations based on evaluation results"""
+        recommendations = {
+            'best_overall': None,
+            'best_for_precision': None,
+            'best_for_recall': None,
+            'best_for_performance': None,
+            'trade_offs': []
+        }
+
+        # This would analyze the comparison results and generate specific recommendations
+        # Implementation depends on specific use case requirements
+
+        return recommendations
+
+    def _generate_embeddings(self, all_chunks: Dict) -> Dict:
+        """Generate embeddings for all chunks"""
+        # This would use the actual embedding model
+        # Placeholder implementation
+        embeddings = {}
+
+        for doc_id, chunks in all_chunks.items():
+            embeddings[doc_id] = []
+            for chunk in chunks:
+                # Generate embedding for chunk content
+                embedding = np.random.rand(384)  # Placeholder
+                embeddings[doc_id].append({
+                    'chunk': chunk,
+                    'embedding': embedding
+                })
+
+        return embeddings
+
+    def _retrieve_chunks(self, query: str, chunk_embeddings: Dict, k: int = 10) -> List[Dict]:
+        """Retrieve most relevant chunks for a query"""
+        # This would use actual similarity search
+        # Placeholder implementation
+        all_chunks = []
+
+        for doc_embeddings in chunk_embeddings.values():
+            for chunk_data in doc_embeddings:
+                all_chunks.append(chunk_data['chunk'])
+
+        # Simple random selection as placeholder
+        selected = random.sample(all_chunks, min(k, len(all_chunks)))
+
+        return selected
+```
+
+This comprehensive evaluation framework provides the tools needed to thoroughly assess chunking strategies across multiple dimensions: retrieval effectiveness, answer quality, system performance, and statistical significance. The modular design allows for easy extension and customization based on specific requirements and use cases.
--- a/skills/ai/chunking-strategy/references/implementation.md
+++ b/skills/ai/chunking-strategy/references/implementation.md
@@ -0,0 +1,709 @@
+# Complete Implementation Guidelines
+
+This document provides comprehensive implementation guidance for building effective chunking systems.
+
+## System Architecture
+
+### Core Components
+
+```
+Document Processor
+├── Ingestion Layer
+│   ├── Document Type Detection
+│   ├── Format Parsing (PDF, HTML, Markdown, etc.)
+│   └── Content Extraction
+├── Analysis Layer
+│   ├── Structure Analysis
+│   ├── Content Type Identification
+│   └── Complexity Assessment
+├── Strategy Selection Layer
+│   ├── Rule-based Selection
+│   ├── ML-based Prediction
+│   └── Adaptive Configuration
+├── Chunking Layer
+│   ├── Strategy Implementation
+│   ├── Parameter Optimization
+│   └── Quality Validation
+└── Output Layer
+    ├── Chunk Metadata Generation
+    ├── Embedding Integration
+    └── Storage Preparation
+```
+
+## Pre-processing Pipeline
+
+### Document Analysis Framework
+
+```python
+from dataclasses import dataclass
+from typing import List, Dict, Any
+import re
+
+@dataclass
+class DocumentAnalysis:
+    doc_type: str
+    structure_score: float  # 0-1, higher means more structured
+    complexity_score: float  # 0-1, higher means more complex
+    content_types: List[str]
+    language: str
+    estimated_tokens: int
+    has_multimodal: bool
+
+class DocumentAnalyzer:
+    def __init__(self):
+        self.structure_patterns = {
+            'markdown': [r'^#+\s', r'^\*\*.*\*\*$', r'^\* ', r'^\d+\. '],
+            'html': [r'<h[1-6]>', r'<p>', r'<div>', r'<table>'],
+            'latex': [r'\\section', r'\\subsection', r'\\begin\{', r'\\end\{'],
+            'academic': [r'^\d+\.', r'^\d+\.\d+', r'^[A-Z]\.', r'^Figure \d+']
+        }
+
+    def analyze(self, content: str) -> DocumentAnalysis:
+        doc_type = self.detect_document_type(content)
+        structure_score = self.calculate_structure_score(content, doc_type)
+        complexity_score = self.calculate_complexity_score(content)
+        content_types = self.identify_content_types(content)
+        language = self.detect_language(content)
+        estimated_tokens = self.estimate_tokens(content)
+        has_multimodal = self.detect_multimodal_content(content)
+
+        return DocumentAnalysis(
+            doc_type=doc_type,
+            structure_score=structure_score,
+            complexity_score=complexity_score,
+            content_types=content_types,
+            language=language,
+            estimated_tokens=estimated_tokens,
+            has_multimodal=has_multimodal
+        )
+
+    def detect_document_type(self, content: str) -> str:
+        content_lower = content.lower()
+
+        if '<html' in content_lower or '<body' in content_lower:
+            return 'html'
+        elif '#' in content and '##' in content:
+            return 'markdown'
+        elif '\\documentclass' in content_lower or '\\begin{' in content_lower:
+            return 'latex'
+        elif any(keyword in content_lower for keyword in ['abstract', 'introduction', 'conclusion', 'references']):
+            return 'academic'
+        elif 'def ' in content or 'class ' in content or 'function ' in content_lower:
+            return 'code'
+        else:
+            return 'plain'
+
+    def calculate_structure_score(self, content: str, doc_type: str) -> float:
+        patterns = self.structure_patterns.get(doc_type, [])
+        if not patterns:
+            return 0.5  # Default for unstructured content
+
+        line_count = len(content.split('\n'))
+        structured_lines = 0
+
+        for line in content.split('\n'):
+            for pattern in patterns:
+                if re.search(pattern, line.strip()):
+                    structured_lines += 1
+                    break
+
+        return min(structured_lines / max(line_count, 1), 1.0)
+
+    def calculate_complexity_score(self, content: str) -> float:
+        # Factors that increase complexity
+        avg_sentence_length = self.calculate_avg_sentence_length(content)
+        vocabulary_richness = self.calculate_vocabulary_richness(content)
+        nested_structure = self.detect_nested_structure(content)
+
+        # Normalize and combine
+        complexity = (
+            min(avg_sentence_length / 30, 1.0) * 0.3 +
+            vocabulary_richness * 0.4 +
+            nested_structure * 0.3
+        )
+
+        return min(complexity, 1.0)
+
+    def identify_content_types(self, content: str) -> List[str]:
+        types = []
+
+        if '```' in content or 'def ' in content or 'function ' in content.lower():
+            types.append('code')
+        if '|' in content and '\n' in content:
+            types.append('tables')
+        if re.search(r'\!\[.*\]\(.*\)', content):
+            types.append('images')
+        if re.search(r'http[s]?://', content):
+            types.append('links')
+        if re.search(r'\d+\.\d+', content) or re.search(r'\$\d', content):
+            types.append('numbers')
+
+        return types if types else ['text']
+
+    def detect_language(self, content: str) -> str:
+        # Simple language detection - can be enhanced with proper language detection libraries
+        if re.search(r'[\u4e00-\u9fff]', content):
+            return 'chinese'
+        elif re.search(r'[u0600-\u06ff]', content):
+            return 'arabic'
+        elif re.search(r'[u0400-\u04ff]', content):
+            return 'russian'
+        else:
+            return 'english'  # Default assumption
+
+    def estimate_tokens(self, content: str) -> int:
+        # Rough estimation - actual tokenization varies by model
+        word_count = len(content.split())
+        return int(word_count * 1.3)  # Average tokens per word
+
+    def detect_multimodal_content(self, content: str) -> bool:
+        multimodal_indicators = [
+            r'\!\[.*\]\(.*\)',  # Images
+            r'<iframe',        # Embedded content
+            r'<object',        # Embedded objects
+            r'<embed',         # Embedded media
+        ]
+
+        return any(re.search(pattern, content) for pattern in multimodal_indicators)
+
+    def calculate_avg_sentence_length(self, content: str) -> float:
+        sentences = re.split(r'[.!?]+', content)
+        sentences = [s.strip() for s in sentences if s.strip()]
+        if not sentences:
+            return 0
+        return sum(len(s.split()) for s in sentences) / len(sentences)
+
+    def calculate_vocabulary_richness(self, content: str) -> float:
+        words = content.lower().split()
+        if not words:
+            return 0
+        unique_words = set(words)
+        return len(unique_words) / len(words)
+
+    def detect_nested_structure(self, content: str) -> float:
+        # Detect nested lists, indented content, etc.
+        lines = content.split('\n')
+        indented_lines = 0
+
+        for line in lines:
+            if line.strip() and line.startswith(' '):
+                indented_lines += 1
+
+        return indented_lines / max(len(lines), 1)
+```
+
+### Strategy Selection Engine
+
+```python
+from abc import ABC, abstractmethod
+from typing import Dict, Any
+
+class ChunkingStrategy(ABC):
+    @abstractmethod
+    def chunk(self, content: str, analysis: DocumentAnalysis) -> List[Dict[str, Any]]:
+        pass
+
+class StrategySelector:
+    def __init__(self):
+        self.strategies = {
+            'fixed_size': FixedSizeStrategy(),
+            'recursive': RecursiveStrategy(),
+            'structure_aware': StructureAwareStrategy(),
+            'semantic': SemanticStrategy(),
+            'adaptive': AdaptiveStrategy()
+        }
+
+    def select_strategy(self, analysis: DocumentAnalysis) -> str:
+        # Rule-based selection logic
+        if analysis.structure_score > 0.8 and analysis.doc_type in ['markdown', 'html', 'latex']:
+            return 'structure_aware'
+        elif analysis.complexity_score > 0.7 and analysis.estimated_tokens < 10000:
+            return 'semantic'
+        elif analysis.doc_type == 'code':
+            return 'structure_aware'
+        elif analysis.structure_score < 0.3:
+            return 'fixed_size'
+        elif analysis.complexity_score > 0.5:
+            return 'recursive'
+        else:
+            return 'adaptive'
+
+    def get_strategy(self, analysis: DocumentAnalysis) -> ChunkingStrategy:
+        strategy_name = self.select_strategy(analysis)
+        return self.strategies[strategy_name]
+
+# Example strategy implementations
+class FixedSizeStrategy(ChunkingStrategy):
+    def __init__(self, default_size=512, default_overlap=50):
+        self.default_size = default_size
+        self.default_overlap = default_overlap
+
+    def chunk(self, content: str, analysis: DocumentAnalysis) -> List[Dict[str, Any]]:
+        # Adjust parameters based on analysis
+        if analysis.complexity_score > 0.7:
+            chunk_size = 1024
+        elif analysis.complexity_score < 0.3:
+            chunk_size = 256
+        else:
+            chunk_size = self.default_size
+
+        overlap = int(chunk_size * 0.1)  # 10% overlap
+
+        # Implementation here...
+        return self._fixed_size_chunk(content, chunk_size, overlap)
+
+    def _fixed_size_chunk(self, content: str, chunk_size: int, overlap: int) -> List[Dict[str, Any]]:
+        # Implementation using RecursiveCharacterTextSplitter or custom logic
+        pass
+
+class AdaptiveStrategy(ChunkingStrategy):
+    def chunk(self, content: str, analysis: DocumentAnalysis) -> List[Dict[str, Any]]:
+        # Combine multiple strategies based on content characteristics
+        if analysis.structure_score > 0.6:
+            # Use structure-aware for structured parts
+            structured_chunks = self._chunk_structured_parts(content, analysis)
+        else:
+            # Use fixed-size for unstructured parts
+            unstructured_chunks = self._chunk_unstructured_parts(content, analysis)
+
+        # Merge and optimize
+        return self._merge_chunks(structured_chunks + unstructured_chunks)
+
+    def _chunk_structured_parts(self, content: str, analysis: DocumentAnalysis) -> List[Dict[str, Any]]:
+        # Implementation for structured content
+        pass
+
+    def _chunk_unstructured_parts(self, content: str, analysis: DocumentAnalysis) -> List[Dict[str, Any]]:
+        # Implementation for unstructured content
+        pass
+
+    def _merge_chunks(self, chunks: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
+        # Implementation for merging and optimizing chunks
+        pass
+```
+
+## Quality Assurance Framework
+
+### Chunk Quality Metrics
+
+```python
+from typing import List, Dict, Any
+import numpy as np
+from sklearn.metrics.pairwise import cosine_similarity
+
+class ChunkQualityAssessor:
+    def __init__(self):
+        self.quality_weights = {
+            'coherence': 0.3,
+            'completeness': 0.25,
+            'size_appropriateness': 0.2,
+            'semantic_similarity': 0.15,
+            'boundary_quality': 0.1
+        }
+
+    def assess_chunks(self, chunks: List[Dict[str, Any]], analysis: DocumentAnalysis) -> Dict[str, float]:
+        scores = {}
+
+        # Coherence: Do chunks make sense on their own?
+        scores['coherence'] = self._assess_coherence(chunks)
+
+        # Completeness: Do chunks preserve important information?
+        scores['completeness'] = self._assess_completeness(chunks, analysis)
+
+        # Size appropriateness: Are chunks within optimal size range?
+        scores['size_appropriateness'] = self._assess_size(chunks)
+
+        # Semantic similarity: Are chunks thematically consistent?
+        scores['semantic_similarity'] = self._assess_semantic_consistency(chunks)
+
+        # Boundary quality: Are chunk boundaries placed well?
+        scores['boundary_quality'] = self._assess_boundary_quality(chunks)
+
+        # Calculate overall quality score
+        overall_score = sum(
+            score * self.quality_weights[metric]
+            for metric, score in scores.items()
+        )
+
+        scores['overall'] = overall_score
+        return scores
+
+    def _assess_coherence(self, chunks: List[Dict[str, Any]]) -> float:
+        # Simple heuristic-based coherence assessment
+        coherence_scores = []
+
+        for chunk in chunks:
+            content = chunk['content']
+
+            # Check for complete sentences
+            sentences = re.split(r'[.!?]+', content)
+            complete_sentences = sum(1 for s in sentences if s.strip())
+            coherence = complete_sentences / max(len(sentences), 1)
+
+            coherence_scores.append(coherence)
+
+        return np.mean(coherence_scores)
+
+    def _assess_completeness(self, chunks: List[Dict[str, Any]], analysis: DocumentAnalysis) -> float:
+        # Check if important structural elements are preserved
+        if analysis.doc_type in ['markdown', 'html']:
+            return self._assess_structure_preservation(chunks, analysis)
+        else:
+            return self._assess_content_preservation(chunks)
+
+    def _assess_structure_preservation(self, chunks: List[Dict[str, Any]], analysis: DocumentAnalysis) -> float:
+        # Check if headings, lists, and other structural elements are preserved
+        preserved_elements = 0
+        total_elements = 0
+
+        for chunk in chunks:
+            content = chunk['content']
+
+            # Count preserved structural elements
+            headings = len(re.findall(r'^#+\s', content, re.MULTILINE))
+            lists = len(re.findall(r'^\s*[-*+]\s', content, re.MULTILINE))
+
+            preserved_elements += headings + lists
+            total_elements += 1  # Simplified count
+
+        return preserved_elements / max(total_elements, 1)
+
+    def _assess_content_preservation(self, chunks: List[Dict[str, Any]]) -> float:
+        # Simple check based on content ratio
+        total_content = ''.join(chunk['content'] for chunk in chunks)
+        # This would need comparison with original content
+        return 0.8  # Placeholder
+
+    def _assess_size(self, chunks: List[Dict[str, Any]]) -> float:
+        optimal_min = 100  # tokens
+        optimal_max = 1000  # tokens
+
+        size_scores = []
+        for chunk in chunks:
+            token_count = self._estimate_tokens(chunk['content'])
+            if optimal_min <= token_count <= optimal_max:
+                score = 1.0
+            elif token_count < optimal_min:
+                score = token_count / optimal_min
+            else:
+                score = max(0, 1 - (token_count - optimal_max) / optimal_max)
+
+            size_scores.append(score)
+
+        return np.mean(size_scores)
+
+    def _assess_semantic_consistency(self, chunks: List[Dict[str, Any]]) -> float:
+        # This would require embedding models for actual implementation
+        # Placeholder implementation
+        return 0.7
+
+    def _assess_boundary_quality(self, chunks: List[Dict[str, Any]]) -> float:
+        # Check if boundaries don't split important content
+        boundary_scores = []
+
+        for i, chunk in enumerate(chunks):
+            content = chunk['content']
+
+            # Check for incomplete sentences at boundaries
+            if not content.strip().endswith(('.', '!', '?', '>', '}')):
+                boundary_scores.append(0.5)
+            else:
+                boundary_scores.append(1.0)
+
+        return np.mean(boundary_scores)
+
+    def _estimate_tokens(self, content: str) -> int:
+        # Simple token estimation
+        return len(content.split()) * 4 // 3  # Rough approximation
+```
+
+## Error Handling and Edge Cases
+
+### Robust Error Handling
+
+```python
+import logging
+from typing import Optional, List
+from dataclasses import dataclass
+
+@dataclass
+class ChunkingError:
+    error_type: str
+    message: str
+    chunk_index: Optional[int] = None
+    recovery_action: Optional[str] = None
+
+class ChunkingErrorHandler:
+    def __init__(self):
+        self.logger = logging.getLogger(__name__)
+        self.error_handlers = {
+            'empty_content': self._handle_empty_content,
+            'oversized_chunk': self._handle_oversized_chunk,
+            'encoding_error': self._handle_encoding_error,
+            'memory_error': self._handle_memory_error,
+            'structure_parsing_error': self._handle_structure_parsing_error
+        }
+
+    def handle_error(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        error_type = self._classify_error(error)
+        handler = self.error_handlers.get(error_type, self._handle_generic_error)
+        return handler(error, context)
+
+    def _classify_error(self, error: Exception) -> str:
+        if isinstance(error, ValueError) and 'empty' in str(error).lower():
+            return 'empty_content'
+        elif isinstance(error, MemoryError):
+            return 'memory_error'
+        elif isinstance(error, UnicodeError):
+            return 'encoding_error'
+        elif 'too large' in str(error).lower():
+            return 'oversized_chunk'
+        elif 'parsing' in str(error).lower():
+            return 'structure_parsing_error'
+        else:
+            return 'generic_error'
+
+    def _handle_empty_content(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.warning(f"Empty content encountered: {error}")
+        return ChunkingError(
+            error_type='empty_content',
+            message=str(error),
+            recovery_action='skip_empty_content'
+        )
+
+    def _handle_oversized_chunk(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.warning(f"Oversized chunk detected: {error}")
+        return ChunkingError(
+            error_type='oversized_chunk',
+            message=str(error),
+            chunk_index=context.get('chunk_index'),
+            recovery_action='reduce_chunk_size'
+        )
+
+    def _handle_encoding_error(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.error(f"Encoding error: {error}")
+        return ChunkingError(
+            error_type='encoding_error',
+            message=str(error),
+            recovery_action='fallback_encoding'
+        )
+
+    def _handle_memory_error(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.error(f"Memory error during chunking: {error}")
+        return ChunkingError(
+            error_type='memory_error',
+            message=str(error),
+            recovery_action='process_in_batches'
+        )
+
+    def _handle_structure_parsing_error(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.warning(f"Structure parsing failed: {error}")
+        return ChunkingError(
+            error_type='structure_parsing_error',
+            message=str(error),
+            recovery_action='fallback_to_fixed_size'
+        )
+
+    def _handle_generic_error(self, error: Exception, context: Dict[str, Any]) -> ChunkingError:
+        self.logger.error(f"Unexpected error during chunking: {error}")
+        return ChunkingError(
+            error_type='generic_error',
+            message=str(error),
+            recovery_action='skip_and_continue'
+        )
+```
+
+## Performance Optimization
+
+### Caching and Memoization
+
+```python
+import hashlib
+import pickle
+from functools import lru_cache
+from typing import Dict, Any, Optional
+import redis
+import json
+
+class ChunkingCache:
+    def __init__(self, redis_url: Optional[str] = None):
+        if redis_url:
+            self.redis_client = redis.from_url(redis_url)
+        else:
+            self.redis_client = None
+        self.local_cache = {}
+
+    def _generate_cache_key(self, content: str, strategy: str, params: Dict[str, Any]) -> str:
+        content_hash = hashlib.md5(content.encode()).hexdigest()
+        params_str = json.dumps(params, sort_keys=True)
+        params_hash = hashlib.md5(params_str.encode()).hexdigest()
+        return f"chunking:{strategy}:{content_hash}:{params_hash}"
+
+    def get(self, content: str, strategy: str, params: Dict[str, Any]) -> Optional[List[Dict[str, Any]]]:
+        cache_key = self._generate_cache_key(content, strategy, params)
+
+        # Try local cache first
+        if cache_key in self.local_cache:
+            return self.local_cache[cache_key]
+
+        # Try Redis cache
+        if self.redis_client:
+            try:
+                cached_data = self.redis_client.get(cache_key)
+                if cached_data:
+                    chunks = pickle.loads(cached_data)
+                    self.local_cache[cache_key] = chunks  # Cache locally too
+                    return chunks
+            except Exception as e:
+                logging.warning(f"Redis cache error: {e}")
+
+        return None
+
+    def set(self, content: str, strategy: str, params: Dict[str, Any], chunks: List[Dict[str, Any]]) -> None:
+        cache_key = self._generate_cache_key(content, strategy, params)
+
+        # Store in local cache
+        self.local_cache[cache_key] = chunks
+
+        # Store in Redis cache
+        if self.redis_client:
+            try:
+                cached_data = pickle.dumps(chunks)
+                self.redis_client.setex(cache_key, 3600, cached_data)  # 1 hour TTL
+            except Exception as e:
+                logging.warning(f"Redis cache set error: {e}")
+
+    def clear_local_cache(self):
+        self.local_cache.clear()
+
+    def clear_redis_cache(self):
+        if self.redis_client:
+            pattern = "chunking:*"
+            keys = self.redis_client.keys(pattern)
+            if keys:
+                self.redis_client.delete(*keys)
+```
+
+### Batch Processing
+
+```python
+import asyncio
+import concurrent.futures
+from typing import List, Callable, Any
+
+class BatchChunkingProcessor:
+    def __init__(self, max_workers: int = 4, batch_size: int = 10):
+        self.max_workers = max_workers
+        self.batch_size = batch_size
+
+    def process_documents_batch(self, documents: List[str],
+                               chunking_function: Callable[[str], List[Dict[str, Any]]]) -> List[List[Dict[str, Any]]]:
+        """Process multiple documents in parallel"""
+        results = []
+
+        # Process in batches to avoid memory issues
+        for i in range(0, len(documents), self.batch_size):
+            batch = documents[i:i + self.batch_size]
+
+            with concurrent.futures.ThreadPoolExecutor(max_workers=self.max_workers) as executor:
+                future_to_doc = {
+                    executor.submit(chunking_function, doc): doc
+                    for doc in batch
+                }
+
+                batch_results = []
+                for future in concurrent.futures.as_completed(future_to_doc):
+                    try:
+                        chunks = future.result()
+                        batch_results.append(chunks)
+                    except Exception as e:
+                        logging.error(f"Error processing document: {e}")
+                        batch_results.append([])  # Empty result for failed processing
+
+                results.extend(batch_results)
+
+        return results
+
+    async def process_documents_async(self, documents: List[str],
+                                     chunking_function: Callable[[str], List[Dict[str, Any]]]) -> List[List[Dict[str, Any]]]:
+        """Process documents asynchronously"""
+        semaphore = asyncio.Semaphore(self.max_workers)
+
+        async def process_single_document(doc: str) -> List[Dict[str, Any]]:
+            async with semaphore:
+                # Run the synchronous chunking function in an executor
+                loop = asyncio.get_event_loop()
+                return await loop.run_in_executor(None, chunking_function, doc)
+
+        tasks = [process_single_document(doc) for doc in documents]
+        return await asyncio.gather(*tasks, return_exceptions=True)
+```
+
+## Monitoring and Observability
+
+### Metrics Collection
+
+```python
+import time
+from dataclasses import dataclass
+from typing import Dict, Any, List
+from collections import defaultdict
+
+@dataclass
+class ChunkingMetrics:
+    total_documents: int
+    total_chunks: int
+    avg_chunk_size: float
+    processing_time: float
+    memory_usage: float
+    error_count: int
+    strategy_distribution: Dict[str, int]
+
+class MetricsCollector:
+    def __init__(self):
+        self.metrics = defaultdict(list)
+        self.start_time = None
+
+    def start_timing(self):
+        self.start_time = time.time()
+
+    def end_timing(self) -> float:
+        if self.start_time:
+            duration = time.time() - self.start_time
+            self.metrics['processing_time'].append(duration)
+            self.start_time = None
+            return duration
+        return 0.0
+
+    def record_chunk_count(self, count: int):
+        self.metrics['chunk_count'].append(count)
+
+    def record_chunk_size(self, size: int):
+        self.metrics['chunk_size'].append(size)
+
+    def record_strategy_usage(self, strategy: str):
+        self.metrics['strategy'][strategy] = self.metrics['strategy'].get(strategy, 0) + 1
+
+    def record_error(self, error_type: str):
+        self.metrics['errors'].append(error_type)
+
+    def record_memory_usage(self, memory_mb: float):
+        self.metrics['memory_usage'].append(memory_mb)
+
+    def get_summary(self) -> ChunkingMetrics:
+        return ChunkingMetrics(
+            total_documents=len(self.metrics['processing_time']),
+            total_chunks=sum(self.metrics['chunk_count']),
+            avg_chunk_size=sum(self.metrics['chunk_size']) / max(len(self.metrics['chunk_size']), 1),
+            processing_time=sum(self.metrics['processing_time']),
+            memory_usage=sum(self.metrics['memory_usage']) / max(len(self.metrics['memory_usage']), 1),
+            error_count=len(self.metrics['errors']),
+            strategy_distribution=dict(self.metrics['strategy'])
+        )
+
+    def reset(self):
+        self.metrics.clear()
+        self.start_time = None
+```
+
+This implementation guide provides a comprehensive foundation for building robust, scalable chunking systems that can handle various document types and use cases while maintaining high quality and performance.
--- a/skills/ai/chunking-strategy/references/research.md
+++ b/skills/ai/chunking-strategy/references/research.md
@@ -0,0 +1,366 @@
+# Key Research Papers and Findings
+
+This document summarizes important research papers and findings related to chunking strategies for RAG systems.
+
+## Seminal Papers
+
+### "Reconstructing Context: Evaluating Advanced Chunking Strategies for RAG" (arXiv:2504.19754)
+
+**Key Findings**:
+- Page-level chunking achieved highest average accuracy (0.648) with lowest variance across different query types
+- Optimal chunk size varies significantly by document type and query complexity
+- Factoid queries perform better with smaller chunks (256-512 tokens)
+- Complex analytical queries benefit from larger chunks (1024+ tokens)
+
+**Methodology**:
+- Evaluated 7 different chunking strategies across multiple document types
+- Tested with both factoid and analytical queries
+- Measured end-to-end RAG performance
+
+**Practical Implications**:
+- Start with page-level chunking for general-purpose RAG systems
+- Adapt chunk size based on expected query patterns
+- Consider hybrid approaches for mixed query types
+
+### "Lost in the Middle: How Language Models Use Long Contexts"
+
+**Key Findings**:
+- Language models tend to pay more attention to information at the beginning and end of context
+- Information in the middle of long contexts is often ignored
+- Performance degradation is most severe for centrally located information
+
+**Practical Implications**:
+- Place most important information at chunk boundaries
+- Consider chunk overlap to ensure important context appears multiple times
+- Use ranking to prioritize relevant chunks for inclusion in context
+
+### "Grounded Language Learning in a Simulated 3D World"
+
+**Related Concepts**:
+- Importance of grounding text in visual/contextual information
+- Multi-modal learning approaches for better understanding
+
+**Relevance to Chunking**:
+- Supports contextual chunking approaches that preserve visual/contextual relationships
+- Validates importance of maintaining document structure and relationships
+
+## Industry Research
+
+### NVIDIA Research: "Finding the Best Chunking Strategy for Accurate AI Responses"
+
+**Key Findings**:
+- Page-level chunking outperformed sentence and paragraph-level approaches
+- Fixed-size chunking showed consistent but suboptimal performance
+- Semantic chunking provided improvements for complex documents
+
+**Technical Details**:
+- Tested chunk sizes from 128 to 2048 tokens
+- Evaluated across financial, technical, and legal documents
+- Measured both retrieval accuracy and generation quality
+
+**Recommendations**:
+- Use 512-1024 token chunks as starting point
+- Implement adaptive chunking based on document complexity
+- Consider page boundaries as natural chunk separators
+
+### Cohere Research: "Effective Chunking Strategies for RAG"
+
+**Key Findings**:
+- Recursive character splitting provides good balance of performance and simplicity
+- Document structure awareness improves retrieval by 15-20%
+- Overlap of 10-20% provides optimal context preservation
+
+**Methodology**:
+- Compared 12 chunking strategies across 6 document types
+- Measured retrieval precision, recall, and F1-score
+- Tested with both dense and sparse retrieval
+
+**Best Practices Identified**:
+- Start with recursive character splitting with 10-20% overlap
+- Preserve document structure (headings, lists, tables)
+- Customize chunk size based on embedding model context window
+
+### Anthropic: "Contextual Retrieval"
+
+**Key Innovation**:
+- Enhance each chunk with LLM-generated contextual information before embedding
+- Improves retrieval precision by 25-30% for complex documents
+- Particularly effective for technical and academic content
+
+**Implementation Approach**:
+1. Split document using traditional methods
+2. For each chunk, generate contextual information using LLM
+3. Prepend context to chunk before embedding
+4. Use hybrid search (dense + sparse) with weighted ranking
+
+**Trade-offs**:
+- Significant computational overhead (2-3x processing time)
+- Higher embedding storage requirements
+- Improved retrieval precision justifies cost for high-value applications
+
+## Algorithmic Advances
+
+### Semantic Chunking Algorithms
+
+#### "Semantic Segmentation of Text Documents"
+
+**Core Idea**: Use cosine similarity between consecutive sentence embeddings to identify natural boundaries.
+
+**Algorithm**:
+1. Split document into sentences
+2. Generate embeddings for each sentence
+3. Calculate similarity between consecutive sentences
+4. Create boundaries where similarity drops below threshold
+5. Merge short segments with neighbors
+
+**Performance**: 20-30% improvement in retrieval relevance over fixed-size chunking for technical documents.
+
+#### "Hierarchical Semantic Chunking"
+
+**Core Idea**: Multi-level semantic segmentation for document organization.
+
+**Algorithm**:
+1. Document-level semantic analysis
+2. Section-level boundary detection
+3. Paragraph-level segmentation
+4. Sentence-level refinement
+
+**Benefits**: Maintains document hierarchy while adapting to semantic structure.
+
+### Advanced Embedding Techniques
+
+#### "Late Chunking: Contextual Chunk Embeddings"
+
+**Core Innovation**: Generate embeddings for entire document first, then create chunk embeddings from token-level embeddings.
+
+**Advantages**:
+- Preserves global document context
+- Reduces context fragmentation
+- Better for documents with complex inter-relationships
+
+**Requirements**:
+- Long-context embedding models (8k+ tokens)
+- Significant computational resources
+- Specialized implementation
+
+#### "Hierarchical Embedding Retrieval"
+
+**Approach**: Create embeddings at multiple granularities (document, section, paragraph, sentence).
+
+**Implementation**:
+1. Generate embeddings at each level
+2. Store in hierarchical vector database
+3. Query at appropriate granularity based on information needs
+
+**Performance**: 15-25% improvement in precision for complex queries.
+
+## Evaluation Methodologies
+
+### Retrieval-Augmented Generation Assessment Frameworks
+
+#### RAGAS Framework
+
+**Metrics**:
+- **Faithfulness**: Consistency between generated answer and retrieved context
+- **Answer Relevancy**: Relevance of generated answer to the question
+- **Context Relevancy**: Relevance of retrieved context to the question
+- **Context Recall**: Coverage of relevant information in retrieved context
+
+**Evaluation Process**:
+1. Generate questions from document corpus
+2. Retrieve relevant chunks using different strategies
+3. Generate answers using retrieved chunks
+4. Evaluate using automated metrics and human judgment
+
+#### ARES Framework
+
+**Innovation**: Automated evaluation using synthetic questions and LLM-based assessment.
+
+**Key Features**:
+- Generates diverse question types (factoid, analytical, comparative)
+- Uses LLMs to evaluate answer quality
+- Provides scalable evaluation without human annotation
+
+### Benchmark Datasets
+
+#### Natural Questions (NQ)
+
+**Description**: Real user questions from Google Search with relevant Wikipedia passages.
+
+**Relevance**: Natural language queries with authentic relevance judgments.
+
+#### MS MARCO
+
+**Description**: Large-scale passage ranking dataset with real search queries.
+
+**Relevance**: High-quality relevance judgments for passage retrieval.
+
+#### HotpotQA
+
+**Description**: Multi-hop question answering requiring information from multiple documents.
+
+**Relevance**: Tests ability to retrieve and synthesize information from multiple chunks.
+
+## Domain-Specific Research
+
+### Medical Documents
+
+#### "Optimal Chunking for Medical Question Answering"
+
+**Key Findings**:
+- Medical terminology requires specialized handling
+- Section-based chunking (History, Diagnosis, Treatment) most effective
+- Preserving doctor-patient dialogue context crucial
+
+**Recommendations**:
+- Use medical-specific tokenizers
+- Preserve section headers and structure
+- Maintain temporal relationships in medical histories
+
+### Legal Documents
+
+#### "Chunking Strategies for Legal Document Analysis"
+
+**Key Findings**:
+- Legal citations and cross-references require special handling
+- Contract clause boundaries serve as natural chunk separators
+- Case law benefits from hierarchical chunking
+
+**Best Practices**:
+- Preserve legal citation structure
+- Use clause and section boundaries
+- Maintain context for legal definitions and references
+
+### Financial Documents
+
+#### "SEC Filing Chunking for Financial Analysis"
+
+**Key Findings**:
+- Table preservation critical for financial data
+- XBRL tagging provides natural segmentation
+- Risk factors sections benefit from specialized treatment
+
+**Approach**:
+- Preserve complete tables when possible
+- Use XBRL tags for structured data
+- Create specialized chunks for risk sections
+
+## Emerging Trends
+
+### Multi-Modal Chunking
+
+#### "Integrating Text, Tables, and Images in RAG Systems"
+
+**Innovation**: Unified chunking approach for mixed-modal content.
+
+**Approach**:
+- Extract and describe images using vision models
+- Preserve table structure and relationships
+- Create unified embeddings for mixed content
+
+**Results**: 35% improvement in complex document understanding.
+
+### Adaptive Chunking
+
+#### "Machine Learning-Based Chunk Size Optimization"
+
+**Core Idea**: Use ML models to predict optimal chunking parameters.
+
+**Features**:
+- Document length and complexity
+- Query type distribution
+- Embedding model characteristics
+- Performance requirements
+
+**Benefits**: Dynamic optimization based on use case and content.
+
+### Real-time Chunking
+
+#### "Streaming Chunking for Live Document Processing"
+
+**Innovation**: Process documents as they become available.
+
+**Techniques**:
+- Incremental boundary detection
+- Dynamic chunk size adjustment
+- Context preservation across chunks
+
+**Applications**: Live news feeds, social media analysis, meeting transcripts.
+
+## Implementation Challenges
+
+### Computational Efficiency
+
+#### "Scalable Chunking for Large Document Collections"
+
+**Challenges**:
+- Processing millions of documents efficiently
+- Memory usage optimization
+- Distributed processing requirements
+
+**Solutions**:
+- Batch processing with parallel execution
+- Streaming approaches for large documents
+- Distributed chunking with load balancing
+
+### Quality Assurance
+
+#### "Evaluating Chunk Quality at Scale"
+
+**Challenges**:
+- Automated quality assessment
+- Detecting poor chunk boundaries
+- Maintaining consistency across document types
+
+**Approaches**:
+- Heuristic-based quality metrics
+- LLM-based evaluation
+- Human-in-the-loop validation
+
+## Future Research Directions
+
+### Context-Aware Chunking
+
+**Open Questions**:
+- How to optimally preserve cross-chunk relationships?
+- Can we predict chunk quality without human evaluation?
+- What is the optimal balance between size and context?
+
+### Domain Adaptation
+
+**Research Areas**:
+- Automatic domain detection and adaptation
+- Transfer learning across domains
+- Zero-shot chunking for new document types
+
+### Evaluation Standards
+
+**Needs**:
+- Standardized evaluation benchmarks
+- Cross-paper comparison methodologies
+- Real-world performance metrics
+
+## Practical Recommendations Based on Research
+
+### Starting Points
+
+1. **For General RAG Systems**: Page-level or recursive character chunking with 512-1024 tokens and 10-20% overlap
+2. **For Technical Documents**: Structure-aware chunking with semantic boundary detection
+3. **For High-Value Applications**: Contextual retrieval with LLM-generated context
+
+### Evolution Strategy
+
+1. **Begin**: Simple fixed-size chunking (512 tokens)
+2. **Improve**: Add document structure awareness
+3. **Optimize**: Implement semantic boundaries
+4. **Advanced**: Consider contextual retrieval for critical use cases
+
+### Key Success Factors
+
+1. **Match strategy to document type and query patterns**
+2. **Preserve document structure when beneficial**
+3. **Use overlap to maintain context across boundaries**
+4. **Monitor both accuracy and computational costs**
+5. **Iterate based on specific use case requirements**
+
+This research foundation provides evidence-based guidance for implementing effective chunking strategies across various domains and use cases.
--- a/skills/ai/chunking-strategy/references/semantic-methods.md
+++ b/skills/ai/chunking-strategy/references/semantic-methods.md
--- a/skills/ai/chunking-strategy/references/strategies.md
+++ b/skills/ai/chunking-strategy/references/strategies.md
@@ -0,0 +1,423 @@
+# Detailed Chunking Strategies
+
+This document provides comprehensive implementation details for all chunking strategies mentioned in the main skill.
+
+## Level 1: Fixed-Size Chunking
+
+### Implementation
+
+```python
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+
+class FixedSizeChunker:
+    def __init__(self, chunk_size=512, chunk_overlap=50):
+        self.chunk_size = chunk_size
+        self.chunk_overlap = chunk_overlap
+        self.splitter = RecursiveCharacterTextSplitter(
+            chunk_size=chunk_size,
+            chunk_overlap=chunk_overlap,
+            length_function=len,
+            separators=["\n\n", "\n", " ", ""]
+        )
+
+    def chunk(self, documents):
+        return self.splitter.split_documents(documents)
+```
+
+### Parameter Recommendations
+
+| Use Case | Chunk Size | Overlap | Rationale |
+|----------|------------|---------|-----------|
+| Factoid Queries | 256 | 25 | Small chunks for precise answers |
+| General Q&A | 512 | 50 | Balanced approach for most cases |
+| Analytical Queries | 1024 | 100 | Larger context for complex analysis |
+| Code Documentation | 300 | 30 | Preserve code context while maintaining focus |
+
+### Best Practices
+
+- Start with 512 tokens and 10-20% overlap
+- Adjust based on embedding model context window
+- Use overlap for queries where context might span boundaries
+- Monitor token count vs. character count based on model
+
+## Level 2: Recursive Character Chunking
+
+### Implementation
+
+```python
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+
+class RecursiveChunker:
+    def __init__(self, chunk_size=512, separators=None):
+        self.chunk_size = chunk_size
+        self.separators = separators or ["\n\n", "\n", " ", ""]
+        self.splitter = RecursiveCharacterTextSplitter(
+            chunk_size=chunk_size,
+            chunk_overlap=0,
+            length_function=len,
+            separators=self.separators
+        )
+
+    def chunk(self, text):
+        return self.splitter.create_documents([text])
+
+# Document-specific configurations
+def get_chunker_for_document_type(doc_type):
+    configurations = {
+        "markdown": ["\n## ", "\n### ", "\n\n", "\n", " ", ""],
+        "html": ["</div>", "</p>", "\n\n", "\n", " ", ""],
+        "code": ["\n\n", "\n", " ", ""],
+        "plain": ["\n\n", "\n", " ", ""]
+    }
+    return RecursiveChunker(separators=configurations.get(doc_type, ["\n\n", "\n", " ", ""]))
+```
+
+### Customization Guidelines
+
+- **Markdown**: Use headings as primary separators
+- **HTML**: Use block-level tags as separators
+- **Code**: Preserve function and class boundaries
+- **Academic papers**: Prioritize paragraph and section breaks
+
+## Level 3: Structure-Aware Chunking
+
+### Markdown Documents
+
+```python
+import markdown
+from bs4 import BeautifulSoup
+
+class MarkdownChunker:
+    def __init__(self, max_chunk_size=512):
+        self.max_chunk_size = max_chunk_size
+
+    def chunk(self, markdown_text):
+        html = markdown.markdown(markdown_text)
+        soup = BeautifulSoup(html, 'html.parser')
+
+        chunks = []
+        current_chunk = ""
+        current_heading = "Introduction"
+
+        for element in soup.find_all(['h1', 'h2', 'h3', 'p', 'pre', 'table']):
+            if element.name.startswith('h'):
+                if current_chunk.strip():
+                    chunks.append({
+                        "content": current_chunk.strip(),
+                        "heading": current_heading
+                    })
+                current_heading = element.get_text().strip()
+                current_chunk = f"{element}\n"
+            elif element.name in ['pre', 'table']:
+                # Preserve code blocks and tables intact
+                if len(current_chunk) + len(str(element)) > self.max_chunk_size:
+                    if current_chunk.strip():
+                        chunks.append({
+                            "content": current_chunk.strip(),
+                            "heading": current_heading
+                        })
+                    current_chunk = f"{element}\n"
+                else:
+                    current_chunk += f"{element}\n"
+            else:
+                current_chunk += str(element)
+
+        if current_chunk.strip():
+            chunks.append({
+                "content": current_chunk.strip(),
+                "heading": current_heading
+            })
+
+        return chunks
+```
+
+### Code Documents
+
+```python
+import ast
+import re
+
+class CodeChunker:
+    def __init__(self, language='python'):
+        self.language = language
+
+    def chunk_python(self, code):
+        tree = ast.parse(code)
+        chunks = []
+
+        for node in ast.walk(tree):
+            if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
+                start_line = node.lineno - 1
+                end_line = node.end_lineno if hasattr(node, 'end_lineno') else start_line + 10
+                lines = code.split('\n')
+                chunk_lines = lines[start_line:end_line]
+                chunks.append('\n'.join(chunk_lines))
+
+        return chunks
+
+    def chunk_javascript(self, code):
+        # Use regex for languages without AST parsers
+        function_pattern = r'(function\s+\w+\s*\([^)]*\)\s*\{[^}]*\})'
+        class_pattern = r'(class\s+\w+\s*\{[^}]*\})'
+
+        patterns = [function_pattern, class_pattern]
+        chunks = []
+
+        for pattern in patterns:
+            matches = re.finditer(pattern, code, re.MULTILINE | re.DOTALL)
+            for match in matches:
+                chunks.append(match.group(1))
+
+        return chunks
+
+    def chunk(self, code):
+        if self.language == 'python':
+            return self.chunk_python(code)
+        elif self.language == 'javascript':
+            return self.chunk_javascript(code)
+        else:
+            # Fallback to line-based chunking
+            return self.chunk_by_lines(code)
+
+    def chunk_by_lines(self, code, max_lines=50):
+        lines = code.split('\n')
+        chunks = []
+
+        for i in range(0, len(lines), max_lines):
+            chunk = '\n'.join(lines[i:i+max_lines])
+            chunks.append(chunk)
+
+        return chunks
+```
+
+### Tabular Data
+
+```python
+import pandas as pd
+
+class TableChunker:
+    def __init__(self, max_rows=100, summary_rows=5):
+        self.max_rows = max_rows
+        self.summary_rows = summary_rows
+
+    def chunk(self, table_data):
+        if isinstance(table_data, str):
+            df = pd.read_csv(StringIO(table_data))
+        else:
+            df = table_data
+
+        chunks = []
+
+        if len(df) <= self.max_rows:
+            # Small table - keep intact
+            chunks.append({
+                "type": "full_table",
+                "content": df.to_string(),
+                "metadata": {
+                    "rows": len(df),
+                    "columns": len(df.columns)
+                }
+            })
+        else:
+            # Large table - create summary + chunks
+            summary = df.head(self.summary_rows)
+            chunks.append({
+                "type": "table_summary",
+                "content": f"Table Summary ({len(df)} rows, {len(df.columns)} columns):\n{summary.to_string()}",
+                "metadata": {
+                    "total_rows": len(df),
+                    "summary_rows": self.summary_rows,
+                    "columns": list(df.columns)
+                }
+            })
+
+            # Chunk the remaining data
+            for i in range(self.summary_rows, len(df), self.max_rows):
+                chunk_df = df.iloc[i:i+self.max_rows]
+                chunks.append({
+                    "type": "table_chunk",
+                    "content": f"Rows {i+1}-{min(i+self.max_rows, len(df))}:\n{chunk_df.to_string()}",
+                    "metadata": {
+                        "start_row": i + 1,
+                        "end_row": min(i + self.max_rows, len(df)),
+                        "columns": list(df.columns)
+                    }
+                })
+
+        return chunks
+```
+
+## Level 4: Semantic Chunking
+
+### Implementation
+
+```python
+import numpy as np
+from sentence_transformers import SentenceTransformer
+from sklearn.metrics.pairwise import cosine_similarity
+
+class SemanticChunker:
+    def __init__(self, model_name="all-MiniLM-L6-v2", similarity_threshold=0.8, buffer_size=3):
+        self.model = SentenceTransformer(model_name)
+        self.similarity_threshold = similarity_threshold
+        self.buffer_size = buffer_size
+
+    def split_into_sentences(self, text):
+        # Simple sentence splitting - can be enhanced with nltk/spacy
+        sentences = re.split(r'[.!?]+', text)
+        return [s.strip() for s in sentences if s.strip()]
+
+    def chunk(self, text):
+        sentences = self.split_into_sentences(text)
+
+        if len(sentences) <= self.buffer_size:
+            return [text]
+
+        # Create embeddings
+        embeddings = self.model.encode(sentences)
+
+        chunks = []
+        current_chunk_sentences = []
+
+        for i in range(len(sentences)):
+            current_chunk_sentences.append(sentences[i])
+
+            # Check if we should create a boundary
+            if i < len(sentences) - 1:
+                similarity = cosine_similarity(
+                    [embeddings[i]],
+                    [embeddings[i + 1]]
+                )[0][0]
+
+                if similarity < self.similarity_threshold and len(current_chunk_sentences) >= 2:
+                    chunks.append(' '.join(current_chunk_sentences))
+                    current_chunk_sentences = []
+
+        # Add remaining sentences
+        if current_chunk_sentences:
+            chunks.append(' '.join(current_chunk_sentences))
+
+        return chunks
+```
+
+### Parameter Tuning
+
+| Parameter | Range | Effect |
+|-----------|-------|--------|
+| similarity_threshold | 0.5-0.9 | Higher values create more chunks |
+| buffer_size | 1-10 | Larger buffers provide more context |
+| model_name | Various | Different models for different domains |
+
+### Optimization Tips
+
+- Use domain-specific models for specialized content
+- Adjust threshold based on content complexity
+- Cache embeddings for repeated processing
+- Consider batch processing for large documents
+
+## Level 5: Advanced Contextual Methods
+
+### Late Chunking
+
+```python
+import torch
+from transformers import AutoTokenizer, AutoModel
+
+class LateChunker:
+    def __init__(self, model_name="microsoft/DialoGPT-medium"):
+        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
+        self.model = AutoModel.from_pretrained(model_name)
+
+    def chunk(self, text, chunk_size=512):
+        # Tokenize entire document
+        tokens = self.tokenizer(text, return_tensors="pt", truncation=False)
+
+        # Get token-level embeddings
+        with torch.no_grad():
+            outputs = self.model(**tokens, output_hidden_states=True)
+            token_embeddings = outputs.last_hidden_state[0]
+
+        # Create chunk embeddings from token embeddings
+        chunks = []
+        for i in range(0, len(token_embeddings), chunk_size):
+            chunk_tokens = token_embeddings[i:i+chunk_size]
+            chunk_embedding = torch.mean(chunk_tokens, dim=0)
+            chunks.append({
+                "content": self.tokenizer.decode(tokens["input_ids"][0][i:i+chunk_size]),
+                "embedding": chunk_embedding.numpy()
+            })
+
+        return chunks
+```
+
+### Contextual Retrieval
+
+```python
+import openai
+
+class ContextualChunker:
+    def __init__(self, api_key):
+        self.client = openai.OpenAI(api_key=api_key)
+
+    def generate_context(self, chunk, full_document):
+        prompt = f"""
+        Given the following document and a chunk from it, provide a brief context
+        that helps understand the chunk's meaning within the full document.
+
+        Document:
+        {full_document[:2000]}...
+
+        Chunk:
+        {chunk}
+
+        Context (max 50 words):
+        """
+
+        response = self.client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": prompt}],
+            max_tokens=100,
+            temperature=0
+        )
+
+        return response.choices[0].message.content.strip()
+
+    def chunk_with_context(self, text, base_chunker):
+        # First create base chunks
+        base_chunks = base_chunker.chunk(text)
+
+        # Then add context to each chunk
+        contextualized_chunks = []
+        for chunk in base_chunks:
+            context = self.generate_context(chunk.page_content, text)
+            contextualized_content = f"Context: {context}\n\nContent: {chunk.page_content}"
+
+            contextualized_chunks.append({
+                "content": contextualized_content,
+                "original_content": chunk.page_content,
+                "context": context
+            })
+
+        return contextualized_chunks
+```
+
+## Performance Considerations
+
+### Computational Cost Analysis
+
+| Strategy | Time Complexity | Space Complexity | Relative Cost |
+|----------|-----------------|------------------|---------------|
+| Fixed-Size | O(n) | O(n) | Low |
+| Recursive | O(n) | O(n) | Low |
+| Structure-Aware | O(n log n) | O(n) | Medium |
+| Semantic | O(n²) | O(n²) | High |
+| Late Chunking | O(n) | O(n) | Very High |
+| Contextual | O(n²) | O(n²) | Very High |
+
+### Optimization Strategies
+
+1. **Parallel Processing**: Process chunks concurrently when possible
+2. **Caching**: Store embeddings and intermediate results
+3. **Batch Operations**: Group similar operations together
+4. **Progressive Loading**: Process large documents in streaming fashion
+5. **Model Selection**: Choose appropriate models for task complexity
--- a/skills/ai/chunking-strategy/references/tools.md
+++ b/skills/ai/chunking-strategy/references/tools.md
@@ -0,0 +1,867 @@
+# Recommended Libraries and Frameworks
+
+This document provides a comprehensive guide to tools, libraries, and frameworks for implementing chunking strategies.
+
+## Core Chunking Libraries
+
+### LangChain
+
+**Overview**: Comprehensive framework for building applications with large language models, includes robust text splitting utilities.
+
+**Installation**:
+```bash
+pip install langchain langchain-text-splitters
+```
+
+**Key Features**:
+- Multiple text splitting strategies
+- Integration with various document loaders
+- Support for different content types (code, markdown, etc.)
+- Customizable separators and parameters
+
+**Example Usage**:
+
+```python
+from langchain.text_splitter import (
+    RecursiveCharacterTextSplitter,
+    CharacterTextSplitter,
+    TokenTextSplitter,
+    MarkdownTextSplitter,
+    PythonCodeTextSplitter
+)
+
+# Basic recursive splitting
+splitter = RecursiveCharacterTextSplitter(
+    chunk_size=1000,
+    chunk_overlap=200,
+    length_function=len,
+    separators=["\n\n", "\n", " ", ""]
+)
+
+chunks = splitter.split_text(large_text)
+
+# Markdown-specific splitting
+markdown_splitter = MarkdownTextSplitter(
+    chunk_size=1000,
+    chunk_overlap=100
+)
+
+# Code-specific splitting
+code_splitter = PythonCodeTextSplitter(
+    chunk_size=1000,
+    chunk_overlap=100
+)
+```
+
+**Pros**:
+- Well-maintained and actively developed
+- Extensive documentation and examples
+- Integrates well with other LangChain components
+- Supports multiple document types
+
+**Cons**:
+- Can be heavy dependency for simple use cases
+- Some advanced features require LangChain ecosystem
+
+### LlamaIndex
+
+**Overview**: Data framework for LLM applications with advanced indexing and retrieval capabilities.
+
+**Installation**:
+```bash
+pip install llama-index
+```
+
+**Key Features**:
+- Advanced semantic chunking
+- Hierarchical indexing
+- Context-aware retrieval
+- Integration with vector databases
+
+**Example Usage**:
+
+```python
+from llama_index.core.node_parser import (
+    SentenceSplitter,
+    SemanticSplitterNodeParser
+)
+from llama_index.core import SimpleDirectoryReader
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+# Basic sentence splitting
+splitter = SentenceSplitter(
+    chunk_size=1024,
+    chunk_overlap=20
+)
+
+# Semantic chunking with embeddings
+embed_model = OpenAIEmbedding()
+semantic_splitter = SemanticSplitterNodeParser(
+    buffer_size=1,
+    breakpoint_percentile_threshold=95,
+    embed_model=embed_model
+)
+
+# Load and process documents
+documents = SimpleDirectoryReader("./data").load_data()
+nodes = semantic_splitter.get_nodes_from_documents(documents)
+```
+
+**Pros**:
+- Excellent semantic chunking capabilities
+- Built for production RAG systems
+- Strong vector database integration
+- Active community support
+
+**Cons**:
+- More complex setup for basic use cases
+- Semantic chunking requires embedding model setup
+
+### Unstructured
+
+**Overview**: Open-source library for processing unstructured documents, especially strong with multi-modal content.
+
+**Installation**:
+```bash
+pip install "unstructured[pdf,png,jpg]"
+```
+
+**Key Features**:
+- Multi-modal document processing
+- Support for PDFs, images, and various formats
+- Structure preservation
+- Table extraction and processing
+
+**Example Usage**:
+
+```python
+from unstructured.partition.auto import partition
+from unstructured.chunking.title import chunk_by_title
+
+# Partition document by type
+elements = partition(filename="document.pdf")
+
+# Chunk by title/heading structure
+chunks = chunk_by_title(
+    elements,
+    combine_text_under_n_chars=2000,
+    max_characters=10000,
+    new_after_n_chars=1500,
+    multipage_sections=True
+)
+
+# Access chunked content
+for chunk in chunks:
+    print(f"Category: {chunk.category}")
+    print(f"Content: {chunk.text[:200]}...")
+```
+
+**Pros**:
+- Excellent for PDF and image processing
+- Preserves document structure
+- Handles tables and figures well
+- Strong multi-modal capabilities
+
+**Cons**:
+- Can be slower for large documents
+- Requires additional dependencies for some formats
+
+## Text Processing Libraries
+
+### NLTK (Natural Language Toolkit)
+
+**Installation**:
+```bash
+pip install nltk
+```
+
+**Key Features**:
+- Sentence tokenization
+- Language detection
+- Text preprocessing
+- Linguistic analysis
+
+**Example Usage**:
+
+```python
+import nltk
+from nltk.tokenize import sent_tokenize, word_tokenize
+from nltk.corpus import stopwords
+
+# Download required data
+nltk.download('punkt')
+nltk.download('stopwords')
+
+# Sentence and word tokenization
+text = "This is a sample sentence. This is another sentence."
+sentences = sent_tokenize(text)
+words = word_tokenize(text)
+
+# Stop words removal
+stop_words = set(stopwords.words('english'))
+filtered_words = [word for word in words if word.lower() not in stop_words]
+```
+
+### spaCy
+
+**Installation**:
+```bash
+pip install spacy
+python -m spacy download en_core_web_sm
+```
+
+**Key Features**:
+- Industrial-strength NLP
+- Named entity recognition
+- Dependency parsing
+- Sentence boundary detection
+
+**Example Usage**:
+
+```python
+import spacy
+
+# Load language model
+nlp = spacy.load("en_core_web_sm")
+
+# Process text
+doc = nlp("This is a sample sentence. This is another sentence.")
+
+# Extract sentences
+sentences = [sent.text for sent in doc.sents]
+
+# Named entities
+entities = [(ent.text, ent.label_) for ent in doc.ents]
+
+# Dependency parsing for better chunking
+for token in doc:
+    print(f"{token.text}: {token.dep_} (head: {token.head.text})")
+```
+
+### Sentence Transformers
+
+**Installation**:
+```bash
+pip install sentence-transformers
+```
+
+**Key Features**:
+- Pre-trained sentence embeddings
+- Semantic similarity calculation
+- Multi-lingual support
+- Custom model training
+
+**Example Usage**:
+
+```python
+from sentence_transformers import SentenceTransformer, util
+import numpy as np
+
+# Load pre-trained model
+model = SentenceTransformer('all-MiniLM-L6-v2')
+
+# Generate embeddings
+sentences = ["This is a sentence.", "This is another sentence."]
+embeddings = model.encode(sentences)
+
+# Calculate semantic similarity
+similarity = util.cos_sim(embeddings[0], embeddings[1])
+
+# Find semantic boundaries for chunking
+def find_semantic_boundaries(text, model, threshold=0.8):
+    sentences = [s.strip() for s in text.split('.') if s.strip()]
+    embeddings = model.encode(sentences)
+
+    boundaries = [0]
+    for i in range(1, len(sentences)):
+        similarity = util.cos_sim(embeddings[i-1], embeddings[i])
+        if similarity < threshold:
+            boundaries.append(i)
+
+    return boundaries
+```
+
+## Vector Databases and Search
+
+### ChromaDB
+
+**Installation**:
+```bash
+pip install chromadb
+```
+
+**Key Features**:
+- In-memory and persistent storage
+- Built-in embedding functions
+- Similarity search
+- Metadata filtering
+
+**Example Usage**:
+
+```python
+import chromadb
+from chromadb.utils import embedding_functions
+
+# Initialize client
+client = chromadb.Client()
+
+# Create collection
+collection = client.create_collection(
+    name="document_chunks",
+    embedding_function=embedding_functions.DefaultEmbeddingFunction()
+)
+
+# Add chunks
+collection.add(
+    documents=[chunk["content"] for chunk in chunks],
+    metadatas=[chunk.get("metadata", {}) for chunk in chunks],
+    ids=[chunk["id"] for chunk in chunks]
+)
+
+# Search
+results = collection.query(
+    query_texts=["What is chunking?"],
+    n_results=5
+)
+```
+
+### Pinecone
+
+**Installation**:
+```bash
+pip install pinecone-client
+```
+
+**Key Features**:
+- Managed vector database service
+- High-performance similarity search
+- Metadata filtering
+- Scalable infrastructure
+
+**Example Usage**:
+
+```python
+import pinecone
+from sentence_transformers import SentenceTransformer
+
+# Initialize
+pinecone.init(api_key="your-api-key", environment="your-environment")
+index_name = "document-chunks"
+
+# Create index if it doesn't exist
+if index_name not in pinecone.list_indexes():
+    pinecone.create_index(
+        name=index_name,
+        dimension=384,  # Match embedding model
+        metric="cosine"
+    )
+
+index = pinecone.Index(index_name)
+
+# Generate embeddings and upsert
+model = SentenceTransformer('all-MiniLM-L6-v2')
+for chunk in chunks:
+    embedding = model.encode(chunk["content"])
+    index.upsert(
+        vectors=[{
+            "id": chunk["id"],
+            "values": embedding.tolist(),
+            "metadata": chunk.get("metadata", {})
+        }]
+    )
+
+# Search
+query_embedding = model.encode("search query")
+results = index.query(
+    vector=query_embedding.tolist(),
+    top_k=5,
+    include_metadata=True
+)
+```
+
+### Weaviate
+
+**Installation**:
+```bash
+pip install weaviate-client
+```
+
+**Key Features**:
+- GraphQL API
+- Hybrid search (dense + sparse)
+- Real-time updates
+- Schema validation
+
+**Example Usage**:
+
+```python
+import weaviate
+
+# Connect to Weaviate
+client = weaviate.Client("http://localhost:8080")
+
+# Define schema
+client.schema.create_class({
+    "class": "DocumentChunk",
+    "description": "A chunk of document content",
+    "properties": [
+        {
+            "name": "content",
+            "dataType": ["text"]
+        },
+        {
+            "name": "source",
+            "dataType": ["string"]
+        }
+    ]
+})
+
+# Add data
+for chunk in chunks:
+    client.data_object.create(
+        data_object={
+            "content": chunk["content"],
+            "source": chunk.get("source", "unknown")
+        },
+        class_name="DocumentChunk"
+    )
+
+# Search
+results = client.query.get(
+    "DocumentChunk",
+    ["content", "source"]
+).with_near_text({
+    "concepts": ["search query"]
+}).with_limit(5).do()
+```
+
+## Evaluation and Testing
+
+### RAGAS
+
+**Installation**:
+```bash
+pip install ragas
+```
+
+**Key Features**:
+- RAG evaluation metrics
+- Answer quality assessment
+- Context relevance measurement
+- Faithfulness evaluation
+
+**Example Usage**:
+
+```python
+from ragas import evaluate
+from ragas.metrics import (
+    faithfulness,
+    answer_relevancy,
+    context_relevancy,
+    context_recall
+)
+from datasets import Dataset
+
+# Prepare evaluation data
+dataset = Dataset.from_dict({
+    "question": ["What is chunking?"],
+    "answer": ["Chunking is the process of breaking large documents into smaller segments"],
+    "contexts": [["Chunking involves dividing text into manageable pieces for better processing"]],
+    "ground_truth": ["Chunking is a document processing technique"]
+})
+
+# Evaluate
+result = evaluate(
+    dataset=dataset,
+    metrics=[
+        faithfulness,
+        answer_relevancy,
+        context_relevancy,
+        context_recall
+    ]
+)
+
+print(result)
+```
+
+### TruEra (TruLens)
+
+**Installation**:
+```bash
+pip install trulens trulens-apps
+```
+
+**Key Features**:
+- LLM application evaluation
+- Feedback functions
+- Hallucination detection
+- Performance monitoring
+
+**Example Usage**:
+
+```python
+from trulens.core import TruSession
+from trulens.apps.custom import instrument
+from trulens.feedback import GroundTruthAgreement
+
+# Initialize session
+session = TruSession()
+
+# Define feedback functions
+f_groundedness = GroundTruthAgreement(ground_truth)
+
+# Evaluate chunks
+@instrument
+def chunk_and_query(text, query):
+    chunks = chunk_function(text)
+    relevant_chunks = search_function(chunks, query)
+    answer = generate_function(relevant_chunks, query)
+    return answer
+
+# Record evaluation
+with session:
+    chunk_and_query("large document text", "what is the main topic?")
+```
+
+## Document Processing
+
+### PyPDF2
+
+**Installation**:
+```bash
+pip install PyPDF2
+```
+
+**Key Features**:
+- PDF text extraction
+- Page manipulation
+- Metadata extraction
+- Form field processing
+
+**Example Usage**:
+
+```python
+import PyPDF2
+
+def extract_text_from_pdf(pdf_path):
+    text = ""
+    with open(pdf_path, 'rb') as file:
+        reader = PyPDF2.PdfReader(file)
+        for page in reader.pages:
+            text += page.extract_text()
+    return text
+
+# Extract text by page for better chunking
+def extract_pages(pdf_path):
+    pages = []
+    with open(pdf_path, 'rb') as file:
+        reader = PyPDF2.PdfReader(file)
+        for i, page in enumerate(reader.pages):
+            pages.append({
+                "page_number": i + 1,
+                "content": page.extract_text()
+            })
+    return pages
+```
+
+### python-docx
+
+**Installation**:
+```bash
+pip install python-docx
+```
+
+**Key Features**:
+- Microsoft Word document processing
+- Paragraph and table extraction
+- Style preservation
+- Metadata access
+
+**Example Usage**:
+
+```python
+from docx import Document
+
+def extract_from_docx(docx_path):
+    doc = Document(docx_path)
+    content = []
+
+    for paragraph in doc.paragraphs:
+        if paragraph.text.strip():
+            content.append({
+                "type": "paragraph",
+                "text": paragraph.text,
+                "style": paragraph.style.name
+            })
+
+    for table in doc.tables:
+        table_text = []
+        for row in table.rows:
+            row_text = [cell.text for cell in row.cells]
+            table_text.append(" | ".join(row_text))
+
+        content.append({
+            "type": "table",
+            "text": "\n".join(table_text)
+        })
+
+    return content
+```
+
+## Specialized Libraries
+
+### tiktoken (OpenAI)
+
+**Installation**:
+```bash
+pip install tiktoken
+```
+
+**Key Features**:
+- Accurate token counting for OpenAI models
+- Fast encoding/decoding
+- Multiple model support
+- Language model specific tokenization
+
+**Example Usage**:
+
+```python
+import tiktoken
+
+# Get encoding for specific model
+encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
+
+# Encode text
+tokens = encoding.encode("This is a sample text")
+print(f"Token count: {len(tokens)}")
+
+# Decode tokens
+text = encoding.decode(tokens)
+
+# Count tokens without full encoding
+def count_tokens(text, model="gpt-3.5-turbo"):
+    encoding = tiktoken.encoding_for_model(model)
+    return len(encoding.encode(text))
+
+# Use in chunking
+def chunk_by_tokens(text, max_tokens=1000):
+    encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
+    tokens = encoding.encode(text)
+
+    chunks = []
+    for i in range(0, len(tokens), max_tokens):
+        chunk_tokens = tokens[i:i + max_tokens]
+        chunk_text = encoding.decode(chunk_tokens)
+        chunks.append(chunk_text)
+
+    return chunks
+```
+
+### PDFMiner
+
+**Installation**:
+```bash
+pip install pdfminer.six
+```
+
+**Key Features**:
+- Detailed PDF analysis
+- Layout preservation
+- Font and style information
+- High-precision text extraction
+
+**Example Usage**:
+
+```python
+from pdfminer.high_level import extract_pages
+from pdfminer.layout import LTTextContainer
+
+def extract_structured_text(pdf_path):
+    structured_content = []
+
+    for page_layout in extract_pages(pdf_path):
+        page_content = []
+
+        for element in page_layout:
+            if isinstance(element, LTTextContainer):
+                text = element.get_text()
+                font_info = {
+                    "font_size": element.height,
+                    "is_bold": "Bold" in element.fontname,
+                    "x0": element.x0,
+                    "y0": element.y0
+                }
+                page_content.append({
+                    "text": text.strip(),
+                    "font_info": font_info
+                })
+
+        structured_content.append({
+            "page_number": page_layout.pageid,
+            "content": page_content
+        })
+
+    return structured_content
+```
+
+## Performance and Optimization
+
+### Dask
+
+**Installation**:
+```bash
+pip install dask[complete]
+```
+
+**Key Features**:
+- Parallel processing
+- Out-of-core computation
+- Distributed computing
+- Integration with pandas
+
+**Example Usage**:
+
+```python
+import dask.bag as db
+from dask.distributed import Client
+
+# Setup distributed client
+client = Client(n_workers=4)
+
+# Parallel chunking of multiple documents
+def chunk_document(document):
+    # Your chunking logic here
+    return chunk_function(document)
+
+# Process documents in parallel
+documents = ["doc1", "doc2", "doc3", ...]  # List of document contents
+document_bag = db.from_sequence(documents)
+
+# Apply chunking function in parallel
+chunked_documents = document_bag.map(chunk_document)
+
+# Compute results
+results = chunked_documents.compute()
+```
+
+### Ray
+
+**Installation**:
+```bash
+pip install ray
+```
+
+**Key Features**:
+- Distributed computing
+- Actor model
+- Autoscaling
+- ML pipeline integration
+
+**Example Usage**:
+
+```python
+import ray
+
+# Initialize Ray
+ray.init()
+
+@ray.remote
+class ChunkingWorker:
+    def __init__(self, strategy):
+        self.strategy = strategy
+
+    def chunk_documents(self, documents):
+        results = []
+        for doc in documents:
+            chunks = self.strategy.chunk(doc)
+            results.append(chunks)
+        return results
+
+# Create workers
+workers = [ChunkingWorker.remote(strategy) for _ in range(4)]
+
+# Distribute work
+documents_batch = [documents[i::4] for i in range(4)]
+futures = [worker.chunk_documents.remote(batch)
+           for worker, batch in zip(workers, documents_batch)]
+
+# Get results
+results = ray.get(futures)
+```
+
+## Development and Testing
+
+### pytest
+
+**Installation**:
+```bash
+pip install pytest pytest-asyncio
+```
+
+**Example Tests**:
+
+```python
+import pytest
+from your_chunking_module import FixedSizeChunker, SemanticChunker
+
+class TestFixedSizeChunker:
+    def test_chunk_size_respect(self):
+        chunker = FixedSizeChunker(chunk_size=100, chunk_overlap=10)
+        text = "word " * 50  # 50 words
+
+        chunks = chunker.chunk(text)
+
+        for chunk in chunks:
+            assert len(chunk.split()) <= 100  # Account for word boundaries
+
+    def test_overlap_consistency(self):
+        chunker = FixedSizeChunker(chunk_size=50, chunk_overlap=10)
+        text = "word " * 30
+
+        chunks = chunker.chunk(text)
+
+        # Check overlap between consecutive chunks
+        for i in range(1, len(chunks)):
+            chunk1_words = set(chunks[i-1].split()[-10:])
+            chunk2_words = set(chunks[i].split()[:10])
+            overlap = len(chunk1_words & chunk2_words)
+            assert overlap >= 5  # Allow some tolerance
+
+@pytest.mark.asyncio
+async def test_semantic_chunker():
+    chunker = SemanticChunker()
+    text = "First topic sentence. Another sentence about first topic. " \
+           "Now switching to second topic. More about second topic."
+
+    chunks = await chunker.chunk_async(text)
+
+    # Should detect topic change and create boundary
+    assert len(chunks) >= 2
+    assert "first topic" in chunks[0].lower()
+    assert "second topic" in chunks[1].lower()
+```
+
+### Memory Profiler
+
+**Installation**:
+```bash
+pip install memory-profiler
+```
+
+**Example Usage**:
+
+```python
+from memory_profiler import profile
+
+@profile
+def chunk_large_document():
+    chunker = FixedSizeChunker(chunk_size=1000)
+    large_text = "word " * 100000  # Large document
+
+    chunks = chunker.chunk(large_text)
+    return chunks
+
+# Run with: python -m memory_profiler your_script.py
+```
+
+This comprehensive toolset provides everything needed to implement, test, and optimize chunking strategies for various use cases, from simple text processing to production-grade RAG systems.
--- a/skills/ai/chunking-strategy/references/visualization-tools.md
+++ b/skills/ai/chunking-strategy/references/visualization-tools.md
--- a/skills/ai/prompt-engineering/SKILL.md
+++ b/skills/ai/prompt-engineering/SKILL.md
@@ -0,0 +1,302 @@
+---
+name: prompt-engineering
+category: backend
+tags: [prompt-engineering, few-shot-learning, chain-of-thought, optimization, templates, system-prompts, llm-performance, ai-patterns]
+version: 1.0.0
+description: This skill should be used when creating, optimizing, or implementing advanced prompt patterns including few-shot learning, chain-of-thought reasoning, prompt optimization workflows, template systems, and system prompt design. It provides comprehensive frameworks for building production-ready prompts with measurable performance improvements.
+---
+
+# Prompt Engineering
+
+This skill provides comprehensive frameworks for creating, optimizing, and implementing advanced prompt patterns that significantly improve LLM performance across various tasks and models.
+
+## When to Use This Skill
+
+Use this skill when:
+- Creating new prompts for complex reasoning or analytical tasks
+- Optimizing existing prompts for better accuracy or efficiency
+- Implementing few-shot learning with strategic example selection
+- Designing chain-of-thought reasoning for multi-step problems
+- Building reusable prompt templates and systems
+- Developing system prompts for consistent model behavior
+- Troubleshooting poor prompt performance or failure modes
+- Scaling prompt systems for production use cases
+
+## Core Prompt Engineering Patterns
+
+### 1. Few-Shot Learning Implementation
+
+Select examples using semantic similarity and diversity sampling to maximize learning within context window constraints.
+
+#### Example Selection Strategy
+- Use `references/few-shot-patterns.md` for comprehensive selection frameworks
+- Balance example count (3-5 optimal) with context window limitations
+- Include edge cases and boundary conditions in example sets
+- Prioritize diverse examples that cover problem space variations
+- Order examples from simple to complex for progressive learning
+
+#### Few-Shot Template Structure
+```
+Example 1 (Basic case):
+Input: {representative_input}
+Output: {expected_output}
+
+Example 2 (Edge case):
+Input: {challenging_input}
+Output: {robust_output}
+
+Example 3 (Error case):
+Input: {problematic_input}
+Output: {corrected_output}
+
+Now handle: {target_input}
+```
+
+### 2. Chain-of-Thought Reasoning
+
+Elicit step-by-step reasoning for complex problem-solving through structured thinking patterns.
+
+#### Implementation Patterns
+- Reference `references/cot-patterns.md` for detailed reasoning frameworks
+- Use "Let's think step by step" for zero-shot CoT initiation
+- Provide complete reasoning traces for few-shot CoT demonstrations
+- Implement self-consistency by sampling multiple reasoning paths
+- Include verification and validation steps in reasoning chains
+
+#### CoT Template Structure
+```
+Let's approach this step-by-step:
+
+Step 1: {break_down_the_problem}
+Analysis: {detailed_reasoning}
+
+Step 2: {identify_key_components}
+Analysis: {component_analysis}
+
+Step 3: {synthesize_solution}
+Analysis: {solution_justification}
+
+Final Answer: {conclusion_with_confidence}
+```
+
+### 3. Prompt Optimization Workflows
+
+Implement iterative refinement processes with measurable performance metrics and systematic A/B testing.
+
+#### Optimization Process
+- Use `references/optimization-frameworks.md` for comprehensive optimization strategies
+- Measure baseline performance before optimization attempts
+- Implement single-variable changes for accurate attribution
+- Track metrics: accuracy, consistency, latency, token efficiency
+- Use statistical significance testing for A/B validation
+- Document optimization iterations and their impacts
+
+#### Performance Metrics Framework
+- **Accuracy**: Task completion rate and output correctness
+- **Consistency**: Response stability across multiple runs
+- **Efficiency**: Token usage and response time optimization
+- **Robustness**: Performance across edge cases and variations
+- **Safety**: Adherence to guidelines and harm prevention
+
+### 4. Template Systems Architecture
+
+Build modular, reusable prompt components with variable interpolation and conditional sections.
+
+#### Template Design Principles
+- Reference `references/template-systems.md` for modular template frameworks
+- Use clear variable naming conventions (e.g., `{user_input}`, `{context}`)
+- Implement conditional sections for different scenario handling
+- Design role-based templates for specific use cases
+- Create hierarchical template composition patterns
+
+#### Template Structure Example
+```
+# System Context
+You are a {role} with {expertise_level} expertise in {domain}.
+
+# Task Context
+{if background_information}
+Background: {background_information}
+{endif}
+
+# Instructions
+{task_instructions}
+
+# Examples
+{example_count}
+
+# Output Format
+{output_specification}
+
+# Input
+{user_query}
+```
+
+### 5. System Prompt Design
+
+Design comprehensive system prompts that establish consistent model behavior, output formats, and safety constraints.
+
+#### System Prompt Components
+- Use `references/system-prompt-design.md` for detailed design guidelines
+- Define clear role specification and expertise boundaries
+- Establish output format requirements and structural constraints
+- Include safety guidelines and content policy adherence
+- Set context for background information and domain knowledge
+
+#### System Prompt Framework
+```
+You are an expert {role} specializing in {domain} with {experience_level} of experience.
+
+## Core Capabilities
+- List specific capabilities and expertise areas
+- Define scope of knowledge and limitations
+
+## Behavioral Guidelines
+- Specify interaction style and communication approach
+- Define error handling and uncertainty protocols
+- Establish quality standards and verification requirements
+
+## Output Requirements
+- Specify format expectations and structural requirements
+- Define content inclusion and exclusion criteria
+- Establish consistency and validation requirements
+
+## Safety and Ethics
+- Include content policy adherence
+- Specify bias mitigation requirements
+- Define harm prevention protocols
+```
+
+## Implementation Workflows
+
+### Workflow 1: Create New Prompt from Requirements
+
+1. **Analyze Requirements**
+   - Identify task complexity and reasoning requirements
+   - Determine target model capabilities and limitations
+   - Define success criteria and evaluation metrics
+   - Assess need for few-shot learning or CoT reasoning
+
+2. **Select Pattern Strategy**
+   - Use few-shot learning for classification or transformation tasks
+   - Apply CoT for complex reasoning or multi-step problems
+   - Implement template systems for reusable prompt architecture
+   - Design system prompts for consistent behavior requirements
+
+3. **Draft Initial Prompt**
+   - Structure prompt with clear sections and logical flow
+   - Include relevant examples or reasoning demonstrations
+   - Specify output format and quality requirements
+   - Incorporate safety guidelines and constraints
+
+4. **Validate and Test**
+   - Test with diverse input scenarios including edge cases
+   - Measure performance against defined success criteria
+   - Iterate refinement based on testing results
+   - Document optimization decisions and their rationale
+
+### Workflow 2: Optimize Existing Prompt
+
+1. **Performance Analysis**
+   - Measure current prompt performance metrics
+   - Identify failure modes and error patterns
+   - Analyze token efficiency and response latency
+   - Assess consistency across multiple runs
+
+2. **Optimization Strategy**
+   - Apply systematic A/B testing with single-variable changes
+   - Use few-shot learning to improve task adherence
+   - Implement CoT reasoning for complex task components
+   - Refine template structure for better clarity
+
+3. **Implementation and Testing**
+   - Deploy optimized prompts with controlled rollout
+   - Monitor performance metrics in production environment
+   - Compare against baseline using statistical significance
+   - Document improvements and lessons learned
+
+### Workflow 3: Scale Prompt Systems
+
+1. **Modular Architecture Design**
+   - Decompose complex prompts into reusable components
+   - Create template inheritance hierarchies
+   - Implement dynamic example selection systems
+   - Build automated quality assurance frameworks
+
+2. **Production Integration**
+   - Implement prompt versioning and rollback capabilities
+   - Create performance monitoring and alerting systems
+   - Build automated testing frameworks for prompt validation
+   - Establish update and deployment workflows
+
+## Quality Assurance
+
+### Validation Requirements
+- Test prompts with at least 10 diverse scenarios
+- Include edge cases, boundary conditions, and failure modes
+- Verify output format compliance and structural consistency
+- Validate safety guideline adherence and harm prevention
+- Measure performance across multiple model runs
+
+### Performance Standards
+- Achieve >90% task completion for well-defined use cases
+- Maintain <5% variance across multiple runs for consistency
+- Optimize token usage without sacrificing accuracy
+- Ensure response latency meets application requirements
+- Demonstrate robust handling of edge cases and unexpected inputs
+
+## Integration with Other Skills
+
+This skill integrates seamlessly with:
+- **langchain4j-ai-services-patterns**: Interface-based prompt design
+- **langchain4j-rag-implementation-patterns**: Context-enhanced prompting
+- **langchain4j-testing-strategies**: Prompt validation frameworks
+- **unit-test-parameterized**: Systematic prompt testing approaches
+
+## Resources and References
+
+- `references/few-shot-patterns.md`: Comprehensive few-shot learning frameworks
+- `references/cot-patterns.md`: Chain-of-thought reasoning patterns and examples
+- `references/optimization-frameworks.md`: Systematic prompt optimization methodologies
+- `references/template-systems.md`: Modular template design and implementation
+- `references/system-prompt-design.md`: System prompt architecture and best practices
+
+## Usage Examples
+
+### Example 1: Classification Task with Few-Shot Learning
+```
+Classify customer feedback into categories using semantic similarity for example selection and diversity sampling for edge case coverage.
+```
+
+### Example 2: Complex Reasoning with Chain-of-Thought
+```
+Implement step-by-step reasoning for financial analysis with verification steps and confidence scoring.
+```
+
+### Example 3: Template System for Customer Service
+```
+Create modular templates with role-based components and conditional sections for different inquiry types.
+```
+
+### Example 4: System Prompt for Code Generation
+```
+Design comprehensive system prompt with behavioral guidelines, output requirements, and safety constraints.
+```
+
+## Common Pitfalls and Solutions
+
+- **Overfitting examples**: Use diverse example sets with semantic variety
+- **Context window overflow**: Implement strategic example selection and compression
+- **Inconsistent outputs**: Specify clear output formats and validation requirements
+- **Poor generalization**: Include edge cases and boundary conditions in training examples
+- **Safety violations**: Incorporate comprehensive content policies and harm prevention
+
+## Performance Optimization
+
+- Monitor token usage and implement compression strategies
+- Use caching for repeated prompt components
+- Optimize example selection for maximum learning efficiency
+- Implement progressive disclosure for complex prompt systems
+- Balance prompt complexity with response quality requirements
+
+This skill provides the foundational patterns and methodologies for building production-ready prompt systems that consistently deliver high performance across diverse use cases and model types.
--- a/skills/ai/prompt-engineering/references/cot-patterns.md
+++ b/skills/ai/prompt-engineering/references/cot-patterns.md
@@ -0,0 +1,426 @@
+# Chain-of-Thought Reasoning Patterns
+
+This reference provides comprehensive frameworks for implementing effective chain-of-thought (CoT) reasoning that improves model performance on complex, multi-step problems.
+
+## Core Principles
+
+### Step-by-Step Reasoning Elicitation
+
+#### Problem Decomposition Strategy
+- Break complex problems into manageable sub-problems
+- Identify dependencies and relationships between components
+- Establish logical flow and sequence of reasoning steps
+- Define clear decision points and validation criteria
+
+#### Verification and Validation Integration
+- Include self-checking mechanisms at critical junctures
+- Implement consistency checks across reasoning steps
+- Add confidence scoring for uncertain conclusions
+- Provide fallback strategies for ambiguous situations
+
+## Zero-Shot Chain-of-Thought Patterns
+
+### Basic CoT Initiation
+```
+Let's think step by step to solve this problem:
+
+1. First, I need to understand what the question is asking for
+2. Then, I'll identify the key information and constraints
+3. Next, I'll consider different approaches to solve it
+4. I'll work through the solution methodically
+5. Finally, I'll verify my answer makes sense
+
+Problem: {problem_statement}
+
+Step 1: Understanding the question
+{analysis}
+
+Step 2: Key information and constraints
+{information_analysis}
+
+Step 3: Solution approach
+{approach_analysis}
+
+Step 4: Working through the solution
+{detailed_solution}
+
+Step 5: Verification
+{verification}
+
+Final Answer: {conclusion}
+```
+
+### Enhanced CoT with Confidence
+```
+Let me think through this systematically, breaking down the problem and checking my reasoning at each step.
+
+**Problem**: {problem_description}
+
+**Step 1: Problem Analysis**
+- What am I being asked to solve?
+- What information is provided?
+- What are the constraints?
+- My confidence in understanding: {score}/10
+
+**Step 2: Strategy Selection**
+- Possible approaches:
+  1. {approach_1}
+  2. {approach_2}
+  3. {approach_3}
+- Selected approach: {chosen_approach}
+- Rationale: {reasoning_for_choice}
+
+**Step 3: Execution**
+- {detailed_step_by_step_solution}
+
+**Step 4: Verification**
+- Does the answer make sense?
+- Have I addressed all parts of the question?
+- Confidence in final answer: {score}/10
+
+**Final Answer**: {solution_with_confidence_score}
+```
+
+## Few-Shot Chain-of-Thought Patterns
+
+### Mathematical Reasoning Template
+```
+Solve the following math problem step by step.
+
+Example 1:
+Problem: A store sells apples for $2 each and oranges for $3 each. If John buys 4 apples and 2 oranges, and spends exactly $14, how much does each fruit cost?
+
+Step 1: Set up the equation
+Let a = cost of apples, o = cost of oranges
+4a + 2o = 14
+
+Step 2: Simplify the equation
+Divide both sides by 2: 2a + o = 7
+
+Step 3: Test integer solutions
+If a = 2, then 2(2) + o = 7 → o = 3
+If a = 3, then 2(3) + o = 7 → o = 1
+
+Step 4: Verify the original problem
+For a=2, o=3: 4(2) + 2(3) = 8 + 6 = 14 ✓
+For a=3, o=1: 4(3) + 2(1) = 12 + 2 = 14 ✓
+
+Both solutions work, so apples could cost $2 and oranges $3, OR apples $3 and oranges $1.
+
+Example 2:
+Problem: Sarah is twice as old as Tom was 3 years ago. In 5 years, Sarah will be 1.5 times Tom's current age. How old are they now?
+
+Step 1: Define variables
+Let S = Sarah's current age, T = Tom's current age
+
+Step 2: Set up equations
+"Sarah is twice as old as Tom was 3 years ago": S = 2(T - 3)
+"In 5 years, Sarah will be 1.5 times Tom's current age": S + 5 = 1.5T
+
+Step 3: Solve the system
+From first equation: S = 2T - 6
+Substitute into second: (2T - 6) + 5 = 1.5T
+2T - 1 = 1.5T
+0.5T = 1
+T = 2
+
+Step 4: Find Sarah's age
+S = 2(2) - 6 = 4 - 6 = -2
+
+Step 5: Check for validity
+Negative age is impossible, so I must have made an error.
+Let me re-examine the problem...
+
+Actually, let me try again:
+S = 2(T - 3) = 2T - 6
+S + 5 = 1.5T
+Substitute: (2T - 6) + 5 = 1.5T
+2T - 1 = 1.5T
+0.5T = 1
+T = 2 years old
+S = 2(2 - 3) = 2(-1) = -2
+
+This gives a negative age. The problem might have no solution with integer ages.
+
+Now solve:
+Problem: {math_problem}
+
+Step 1: Define variables
+{variable_definitions}
+
+Step 2: Set up equations
+{equation_setup}
+
+Step 3: Solve the system
+{solution_process}
+
+Step 4: Verify the solution
+{verification}
+
+Final Answer: {answer}
+```
+
+### Logical Reasoning Template
+```
+Analyze the logical argument and determine if it's valid.
+
+Example 1:
+Premise 1: All birds can fly
+Premise 2: Penguins are birds
+Conclusion: Therefore, penguins can fly
+
+Step 1: Analyze the structure
+This is a syllogism with form:
+All A are B
+C is A
+Therefore, C is B
+
+Step 2: Evaluate premise validity
+Premise 1: "All birds can fly" - This is false (penguins, ostriches cannot fly)
+Premise 2: "Penguins are birds" - This is true
+
+Step 3: Check logical validity
+The logical structure is valid, but since Premise 1 is false, the conclusion may not be true
+
+Step 4: Real-world verification
+In reality, penguins cannot fly despite being birds
+
+Conclusion: The argument is logically valid but soundness fails due to false premise
+
+Example 2:
+Premise 1: If it rains, then the ground gets wet
+Premise 2: It is raining
+Conclusion: Therefore, the ground gets wet
+
+Step 1: Analyze the structure
+This is modus ponens:
+If P, then Q
+P
+Therefore, Q
+
+Step 2: Evaluate premise validity
+Premise 1: "If it rains, then the ground gets wet" - Generally true
+Premise 2: "It is raining" - Given as true
+
+Step 3: Check logical validity
+Modus ponens is a valid argument form
+
+Step 4: Verify the conclusion
+Given the premises, the conclusion follows logically
+
+Conclusion: The argument is both logically valid and sound
+
+Now analyze:
+Argument: {logical_argument}
+
+Step 1: Analyze the argument structure
+{structure_analysis}
+
+Step 2: Evaluate premise validity
+{premise_evaluation}
+
+Step 3: Check logical validity
+{validity_check}
+
+Step 4: Verify the conclusion
+{conclusion_verification}
+
+Final Assessment: {argument_validity_assessment}
+```
+
+## Self-Consistency Techniques
+
+### Multiple Reasoning Paths
+```
+I'll solve this problem using three different approaches and see which result is most reliable.
+
+**Problem**: {complex_problem}
+
+**Approach 1: Direct Calculation**
+{first_approach_reasoning}
+Result 1: {result_1}
+
+**Approach 2: Logical Deduction**
+{second_approach_reasoning}
+Result 2: {result_2}
+
+**Approach 3: Pattern Recognition**
+{third_approach_reasoning}
+Result 3: {result_3}
+
+**Consistency Analysis:**
+- Approach 1 and 2 agree: {yes/no}
+- Approach 1 and 3 agree: {yes/no}
+- Approach 2 and 3 agree: {yes/no}
+
+**Final Decision:**
+{majority_result} appears in {count} out of 3 approaches.
+Confidence: {high/medium/low}
+
+Most Likely Answer: {final_answer_with_confidence}
+```
+
+### Verification Loop Pattern
+```
+Let me solve this step by step and verify each step.
+
+**Problem**: {problem_description}
+
+**Step 1: Initial Analysis**
+{initial_analysis}
+
+Verification: Does this make sense? {verification_1}
+
+**Step 2: Solution Development**
+{solution_development}
+
+Verification: Does this logically follow from step 1? {verification_2}
+
+**Step 3: Result Calculation**
+{result_calculation}
+
+Verification: Does this answer the original question? {verification_3}
+
+**Step 4: Cross-Check**
+Let me try a different approach to confirm:
+{alternative_approach}
+
+Results comparison: {comparison_analysis}
+
+**Final Answer:**
+{conclusion_with_verification_status}
+```
+
+## Specialized CoT Patterns
+
+### Code Debugging CoT
+```
+Debug the following code by analyzing it step by step.
+
+**Code:**
+{code_snippet}
+
+**Step 1: Understand the Code's Purpose**
+{purpose_analysis}
+
+**Step 2: Identify Expected Behavior**
+{expected_behavior}
+
+**Step 3: Trace the Execution**
+{execution_trace}
+
+**Step 4: Find the Error**
+{error_identification}
+
+**Step 5: Propose Fix**
+{fix_proposal}
+
+**Step 6: Verify the Fix**
+{fix_verification}
+
+**Fixed Code:**
+{corrected_code}
+```
+
+### Data Analysis CoT
+```
+Analyze this data systematically to draw meaningful conclusions.
+
+**Data:**
+{dataset}
+
+**Step 1: Understand the Data Structure**
+{data_structure_analysis}
+
+**Step 2: Identify Patterns and Trends**
+{pattern_identification}
+
+**Step 3: Calculate Key Metrics**
+{metrics_calculation}
+
+**Step 4: Compare with Benchmarks**
+{benchmark_comparison}
+
+**Step 5: Formulate Insights**
+{insight_generation}
+
+**Step 6: Validate Conclusions**
+{conclusion_validation}
+
+**Key Findings:**
+{summary_of_insights}
+```
+
+### Creative Problem Solving CoT
+```
+Generate creative solutions to this challenging problem.
+
+**Problem:**
+{creative_problem}
+
+**Step 1: Reframe the Problem**
+{problem_reframing}
+
+**Step 2: Brainstorm Multiple Angles**
+- Technical approach: {technical_ideas}
+- Business approach: {business_ideas}
+- User experience approach: {ux_ideas}
+- Unconventional approach: {unconventional_ideas}
+
+**Step 3: Evaluate Each Approach**
+{approach_evaluation}
+
+**Step 4: Synthesize Best Elements**
+{synthesis_process}
+
+**Step 5: Develop Final Solution**
+{solution_development}
+
+**Step 6: Test for Feasibility**
+{feasibility_testing}
+
+**Recommended Solution:**
+{final_creative_solution}
+```
+
+## Implementation Guidelines
+
+### When to Use Chain-of-Thought
+- **Multi-step problems**: Tasks requiring sequential reasoning
+- **Complex calculations**: Mathematical or logical derivations
+- **Problem decomposition**: Tasks that benefit from breaking down
+- **Verification needs**: When accuracy is critical
+- **Educational contexts**: When showing reasoning is valuable
+
+### CoT Effectiveness Factors
+- **Problem complexity**: Higher benefit for complex problems
+- **Task type**: Mathematical, logical, and analytical tasks benefit most
+- **Model capability**: Newer models handle CoT more effectively
+- **Context window**: Ensure sufficient space for reasoning steps
+- **Output requirements**: Detailed explanations benefit from CoT
+
+### Common Pitfalls to Avoid
+- **Over-explaining simple steps**: Keep proportional detail
+- **Circular reasoning**: Ensure logical progression
+- **Missing verification**: Always include validation steps
+- **Inconsistent confidence**: Use realistic confidence scoring
+- **Premature conclusions**: Don't jump to answers without full reasoning
+
+## Integration with Other Techniques
+
+### CoT + Few-Shot Learning
+- Include reasoning traces in examples
+- Show step-by-step problem-solving demonstrations
+- Teach verification and self-checking patterns
+
+### CoT + Template Systems
+- Embed CoT patterns within structured templates
+- Use conditional CoT based on problem complexity
+- Implement adaptive reasoning depth
+
+### CoT + Prompt Optimization
+- Test different CoT formulations
+- Optimize reasoning step granularity
+- Balance detail with efficiency
+
+This framework provides comprehensive patterns for implementing effective chain-of-thought reasoning across diverse problem types and applications.
--- a/skills/ai/prompt-engineering/references/few-shot-patterns.md
+++ b/skills/ai/prompt-engineering/references/few-shot-patterns.md
@@ -0,0 +1,273 @@
+# Few-Shot Learning Patterns
+
+This reference provides comprehensive frameworks for implementing effective few-shot learning strategies that maximize model performance within context window constraints.
+
+## Core Principles
+
+### Example Selection Strategy
+
+#### Semantic Similarity Selection
+- Use embedding similarity to find examples closest to target input
+- Cluster similar examples to avoid redundancy
+- Select diverse representatives from different semantic regions
+- Prioritize examples that cover key variations in problem space
+
+#### Diversity Sampling Approach
+- Ensure coverage of different input types and patterns
+- Include boundary cases and edge conditions
+- Balance simple and complex examples
+- Select examples that demonstrate different solution strategies
+
+#### Progressive Complexity Ordering
+- Start with simplest, most straightforward examples
+- Progress to increasingly complex scenarios
+- Include challenging edge cases last
+- Use this ordering to build understanding incrementally
+
+## Example Templates
+
+### Classification Tasks
+
+#### Binary Classification Template
+```
+Classify if the text expresses positive or negative sentiment.
+
+Example 1:
+Text: "I love this product! It works exactly as advertised and exceeded my expectations."
+Sentiment: Positive
+Reasoning: Contains enthusiastic language, positive adjectives, and satisfaction indicators
+
+Example 2:
+Text: "The customer service was terrible and the product broke after one day of use."
+Sentiment: Negative
+Reasoning: Contains negative adjectives, complaint language, and dissatisfaction indicators
+
+Example 3:
+Text: "It's okay, nothing special but does the basic job."
+Sentiment: Negative
+Reasoning: Contains lukewarm language, lack of enthusiasm, minimal positive elements
+
+Now classify:
+Text: {input_text}
+Sentiment:
+Reasoning:
+```
+
+#### Multi-Class Classification Template
+```
+Categorize the customer inquiry into one of: Technical Support, Billing, Sales, or General.
+
+Example 1:
+Inquiry: "My account was charged twice for the same subscription this month"
+Category: Billing
+Key indicators: "charged twice", "subscription", "account", financial terms
+
+Example 2:
+Inquiry: "The app keeps crashing when I try to upload files larger than 10MB"
+Category: Technical Support
+Key indicators: "crashing", "upload files", "technical issue", "error report"
+
+Example 3:
+Inquiry: "What are your pricing plans for enterprise customers?"
+Category: Sales
+Key indicators: "pricing plans", "enterprise", business inquiry, sales question
+
+Now categorize:
+Inquiry: {inquiry_text}
+Category:
+Key indicators:
+```
+
+### Transformation Tasks
+
+#### Text Transformation Template
+```
+Convert formal business text into casual, friendly language.
+
+Example 1:
+Formal: "We regret to inform you that your request cannot be processed at this time due to insufficient documentation."
+Casual: "Sorry, but we can't process your request right now because some documents are missing."
+
+Example 2:
+Formal: "The aforementioned individual has demonstrated exceptional proficiency in the designated responsibilities."
+Casual: "They've done a great job with their tasks and really know what they're doing."
+
+Example 3:
+Formal: "Please be advised that the scheduled meeting has been postponed pending further notice."
+Casual: "Hey, just letting you know that we've put off the meeting for now and will let you know when it's rescheduled."
+
+Now convert:
+Formal: {formal_text}
+Casual:
+```
+
+#### Data Extraction Template
+```
+Extract key information from the job posting into structured format.
+
+Example 1:
+Job Posting: "We are seeking a Senior Software Engineer with 5+ years of experience in Python and cloud technologies. This is a remote position offering $120k-$150k salary plus equity."
+
+Extracted:
+- Position: Senior Software Engineer
+- Experience Required: 5+ years
+- Skills: Python, cloud technologies
+- Location: Remote
+- Salary: $120k-$150k plus equity
+
+Example 2:
+Job Posting: "Marketing Manager needed for growing startup. Must have 3 years experience in digital marketing, social media management, and content creation. San Francisco office, competitive compensation."
+
+Extracted:
+- Position: Marketing Manager
+- Experience Required: 3 years
+- Skills: Digital marketing, social media management, content creation
+- Location: San Francisco
+- Salary: Competitive compensation
+
+Now extract:
+Job Posting: {job_posting_text}
+Extracted:
+```
+
+### Generation Tasks
+
+#### Creative Writing Template
+```
+Generate compelling product descriptions following the shown patterns.
+
+Example 1:
+Product: Wireless headphones with noise cancellation
+Description: "Immerse yourself in crystal-clear audio with our premium wireless headphones. Advanced noise cancellation technology blocks out distractions while 30-hour battery life keeps you connected all day long."
+
+Example 2:
+Product: Smart home security camera
+Description: "Protect what matters most with intelligent monitoring that alerts you to activity instantly. AI-powered detection distinguishes between people, pets, and vehicles for truly smart security."
+
+Example 3:
+Product: Portable espresso maker
+Description: "Barista-quality espresso anywhere, anytime. Compact design meets professional-grade extraction in this revolutionary portable machine that delivers perfect shots in under 30 seconds."
+
+Now generate:
+Product: {product_description}
+Description:
+```
+
+### Error Correction Patterns
+
+#### Error Detection and Correction Template
+```
+Identify and correct errors in the given text.
+
+Example 1:
+Text with errors: "Their going to the park to play there new game with they're friends."
+Correction: "They're going to the park to play their new game with their friends."
+Errors fixed: "Their → They're", "there → their", "they're → their"
+
+Example 2:
+Text with errors: "The company's new policy effects every employee and there morale."
+Correction: "The company's new policy affects every employee and their morale."
+Errors fixed: "effects → affects", "there → their"
+
+Example 3:
+Text with errors: "Its important to review you're work carefully before submiting."
+Correction: "It's important to review your work carefully before submitting."
+Errors fixed: "Its → It's", "you're → your", "submiting → submitting"
+
+Now correct:
+Text with errors: {text_with_errors}
+Correction:
+Errors fixed:
+```
+
+## Advanced Strategies
+
+### Dynamic Example Selection
+
+#### Context-Aware Selection
+```python
+def select_examples(input_text, example_database, max_examples=3):
+    """
+    Select most relevant examples based on semantic similarity and diversity.
+    """
+    # 1. Calculate similarity scores
+    similarities = calculate_similarity(input_text, example_database)
+
+    # 2. Sort by similarity
+    sorted_examples = sort_by_similarity(similarities)
+
+    # 3. Apply diversity sampling
+    diverse_examples = diversity_sampling(sorted_examples, max_examples)
+
+    # 4. Order by complexity
+    final_examples = order_by_complexity(diverse_examples)
+
+    return final_examples
+```
+
+#### Adaptive Example Count
+```python
+def determine_example_count(input_complexity, context_limit):
+    """
+    Adjust example count based on input complexity and available context.
+    """
+    base_count = 3
+
+    # Complex inputs benefit from more examples
+    if input_complexity > 0.8:
+        return min(base_count + 2, context_limit)
+    elif input_complexity > 0.5:
+        return base_count + 1
+    else:
+        return max(base_count - 1, 2)
+```
+
+### Quality Metrics for Examples
+
+#### Example Effectiveness Scoring
+```python
+def score_example_effectiveness(example, test_cases):
+    """
+    Score how effectively an example teaches the desired pattern.
+    """
+    metrics = {
+        'coverage': measure_pattern_coverage(example),
+        'clarity': measure_instructional_clarity(example),
+        'uniqueness': measure_uniqueness_from_other_examples(example),
+        'difficulty': measure_appropriateness_difficulty(example)
+    }
+
+    return weighted_average(metrics, weights=[0.3, 0.3, 0.2, 0.2])
+```
+
+## Best Practices
+
+### Example Quality Guidelines
+- **Clarity**: Examples should clearly demonstrate the desired pattern
+- **Accuracy**: Input-output pairs must be correct and consistent
+- **Relevance**: Examples should be representative of target task
+- **Diversity**: Include variation in input types and complexity levels
+- **Completeness**: Cover edge cases and boundary conditions
+
+### Context Management
+- **Token Efficiency**: Optimize example length while maintaining clarity
+- **Progressive Disclosure**: Start simple, increase complexity gradually
+- **Redundancy Elimination**: Remove overlapping or duplicate examples
+- **Compression**: Use concise representations where possible
+
+### Common Pitfalls to Avoid
+- **Overfitting**: Don't include too many examples from same pattern
+- **Under-representation**: Ensure coverage of important variations
+- **Ambiguity**: Examples should have clear, unambiguous solutions
+- **Context Overflow**: Balance example count with window limitations
+- **Poor Ordering**: Place examples in logical progression order
+
+## Integration with Other Patterns
+
+Few-shot learning combines effectively with:
+- **Chain-of-Thought**: Add reasoning steps to examples
+- **Template Systems**: Use few-shot within structured templates
+- **Prompt Optimization**: Test different example selections
+- **System Prompts**: Establish few-shot learning expectations in system prompts
+
+This framework provides the foundation for implementing effective few-shot learning across diverse tasks and model types.
--- a/skills/ai/prompt-engineering/references/optimization-frameworks.md
+++ b/skills/ai/prompt-engineering/references/optimization-frameworks.md
@@ -0,0 +1,488 @@
+# Prompt Optimization Frameworks
+
+This reference provides systematic methodologies for iteratively improving prompt performance through structured testing, measurement, and refinement processes.
+
+## Optimization Process Overview
+
+### Iterative Improvement Cycle
+```mermaid
+graph TD
+    A[Baseline Measurement] --> B[Hypothesis Generation]
+    B --> C[Controlled Test]
+    C --> D[Performance Analysis]
+    D --> E[Statistical Validation]
+    E --> F[Implementation Decision]
+    F --> G[Monitor Impact]
+    G --> H[Learn & Iterate]
+    H --> B
+```
+
+### Core Optimization Principles
+- **Single Variable Testing**: Change one element at a time for accurate attribution
+- **Measurable Metrics**: Define quantitative success criteria
+- **Statistical Significance**: Use proper sample sizes and validation methods
+- **Controlled Environment**: Test conditions must be consistent
+- **Baseline Comparison**: Always measure against established baseline
+
+## Performance Metrics Framework
+
+### Primary Metrics
+
+#### Task Success Rate
+```python
+def calculate_success_rate(results, expected_outputs):
+    """
+    Measure percentage of tasks completed correctly.
+    """
+    correct = sum(1 for result, expected in zip(results, expected_outputs)
+                  if result == expected)
+    return (correct / len(results)) * 100
+```
+
+#### Response Consistency
+```python
+def measure_consistency(prompt, test_cases, num_runs=5):
+    """
+    Measure response stability across multiple runs.
+    """
+    responses = {}
+    for test_case in test_cases:
+        test_responses = []
+        for _ in range(num_runs):
+            response = execute_prompt(prompt, test_case)
+            test_responses.append(response)
+
+        # Calculate similarity score for consistency
+        consistency = calculate_similarity(test_responses)
+        responses[test_case] = consistency
+
+    return sum(responses.values()) / len(responses)
+```
+
+#### Token Efficiency
+```python
+def calculate_token_efficiency(prompt, test_cases):
+    """
+    Measure token usage per successful task completion.
+    """
+    total_tokens = 0
+    successful_tasks = 0
+
+    for test_case in test_cases:
+        response = execute_prompt_with_metrics(prompt, test_case)
+        total_tokens += response.token_count
+        if response.is_successful:
+            successful_tasks += 1
+
+    return total_tokens / successful_tasks if successful_tasks > 0 else float('inf')
+```
+
+#### Response Latency
+```python
+def measure_response_time(prompt, test_cases):
+    """
+    Measure average response time.
+    """
+    times = []
+    for test_case in test_cases:
+        start_time = time.time()
+        execute_prompt(prompt, test_case)
+        end_time = time.time()
+        times.append(end_time - start_time)
+
+    return sum(times) / len(times)
+```
+
+### Secondary Metrics
+
+#### Output Quality Score
+```python
+def assess_output_quality(response, criteria):
+    """
+    Multi-dimensional quality assessment.
+    """
+    scores = {
+        'accuracy': measure_accuracy(response),
+        'completeness': measure_completeness(response),
+        'coherence': measure_coherence(response),
+        'relevance': measure_relevance(response),
+        'format_compliance': measure_format_compliance(response)
+    }
+
+    weights = [0.3, 0.2, 0.2, 0.2, 0.1]
+    return sum(score * weight for score, weight in zip(scores.values(), weights))
+```
+
+#### Safety Compliance
+```python
+def check_safety_compliance(response):
+    """
+    Measure adherence to safety guidelines.
+    """
+    violations = []
+
+    # Check for various safety issues
+    if contains_harmful_content(response):
+        violations.append('harmful_content')
+    if has_bias(response):
+        violations.append('bias')
+    if violates_privacy(response):
+        violations.append('privacy_violation')
+
+    safety_score = max(0, 100 - len(violations) * 25)
+    return safety_score, violations
+```
+
+## A/B Testing Methodology
+
+### Controlled Test Design
+```python
+def design_ab_test(baseline_prompt, variant_prompt, test_cases):
+    """
+    Design controlled A/B test with proper statistical power.
+    """
+    # Calculate required sample size
+    effect_size = estimate_effect_size(baseline_prompt, variant_prompt)
+    sample_size = calculate_sample_size(effect_size, power=0.8, alpha=0.05)
+
+    # Random assignment
+    randomized_cases = random.sample(test_cases, sample_size)
+    split_point = len(randomized_cases) // 2
+
+    group_a = randomized_cases[:split_point]
+    group_b = randomized_cases[split_point:]
+
+    return {
+        'baseline_group': group_a,
+        'variant_group': group_b,
+        'sample_size': sample_size,
+        'statistical_power': 0.8,
+        'significance_level': 0.05
+    }
+```
+
+### Statistical Analysis
+```python
+def analyze_ab_results(baseline_results, variant_results):
+    """
+    Perform statistical analysis of A/B test results.
+    """
+    # Calculate means and standard deviations
+    baseline_mean = np.mean(baseline_results)
+    variant_mean = np.mean(variant_results)
+    baseline_std = np.std(baseline_results)
+    variant_std = np.std(variant_results)
+
+    # Perform t-test
+    t_statistic, p_value = stats.ttest_ind(baseline_results, variant_results)
+
+    # Calculate effect size (Cohen's d)
+    pooled_std = np.sqrt(((len(baseline_results) - 1) * baseline_std**2 +
+                         (len(variant_results) - 1) * variant_std**2) /
+                        (len(baseline_results) + len(variant_results) - 2))
+    cohens_d = (variant_mean - baseline_mean) / pooled_std
+
+    return {
+        'baseline_mean': baseline_mean,
+        'variant_mean': variant_mean,
+        'improvement': ((variant_mean - baseline_mean) / baseline_mean) * 100,
+        'p_value': p_value,
+        'statistical_significance': p_value < 0.05,
+        'effect_size': cohens_d,
+        'recommendation': 'implement_variant' if p_value < 0.05 and cohens_d > 0.2 else 'keep_baseline'
+    }
+```
+
+## Optimization Strategies
+
+### Strategy 1: Progressive Enhancement
+
+#### Stepwise Improvement Process
+```python
+def progressive_optimization(base_prompt, test_cases, max_iterations=10):
+    """
+    Incrementally improve prompt through systematic testing.
+    """
+    current_prompt = base_prompt
+    current_performance = evaluate_prompt(current_prompt, test_cases)
+    optimization_history = []
+
+    for iteration in range(max_iterations):
+        # Generate improvement hypotheses
+        hypotheses = generate_improvement_hypotheses(current_prompt, current_performance)
+
+        best_improvement = None
+        best_performance = current_performance
+
+        for hypothesis in hypotheses:
+            # Test hypothesis
+            test_prompt = apply_hypothesis(current_prompt, hypothesis)
+            test_performance = evaluate_prompt(test_prompt, test_cases)
+
+            # Validate improvement
+            if is_statistically_significant(current_performance, test_performance):
+                if test_performance.overall_score > best_performance.overall_score:
+                    best_improvement = hypothesis
+                    best_performance = test_performance
+
+        # Apply best improvement if found
+        if best_improvement:
+            current_prompt = apply_hypothesis(current_prompt, best_improvement)
+            optimization_history.append({
+                'iteration': iteration,
+                'hypothesis': best_improvement,
+                'performance_before': current_performance,
+                'performance_after': best_performance,
+                'improvement': best_performance.overall_score - current_performance.overall_score
+            })
+            current_performance = best_performance
+        else:
+            break  # No further improvements found
+
+    return current_prompt, optimization_history
+```
+
+### Strategy 2: Multi-Objective Optimization
+
+#### Pareto Optimization Framework
+```python
+def multi_objective_optimization(prompt_variants, objectives):
+    """
+    Optimize for multiple competing objectives using Pareto efficiency.
+    """
+    results = []
+
+    for variant in prompt_variants:
+        scores = {}
+        for objective in objectives:
+            scores[objective] = evaluate_objective(variant, objective)
+
+        results.append({
+            'prompt': variant,
+            'scores': scores,
+            'dominates': []
+        })
+
+    # Find Pareto optimal solutions
+    pareto_optimal = []
+    for i, result_i in enumerate(results):
+        is_dominated = False
+        for j, result_j in enumerate(results):
+            if i != j and dominates(result_j, result_i):
+                is_dominated = True
+                break
+
+        if not is_dominated:
+            pareto_optimal.append(result_i)
+
+    return pareto_optimal
+
+def dominates(result_a, result_b):
+    """
+    Check if result_a dominates result_b in all objectives.
+    """
+    return all(result_a['scores'][obj] >= result_b['scores'][obj]
+               for obj in result_a['scores'])
+```
+
+### Strategy 3: Adaptive Testing
+
+#### Dynamic Test Allocation
+```python
+def adaptive_testing(prompt_variants, initial_budget):
+    """
+    Dynamically allocate testing budget to promising variants.
+    """
+    # Initial exploration phase
+    exploration_results = {}
+    budget分配 = initial_budget // len(prompt_variants)
+
+    for variant in prompt_variants:
+        exploration_results[variant] = test_prompt(variant, budget分配)
+
+    # Exploitation phase - allocate more budget to promising variants
+    total_budget_spent = len(prompt_variants) * budget分配
+    remaining_budget = initial_budget - total_budget_spent
+
+    # Sort by performance
+    sorted_variants = sorted(exploration_results.items(),
+                           key=lambda x: x[1].overall_score, reverse=True)
+
+    # Allocate remaining budget proportionally to performance
+    final_results = {}
+    for i, (variant, initial_result) in enumerate(sorted_variants):
+        if remaining_budget > 0:
+            additional_budget = max(1, remaining_budget // (len(sorted_variants) - i))
+            final_results[variant] = test_prompt(variant, additional_budget)
+            remaining_budget -= additional_budget
+        else:
+            final_results[variant] = initial_result
+
+    return final_results
+```
+
+## Optimization Hypotheses
+
+### Common Optimization Areas
+
+#### Instruction Clarity
+```python
+instruction_clarity_hypotheses = [
+    "Add numbered steps to instructions",
+    "Include specific output format examples",
+    "Clarify role and expertise level",
+    "Add context and background information",
+    "Specify constraints and boundaries",
+    "Include success criteria and evaluation standards"
+]
+```
+
+#### Example Quality
+```python
+example_optimization_hypotheses = [
+    "Increase number of examples from 3 to 5",
+    "Add edge case examples",
+    "Reorder examples by complexity",
+    "Include negative examples",
+    "Add reasoning traces to examples",
+    "Improve example diversity and coverage"
+]
+```
+
+#### Structure Optimization
+```python
+structure_hypotheses = [
+    "Add clear section headings",
+    "Reorganize content flow",
+    "Include summary at the beginning",
+    "Add checklist for verification",
+    "Separate instructions from examples",
+    "Add troubleshooting section"
+]
+```
+
+#### Model-Specific Optimization
+```python
+model_specific_hypotheses = {
+    'claude': [
+        "Use XML tags for structure",
+        "Add <thinking> sections for reasoning",
+        "Include constitutional AI principles",
+        "Use system message format",
+        "Add safety guidelines and constraints"
+    ],
+    'gpt-4': [
+        "Use numbered sections with ### headers",
+        "Include JSON format specifications",
+        "Add function calling patterns",
+        "Use bullet points for clarity",
+        "Include error handling instructions"
+    ],
+    'gemini': [
+        "Use bold headers with ** formatting",
+        "Include step-by-step process descriptions",
+        "Add validation checkpoints",
+        "Use conversational tone",
+        "Include confidence scoring"
+    ]
+}
+```
+
+## Continuous Monitoring
+
+### Production Performance Tracking
+```python
+def setup_monitoring(prompt, alert_thresholds):
+    """
+    Set up continuous monitoring for deployed prompts.
+    """
+    monitors = {
+        'success_rate': MetricMonitor('success_rate', alert_thresholds['success_rate']),
+        'response_time': MetricMonitor('response_time', alert_thresholds['response_time']),
+        'token_cost': MetricMonitor('token_cost', alert_thresholds['token_cost']),
+        'safety_score': MetricMonitor('safety_score', alert_thresholds['safety_score'])
+    }
+
+    def monitor_performance():
+        recent_data = collect_recent_performance(prompt)
+        alerts = []
+
+        for metric_name, monitor in monitors.items():
+            if metric_name in recent_data:
+                alert = monitor.check(recent_data[metric_name])
+                if alert:
+                    alerts.append(alert)
+
+        return alerts
+
+    return monitor_performance
+```
+
+### Automated Rollback System
+```python
+def automated_rollback_system(prompts, monitoring_data):
+    """
+    Automatically rollback to previous version if performance degrades.
+    """
+    def check_and_rollback(current_prompt, baseline_prompt):
+        current_metrics = monitoring_data.get_metrics(current_prompt)
+        baseline_metrics = monitoring_data.get_metrics(baseline_prompt)
+
+        # Check if performance degradation exceeds threshold
+        degradation_threshold = 0.1  # 10% degradation
+
+        for metric in current_metrics:
+            if current_metrics[metric] < baseline_metrics[metric] * (1 - degradation_threshold):
+                return True, f"Performance degradation in {metric}"
+
+        return False, "Performance acceptable"
+
+    return check_and_rollback
+```
+
+## Optimization Tools and Utilities
+
+### Prompt Variation Generator
+```python
+def generate_prompt_variations(base_prompt):
+    """
+    Generate systematic variations for testing.
+    """
+    variations = {}
+
+    # Instruction variations
+    variations['more_detailed'] = add_detailed_instructions(base_prompt)
+    variations['simplified'] = simplify_instructions(base_prompt)
+    variations['structured'] = add_structured_format(base_prompt)
+
+    # Example variations
+    variations['more_examples'] = add_examples(base_prompt)
+    variations['better_examples'] = improve_example_quality(base_prompt)
+    variations['diverse_examples'] = add_example_diversity(base_prompt)
+
+    # Format variations
+    variations['numbered_steps'] = add_numbered_steps(base_prompt)
+    variations['bullet_points'] = use_bullet_points(base_prompt)
+    variations['sections'] = add_section_headers(base_prompt)
+
+    return variations
+```
+
+### Performance Dashboard
+```python
+def create_performance_dashboard(optimization_history):
+    """
+    Create visualization of optimization progress.
+    """
+    # Generate performance metrics over time
+    metrics_over_time = {
+        'iterations': [h['iteration'] for h in optimization_history],
+        'success_rates': [h['performance_after'].success_rate for h in optimization_history],
+        'token_efficiency': [h['performance_after'].token_efficiency for h in optimization_history],
+        'response_times': [h['performance_after'].response_time for h in optimization_history]
+    }
+
+    return PerformanceDashboard(metrics_over_time)
+```
+
+This comprehensive framework provides systematic methodologies for continuous prompt improvement through data-driven optimization and rigorous testing processes.
--- a/skills/ai/prompt-engineering/references/system-prompt-design.md
+++ b/skills/ai/prompt-engineering/references/system-prompt-design.md
@@ -0,0 +1,494 @@
+# System Prompt Design
+
+This reference provides comprehensive frameworks for designing effective system prompts that establish consistent model behavior, define clear boundaries, and ensure reliable performance across diverse applications.
+
+## System Prompt Architecture
+
+### Core Components Structure
+```
+1. Role Definition & Expertise
+2. Behavioral Guidelines & Constraints
+3. Interaction Protocols
+4. Output Format Specifications
+5. Safety & Ethical Guidelines
+6. Context & Background Information
+7. Quality Standards & Verification
+8. Error Handling & Uncertainty Protocols
+```
+
+## Component Design Patterns
+
+### 1. Role Definition Framework
+
+#### Comprehensive Role Specification
+```markdown
+## Role Definition
+You are an expert {role} with {experience_level} of specialized experience in {domain}. Your expertise includes:
+
+### Core Competencies
+- {competency_1}
+- {competency_2}
+- {competency_3}
+- {competency_4}
+
+### Knowledge Boundaries
+- You have deep knowledge of {strength_area_1} and {strength_area_2}
+- Your knowledge is current as of {knowledge_cutoff_date}
+- You should acknowledge limitations in {limitation_area}
+- When uncertain about recent developments, state this explicitly
+
+### Professional Standards
+- Adhere to {industry_standard_1} guidelines
+- Follow {industry_standard_2} best practices
+- Maintain {professional_attribute} in all interactions
+- Ensure compliance with {regulatory_framework}
+```
+
+#### Specialized Role Templates
+
+##### Technical Expert Role
+```markdown
+## Technical Expert Role
+You are a Senior {domain} Engineer with {years} years of experience in {specialization}. Your expertise encompasses:
+
+### Technical Proficiency
+- Deep understanding of {technology_stack}
+- Experience with {specific_frameworks} and {tools}
+- Knowledge of {design_patterns} and {architectures}
+- Proficiency in {programming_languages} and {development_methodologies}
+
+### Problem-Solving Approach
+- Analyze problems systematically using {methodology}
+- Consider multiple solution approaches before recommending
+- Evaluate trade-offs between {criteria_1}, {criteria_2}, and {criteria_3}
+- Provide scalable and maintainable solutions
+
+### Communication Style
+- Explain technical concepts clearly to both technical and non-technical audiences
+- Use precise terminology when appropriate
+- Provide concrete examples and code snippets when helpful
+- Structure responses with clear sections and logical flow
+```
+
+##### Analyst Role
+```markdown
+## Analyst Role
+You are a professional {analysis_type} Analyst with expertise in {data_domain} and {methodology}. Your analytical approach includes:
+
+### Analytical Framework
+- Apply {analytical_methodology} for systematic analysis
+- Use {statistical_techniques} for data interpretation
+- Consider {contextual_factors} in your analysis
+- Validate findings through {verification_methods}
+
+### Critical Thinking Process
+- Question assumptions and identify potential biases
+- Evaluate evidence quality and source reliability
+- Consider alternative explanations and perspectives
+- Synthesize information from multiple sources
+
+### Reporting Standards
+- Present findings with appropriate confidence levels
+- Distinguish between facts, interpretations, and recommendations
+- Provide evidence-based conclusions
+- Acknowledge limitations and uncertainties
+```
+
+### 2. Behavioral Guidelines Design
+
+#### Comprehensive Behavior Framework
+```markdown
+## Behavioral Guidelines
+
+### Interaction Style
+- Maintain {tone} tone throughout all interactions
+- Use {communication_approach} when explaining complex concepts
+- Be {responsiveness_level} in addressing user questions
+- Demonstrate {empathy_level} when dealing with user challenges
+
+### Response Standards
+- Provide responses that are {length_preference} and {detail_preference}
+- Structure information using {organization_pattern}
+- Include {frequency} examples and illustrations
+- Use {format_preference} formatting for clarity
+
+### Quality Expectations
+- Ensure all information is {accuracy_standard}
+- Provide citations for {information_type} when available
+- Cross-verify information using {verification_method}
+- Update knowledge based on {update_criteria}
+```
+
+#### Model-Specific Behavior Patterns
+
+##### Claude 3.5/4 Specific Guidelines
+```markdown
+## Claude-Specific Behavioral Guidelines
+
+### Constitutional Alignment
+- Follow constitutional AI principles in all responses
+- Prioritize helpfulness while maintaining safety
+- Consider multiple perspectives before concluding
+- Avoid harmful content while remaining useful
+
+### Output Formatting
+- Use XML tags for structured information: <tag>content</tag>
+- Include thinking blocks for complex reasoning: <thinking>...</thinking>
+- Provide clear section headers with proper hierarchy
+- Use markdown formatting for improved readability
+
+### Safety Protocols
+- Apply content policies consistently
+- Identify and flag potentially harmful requests
+- Provide safe alternatives when appropriate
+- Maintain transparency about limitations
+```
+
+##### GPT-4 Specific Guidelines
+```markdown
+## GPT-4 Specific Behavioral Guidelines
+
+### Structured Response Patterns
+- Use numbered lists for step-by-step processes
+- Implement clear section boundaries with ### headers
+- Provide JSON formatted outputs when specified
+- Use consistent indentation and formatting
+
+### Function Calling Integration
+- Recognize when function calling would be appropriate
+- Structure responses to facilitate tool usage
+- Provide clear parameter specifications
+- Handle function results systematically
+
+### Optimization Behaviors
+- Balance conciseness with comprehensiveness
+- Prioritize information relevance and importance
+- Use efficient language patterns
+- Minimize redundancy while maintaining clarity
+```
+
+### 3. Output Format Specifications
+
+#### Comprehensive Format Framework
+```markdown
+## Output Format Requirements
+
+### Structure Standards
+- Begin responses with {opening_pattern}
+- Use {section_pattern} for major sections
+- Implement {hierarchy_pattern} for information organization
+- Include {closing_pattern} for response completion
+
+### Content Organization
+- Present information in {presentation_order}
+- Group related information using {grouping_method}
+- Use {transition_pattern} between sections
+- Include {summary_element} for complex responses
+
+### Format Specifications
+{if json_format_required}
+- Provide responses in valid JSON format
+- Use consistent key naming conventions
+- Include all required fields
+- Validate JSON syntax before output
+{endif}
+
+{if markdown_format_required}
+- Use markdown for formatting and emphasis
+- Include appropriate heading levels
+- Use code blocks for technical content
+- Implement tables for structured data
+{endif}
+```
+
+### 4. Safety and Ethical Guidelines
+
+#### Comprehensive Safety Framework
+```markdown
+## Safety and Ethical Guidelines
+
+### Content Policies
+- Avoid generating {prohibited_content_1}
+- Do not provide {prohibited_content_2}
+- Flag {sensitive_topics} for human review
+- Provide {safe_alternatives} when appropriate
+
+### Ethical Considerations
+- Consider {ethical_principle_1} in all responses
+- Evaluate potential {ethical_impact} of provided information
+- Balance helpfulness with {safety_consideration}
+- Maintain {transparency_standard} about limitations
+
+### Bias Mitigation
+- Actively identify and mitigate {bias_type_1}
+- Present information {neutrality_standard}
+- Include {diverse_perspectives} when appropriate
+- Avoid {stereotype_patterns}
+
+### Harm Prevention
+- Identify potential {harm_type_1} in responses
+- Implement {prevention_mechanism} for harmful content
+- Provide {warning_system} for sensitive topics
+- Include {escalation_protocol} for concerning requests
+```
+
+### 5. Error Handling and Uncertainty
+
+#### Comprehensive Error Management
+```markdown
+## Error Handling and Uncertainty Protocols
+
+### Uncertainty Management
+- Explicitly state confidence levels for uncertain information
+- Use phrases like "I believe," "It appears that," "Based on available information"
+- Acknowledge when information may be {uncertainty_type}
+- Provide {verification_method} for uncertain claims
+
+### Error Recognition
+- Identify when {error_pattern} might have occurred
+- Implement {self_checking_mechanism} for accuracy
+- Use {validation_process} for important information
+- Provide {correction_protocol} when errors are identified
+
+### Limitation Acknowledgment
+- Clearly state {knowledge_limitation} when relevant
+- Explain {limitation_reason} when unable to provide complete information
+- Suggest {alternative_approach} when direct assistance isn't possible
+- Provide {escalation_option} for complex scenarios
+
+### Correction Procedures
+- Implement {correction_workflow} for identified errors
+- Provide {explanation_format} for corrections
+- Use {acknowledgment_pattern} for mistakes
+- Include {improvement_commitment} for future accuracy
+```
+
+## Specialized System Prompt Templates
+
+### 1. Educational Assistant System Prompt
+```markdown
+# Educational Assistant System Prompt
+
+## Role Definition
+You are an expert educational assistant specializing in {subject_area} with {experience_level} of teaching experience. Your pedagogical approach emphasizes {teaching_philosophy} and adapts to different learning styles.
+
+## Educational Philosophy
+- Create inclusive and supportive learning environments
+- Adapt explanations to match learner's comprehension level
+- Use scaffolding techniques to build understanding progressively
+- Encourage critical thinking and independent learning
+
+## Teaching Standards
+- Provide accurate, up-to-date information verified through {verification_sources}
+- Use clear, accessible language appropriate for the target audience
+- Include relevant examples and analogies to enhance understanding
+- Structure learning objectives with clear progression
+
+## Interaction Protocols
+- Assess learner's current understanding before providing explanations
+- Ask clarifying questions to tailor responses appropriately
+- Provide opportunities for learner questions and feedback
+- Offer additional resources for extended learning
+
+## Output Format
+- Begin with brief assessment of learner's needs
+- Use clear headings and organized structure
+- Include summary points for key takeaways
+- Provide practice exercises when appropriate
+- End with suggestions for further learning
+
+## Safety Guidelines
+- Create psychologically safe learning environments
+- Avoid language that might discourage or intimidate learners
+- Be patient and supportive when learners struggle with concepts
+- Respect diverse backgrounds and learning abilities
+
+## Uncertainty Handling
+- Acknowledge when topics are beyond current expertise
+- Suggest reliable resources for additional information
+- Be transparent about the limits of available knowledge
+- Encourage critical thinking and independent verification
+```
+
+### 2. Technical Documentation Generator System Prompt
+```markdown
+# Technical Documentation System Prompt
+
+## Role Definition
+You are a Senior Technical Writer with {years} of experience creating documentation for {technology_domain}. Your expertise encompasses {documentation_types} and you follow {industry_standards} for technical communication.
+
+## Documentation Standards
+- Follow {style_guide} for consistent formatting and terminology
+- Ensure clarity and accuracy in all technical explanations
+- Include practical examples and code snippets when helpful
+- Structure content with clear hierarchy and logical flow
+
+## Quality Requirements
+- Maintain technical accuracy verified through {review_process}
+- Use consistent terminology throughout documentation
+- Provide comprehensive coverage of topics without overwhelming detail
+- Include troubleshooting information for common issues
+
+## Audience Considerations
+- Target documentation at {audience_level} technical proficiency
+- Define technical terms and concepts appropriately
+- Provide progressive disclosure of complex information
+- Include context and motivation for technical decisions
+
+## Format Specifications
+- Use markdown formatting for clear structure and readability
+- Include code blocks with syntax highlighting
+- Implement consistent section headings and numbering
+- Provide navigation aids and cross-references
+
+## Review Process
+- Verify technical accuracy through {verification_method}
+- Test all code examples and procedures
+- Ensure completeness of coverage for documented features
+- Validate clarity and comprehensibility with target audience
+
+## Safety and Compliance
+- Include security considerations where relevant
+- Document potential risks and mitigation strategies
+- Follow industry compliance requirements
+- Maintain confidentiality for sensitive information
+```
+
+### 3. Data Analysis System Prompt
+```markdown
+# Data Analysis System Prompt
+
+## Role Definition
+You are an expert Data Analyst specializing in {data_domain} with {years} of experience in {analysis_methodologies}. Your analytical approach combines {technical_skills} with {business_acumen} to deliver actionable insights.
+
+## Analytical Framework
+- Apply {statistical_methods} for rigorous data analysis
+- Use {visualization_techniques} for effective data communication
+- Implement {quality_assurance} processes for data validation
+- Follow {ethical_guidelines} for responsible data handling
+
+## Analysis Standards
+- Ensure methodological soundness in all analyses
+- Provide clear documentation of analytical processes
+- Include appropriate statistical measures and confidence intervals
+- Validate findings through {validation_methods}
+
+## Communication Requirements
+- Present findings with appropriate technical depth for the audience
+- Use clear visualizations and narrative explanations
+- Highlight actionable insights and recommendations
+- Acknowledge limitations and uncertainties in analyses
+
+## Output Structure
+```json
+{
+  "executive_summary": "High-level overview of key findings",
+  "methodology": "Description of analytical approach and methods used",
+  "data_overview": "Summary of data sources, quality, and limitations",
+  "key_findings": [
+    {
+      "finding": "Specific discovery or insight",
+      "evidence": "Supporting data and statistical measures",
+      "confidence": "Confidence level in the finding",
+      "implications": "Business or operational implications"
+    }
+  ],
+  "recommendations": [
+    {
+      "action": "Recommended action",
+      "priority": "High/Medium/Low",
+      "expected_impact": "Anticipated outcome",
+      "implementation_considerations": "Factors to consider"
+    }
+  ],
+  "limitations": "Constraints and limitations of the analysis",
+  "next_steps": "Suggested follow-up analyses or actions"
+}
+```
+
+## Ethical Considerations
+- Protect privacy and confidentiality of data subjects
+- Ensure unbiased analysis and interpretation
+- Consider potential impact of findings on stakeholders
+- Maintain transparency about analytical limitations
+```
+
+## System Prompt Testing and Validation
+
+### Validation Framework
+```python
+class SystemPromptValidator:
+    def __init__(self):
+        self.validation_criteria = {
+            'role_clarity': 0.2,
+            'instruction_specificity': 0.2,
+            'safety_completeness': 0.15,
+            'output_format_clarity': 0.15,
+            'error_handling_coverage': 0.1,
+            'behavioral_consistency': 0.1,
+            'ethical_considerations': 0.1
+        }
+
+    def validate_prompt(self, system_prompt):
+        """Validate system prompt against quality criteria."""
+        scores = {}
+
+        scores['role_clarity'] = self.assess_role_clarity(system_prompt)
+        scores['instruction_specificity'] = self.assess_instruction_specificity(system_prompt)
+        scores['safety_completeness'] = self.assess_safety_completeness(system_prompt)
+        scores['output_format_clarity'] = self.assess_output_format_clarity(system_prompt)
+        scores['error_handling_coverage'] = self.assess_error_handling(system_prompt)
+        scores['behavioral_consistency'] = self.assess_behavioral_consistency(system_prompt)
+        scores['ethical_considerations'] = self.assess_ethical_considerations(system_prompt)
+
+        # Calculate overall score
+        overall_score = sum(score * weight for score, weight in
+                           zip(scores.values(), self.validation_criteria.values()))
+
+        return {
+            'overall_score': overall_score,
+            'individual_scores': scores,
+            'recommendations': self.generate_recommendations(scores)
+        }
+
+    def test_prompt_consistency(self, system_prompt, test_scenarios):
+        """Test prompt behavior consistency across different scenarios."""
+        results = []
+
+        for scenario in test_scenarios:
+            response = execute_with_system_prompt(system_prompt, scenario)
+
+            # Analyze response consistency
+            consistency_score = self.analyze_response_consistency(response, system_prompt)
+            results.append({
+                'scenario': scenario,
+                'response': response,
+                'consistency_score': consistency_score
+            })
+
+        average_consistency = sum(r['consistency_score'] for r in results) / len(results)
+
+        return {
+            'average_consistency': average_consistency,
+            'scenario_results': results,
+            'recommendations': self.generate_consistency_recommendations(results)
+        }
+```
+
+## Best Practices Summary
+
+### Design Principles
+- **Clarity First**: Ensure role and instructions are unambiguous
+- **Comprehensive Coverage**: Address all aspects of model behavior
+- **Consistency Focus**: Maintain consistent behavior across scenarios
+- **Safety Priority**: Include robust safety guidelines and constraints
+- **Flexibility Built-in**: Allow for adaptation to different contexts
+
+### Common Pitfalls to Avoid
+- **Vague Instructions**: Be specific about expected behaviors
+- **Over-constraining**: Allow room for intelligent adaptation
+- **Missing Safety Guidelines**: Always include comprehensive safety measures
+- **Inconsistent Formatting**: Use consistent structure throughout
+- **Ignoring Model Capabilities**: Design prompts that leverage model strengths
+
+This comprehensive system prompt design framework provides the foundation for creating effective, reliable, and safe AI system behaviors across diverse applications and use cases.
--- a/skills/ai/prompt-engineering/references/template-systems.md
+++ b/skills/ai/prompt-engineering/references/template-systems.md
@@ -0,0 +1,599 @@
+# Template Systems Architecture
+
+This reference provides comprehensive frameworks for building modular, reusable prompt templates with variable interpolation, conditional sections, and hierarchical composition.
+
+## Template Design Principles
+
+### Modularity and Reusability
+- **Single Responsibility**: Each template handles one specific type of task
+- **Composability**: Templates can be combined to create complex prompts
+- **Parameterization**: Variables allow customization without core changes
+- **Inheritance**: Base templates can be extended for specific use cases
+
+### Clear Variable Naming Conventions
+```
+{user_input}           - Direct input from user
+{context}             - Background information
+{examples}            - Few-shot learning examples
+{constraints}         - Task limitations and requirements
+{output_format}       - Desired output structure
+{role}                - AI role or persona
+{expertise_level}     - Level of expertise for the role
+{domain}              - Specific domain or field
+{difficulty}          - Task complexity level
+{language}            - Output language specification
+```
+
+## Core Template Components
+
+### 1. Base Template Structure
+```
+# Template: Universal Task Framework
+# Purpose: Base template for most task types
+# Variables: {role}, {task_description}, {context}, {examples}, {output_format}
+
+## System Instructions
+You are a {role} with {expertise_level} expertise in {domain}.
+
+## Context Information
+{if context}
+Background and relevant context:
+{context}
+{endif}
+
+## Task Description
+{task_description}
+
+## Examples
+{if examples}
+Here are some examples to guide your response:
+
+{examples}
+{endif}
+
+## Output Requirements
+{output_format}
+
+## Constraints and Guidelines
+{constraints}
+
+## User Input
+{user_input}
+```
+
+### 2. Conditional Sections Framework
+```python
+def process_conditional_template(template, variables):
+    """
+    Process template with conditional sections.
+    """
+    # Process if/endif blocks
+    while '{if ' in template:
+        start = template.find('{if ')
+        end_condition = template.find('}', start)
+        condition = template[start+4:end_condition].strip()
+
+        start_endif = template.find('{endif}', end_condition)
+        if_content = template[end_condition+1:start_endif].strip()
+
+        # Evaluate condition
+        if evaluate_condition(condition, variables):
+            template = template[:start] + if_content + template[start_endif+6:]
+        else:
+            template = template[:start] + template[start_endif+6:]
+
+    # Replace variables
+    for key, value in variables.items():
+        template = template.replace(f'{{{key}}}', str(value))
+
+    return template
+```
+
+### 3. Variable Interpolation System
+```python
+class TemplateEngine:
+    def __init__(self):
+        self.variables = {}
+        self.functions = {
+            'upper': str.upper,
+            'lower': str.lower,
+            'capitalize': str.capitalize,
+            'pluralize': self.pluralize,
+            'format_date': self.format_date,
+            'truncate': self.truncate
+        }
+
+    def set_variable(self, name, value):
+        """Set a template variable."""
+        self.variables[name] = value
+
+    def render(self, template):
+        """Render template with variable substitution."""
+        # Process function calls {variable|function}
+        template = self.process_functions(template)
+
+        # Replace variables
+        for key, value in self.variables.items():
+            template = template.replace(f'{{{key}}}', str(value))
+
+        return template
+
+    def process_functions(self, template):
+        """Process template functions."""
+        import re
+        pattern = r'\{(\w+)\|(\w+)\}'
+
+        def replace_function(match):
+            var_name, func_name = match.groups()
+            value = self.variables.get(var_name, '')
+            if func_name in self.functions:
+                return self.functions[func_name](str(value))
+            return value
+
+        return re.sub(pattern, replace_function, template)
+```
+
+## Specialized Template Types
+
+### 1. Classification Template
+```
+# Template: Multi-Class Classification
+# Purpose: Classify inputs into predefined categories
+# Required Variables: {input_text}, {categories}, {role}
+
+## Classification Framework
+You are a {role} specializing in accurate text classification.
+
+## Classification Categories
+{categories}
+
+## Classification Process
+1. Analyze the input text carefully
+2. Identify key indicators and features
+3. Match against category definitions
+4. Select the most appropriate category
+5. Provide confidence score
+
+## Input to Classify
+{input_text}
+
+## Output Format
+```json
+{{
+  "category": "selected_category",
+  "confidence": 0.95,
+  "reasoning": "Brief explanation of classification logic",
+  "key_indicators": ["indicator1", "indicator2"]
+}}
+```
+```
+
+### 2. Transformation Template
+```
+# Template: Text Transformation
+# Purpose: Transform text from one format/style to another
+# Required Variables: {source_text}, {target_format}, {transformation_rules}
+
+## Transformation Task
+Transform the given {source_format} text into {target_format} following these rules:
+{transformation_rules}
+
+## Source Text
+{source_text}
+
+## Transformation Process
+1. Analyze the structure and content of the source text
+2. Apply the specified transformation rules
+3. Maintain the core meaning and intent
+4. Ensure proper {target_format} formatting
+5. Verify completeness and accuracy
+
+## Transformed Output
+```
+
+### 3. Generation Template
+```
+# Template: Creative Generation
+# Purpose: Generate creative content based on specifications
+# Required Variables: {content_type}, {specifications}, {style_guidelines}
+
+## Creative Generation Task
+Generate {content_type} that meets the following specifications:
+
+## Content Specifications
+{specifications}
+
+## Style Guidelines
+{style_guidelines}
+
+## Quality Requirements
+- Originality and creativity
+- Adherence to specifications
+- Appropriate tone and style
+- Clear structure and coherence
+- Audience-appropriate language
+
+## Generated Content
+```
+
+### 4. Analysis Template
+```
+# Template: Comprehensive Analysis
+# Purpose: Perform detailed analysis of given input
+# Required Variables: {input_data}, {analysis_framework}, {focus_areas}
+
+## Analysis Framework
+You are an expert analyst with deep expertise in {domain}.
+
+## Analysis Scope
+Focus on these key areas:
+{focus_areas}
+
+## Analysis Methodology
+{analysis_framework}
+
+## Input Data for Analysis
+{input_data}
+
+## Analysis Process
+1. Initial assessment and context understanding
+2. Detailed examination of each focus area
+3. Pattern and trend identification
+4. Comparative analysis with benchmarks
+5. Insight generation and recommendation formulation
+
+## Analysis Output Structure
+```yaml
+executive_summary:
+  key_findings: []
+  overall_assessment: ""
+
+detailed_analysis:
+  {focus_area_1}:
+    observations: []
+    patterns: []
+    insights: []
+  {focus_area_2}:
+    observations: []
+    patterns: []
+    insights: []
+
+recommendations:
+  immediate: []
+  short_term: []
+  long_term: []
+```
+
+## Advanced Template Patterns
+
+### 1. Hierarchical Template Composition
+```python
+class HierarchicalTemplate:
+    def __init__(self, name, content, parent=None):
+        self.name = name
+        self.content = content
+        self.parent = parent
+        self.children = []
+        self.variables = {}
+
+    def add_child(self, child_template):
+        """Add a child template."""
+        child_template.parent = self
+        self.children.append(child_template)
+
+    def render(self, variables=None):
+        """Render template with inherited variables."""
+        # Combine variables from parent hierarchy
+        combined_vars = {}
+
+        # Collect variables from parents
+        current = self.parent
+        while current:
+            combined_vars.update(current.variables)
+            current = current.parent
+
+        # Add current variables
+        combined_vars.update(self.variables)
+
+        # Override with provided variables
+        if variables:
+            combined_vars.update(variables)
+
+        # Render content
+        rendered_content = self.render_content(self.content, combined_vars)
+
+        # Render children
+        for child in self.children:
+            child_rendered = child.render(combined_vars)
+            rendered_content = rendered_content.replace(
+                f'{{child:{child.name}}}', child_rendered
+            )
+
+        return rendered_content
+```
+
+### 2. Role-Based Template System
+```python
+class RoleBasedTemplate:
+    def __init__(self):
+        self.roles = {
+            'analyst': {
+                'persona': 'You are a professional analyst with expertise in data interpretation and pattern recognition.',
+                'approach': 'systematic',
+                'output_style': 'detailed and evidence-based',
+                'verification': 'Always cross-check findings and cite sources'
+            },
+            'creative_writer': {
+                'persona': 'You are a creative writer with a talent for engaging storytelling and vivid descriptions.',
+                'approach': 'imaginative',
+                'output_style': 'descriptive and engaging',
+                'verification': 'Ensure narrative consistency and flow'
+            },
+            'technical_expert': {
+                'persona': 'You are a technical expert with deep knowledge of {domain} and practical implementation experience.',
+                'approach': 'methodical',
+                'output_style': 'precise and technical',
+                'verification': 'Include technical accuracy and best practices'
+            }
+        }
+
+    def create_prompt(self, role, task, domain=None):
+        """Create role-specific prompt template."""
+        role_config = self.roles.get(role, self.roles['analyst'])
+
+        template = f"""
+## Role Definition
+{role_config['persona']}
+
+## Approach
+Use a {role_config['approach']} approach to this task.
+
+## Task
+{task}
+
+## Output Style
+{role_config['output_style']}
+
+## Verification
+{role_config['verification']}
+"""
+
+        if domain and '{domain}' in role_config['persona']:
+            template = template.replace('{domain}', domain)
+
+        return template
+```
+
+### 3. Dynamic Template Selection
+```python
+class DynamicTemplateSelector:
+    def __init__(self):
+        self.templates = {}
+        self.selection_rules = {}
+
+    def register_template(self, name, template, selection_criteria):
+        """Register a template with selection criteria."""
+        self.templates[name] = template
+        self.selection_rules[name] = selection_criteria
+
+    def select_template(self, task_characteristics):
+        """Select the most appropriate template based on task characteristics."""
+        best_template = None
+        best_score = 0
+
+        for name, criteria in self.selection_rules.items():
+            score = self.calculate_match_score(task_characteristics, criteria)
+            if score > best_score:
+                best_score = score
+                best_template = name
+
+        return self.templates[best_template] if best_template else None
+
+    def calculate_match_score(self, task_characteristics, criteria):
+        """Calculate how well task matches template criteria."""
+        score = 0
+        total_weight = 0
+
+        for characteristic, weight in criteria.items():
+            if characteristic in task_characteristics:
+                if task_characteristics[characteristic] == weight['value']:
+                    score += weight['weight']
+                total_weight += weight['weight']
+
+        return score / total_weight if total_weight > 0 else 0
+```
+
+## Template Implementation Examples
+
+### Example 1: Customer Service Template
+```python
+customer_service_template = """
+# Customer Service Response Template
+
+## Role Definition
+You are a {customer_service_role} with {experience_level} of customer service experience in {industry}.
+
+## Context
+{if customer_history}
+Customer History:
+{customer_history}
+{endif}
+
+{if issue_context}
+Issue Context:
+{issue_context}
+{endif}
+
+## Response Guidelines
+- Maintain {tone} tone throughout
+- Address all aspects of the customer's inquiry
+- Provide {level_of_detail} explanation
+- Include {additional_elements}
+- Follow company {communication_style} style
+
+## Customer Inquiry
+{customer_inquiry}
+
+## Response Structure
+1. Greeting and acknowledgment
+2. Understanding and empathy
+3. Solution or explanation
+4. Additional assistance offered
+5. Professional closing
+
+## Response
+"""
+```
+
+### Example 2: Technical Documentation Template
+```python
+documentation_template = """
+# Technical Documentation Generator
+
+## Role Definition
+You are a {technical_writer_role} specializing in {technology} documentation with {experience_level} of experience.
+
+## Documentation Standards
+- Target audience: {audience_level}
+- Technical depth: {technical_depth}
+- Include examples: {include_examples}
+- Add troubleshooting: {add_troubleshooting}
+- Version: {version}
+
+## Content to Document
+{content_to_document}
+
+## Documentation Structure
+```markdown
+# {title}
+
+## Overview
+{overview}
+
+## Prerequisites
+{prerequisites}
+
+## {main_sections}
+
+## Examples
+{if include_examples}
+{examples}
+{endif}
+
+## Troubleshooting
+{if add_troubleshooting}
+{troubleshooting}
+{endif}
+
+## Additional Resources
+{additional_resources}
+```
+
+## Generated Documentation
+"""
+```
+
+## Template Management System
+
+### Version Control Integration
+```python
+class TemplateVersionManager:
+    def __init__(self):
+        self.versions = {}
+        self.current_versions = {}
+
+    def create_version(self, template_name, template_content, author, description):
+        """Create a new version of a template."""
+        import datetime
+        import hashlib
+
+        version_id = hashlib.md5(template_content.encode()).hexdigest()[:8]
+        timestamp = datetime.datetime.now().isoformat()
+
+        version_info = {
+            'version_id': version_id,
+            'content': template_content,
+            'author': author,
+            'description': description,
+            'timestamp': timestamp,
+            'parent_version': self.current_versions.get(template_name)
+        }
+
+        if template_name not in self.versions:
+            self.versions[template_name] = []
+
+        self.versions[template_name].append(version_info)
+        self.current_versions[template_name] = version_id
+
+        return version_id
+
+    def rollback(self, template_name, version_id):
+        """Rollback to a specific version."""
+        if template_name in self.versions:
+            for version in self.versions[template_name]:
+                if version['version_id'] == version_id:
+                    self.current_versions[template_name] = version_id
+                    return version['content']
+        return None
+```
+
+### Performance Monitoring
+```python
+class TemplatePerformanceMonitor:
+    def __init__(self):
+        self.usage_stats = {}
+        self.performance_metrics = {}
+
+    def track_usage(self, template_name, execution_time, success):
+        """Track template usage and performance."""
+        if template_name not in self.usage_stats:
+            self.usage_stats[template_name] = {
+                'usage_count': 0,
+                'total_time': 0,
+                'success_count': 0,
+                'failure_count': 0
+            }
+
+        stats = self.usage_stats[template_name]
+        stats['usage_count'] += 1
+        stats['total_time'] += execution_time
+
+        if success:
+            stats['success_count'] += 1
+        else:
+            stats['failure_count'] += 1
+
+    def get_performance_report(self, template_name):
+        """Generate performance report for a template."""
+        if template_name not in self.usage_stats:
+            return None
+
+        stats = self.usage_stats[template_name]
+        avg_time = stats['total_time'] / stats['usage_count']
+        success_rate = stats['success_count'] / stats['usage_count']
+
+        return {
+            'template_name': template_name,
+            'total_usage': stats['usage_count'],
+            'average_execution_time': avg_time,
+            'success_rate': success_rate,
+            'failure_rate': 1 - success_rate
+        }
+```
+
+## Best Practices
+
+### Template Quality Guidelines
+- **Clear Documentation**: Include purpose, variables, and usage examples
+- **Consistent Naming**: Use standardized variable naming conventions
+- **Error Handling**: Include fallback mechanisms for missing variables
+- **Performance Optimization**: Minimize template complexity and rendering time
+- **Testing**: Implement comprehensive template testing frameworks
+
+### Security Considerations
+- **Input Validation**: Sanitize all template variables
+- **Injection Prevention**: Prevent code injection in template rendering
+- **Access Control**: Implement proper authorization for template modifications
+- **Audit Trail**: Track template changes and usage
+
+This comprehensive template system architecture provides the foundation for building scalable, maintainable prompt templates that can be efficiently managed and optimized across diverse use cases.
--- a/skills/ai/rag/SKILL.md
+++ b/skills/ai/rag/SKILL.md
@@ -0,0 +1,286 @@
+---
+name: rag-implementation
+description: Build Retrieval-Augmented Generation (RAG) systems for AI applications with vector databases and semantic search. Use when implementing knowledge-grounded AI, building document Q&A systems, or integrating LLMs with external knowledge bases.
+allowed-tools: Read, Write, Bash
+category: ai-engineering
+tags: [rag, vector-databases, embeddings, retrieval, semantic-search]
+version: 1.0.0
+---
+
+# RAG Implementation
+
+Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.
+
+## Overview
+
+RAG (Retrieval-Augmented Generation) enhances AI applications by retrieving relevant information from knowledge bases and incorporating it into AI responses, reducing hallucinations and providing accurate, grounded answers.
+
+## When to Use
+
+Use this skill when:
+
+- Building Q&A systems over proprietary documents
+- Creating chatbots with current, factual information
+- Implementing semantic search with natural language queries
+- Reducing hallucinations with grounded responses
+- Enabling AI systems to access domain-specific knowledge
+- Building documentation assistants
+- Creating research tools with source citation
+- Developing knowledge management systems
+
+## Core Components
+
+### Vector Databases
+Store and efficiently retrieve document embeddings for semantic search.
+
+**Key Options:**
+- **Pinecone**: Managed, scalable, production-ready
+- **Weaviate**: Open-source, hybrid search capabilities
+- **Milvus**: High performance, on-premise deployment
+- **Chroma**: Lightweight, easy local development
+- **Qdrant**: Fast, advanced filtering
+- **FAISS**: Meta's library, full control
+
+### Embedding Models
+Convert text to numerical vectors for similarity search.
+
+**Popular Models:**
+- **text-embedding-ada-002** (OpenAI): General purpose, 1536 dimensions
+- **all-MiniLM-L6-v2**: Fast, lightweight, 384 dimensions
+- **e5-large-v2**: High quality, multilingual
+- **bge-large-en-v1.5**: State-of-the-art performance
+
+### Retrieval Strategies
+Find relevant content based on user queries.
+
+**Approaches:**
+- **Dense Retrieval**: Semantic similarity via embeddings
+- **Sparse Retrieval**: Keyword matching (BM25, TF-IDF)
+- **Hybrid Search**: Combine dense + sparse for best results
+- **Multi-Query**: Generate multiple query variations
+- **Contextual Compression**: Extract only relevant parts
+
+## Quick Implementation
+
+### Basic RAG Setup
+
+```java
+// Load documents from file system
+List<Document> documents = FileSystemDocumentLoader.loadDocuments("/path/to/docs");
+
+// Create embedding store
+InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
+
+// Ingest documents into the store
+EmbeddingStoreIngestor.ingest(documents, embeddingStore);
+
+// Create AI service with RAG capability
+Assistant assistant = AiServices.builder(Assistant.class)
+    .chatModel(chatModel)
+    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
+    .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
+    .build();
+```
+
+### Document Processing Pipeline
+
+```java
+// Split documents into chunks
+DocumentSplitter splitter = new RecursiveCharacterTextSplitter(
+    500,  // chunk size
+    100   // overlap
+);
+
+// Create embedding model
+EmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
+    .apiKey("your-api-key")
+    .build();
+
+// Create embedding store
+EmbeddingStore<TextSegment> embeddingStore = PgVectorEmbeddingStore.builder()
+    .host("localhost")
+    .database("postgres")
+    .user("postgres")
+    .password("password")
+    .table("embeddings")
+    .dimension(1536)
+    .build();
+
+// Process and store documents
+for (Document document : documents) {
+    List<TextSegment> segments = splitter.split(document);
+    for (TextSegment segment : segments) {
+        Embedding embedding = embeddingModel.embed(segment).content();
+        embeddingStore.add(embedding, segment);
+    }
+}
+```
+
+## Implementation Patterns
+
+### Pattern 1: Simple Document Q&A
+
+Create a basic Q&A system over your documents.
+
+```java
+public interface DocumentAssistant {
+    String answer(String question);
+}
+
+DocumentAssistant assistant = AiServices.builder(DocumentAssistant.class)
+    .chatModel(chatModel)
+    .contentRetriever(retriever)
+    .build();
+```
+
+### Pattern 2: Metadata-Filtered Retrieval
+
+Filter results based on document metadata.
+
+```java
+// Add metadata during document loading
+Document document = Document.builder()
+    .text("Content here")
+    .metadata("source", "technical-manual.pdf")
+    .metadata("category", "technical")
+    .metadata("date", "2024-01-15")
+    .build();
+
+// Filter during retrieval
+EmbeddingStoreContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
+    .embeddingStore(embeddingStore)
+    .embeddingModel(embeddingModel)
+    .maxResults(5)
+    .minScore(0.7)
+    .filter(metadataKey("category").isEqualTo("technical"))
+    .build();
+```
+
+### Pattern 3: Multi-Source Retrieval
+
+Combine results from multiple knowledge sources.
+
+```java
+ContentRetriever webRetriever = EmbeddingStoreContentRetriever.from(webStore);
+ContentRetriever documentRetriever = EmbeddingStoreContentRetriever.from(documentStore);
+ContentRetriever databaseRetriever = EmbeddingStoreContentRetriever.from(databaseStore);
+
+// Combine results
+List<Content> allResults = new ArrayList<>();
+allResults.addAll(webRetriever.retrieve(query));
+allResults.addAll(documentRetriever.retrieve(query));
+allResults.addAll(databaseRetriever.retrieve(query));
+
+// Rerank combined results
+List<Content> rerankedResults = reranker.reorder(query, allResults);
+```
+
+## Best Practices
+
+### Document Preparation
+- Clean and preprocess documents before ingestion
+- Remove irrelevant content and formatting artifacts
+- Standardize document structure for consistent processing
+- Add relevant metadata for filtering and context
+
+### Chunking Strategy
+- Use 500-1000 tokens per chunk for optimal balance
+- Include 10-20% overlap to preserve context at boundaries
+- Consider document structure when determining chunk boundaries
+- Test different chunk sizes for your specific use case
+
+### Retrieval Optimization
+- Start with high k values (10-20) then filter/rerank
+- Use metadata filtering to improve relevance
+- Combine multiple retrieval strategies for better coverage
+- Monitor retrieval quality and user feedback
+
+### Performance Considerations
+- Cache embeddings for frequently accessed content
+- Use batch processing for document ingestion
+- Optimize vector store configuration for your scale
+- Monitor query performance and system resources
+
+## Common Issues and Solutions
+
+### Poor Retrieval Quality
+**Problem**: Retrieved documents don't match user queries
+**Solutions**:
+- Improve document preprocessing and cleaning
+- Adjust chunk size and overlap parameters
+- Try different embedding models
+- Use hybrid search combining semantic and keyword matching
+
+### Irrelevant Results
+**Problem**: Retrieved documents contain relevant information but are not specific enough
+**Solutions**:
+- Add metadata filtering for domain-specific constraints
+- Implement reranking with cross-encoder models
+- Use contextual compression to extract relevant parts
+- Fine-tune retrieval parameters (k values, similarity thresholds)
+
+### Performance Issues
+**Problem**: Slow response times during retrieval
+**Solutions**:
+- Optimize vector store configuration and indexing
+- Implement caching for frequently retrieved content
+- Use smaller embedding models for faster inference
+- Consider approximate nearest neighbor algorithms
+
+### Hallucination Prevention
+**Problem**: AI generates information not present in retrieved documents
+**Solutions**:
+- Improve prompt engineering to emphasize grounding
+- Add verification steps to check answer alignment
+- Include confidence scoring for responses
+- Implement fact-checking mechanisms
+
+## Evaluation Framework
+
+### Retrieval Metrics
+- **Precision@k**: Percentage of relevant documents in top-k results
+- **Recall@k**: Percentage of all relevant documents found in top-k results
+- **Mean Reciprocal Rank (MRR)**: Average rank of first relevant result
+- **Normalized Discounted Cumulative Gain (nDCG)**: Ranking quality metric
+
+### Answer Quality Metrics
+- **Faithfulness**: Degree to which answers are grounded in retrieved documents
+- **Answer Relevance**: How well answers address user questions
+- **Context Recall**: Percentage of relevant context used in answers
+- **Context Precision**: Percentage of retrieved context that is relevant
+
+### User Experience Metrics
+- **Response Time**: Time from query to answer
+- **User Satisfaction**: Feedback ratings on answer quality
+- **Task Completion**: Rate of successful task completion
+- **Engagement**: User interaction patterns with the system
+
+## Resources
+
+### Reference Documentation
+- [Vector Database Comparison](references/vector-databases.md) - Detailed comparison of vector database options
+- [Embedding Models Guide](references/embedding-models.md) - Model selection and optimization
+- [Retrieval Strategies](references/retrieval-strategies.md) - Advanced retrieval techniques
+- [Document Chunking](references/document-chunking.md) - Chunking strategies and best practices
+- [LangChain4j RAG Guide](references/langchain4j-rag-guide.md) - Official implementation patterns
+
+### Assets
+- `assets/vector-store-config.yaml` - Configuration templates for different vector stores
+- `assets/retriever-pipeline.java` - Complete RAG pipeline implementation
+- `assets/evaluation-metrics.java` - Evaluation framework code
+
+## Constraints and Limitations
+
+1. **Token Limits**: Respect model context window limitations
+2. **API Rate Limits**: Manage external API rate limits and costs
+3. **Data Privacy**: Ensure compliance with data protection regulations
+4. **Resource Requirements**: Consider memory and computational requirements
+5. **Maintenance**: Plan for regular updates and system monitoring
+
+## Security Considerations
+
+- Secure access to vector databases and embedding services
+- Implement proper authentication and authorization
+- Validate and sanitize user inputs
+- Monitor for abuse and unusual usage patterns
+- Regular security audits and penetration testing
--- a/skills/ai/rag/assets/retriever-pipeline.java
+++ b/skills/ai/rag/assets/retriever-pipeline.java
@@ -0,0 +1,307 @@
+package com.example.rag;
+
+import dev.langchain4j.data.document.Document;
+import dev.langchain4j.data.document.DocumentSplitter;
+import dev.langchain4j.data.document.parser.TextDocumentParser;
+import dev.langchain4j.data.document.splitter.RecursiveCharacterTextSplitter;
+import dev.langchain4j.data.embedding.Embedding;
+import dev.langchain4j.data.segment.TextSegment;
+import dev.langchain4j.model.embedding.EmbeddingModel;
+import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
+import dev.langchain4j.store.embedding.EmbeddingStore;
+import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
+import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
+import dev.langchain4j.store.embedding.pinecone.PineconeEmbeddingStore;
+import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
+import dev.langchain4j.store.embedding.qdrant.QdrantEmbeddingStore;
+import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
+import dev.langchain4j.store.embedding.filter.Filter;
+import dev.langchain4j.store.embedding.filter.MetadataFilterBuilder;
+
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.List;
+import java.util.Map;
+import java.util.HashMap;
+
+/**
+ * Complete RAG Pipeline Implementation
+ *
+ * This class provides a comprehensive implementation of a RAG (Retrieval-Augmented Generation)
+ * system with support for multiple vector stores and advanced retrieval strategies.
+ */
+public class RAGPipeline {
+
+    private final EmbeddingModel embeddingModel;
+    private final EmbeddingStore<TextSegment> embeddingStore;
+    private final DocumentSplitter documentSplitter;
+    private final RAGConfig config;
+
+    /**
+     * Configuration class for RAG pipeline
+     */
+    public static class RAGConfig {
+        private String vectorStoreType = "chroma";
+        private String openAiApiKey;
+        private String pineconeApiKey;
+        private String pineconeEnvironment;
+        private String pineconeIndex = "rag-documents";
+        private String chromaCollection = "rag-documents";
+        private String chromaPersistPath = "./chroma_db";
+        private String qdrantHost = "localhost";
+        private int qdrantPort = 6333;
+        private String qdrantCollection = "rag-documents";
+        private int chunkSize = 1000;
+        private int chunkOverlap = 200;
+        private int embeddingDimension = 1536;
+
+        // Getters and setters
+        public String getVectorStoreType() { return vectorStoreType; }
+        public void setVectorStoreType(String vectorStoreType) { this.vectorStoreType = vectorStoreType; }
+        public String getOpenAiApiKey() { return openAiApiKey; }
+        public void setOpenAiApiKey(String openAiApiKey) { this.openAiApiKey = openAiApiKey; }
+        public String getPineconeApiKey() { return pineconeApiKey; }
+        public void setPineconeApiKey(String pineconeApiKey) { this.pineconeApiKey = pineconeApiKey; }
+        public String getPineconeEnvironment() { return pineconeEnvironment; }
+        public void setPineconeEnvironment(String pineconeEnvironment) { this.pineconeEnvironment = pineconeEnvironment; }
+        public String getPineconeIndex() { return pineconeIndex; }
+        public void setPineconeIndex(String pineconeIndex) { this.pineconeIndex = pineconeIndex; }
+        public String getChromaCollection() { return chromaCollection; }
+        public void setChromaCollection(String chromaCollection) { this.chromaCollection = chromaCollection; }
+        public String getChromaPersistPath() { return chromaPersistPath; }
+        public void setChromaPersistPath(String chromaPersistPath) { this.chromaPersistPath = chromaPersistPath; }
+        public String getQdrantHost() { return qdrantHost; }
+        public void setQdrantHost(String qdrantHost) { this.qdrantHost = qdrantHost; }
+        public int getQdrantPort() { return qdrantPort; }
+        public void setQdrantPort(int qdrantPort) { this.qdrantPort = qdrantPort; }
+        public String getQdrantCollection() { return qdrantCollection; }
+        public void setQdrantCollection(String qdrantCollection) { this.qdrantCollection = qdrantCollection; }
+        public int getChunkSize() { return chunkSize; }
+        public void setChunkSize(int chunkSize) { this.chunkSize = chunkSize; }
+        public int getChunkOverlap() { return chunkOverlap; }
+        public void setChunkOverlap(int chunkOverlap) { this.chunkOverlap = chunkOverlap; }
+        public int getEmbeddingDimension() { return embeddingDimension; }
+        public void setEmbeddingDimension(int embeddingDimension) { this.embeddingDimension = embeddingDimension; }
+    }
+
+    /**
+     * Constructor
+     */
+    public RAGPipeline(RAGConfig config) {
+        this.config = config;
+        this.embeddingModel = createEmbeddingModel();
+        this.embeddingStore = createEmbeddingStore();
+        this.documentSplitter = createDocumentSplitter();
+    }
+
+    /**
+     * Create embedding model based on configuration
+     */
+    private EmbeddingModel createEmbeddingModel() {
+        return OpenAiEmbeddingModel.builder()
+                .apiKey(config.getOpenAiApiKey())
+                .modelName("text-embedding-ada-002")
+                .build();
+    }
+
+    /**
+     * Create embedding store based on configuration
+     */
+    private EmbeddingStore<TextSegment> createEmbeddingStore() {
+        switch (config.getVectorStoreType().toLowerCase()) {
+            case "pinecone":
+                return PineconeEmbeddingStore.builder()
+                        .apiKey(config.getPineconeApiKey())
+                        .environment(config.getPineconeEnvironment())
+                        .index(config.getPineconeIndex())
+                        .dimension(config.getEmbeddingDimension())
+                        .build();
+
+            case "chroma":
+                return ChromaEmbeddingStore.builder()
+                        .collectionName(config.getChromaCollection())
+                        .persistDirectory(config.getChromaPersistPath())
+                        .build();
+
+            case "qdrant":
+                return QdrantEmbeddingStore.builder()
+                        .host(config.getQdrantHost())
+                        .port(config.getQdrantPort())
+                        .collectionName(config.getQdrantCollection())
+                        .dimension(config.getEmbeddingDimension())
+                        .build();
+
+            case "memory":
+            default:
+                return new InMemoryEmbeddingStore<>();
+        }
+    }
+
+    /**
+     * Create document splitter
+     */
+    private DocumentSplitter createDocumentSplitter() {
+        return new RecursiveCharacterTextSplitter(
+                config.getChunkSize(),
+                config.getChunkOverlap()
+        );
+    }
+
+    /**
+     * Load documents from directory
+     */
+    public List<Document> loadDocuments(String directoryPath) {
+        try {
+            Path directory = Paths.get(directoryPath);
+            List<Document> documents = FileSystemDocumentLoader.loadDocuments(directory);
+
+            // Add metadata to documents
+            for (Document document : documents) {
+                Map<String, Object> metadata = new HashMap<>(document.metadata().toMap());
+                metadata.put("loaded_at", System.currentTimeMillis());
+                metadata.put("source_directory", directoryPath);
+
+                // Update document metadata
+                document = Document.from(document.text(), metadata);
+            }
+
+            return documents;
+        } catch (Exception e) {
+            throw new RuntimeException("Failed to load documents from " + directoryPath, e);
+        }
+    }
+
+    /**
+     * Process and ingest documents
+     */
+    public void ingestDocuments(List<Document> documents) {
+        // Split documents into segments
+        List<TextSegment> segments = documentSplitter.split(documents);
+
+        // Add additional metadata to segments
+        for (int i = 0; i < segments.size(); i++) {
+            TextSegment segment = segments.get(i);
+            Map<String, Object> metadata = new HashMap<>(segment.metadata().toMap());
+            metadata.put("segment_index", i);
+            metadata.put("total_segments", segments.size());
+            metadata.put("processed_at", System.currentTimeMillis());
+
+            segments.set(i, TextSegment.from(segment.text(), metadata));
+        }
+
+        // Ingest into embedding store
+        EmbeddingStoreIngestor.ingest(segments, embeddingStore);
+
+        System.out.println("Ingested " + documents.size() + " documents into " +
+                          segments.size() + " segments");
+    }
+
+    /**
+     * Search documents with optional filtering
+     */
+    public List<TextSegment> search(String query, int maxResults, Filter filter) {
+        Embedding queryEmbedding = embeddingModel.embed(query).content();
+
+        return embeddingStore.findRelevant(queryEmbedding, maxResults, filter);
+    }
+
+    /**
+     * Search documents with metadata filtering
+     */
+    public List<TextSegment> searchWithMetadataFilter(String query, int maxResults,
+                                                    Map<String, Object> metadataFilters) {
+        Filter filter = null;
+
+        if (metadataFilters != null && !metadataFilters.isEmpty()) {
+            MetadataFilterBuilder filterBuilder = new MetadataFilterBuilder();
+
+            for (Map.Entry<String, Object> entry : metadataFilters.entrySet()) {
+                String key = entry.getKey();
+                Object value = entry.getValue();
+
+                if (value instanceof String) {
+                    filterBuilder = filterBuilder.metadata(key).isEqualTo((String) value);
+                } else if (value instanceof Number) {
+                    filterBuilder = filterBuilder.metadata(key).isEqualTo(((Number) value).doubleValue());
+                }
+                // Add more type handling as needed
+            }
+
+            filter = filterBuilder.build();
+        }
+
+        return search(query, maxResults, filter);
+    }
+
+    /**
+     * Get statistics about the stored documents
+     */
+    public RAGStatistics getStatistics() {
+        // This is a simplified implementation
+        // In practice, you might want to track more detailed statistics
+        return new RAGStatistics(
+                embeddingStore.getClass().getSimpleName(),
+                config.getVectorStoreType()
+        );
+    }
+
+    /**
+     * Statistics holder class
+     */
+    public static class RAGStatistics {
+        private final String storeType;
+        private final String implementation;
+
+        public RAGStatistics(String storeType, String implementation) {
+            this.storeType = storeType;
+            this.implementation = implementation;
+        }
+
+        public String getStoreType() { return storeType; }
+        public String getImplementation() { return implementation; }
+
+        @Override
+        public String toString() {
+            return "RAGStatistics{" +
+                    "storeType='" + storeType + '\'' +
+                    ", implementation='" + implementation + '\'' +
+                    '}';
+        }
+    }
+
+    /**
+     * Example usage
+     */
+    public static void main(String[] args) {
+        // Configure the pipeline
+        RAGConfig config = new RAGConfig();
+        config.setVectorStoreType("chroma"); // or "pinecone", "qdrant", "memory"
+        config.setOpenAiApiKey("your-openai-api-key");
+        config.setChunkSize(1000);
+        config.setChunkOverlap(200);
+
+        // Create pipeline
+        RAGPipeline pipeline = new RAGPipeline(config);
+
+        // Load documents
+        List<Document> documents = pipeline.loadDocuments("./documents");
+
+        // Ingest documents
+        pipeline.ingestDocuments(documents);
+
+        // Search for relevant content
+        List<TextSegment> results = pipeline.search("What is machine learning?", 5, null);
+
+        // Print results
+        for (int i = 0; i < results.size(); i++) {
+            TextSegment segment = results.get(i);
+            System.out.println("Result " + (i + 1) + ":");
+            System.out.println("Content: " + segment.text().substring(0, Math.min(200, segment.text().length())) + "...");
+            System.out.println("Metadata: " + segment.metadata());
+            System.out.println();
+        }
+
+        // Print statistics
+        System.out.println("Pipeline Statistics: " + pipeline.getStatistics());
+    }
+}
--- a/skills/ai/rag/assets/vector-store-config.yaml
+++ b/skills/ai/rag/assets/vector-store-config.yaml
@@ -0,0 +1,127 @@
+# Vector Store Configuration Templates
+# This file contains configuration templates for different vector databases
+
+# Chroma (Local/Development)
+chroma:
+  type: chroma
+  settings:
+    persist_directory: "./chroma_db"
+    collection_name: "rag_documents"
+    host: "localhost"
+    port: 8000
+
+  # Recommended for: Development, small-scale applications
+  # Pros: Easy setup, local deployment, free
+  # Cons: Limited scalability, single-node only
+
+# Pinecone (Cloud/Production)
+pinecone:
+  type: pinecone
+  settings:
+    api_key: "${PINECONE_API_KEY}"
+    environment: "us-west1-gcp"
+    index_name: "rag-documents"
+    dimension: 1536
+    metric: "cosine"
+    pods: 1
+    pod_type: "p1.x1"
+
+  # Recommended for: Production applications, large-scale
+  # Pros: Managed service, scalable, fast
+  # Cons: Cost, requires internet connection
+
+# Weaviate (Open-source/Cloud)
+weaviate:
+  type: weaviate
+  settings:
+    url: "http://localhost:8080"
+    api_key: "${WEAVIATE_API_KEY}"
+    class_name: "Document"
+    text_key: "content"
+    vectorizer: "text2vec-openai"
+    module_config:
+      text2vec-openai:
+        model: "ada"
+        modelVersion: "002"
+        type: "text"
+        baseUrl: "https://api.openai.com/v1"
+
+  # Recommended for: Hybrid search, GraphQL API
+  # Pros: Open-source, hybrid search, flexible
+  # Cons: More complex setup
+
+# Qdrant (Performance-focused)
+qdrant:
+  type: qdrant
+  settings:
+    host: "localhost"
+    port: 6333
+    collection_name: "rag_documents"
+    vector_size: 1536
+    distance: "Cosine"
+    api_key: "${QDRANT_API_KEY}"
+
+  # Recommended for: Performance, advanced filtering
+  # Pros: Fast, good filtering, open-source
+  # Cons: Newer project, smaller community
+
+# Milvus (Enterprise/Scale)
+milvus:
+  type: milvus
+  settings:
+    host: "localhost"
+    port: 19530
+    collection_name: "rag_documents"
+    dimension: 1536
+    index_type: "IVF_FLAT"
+    metric_type: "COSINE"
+    nlist: 1024
+
+  # Recommended for: Enterprise, large-scale deployments
+  # Pros: High performance, distributed
+  # Cons: Complex setup, resource intensive
+
+# FAISS (Local/Research)
+faiss:
+  type: faiss
+  settings:
+    index_type: "IndexFlatL2"
+    dimension: 1536
+    save_path: "./faiss_index"
+
+  # Recommended for: Research, local processing
+  # Pros: Fast, local, no dependencies
+  # Cons: No persistence, limited features
+
+# Common Configuration Parameters
+common:
+  chunking:
+    chunk_size: 1000
+    chunk_overlap: 200
+    separators: ["\n\n", "\n", " ", ""]
+
+  embedding:
+    model: "text-embedding-ada-002"
+    batch_size: 100
+    max_retries: 3
+    timeout: 30
+
+  retrieval:
+    default_k: 5
+    similarity_threshold: 0.7
+    max_results: 20
+
+  performance:
+    cache_embeddings: true
+    cache_size: 1000
+    parallel_processing: true
+    batch_size: 50
+
+# Environment Variables Template
+# Copy this to .env file and fill in your values
+environment:
+  OPENAI_API_KEY: "your-openai-api-key-here"
+  PINECONE_API_KEY: "your-pinecone-api-key-here"
+  PINECONE_ENVIRONMENT: "us-west1-gcp"
+  WEAVIATE_API_KEY: "your-weaviate-api-key-here"
+  QDRANT_API_KEY: "your-qdrant-api-key-here"
--- a/skills/ai/rag/references/document-chunking.md
+++ b/skills/ai/rag/references/document-chunking.md
@@ -0,0 +1,137 @@
+# Document Chunking Strategies
+
+## Overview
+Document chunking is the process of breaking large documents into smaller, manageable pieces that can be effectively embedded and retrieved.
+
+## Chunking Strategies
+
+### 1. Recursive Character Text Splitter
+**Method**: Split text based on character count, trying separators in order
+**Use Case**: General purpose text splitting
+**Advantages**: Preserves sentence and paragraph boundaries when possible
+
+```python
+from langchain.text_splitters import RecursiveCharacterTextSplitter
+
+splitter = RecursiveCharacterTextSplitter(
+    chunk_size=1000,
+    chunk_overlap=200,
+    length_function=len,
+    separators=["\n\n", "\n", " ", ""]  # Try these in order
+)
+chunks = splitter.split_documents(documents)
+```
+
+### 2. Token-Based Splitting
+**Method**: Split based on token count rather than characters
+**Use Case**: When working with token limits of language models
+**Advantages**: Better control over context window usage
+
+```python
+from langchain.text_splitters import TokenTextSplitter
+
+splitter = TokenTextSplitter(
+    chunk_size=512,
+    chunk_overlap=50
+)
+chunks = splitter.split_documents(documents)
+```
+
+### 3. Semantic Chunking
+**Method**: Split based on semantic similarity
+**Use Case**: When maintaining semantic coherence is important
+**Advantages**: Chunks are more semantically meaningful
+
+```python
+from langchain.text_splitters import SemanticChunker
+
+splitter = SemanticChunker(
+    embeddings=OpenAIEmbeddings(),
+    breakpoint_threshold_type="percentile"
+)
+chunks = splitter.split_documents(documents)
+```
+
+### 4. Markdown Header Splitter
+**Method**: Split based on markdown headers
+**Use Case**: Structured documents with clear hierarchical organization
+**Advantages**: Maintains document structure and context
+
+```python
+from langchain.text_splitters import MarkdownHeaderTextSplitter
+
+headers_to_split_on = [
+    ("#", "Header 1"),
+    ("##", "Header 2"),
+    ("###", "Header 3"),
+]
+
+splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
+chunks = splitter.split_documents(documents)
+```
+
+### 5. HTML Splitter
+**Method**: Split based on HTML tags
+**Use Case**: Web pages and HTML documents
+**Advantages**: Preserves HTML structure and metadata
+
+```python
+from langchain.text_splitters import HTMLHeaderTextSplitter
+
+headers_to_split_on = [
+    ("h1", "Header 1"),
+    ("h2", "Header 2"),
+    ("h3", "Header 3"),
+]
+
+splitter = HTMLHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
+chunks = splitter.split_documents(documents)
+```
+
+## Parameter Tuning
+
+### Chunk Size
+- **Small chunks (200-400 tokens)**: More precise retrieval, but may lose context
+- **Medium chunks (500-1000 tokens)**: Good balance of precision and context
+- **Large chunks (1000-2000 tokens)**: More context, but less precise retrieval
+
+### Chunk Overlap
+- **Purpose**: Preserve context at chunk boundaries
+- **Typical range**: 10-20% of chunk size
+- **Higher overlap**: Better context preservation, but more redundancy
+- **Lower overlap**: Less redundancy, but may lose important context
+
+### Separators
+- **Hierarchical separators**: Start with larger boundaries (paragraphs), then smaller (sentences)
+- **Custom separators**: Add domain-specific separators for better results
+- **Language-specific**: Adjust for different languages and writing styles
+
+## Best Practices
+
+1. **Preserve Context**: Ensure chunks contain enough surrounding context
+2. **Maintain Coherence**: Keep semantically related content together
+3. **Respect Boundaries**: Avoid breaking sentences or important phrases
+4. **Consider Query Types**: Adapt chunking strategy to typical user queries
+5. **Test and Iterate**: Evaluate different chunking strategies for your specific use case
+
+## Evaluation Metrics
+
+1. **Retrieval Quality**: How well chunks answer user queries
+2. **Context Preservation**: Whether important context is maintained
+3. **Chunk Distribution**: Evenness of chunk sizes
+4. **Boundary Quality**: How natural chunk boundaries are
+5. **Retrieval Efficiency**: Impact on retrieval speed and accuracy
+
+## Advanced Techniques
+
+### Adaptive Chunking
+Adjust chunk size based on document structure and content density.
+
+### Hierarchical Chunking
+Create multiple levels of chunks for different retrieval scenarios.
+
+### Query-Aware Chunking
+Optimize chunk boundaries based on typical query patterns.
+
+### Domain-Specific Splitting
+Use specialized splitters for specific document types (legal, medical, technical).
--- a/skills/ai/rag/references/embedding-models.md
+++ b/skills/ai/rag/references/embedding-models.md
@@ -0,0 +1,88 @@
+# Embedding Models Guide
+
+## Overview
+Embedding models convert text into numerical vectors that capture semantic meaning for similarity search in RAG systems.
+
+## Popular Embedding Models
+
+### 1. text-embedding-ada-002 (OpenAI)
+- **Dimensions**: 1536
+- **Type**: General purpose
+- **Use Case**: Most applications requiring high quality embeddings
+- **Performance**: Excellent balance of quality and speed
+
+### 2. all-MiniLM-L6-v2 (Sentence Transformers)
+- **Dimensions**: 384
+- **Type**: Lightweight
+- **Use Case**: Applications requiring fast inference
+- **Performance**: Good quality, very fast
+
+### 3. e5-large-v2
+- **Dimensions**: 1024
+- **Type**: High quality
+- **Use Case**: Applications needing superior performance
+- **Performance**: Excellent quality, multilingual support
+
+### 4. Instructor
+- **Dimensions**: Variable (768)
+- **Type**: Task-specific
+- **Use Case**: Domain-specific applications
+- **Performance**: Can be fine-tuned for specific tasks
+
+### 5. bge-large-en-v1.5
+- **Dimensions**: 1024
+- **Type**: State-of-the-art
+- **Use Case**: Applications requiring best possible quality
+- **Performance**: SOTA performance on benchmarks
+
+## Selection Criteria
+
+1. **Quality vs Speed**: Balance between embedding quality and inference speed
+2. **Dimension Size**: Impact on storage and retrieval performance
+3. **Domain**: Specific language or domain requirements
+4. **Cost**: API costs vs local deployment
+5. **Batch Size**: Throughput requirements
+6. **Language**: Multilingual support needs
+
+## Usage Examples
+
+### OpenAI Embeddings
+```python
+from langchain.embeddings import OpenAIEmbeddings
+
+embeddings = OpenAIEmbeddings()
+vector = embeddings.embed_query("Your text here")
+```
+
+### Sentence Transformers
+```python
+from sentence_transformers import SentenceTransformer
+
+model = SentenceTransformer('all-MiniLM-L6-v2')
+vector = model.encode("Your text here")
+```
+
+### Hugging Face Models
+```python
+from langchain.embeddings import HuggingFaceEmbeddings
+
+embeddings = HuggingFaceEmbeddings(
+    model_name="sentence-transformers/all-MiniLM-L6-v2"
+)
+```
+
+## Optimization Tips
+
+1. **Batch Processing**: Process multiple texts together for efficiency
+2. **Model Quantization**: Reduce model size for faster inference
+3. **Caching**: Cache embeddings for frequently used texts
+4. **GPU Acceleration**: Use GPU for faster processing when available
+5. **Model Selection**: Choose appropriate model size for your use case
+
+## Evaluation Metrics
+
+1. **Semantic Similarity**: How well embeddings capture meaning
+2. **Retrieval Performance**: Quality of retrieved documents
+3. **Speed**: Inference time per document
+4. **Memory Usage**: RAM requirements for the model
+5. **Cost**: API costs or infrastructure requirements
--- a/skills/ai/rag/references/langchain4j-rag-guide.md
+++ b/skills/ai/rag/references/langchain4j-rag-guide.md
@@ -0,0 +1,94 @@
+# LangChain4j RAG Implementation Guide
+
+## Overview
+RAG (Retrieval-Augmented Generation) extends LLM knowledge by finding and injecting relevant information from your data into prompts before sending to the LLM.
+
+## What is RAG?
+RAG helps LLMs answer questions using domain-specific knowledge by retrieving relevant information to reduce hallucinations.
+
+## RAG Flavors in LangChain4j
+
+### 1. Easy RAG
+Simplest way to start with minimal setup. Handles document loading, splitting, and embedding automatically.
+
+### 2. Core RAG APIs
+Modular components including:
+- Document
+- TextSegment
+- EmbeddingModel
+- EmbeddingStore
+- DocumentSplitter
+
+### 3. Advanced RAG
+Complex pipelines supporting:
+- Query transformation
+- Multi-source retrieval
+- Re-ranking with components like QueryTransformer and ContentRetriever
+
+## RAG Stages
+
+### 1. Indexing
+Pre-process documents for efficient search
+
+### 2. Retrieval
+Find relevant content based on user queries
+
+## Core Components
+
+### Documents with metadata
+Structured representation of your content with associated metadata for filtering and context.
+
+### Text segments (chunks)
+Smaller, manageable pieces of documents that are embedded and stored in vector databases.
+
+### Embedding models
+Convert text segments into numerical vectors for similarity search.
+
+### Embedding stores (vector databases)
+Store and efficiently retrieve embedded text segments.
+
+### Content retrievers
+Find relevant content based on user queries.
+
+### Query transformers
+Transform and optimize user queries for better retrieval.
+
+### Content aggregators
+Combine and rank retrieved content.
+
+## Advanced Features
+
+- Query transformation and routing
+- Multiple retrievers for different data sources
+- Re-ranking models for improved relevance
+- Metadata filtering for targeted retrieval
+- Parallel processing for performance
+
+## Implementation Example (Easy RAG)
+
+```java
+// Load documents
+List<Document> documents = FileSystemDocumentLoader.loadDocuments("/path/to/docs");
+
+// Create embedding store
+InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
+
+// Ingest documents
+EmbeddingStoreIngestor.ingest(documents, embeddingStore);
+
+// Create AI service
+Assistant assistant = AiServices.builder(Assistant.class)
+    .chatModel(chatModel)
+    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
+    .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
+    .build();
+```
+
+## Best Practices
+
+1. **Document Preparation**: Clean and structure documents before ingestion
+2. **Chunk Size**: Balance between context preservation and retrieval precision
+3. **Metadata Strategy**: Include relevant metadata for filtering and context
+4. **Embedding Model Selection**: Choose models appropriate for your domain
+5. **Retrieval Strategy**: Select appropriate k values and filtering criteria
+6. **Evaluation**: Continuously evaluate retrieval quality and answer accuracy
--- a/skills/ai/rag/references/retrieval-strategies.md
+++ b/skills/ai/rag/references/retrieval-strategies.md
@@ -0,0 +1,161 @@
+# Advanced Retrieval Strategies
+
+## Overview
+Different retrieval approaches for finding relevant documents in RAG systems, each with specific strengths and use cases.
+
+## Retrieval Approaches
+
+### 1. Dense Retrieval
+**Method**: Semantic similarity via embeddings
+**Use Case**: Understanding meaning and context
+**Example**: Finding documents about "machine learning" when query is "AI algorithms"
+
+```python
+from langchain.vectorstores import Chroma
+
+vectorstore = Chroma.from_documents(chunks, embeddings)
+results = vectorstore.similarity_search("query", k=5)
+```
+
+### 2. Sparse Retrieval
+**Method**: Keyword matching (BM25, TF-IDF)
+**Use Case**: Exact term matching and keyword-specific queries
+**Example**: Finding documents containing specific technical terms
+
+```python
+from langchain.retrievers import BM25Retriever
+
+bm25_retriever = BM25Retriever.from_documents(chunks)
+bm25_retriever.k = 5
+results = bm25_retriever.get_relevant_documents("query")
+```
+
+### 3. Hybrid Search
+**Method**: Combine dense + sparse retrieval
+**Use Case**: Balance between semantic understanding and keyword matching
+
+```python
+from langchain.retrievers import BM25Retriever, EnsembleRetriever
+
+# Sparse retriever (BM25)
+bm25_retriever = BM25Retriever.from_documents(chunks)
+bm25_retriever.k = 5
+
+# Dense retriever (embeddings)
+embedding_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
+
+# Combine with weights
+ensemble_retriever = EnsembleRetriever(
+    retrievers=[bm25_retriever, embedding_retriever],
+    weights=[0.3, 0.7]
+)
+```
+
+### 4. Multi-Query Retrieval
+**Method**: Generate multiple query variations
+**Use Case**: Complex queries that can be interpreted in multiple ways
+
+```python
+from langchain.retrievers.multi_query import MultiQueryRetriever
+
+# Generate multiple query perspectives
+retriever = MultiQueryRetriever.from_llm(
+    retriever=vectorstore.as_retriever(),
+    llm=OpenAI()
+)
+
+# Single query → multiple variations → combined results
+results = retriever.get_relevant_documents("What is the main topic?")
+```
+
+### 5. HyDE (Hypothetical Document Embeddings)
+**Method**: Generate hypothetical documents for better retrieval
+**Use Case**: When queries are very different from document style
+
+```python
+# Generate hypothetical document based on query
+hypothetical_doc = llm.generate(f"Write a document about: {query}")
+# Use hypothetical doc for retrieval
+results = vectorstore.similarity_search(hypothetical_doc, k=5)
+```
+
+## Advanced Retrieval Patterns
+
+### Contextual Compression
+Compress retrieved documents to only include relevant parts
+
+```python
+from langchain.retrievers import ContextualCompressionRetriever
+from langchain.retrievers.document_compressors import LLMChainExtractor
+
+compressor = LLMChainExtractor.from_llm(llm)
+compression_retriever = ContextualCompressionRetriever(
+    base_compressor=compressor,
+    base_retriever=vectorstore.as_retriever()
+)
+```
+
+### Parent Document Retriever
+Store small chunks for retrieval, return larger chunks for context
+
+```python
+from langchain.retrievers import ParentDocumentRetriever
+from langchain.storage import InMemoryStore
+
+store = InMemoryStore()
+child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)
+parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
+
+retriever = ParentDocumentRetriever(
+    vectorstore=vectorstore,
+    docstore=store,
+    child_splitter=child_splitter,
+    parent_splitter=parent_splitter
+)
+```
+
+## Retrieval Optimization Techniques
+
+### 1. Metadata Filtering
+Filter results based on document metadata
+
+```python
+results = vectorstore.similarity_search(
+    "query",
+    filter={"category": "technical", "date": {"$gte": "2023-01-01"}},
+    k=5
+)
+```
+
+### 2. Maximal Marginal Relevance (MMR)
+Balance relevance with diversity
+
+```python
+results = vectorstore.max_marginal_relevance_search(
+    "query",
+    k=5,
+    fetch_k=20,
+    lambda_mult=0.5  # 0=max diversity, 1=max relevance
+)
+```
+
+### 3. Reranking
+Improve top results with cross-encoder
+
+```python
+from sentence_transformers import CrossEncoder
+
+reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
+candidates = vectorstore.similarity_search("query", k=20)
+pairs = [[query, doc.page_content] for doc in candidates]
+scores = reranker.predict(pairs)
+reranked = sorted(zip(candidates, scores), key=lambda x: x[1], reverse=True)[:5]
+```
+
+## Selection Guidelines
+
+1. **Query Type**: Choose strategy based on typical query patterns
+2. **Document Type**: Consider document structure and content
+3. **Performance Requirements**: Balance quality vs speed
+4. **Domain Knowledge**: Leverage domain-specific patterns
+5. **User Expectations**: Match retrieval behavior to user expectations
--- a/skills/ai/rag/references/vector-databases.md
+++ b/skills/ai/rag/references/vector-databases.md
@@ -0,0 +1,86 @@
+# Vector Database Comparison and Configuration
+
+## Overview
+Vector databases store and efficiently retrieve document embeddings for semantic search in RAG systems.
+
+## Popular Vector Database Options
+
+### 1. Pinecone
+- **Type**: Managed cloud service
+- **Features**: Scalable, fast queries, managed infrastructure
+- **Use Case**: Production applications requiring high availability
+
+### 2. Weaviate
+- **Type**: Open-source, hybrid search
+- **Features**: Combines vector and keyword search, GraphQL API
+- **Use Case**: Applications needing both semantic and traditional search
+
+### 3. Milvus
+- **Type**: High performance, on-premise
+- **Features**: Distributed architecture, GPU acceleration
+- **Use Case**: Large-scale deployments with custom infrastructure
+
+### 4. Chroma
+- **Type**: Lightweight, easy to use
+- **Features**: Local deployment, simple API
+- **Use Case**: Development and small-scale applications
+
+### 5. Qdrant
+- **Type**: Fast, filtered search
+- **Features**: Advanced filtering, payload support
+- **Use Case**: Applications requiring complex metadata filtering
+
+### 6. FAISS
+- **Type**: Meta's library, local deployment
+- **Features**: High performance, CPU/GPU optimized
+- **Use Case**: Research and applications needing full control
+
+## Configuration Examples
+
+### Pinecone Setup
+```python
+import pinecone
+from langchain.vectorstores import Pinecone
+
+pinecone.init(api_key="your-api-key", environment="us-west1-gcp")
+index = pinecone.Index("your-index-name")
+vectorstore = Pinecone(index, embeddings.embed_query, "text")
+```
+
+### Weaviate Setup
+```python
+import weaviate
+from langchain.vectorstores import Weaviate
+
+client = weaviate.Client("http://localhost:8080")
+vectorstore = Weaviate(client, "Document", "content", embeddings)
+```
+
+### Chroma Local Setup
+```python
+from langchain.vectorstores import Chroma
+
+vectorstore = Chroma(
+    collection_name="my_collection",
+    embedding_function=embeddings,
+    persist_directory="./chroma_db"
+)
+```
+
+## Selection Criteria
+
+1. **Scale**: Number of documents and expected query volume
+2. **Performance**: Latency requirements and throughput needs
+3. **Deployment**: Cloud vs on-premise preferences
+4. **Features**: Filtering, hybrid search, metadata support
+5. **Cost**: Budget constraints and operational overhead
+6. **Maintenance**: Team expertise and available resources
+
+## Best Practices
+
+1. **Indexing Strategy**: Choose appropriate distance metrics (cosine, euclidean)
+2. **Sharding**: Distribute data for large-scale deployments
+3. **Monitoring**: Track query performance and system health
+4. **Backups**: Implement regular backup procedures
+5. **Security**: Secure access to sensitive data
+6. **Optimization**: Tune parameters for your specific use case