Initial commit

2025-11-29 18:45:58 +08:00
commit 4b6db3349f
68 changed files with 15165 additions and 0 deletions
--- a/skills/fine-tune/SKILL.md
+++ b/skills/fine-tune/SKILL.md
@@ -0,0 +1,153 @@
+---
+name: fine-tune
+description: Use when you need to fine-tune(ファインチューニング) and optimize LangGraph applications based on evaluation criteria. This skill performs iterative prompt optimization for LangGraph nodes without changing the graph structure.
+---
+
+# LangGraph Application Fine-Tuning Skill
+
+A skill for iteratively optimizing prompts and processing logic in each node of a LangGraph application based on evaluation criteria.
+
+## 📋 Overview
+
+This skill executes the following process to improve the performance of existing LangGraph applications:
+
+1. **Load Objectives**: Retrieve optimization goals and evaluation criteria from `.langgraph-master/fine-tune.md` (if this file doesn't exist, help the user create it based on their requirements)
+2. **Identify Optimization Targets**: Extract nodes containing LLM prompts using Serena MCP (if Serena MCP is unavailable, investigate the codebase using ls, read, etc.)
+3. **Baseline Evaluation**: Measure current performance through multiple runs
+4. **Implement Improvements**: Identify the most effective improvement areas and optimize prompts and processing logic
+5. **Re-evaluation**: Measure performance after improvements
+6. **Iteration**: Repeat steps 4-5 until goals are achieved
+
+**Important Constraint**: Only optimize prompts and processing logic within each node without modifying the graph structure (nodes, edges configuration).
+
+## 🎯 When to Use This Skill
+
+Use this skill in the following situations:
+
+1. **When performance improvement of existing applications is needed**
+
+   - Want to improve LLM output quality
+   - Want to improve response speed
+   - Want to reduce error rate
+
+2. **When evaluation criteria are clear**
+
+   - Optimization goals are defined in `.langgraph-master/fine-tune.md`
+   - Quantitative evaluation methods are established
+
+3. **When improvements through prompt engineering are expected**
+   - Improvements are likely with clearer LLM instructions
+   - Adding few-shot examples would be effective
+   - Output format adjustment is needed
+
+## 📖 Fine-Tuning Workflow Overview
+
+### Phase 1: Preparation and Analysis
+
+**Purpose**: Understand optimization targets and current state
+
+**Main Steps**:
+
+1. Load objective setting file (`.langgraph-master/fine-tune.md`)
+2. Identify optimization targets (Serena MCP or manual code investigation)
+3. Create optimization target list (evaluate improvement potential for each node)
+
+→ See [workflow.md](workflow.md#phase-1-preparation-and-analysis) for details
+
+### Phase 2: Baseline Evaluation
+
+**Purpose**: Quantitatively measure current performance
+
+**Main Steps**: 4. Prepare evaluation environment (test cases, evaluation scripts) 5. Baseline measurement (recommended: 3-5 runs) 6. Analyze baseline results (identify problems)
+
+**Important**: When evaluation programs are needed, create evaluation code in a specific subdirectory (users may specify the directory).
+
+→ See [workflow.md](workflow.md#phase-2-baseline-evaluation) and [evaluation.md](evaluation.md) for details
+
+### Phase 3: Iterative Improvement
+
+**Purpose**: Data-driven incremental improvement
+
+**Main Steps**: 7. Prioritization (select the most impactful improvement area) 8. Implement improvements (prompt optimization, parameter tuning) 9. Post-improvement evaluation (re-evaluate under the same conditions) 10. Compare and analyze results (measure improvement effects) 11. Decide whether to continue iteration (repeat until goals are achieved)
+
+→ See [workflow.md](workflow.md#phase-3-iterative-improvement) and [prompt_optimization.md](prompt_optimization.md) for details
+
+### Phase 4: Completion and Documentation
+
+**Purpose**: Record achievements and provide future recommendations
+
+**Main Steps**: 12. Create final evaluation report (improvement content, results, recommendations) 13. Code commit and documentation update
+
+→ See [workflow.md](workflow.md#phase-4-completion-and-documentation) for details
+
+## 🔧 Tools and Technologies Used
+
+### MCP Server Utilization
+
+- **Serena MCP**: Codebase analysis and optimization target identification
+
+  - `find_symbol`: Search for LLM clients
+  - `find_referencing_symbols`: Identify prompt construction locations
+  - `get_symbols_overview`: Understand node structure
+
+- **Sequential MCP**: Complex analysis and decision making
+  - Determine improvement priorities
+  - Analyze evaluation results
+  - Plan next actions
+
+### Key Optimization Techniques
+
+1. **Few-Shot Examples**: Accuracy +10-20%
+2. **Structured Output Format**: Parsing errors -90%
+3. **Temperature/Max Tokens Adjustment**: Cost -20-40%
+4. **Model Selection Optimization**: Cost -40-60%
+5. **Prompt Caching**: Cost -50-90% (on cache hit)
+
+→ See [prompt_optimization.md](prompt_optimization.md) for details
+
+## 📚 Related Documentation
+
+Detailed guidelines and best practices:
+
+- **[workflow.md](workflow.md)** - Fine-tuning workflow details (execution procedures and code examples for each phase)
+- **[evaluation.md](evaluation.md)** - Evaluation methods and best practices (metric calculation, statistical analysis, test case design)
+- **[prompt_optimization.md](prompt_optimization.md)** - Prompt optimization techniques (10 practical methods and priorities)
+- **[examples.md](examples.md)** - Practical examples collection (copy-and-paste ready code examples and template collection)
+
+## ⚠️ Important Notes
+
+1. **Preserve Graph Structure**
+
+   - Do not add or remove nodes or edges
+   - Do not change data flow between nodes
+   - Maintain state schema
+
+2. **Evaluation Consistency**
+
+   - Use the same test cases
+   - Measure with the same evaluation metrics
+   - Run multiple times to confirm statistically significant improvements
+
+3. **Cost Management**
+
+   - Consider evaluation execution costs
+   - Adjust sample size as needed
+   - Be mindful of API rate limits
+
+4. **Version Control**
+   - Git commit each iteration's changes
+   - Maintain rollback-capable state
+   - Record evaluation results
+
+## 🎓 Fine-Tuning Best Practices
+
+1. **Start Small**: Optimize from the most impactful node
+2. **Measurement-Driven**: Always perform quantitative evaluation before and after improvements
+3. **Incremental Improvement**: Validate one change at a time, not multiple simultaneously
+4. **Documentation**: Record reasons and results for each change
+5. **Iteration**: Continuously improve until goals are achieved
+
+## 🔗 Reference Links
+
+- [LangGraph Official Documentation](https://docs.langchain.com/oss/python/langgraph/overview)
+- [Prompt Engineering Guide](https://www.promptingguide.ai/)