6.4 KiB
name, description
| name | description |
|---|---|
| fine-tune | Use when you need to fine-tune(ファインチューニング) and optimize LangGraph applications based on evaluation criteria. This skill performs iterative prompt optimization for LangGraph nodes without changing the graph structure. |
LangGraph Application Fine-Tuning Skill
A skill for iteratively optimizing prompts and processing logic in each node of a LangGraph application based on evaluation criteria.
📋 Overview
This skill executes the following process to improve the performance of existing LangGraph applications:
- Load Objectives: Retrieve optimization goals and evaluation criteria from
.langgraph-master/fine-tune.md(if this file doesn't exist, help the user create it based on their requirements) - Identify Optimization Targets: Extract nodes containing LLM prompts using Serena MCP (if Serena MCP is unavailable, investigate the codebase using ls, read, etc.)
- Baseline Evaluation: Measure current performance through multiple runs
- Implement Improvements: Identify the most effective improvement areas and optimize prompts and processing logic
- Re-evaluation: Measure performance after improvements
- Iteration: Repeat steps 4-5 until goals are achieved
Important Constraint: Only optimize prompts and processing logic within each node without modifying the graph structure (nodes, edges configuration).
🎯 When to Use This Skill
Use this skill in the following situations:
-
When performance improvement of existing applications is needed
- Want to improve LLM output quality
- Want to improve response speed
- Want to reduce error rate
-
When evaluation criteria are clear
- Optimization goals are defined in
.langgraph-master/fine-tune.md - Quantitative evaluation methods are established
- Optimization goals are defined in
-
When improvements through prompt engineering are expected
- Improvements are likely with clearer LLM instructions
- Adding few-shot examples would be effective
- Output format adjustment is needed
📖 Fine-Tuning Workflow Overview
Phase 1: Preparation and Analysis
Purpose: Understand optimization targets and current state
Main Steps:
- Load objective setting file (
.langgraph-master/fine-tune.md) - Identify optimization targets (Serena MCP or manual code investigation)
- Create optimization target list (evaluate improvement potential for each node)
→ See workflow.md for details
Phase 2: Baseline Evaluation
Purpose: Quantitatively measure current performance
Main Steps: 4. Prepare evaluation environment (test cases, evaluation scripts) 5. Baseline measurement (recommended: 3-5 runs) 6. Analyze baseline results (identify problems)
Important: When evaluation programs are needed, create evaluation code in a specific subdirectory (users may specify the directory).
→ See workflow.md and evaluation.md for details
Phase 3: Iterative Improvement
Purpose: Data-driven incremental improvement
Main Steps: 7. Prioritization (select the most impactful improvement area) 8. Implement improvements (prompt optimization, parameter tuning) 9. Post-improvement evaluation (re-evaluate under the same conditions) 10. Compare and analyze results (measure improvement effects) 11. Decide whether to continue iteration (repeat until goals are achieved)
→ See workflow.md and prompt_optimization.md for details
Phase 4: Completion and Documentation
Purpose: Record achievements and provide future recommendations
Main Steps: 12. Create final evaluation report (improvement content, results, recommendations) 13. Code commit and documentation update
→ See workflow.md for details
🔧 Tools and Technologies Used
MCP Server Utilization
-
Serena MCP: Codebase analysis and optimization target identification
find_symbol: Search for LLM clientsfind_referencing_symbols: Identify prompt construction locationsget_symbols_overview: Understand node structure
-
Sequential MCP: Complex analysis and decision making
- Determine improvement priorities
- Analyze evaluation results
- Plan next actions
Key Optimization Techniques
- Few-Shot Examples: Accuracy +10-20%
- Structured Output Format: Parsing errors -90%
- Temperature/Max Tokens Adjustment: Cost -20-40%
- Model Selection Optimization: Cost -40-60%
- Prompt Caching: Cost -50-90% (on cache hit)
→ See prompt_optimization.md for details
📚 Related Documentation
Detailed guidelines and best practices:
- workflow.md - Fine-tuning workflow details (execution procedures and code examples for each phase)
- evaluation.md - Evaluation methods and best practices (metric calculation, statistical analysis, test case design)
- prompt_optimization.md - Prompt optimization techniques (10 practical methods and priorities)
- examples.md - Practical examples collection (copy-and-paste ready code examples and template collection)
⚠️ Important Notes
-
Preserve Graph Structure
- Do not add or remove nodes or edges
- Do not change data flow between nodes
- Maintain state schema
-
Evaluation Consistency
- Use the same test cases
- Measure with the same evaluation metrics
- Run multiple times to confirm statistically significant improvements
-
Cost Management
- Consider evaluation execution costs
- Adjust sample size as needed
- Be mindful of API rate limits
-
Version Control
- Git commit each iteration's changes
- Maintain rollback-capable state
- Record evaluation results
🎓 Fine-Tuning Best Practices
- Start Small: Optimize from the most impactful node
- Measurement-Driven: Always perform quantitative evaluation before and after improvements
- Incremental Improvement: Validate one change at a time, not multiple simultaneously
- Documentation: Record reasons and results for each change
- Iteration: Continuously improve until goals are achieved