Files
2025-11-29 18:02:28 +08:00

4 lines
104 B
Markdown

# agent-benchmark-kit
Automated quality assurance for Claude Code agents using LLM-as-judge evaluation