293 lines
7.6 KiB
Markdown
293 lines
7.6 KiB
Markdown
# Anthropic Best Practices Checklist
|
|
|
|
Evaluation criteria for assessing Claude Skill quality based on official Anthropic guidelines.
|
|
|
|
## Purpose
|
|
|
|
Use this checklist to evaluate skills found on GitHub. Each criterion contributes to the overall quality score (0-10).
|
|
|
|
## Evaluation Criteria
|
|
|
|
### 1. Description Quality (Weight: 2.0)
|
|
|
|
**What to check:**
|
|
- [ ] Description is specific, not vague
|
|
- [ ] Includes what the skill does
|
|
- [ ] Includes when to use it (trigger conditions)
|
|
- [ ] Contains key terms users would mention
|
|
- [ ] Written in third person
|
|
- [ ] Under 1024 characters
|
|
- [ ] No XML tags
|
|
|
|
**Scoring:**
|
|
- 2.0: All criteria met, very clear and specific
|
|
- 1.5: Most criteria met, good clarity
|
|
- 1.0: Basic description, somewhat vague
|
|
- 0.5: Very vague or generic
|
|
- 0.0: Missing or completely unclear
|
|
|
|
**Examples:**
|
|
|
|
**Good (2.0):**
|
|
```yaml
|
|
description: Analyze Excel spreadsheets, create pivot tables, generate charts. Use when working with Excel files, spreadsheets, tabular data, or .xlsx files.
|
|
```
|
|
|
|
**Bad (0.5):**
|
|
```yaml
|
|
description: Helps with documents
|
|
```
|
|
|
|
### 2. Name Convention (Weight: 0.5)
|
|
|
|
**What to check:**
|
|
- [ ] Uses lowercase letters, numbers, hyphens only
|
|
- [ ] Under 64 characters
|
|
- [ ] Follows naming pattern (gerund form preferred)
|
|
- [ ] Descriptive, not vague
|
|
- [ ] No reserved words ("anthropic", "claude")
|
|
|
|
**Scoring:**
|
|
- 0.5: Follows all conventions
|
|
- 0.25: Minor issues (e.g., not gerund but still clear)
|
|
- 0.0: Violates conventions or very vague
|
|
|
|
**Good:** `processing-pdfs`, `analyzing-spreadsheets`
|
|
**Bad:** `helper`, `utils`, `claude-tool`
|
|
|
|
### 3. Conciseness (Weight: 1.5)
|
|
|
|
**What to check:**
|
|
- [ ] SKILL.md body under 500 lines
|
|
- [ ] No unnecessary explanations
|
|
- [ ] Assumes Claude's intelligence
|
|
- [ ] Gets to the point quickly
|
|
- [ ] Additional content in separate files if needed
|
|
|
|
**Scoring:**
|
|
- 1.5: Very concise, well-edited, <300 lines
|
|
- 1.0: Reasonable length, <500 lines
|
|
- 0.5: Long but not excessive, 500-800 lines
|
|
- 0.0: Very verbose, >800 lines
|
|
|
|
### 4. Progressive Disclosure (Weight: 1.0)
|
|
|
|
**What to check:**
|
|
- [ ] SKILL.md serves as overview/table of contents
|
|
- [ ] Additional details in separate files
|
|
- [ ] Clear references to other files
|
|
- [ ] Files organized by domain/feature
|
|
- [ ] No deeply nested references (max 1 level deep)
|
|
|
|
**Scoring:**
|
|
- 1.0: Excellent use of progressive disclosure
|
|
- 0.75: Good organization with some references
|
|
- 0.5: Some separation, could be better
|
|
- 0.25: All content in SKILL.md, no references
|
|
- 0.0: Poorly organized or deeply nested
|
|
|
|
### 5. Examples and Workflows (Weight: 1.0)
|
|
|
|
**What to check:**
|
|
- [ ] Has concrete examples (not abstract)
|
|
- [ ] Includes code snippets
|
|
- [ ] Shows input/output pairs
|
|
- [ ] Has clear workflows for complex tasks
|
|
- [ ] Examples use real patterns, not placeholders
|
|
|
|
**Scoring:**
|
|
- 1.0: Excellent examples and clear workflows
|
|
- 0.75: Good examples, some workflows
|
|
- 0.5: Basic examples, no workflows
|
|
- 0.25: Few or abstract examples
|
|
- 0.0: No examples
|
|
|
|
### 6. Appropriate Degree of Freedom (Weight: 0.5)
|
|
|
|
**What to check:**
|
|
- [ ] Instructions match task fragility
|
|
- [ ] High freedom for flexible tasks (text instructions)
|
|
- [ ] Low freedom for fragile tasks (specific scripts)
|
|
- [ ] Clear when to use exact commands vs adapt
|
|
|
|
**Scoring:**
|
|
- 0.5: Perfect match of freedom to task type
|
|
- 0.25: Reasonable but could be better
|
|
- 0.0: Inappropriate level (too rigid or too loose)
|
|
|
|
### 7. Dependencies Documentation (Weight: 0.5)
|
|
|
|
**What to check:**
|
|
- [ ] Required packages listed
|
|
- [ ] Installation instructions provided
|
|
- [ ] Dependencies verified as available
|
|
- [ ] No assumption of pre-installed packages
|
|
|
|
**Scoring:**
|
|
- 0.5: All dependencies documented and verified
|
|
- 0.25: Dependencies mentioned but not fully documented
|
|
- 0.0: Dependencies assumed or not mentioned
|
|
|
|
### 8. Structure and Organization (Weight: 1.0)
|
|
|
|
**What to check:**
|
|
- [ ] Clear section headings
|
|
- [ ] Logical flow of information
|
|
- [ ] Table of contents for long files
|
|
- [ ] Consistent formatting
|
|
- [ ] Unix-style paths (forward slashes)
|
|
|
|
**Scoring:**
|
|
- 1.0: Excellently organized
|
|
- 0.75: Well organized with minor issues
|
|
- 0.5: Basic organization
|
|
- 0.25: Poor organization
|
|
- 0.0: No clear structure
|
|
|
|
### 9. Error Handling (Weight: 0.5)
|
|
|
|
**What to check (for skills with scripts):**
|
|
- [ ] Scripts handle errors explicitly
|
|
- [ ] Clear error messages
|
|
- [ ] Fallback strategies provided
|
|
- [ ] Validation loops for critical operations
|
|
- [ ] No "voodoo constants"
|
|
|
|
**Scoring:**
|
|
- 0.5: Excellent error handling
|
|
- 0.25: Basic error handling
|
|
- 0.0: No error handling or punts to Claude
|
|
|
|
### 10. Avoids Anti-Patterns (Weight: 1.0)
|
|
|
|
**What to avoid:**
|
|
- [ ] Time-sensitive information
|
|
- [ ] Inconsistent terminology
|
|
- [ ] Windows-style paths
|
|
- [ ] Offering too many options without guidance
|
|
- [ ] Deeply nested references
|
|
- [ ] Vague or generic content
|
|
|
|
**Scoring:**
|
|
- 1.0: No anti-patterns
|
|
- 0.75: 1-2 minor anti-patterns
|
|
- 0.5: Multiple anti-patterns
|
|
- 0.0: Severe anti-patterns
|
|
|
|
### 11. Testing and Validation (Weight: 0.5)
|
|
|
|
**What to check:**
|
|
- [ ] Evidence of testing mentioned
|
|
- [ ] Evaluation examples provided
|
|
- [ ] Clear success criteria
|
|
- [ ] Feedback loops for quality
|
|
|
|
**Scoring:**
|
|
- 0.5: Clear testing approach
|
|
- 0.25: Some testing mentioned
|
|
- 0.0: No testing mentioned
|
|
|
|
## Scoring System
|
|
|
|
**Total possible: 10.0 points**
|
|
|
|
Calculate weighted score:
|
|
```
|
|
quality_score = (
|
|
description_score * 2.0 +
|
|
name_score * 0.5 +
|
|
conciseness_score * 1.5 +
|
|
progressive_disclosure_score * 1.0 +
|
|
examples_score * 1.0 +
|
|
freedom_score * 0.5 +
|
|
dependencies_score * 0.5 +
|
|
structure_score * 1.0 +
|
|
error_handling_score * 0.5 +
|
|
anti_patterns_score * 1.0 +
|
|
testing_score * 0.5
|
|
)
|
|
```
|
|
|
|
## Quality Tiers
|
|
|
|
**Excellent (8.0-10.0):**
|
|
- Follows all best practices
|
|
- Clearly professional
|
|
- Ready for production use
|
|
- **Recommendation:** Strongly recommended
|
|
|
|
**Good (6.0-7.9):**
|
|
- Follows most best practices
|
|
- Minor improvements needed
|
|
- Usable but not perfect
|
|
- **Recommendation:** Recommended with minor notes
|
|
|
|
**Fair (4.0-5.9):**
|
|
- Follows some best practices
|
|
- Several improvements needed
|
|
- May work but needs review
|
|
- **Recommendation:** Consider with caution
|
|
|
|
**Poor (0.0-3.9):**
|
|
- Violates many best practices
|
|
- Significant issues
|
|
- High risk of problems
|
|
- **Recommendation:** Not recommended
|
|
|
|
## Quick Evaluation Process
|
|
|
|
For rapid assessment during search:
|
|
|
|
1. **Read SKILL.md frontmatter** (30 sec)
|
|
- Check description quality (most important)
|
|
- Check name convention
|
|
|
|
2. **Scan SKILL.md body** (1-2 min)
|
|
- Check length (<500 lines?)
|
|
- Look for examples
|
|
- Check for references to other files
|
|
- Note any obvious anti-patterns
|
|
|
|
3. **Check file structure** (30 sec)
|
|
- Look for reference files
|
|
- Check for scripts/utilities
|
|
- Verify organization
|
|
|
|
4. **Calculate quick score** (30 sec)
|
|
- Focus on weighted criteria
|
|
- Estimate tier (Excellent/Good/Fair/Poor)
|
|
|
|
**Total time per skill: ~3-4 minutes**
|
|
|
|
## Automation Tips
|
|
|
|
When evaluating multiple skills:
|
|
|
|
```bash
|
|
# Check SKILL.md length
|
|
wc -l SKILL.md
|
|
|
|
# Count reference files
|
|
find . -name "*.md" -not -name "SKILL.md" | wc -l
|
|
|
|
# Check for common anti-patterns
|
|
grep -i "claude can help\|I can help\|you can use" SKILL.md
|
|
|
|
# Verify Unix paths
|
|
grep -E '\\\|\\\\' SKILL.md
|
|
|
|
# Check description length
|
|
head -10 SKILL.md | grep "description:" | wc -c
|
|
```
|
|
|
|
## Reference
|
|
|
|
Based on official Anthropic documentation:
|
|
- [Agent Skills Overview](https://docs.anthropic.com/en/docs/agents-and-tools/agent-skills/overview)
|
|
- [Best Practices Guide](https://docs.anthropic.com/en/docs/agents-and-tools/agent-skills/best-practices)
|
|
- [Claude Code Skills](https://docs.anthropic.com/en/docs/claude-code/skills)
|
|
|
|
---
|
|
|
|
**Usage:** Use this checklist when evaluating skills found through skill-finder to provide quality scores and recommendations to users.
|