Initial commit
This commit is contained in:
292
skills/skill/references/best-practices-checklist.md
Normal file
292
skills/skill/references/best-practices-checklist.md
Normal file
@@ -0,0 +1,292 @@
|
||||
# Anthropic Best Practices Checklist
|
||||
|
||||
Evaluation criteria for assessing Claude Skill quality based on official Anthropic guidelines.
|
||||
|
||||
## Purpose
|
||||
|
||||
Use this checklist to evaluate skills found on GitHub. Each criterion contributes to the overall quality score (0-10).
|
||||
|
||||
## Evaluation Criteria
|
||||
|
||||
### 1. Description Quality (Weight: 2.0)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Description is specific, not vague
|
||||
- [ ] Includes what the skill does
|
||||
- [ ] Includes when to use it (trigger conditions)
|
||||
- [ ] Contains key terms users would mention
|
||||
- [ ] Written in third person
|
||||
- [ ] Under 1024 characters
|
||||
- [ ] No XML tags
|
||||
|
||||
**Scoring:**
|
||||
- 2.0: All criteria met, very clear and specific
|
||||
- 1.5: Most criteria met, good clarity
|
||||
- 1.0: Basic description, somewhat vague
|
||||
- 0.5: Very vague or generic
|
||||
- 0.0: Missing or completely unclear
|
||||
|
||||
**Examples:**
|
||||
|
||||
**Good (2.0):**
|
||||
```yaml
|
||||
description: Analyze Excel spreadsheets, create pivot tables, generate charts. Use when working with Excel files, spreadsheets, tabular data, or .xlsx files.
|
||||
```
|
||||
|
||||
**Bad (0.5):**
|
||||
```yaml
|
||||
description: Helps with documents
|
||||
```
|
||||
|
||||
### 2. Name Convention (Weight: 0.5)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Uses lowercase letters, numbers, hyphens only
|
||||
- [ ] Under 64 characters
|
||||
- [ ] Follows naming pattern (gerund form preferred)
|
||||
- [ ] Descriptive, not vague
|
||||
- [ ] No reserved words ("anthropic", "claude")
|
||||
|
||||
**Scoring:**
|
||||
- 0.5: Follows all conventions
|
||||
- 0.25: Minor issues (e.g., not gerund but still clear)
|
||||
- 0.0: Violates conventions or very vague
|
||||
|
||||
**Good:** `processing-pdfs`, `analyzing-spreadsheets`
|
||||
**Bad:** `helper`, `utils`, `claude-tool`
|
||||
|
||||
### 3. Conciseness (Weight: 1.5)
|
||||
|
||||
**What to check:**
|
||||
- [ ] SKILL.md body under 500 lines
|
||||
- [ ] No unnecessary explanations
|
||||
- [ ] Assumes Claude's intelligence
|
||||
- [ ] Gets to the point quickly
|
||||
- [ ] Additional content in separate files if needed
|
||||
|
||||
**Scoring:**
|
||||
- 1.5: Very concise, well-edited, <300 lines
|
||||
- 1.0: Reasonable length, <500 lines
|
||||
- 0.5: Long but not excessive, 500-800 lines
|
||||
- 0.0: Very verbose, >800 lines
|
||||
|
||||
### 4. Progressive Disclosure (Weight: 1.0)
|
||||
|
||||
**What to check:**
|
||||
- [ ] SKILL.md serves as overview/table of contents
|
||||
- [ ] Additional details in separate files
|
||||
- [ ] Clear references to other files
|
||||
- [ ] Files organized by domain/feature
|
||||
- [ ] No deeply nested references (max 1 level deep)
|
||||
|
||||
**Scoring:**
|
||||
- 1.0: Excellent use of progressive disclosure
|
||||
- 0.75: Good organization with some references
|
||||
- 0.5: Some separation, could be better
|
||||
- 0.25: All content in SKILL.md, no references
|
||||
- 0.0: Poorly organized or deeply nested
|
||||
|
||||
### 5. Examples and Workflows (Weight: 1.0)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Has concrete examples (not abstract)
|
||||
- [ ] Includes code snippets
|
||||
- [ ] Shows input/output pairs
|
||||
- [ ] Has clear workflows for complex tasks
|
||||
- [ ] Examples use real patterns, not placeholders
|
||||
|
||||
**Scoring:**
|
||||
- 1.0: Excellent examples and clear workflows
|
||||
- 0.75: Good examples, some workflows
|
||||
- 0.5: Basic examples, no workflows
|
||||
- 0.25: Few or abstract examples
|
||||
- 0.0: No examples
|
||||
|
||||
### 6. Appropriate Degree of Freedom (Weight: 0.5)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Instructions match task fragility
|
||||
- [ ] High freedom for flexible tasks (text instructions)
|
||||
- [ ] Low freedom for fragile tasks (specific scripts)
|
||||
- [ ] Clear when to use exact commands vs adapt
|
||||
|
||||
**Scoring:**
|
||||
- 0.5: Perfect match of freedom to task type
|
||||
- 0.25: Reasonable but could be better
|
||||
- 0.0: Inappropriate level (too rigid or too loose)
|
||||
|
||||
### 7. Dependencies Documentation (Weight: 0.5)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Required packages listed
|
||||
- [ ] Installation instructions provided
|
||||
- [ ] Dependencies verified as available
|
||||
- [ ] No assumption of pre-installed packages
|
||||
|
||||
**Scoring:**
|
||||
- 0.5: All dependencies documented and verified
|
||||
- 0.25: Dependencies mentioned but not fully documented
|
||||
- 0.0: Dependencies assumed or not mentioned
|
||||
|
||||
### 8. Structure and Organization (Weight: 1.0)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Clear section headings
|
||||
- [ ] Logical flow of information
|
||||
- [ ] Table of contents for long files
|
||||
- [ ] Consistent formatting
|
||||
- [ ] Unix-style paths (forward slashes)
|
||||
|
||||
**Scoring:**
|
||||
- 1.0: Excellently organized
|
||||
- 0.75: Well organized with minor issues
|
||||
- 0.5: Basic organization
|
||||
- 0.25: Poor organization
|
||||
- 0.0: No clear structure
|
||||
|
||||
### 9. Error Handling (Weight: 0.5)
|
||||
|
||||
**What to check (for skills with scripts):**
|
||||
- [ ] Scripts handle errors explicitly
|
||||
- [ ] Clear error messages
|
||||
- [ ] Fallback strategies provided
|
||||
- [ ] Validation loops for critical operations
|
||||
- [ ] No "voodoo constants"
|
||||
|
||||
**Scoring:**
|
||||
- 0.5: Excellent error handling
|
||||
- 0.25: Basic error handling
|
||||
- 0.0: No error handling or punts to Claude
|
||||
|
||||
### 10. Avoids Anti-Patterns (Weight: 1.0)
|
||||
|
||||
**What to avoid:**
|
||||
- [ ] Time-sensitive information
|
||||
- [ ] Inconsistent terminology
|
||||
- [ ] Windows-style paths
|
||||
- [ ] Offering too many options without guidance
|
||||
- [ ] Deeply nested references
|
||||
- [ ] Vague or generic content
|
||||
|
||||
**Scoring:**
|
||||
- 1.0: No anti-patterns
|
||||
- 0.75: 1-2 minor anti-patterns
|
||||
- 0.5: Multiple anti-patterns
|
||||
- 0.0: Severe anti-patterns
|
||||
|
||||
### 11. Testing and Validation (Weight: 0.5)
|
||||
|
||||
**What to check:**
|
||||
- [ ] Evidence of testing mentioned
|
||||
- [ ] Evaluation examples provided
|
||||
- [ ] Clear success criteria
|
||||
- [ ] Feedback loops for quality
|
||||
|
||||
**Scoring:**
|
||||
- 0.5: Clear testing approach
|
||||
- 0.25: Some testing mentioned
|
||||
- 0.0: No testing mentioned
|
||||
|
||||
## Scoring System
|
||||
|
||||
**Total possible: 10.0 points**
|
||||
|
||||
Calculate weighted score:
|
||||
```
|
||||
quality_score = (
|
||||
description_score * 2.0 +
|
||||
name_score * 0.5 +
|
||||
conciseness_score * 1.5 +
|
||||
progressive_disclosure_score * 1.0 +
|
||||
examples_score * 1.0 +
|
||||
freedom_score * 0.5 +
|
||||
dependencies_score * 0.5 +
|
||||
structure_score * 1.0 +
|
||||
error_handling_score * 0.5 +
|
||||
anti_patterns_score * 1.0 +
|
||||
testing_score * 0.5
|
||||
)
|
||||
```
|
||||
|
||||
## Quality Tiers
|
||||
|
||||
**Excellent (8.0-10.0):**
|
||||
- Follows all best practices
|
||||
- Clearly professional
|
||||
- Ready for production use
|
||||
- **Recommendation:** Strongly recommended
|
||||
|
||||
**Good (6.0-7.9):**
|
||||
- Follows most best practices
|
||||
- Minor improvements needed
|
||||
- Usable but not perfect
|
||||
- **Recommendation:** Recommended with minor notes
|
||||
|
||||
**Fair (4.0-5.9):**
|
||||
- Follows some best practices
|
||||
- Several improvements needed
|
||||
- May work but needs review
|
||||
- **Recommendation:** Consider with caution
|
||||
|
||||
**Poor (0.0-3.9):**
|
||||
- Violates many best practices
|
||||
- Significant issues
|
||||
- High risk of problems
|
||||
- **Recommendation:** Not recommended
|
||||
|
||||
## Quick Evaluation Process
|
||||
|
||||
For rapid assessment during search:
|
||||
|
||||
1. **Read SKILL.md frontmatter** (30 sec)
|
||||
- Check description quality (most important)
|
||||
- Check name convention
|
||||
|
||||
2. **Scan SKILL.md body** (1-2 min)
|
||||
- Check length (<500 lines?)
|
||||
- Look for examples
|
||||
- Check for references to other files
|
||||
- Note any obvious anti-patterns
|
||||
|
||||
3. **Check file structure** (30 sec)
|
||||
- Look for reference files
|
||||
- Check for scripts/utilities
|
||||
- Verify organization
|
||||
|
||||
4. **Calculate quick score** (30 sec)
|
||||
- Focus on weighted criteria
|
||||
- Estimate tier (Excellent/Good/Fair/Poor)
|
||||
|
||||
**Total time per skill: ~3-4 minutes**
|
||||
|
||||
## Automation Tips
|
||||
|
||||
When evaluating multiple skills:
|
||||
|
||||
```bash
|
||||
# Check SKILL.md length
|
||||
wc -l SKILL.md
|
||||
|
||||
# Count reference files
|
||||
find . -name "*.md" -not -name "SKILL.md" | wc -l
|
||||
|
||||
# Check for common anti-patterns
|
||||
grep -i "claude can help\|I can help\|you can use" SKILL.md
|
||||
|
||||
# Verify Unix paths
|
||||
grep -E '\\\|\\\\' SKILL.md
|
||||
|
||||
# Check description length
|
||||
head -10 SKILL.md | grep "description:" | wc -c
|
||||
```
|
||||
|
||||
## Reference
|
||||
|
||||
Based on official Anthropic documentation:
|
||||
- [Agent Skills Overview](https://docs.anthropic.com/en/docs/agents-and-tools/agent-skills/overview)
|
||||
- [Best Practices Guide](https://docs.anthropic.com/en/docs/agents-and-tools/agent-skills/best-practices)
|
||||
- [Claude Code Skills](https://docs.anthropic.com/en/docs/claude-code/skills)
|
||||
|
||||
---
|
||||
|
||||
**Usage:** Use this checklist when evaluating skills found through skill-finder to provide quality scores and recommendations to users.
|
||||
Reference in New Issue
Block a user