zhongwei/gh-tenequm-claude-plugins-skill-finder

Fork 0

Files

Zhongwei Li 4cbc82fbd4 Initial commit

2025-11-30 09:01:22 +08:00

7.6 KiB

Raw Permalink Blame History

Anthropic Best Practices Checklist

Evaluation criteria for assessing Claude Skill quality based on official Anthropic guidelines.

Purpose

Use this checklist to evaluate skills found on GitHub. Each criterion contributes to the overall quality score (0-10).

Evaluation Criteria

1. Description Quality (Weight: 2.0)

What to check:

Description is specific, not vague
Includes what the skill does
Includes when to use it (trigger conditions)
Contains key terms users would mention
Written in third person
Under 1024 characters
No XML tags

Scoring:

2.0: All criteria met, very clear and specific
1.5: Most criteria met, good clarity
1.0: Basic description, somewhat vague
0.5: Very vague or generic
0.0: Missing or completely unclear

Examples:

Good (2.0):

description: Analyze Excel spreadsheets, create pivot tables, generate charts. Use when working with Excel files, spreadsheets, tabular data, or .xlsx files.

Bad (0.5):

description: Helps with documents

2. Name Convention (Weight: 0.5)

What to check:

Uses lowercase letters, numbers, hyphens only
Under 64 characters
Follows naming pattern (gerund form preferred)
Descriptive, not vague
No reserved words ("anthropic", "claude")

Scoring:

0.5: Follows all conventions
0.25: Minor issues (e.g., not gerund but still clear)
0.0: Violates conventions or very vague

Good: processing-pdfs, analyzing-spreadsheets Bad: helper, utils, claude-tool

3. Conciseness (Weight: 1.5)

What to check:

SKILL.md body under 500 lines
No unnecessary explanations
Assumes Claude's intelligence
Gets to the point quickly
Additional content in separate files if needed

Scoring:

1.5: Very concise, well-edited, <300 lines
1.0: Reasonable length, <500 lines
0.5: Long but not excessive, 500-800 lines
0.0: Very verbose, >800 lines

4. Progressive Disclosure (Weight: 1.0)

What to check:

SKILL.md serves as overview/table of contents
Additional details in separate files
Clear references to other files
Files organized by domain/feature
No deeply nested references (max 1 level deep)

Scoring:

1.0: Excellent use of progressive disclosure
0.75: Good organization with some references
0.5: Some separation, could be better
0.25: All content in SKILL.md, no references
0.0: Poorly organized or deeply nested

5. Examples and Workflows (Weight: 1.0)

What to check:

Has concrete examples (not abstract)
Includes code snippets
Shows input/output pairs
Has clear workflows for complex tasks
Examples use real patterns, not placeholders

Scoring:

1.0: Excellent examples and clear workflows
0.75: Good examples, some workflows
0.5: Basic examples, no workflows
0.25: Few or abstract examples
0.0: No examples

6. Appropriate Degree of Freedom (Weight: 0.5)

What to check:

Instructions match task fragility
High freedom for flexible tasks (text instructions)
Low freedom for fragile tasks (specific scripts)
Clear when to use exact commands vs adapt

Scoring:

0.5: Perfect match of freedom to task type
0.25: Reasonable but could be better
0.0: Inappropriate level (too rigid or too loose)

7. Dependencies Documentation (Weight: 0.5)

What to check:

Required packages listed
Installation instructions provided
Dependencies verified as available
No assumption of pre-installed packages

Scoring:

0.5: All dependencies documented and verified
0.25: Dependencies mentioned but not fully documented
0.0: Dependencies assumed or not mentioned

8. Structure and Organization (Weight: 1.0)

What to check:

Clear section headings
Logical flow of information
Table of contents for long files
Consistent formatting
Unix-style paths (forward slashes)

Scoring:

1.0: Excellently organized
0.75: Well organized with minor issues
0.5: Basic organization
0.25: Poor organization
0.0: No clear structure

9. Error Handling (Weight: 0.5)

What to check (for skills with scripts):

Scripts handle errors explicitly
Clear error messages
Fallback strategies provided
Validation loops for critical operations
No "voodoo constants"

Scoring:

0.5: Excellent error handling
0.25: Basic error handling
0.0: No error handling or punts to Claude

10. Avoids Anti-Patterns (Weight: 1.0)

What to avoid:

Time-sensitive information
Inconsistent terminology
Windows-style paths
Offering too many options without guidance
Deeply nested references
Vague or generic content

Scoring:

1.0: No anti-patterns
0.75: 1-2 minor anti-patterns
0.5: Multiple anti-patterns
0.0: Severe anti-patterns

11. Testing and Validation (Weight: 0.5)

What to check:

Evidence of testing mentioned
Evaluation examples provided
Clear success criteria
Feedback loops for quality

Scoring:

0.5: Clear testing approach
0.25: Some testing mentioned
0.0: No testing mentioned

Scoring System

Total possible: 10.0 points

Calculate weighted score:

quality_score = (
  description_score * 2.0 +
  name_score * 0.5 +
  conciseness_score * 1.5 +
  progressive_disclosure_score * 1.0 +
  examples_score * 1.0 +
  freedom_score * 0.5 +
  dependencies_score * 0.5 +
  structure_score * 1.0 +
  error_handling_score * 0.5 +
  anti_patterns_score * 1.0 +
  testing_score * 0.5
)

Quality Tiers

Excellent (8.0-10.0):

Follows all best practices
Clearly professional
Ready for production use
Recommendation: Strongly recommended

Good (6.0-7.9):

Follows most best practices
Minor improvements needed
Usable but not perfect
Recommendation: Recommended with minor notes

Fair (4.0-5.9):

Follows some best practices
Several improvements needed
May work but needs review
Recommendation: Consider with caution

Poor (0.0-3.9):

Violates many best practices
Significant issues
High risk of problems
Recommendation: Not recommended

Quick Evaluation Process

For rapid assessment during search:

Read SKILL.md frontmatter (30 sec)
- Check description quality (most important)
- Check name convention
Scan SKILL.md body (1-2 min)
- Check length (<500 lines?)
- Look for examples
- Check for references to other files
- Note any obvious anti-patterns
Check file structure (30 sec)
- Look for reference files
- Check for scripts/utilities
- Verify organization
Calculate quick score (30 sec)
- Focus on weighted criteria
- Estimate tier (Excellent/Good/Fair/Poor)

Total time per skill: ~3-4 minutes

Automation Tips

When evaluating multiple skills:

# Check SKILL.md length
wc -l SKILL.md

# Count reference files
find . -name "*.md" -not -name "SKILL.md" | wc -l

# Check for common anti-patterns
grep -i "claude can help\|I can help\|you can use" SKILL.md

# Verify Unix paths
grep -E '\\\|\\\\' SKILL.md

# Check description length
head -10 SKILL.md | grep "description:" | wc -c

Reference

Based on official Anthropic documentation:

Usage: Use this checklist when evaluating skills found through skill-finder to provide quality scores and recommendations to users.

7.6 KiB Raw Permalink Blame History

Anthropic Best Practices Checklist

Purpose

Evaluation Criteria

1. Description Quality (Weight: 2.0)

2. Name Convention (Weight: 0.5)

3. Conciseness (Weight: 1.5)

4. Progressive Disclosure (Weight: 1.0)

5. Examples and Workflows (Weight: 1.0)

6. Appropriate Degree of Freedom (Weight: 0.5)

7. Dependencies Documentation (Weight: 0.5)

8. Structure and Organization (Weight: 1.0)

9. Error Handling (Weight: 0.5)

10. Avoids Anti-Patterns (Weight: 1.0)

11. Testing and Validation (Weight: 0.5)

Scoring System

Quality Tiers

Quick Evaluation Process

Automation Tips

Reference

7.6 KiB

Raw Permalink Blame History