# Quality Checklist - Detailed Criteria Comprehensive quality criteria for skill review with examples and guidance. ## 1. Progressive Disclosure **What to check**: Information is properly layered across metadata, instructions, and resources. **Good example**: ```yaml --- name: pdf-processor description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files. --- # PDF Processor ## Quick Start Use pdfplumber to extract text: ```python import pdfplumber with pdfplumber.open("doc.pdf") as pdf: text = pdf.pages[0].extract_text() ``` For form filling, see [forms.md](references/forms.md). For advanced table extraction, see [tables.md](references/tables.md). ``` **Bad example**: ```markdown # PDF Processor ## Complete API Reference [500 lines of pdfplumber API documentation inline...] ## All Possible Workflows [50 different use cases with full code...] ## Configuration Options [Every configuration parameter explained...] ``` **Why it matters**: Skills should load incrementally. Metadata is always loaded (tiny), SKILL.md loads when triggered (small), references load as needed (can be large). **Review questions**: - Is SKILL.md under 5k tokens? - Are detailed references offloaded to separate files? - Does SKILL.md link to references instead of duplicating content? --- ## 2. Mental Model Shift **What to check**: Skill is described as the canonical way, not a "new" or "recommended" feature. **Good example**: ```markdown # Session Registry Use the session registry for automatic session tracking. This eliminates manual socket management. ## Standard Workflow 1. Create a session: `create-session.sh -n my-session` 2. Use the session: `safe-send.sh -s my-session -c "command"` ``` **Bad example**: ```markdown # Session Registry (NEW!) The session registry is a new recommended feature that you can optionally use instead of manual socket management. ## Two Approaches ### Approach 1: Manual Socket Management (Traditional) [old way...] ### Approach 2: Session Registry (Recommended!) [new way...] ``` **Why it matters**: Mental model shift means the feature becomes "the way" things are done, not an alternative. Documentation should reflect this confidence. **Red flags**: - "New feature" or "recommended approach" - Side-by-side comparisons of old vs new - Hedging language ("you might want to", "consider using") - "Traditional" or "legacy" alongside "new" **Review questions**: - Does the documentation present this as THE way to do the task? - Is old/alternative approach relegated to a "Manual Alternative" section? - Does language convey confidence rather than optionality? --- ## 3. Degree of Freedom **What to check**: Instructions match the declared autonomy level (high/medium/low). **High Freedom** (principles and heuristics): ```markdown ## Analyzing Code Quality Review code for: - Readability and maintainability - Performance implications - Security concerns - Test coverage Consider the project's context and constraints when making recommendations. ``` **Medium Freedom** (preferred patterns with parameters): ```markdown ## Creating Tests Use pytest with this structure: ```python def test_feature_name(): # Arrange: Setup test data # Act: Execute the feature # Assert: Verify results ``` Adjust assertion strictness based on feature criticality. ``` **Low Freedom** (specific steps): ```markdown ## Deploying to Production Execute exactly in order: 1. Run: `make test` (must pass 100%) 2. Run: `make build` 3. Tag release: `git tag v1.x.x` 4. Push: `git push origin v1.x.x` 5. Run: `./deploy.sh production` 6. Monitor: `tail -f /var/log/app.log` for 5 minutes ``` **Why it matters**: Mismatch creates confusion. High freedom tasks shouldn't be over-specified. Low freedom tasks shouldn't be under-specified. **Review questions**: - Is the freedom level explicitly stated or clearly implied? - Do instructions match that freedom level? - Are fragile operations given low freedom with exact steps? - Are creative/contextual tasks given high freedom? --- ## 4. SKILL.md Conciseness **What to check**: SKILL.md is lean, actionable, and purpose-driven. **Good example** (concise): ```markdown # API Client ## Authentication Set `API_KEY` environment variable before making requests. ## Making Requests ```python import requests response = requests.get( "https://api.example.com/data", headers={"Authorization": f"Bearer {os.getenv('API_KEY')}"} ) ``` For all endpoints and parameters, see [api-reference.md](references/api-reference.md). ``` **Bad example** (verbose): ```markdown # API Client ## Introduction This skill helps you interact with the Example API. The API provides various endpoints for data access and manipulation. Founded in 2020, Example Corp offers... ## Why Use This Skill Benefits of using this skill include... - Consistent authentication - Error handling - Rate limiting [more marketing copy...] ## Prerequisites Before you begin, make sure you have: 1. An API key (see below for how to obtain) 2. Python 3.7+ installed 3. requests library (can be installed via pip) 4. A stable internet connection ... ``` **Why it matters**: Context window is expensive. Every word should earn its place. **Conciseness checklist**: - ❌ Marketing language or lengthy introductions - ❌ Redundant explanations of obvious concepts - ❌ Walls of text that could be examples - ✅ Direct, actionable instructions - ✅ Minimal but representative examples - ✅ Links to references for depth **Review questions**: - Could any section be condensed by 50% without losing clarity? - Are there marketing phrases or fluff? - Do examples replace explanations where possible? - Is depth offloaded to references/? --- ## 5. Safety & Failure Handling **What to check**: Guardrails for dangerous actions, clear failure modes, recovery steps. **Good example**: ```markdown ## Deploying Changes **⚠️ WARNING**: This deploys to production. Ensure tests pass before proceeding. ```bash # Verify tests first make test || { echo "Tests failed - aborting"; exit 1; } # Deploy ./deploy.sh production ``` **If deployment fails**: 1. Check logs: `tail -f /var/log/deploy.log` 2. Rollback: `./deploy.sh rollback` 3. Verify: `curl https://api.example.com/health` **Rollback steps**: ```bash git revert HEAD ./deploy.sh production ``` ``` **Bad example**: ```markdown ## Deploying Changes Run: `./deploy.sh production` ``` **Why it matters**: Skills often perform critical or destructive operations. Users need to know what can go wrong and how to recover. **Safety elements**: - **Warnings** for destructive operations - **Validation** steps before critical actions - **Failure modes** documented - **Recovery procedures** provided - **Assumptions** stated explicitly **Review questions**: - Are dangerous operations flagged with warnings? - Are there validation steps before destructive actions? - Are failure scenarios documented? - Are rollback/recovery steps provided? --- ## 6. Resource Hygiene **What to check**: References are current, minimal, discoverable, and properly linked. **Good example**: ``` skill-name/ ├── SKILL.md └── references/ ├── api-reference.md (current, focused) ├── examples.md (representative cases) └── troubleshooting.md (common issues) ``` SKILL.md properly links: ```markdown See [API Reference](references/api-reference.md) for all endpoints. For common issues, check [Troubleshooting](references/troubleshooting.md). ``` **Bad example**: ``` skill-name/ ├── SKILL.md └── references/ ├── docs.md (duplicates SKILL.md) ├── api-v1.md (outdated) ├── api-v2.md (current but not clear) ├── examples-old.md (deprecated) ├── examples-new.md (current) ├── random-notes.md (unclear purpose) └── README.md (redundant) ``` **Resource hygiene checklist**: - ✅ Each file has clear, unique purpose - ✅ File names indicate content - ✅ No duplicate information - ✅ Links from SKILL.md resolve - ✅ No outdated or deprecated content - ✅ Secret handling documented if applicable **Review questions**: - Is each reference file's purpose clear from its name? - Are all links from SKILL.md valid? - Is there duplicate content between files? - Are outdated resources removed? --- ## 7. Consistency & Clarity **What to check**: Terminology consistent, flow logical, formatting readable. **Good example**: ```markdown # Database Migration Tool ## Running Migrations Apply all pending migrations: ```bash ./migrate.sh apply ``` Rollback the last migration: ```bash ./migrate.sh rollback ``` ## Migration Files Create new migration: ```bash ./migrate.sh create add_users_table ``` This creates `migrations/001_add_users_table.sql`. ``` **Bad example**: ```markdown # Database Migration Tool ## Executing Migrations Run migrations using the migration runner: ```bash ./run-migrations.sh ``` ## Reverting Changes Undo schema modifications: ```bash ./rollback-db.sh ``` ## Creating Migration Scripts Generate new migration file: ```bash ./new-migration.sh ``` ``` **Consistency issues** in bad example: - Command names inconsistent (`./migrate.sh` vs `./run-migrations.sh`) - Terminology varies ("migrations" vs "schema modifications") - Section headings use different patterns **Clarity checklist**: - ✅ Consistent terminology throughout - ✅ Logical section ordering - ✅ Clear, unambiguous instructions - ✅ Readable formatting and spacing - ✅ No conflicting guidance **Review questions**: - Is the same concept called by the same name throughout? - Do sections flow in logical order? - Are commands/tools referenced consistently? - Is formatting consistent? --- ## 8. Testing & Verification **What to check**: Quick checks, expected outputs, or smoke tests included. **Good example**: ```markdown ## Verification Test the installation: ```bash ./health-check.sh ``` **Expected output**: ``` ✓ API connection successful ✓ Database accessible ✓ Cache configured All systems operational ``` **Quick smoke test**: ```bash # Should return status 200 curl -I https://api.example.com/health ``` ``` **Bad example**: ```markdown ## Usage Run the tool: ```bash ./tool.sh ``` ``` **Why it matters**: Users need to verify the skill works correctly and understand what success looks like. **Testing elements**: - **Smoke tests**: Quick checks that basic functionality works - **Expected outputs**: What success looks like - **Verification steps**: How to confirm it's working - **Example runs**: Representative use cases with results **Review questions**: - Are there quick verification steps? - Is expected output shown? - Can users confirm the skill works? - Are examples testable/reproducible? --- ## 9. Ownership & Maintenance (Optional) **What to check**: Known limitations documented. Version/maintainer metadata optional but recommended for team/public skills. **Note**: Marketplace-level changelogs (changelogs/skill-name.md) are required per marketplace standards and provide versioning at the marketplace level. The Version section within SKILL.md itself is optional. **When to include version metadata in SKILL.md**: - Public marketplace skills (helps users track updates) - Team-shared skills (clarifies who maintains it) - Skills with frequent breaking changes (version tracking important) **When to skip version metadata in SKILL.md**: - Personal skills for individual use - Experimental/prototype skills - Skills where marketplace changelogs provide sufficient tracking **Example with optional version metadata**: ```markdown # API Integration Skill **Version**: 1.2.0 **Maintainer**: DevTools Team (devtools@example.com) ## Known Limitations - Rate limited to 100 requests/minute - Large file uploads (>10MB) not supported - Requires Python 3.8+ ``` **Minimal example (recommended for most skills)**: ```markdown # Simple Helper Skill ## Known Limitations - Works only on Linux/macOS - Requires bash 4.0+ ``` **Why it matters**: Known limitations help users understand constraints. Version metadata in SKILL.md is helpful for team coordination but optional. **Core elements**: - **Known Limitations** (recommended): Document constraints and requirements - **Version** (optional): Current version number - **Maintainer** (optional): Contact info for questions - **Changelog** (optional in SKILL.md): Can reference marketplace changelog or references/ **Review questions**: - Are known limitations or requirements documented? - If version metadata is present, is it complete and current? - If this is a team skill, is maintainer contact info provided? --- ## 10. Tight Scope & Minimalism **What to check**: Focused purpose, no feature creep, no overlapping functionality. **Good example** (focused): ```markdown # PDF Text Extractor Extract text content from PDF files using pdfplumber. ## Supported Operations - Extract text from single page - Extract text from all pages - Extract text with layout preservation **Not covered** (use pdf-form-filler skill): - Form filling - PDF editing ``` **Bad example** (scope creep): ```markdown # PDF Swiss Army Knife Complete PDF toolkit for all your document needs! ## Features - Text extraction - Image extraction - Form filling - PDF editing - PDF merging - PDF splitting - Watermarking - OCR processing - Compression - Encryption - Digital signatures - Conversion to Word/Excel - Email integration - Cloud storage sync ``` **Why it matters**: Focused skills are easier to maintain, understand, and use. Feature creep dilutes the skill's purpose and increases complexity. **Scope checklist**: - ✅ Solves one focused job well - ✅ Clear boundaries (what's in, what's out) - ✅ No overlapping functionality with other skills - ✅ No unrelated features - ✅ Complexity matches the actual need **Review questions**: - Does the skill do one thing well? - Are there unrelated features that should be separate skills? - Does functionality overlap with existing skills? - Is complexity justified by the use case? --- ## Using This Checklist ### Quick Review (5-10 minutes) Scan for obvious issues: 1. Check SKILL.md length (should be under 5k tokens) 2. Verify progressive disclosure (links to references/) 3. Look for mental model language ("new feature", "recommended") 4. Check for safety warnings on destructive operations 5. Verify examples are present and minimal ### Thorough Review (30-60 minutes) Apply all 10 criteria systematically: 1. Read SKILL.md completely 2. Check frontmatter quality 3. Verify each criterion with examples 4. Review all reference files 5. Test examples if possible 6. Document findings in review report ### Common Review Patterns **New Skill**: - Focus on criteria 1, 2, 4, 10 (structure and scope) - Verify progressive disclosure from the start - Ensure mental model language is correct **Updated Skill**: - Focus on criteria 3, 7, 9 (consistency with changes) - Check that updates didn't break existing patterns - Verify changelog is updated **Audit**: - Apply all 10 criteria - Compare against other skills for consistency - Look for improvement opportunities