# Quality Checklist - Detailed Criteria

Comprehensive quality criteria for skill review with examples and guidance.

## 1. Progressive Disclosure

**What to check**: Information is properly layered across metadata, instructions, and resources.

**Good example**:
```yaml
---
name: pdf-processor
description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files.
---

# PDF Processor

## Quick Start
Use pdfplumber to extract text:
```python
import pdfplumber
with pdfplumber.open("doc.pdf") as pdf:
    text = pdf.pages[0].extract_text()
```

For form filling, see [forms.md](references/forms.md).
For advanced table extraction, see [tables.md](references/tables.md).
```

**Bad example**:
```markdown
# PDF Processor

## Complete API Reference
[500 lines of pdfplumber API documentation inline...]

## All Possible Workflows
[50 different use cases with full code...]

## Configuration Options
[Every configuration parameter explained...]
```

**Why it matters**: Skills should load incrementally. Metadata is always loaded (tiny), SKILL.md loads when triggered (small), references load as needed (can be large).

**Review questions**:
- Is SKILL.md under 5k tokens?
- Are detailed references offloaded to separate files?
- Does SKILL.md link to references instead of duplicating content?

---

## 2. Mental Model Shift

**What to check**: Skill is described as the canonical way, not a "new" or "recommended" feature.

**Good example**:
```markdown
# Session Registry

Use the session registry for automatic session tracking. This eliminates manual socket management.

## Standard Workflow
1. Create a session: `create-session.sh -n my-session`
2. Use the session: `safe-send.sh -s my-session -c "command"`
```

**Bad example**:
```markdown
# Session Registry (NEW!)

The session registry is a new recommended feature that you can optionally use instead of manual socket management.

## Two Approaches
### Approach 1: Manual Socket Management (Traditional)
[old way...]

### Approach 2: Session Registry (Recommended!)
[new way...]
```

**Why it matters**: Mental model shift means the feature becomes "the way" things are done, not an alternative. Documentation should reflect this confidence.

**Red flags**:
- "New feature" or "recommended approach"
- Side-by-side comparisons of old vs new
- Hedging language ("you might want to", "consider using")
- "Traditional" or "legacy" alongside "new"

**Review questions**:
- Does the documentation present this as THE way to do the task?
- Is old/alternative approach relegated to a "Manual Alternative" section?
- Does language convey confidence rather than optionality?

---

## 3. Degree of Freedom

**What to check**: Instructions match the declared autonomy level (high/medium/low).

**High Freedom** (principles and heuristics):
```markdown
## Analyzing Code Quality

Review code for:
- Readability and maintainability
- Performance implications
- Security concerns
- Test coverage

Consider the project's context and constraints when making recommendations.
```

**Medium Freedom** (preferred patterns with parameters):
```markdown
## Creating Tests

Use pytest with this structure:
```python
def test_feature_name():
    # Arrange: Setup test data
    # Act: Execute the feature
    # Assert: Verify results
```

Adjust assertion strictness based on feature criticality.
```

**Low Freedom** (specific steps):
```markdown
## Deploying to Production

Execute exactly in order:
1. Run: `make test` (must pass 100%)
2. Run: `make build`
3. Tag release: `git tag v1.x.x`
4. Push: `git push origin v1.x.x`
5. Run: `./deploy.sh production`
6. Monitor: `tail -f /var/log/app.log` for 5 minutes
```

**Why it matters**: Mismatch creates confusion. High freedom tasks shouldn't be over-specified. Low freedom tasks shouldn't be under-specified.

**Review questions**:
- Is the freedom level explicitly stated or clearly implied?
- Do instructions match that freedom level?
- Are fragile operations given low freedom with exact steps?
- Are creative/contextual tasks given high freedom?

---

## 4. SKILL.md Conciseness

**What to check**: SKILL.md is lean, actionable, and purpose-driven.

**Good example** (concise):
```markdown
# API Client

## Authentication
Set `API_KEY` environment variable before making requests.

## Making Requests
```python
import requests
response = requests.get(
    "https://api.example.com/data",
    headers={"Authorization": f"Bearer {os.getenv('API_KEY')}"}
)
```

For all endpoints and parameters, see [api-reference.md](references/api-reference.md).
```

**Bad example** (verbose):
```markdown
# API Client

## Introduction
This skill helps you interact with the Example API. The API provides various endpoints for data access and manipulation. Founded in 2020, Example Corp offers...

## Why Use This Skill
Benefits of using this skill include...
- Consistent authentication
- Error handling
- Rate limiting
[more marketing copy...]

## Prerequisites
Before you begin, make sure you have:
1. An API key (see below for how to obtain)
2. Python 3.7+ installed
3. requests library (can be installed via pip)
4. A stable internet connection
...
```

**Why it matters**: Context window is expensive. Every word should earn its place.

**Conciseness checklist**:
- ❌ Marketing language or lengthy introductions
- ❌ Redundant explanations of obvious concepts
- ❌ Walls of text that could be examples
- ✅ Direct, actionable instructions
- ✅ Minimal but representative examples
- ✅ Links to references for depth

**Review questions**:
- Could any section be condensed by 50% without losing clarity?
- Are there marketing phrases or fluff?
- Do examples replace explanations where possible?
- Is depth offloaded to references/?

---

## 5. Safety & Failure Handling

**What to check**: Guardrails for dangerous actions, clear failure modes, recovery steps.

**Good example**:
```markdown
## Deploying Changes

**⚠️  WARNING**: This deploys to production. Ensure tests pass before proceeding.

```bash
# Verify tests first
make test || { echo "Tests failed - aborting"; exit 1; }

# Deploy
./deploy.sh production
```

**If deployment fails**:
1. Check logs: `tail -f /var/log/deploy.log`
2. Rollback: `./deploy.sh rollback`
3. Verify: `curl https://api.example.com/health`

**Rollback steps**:
```bash
git revert HEAD
./deploy.sh production
```
```

**Bad example**:
```markdown
## Deploying Changes

Run: `./deploy.sh production`
```

**Why it matters**: Skills often perform critical or destructive operations. Users need to know what can go wrong and how to recover.

**Safety elements**:
- **Warnings** for destructive operations
- **Validation** steps before critical actions
- **Failure modes** documented
- **Recovery procedures** provided
- **Assumptions** stated explicitly

**Review questions**:
- Are dangerous operations flagged with warnings?
- Are there validation steps before destructive actions?
- Are failure scenarios documented?
- Are rollback/recovery steps provided?

---

## 6. Resource Hygiene

**What to check**: References are current, minimal, discoverable, and properly linked.

**Good example**:
```
skill-name/
├── SKILL.md
└── references/
    ├── api-reference.md (current, focused)
    ├── examples.md (representative cases)
    └── troubleshooting.md (common issues)
```

SKILL.md properly links:
```markdown
See [API Reference](references/api-reference.md) for all endpoints.
For common issues, check [Troubleshooting](references/troubleshooting.md).
```

**Bad example**:
```
skill-name/
├── SKILL.md
└── references/
    ├── docs.md (duplicates SKILL.md)
    ├── api-v1.md (outdated)
    ├── api-v2.md (current but not clear)
    ├── examples-old.md (deprecated)
    ├── examples-new.md (current)
    ├── random-notes.md (unclear purpose)
    └── README.md (redundant)
```

**Resource hygiene checklist**:
- ✅ Each file has clear, unique purpose
- ✅ File names indicate content
- ✅ No duplicate information
- ✅ Links from SKILL.md resolve
- ✅ No outdated or deprecated content
- ✅ Secret handling documented if applicable

**Review questions**:
- Is each reference file's purpose clear from its name?
- Are all links from SKILL.md valid?
- Is there duplicate content between files?
- Are outdated resources removed?

---

## 7. Consistency & Clarity

**What to check**: Terminology consistent, flow logical, formatting readable.

**Good example**:
```markdown
# Database Migration Tool

## Running Migrations

Apply all pending migrations:
```bash
./migrate.sh apply
```

Rollback the last migration:
```bash
./migrate.sh rollback
```

## Migration Files

Create new migration:
```bash
./migrate.sh create add_users_table
```

This creates `migrations/001_add_users_table.sql`.
```

**Bad example**:
```markdown
# Database Migration Tool

## Executing Migrations

Run migrations using the migration runner:
```bash
./run-migrations.sh
```

## Reverting Changes

Undo schema modifications:
```bash
./rollback-db.sh
```

## Creating Migration Scripts

Generate new migration file:
```bash
./new-migration.sh
```
```

**Consistency issues** in bad example:
- Command names inconsistent (`./migrate.sh` vs `./run-migrations.sh`)
- Terminology varies ("migrations" vs "schema modifications")
- Section headings use different patterns

**Clarity checklist**:
- ✅ Consistent terminology throughout
- ✅ Logical section ordering
- ✅ Clear, unambiguous instructions
- ✅ Readable formatting and spacing
- ✅ No conflicting guidance

**Review questions**:
- Is the same concept called by the same name throughout?
- Do sections flow in logical order?
- Are commands/tools referenced consistently?
- Is formatting consistent?

---

## 8. Testing & Verification

**What to check**: Quick checks, expected outputs, or smoke tests included.

**Good example**:
```markdown
## Verification

Test the installation:
```bash
./health-check.sh
```

**Expected output**:
```
✓ API connection successful
✓ Database accessible
✓ Cache configured
All systems operational
```

**Quick smoke test**:
```bash
# Should return status 200
curl -I https://api.example.com/health
```
```

**Bad example**:
```markdown
## Usage

Run the tool:
```bash
./tool.sh
```
```

**Why it matters**: Users need to verify the skill works correctly and understand what success looks like.

**Testing elements**:
- **Smoke tests**: Quick checks that basic functionality works
- **Expected outputs**: What success looks like
- **Verification steps**: How to confirm it's working
- **Example runs**: Representative use cases with results

**Review questions**:
- Are there quick verification steps?
- Is expected output shown?
- Can users confirm the skill works?
- Are examples testable/reproducible?

---

## 9. Ownership & Maintenance (Optional)

**What to check**: Known limitations documented. Version/maintainer metadata optional but recommended for team/public skills.

**Note**: Marketplace-level changelogs (changelogs/skill-name.md) are required per marketplace standards and provide versioning at the marketplace level. The Version section within SKILL.md itself is optional.

**When to include version metadata in SKILL.md**:
- Public marketplace skills (helps users track updates)
- Team-shared skills (clarifies who maintains it)
- Skills with frequent breaking changes (version tracking important)

**When to skip version metadata in SKILL.md**:
- Personal skills for individual use
- Experimental/prototype skills
- Skills where marketplace changelogs provide sufficient tracking

**Example with optional version metadata**:
```markdown
# API Integration Skill

**Version**: 1.2.0
**Maintainer**: DevTools Team (devtools@example.com)

## Known Limitations

- Rate limited to 100 requests/minute
- Large file uploads (>10MB) not supported
- Requires Python 3.8+
```

**Minimal example (recommended for most skills)**:
```markdown
# Simple Helper Skill

## Known Limitations

- Works only on Linux/macOS
- Requires bash 4.0+
```

**Why it matters**: Known limitations help users understand constraints. Version metadata in SKILL.md is helpful for team coordination but optional.

**Core elements**:
- **Known Limitations** (recommended): Document constraints and requirements
- **Version** (optional): Current version number
- **Maintainer** (optional): Contact info for questions
- **Changelog** (optional in SKILL.md): Can reference marketplace changelog or references/

**Review questions**:
- Are known limitations or requirements documented?
- If version metadata is present, is it complete and current?
- If this is a team skill, is maintainer contact info provided?

---

## 10. Tight Scope & Minimalism

**What to check**: Focused purpose, no feature creep, no overlapping functionality.

**Good example** (focused):
```markdown
# PDF Text Extractor

Extract text content from PDF files using pdfplumber.

## Supported Operations
- Extract text from single page
- Extract text from all pages
- Extract text with layout preservation

**Not covered** (use pdf-form-filler skill):
- Form filling
- PDF editing
```

**Bad example** (scope creep):
```markdown
# PDF Swiss Army Knife

Complete PDF toolkit for all your document needs!

## Features
- Text extraction
- Image extraction
- Form filling
- PDF editing
- PDF merging
- PDF splitting
- Watermarking
- OCR processing
- Compression
- Encryption
- Digital signatures
- Conversion to Word/Excel
- Email integration
- Cloud storage sync
```

**Why it matters**: Focused skills are easier to maintain, understand, and use. Feature creep dilutes the skill's purpose and increases complexity.

**Scope checklist**:
- ✅ Solves one focused job well
- ✅ Clear boundaries (what's in, what's out)
- ✅ No overlapping functionality with other skills
- ✅ No unrelated features
- ✅ Complexity matches the actual need

**Review questions**:
- Does the skill do one thing well?
- Are there unrelated features that should be separate skills?
- Does functionality overlap with existing skills?
- Is complexity justified by the use case?

---

## Using This Checklist

### Quick Review (5-10 minutes)

Scan for obvious issues:
1. Check SKILL.md length (should be under 5k tokens)
2. Verify progressive disclosure (links to references/)
3. Look for mental model language ("new feature", "recommended")
4. Check for safety warnings on destructive operations
5. Verify examples are present and minimal

### Thorough Review (30-60 minutes)

Apply all 10 criteria systematically:
1. Read SKILL.md completely
2. Check frontmatter quality
3. Verify each criterion with examples
4. Review all reference files
5. Test examples if possible
6. Document findings in review report

### Common Review Patterns

**New Skill**:
- Focus on criteria 1, 2, 4, 10 (structure and scope)
- Verify progressive disclosure from the start
- Ensure mental model language is correct

**Updated Skill**:
- Focus on criteria 3, 7, 9 (consistency with changes)
- Check that updates didn't break existing patterns
- Verify changelog is updated

**Audit**:
- Apply all 10 criteria
- Compare against other skills for consistency
- Look for improvement opportunities