# Activation Quality Checklist **Version:** 1.0 **Purpose:** Ensure high-quality activation system for all created skills --- ## Overview Use this checklist during Phase 4 (Detection) to ensure the skill has robust, reliable activation. **All items must be checked before proceeding to Phase 5.** **Target:** 95%+ activation reliability with zero false positives --- ## ✅ Layer 1: Keywords Quality ### Quantity - [ ] **Minimum 10 keywords defined** - [ ] **Maximum 20 keywords** (more can dilute effectiveness) - [ ] At least 3 categories covered (action, workflow, domain) ### Quality - [ ] **All keywords are complete phrases** (not single words) - [ ] No keywords shorter than 2 words - [ ] **No overly generic keywords** (e.g., "data", "analysis" alone) - [ ] Each keyword is unique and non-redundant ### Coverage - [ ] Keywords cover main capability: {{capability-1}} - [ ] Keywords cover secondary capability: {{capability-2}} - [ ] Keywords cover tertiary capability: {{capability-3}} - [ ] **At least 3 keywords per major capability** ### Specificity - [ ] Keywords include action verbs (create, analyze, extract) - [ ] Keywords include domain entities (agent, stock, crop) - [ ] Keywords include context modifiers when appropriate ### Examples - [ ] ✅ Good: "create an agent for" - [ ] ✅ Good: "stock technical analysis" - [ ] ✅ Good: "harvest progress data" - [ ] ❌ Bad: "create" (single word) - [ ] ❌ Bad: "data analysis" (too generic) - [ ] ❌ Bad: "help me" (too vague) --- ## ✅ Layer 2: Patterns Quality ### Quantity - [ ] **Minimum 5 patterns defined** - [ ] **Maximum 10 patterns** (more can create conflicts) - [ ] At least 3 pattern types covered (action, transformation, query) ### Structure - [ ] **All patterns start with (?i)** for case-insensitivity - [ ] All patterns include action verb group - [ ] Patterns allow for flexible word order where appropriate - [ ] **No patterns match single words only** ### Specificity vs Flexibility - [ ] Patterns are specific enough (avoid false positives) - [ ] Patterns are flexible enough (capture variations) - [ ] Patterns require both verb AND entity/context - [ ] **Tested each pattern independently** ### Quality Checks - [ ] **Pattern 1: Action + Object pattern exists** - Example: `(?i)(create|build)\s+(an?\s+)?agent\s+for` - [ ] **Pattern 2: Domain-specific pattern exists** - Example: `(?i)(analyze|monitor)\s+.*\s+(stock|crop)` - [ ] **Pattern 3: Workflow pattern exists** (if applicable) - Example: `(?i)(every day|daily)\s+I\s+(have to|need)` - [ ] **Pattern 4: Transformation pattern exists** (if applicable) - Example: `(?i)(convert|transform)\s+.*\s+into` - [ ] Pattern 5-7: Additional patterns cover edge cases ### Testing - [ ] **Each pattern tested with 5+ positive examples** - [ ] Each pattern tested with 2+ negative examples - [ ] No pattern has >20% false positive rate - [ ] Combined patterns achieve >80% coverage --- ## ✅ Layer 3: Description Quality ### Content Requirements - [ ] **60+ unique keywords included in description** - [ ] All major capabilities explicitly mentioned - [ ] **Each capability has synonyms** in parentheses - [ ] Technology/API/data source names included - [ ] 3-5 example use cases mentioned ### Structure - [ ] Description starts with primary use case - [ ] **"Activates for queries about:"** section included - [ ] **"Does NOT activate for:"** section included - [ ] Length is 300-500 characters (comprehensive but not excessive) ### Keyword Integration - [ ] All Layer 1 keywords appear in description - [ ] Domain-specific terms well-represented - [ ] Action verbs prominently featured - [ ] Geographic/temporal qualifiers included (if relevant) ### Clarity - [ ] Description is readable and natural - [ ] No keyword stuffing (keywords flow naturally) - [ ] Technical terms explained where necessary - [ ] **User can understand when to use skill** --- ## ✅ Usage Section Quality ### when_to_use - [ ] **Minimum 5 use cases listed** - [ ] Use cases are specific and actionable - [ ] Use cases cover all major capabilities - [ ] Use cases use natural language ### when_not_to_use - [ ] **Minimum 3 counter-cases listed** - [ ] Counter-cases prevent common false positives - [ ] Counter-cases clearly distinguish from similar skills - [ ] Each counter-case explains WHY not to use ### Example - [ ] **Concrete example query provided** - [ ] Example demonstrates typical usage - [ ] Example would actually activate the skill --- ## ✅ Test Queries Quality ### Quantity - [ ] **Minimum 10 test queries defined** - [ ] At least 2 queries per major capability - [ ] Mix of query types (direct, natural, edge cases) ### Coverage - [ ] Tests cover Layer 1 (keywords) - [ ] Tests cover Layer 2 (patterns) - [ ] Tests cover Layer 3 (description/NLU) - [ ] Tests cover all capabilities - [ ] Tests include edge cases ### Quality - [ ] Queries use natural language - [ ] Queries are realistic user requests - [ ] Queries vary in phrasing and structure - [ ] **Each query documented with expected activation layer** ### Negative Tests - [ ] **Minimum 3 negative test cases** (should NOT activate) - [ ] Negative cases test counter-examples from when_not_to_use - [ ] Negative cases documented separately --- ## ✅ Integration & Conflicts ### Conflict Check - [ ] **Reviewed other existing skills in ecosystem** - [ ] No keyword conflicts with other skills - [ ] Patterns don't overlap significantly with other skills - [ ] Clear differentiation from similar skills ### Priority - [ ] Activation priority is appropriate - [ ] More specific skills have higher priority if needed - [ ] Domain-specific skills prioritized over general skills --- ## ✅ Documentation ### In marketplace.json - [ ] **activation section complete** - [ ] **usage section complete** - [ ] **test_queries array populated** - [ ] All JSON is valid (no syntax errors) ### In SKILL.md - [ ] Keywords section included - [ ] Activation examples (positive and negative) - [ ] Use cases clearly documented ### In README.md - [ ] **Activation section included** (see template) - [ ] 10+ activation phrase examples - [ ] Counter-examples documented - [ ] Activation tips provided --- ## ✅ Testing Validation ### Layer Testing - [ ] **Layer 1 (Keywords) tested individually** - Pass rate: ___% (target: 100%) - [ ] **Layer 2 (Patterns) tested individually** - Pass rate: ___% (target: 100%) - [ ] **Layer 3 (Description) tested with edge cases** - Pass rate: ___% (target: 90%+) ### Integration Testing - [ ] **All test_queries tested in Claude Code** - Pass rate: ___% (target: 95%+) - [ ] Negative tests verified (no false positives) - Pass rate: ___% (target: 100%) ### Results - [ ] **Overall success rate: ____%** (target: >=95%) - [ ] **False positive rate: ____%** (target: 0%) - [ ] **False negative rate: ____%** (target: <5%) --- ## ✅ Final Verification ### Pre-Deployment - [ ] All above checklists completed - [ ] Test report documented - [ ] Issues identified and fixed - [ ] **Activation success rate >= 95%** ### Documentation Complete - [ ] marketplace.json reviewed and validated - [ ] SKILL.md includes activation section - [ ] README.md includes activation examples - [ ] TESTING.md created (if complex skill) ### Sign-Off - [ ] Creator reviewed activation system - [ ] Test results satisfactory - [ ] Ready for Phase 5 (Implementation) --- ## 📊 Scoring System ### Minimum Requirements | Layer | Minimum Score | Target Score | |-------|---------------|--------------| | Keywords (Layer 1) | 10 keywords | 12-15 keywords | | Patterns (Layer 2) | 5 patterns | 7 patterns | | Description (Layer 3) | 300 chars, 60+ keywords | 400 chars, 80+ keywords | | Test Queries | 10 queries | 15+ queries | | Success Rate | 90% | 95%+ | ### Grading **A (Excellent):** 95%+ success rate, all requirements met **B (Good):** 90-94% success rate, most requirements met **C (Acceptable):** 85-89% success rate, minimum requirements met **F (Needs Work):** <85% success rate, requirements not met **Only Grade A skills should proceed to implementation.** --- ## 🚨 Common Issues Checklist ### Issue: Low Activation Rate (<90%) **Check:** - [ ] Are keywords too specific/narrow? - [ ] Are patterns too restrictive? - [ ] Is description missing key concepts? - [ ] Are test queries realistic? ### Issue: False Positives **Check:** - [ ] Are keywords too generic? - [ ] Are patterns too broad? - [ ] Is description unclear about scope? - [ ] Are when_not_to_use cases defined? ### Issue: Inconsistent Activation **Check:** - [ ] Are all 3 layers properly configured? - [ ] Is JSON syntax valid? - [ ] Are patterns properly escaped? - [ ] Has testing been thorough? --- ## 📝 Quick Reference ### Minimum Requirements Summary **Must Have:** - ✅ 10+ keywords (complete phrases) - ✅ 5+ patterns (with verbs + entities) - ✅ 300+ char description (60+ keywords) - ✅ 5+ when_to_use cases - ✅ 3+ when_not_to_use cases - ✅ 10+ test queries - ✅ 95%+ success rate **Should Have:** - ⭐ 15 keywords - ⭐ 7 patterns - ⭐ 400+ char description (80+ keywords) - ⭐ 15+ test queries - ⭐ 98%+ success rate - ⭐ Zero false positives --- ## 📚 Additional Resources - `phase4-detection.md` - Complete detection methodology - `activation-patterns-guide.md` - Pattern library - `activation-testing-guide.md` - Testing procedures - `marketplace-robust-template.json` - Template with placeholders - `README-activation-template.md` - README template --- **Status:** ___ (In Progress / Complete) **Reviewer:** ___ **Date:** ___ **Success Rate:** ___% **Grade:** ___ (A / B / C / F) --- **Version:** 1.0 **Last Updated:** 2025-10-23 **Maintained By:** Agent-Skill-Creator Team