Files
gh-epieczko-betty/skills/artifact.validate.types/SKILL.md
2025-11-29 18:26:08 +08:00

11 KiB

artifact.validate.types

Overview

Validates artifact type names against the Betty Framework registry and returns complete metadata for each type. Provides intelligent fuzzy matching and suggestions for invalid types.

Version: 0.1.0 Status: active

Purpose

This skill is critical for ensuring skills reference valid artifact types before creation. It validates artifact type names against registry/artifact_types.json, retrieves complete metadata (file_pattern, content_type, schema), and suggests alternatives for invalid types using three fuzzy matching strategies:

  1. Singular/Plural Detection (high confidence) - Detects "data-flow-diagram" vs "data-flow-diagrams"
  2. Generic vs Specific Variants (medium confidence) - Suggests "logical-data-model" for "data-model"
  3. Levenshtein Distance (low confidence) - Catches typos like "thret-model" → "threat-model"

This skill is specifically designed to be called by meta.skill during Step 2 (Validate Artifact Types) of the skill creation workflow.

Inputs

Parameter Type Required Default Description
artifact_types array Yes - List of artifact type names to validate
check_schemas boolean No true Whether to verify schema files exist on filesystem
suggest_alternatives boolean No true Whether to suggest similar types for invalid ones
max_suggestions number No 3 Maximum number of suggestions per invalid type

Outputs

Output Type Description
validation_results object Validation results for each artifact type with complete metadata
all_valid boolean Whether all artifact types are valid
invalid_types array List of artifact types that don't exist in registry
suggestions object Suggested alternatives for each invalid type
warnings array List of warnings (e.g., schema file missing)

Artifact Metadata

Produces

  • validation-report (*.validation.json) - Validation results with metadata and suggestions

Consumes

None - reads directly from registry files

Usage

Example 1: Validate Single Artifact Type

python artifact_validate_types.py \
  --artifact_types '["threat-model"]' \
  --check_schemas true

Output:

{
  "validation_results": {
    "threat-model": {
      "valid": true,
      "file_pattern": "*.threat-model.yaml",
      "content_type": "application/yaml",
      "schema": "schemas/artifacts/threat-model-schema.json",
      "description": "Threat model (STRIDE, attack trees)..."
    }
  },
  "all_valid": true,
  "invalid_types": [],
  "suggestions": {},
  "warnings": []
}

Example 2: Invalid Type with Suggestions

python artifact_validate_types.py \
  --artifact_types '["data-flow-diagram", "threat-model"]' \
  --suggest_alternatives true

Output:

{
  "validation_results": {
    "data-flow-diagram": {
      "valid": false
    },
    "threat-model": {
      "valid": true,
      "file_pattern": "*.threat-model.yaml",
      "content_type": "application/yaml",
      "schema": "schemas/artifacts/threat-model-schema.json"
    }
  },
  "all_valid": false,
  "invalid_types": ["data-flow-diagram"],
  "suggestions": {
    "data-flow-diagram": [
      {
        "type": "data-flow-diagrams",
        "reason": "Plural form",
        "confidence": "high"
      },
      {
        "type": "dataflow-diagram",
        "reason": "Similar spelling",
        "confidence": "low"
      }
    ]
  },
  "warnings": [],
  "ok": false,
  "status": "validation_failed"
}

Example 3: Multiple Invalid Types with Generic → Specific Suggestions

python artifact_validate_types.py \
  --artifact_types '["data-model", "api-spec", "test-result"]' \
  --max_suggestions 3

Output:

{
  "all_valid": false,
  "invalid_types": ["data-model", "api-spec"],
  "suggestions": {
    "data-model": [
      {
        "type": "logical-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      },
      {
        "type": "physical-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      },
      {
        "type": "enterprise-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      }
    ],
    "api-spec": [
      {
        "type": "openapi-spec",
        "reason": "Specific variant of spec",
        "confidence": "medium"
      },
      {
        "type": "asyncapi-spec",
        "reason": "Specific variant of spec",
        "confidence": "medium"
      }
    ]
  },
  "validation_results": {
    "test-result": {
      "valid": true,
      "file_pattern": "*.test-result.json",
      "content_type": "application/json"
    }
  }
}

Example 4: Save Validation Report to File

python artifact_validate_types.py \
  --artifact_types '["threat-model", "architecture-overview"]' \
  --output validation-results.validation.json

Creates validation-results.validation.json with complete validation report.

Integration with meta.skill

The meta.skill agent calls this skill in Step 2 of its workflow:

# meta.skill workflow Step 2
2. **Validate Artifact Types**
   - Extract artifact types from skill description
   - Call artifact.validate.types with all types
   - If all_valid == false:
     → Display suggestions to user
     → Ask user to confirm correct types
     → HALT until types are validated
   - Store validated metadata for use in skill.yaml generation

Example Integration:

# meta.skill calls artifact.validate.types
result = subprocess.run([
    'python', 'skills/artifact.validate.types/artifact_validate_types.py',
    '--artifact_types', json.dumps(["threat-model", "data-flow-diagrams"]),
    '--suggest_alternatives', 'true'
], capture_output=True, text=True)

validation = json.loads(result.stdout)

if not validation['all_valid']:
    print(f"❌ Invalid artifact types: {validation['invalid_types']}")
    for invalid_type, suggestions in validation['suggestions'].items():
        print(f"\n  Suggestions for '{invalid_type}':")
        for s in suggestions:
            print(f"    - {s['type']} ({s['confidence']} confidence): {s['reason']}")
    # HALT skill creation
else:
    print("✅ All artifact types validated")
    # Continue with skill.yaml generation using validated metadata

Fuzzy Matching Strategies

Strategy 1: Singular/Plural Detection (High Confidence)

Detects when a user forgets the "s":

Invalid Type Suggested Type Reason
data-flow-diagram data-flow-diagrams Plural form
threat-models threat-model Singular form

Strategy 2: Generic vs Specific Variants (Medium Confidence)

Suggests specific variants when a generic term is used:

Invalid Type Suggested Types
data-model logical-data-model, physical-data-model, enterprise-data-model
api-spec openapi-spec, asyncapi-spec, graphql-spec
architecture-diagram system-architecture-diagram, component-architecture-diagram

Strategy 3: Levenshtein Distance (Low Confidence)

Catches typos and misspellings (60%+ similarity):

Invalid Type Suggested Type Similarity
thret-model threat-model ~90%
architecure-overview architecture-overview ~85%
api-specfication api-specification ~92%

Error Handling

Missing Registry File

{
  "ok": false,
  "status": "error",
  "error": "Artifact registry not found: registry/artifact_types.json"
}

Resolution: Ensure you're running from the Betty Framework root directory.

Invalid JSON in artifact_types Parameter

{
  "ok": false,
  "status": "error",
  "error": "Invalid JSON: Expecting ',' delimiter: line 1 column 15 (char 14)"
}

Resolution: Ensure artifact_types is a valid JSON array with proper quoting.

Corrupted Registry File

{
  "ok": false,
  "status": "error",
  "error": "Invalid JSON in registry file: ..."
}

Resolution: Validate and fix registry/artifact_types.json syntax.

Performance

  • Single type validation: <100ms
  • 20 types validation: <1 second
  • All 409 types validation: <5 seconds

Memory usage is minimal as registry is loaded once and indexed by name for O(1) lookups.

Dependencies

  • Python 3.7+
  • PyYAML - For reading registry
  • difflib - For fuzzy matching (Python stdlib)
  • jsonschema - For validation (optional)

Testing

Run the test suite:

cd skills/artifact.validate.types
python test_artifact_validate_types.py

Test Coverage:

  • Valid artifact type validation
  • Invalid artifact type detection
  • Singular/plural suggestion
  • Generic → specific suggestion
  • Typo detection with Levenshtein distance
  • Max suggestions limit
  • Schema file existence checking
  • Empty input handling
  • Mixed valid/invalid types

Quality Standards

  • Accuracy: 100% for exact matches in registry
  • Suggestion Quality: >80% relevant for common mistakes
  • Performance: <1s for 20 types, <100ms for single type
  • Schema Verification: 100% accurate file existence check
  • Error Handling: Graceful handling of corrupted registry files

Success Criteria

  • Validates all 409 artifact types correctly
  • Provides accurate suggestions for common mistakes (singular/plural)
  • Returns exact metadata from registry (file_pattern, content_type, schema)
  • Detects missing schema files and warns appropriately
  • Completes validation in <1 second for up to 20 types
  • Fuzzy matching handles typos within 40% character difference

Troubleshooting

Skill returns all_valid=false but I think types are correct

  1. Check the exact spelling in registry/artifact_types.json
  2. Look at suggestions - they often reveal plural/singular issues
  3. Use jq to search registry:
    jq '.artifact_types[] | select(.name | contains("your-search"))' registry/artifact_types.json
    

Fuzzy matching isn't suggesting the type I expect

  1. Check if the type name follows patterns (ending in common suffix like "-model", "-spec")
  2. Increase max_suggestions to see more options
  3. The type might be too dissimilar (< 60% match threshold)

Schema warnings appearing for valid types

This is normal if schema files haven't been created yet. Schema files are optional for many artifact types. Set check_schemas=false to suppress these warnings.

  • artifact.define - Define new artifact types
  • artifact.create - Create artifact files
  • skill.define - Validate skill manifests
  • registry.update - Update skill registry

References