Files

Zhongwei Li 8f22ddf339 Initial commit

2025-11-29 18:26:08 +08:00

11 KiB

Raw Blame History

artifact.validate.types

Overview

Validates artifact type names against the Betty Framework registry and returns complete metadata for each type. Provides intelligent fuzzy matching and suggestions for invalid types.

Version: 0.1.0 Status: active

Purpose

This skill is critical for ensuring skills reference valid artifact types before creation. It validates artifact type names against registry/artifact_types.json, retrieves complete metadata (file_pattern, content_type, schema), and suggests alternatives for invalid types using three fuzzy matching strategies:

Singular/Plural Detection (high confidence) - Detects "data-flow-diagram" vs "data-flow-diagrams"
Generic vs Specific Variants (medium confidence) - Suggests "logical-data-model" for "data-model"
Levenshtein Distance (low confidence) - Catches typos like "thret-model" → "threat-model"

This skill is specifically designed to be called by meta.skill during Step 2 (Validate Artifact Types) of the skill creation workflow.

Inputs

Parameter	Type	Required	Default	Description
`artifact_types`	array	Yes	-	List of artifact type names to validate
`check_schemas`	boolean	No	`true`	Whether to verify schema files exist on filesystem
`suggest_alternatives`	boolean	No	`true`	Whether to suggest similar types for invalid ones
`max_suggestions`	number	No	`3`	Maximum number of suggestions per invalid type

Outputs

Output	Type	Description
`validation_results`	object	Validation results for each artifact type with complete metadata
`all_valid`	boolean	Whether all artifact types are valid
`invalid_types`	array	List of artifact types that don't exist in registry
`suggestions`	object	Suggested alternatives for each invalid type
`warnings`	array	List of warnings (e.g., schema file missing)

Artifact Metadata

Produces

validation-report (*.validation.json) - Validation results with metadata and suggestions

Consumes

None - reads directly from registry files

Usage

Example 1: Validate Single Artifact Type

python artifact_validate_types.py \
  --artifact_types '["threat-model"]' \
  --check_schemas true

Output:

{
  "validation_results": {
    "threat-model": {
      "valid": true,
      "file_pattern": "*.threat-model.yaml",
      "content_type": "application/yaml",
      "schema": "schemas/artifacts/threat-model-schema.json",
      "description": "Threat model (STRIDE, attack trees)..."
    }
  },
  "all_valid": true,
  "invalid_types": [],
  "suggestions": {},
  "warnings": []
}

Example 2: Invalid Type with Suggestions

python artifact_validate_types.py \
  --artifact_types '["data-flow-diagram", "threat-model"]' \
  --suggest_alternatives true

Output:

{
  "validation_results": {
    "data-flow-diagram": {
      "valid": false
    },
    "threat-model": {
      "valid": true,
      "file_pattern": "*.threat-model.yaml",
      "content_type": "application/yaml",
      "schema": "schemas/artifacts/threat-model-schema.json"
    }
  },
  "all_valid": false,
  "invalid_types": ["data-flow-diagram"],
  "suggestions": {
    "data-flow-diagram": [
      {
        "type": "data-flow-diagrams",
        "reason": "Plural form",
        "confidence": "high"
      },
      {
        "type": "dataflow-diagram",
        "reason": "Similar spelling",
        "confidence": "low"
      }
    ]
  },
  "warnings": [],
  "ok": false,
  "status": "validation_failed"
}

Example 3: Multiple Invalid Types with Generic → Specific Suggestions

python artifact_validate_types.py \
  --artifact_types '["data-model", "api-spec", "test-result"]' \
  --max_suggestions 3

Output:

{
  "all_valid": false,
  "invalid_types": ["data-model", "api-spec"],
  "suggestions": {
    "data-model": [
      {
        "type": "logical-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      },
      {
        "type": "physical-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      },
      {
        "type": "enterprise-data-model",
        "reason": "Specific variant of model",
        "confidence": "medium"
      }
    ],
    "api-spec": [
      {
        "type": "openapi-spec",
        "reason": "Specific variant of spec",
        "confidence": "medium"
      },
      {
        "type": "asyncapi-spec",
        "reason": "Specific variant of spec",
        "confidence": "medium"
      }
    ]
  },
  "validation_results": {
    "test-result": {
      "valid": true,
      "file_pattern": "*.test-result.json",
      "content_type": "application/json"
    }
  }
}

Example 4: Save Validation Report to File

python artifact_validate_types.py \
  --artifact_types '["threat-model", "architecture-overview"]' \
  --output validation-results.validation.json

Creates validation-results.validation.json with complete validation report.

Integration with meta.skill

The meta.skill agent calls this skill in Step 2 of its workflow:

# meta.skill workflow Step 2
2. **Validate Artifact Types**
   - Extract artifact types from skill description
   - Call artifact.validate.types with all types
   - If all_valid == false:
     → Display suggestions to user
     → Ask user to confirm correct types
     → HALT until types are validated
   - Store validated metadata for use in skill.yaml generation

Example Integration:

# meta.skill calls artifact.validate.types
result = subprocess.run([
    'python', 'skills/artifact.validate.types/artifact_validate_types.py',
    '--artifact_types', json.dumps(["threat-model", "data-flow-diagrams"]),
    '--suggest_alternatives', 'true'
], capture_output=True, text=True)

validation = json.loads(result.stdout)

if not validation['all_valid']:
    print(f"❌ Invalid artifact types: {validation['invalid_types']}")
    for invalid_type, suggestions in validation['suggestions'].items():
        print(f"\n  Suggestions for '{invalid_type}':")
        for s in suggestions:
            print(f"    - {s['type']} ({s['confidence']} confidence): {s['reason']}")
    # HALT skill creation
else:
    print("✅ All artifact types validated")
    # Continue with skill.yaml generation using validated metadata

Fuzzy Matching Strategies

Strategy 1: Singular/Plural Detection (High Confidence)

Detects when a user forgets the "s":

Invalid Type	Suggested Type	Reason
`data-flow-diagram`	`data-flow-diagrams`	Plural form
`threat-models`	`threat-model`	Singular form

Strategy 2: Generic vs Specific Variants (Medium Confidence)

Suggests specific variants when a generic term is used:

Invalid Type	Suggested Types
`data-model`	`logical-data-model`, `physical-data-model`, `enterprise-data-model`
`api-spec`	`openapi-spec`, `asyncapi-spec`, `graphql-spec`
`architecture-diagram`	`system-architecture-diagram`, `component-architecture-diagram`

Strategy 3: Levenshtein Distance (Low Confidence)

Catches typos and misspellings (60%+ similarity):

Invalid Type	Suggested Type	Similarity
`thret-model`	`threat-model`	~90%
`architecure-overview`	`architecture-overview`	~85%
`api-specfication`	`api-specification`	~92%

Error Handling

Missing Registry File

{
  "ok": false,
  "status": "error",
  "error": "Artifact registry not found: registry/artifact_types.json"
}

Resolution: Ensure you're running from the Betty Framework root directory.

Invalid JSON in artifact_types Parameter

{
  "ok": false,
  "status": "error",
  "error": "Invalid JSON: Expecting ',' delimiter: line 1 column 15 (char 14)"
}

Resolution: Ensure artifact_types is a valid JSON array with proper quoting.

Corrupted Registry File

{
  "ok": false,
  "status": "error",
  "error": "Invalid JSON in registry file: ..."
}

Resolution: Validate and fix registry/artifact_types.json syntax.

Performance

Single type validation: <100ms
20 types validation: <1 second
All 409 types validation: <5 seconds

Memory usage is minimal as registry is loaded once and indexed by name for O(1) lookups.

Dependencies

Python 3.7+
PyYAML - For reading registry
difflib - For fuzzy matching (Python stdlib)
jsonschema - For validation (optional)

Testing

Run the test suite:

cd skills/artifact.validate.types
python test_artifact_validate_types.py

Test Coverage:

✅ Valid artifact type validation
✅ Invalid artifact type detection
✅ Singular/plural suggestion
✅ Generic → specific suggestion
✅ Typo detection with Levenshtein distance
✅ Max suggestions limit
✅ Schema file existence checking
✅ Empty input handling
✅ Mixed valid/invalid types

Quality Standards

Accuracy: 100% for exact matches in registry
Suggestion Quality: >80% relevant for common mistakes
Performance: <1s for 20 types, <100ms for single type
Schema Verification: 100% accurate file existence check
Error Handling: Graceful handling of corrupted registry files

Success Criteria

✅ Validates all 409 artifact types correctly
✅ Provides accurate suggestions for common mistakes (singular/plural)
✅ Returns exact metadata from registry (file_pattern, content_type, schema)
✅ Detects missing schema files and warns appropriately
✅ Completes validation in <1 second for up to 20 types
✅ Fuzzy matching handles typos within 40% character difference

Troubleshooting

Skill returns all_valid=false but I think types are correct

Check the exact spelling in registry/artifact_types.json
Look at suggestions - they often reveal plural/singular issues

Use jq to search registry:

jq '.artifact_types[] | select(.name | contains("your-search"))' registry/artifact_types.json

Fuzzy matching isn't suggesting the type I expect

Check if the type name follows patterns (ending in common suffix like "-model", "-spec")
Increase max_suggestions to see more options
The type might be too dissimilar (< 60% match threshold)

Schema warnings appearing for valid types

This is normal if schema files haven't been created yet. Schema files are optional for many artifact types. Set check_schemas=false to suppress these warnings.

artifact.define - Define new artifact types
artifact.create - Create artifact files
skill.define - Validate skill manifests
registry.update - Update skill registry

References

Python difflib - Fuzzy string matching
Betty Artifact Registry - Source of truth for artifact types
Levenshtein Distance - String similarity algorithm
meta.skill Agent - Primary consumer of this skill

11 KiB Raw Blame History

artifact.validate.types

Overview

Purpose

Inputs

Outputs

Artifact Metadata

Produces

Consumes

Usage

Example 1: Validate Single Artifact Type

Example 2: Invalid Type with Suggestions

Example 3: Multiple Invalid Types with Generic → Specific Suggestions

Example 4: Save Validation Report to File

Integration with meta.skill

Fuzzy Matching Strategies

Strategy 1: Singular/Plural Detection (High Confidence)

Strategy 2: Generic vs Specific Variants (Medium Confidence)

Strategy 3: Levenshtein Distance (Low Confidence)

Error Handling

Missing Registry File

Invalid JSON in artifact_types Parameter

Corrupted Registry File

Performance

Dependencies

Testing

Quality Standards

Success Criteria

Troubleshooting

Skill returns all_valid=false but I think types are correct

Fuzzy matching isn't suggesting the type I expect

Schema warnings appearing for valid types

Related Skills

References

11 KiB

Raw Blame History