Initial commit
This commit is contained in:
378
skills/artifact.validate.types/SKILL.md
Normal file
378
skills/artifact.validate.types/SKILL.md
Normal file
@@ -0,0 +1,378 @@
|
||||
# artifact.validate.types
|
||||
|
||||
## Overview
|
||||
|
||||
Validates artifact type names against the Betty Framework registry and returns complete metadata for each type. Provides intelligent fuzzy matching and suggestions for invalid types.
|
||||
|
||||
**Version**: 0.1.0
|
||||
**Status**: active
|
||||
|
||||
## Purpose
|
||||
|
||||
This skill is critical for ensuring skills reference valid artifact types before creation. It validates artifact type names against `registry/artifact_types.json`, retrieves complete metadata (file_pattern, content_type, schema), and suggests alternatives for invalid types using three fuzzy matching strategies:
|
||||
|
||||
1. **Singular/Plural Detection** (high confidence) - Detects "data-flow-diagram" vs "data-flow-diagrams"
|
||||
2. **Generic vs Specific Variants** (medium confidence) - Suggests "logical-data-model" for "data-model"
|
||||
3. **Levenshtein Distance** (low confidence) - Catches typos like "thret-model" → "threat-model"
|
||||
|
||||
This skill is specifically designed to be called by `meta.skill` during Step 2 (Validate Artifact Types) of the skill creation workflow.
|
||||
|
||||
## Inputs
|
||||
|
||||
| Parameter | Type | Required | Default | Description |
|
||||
|-----------|------|----------|---------|-------------|
|
||||
| `artifact_types` | array | Yes | - | List of artifact type names to validate |
|
||||
| `check_schemas` | boolean | No | `true` | Whether to verify schema files exist on filesystem |
|
||||
| `suggest_alternatives` | boolean | No | `true` | Whether to suggest similar types for invalid ones |
|
||||
| `max_suggestions` | number | No | `3` | Maximum number of suggestions per invalid type |
|
||||
|
||||
## Outputs
|
||||
|
||||
| Output | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `validation_results` | object | Validation results for each artifact type with complete metadata |
|
||||
| `all_valid` | boolean | Whether all artifact types are valid |
|
||||
| `invalid_types` | array | List of artifact types that don't exist in registry |
|
||||
| `suggestions` | object | Suggested alternatives for each invalid type |
|
||||
| `warnings` | array | List of warnings (e.g., schema file missing) |
|
||||
|
||||
## Artifact Metadata
|
||||
|
||||
### Produces
|
||||
- **validation-report** (`*.validation.json`) - Validation results with metadata and suggestions
|
||||
|
||||
### Consumes
|
||||
None - reads directly from registry files
|
||||
|
||||
## Usage
|
||||
|
||||
### Example 1: Validate Single Artifact Type
|
||||
|
||||
```bash
|
||||
python artifact_validate_types.py \
|
||||
--artifact_types '["threat-model"]' \
|
||||
--check_schemas true
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
{
|
||||
"validation_results": {
|
||||
"threat-model": {
|
||||
"valid": true,
|
||||
"file_pattern": "*.threat-model.yaml",
|
||||
"content_type": "application/yaml",
|
||||
"schema": "schemas/artifacts/threat-model-schema.json",
|
||||
"description": "Threat model (STRIDE, attack trees)..."
|
||||
}
|
||||
},
|
||||
"all_valid": true,
|
||||
"invalid_types": [],
|
||||
"suggestions": {},
|
||||
"warnings": []
|
||||
}
|
||||
```
|
||||
|
||||
### Example 2: Invalid Type with Suggestions
|
||||
|
||||
```bash
|
||||
python artifact_validate_types.py \
|
||||
--artifact_types '["data-flow-diagram", "threat-model"]' \
|
||||
--suggest_alternatives true
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
{
|
||||
"validation_results": {
|
||||
"data-flow-diagram": {
|
||||
"valid": false
|
||||
},
|
||||
"threat-model": {
|
||||
"valid": true,
|
||||
"file_pattern": "*.threat-model.yaml",
|
||||
"content_type": "application/yaml",
|
||||
"schema": "schemas/artifacts/threat-model-schema.json"
|
||||
}
|
||||
},
|
||||
"all_valid": false,
|
||||
"invalid_types": ["data-flow-diagram"],
|
||||
"suggestions": {
|
||||
"data-flow-diagram": [
|
||||
{
|
||||
"type": "data-flow-diagrams",
|
||||
"reason": "Plural form",
|
||||
"confidence": "high"
|
||||
},
|
||||
{
|
||||
"type": "dataflow-diagram",
|
||||
"reason": "Similar spelling",
|
||||
"confidence": "low"
|
||||
}
|
||||
]
|
||||
},
|
||||
"warnings": [],
|
||||
"ok": false,
|
||||
"status": "validation_failed"
|
||||
}
|
||||
```
|
||||
|
||||
### Example 3: Multiple Invalid Types with Generic → Specific Suggestions
|
||||
|
||||
```bash
|
||||
python artifact_validate_types.py \
|
||||
--artifact_types '["data-model", "api-spec", "test-result"]' \
|
||||
--max_suggestions 3
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
{
|
||||
"all_valid": false,
|
||||
"invalid_types": ["data-model", "api-spec"],
|
||||
"suggestions": {
|
||||
"data-model": [
|
||||
{
|
||||
"type": "logical-data-model",
|
||||
"reason": "Specific variant of model",
|
||||
"confidence": "medium"
|
||||
},
|
||||
{
|
||||
"type": "physical-data-model",
|
||||
"reason": "Specific variant of model",
|
||||
"confidence": "medium"
|
||||
},
|
||||
{
|
||||
"type": "enterprise-data-model",
|
||||
"reason": "Specific variant of model",
|
||||
"confidence": "medium"
|
||||
}
|
||||
],
|
||||
"api-spec": [
|
||||
{
|
||||
"type": "openapi-spec",
|
||||
"reason": "Specific variant of spec",
|
||||
"confidence": "medium"
|
||||
},
|
||||
{
|
||||
"type": "asyncapi-spec",
|
||||
"reason": "Specific variant of spec",
|
||||
"confidence": "medium"
|
||||
}
|
||||
]
|
||||
},
|
||||
"validation_results": {
|
||||
"test-result": {
|
||||
"valid": true,
|
||||
"file_pattern": "*.test-result.json",
|
||||
"content_type": "application/json"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example 4: Save Validation Report to File
|
||||
|
||||
```bash
|
||||
python artifact_validate_types.py \
|
||||
--artifact_types '["threat-model", "architecture-overview"]' \
|
||||
--output validation-results.validation.json
|
||||
```
|
||||
|
||||
Creates `validation-results.validation.json` with complete validation report.
|
||||
|
||||
## Integration with meta.skill
|
||||
|
||||
The `meta.skill` agent calls this skill in Step 2 of its workflow:
|
||||
|
||||
```yaml
|
||||
# meta.skill workflow Step 2
|
||||
2. **Validate Artifact Types**
|
||||
- Extract artifact types from skill description
|
||||
- Call artifact.validate.types with all types
|
||||
- If all_valid == false:
|
||||
→ Display suggestions to user
|
||||
→ Ask user to confirm correct types
|
||||
→ HALT until types are validated
|
||||
- Store validated metadata for use in skill.yaml generation
|
||||
```
|
||||
|
||||
**Example Integration:**
|
||||
|
||||
```python
|
||||
# meta.skill calls artifact.validate.types
|
||||
result = subprocess.run([
|
||||
'python', 'skills/artifact.validate.types/artifact_validate_types.py',
|
||||
'--artifact_types', json.dumps(["threat-model", "data-flow-diagrams"]),
|
||||
'--suggest_alternatives', 'true'
|
||||
], capture_output=True, text=True)
|
||||
|
||||
validation = json.loads(result.stdout)
|
||||
|
||||
if not validation['all_valid']:
|
||||
print(f"❌ Invalid artifact types: {validation['invalid_types']}")
|
||||
for invalid_type, suggestions in validation['suggestions'].items():
|
||||
print(f"\n Suggestions for '{invalid_type}':")
|
||||
for s in suggestions:
|
||||
print(f" - {s['type']} ({s['confidence']} confidence): {s['reason']}")
|
||||
# HALT skill creation
|
||||
else:
|
||||
print("✅ All artifact types validated")
|
||||
# Continue with skill.yaml generation using validated metadata
|
||||
```
|
||||
|
||||
## Fuzzy Matching Strategies
|
||||
|
||||
### Strategy 1: Singular/Plural Detection (High Confidence)
|
||||
|
||||
Detects when a user forgets the "s":
|
||||
|
||||
| Invalid Type | Suggested Type | Reason |
|
||||
|-------------|----------------|--------|
|
||||
| `data-flow-diagram` | `data-flow-diagrams` | Plural form |
|
||||
| `threat-models` | `threat-model` | Singular form |
|
||||
|
||||
### Strategy 2: Generic vs Specific Variants (Medium Confidence)
|
||||
|
||||
Suggests specific variants when a generic term is used:
|
||||
|
||||
| Invalid Type | Suggested Types |
|
||||
|-------------|-----------------|
|
||||
| `data-model` | `logical-data-model`, `physical-data-model`, `enterprise-data-model` |
|
||||
| `api-spec` | `openapi-spec`, `asyncapi-spec`, `graphql-spec` |
|
||||
| `architecture-diagram` | `system-architecture-diagram`, `component-architecture-diagram` |
|
||||
|
||||
### Strategy 3: Levenshtein Distance (Low Confidence)
|
||||
|
||||
Catches typos and misspellings (60%+ similarity):
|
||||
|
||||
| Invalid Type | Suggested Type | Similarity |
|
||||
|-------------|----------------|------------|
|
||||
| `thret-model` | `threat-model` | ~90% |
|
||||
| `architecure-overview` | `architecture-overview` | ~85% |
|
||||
| `api-specfication` | `api-specification` | ~92% |
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Missing Registry File
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": false,
|
||||
"status": "error",
|
||||
"error": "Artifact registry not found: registry/artifact_types.json"
|
||||
}
|
||||
```
|
||||
|
||||
**Resolution**: Ensure you're running from the Betty Framework root directory.
|
||||
|
||||
### Invalid JSON in artifact_types Parameter
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": false,
|
||||
"status": "error",
|
||||
"error": "Invalid JSON: Expecting ',' delimiter: line 1 column 15 (char 14)"
|
||||
}
|
||||
```
|
||||
|
||||
**Resolution**: Ensure artifact_types is a valid JSON array with proper quoting.
|
||||
|
||||
### Corrupted Registry File
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": false,
|
||||
"status": "error",
|
||||
"error": "Invalid JSON in registry file: ..."
|
||||
}
|
||||
```
|
||||
|
||||
**Resolution**: Validate and fix `registry/artifact_types.json` syntax.
|
||||
|
||||
## Performance
|
||||
|
||||
- **Single type validation**: <100ms
|
||||
- **20 types validation**: <1 second
|
||||
- **All 409 types validation**: <5 seconds
|
||||
|
||||
Memory usage is minimal as registry is loaded once and indexed by name for O(1) lookups.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **Python 3.7+**
|
||||
- **PyYAML** - For reading registry
|
||||
- **difflib** - For fuzzy matching (Python stdlib)
|
||||
- **jsonschema** - For validation (optional)
|
||||
|
||||
## Testing
|
||||
|
||||
Run the test suite:
|
||||
|
||||
```bash
|
||||
cd skills/artifact.validate.types
|
||||
python test_artifact_validate_types.py
|
||||
```
|
||||
|
||||
**Test Coverage:**
|
||||
- ✅ Valid artifact type validation
|
||||
- ✅ Invalid artifact type detection
|
||||
- ✅ Singular/plural suggestion
|
||||
- ✅ Generic → specific suggestion
|
||||
- ✅ Typo detection with Levenshtein distance
|
||||
- ✅ Max suggestions limit
|
||||
- ✅ Schema file existence checking
|
||||
- ✅ Empty input handling
|
||||
- ✅ Mixed valid/invalid types
|
||||
|
||||
## Quality Standards
|
||||
|
||||
- **Accuracy**: 100% for exact matches in registry
|
||||
- **Suggestion Quality**: >80% relevant for common mistakes
|
||||
- **Performance**: <1s for 20 types, <100ms for single type
|
||||
- **Schema Verification**: 100% accurate file existence check
|
||||
- **Error Handling**: Graceful handling of corrupted registry files
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ Validates all 409 artifact types correctly
|
||||
- ✅ Provides accurate suggestions for common mistakes (singular/plural)
|
||||
- ✅ Returns exact metadata from registry (file_pattern, content_type, schema)
|
||||
- ✅ Detects missing schema files and warns appropriately
|
||||
- ✅ Completes validation in <1 second for up to 20 types
|
||||
- ✅ Fuzzy matching handles typos within 40% character difference
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Skill returns all_valid=false but I think types are correct
|
||||
|
||||
1. Check the exact spelling in `registry/artifact_types.json`
|
||||
2. Look at suggestions - they often reveal plural/singular issues
|
||||
3. Use `jq` to search registry:
|
||||
```bash
|
||||
jq '.artifact_types[] | select(.name | contains("your-search"))' registry/artifact_types.json
|
||||
```
|
||||
|
||||
### Fuzzy matching isn't suggesting the type I expect
|
||||
|
||||
1. Check if the type name follows patterns (ending in common suffix like "-model", "-spec")
|
||||
2. Increase `max_suggestions` to see more options
|
||||
3. The type might be too dissimilar (< 60% match threshold)
|
||||
|
||||
### Schema warnings appearing for valid types
|
||||
|
||||
This is normal if schema files haven't been created yet. Schema files are optional for many artifact types. Set `check_schemas=false` to suppress these warnings.
|
||||
|
||||
## Related Skills
|
||||
|
||||
- **artifact.define** - Define new artifact types
|
||||
- **artifact.create** - Create artifact files
|
||||
- **skill.define** - Validate skill manifests
|
||||
- **registry.update** - Update skill registry
|
||||
|
||||
## References
|
||||
|
||||
- [Python difflib](https://docs.python.org/3/library/difflib.html) - Fuzzy string matching
|
||||
- [Betty Artifact Registry](../../registry/artifact_types.json) - Source of truth for artifact types
|
||||
- [Levenshtein Distance](https://en.wikipedia.org/wiki/Levenshtein_distance) - String similarity algorithm
|
||||
- [meta.skill Agent](../../agents/meta.skill/agent.yaml) - Primary consumer of this skill
|
||||
Reference in New Issue
Block a user