# docs.expand.glossary **Version**: 0.1.0 **Status**: active ## Overview The `docs.expand.glossary` skill automatically discovers undocumented terms from Betty manifests and documentation, then enriches `glossary.md` with auto-generated definitions. This ensures comprehensive documentation coverage and helps maintain consistency across the Betty ecosystem. ## Purpose - Extract field names and values from `skill.yaml` and `agent.yaml` manifests - Scan markdown documentation for capitalized terms that may need definitions - Identify gaps in the existing glossary - Auto-generate definitions for common technical terms - Update `glossary.md` with new entries organized alphabetically - Emit JSON summary of changes for auditing ## Inputs | Name | Type | Required | Default | Description | |------|------|----------|---------|-------------| | `glossary_path` | string | No | `docs/glossary.md` | Path to the glossary file to expand | | `base_dir` | string | No | Project root | Base directory to scan for manifests | | `dry_run` | boolean | No | `false` | Preview changes without writing to file | | `include_auto_generated` | boolean | No | `true` | Include auto-generated definitions | ## Outputs | Name | Type | Description | |------|------|-------------| | `summary` | object | Summary with counts, file paths, and operation metadata | | `new_definitions` | object | Dictionary mapping new terms to their definitions | | `manifest_terms` | object | Categorized terms extracted from manifests | | `skipped_terms` | array | Terms that were skipped (already documented or too common) | ## What Gets Scanned ### Manifest Files **skill.yaml fields:** - `status` values (active, draft, deprecated, archived) - `runtime` values (python, javascript, bash) - `permissions` (filesystem:read, filesystem:write, network:http) - Input/output `type` values - `entrypoints` parameters **agent.yaml fields:** - `reasoning_mode` (iterative, oneshot) - `status` values - `capabilities` - Error handling strategies (on_validation_failure, etc.) - Timeout and retry configurations ### Documentation - Scans all `docs/*.md` files for capitalized multi-word phrases - Identifies technical terms that may need glossary entries - Filters out common words and overly generic terms ## How It Works 1. **Load Existing Glossary**: Parses `glossary.md` to identify already-documented terms 2. **Scan Manifests**: Recursively walks `skills/` and `agents/` directories for YAML files 3. **Extract Terms**: Collects field names, values, and configuration options from manifests 4. **Scan Docs**: Looks for capitalized terms in markdown documentation 5. **Generate Definitions**: Creates concise, accurate definitions for common technical terms 6. **Update Glossary**: Inserts new terms alphabetically into appropriate sections 7. **Report**: Returns JSON summary with all changes and statistics ## Auto-Generated Definitions The skill includes predefined definitions for common terms: - **Status values**: active, draft, deprecated, archived - **Runtimes**: python, javascript, bash - **Permissions**: filesystem:read, filesystem:write, network:http - **Reasoning modes**: iterative, oneshot - **Types**: string, boolean, integer, object, array - **Configuration**: max_retries, timeout_seconds, blocking, fuzzy - **Modes**: dry_run, strict, overwrite For unknown terms, the skill can generate contextual definitions based on category and usage patterns. ## Usage Examples ### Basic Usage ```bash # Expand glossary with all undocumented terms python glossary_expand.py # Preview changes without writing python glossary_expand.py --dry-run # Use custom glossary location python glossary_expand.py --glossary-path /path/to/glossary.md ``` ### Programmatic Usage ```python from skills.docs.expand.glossary.glossary_expand import expand_glossary # Expand glossary result = expand_glossary( glossary_path="docs/glossary.md", dry_run=False, include_auto_generated=True ) # Check results if result['ok']: summary = result['details']['summary'] print(f"Added {summary['new_terms_count']} new terms") print(f"New terms: {summary['new_terms']}") ``` ### Output Format #### Summary Mode (Default) ``` ================================================================================ GLOSSARY EXPANSION SUMMARY ================================================================================ Glossary: /home/user/betty/docs/glossary.md Existing terms: 45 New terms added: 8 Scanned: 25 skills, 2 agents -------------------------------------------------------------------------------- NEW TERMS: -------------------------------------------------------------------------------- ### Archived A status indicating that a component has been retired and is no longer maintained or available. ### Dry Run A mode that previews an operation without actually executing it or making changes. ### Handler The script or function that implements the core logic of a skill or operation. ... -------------------------------------------------------------------------------- Glossary updated successfully! ================================================================================ ``` #### JSON Mode ```json { "ok": true, "status": "success", "timestamp": "2025-10-23T19:54:00Z", "details": { "summary": { "glossary_path": "docs/glossary.md", "existing_terms_count": 45, "new_terms_count": 8, "new_terms": ["Archived", "Dry Run", "Handler", ...], "scanned_files": { "skills": 25, "agents": 2 } }, "new_definitions": { "Archived": "A status indicating...", "Dry Run": "A mode that previews...", ... }, "manifest_terms": { "status": ["active", "draft", "deprecated", "archived"], "runtime": ["python", "bash"], "permissions": ["filesystem:read", "filesystem:write"] } } } ``` ## Integration ### With CI/CD ```yaml # .github/workflows/docs-check.yml - name: Check glossary completeness run: | python skills/docs.expand.glossary/glossary_expand.py --dry-run # Fail if new terms found if [ $? -eq 0 ]; then echo "Glossary is complete" else echo "Missing glossary terms - run skill to update" exit 1 fi ``` ### As a Hook Can be integrated as a pre-commit hook to ensure glossary stays current: ```yaml # .claude/hooks.yaml - name: glossary-completeness-check event: on_commit command: python skills/docs.expand.glossary/glossary_expand.py --dry-run blocking: false ``` ## Skipped Terms The skill automatically skips: - Terms already in the glossary - Common words (name, version, description, etc.) - Generic types (string, boolean, file, path, etc.) - Single-character or overly generic terms ## Limitations - Auto-generated definitions may need manual refinement for domain-specific terms - Complex or nuanced terms may require human review - Alphabetical insertion may need manual adjustment for optimal organization - Does not detect duplicate or inconsistent definitions ## Future Enhancements - Detect and flag duplicate definitions - Identify outdated or inconsistent glossary entries - Generate contextual definitions using LLM analysis - Support for multi-language glossaries - Integration with documentation linting tools ## Dependencies - `context.schema` - For validating manifest structure ## Tags `documentation`, `glossary`, `automation`, `analysis`, `manifests` ## Related Skills - `generate.docs` - Generate SKILL.md documentation from manifests - `registry.query` - Query registries for specific terms and metadata - `skill.define` - Define and register new skills ## See Also - [Glossary](../../docs/glossary.md) - The Betty Framework glossary - [Contributing](../../docs/contributing.md) - Documentation contribution guidelines - [Developer Guide](../../docs/developer-guide.md) - Building and extending Betty