10 KiB
telemetry.capture
Version: 0.1.0 Status: Active Tags: telemetry, logging, observability, audit
Overview
The telemetry.capture skill provides comprehensive execution logging for Betty Framework components. It captures usage metrics, execution status, timing data, and contextual metadata in a structured, thread-safe manner.
All telemetry data is written to /registry/telemetry.json with ISO timestamps and validated JSON schema.
Features
- Thread-safe JSON logging using file locking (fcntl)
- ISO 8601 timestamp formatting with timezone support
- Structured telemetry entries with validation
- Query interface for telemetry analysis
- Decorator pattern for automatic capture (
@capture_telemetry) - Context manager pattern for manual capture
- CLI and programmatic interfaces
- Input sanitization (exclude secrets)
Purpose
This skill enables:
- Observability: Track execution patterns across Betty components
- Performance Monitoring: Measure duration of skill executions
- Error Tracking: Capture failures with detailed error messages
- Usage Analytics: Understand which skills are used most frequently
- Audit Trail: Maintain compliance and debugging history
- Workflow Analysis: Trace caller chains and dependencies
Usage
Basic CLI Usage
# Capture a successful execution
python skills/telemetry.capture/telemetry_capture.py plugin.build success 320 CLI
# Capture with inputs
python skills/telemetry.capture/telemetry_capture.py \
agent.run success 1500 API '{"agent": "api.designer", "task": "design_api"}'
# Capture a failure
python skills/telemetry.capture/telemetry_capture.py \
workflow.compose failure 2800 CLI '{"workflow": "api_first"}' "Validation failed at step 3"
As a Decorator (Recommended)
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent / "../telemetry.capture"))
from telemetry_utils import capture_telemetry
@capture_telemetry(skill_name="agent.run", caller="CLI", capture_inputs=True)
def run_agent(agent_path: str, task_context: str = None):
"""Execute a Betty agent."""
# ... implementation
return {"status": "success", "result": execution_result}
# Usage
result = run_agent("/agents/api.designer", "Design user authentication API")
# Telemetry is automatically captured
As a Context Manager
from telemetry_utils import TelemetryContext
def build_plugin(plugin_path: str):
with TelemetryContext(skill="plugin.build", caller="CLI") as ctx:
ctx.set_inputs({"plugin_path": plugin_path})
try:
# Perform build operations
result = create_plugin_archive(plugin_path)
ctx.set_status("success")
return result
except Exception as e:
ctx.set_error(str(e))
raise
Programmatic API
from telemetry_capture import TelemetryCapture
telemetry = TelemetryCapture()
# Capture an event
entry = telemetry.capture(
skill="plugin.build",
status="success",
duration_ms=320.5,
caller="CLI",
inputs={"plugin_path": "./plugin.yaml", "output_format": "tar.gz"},
metadata={"user": "developer@example.com", "environment": "production"}
)
print(f"Captured: {entry['timestamp']}")
Query Telemetry Data
from telemetry_capture import TelemetryCapture
telemetry = TelemetryCapture()
# Query recent failures
failures = telemetry.query(status="failure", limit=10)
# Query specific skill usage
agent_runs = telemetry.query(skill="agent.run", limit=50)
# Query by caller
cli_executions = telemetry.query(caller="CLI", limit=100)
for entry in failures:
print(f"{entry['timestamp']}: {entry['skill']} - {entry['error_message']}")
Parameters
Capture Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
skill |
string | Yes | Name of the skill/component (e.g., 'plugin.build') |
status |
string | Yes | Execution status: success, failure, timeout, error, pending |
duration_ms |
number | Yes | Execution duration in milliseconds |
caller |
string | Yes | Source of the call (CLI, API, workflow.compose) |
inputs |
object | No | Sanitized input parameters (default: {}) |
error_message |
string | No | Error message if status is failure/error |
metadata |
object | No | Additional context (user, session_id, environment) |
Decorator Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
skill_name |
string | function name | Override skill name |
caller |
string | "runtime" | Caller identifier |
capture_inputs |
boolean | False | Whether to capture function arguments |
sanitize_keys |
list | None | Parameter names to redact (e.g., ['password']) |
Output Format
Telemetry Entry Structure
{
"timestamp": "2025-10-24T14:30:45.123456+00:00",
"skill": "plugin.build",
"status": "success",
"duration_ms": 320.5,
"caller": "CLI",
"inputs": {
"plugin_path": "./plugin.yaml",
"output_format": "tar.gz"
},
"error_message": null,
"metadata": {
"user": "developer@example.com",
"environment": "production"
}
}
Telemetry File Structure
[
{
"timestamp": "2025-10-24T14:30:45.123456+00:00",
"skill": "plugin.build",
"status": "success",
"duration_ms": 320.5,
"caller": "CLI",
"inputs": {
"plugin_path": "./plugin.yaml",
"output_format": "tar.gz"
},
"error_message": null,
"metadata": {}
},
{
"timestamp": "2025-10-24T14:32:10.789012+00:00",
"skill": "agent.run",
"status": "success",
"duration_ms": 1500.0,
"caller": "API",
"inputs": {
"agent": "api.designer"
},
"error_message": null,
"metadata": {}
}
]
Note: The telemetry file is a simple JSON array for efficient querying and compatibility with existing Betty Framework tools.
Examples
Example 1: Capture Plugin Build
python skills/telemetry.capture/telemetry_capture.py \
plugin.build success 320 CLI '{"plugin_path": "./plugin.yaml", "output_format": "tar.gz"}'
Output:
{
"timestamp": "2025-10-24T14:30:45.123456+00:00",
"skill": "plugin.build",
"status": "success",
"duration_ms": 320.0,
"caller": "CLI",
"inputs": {
"plugin_path": "./plugin.yaml",
"output_format": "tar.gz"
},
"error_message": null,
"metadata": {}
}
✓ Telemetry captured to /home/user/betty/registry/telemetry.json
Example 2: Capture Agent Execution with Decorator
# In skills/agent.run/agent_run.py
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent / "../telemetry.capture"))
from telemetry_utils import capture_telemetry
@capture_telemetry(
skill_name="agent.run",
caller="CLI",
capture_inputs=True,
sanitize_keys=["api_key", "password"]
)
def main():
"""Execute agent with automatic telemetry capture."""
# ... existing implementation
return {"status": "success", "result": result}
if __name__ == "__main__":
main()
Example 3: Query Recent Failures
from telemetry_capture import TelemetryCapture
telemetry = TelemetryCapture()
failures = telemetry.query(status="failure", limit=10)
print("Recent Failures:")
for entry in failures:
print(f" [{entry['timestamp']}] {entry['skill']}")
print(f" Error: {entry['error_message']}")
print(f" Duration: {entry['duration_ms']}ms")
print()
Error Handling
Invalid Status Value
# Raises ValueError
telemetry.capture(
skill="test.skill",
status="invalid_status", # Must be: success, failure, timeout, error, pending
duration_ms=100,
caller="CLI"
)
Error: ValueError: Invalid status: invalid_status. Must be one of: success, failure, timeout, error, pending
Malformed Input JSON (CLI)
python skills/telemetry.capture/telemetry_capture.py \
plugin.build success 320 CLI '{invalid json}'
Error: Error: Invalid JSON for inputs: Expecting property name enclosed in double quotes
File Locking Contention
The implementation uses fcntl.flock for thread-safe writes. If multiple processes write simultaneously:
- Writes are serialized automatically
- No data loss occurs
- Performance may degrade under heavy contention
Dependencies
This skill has no external dependencies beyond Python standard library:
json- JSON parsing and serializationfcntl- File locking for thread safetydatetime- ISO 8601 timestamp generationpathlib- Path handlingtyping- Type annotations
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
BETTY_TELEMETRY_FILE |
/home/user/betty/registry/telemetry.json |
Path to telemetry file |
Custom Telemetry File
from telemetry_capture import TelemetryCapture
# Use custom location
telemetry = TelemetryCapture(telemetry_file="/custom/path/telemetry.json")
Troubleshooting
Q: Telemetry file doesn't exist
A: The skill automatically creates the telemetry file on first use. Ensure:
- The parent directory exists or can be created
- Write permissions are granted
- Path is absolute or correctly relative
Q: Decorator not capturing telemetry
A: Ensure you:
- Import the decorator correctly
- Add the parent path to sys.path if needed
- Check that the decorated function completes (doesn't hang)
- Verify file permissions on
/registry/telemetry.json
Q: How to exclude sensitive data?
A: Use sanitize_keys parameter:
@capture_telemetry(
skill_name="auth.login",
capture_inputs=True,
sanitize_keys=["password", "api_key", "secret_token"]
)
def login(username: str, password: str):
# password will be redacted as "***REDACTED***"
pass
Q: Performance impact of telemetry?
A: Minimal impact:
- Decorator adds <1ms overhead per call
- File I/O is buffered and atomic
- No network calls
- Consider async writes for high-throughput scenarios
Integration with Betty Framework
The telemetry.capture skill integrates with:
- agent.run: Logs agent executions with task context
- workflow.compose: Traces multi-step workflow chains
- plugin.build: Monitors build performance
- api.define: Tracks API creation events
- skill.define: Captures skill registration
- audit.log: Complements audit trail with performance metrics
All core Betty components should use the @capture_telemetry decorator for consistent observability.
License
Part of the Betty Framework. See repository LICENSE.