144 lines
4.2 KiB
Markdown
144 lines
4.2 KiB
Markdown
# Cache Structure Documentation
|
|
|
|
This directory contains the multi-tier caching system for the BAML Code Generation Skill.
|
|
|
|
## Cache Files
|
|
|
|
### patterns.json
|
|
|
|
**Purpose**: Stores discovered BAML patterns from the baml_Examples MCP server.
|
|
|
|
**Initial State**: Empty arrays and null values. After the first MCP query, the structure will be populated.
|
|
|
|
**Structure After First Query**:
|
|
```json
|
|
{
|
|
"_comment": "Initial cache state. Timestamps and patterns will be updated automatically on first MCP query.",
|
|
"version": "1.0.0",
|
|
"last_updated": "2025-01-25T00:00:00Z", // Updated on each refresh
|
|
"repository_commit": "abc123def456...", // Git commit hash from BoundaryML/baml-examples
|
|
"patterns": {
|
|
"extraction": [
|
|
{
|
|
"id": "uuid-v4",
|
|
"name": "invoice_extraction",
|
|
"description": "Extract structured data from invoices",
|
|
"components": {
|
|
"types": ["Invoice", "LineItem", "Address"],
|
|
"functions": ["ExtractInvoice"],
|
|
"clients": ["gpt-4o"]
|
|
},
|
|
"hit_count": 0,
|
|
"last_used": "ISO-8601 timestamp"
|
|
}
|
|
],
|
|
"classification": [ /* Similar structure */ ],
|
|
"rag": [ /* Similar structure */ ],
|
|
"agents": [ /* Similar structure */ ]
|
|
},
|
|
"metadata": {
|
|
"total_patterns": 150, // Calculated dynamically
|
|
"cache_size_bytes": 245678,
|
|
"last_refresh": "2025-01-25T12:34:56Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
See `lib/pattern_library.md` lines 13-30 for the complete schema definition.
|
|
|
|
### query_cache.json
|
|
|
|
**Purpose**: Caches recent MCP query results to avoid redundant API calls.
|
|
|
|
**Initial State**: Empty queries array. Gets populated as MCP queries are made.
|
|
|
|
**Structure After Use**:
|
|
```json
|
|
{
|
|
"version": "1.0.0",
|
|
"queries": [
|
|
{
|
|
"query": "search_baml_documentation(\"enum syntax\")",
|
|
"server": "baml_Docs",
|
|
"result": { /* MCP response */ },
|
|
"timestamp": "ISO-8601",
|
|
"ttl": 900 // Time-to-live in seconds
|
|
}
|
|
],
|
|
"metadata": {
|
|
"total_queries": 42,
|
|
"cache_hits": 18,
|
|
"cache_misses": 24,
|
|
"hit_rate": 0.43
|
|
}
|
|
}
|
|
```
|
|
|
|
### syntax.json
|
|
|
|
**Purpose**: Caches current BAML syntax specifications from baml_Docs MCP server.
|
|
|
|
**Initial State**: Empty syntax definitions. Populated on first validation query.
|
|
|
|
**Structure After Use**:
|
|
```json
|
|
{
|
|
"version": "1.0.0",
|
|
"last_updated": "2025-01-25T00:00:00Z",
|
|
"syntax": {
|
|
"types": {
|
|
"class": { /* Syntax spec from BoundaryML/baml */ },
|
|
"enum": { /* Syntax spec */ }
|
|
},
|
|
"functions": { /* Function syntax specs */ },
|
|
"clients": { /* Client configuration specs */ },
|
|
"tests": { /* Test syntax specs */ }
|
|
},
|
|
"repository_commit": "xyz789abc123...", // Git commit hash from BoundaryML/baml
|
|
"metadata": {
|
|
"spec_version": "0.x.x", // BAML language version
|
|
"last_validated": "ISO-8601"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Cache Tiers
|
|
|
|
The caching system operates in 4 tiers (see `lib/cache_manager.md` for details):
|
|
|
|
- **Tier 1**: Session memory (loaded patterns, <1ms access)
|
|
- **Tier 2**: LRU memory cache (recent queries, <10ms access)
|
|
- **Tier 3**: File system (these JSON files, <50ms access)
|
|
- **Tier 4**: Live MCP queries (fallback, <500ms access)
|
|
|
|
## Maintenance
|
|
|
|
- **Automatic Updates**: Repository monitor runs daily at 00:00 UTC
|
|
- **Manual Refresh**: Delete cache files to force fresh MCP queries
|
|
- **Size Limits**: Tier 3 (disk) capped at 100MB total
|
|
- **TTL**: Query cache entries expire after 900 seconds (15 minutes)
|
|
|
|
## Troubleshooting
|
|
|
|
**Cache not populating?**
|
|
- Ensure MCP servers are configured (baml_Examples and baml_Docs)
|
|
- Check MCP connectivity with a test query
|
|
- Review logs for MCP query errors
|
|
|
|
**Stale patterns?**
|
|
- Delete `patterns.json` to force refresh
|
|
- Check `last_updated` timestamp in the file
|
|
- Verify repository_commit matches latest BoundaryML repos
|
|
|
|
**Large cache files?**
|
|
- Review `metadata.cache_size_bytes` in patterns.json
|
|
- Oldest/least-used patterns are auto-pruned at 100MB
|
|
- Manually delete query_cache.json if it grows too large
|
|
|
|
## Related Documentation
|
|
|
|
- **Pattern Library**: `lib/pattern_library.md` - Schema definitions
|
|
- **Cache Manager**: `lib/cache_manager.md` - Caching algorithms
|
|
- **MCP Interface**: `lib/mcp_interface.md` - MCP server integration
|
|
- **Repository Monitor**: `lib/repository_monitor.md` - Auto-update system
|