121 lines
3.7 KiB
Markdown
121 lines
3.7 KiB
Markdown
# Progressive Disclosure Pattern (MANDATORY)
|
||
|
||
**Core Principle**: Find the smallest set of high-signal tokens first (index format), then drill down to full details only for relevant items.
|
||
|
||
## The 4-Step Workflow
|
||
|
||
### Step 1: Start with Index Format
|
||
|
||
**Action:**
|
||
- Use `format=index` (default in most operations)
|
||
- Set `limit=3-5` (not 20)
|
||
- Review titles and dates ONLY
|
||
|
||
**Token Cost:** ~50-100 tokens per result
|
||
|
||
**Why:** Minimal token investment for maximum signal. Get overview before committing to full details.
|
||
|
||
**Example:**
|
||
```bash
|
||
curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"
|
||
```
|
||
|
||
**Response:**
|
||
```json
|
||
{
|
||
"query": "authentication",
|
||
"count": 5,
|
||
"format": "index",
|
||
"results": [
|
||
{
|
||
"id": 1234,
|
||
"type": "feature",
|
||
"title": "Implemented JWT authentication",
|
||
"subtitle": "Added token-based auth with refresh tokens",
|
||
"created_at_epoch": 1699564800000,
|
||
"project": "api-server"
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
### Step 2: Identify Relevant Items
|
||
|
||
**Cognitive Task:**
|
||
- Scan index results for relevance
|
||
- Note which items need full details
|
||
- Discard irrelevant items
|
||
|
||
**Why:** Human-in-the-loop filtering before expensive operations. Don't load full details for items you'll ignore.
|
||
|
||
### Step 3: Request Full Details (Selectively)
|
||
|
||
**Action:**
|
||
- Use `format=full` ONLY for specific items of interest
|
||
- Target by ID or use refined search query
|
||
|
||
**Token Cost:** ~500-1000 tokens per result
|
||
|
||
**Principle:** Load only what you need
|
||
|
||
**Example:**
|
||
```bash
|
||
# After reviewing index, get full details for observation #1234
|
||
curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"
|
||
```
|
||
|
||
**Why:** Targeted token expenditure with high ROI. 10x cost difference means selectivity matters.
|
||
|
||
### Step 4: Refine with Filters (If Needed)
|
||
|
||
**Techniques:**
|
||
- Use `type`, `dateRange`, `concepts`, `files` filters
|
||
- Narrow scope BEFORE requesting more results
|
||
- Use `offset` for pagination instead of large limits
|
||
|
||
**Why:** Reduce result set first, then expand selectively. Don't load 20 results when filters could narrow to 3.
|
||
|
||
## Token Budget Awareness
|
||
|
||
**Costs:**
|
||
- Index result: ~50-100 tokens
|
||
- Full result: ~500-1000 tokens
|
||
- 10x cost difference
|
||
|
||
**Starting Points:**
|
||
- Start with `limit=3-5` (not 20)
|
||
- Reduce limit if hitting token errors
|
||
|
||
**Savings Example:**
|
||
- Naive: 10 items × 750 tokens (avg full) = 7,500 tokens
|
||
- Progressive: (5 items × 75 tokens index) + (2 items × 750 tokens full) = 1,875 tokens
|
||
- **Savings: 5,625 tokens (75% reduction)**
|
||
|
||
## What Problems This Solves
|
||
|
||
1. **Token exhaustion**: Without this, LLMs load everything in full format (9,000+ tokens for 10 items)
|
||
2. **Poor signal-to-noise**: Loading full details for irrelevant items wastes tokens
|
||
3. **MCP limits**: Large payloads hit protocol limits (system failures)
|
||
4. **Inefficiency**: Loading 20 full results when only 2 are relevant
|
||
|
||
## How It Scales
|
||
|
||
**With 10 records:**
|
||
- Index (500 tokens) → Full (2,000 tokens for 2 relevant) = 2,500 tokens
|
||
- Without pattern: Full (10,000 tokens for all 10) = 4x more expensive
|
||
|
||
**With 1,000 records:**
|
||
- Index (500 tokens for top 5) → Full (1,000 tokens for 1 relevant) = 1,500 tokens
|
||
- Without pattern: Would hit MCP limits before seeing relevant data
|
||
|
||
## Context Engineering Alignment
|
||
|
||
This pattern implements core context engineering principles:
|
||
|
||
- **Just-in-time context**: Load data dynamically at runtime
|
||
- **Progressive disclosure**: Lightweight identifiers (index) → full details as needed
|
||
- **Token efficiency**: Minimal high-signal tokens first, expand selectively
|
||
- **Attention budget**: Treat context as finite resource with diminishing returns
|
||
|
||
Always start with the smallest set of high-signal tokens that maximize likelihood of desired outcome.
|