Initial commit

2025-11-30 09:01:43 +08:00
commit 7de5fc7545
28 changed files with 4312 additions and 0 deletions
--- a/skills/mem-search/principles/.gitkeep
+++ b/skills/mem-search/principles/.gitkeep
--- a/skills/mem-search/principles/anti-patterns.md
+++ b/skills/mem-search/principles/anti-patterns.md
@@ -0,0 +1,176 @@
+# Anti-Pattern Catalogue
+
+Common mistakes to avoid when using the HTTP search API. These anti-patterns address LLM training biases and prevent token-wasting behaviors.
+
+## Anti-Pattern 1: Skipping Index Format
+
+**The Mistake:**
+```bash
+# ❌ Bad: Jump straight to full format
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=20"
+```
+
+**Why It's Wrong:**
+- 20 × 750 tokens = 15,000 tokens
+- May hit MCP token limits
+- 99% wasted on irrelevant results
+
+**The Correction:**
+```bash
+# ✅ Good: Start with index, review, then request full selectively
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"
+# Review results, identify relevant items
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"
+```
+
+**What It Teaches:**
+Progressive disclosure isn't optional - it's essential for scale.
+
+**LLM Behavior Insight:**
+LLMs trained on code examples may have seen `format=full` as "more complete" and default to it.
+
+---
+
+## Anti-Pattern 2: Over-Requesting Results
+
+**The Mistake:**
+```bash
+# ❌ Bad: Request limit=20 without reviewing index first
+curl -s "http://localhost:37777/api/search/observations?query=auth&format=index&limit=20"
+```
+
+**Why It's Wrong:**
+- Most of 20 results will be irrelevant
+- Wastes tokens and time
+- Overwhelms review process
+
+**The Correction:**
+```bash
+# ✅ Good: Start small, paginate if needed
+curl -s "http://localhost:37777/api/search/observations?query=auth&format=index&limit=5"
+# If needed, paginate:
+curl -s "http://localhost:37777/api/search/observations?query=auth&format=index&limit=5&offset=5"
+```
+
+**What It Teaches:**
+Start small (limit=3-5), review, paginate if needed.
+
+**LLM Behavior Insight:**
+LLMs may think "more results = more thorough" without considering relevance.
+
+---
+
+## Anti-Pattern 3: Ignoring Tool Specialization
+
+**The Mistake:**
+```bash
+# ❌ Bad: Use generic search for everything
+curl -s "http://localhost:37777/api/search/observations?query=bugfix&format=index&limit=10"
+```
+
+**Why It's Wrong:**
+- Specialized tools (by-type, by-concept, by-file) are more efficient
+- Generic search mixes all result types
+- Misses filtering optimization
+
+**The Correction:**
+```bash
+# ✅ Good: Use specialized endpoint when applicable
+curl -s "http://localhost:37777/api/search/by-type?type=bugfix&format=index&limit=10"
+```
+
+**What It Teaches:**
+The decision tree exists for a reason - follow it.
+
+**LLM Behavior Insight:**
+LLMs may gravitate toward "general purpose" tools to avoid decision-making.
+
+---
+
+## Anti-Pattern 4: Loading Full Context Prematurely
+
+**The Mistake:**
+```bash
+# ❌ Bad: Request full format before understanding what's relevant
+curl -s "http://localhost:37777/api/search/observations?query=database&format=full&limit=10"
+```
+
+**Why It's Wrong:**
+- Can't filter relevance without seeing index first
+- Wastes tokens on irrelevant full details
+- 10 × 750 = 7,500 tokens for potentially zero useful results
+
+**The Correction:**
+```bash
+# ✅ Good: Index first to identify relevance
+curl -s "http://localhost:37777/api/search/observations?query=database&format=index&limit=10"
+# Identify relevant: #1234 and #1250
+curl -s "http://localhost:37777/api/search/observations?query=database+1234&format=full&limit=1"
+curl -s "http://localhost:37777/api/search/observations?query=database+1250&format=full&limit=1"
+```
+
+**What It Teaches:**
+Filtering is a prerequisite for expansion.
+
+**LLM Behavior Insight:**
+LLMs may try to "get everything at once" to avoid multiple tool calls.
+
+---
+
+## Anti-Pattern 5: Not Using Timeline Tools
+
+**The Mistake:**
+```bash
+# ❌ Bad: Search for individual observations separately
+curl -s "http://localhost:37777/api/search/observations?query=before+deployment"
+curl -s "http://localhost:37777/api/search/observations?query=during+deployment"
+curl -s "http://localhost:37777/api/search/observations?query=after+deployment"
+```
+
+**Why It's Wrong:**
+- Misses context around events
+- Inefficient (N searches vs 1 timeline)
+- Temporal relationships lost
+
+**The Correction:**
+```bash
+# ✅ Good: Use timeline tool for contextual investigation
+curl -s "http://localhost:37777/api/timeline/by-query?query=deployment&depth_before=10&depth_after=10"
+```
+
+**What It Teaches:**
+Tool composition - some tools are designed to work together.
+
+**LLM Behavior Insight:**
+LLMs may not naturally discover tool composition patterns.
+
+---
+
+## Why These Anti-Patterns Matter
+
+**Addresses LLM Training Bias:**
+LLMs default to "load everything" behavior from web scraping training data where thoroughness was rewarded.
+
+**Teaches Protocol Awareness:**
+HTTP APIs and MCP have real token limits that can break the system.
+
+**Prevents User Frustration:**
+Token limit errors confuse users and break workflows.
+
+**Builds Good Habits:**
+Anti-patterns teach the "why" behind best practices.
+
+**Makes Implicit Explicit:**
+Surfaces mental models that experienced users internalize but novices miss.
+
+---
+
+## What Happens If These Are Ignored
+
+- **No progressive disclosure**: Every search loads limit=20 in full format → token exhaustion
+- **Over-requesting**: 15,000 token searches for 2 relevant results
+- **Wrong tool**: Generic search when specialized filters would be 10x faster
+- **Premature expansion**: Load full details before knowing relevance
+- **Missing composition**: Single-tool thinking, missing powerful multi-step workflows
+
+**Bottom Line:** These anti-patterns waste 5-10x more tokens than necessary and frequently cause system failures.
--- a/skills/mem-search/principles/progressive-disclosure.md
+++ b/skills/mem-search/principles/progressive-disclosure.md
@@ -0,0 +1,120 @@
+# Progressive Disclosure Pattern (MANDATORY)
+
+**Core Principle**: Find the smallest set of high-signal tokens first (index format), then drill down to full details only for relevant items.
+
+## The 4-Step Workflow
+
+### Step 1: Start with Index Format
+
+**Action:**
+- Use `format=index` (default in most operations)
+- Set `limit=3-5` (not 20)
+- Review titles and dates ONLY
+
+**Token Cost:** ~50-100 tokens per result
+
+**Why:** Minimal token investment for maximum signal. Get overview before committing to full details.
+
+**Example:**
+```bash
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=index&limit=5"
+```
+
+**Response:**
+```json
+{
+  "query": "authentication",
+  "count": 5,
+  "format": "index",
+  "results": [
+    {
+      "id": 1234,
+      "type": "feature",
+      "title": "Implemented JWT authentication",
+      "subtitle": "Added token-based auth with refresh tokens",
+      "created_at_epoch": 1699564800000,
+      "project": "api-server"
+    }
+  ]
+}
+```
+
+### Step 2: Identify Relevant Items
+
+**Cognitive Task:**
+- Scan index results for relevance
+- Note which items need full details
+- Discard irrelevant items
+
+**Why:** Human-in-the-loop filtering before expensive operations. Don't load full details for items you'll ignore.
+
+### Step 3: Request Full Details (Selectively)
+
+**Action:**
+- Use `format=full` ONLY for specific items of interest
+- Target by ID or use refined search query
+
+**Token Cost:** ~500-1000 tokens per result
+
+**Principle:** Load only what you need
+
+**Example:**
+```bash
+# After reviewing index, get full details for observation #1234
+curl -s "http://localhost:37777/api/search/observations?query=authentication&format=full&limit=1&offset=2"
+```
+
+**Why:** Targeted token expenditure with high ROI. 10x cost difference means selectivity matters.
+
+### Step 4: Refine with Filters (If Needed)
+
+**Techniques:**
+- Use `type`, `dateRange`, `concepts`, `files` filters
+- Narrow scope BEFORE requesting more results
+- Use `offset` for pagination instead of large limits
+
+**Why:** Reduce result set first, then expand selectively. Don't load 20 results when filters could narrow to 3.
+
+## Token Budget Awareness
+
+**Costs:**
+- Index result: ~50-100 tokens
+- Full result: ~500-1000 tokens
+- 10x cost difference
+
+**Starting Points:**
+- Start with `limit=3-5` (not 20)
+- Reduce limit if hitting token errors
+
+**Savings Example:**
+- Naive: 10 items × 750 tokens (avg full) = 7,500 tokens
+- Progressive: (5 items × 75 tokens index) + (2 items × 750 tokens full) = 1,875 tokens
+- **Savings: 5,625 tokens (75% reduction)**
+
+## What Problems This Solves
+
+1. **Token exhaustion**: Without this, LLMs load everything in full format (9,000+ tokens for 10 items)
+2. **Poor signal-to-noise**: Loading full details for irrelevant items wastes tokens
+3. **MCP limits**: Large payloads hit protocol limits (system failures)
+4. **Inefficiency**: Loading 20 full results when only 2 are relevant
+
+## How It Scales
+
+**With 10 records:**
+- Index (500 tokens) → Full (2,000 tokens for 2 relevant) = 2,500 tokens
+- Without pattern: Full (10,000 tokens for all 10) = 4x more expensive
+
+**With 1,000 records:**
+- Index (500 tokens for top 5) → Full (1,000 tokens for 1 relevant) = 1,500 tokens
+- Without pattern: Would hit MCP limits before seeing relevant data
+
+## Context Engineering Alignment
+
+This pattern implements core context engineering principles:
+
+- **Just-in-time context**: Load data dynamically at runtime
+- **Progressive disclosure**: Lightweight identifiers (index) → full details as needed
+- **Token efficiency**: Minimal high-signal tokens first, expand selectively
+- **Attention budget**: Treat context as finite resource with diminishing returns
+
+Always start with the smallest set of high-signal tokens that maximize likelihood of desired outcome.