Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:20:33 +08:00
commit 977fbf5872
27 changed files with 5714 additions and 0 deletions

View File

@@ -0,0 +1,170 @@
---
name: research-synthesis
description: Guide when to use built-in tools (WebFetch, WebSearch) and MCP servers (Parallel Search, Perplexity, Context7) for research. Synthesize findings into narrative for braindump. Use when gathering data, examples, or citations for blog posts.
---
# Research Synthesis
## When to Use
Use when:
- User claims need supporting data
- Need recent examples/trends
- Looking for citations or authoritative sources
- Extracting info from specific URLs
- Checking technical docs or library APIs
- Filling knowledge gaps during brainstorming/drafting
Skip when:
- Info clearly from personal experience
- User says "don't research, just write"
- Topic is purely opinion-based
## Critical: Never Hallucinate
**Only use REAL research from MCP tools. Never invent:**
- Statistics/percentages
- Study names/researchers
- Company examples/case studies
- Technical specs/benchmarks
- Quotes/citations
**If no data found:**
❌ BAD: "Research shows 70% of OKR implementations fail..."
✅ GOOD: "I don't have data on OKR failure rates. Should I research using Perplexity?"
**Before adding to braindump:**
- Verify came from MCP tool results (not training data)
- Include source attribution always
- If uncertain, say so
- Don't fill in missing details with assumptions
## Research Tool Selection (Priority Order)
### Priority 1: Built-in Tools (Always Try First)
| Tool | Use For | Examples |
|------|---------|----------|
| **WebFetch** | Specific URLs, extracting article content, user-mentioned sources | User: "Check this article: https://..." |
| **WebSearch** | Recent trends/news, statistical data, multiple perspectives, general knowledge gaps | "Recent research on OKR failures", "Companies that abandoned agile" |
### Priority 2: Parallel Search (Advanced Synthesis)
| Tool | Use For | Examples |
|------|---------|----------|
| **Parallel Search** | Advanced web search with agentic mode, fact-checking, competitive intelligence, multi-source synthesis, deep URL extraction | Complex queries needing synthesis, validation across sources, extracting full content from URLs |
### Priority 3: Perplexity (Broad Surveys)
| Tool | Use For | Examples |
|------|---------|----------|
| **Perplexity** | Broad surveys when WebSearch/Parallel insufficient | Industry consensus, statistical data, multiple perspectives |
### Priority 4: Context7 (Technical Docs)
| Tool | Use For | Examples |
|------|---------|----------|
| **Context7** | Library/framework docs, API references, technical specifications | "How does React useEffect work?", "Check latest API docs" |
**Decision tree:**
```
Need research?
├─ Specific URL? → WebFetch → Parallel Search
├─ Technical docs/APIs? → Context7
├─ General search? → WebSearch → Parallel Search → Perplexity
└─ Complex synthesis? → Parallel Search
```
**Rationale:** Built-in tools (WebFetch, WebSearch) are faster and always available. Parallel Search provides advanced agentic mode for synthesis and deep content extraction. Perplexity offers broad surveys when needed. Context7 for official docs only.
## Synthesizing Findings
### Patterns, Not Lists
**Bad** (data dump):
```
Research shows:
- Stat 1
- Stat 2
- Stat 3
```
**Good** (synthesized narrative):
```
Found pattern: 3 recent studies show 60-70% OKR failure rates.
- HBR: 70% failure, metric gaming primary cause
- McKinsey: >100 OKRs correlate with diminishing returns
- Google: Shifted from strict OKRs to "goals and signals"
Key insight: Failure correlates with treating OKRs as compliance exercise.
```
### Look For
- **Patterns**: What do multiple sources agree on?
- **Contradictions**: Where do sources disagree?
- **Gaps**: What's missing?
- **Surprises**: What's unexpected or counterintuitive?
### Source Attribution Format
```markdown
## Research
### OKR Implementation Failures
60-70% failure rate (HBR, McKinsey). Primary causes: metric gaming, checkbox compliance.
**Sources:**
- HBR: "Why OKRs Don't Work" - 70% fail to improve performance
- McKinsey: Survey of 500 companies
- Google blog: Evolution of goals system
**Key Quote:**
> "When OKRs become performance evaluation, they stop being planning."
> - John Doerr, Measure What Matters
```
## Integration with Conversation
Research flows naturally into conversation:
**Proactive**: "That's a strong claim - let me check data... [uses tool] Good intuition! Found 3 confirming studies. Adding to braindump."
**Requested**: "Find X... [uses tool] Found several cases. Should I add all to braindump or focus on one approach?"
**During Drafting**: "Need citation... [uses tool] Found supporting research. Adding to draft with attribution."
## Adding to Braindump
Always ask before updating (unless context is clear): "Found X, Y, Z. Add to braindump under Research?"
Update sections:
- **Research**: Studies, data, citations
- **Examples**: Concrete cases and stories
- **Quotes**: Notable quotations with attribution
- **Sources**: Full references
## Quality Checklist
Before adding to braindump:
- [ ] Synthesized into narrative, not just listed
- [ ] Source attribution included
- [ ] Relevance to core argument clear
- [ ] Key insights/patterns highlighted
- [ ] Contradictions/gaps noted if relevant
## Common Pitfalls
1. **Information overload**: Synthesize 3-5 key findings, not 20 sources
2. **Missing attribution**: Always cite for later reference
3. **Irrelevant research**: Found ≠ useful
4. **Breaking flow**: Don't interrupt creative flow for minor fact-checks
5. **Uncritical acceptance**: Note when sources disagree/have limitations
## Integration with Other Skills
- **brainstorming**: Research validates/challenges ideas
- **blog-writing**: Provides citations and examples
- **Throughout**: Update braindump.md with structured findings
For detailed examples, see reference/examples.md

View File

@@ -0,0 +1,106 @@
# Research Synthesis Examples
## Example: OKR Outputs vs Outcomes
This example shows how to use research-synthesis to validate a claim, synthesize findings into narrative, and integrate naturally into conversation.
```
[During brainstorming]
You: I want to argue that OKRs fail because they measure outputs not outcomes
AI: [invokes research-synthesis]
That's a common critique - let me see if there's data to support it...
[uses Perplexity: "OKR failures output vs outcome measurement"]
Found strong supporting evidence:
- Reforge 2024 study: 68% of failed OKR implementations measured
outputs (features shipped) vs outcomes (user behavior change)
- Amplitude research: Companies measuring outcomes had 3x higher
product-market fit scores
- Example: Spotify's "squad health" framework specifically avoided
output metrics for this reason
Key pattern: Output metrics are easier to measure but don't correlate
with business value. This ties directly to your Goodhart's Law point.
[updates braindump.md under Research section]
Added to braindump. This gives you concrete data to back the claim.
Want to explore specific examples of good outcome metrics?
You: Yes, give me 2-3 examples
AI: [uses Perplexity: "good OKR outcome metrics examples"]
Strong examples:
1. Netflix: "Hours of satisfied viewing" (outcome) vs "hours watched" (output)
2. Airbnb: "Nights booked by repeat guests" (outcome) vs "total bookings" (output)
3. Stripe: "Developer time to first API call" (outcome) vs "API calls made" (output)
Pattern: Outcome metrics measure behavior change or satisfaction,
outputs just measure activity.
[adds to braindump under Examples]
These work well as concrete illustrations. Should we outline the post now?
```
## What This Example Demonstrates
**MCP Tool Selection:**
- Used Perplexity for broad research (not a specific URL)
- Searched for both validation and examples
- Second query built on first findings
**Synthesis Techniques:**
- Identified pattern across sources (68% correlation)
- Connected findings to user's framework (Goodhart's Law)
- Provided concrete examples, not just statistics
- Noted implications (easier to measure ≠ more valuable)
**Integration with Conversation:**
- Research happened naturally when claim needed support
- Didn't interrupt flow—enhanced the argument
- Asked follow-up question to continue exploration
- Updated braindump.md in structured way
**Braindump Updates:**
Research section received:
```markdown
### Output vs Outcome Metrics
Reforge study: 68% of failed OKR implementations measured outputs
rather than outcomes. Companies measuring outcomes had 3x higher
product-market fit scores.
Pattern: Output metrics (features shipped, API calls) are easier to
measure but don't correlate with business value. Outcome metrics
(user satisfaction, behavior change) harder but more meaningful.
```
Examples section received:
```markdown
- Netflix: "Hours of satisfied viewing" vs "hours watched"
- Airbnb: "Nights booked by repeat guests" vs "total bookings"
- Stripe: "Developer time to first API call" vs "API calls made"
```
## Common Patterns
**Good Research Synthesis:**
- 3-5 sources, not 20
- Pattern identified across sources
- Connected to user's existing framework
- Concrete examples included
- Source attribution maintained
- Implications stated clearly
**Avoided Pitfalls:**
- No information overload (focused on key findings)
- Not just listing stats—synthesized into narrative
- Didn't break creative flow—enhanced it
- Asked before continuing (user control maintained)

View File

@@ -0,0 +1,127 @@
# Multi-Agent Invocation Pattern
Guide for using specialized research agents in parallel for comprehensive investigation.
## Research Agents Overview
| Agent | Tool | Use Cases | Output |
|-------|------|-----------|--------|
| **research-breadth** (haiku, blue) | Perplexity | Industry trends, best practices, multiple perspectives, comparative analyses, "What are common patterns?" | Narrative patterns with consensus, confidence ratings, contradictions |
| **research-depth** (haiku, purple) | Firecrawl | Specific URLs, detailed implementations, code examples, gotchas, "How did X implement Y?" | Source-by-source analysis with code, tradeoffs, applicability |
| **research-technical** (haiku, green) | Context7 | Official docs, API signatures, TypeScript types, configs, migration guides, "What's the official API?" | Exact API specs with types, configurations, official examples |
## Agent Selection Decision Tree
| Question Type | Agent Combination | Rationale |
|--------------|-------------------|-----------|
| **New technology/framework** | breadth + technical | Industry patterns + Official API |
| **Specific error/bug** | depth + technical | Detailed solutions + API reference |
| **API integration** | technical + depth | Official docs + Real examples |
| **Best practices/patterns** | breadth + depth | Industry trends + Case studies |
| **Comparison/decision** | breadth + depth | Broad survey + Detailed experiences |
| **Official API only** | technical | Just need documentation |
**Default when unsure**: breadth + technical
## Parallel Invocation Syntax
**Always use Promise.all for parallel execution:**
```typescript
await Promise.all([
Task({
subagent_type: 'research-breadth', // or 'research-depth' or 'research-technical'
model: 'haiku',
description: 'Brief description',
prompt: `Specific research question with focus areas and MCP tool guidance`
}),
Task({
subagent_type: 'research-technical',
model: 'haiku',
description: 'Brief description',
prompt: `Specific research question with focus areas and MCP tool guidance`
})
]);
```
## Common Patterns
### Pattern 1: New Technology
**Scenario**: Learning a new framework
**Agents**: breadth + technical
**Focus**: breadth (architectural patterns, industry trends), technical (official API, configs)
**Consolidation**: Industry patterns → Official implementation
### Pattern 2: Specific Solution
**Scenario**: Debugging or implementing known solution
**Agents**: depth + technical
**Focus**: depth (blog posts, implementations, gotchas), technical (official API, types)
**Consolidation**: Real-world patterns → Official API usage
### Pattern 3: API Integration
**Scenario**: Integrating with library/API
**Agents**: technical + depth
**Focus**: technical (official API, error codes), depth (tutorials, testing approaches)
**Consolidation**: Official API first → Battle-tested patterns
### Pattern 4: Comparative Analysis
**Scenario**: Choosing between approaches
**Agents**: breadth + depth
**Focus**: breadth (comparisons, trends), depth (migration experiences, lessons)
**Consolidation**: Industry trends → Real experiences
## Synthesis Strategy
Use **research-synthesis skill** to consolidate findings:
1. **Consolidate**: Group by theme, identify consensus, note contradictions
2. **Narrativize**: Weave findings into story (not bullet dumps): "Industry uses X (breadth), implemented via Y (technical), as shown by Z (depth)"
3. **Attribute**: Link claims to sources, note which agent provided insights
4. **Identify Gaps**: Unanswered questions, contradictions, disagreements
5. **Extract Actions**: Implementation path, code/configs, risks, constraints
## Anti-Patterns vs Best Practices
| ❌ Anti-Pattern | ✅ Best Practice |
|----------------|------------------|
| Single agent for multi-faceted question | 2-3 agents for comprehensive coverage |
| Sequential: `await` each agent | Parallel: `Promise.all([...])` |
| Copy agent outputs verbatim in sections | Synthesize into narrative with attribution |
| Skip source attribution | Note which agent/source for each claim |
| List findings separately | Weave into coherent story |
## Complete Example
**User**: "How do I implement real-time notifications in Next.js?"
**Step 1: Analyze** → New technology + implementation
**Step 2: Launch** → breadth + technical in parallel
**Step 3: Synthesize**:
```markdown
## Findings
Industry research shows three approaches: SSE (most popular for Next.js), WebSockets
(bidirectional), Polling (fallback). Official Next.js docs indicate route handlers
support SSE via ReadableStream, but WebSockets require external service on Vercel.
**Recommendation**: Use SSE via Next.js route handlers - aligns with framework
capabilities and industry best practices.
**Implementation**: Create API route with ReadableStream → Client uses EventSource
→ Handle reconnection/errors → Consider Vercel limitations
**Sources**: [Perplexity] Next.js real-time patterns 2024-2025 | [Context7] Next.js Route Handlers
```
## Integration Points
**Used by**:
- `/research` command (essentials) - User-initiated research
- `implementing-tasks` skill (experimental) - Auto-launch when STUCK
- `planning` skill (experimental) - Uses exploration agents instead
**Other agent categories**:
- **Exploration** (codebase): architecture-explorer + codebase-analyzer (parallel)
- **Review** (code quality): test-coverage + error-handling + security (all 3 parallel)