# Research Skill Enhancement - Requirements

## Core Principles

1. **Discovery-driven, not list-driven** - Find best sources wherever they are
2. **Context-aware** - Grounded in user's domain preferences
3. **Finds unknowns** - Value is in discovering what user doesn't know
4. **No time limits** - Can run 30 min to 3 hours
5. **Exhaustive** - Don't stop until topic is exhausted

## Domain Contexts

### What Context Provides
- **Relevance filtering** - what matters to this user
- **Constraint awareness** - their specific resources/limitations
- **Optimization targets** - what to maximize for them

### Domain: Travel
- Loyalty: Alaska MVP Gold 75K, Marriott Titanium Elite
- Points: Current balances (fetch via browser-control)
- Credit cards: Alaska card, Marriott Amex, Chase Sapphire Preferred
- Memberships: Epic Pass
- Preferences: "Luxury at rock-bottom prices", never book only research

### Domain: Shopping
- Quality standards
- Budget ranges
- Brand preferences

### Domain: Work/Education
- PSD context (CIO role)
- UDL expertise
- Edtech landscape
- District priorities

### Domain: AI/Coding
- Tech stack preferences
- Languages/frameworks
- Architecture patterns

### Domain: Consulting
- Service offerings
- Client contexts

## Research Engine Architecture

### Phase 1: Query Understanding
- Detect domain from query
- Load relevant context
- Decompose into sub-questions (5-10 angles)
- Identify what types of sources needed (academic, forum, video, etc.)

### Phase 2: Parallel Multi-Source Discovery
- Launch multiple search strategies simultaneously:
  - Multiple search engines (Google, Bing, DuckDuckGo)
  - Multiple query formulations
  - Different source types (articles, forums, videos, podcasts)
- Use multiple LLMs for different perspectives:
  - Perplexity: Current web info with citations
  - Gemini: Multi-perspective synthesis
  - OpenAI: Structured analysis
  - Claude: Deep reasoning

### Phase 3: Source Evaluation & Expansion
- Evaluate each source for:
  - Credibility (author expertise, publication reputation)
  - Recency (when published/updated)
  - Depth (surface vs comprehensive)
  - Citations (does it cite others? is it cited?)
- Follow promising leads:
  - Sources referenced by good sources
  - Authors who appear multiple times
  - Cross-referenced claims

### Phase 4: Deep Scraping
- Don't stop at first page of results
- Go 5-10 pages deep on good queries
- Use browser-control for:
  - JS-rendered content (Reddit, forums)
  - Authenticated pages (user accounts)
  - Sites that block scrapers
- Follow internal links on valuable sources

### Phase 5: Multimedia Discovery
- YouTube videos
- TikTok content
- Podcasts (Spotify, Apple)
- Not just text articles

### Phase 6: Synthesis
- Organize by theme/question
- Every claim cited
- Note consensus vs disagreement
- Include multimedia resources
- Provide actionable recommendations
- Flag what's still uncertain

## Output Format

```markdown
## Research: [Topic]

### Context Applied
- [Domain context that was loaded]
- [Dynamic data fetched - e.g., current point balances]

### Executive Summary
[2-3 paragraph overview of key findings]

### Detailed Findings

#### [Sub-topic 1]
[Deep analysis with inline citations]

#### [Sub-topic 2]
[Deep analysis with inline citations]

### Multimedia Resources
- [Video: Title](url) - description
- [Podcast: Title](url) - description
- [TikTok: @user](url) - description

### Recommendations
1. [Actionable recommendation based on user's context]
2. [Another recommendation]

### What I Discovered You Might Not Know
- [Surprising finding 1]
- [Surprising finding 2]

### Confidence Assessment
- High confidence: [topics]
- Needs verification: [topics]
- Conflicting information: [topics]

### All Sources
[Complete list of every URL referenced]
```

## Technical Implementation

### Files to Create/Modify
- [ ] Update preferences.json with domain contexts
- [ ] Create domain context loader
- [ ] Create multi-LLM research orchestrator
- [ ] Create source evaluator
- [ ] Create deep scraper (uses browser-control)
- [ ] Create multimedia searcher
- [ ] Create synthesis engine
- [ ] Update SKILL.md with new architecture

### API Keys Needed
- PERPLEXITY_API_KEY
- GEMINI_API_KEY
- OPENAI_API_KEY
- (Claude runs natively)

### Browser-Control Integration
- Fetch current balances (Marriott, Alaska, Chase)
- Scrape full forum threads
- Access authenticated content
- Handle JS-rendered sites

## Success Criteria

1. Can run a travel research query and get:
   - Current point balances fetched
   - Transfer bonus opportunities identified
   - 5-6 hotel options with points AND cash prices
   - Flight options with miles needed
   - Deep user reviews from forums
   - Video content discovered
   - Optimization recommendations

2. Report takes 30-60 minutes to generate (not 5 minutes)

3. Sources include things user didn't know existed

4. Every claim is cited

5. Recommendations are specific to user's context