Files
gh-krishagel-geoffrey/skills/research/RESEARCH_ENHANCEMENT.md
2025-11-30 08:35:59 +08:00

4.9 KiB

Research Skill Enhancement - Requirements

Core Principles

  1. Discovery-driven, not list-driven - Find best sources wherever they are
  2. Context-aware - Grounded in user's domain preferences
  3. Finds unknowns - Value is in discovering what user doesn't know
  4. No time limits - Can run 30 min to 3 hours
  5. Exhaustive - Don't stop until topic is exhausted

Domain Contexts

What Context Provides

  • Relevance filtering - what matters to this user
  • Constraint awareness - their specific resources/limitations
  • Optimization targets - what to maximize for them

Domain: Travel

  • Loyalty: Alaska MVP Gold 75K, Marriott Titanium Elite
  • Points: Current balances (fetch via browser-control)
  • Credit cards: Alaska card, Marriott Amex, Chase Sapphire Preferred
  • Memberships: Epic Pass
  • Preferences: "Luxury at rock-bottom prices", never book only research

Domain: Shopping

  • Quality standards
  • Budget ranges
  • Brand preferences

Domain: Work/Education

  • PSD context (CIO role)
  • UDL expertise
  • Edtech landscape
  • District priorities

Domain: AI/Coding

  • Tech stack preferences
  • Languages/frameworks
  • Architecture patterns

Domain: Consulting

  • Service offerings
  • Client contexts

Research Engine Architecture

Phase 1: Query Understanding

  • Detect domain from query
  • Load relevant context
  • Decompose into sub-questions (5-10 angles)
  • Identify what types of sources needed (academic, forum, video, etc.)

Phase 2: Parallel Multi-Source Discovery

  • Launch multiple search strategies simultaneously:
    • Multiple search engines (Google, Bing, DuckDuckGo)
    • Multiple query formulations
    • Different source types (articles, forums, videos, podcasts)
  • Use multiple LLMs for different perspectives:
    • Perplexity: Current web info with citations
    • Gemini: Multi-perspective synthesis
    • OpenAI: Structured analysis
    • Claude: Deep reasoning

Phase 3: Source Evaluation & Expansion

  • Evaluate each source for:
    • Credibility (author expertise, publication reputation)
    • Recency (when published/updated)
    • Depth (surface vs comprehensive)
    • Citations (does it cite others? is it cited?)
  • Follow promising leads:
    • Sources referenced by good sources
    • Authors who appear multiple times
    • Cross-referenced claims

Phase 4: Deep Scraping

  • Don't stop at first page of results
  • Go 5-10 pages deep on good queries
  • Use browser-control for:
    • JS-rendered content (Reddit, forums)
    • Authenticated pages (user accounts)
    • Sites that block scrapers
  • Follow internal links on valuable sources

Phase 5: Multimedia Discovery

  • YouTube videos
  • TikTok content
  • Podcasts (Spotify, Apple)
  • Not just text articles

Phase 6: Synthesis

  • Organize by theme/question
  • Every claim cited
  • Note consensus vs disagreement
  • Include multimedia resources
  • Provide actionable recommendations
  • Flag what's still uncertain

Output Format

## Research: [Topic]

### Context Applied
- [Domain context that was loaded]
- [Dynamic data fetched - e.g., current point balances]

### Executive Summary
[2-3 paragraph overview of key findings]

### Detailed Findings

#### [Sub-topic 1]
[Deep analysis with inline citations]

#### [Sub-topic 2]
[Deep analysis with inline citations]

### Multimedia Resources
- [Video: Title](url) - description
- [Podcast: Title](url) - description
- [TikTok: @user](url) - description

### Recommendations
1. [Actionable recommendation based on user's context]
2. [Another recommendation]

### What I Discovered You Might Not Know
- [Surprising finding 1]
- [Surprising finding 2]

### Confidence Assessment
- High confidence: [topics]
- Needs verification: [topics]
- Conflicting information: [topics]

### All Sources
[Complete list of every URL referenced]

Technical Implementation

Files to Create/Modify

  • Update preferences.json with domain contexts
  • Create domain context loader
  • Create multi-LLM research orchestrator
  • Create source evaluator
  • Create deep scraper (uses browser-control)
  • Create multimedia searcher
  • Create synthesis engine
  • Update SKILL.md with new architecture

API Keys Needed

  • PERPLEXITY_API_KEY
  • GEMINI_API_KEY
  • OPENAI_API_KEY
  • (Claude runs natively)

Browser-Control Integration

  • Fetch current balances (Marriott, Alaska, Chase)
  • Scrape full forum threads
  • Access authenticated content
  • Handle JS-rendered sites

Success Criteria

  1. Can run a travel research query and get:

    • Current point balances fetched
    • Transfer bonus opportunities identified
    • 5-6 hotel options with points AND cash prices
    • Flight options with miles needed
    • Deep user reviews from forums
    • Video content discovered
    • Optimization recommendations
  2. Report takes 30-60 minutes to generate (not 5 minutes)

  3. Sources include things user didn't know existed

  4. Every claim is cited

  5. Recommendations are specific to user's context