4.9 KiB
4.9 KiB
Research Skill Enhancement - Requirements
Core Principles
- Discovery-driven, not list-driven - Find best sources wherever they are
- Context-aware - Grounded in user's domain preferences
- Finds unknowns - Value is in discovering what user doesn't know
- No time limits - Can run 30 min to 3 hours
- Exhaustive - Don't stop until topic is exhausted
Domain Contexts
What Context Provides
- Relevance filtering - what matters to this user
- Constraint awareness - their specific resources/limitations
- Optimization targets - what to maximize for them
Domain: Travel
- Loyalty: Alaska MVP Gold 75K, Marriott Titanium Elite
- Points: Current balances (fetch via browser-control)
- Credit cards: Alaska card, Marriott Amex, Chase Sapphire Preferred
- Memberships: Epic Pass
- Preferences: "Luxury at rock-bottom prices", never book only research
Domain: Shopping
- Quality standards
- Budget ranges
- Brand preferences
Domain: Work/Education
- PSD context (CIO role)
- UDL expertise
- Edtech landscape
- District priorities
Domain: AI/Coding
- Tech stack preferences
- Languages/frameworks
- Architecture patterns
Domain: Consulting
- Service offerings
- Client contexts
Research Engine Architecture
Phase 1: Query Understanding
- Detect domain from query
- Load relevant context
- Decompose into sub-questions (5-10 angles)
- Identify what types of sources needed (academic, forum, video, etc.)
Phase 2: Parallel Multi-Source Discovery
- Launch multiple search strategies simultaneously:
- Multiple search engines (Google, Bing, DuckDuckGo)
- Multiple query formulations
- Different source types (articles, forums, videos, podcasts)
- Use multiple LLMs for different perspectives:
- Perplexity: Current web info with citations
- Gemini: Multi-perspective synthesis
- OpenAI: Structured analysis
- Claude: Deep reasoning
Phase 3: Source Evaluation & Expansion
- Evaluate each source for:
- Credibility (author expertise, publication reputation)
- Recency (when published/updated)
- Depth (surface vs comprehensive)
- Citations (does it cite others? is it cited?)
- Follow promising leads:
- Sources referenced by good sources
- Authors who appear multiple times
- Cross-referenced claims
Phase 4: Deep Scraping
- Don't stop at first page of results
- Go 5-10 pages deep on good queries
- Use browser-control for:
- JS-rendered content (Reddit, forums)
- Authenticated pages (user accounts)
- Sites that block scrapers
- Follow internal links on valuable sources
Phase 5: Multimedia Discovery
- YouTube videos
- TikTok content
- Podcasts (Spotify, Apple)
- Not just text articles
Phase 6: Synthesis
- Organize by theme/question
- Every claim cited
- Note consensus vs disagreement
- Include multimedia resources
- Provide actionable recommendations
- Flag what's still uncertain
Output Format
## Research: [Topic]
### Context Applied
- [Domain context that was loaded]
- [Dynamic data fetched - e.g., current point balances]
### Executive Summary
[2-3 paragraph overview of key findings]
### Detailed Findings
#### [Sub-topic 1]
[Deep analysis with inline citations]
#### [Sub-topic 2]
[Deep analysis with inline citations]
### Multimedia Resources
- [Video: Title](url) - description
- [Podcast: Title](url) - description
- [TikTok: @user](url) - description
### Recommendations
1. [Actionable recommendation based on user's context]
2. [Another recommendation]
### What I Discovered You Might Not Know
- [Surprising finding 1]
- [Surprising finding 2]
### Confidence Assessment
- High confidence: [topics]
- Needs verification: [topics]
- Conflicting information: [topics]
### All Sources
[Complete list of every URL referenced]
Technical Implementation
Files to Create/Modify
- Update preferences.json with domain contexts
- Create domain context loader
- Create multi-LLM research orchestrator
- Create source evaluator
- Create deep scraper (uses browser-control)
- Create multimedia searcher
- Create synthesis engine
- Update SKILL.md with new architecture
API Keys Needed
- PERPLEXITY_API_KEY
- GEMINI_API_KEY
- OPENAI_API_KEY
- (Claude runs natively)
Browser-Control Integration
- Fetch current balances (Marriott, Alaska, Chase)
- Scrape full forum threads
- Access authenticated content
- Handle JS-rendered sites
Success Criteria
-
Can run a travel research query and get:
- Current point balances fetched
- Transfer bonus opportunities identified
- 5-6 hotel options with points AND cash prices
- Flight options with miles needed
- Deep user reviews from forums
- Video content discovered
- Optimization recommendations
-
Report takes 30-60 minutes to generate (not 5 minutes)
-
Sources include things user didn't know existed
-
Every claim is cited
-
Recommendations are specific to user's context