zhongwei/gh-krishagel-geoffrey

Fork 0

Files

Zhongwei Li 90883a4d25 Initial commit

2025-11-30 08:35:59 +08:00

4.9 KiB

Raw Permalink Blame History

Research Skill Enhancement - Requirements

Core Principles

Discovery-driven, not list-driven - Find best sources wherever they are
Context-aware - Grounded in user's domain preferences
Finds unknowns - Value is in discovering what user doesn't know
No time limits - Can run 30 min to 3 hours
Exhaustive - Don't stop until topic is exhausted

Domain Contexts

What Context Provides

Relevance filtering - what matters to this user
Constraint awareness - their specific resources/limitations
Optimization targets - what to maximize for them

Domain: Travel

Loyalty: Alaska MVP Gold 75K, Marriott Titanium Elite
Points: Current balances (fetch via browser-control)
Credit cards: Alaska card, Marriott Amex, Chase Sapphire Preferred
Memberships: Epic Pass
Preferences: "Luxury at rock-bottom prices", never book only research

Domain: Shopping

Quality standards
Budget ranges
Brand preferences

Domain: Work/Education

PSD context (CIO role)
UDL expertise
Edtech landscape
District priorities

Domain: AI/Coding

Tech stack preferences
Languages/frameworks
Architecture patterns

Domain: Consulting

Service offerings
Client contexts

Research Engine Architecture

Phase 1: Query Understanding

Detect domain from query
Load relevant context
Decompose into sub-questions (5-10 angles)
Identify what types of sources needed (academic, forum, video, etc.)

Phase 2: Parallel Multi-Source Discovery

Launch multiple search strategies simultaneously:
- Multiple search engines (Google, Bing, DuckDuckGo)
- Multiple query formulations
- Different source types (articles, forums, videos, podcasts)
Use multiple LLMs for different perspectives:
- Perplexity: Current web info with citations
- Gemini: Multi-perspective synthesis
- OpenAI: Structured analysis
- Claude: Deep reasoning

Phase 3: Source Evaluation & Expansion

Evaluate each source for:
- Credibility (author expertise, publication reputation)
- Recency (when published/updated)
- Depth (surface vs comprehensive)
- Citations (does it cite others? is it cited?)
Follow promising leads:
- Sources referenced by good sources
- Authors who appear multiple times
- Cross-referenced claims

Phase 4: Deep Scraping

Don't stop at first page of results
Go 5-10 pages deep on good queries
Use browser-control for:
- JS-rendered content (Reddit, forums)
- Authenticated pages (user accounts)
- Sites that block scrapers
Follow internal links on valuable sources

Phase 5: Multimedia Discovery

YouTube videos
TikTok content
Podcasts (Spotify, Apple)
Not just text articles

Phase 6: Synthesis

Organize by theme/question
Every claim cited
Note consensus vs disagreement
Include multimedia resources
Provide actionable recommendations
Flag what's still uncertain

Output Format

## Research: [Topic]

### Context Applied
- [Domain context that was loaded]
- [Dynamic data fetched - e.g., current point balances]

### Executive Summary
[2-3 paragraph overview of key findings]

### Detailed Findings

#### [Sub-topic 1]
[Deep analysis with inline citations]

#### [Sub-topic 2]
[Deep analysis with inline citations]

### Multimedia Resources
- [Video: Title](url) - description
- [Podcast: Title](url) - description
- [TikTok: @user](url) - description

### Recommendations
1. [Actionable recommendation based on user's context]
2. [Another recommendation]

### What I Discovered You Might Not Know
- [Surprising finding 1]
- [Surprising finding 2]

### Confidence Assessment
- High confidence: [topics]
- Needs verification: [topics]
- Conflicting information: [topics]

### All Sources
[Complete list of every URL referenced]

Technical Implementation

Files to Create/Modify

Update preferences.json with domain contexts
Create domain context loader
Create multi-LLM research orchestrator
Create source evaluator
Create deep scraper (uses browser-control)
Create multimedia searcher
Create synthesis engine
Update SKILL.md with new architecture

API Keys Needed

PERPLEXITY_API_KEY
GEMINI_API_KEY
OPENAI_API_KEY
(Claude runs natively)

Browser-Control Integration

Fetch current balances (Marriott, Alaska, Chase)
Scrape full forum threads
Access authenticated content
Handle JS-rendered sites

Success Criteria

Can run a travel research query and get:
- Current point balances fetched
- Transfer bonus opportunities identified
- 5-6 hotel options with points AND cash prices
- Flight options with miles needed
- Deep user reviews from forums
- Video content discovered
- Optimization recommendations
Report takes 30-60 minutes to generate (not 5 minutes)
Sources include things user didn't know existed
Every claim is cited
Recommendations are specific to user's context

4.9 KiB Raw Permalink Blame History

Research Skill Enhancement - Requirements

Core Principles

Domain Contexts

What Context Provides

Domain: Travel

Domain: Shopping

Domain: Work/Education

Domain: AI/Coding

Domain: Consulting

Research Engine Architecture

Phase 1: Query Understanding

Phase 2: Parallel Multi-Source Discovery

Phase 3: Source Evaluation & Expansion

Phase 4: Deep Scraping

Phase 5: Multimedia Discovery

Phase 6: Synthesis

Output Format

Technical Implementation

Files to Create/Modify

API Keys Needed

Browser-Control Integration

Success Criteria

4.9 KiB

Raw Permalink Blame History