Initial commit
This commit is contained in:
116
skills/research-lookup/README.md
Normal file
116
skills/research-lookup/README.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# Research Lookup Skill
|
||||
|
||||
This skill provides real-time research information lookup using Perplexity's Sonar Pro model through OpenRouter.
|
||||
|
||||
## Setup
|
||||
|
||||
1. **Get OpenRouter API Key:**
|
||||
- Visit [openrouter.ai](https://openrouter.ai)
|
||||
- Create account and generate API key
|
||||
- Add credits to your account
|
||||
|
||||
2. **Configure Environment:**
|
||||
```bash
|
||||
export OPENROUTER_API_KEY="your_api_key_here"
|
||||
```
|
||||
|
||||
3. **Test Setup:**
|
||||
```bash
|
||||
python scripts/research_lookup.py --model-info
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Command Line Usage
|
||||
|
||||
```bash
|
||||
# Single research query
|
||||
python scripts/research_lookup.py "Recent advances in CRISPR gene editing 2024"
|
||||
|
||||
# Multiple queries with delay
|
||||
python scripts/research_lookup.py --batch "CRISPR applications" "gene therapy trials" "ethical considerations"
|
||||
|
||||
# Claude Code integration (called automatically)
|
||||
python lookup.py "your research query here"
|
||||
```
|
||||
|
||||
### Claude Code Integration
|
||||
|
||||
The research lookup tool is automatically available in Claude Code when you:
|
||||
|
||||
1. **Ask research questions:** "Research recent advances in quantum computing"
|
||||
2. **Request literature reviews:** "Find current studies on climate change impacts"
|
||||
3. **Need citations:** "What are the latest papers on transformer attention mechanisms?"
|
||||
4. **Want technical information:** "Standard protocols for flow cytometry"
|
||||
|
||||
## Features
|
||||
|
||||
- **Academic Focus:** Prioritizes peer-reviewed papers and reputable sources
|
||||
- **Current Information:** Focuses on recent publications (2020-2024)
|
||||
- **Complete Citations:** Provides full bibliographic information with DOIs
|
||||
- **Multiple Formats:** Supports various query types and research needs
|
||||
- **Cost Effective:** Typically $0.01-0.05 per research query
|
||||
|
||||
## Query Examples
|
||||
|
||||
### Academic Research
|
||||
- "Recent systematic reviews on AI in medical diagnosis 2024"
|
||||
- "Meta-analysis of randomized controlled trials for depression treatment"
|
||||
- "Current state of quantum computing error correction research"
|
||||
|
||||
### Technical Methods
|
||||
- "Standard protocols for immunohistochemistry in tissue samples"
|
||||
- "Best practices for machine learning model validation"
|
||||
- "Statistical methods for analyzing longitudinal data"
|
||||
|
||||
### Statistical Data
|
||||
- "Global renewable energy adoption statistics 2024"
|
||||
- "Prevalence of diabetes in different populations"
|
||||
- "Market size for autonomous vehicles industry"
|
||||
|
||||
## Response Format
|
||||
|
||||
Each research result includes:
|
||||
- **Summary:** Brief overview of key findings
|
||||
- **Key Studies:** 3-5 most relevant recent papers
|
||||
- **Citations:** Complete bibliographic information
|
||||
- **Usage Stats:** Token usage for cost tracking
|
||||
- **Timestamp:** When the research was performed
|
||||
|
||||
## Integration with Scientific Writing
|
||||
|
||||
This skill enhances the scientific writing process by providing:
|
||||
|
||||
1. **Literature Reviews:** Current research for introduction sections
|
||||
2. **Methods Validation:** Verify protocols against current standards
|
||||
3. **Results Context:** Compare findings with recent similar studies
|
||||
4. **Discussion Support:** Latest evidence for arguments
|
||||
5. **Citation Management:** Properly formatted references
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"API key not found"**
|
||||
- Ensure `OPENROUTER_API_KEY` environment variable is set
|
||||
- Check that you have credits in your OpenRouter account
|
||||
|
||||
**"Model not available"**
|
||||
- Verify your API key has access to Perplexity models
|
||||
- Check OpenRouter status page for service issues
|
||||
|
||||
**"Rate limit exceeded"**
|
||||
- Add delays between requests using `--delay` option
|
||||
- Check your OpenRouter account limits
|
||||
|
||||
**"No relevant results"**
|
||||
- Try more specific or broader queries
|
||||
- Include time frames (e.g., "2023-2024")
|
||||
- Use academic keywords and technical terms
|
||||
|
||||
## Cost Management
|
||||
|
||||
- Monitor usage through OpenRouter dashboard
|
||||
- Typical costs: $0.01-0.05 per research query
|
||||
- Batch processing available for multiple queries
|
||||
- Consider query specificity to optimize token usage
|
||||
|
||||
This skill is designed for academic and research purposes, providing high-quality, cited information to support scientific writing and research activities.
|
||||
479
skills/research-lookup/SKILL.md
Normal file
479
skills/research-lookup/SKILL.md
Normal file
@@ -0,0 +1,479 @@
|
||||
---
|
||||
name: research-lookup
|
||||
description: "Look up current research information using Perplexity's Sonar Pro or Sonar Reasoning Pro models through OpenRouter. Automatically selects the best model based on query complexity. Search academic papers, recent studies, technical documentation, and general research information with citations."
|
||||
allowed-tools: [Read, Write, Edit, Bash]
|
||||
---
|
||||
|
||||
# Research Information Lookup
|
||||
|
||||
## Overview
|
||||
|
||||
This skill enables real-time research information lookup using Perplexity's Sonar models through OpenRouter. It intelligently selects between **Sonar Pro** (fast, efficient lookup) and **Sonar Reasoning Pro** (deep analytical reasoning) based on query complexity. The skill provides access to current academic literature, recent studies, technical documentation, and general research information with proper citations and source attribution.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when you need:
|
||||
|
||||
- **Current Research Information**: Latest studies, papers, and findings in a specific field
|
||||
- **Literature Verification**: Check facts, statistics, or claims against current research
|
||||
- **Background Research**: Gather context and supporting evidence for scientific writing
|
||||
- **Citation Sources**: Find relevant papers and studies to cite in manuscripts
|
||||
- **Technical Documentation**: Look up specifications, protocols, or methodologies
|
||||
- **Recent Developments**: Stay current with emerging trends and breakthroughs
|
||||
- **Statistical Data**: Find recent statistics, survey results, or research findings
|
||||
- **Expert Opinions**: Access insights from recent interviews, reviews, or commentary
|
||||
|
||||
## Visual Enhancement with Scientific Schematics
|
||||
|
||||
**When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.**
|
||||
|
||||
If your document does not already contain schematics or diagrams:
|
||||
- Use the **scientific-schematics** skill to generate AI-powered publication-quality diagrams
|
||||
- Simply describe your desired diagram in natural language
|
||||
- Nano Banana Pro will automatically generate, review, and refine the schematic
|
||||
|
||||
**For new documents:** Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.
|
||||
|
||||
**How to generate schematics:**
|
||||
```bash
|
||||
python scripts/generate_schematic.py "your diagram description" -o figures/output.png
|
||||
```
|
||||
|
||||
The AI will automatically:
|
||||
- Create publication-quality images with proper formatting
|
||||
- Review and refine through multiple iterations
|
||||
- Ensure accessibility (colorblind-friendly, high contrast)
|
||||
- Save outputs in the figures/ directory
|
||||
|
||||
**When to add schematics:**
|
||||
- Research information flow diagrams
|
||||
- Query processing workflow illustrations
|
||||
- Model selection decision trees
|
||||
- System integration architecture diagrams
|
||||
- Information retrieval pipeline visualizations
|
||||
- Knowledge synthesis frameworks
|
||||
- Any complex concept that benefits from visualization
|
||||
|
||||
For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.
|
||||
|
||||
---
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### 1. Academic Research Queries
|
||||
|
||||
**Search Academic Literature**: Query for recent papers, studies, and reviews in specific domains:
|
||||
|
||||
```
|
||||
Query Examples:
|
||||
- "Recent advances in CRISPR gene editing 2024"
|
||||
- "Latest clinical trials for Alzheimer's disease treatment"
|
||||
- "Machine learning applications in drug discovery systematic review"
|
||||
- "Climate change impacts on biodiversity meta-analysis"
|
||||
```
|
||||
|
||||
**Expected Response Format**:
|
||||
- Summary of key findings from recent literature
|
||||
- Citation of 3-5 most relevant papers with authors, titles, journals, and years
|
||||
- Key statistics or findings highlighted
|
||||
- Identification of research gaps or controversies
|
||||
- Links to full papers when available
|
||||
|
||||
### 2. Technical and Methodological Information
|
||||
|
||||
**Protocol and Method Lookups**: Find detailed procedures, specifications, and methodologies:
|
||||
|
||||
```
|
||||
Query Examples:
|
||||
- "Western blot protocol for protein detection"
|
||||
- "RNA sequencing library preparation methods"
|
||||
- "Statistical power analysis for clinical trials"
|
||||
- "Machine learning model evaluation metrics"
|
||||
```
|
||||
|
||||
**Expected Response Format**:
|
||||
- Step-by-step procedures or protocols
|
||||
- Required materials and equipment
|
||||
- Critical parameters and considerations
|
||||
- Troubleshooting common issues
|
||||
- References to standard protocols or seminal papers
|
||||
|
||||
### 3. Statistical and Data Information
|
||||
|
||||
**Research Statistics**: Look up current statistics, survey results, and research data:
|
||||
|
||||
```
|
||||
Query Examples:
|
||||
- "Prevalence of diabetes in US population 2024"
|
||||
- "Global renewable energy adoption statistics"
|
||||
- "COVID-19 vaccination rates by country"
|
||||
- "AI adoption in healthcare industry survey"
|
||||
```
|
||||
|
||||
**Expected Response Format**:
|
||||
- Current statistics with dates and sources
|
||||
- Methodology of data collection
|
||||
- Confidence intervals or margins of error when available
|
||||
- Comparison with previous years or benchmarks
|
||||
- Citations to original surveys or studies
|
||||
|
||||
### 4. Citation and Reference Assistance
|
||||
|
||||
**Citation Finding**: Locate relevant papers and studies for citation in manuscripts:
|
||||
|
||||
```
|
||||
Query Examples:
|
||||
- "Foundational papers on transformer architecture"
|
||||
- "Seminal works in quantum computing"
|
||||
- "Key studies on climate change mitigation"
|
||||
- "Landmark trials in cancer immunotherapy"
|
||||
```
|
||||
|
||||
**Expected Response Format**:
|
||||
- 5-10 most influential or relevant papers
|
||||
- Complete citation information (authors, title, journal, year, DOI)
|
||||
- Brief description of each paper's contribution
|
||||
- Citation impact metrics when available (h-index, citation count)
|
||||
- Journal impact factors and rankings
|
||||
|
||||
## Automatic Model Selection
|
||||
|
||||
This skill features **intelligent model selection** based on query complexity:
|
||||
|
||||
### Model Types
|
||||
|
||||
**1. Sonar Pro** (`perplexity/sonar-pro`)
|
||||
- **Use Case**: Straightforward information lookup
|
||||
- **Best For**:
|
||||
- Simple fact-finding queries
|
||||
- Recent publication searches
|
||||
- Basic protocol lookups
|
||||
- Statistical data retrieval
|
||||
- **Speed**: Fast responses
|
||||
- **Cost**: Lower cost per query
|
||||
|
||||
**2. Sonar Reasoning Pro** (`perplexity/sonar-reasoning-pro`)
|
||||
- **Use Case**: Complex analytical queries requiring deep reasoning
|
||||
- **Best For**:
|
||||
- Comparative analysis ("compare X vs Y")
|
||||
- Synthesis of multiple studies
|
||||
- Evaluating trade-offs or controversies
|
||||
- Explaining mechanisms or relationships
|
||||
- Critical analysis and interpretation
|
||||
- **Speed**: Slower but more thorough
|
||||
- **Cost**: Higher cost per query, but provides deeper insights
|
||||
|
||||
### Complexity Assessment
|
||||
|
||||
The skill automatically detects query complexity using these indicators:
|
||||
|
||||
**Reasoning Keywords** (triggers Sonar Reasoning Pro):
|
||||
- Analytical: `compare`, `contrast`, `analyze`, `analysis`, `evaluate`, `critique`
|
||||
- Comparative: `versus`, `vs`, `vs.`, `compared to`, `differences between`, `similarities`
|
||||
- Synthesis: `meta-analysis`, `systematic review`, `synthesis`, `integrate`
|
||||
- Causal: `mechanism`, `why`, `how does`, `how do`, `explain`, `relationship`, `causal relationship`, `underlying mechanism`
|
||||
- Theoretical: `theoretical framework`, `implications`, `interpret`, `reasoning`
|
||||
- Debate: `controversy`, `conflicting`, `paradox`, `debate`, `reconcile`
|
||||
- Trade-offs: `pros and cons`, `advantages and disadvantages`, `trade-off`, `tradeoff`, `trade offs`
|
||||
- Complexity: `multifaceted`, `complex interaction`, `critical analysis`
|
||||
|
||||
**Complexity Scoring**:
|
||||
- Reasoning keywords: 3 points each (heavily weighted)
|
||||
- Multiple questions: 2 points per question mark
|
||||
- Complex sentence structures: 1.5 points per clause indicator (and, or, but, however, whereas, although)
|
||||
- Very long queries: 1 point if >150 characters
|
||||
- **Threshold**: Queries scoring ≥3 points trigger Sonar Reasoning Pro
|
||||
|
||||
**Practical Result**: Even a single strong reasoning keyword (compare, explain, analyze, etc.) will trigger the more powerful Sonar Reasoning Pro model, ensuring you get deep analysis when needed.
|
||||
|
||||
**Example Query Classification**:
|
||||
|
||||
✅ **Sonar Pro** (straightforward lookup):
|
||||
- "Recent advances in CRISPR gene editing 2024"
|
||||
- "Prevalence of diabetes in US population"
|
||||
- "Western blot protocol for protein detection"
|
||||
|
||||
✅ **Sonar Reasoning Pro** (complex analysis):
|
||||
- "Compare and contrast mRNA vaccines vs traditional vaccines for cancer treatment"
|
||||
- "Explain the mechanism underlying the relationship between gut microbiome and depression"
|
||||
- "Analyze the controversy surrounding AI in medical diagnosis and evaluate trade-offs"
|
||||
|
||||
### Manual Override
|
||||
|
||||
You can force a specific model using the `force_model` parameter:
|
||||
|
||||
```python
|
||||
# Force Sonar Pro for fast lookup
|
||||
research = ResearchLookup(force_model='pro')
|
||||
|
||||
# Force Sonar Reasoning Pro for deep analysis
|
||||
research = ResearchLookup(force_model='reasoning')
|
||||
|
||||
# Automatic selection (default)
|
||||
research = ResearchLookup()
|
||||
```
|
||||
|
||||
Command-line usage:
|
||||
```bash
|
||||
# Force Sonar Pro
|
||||
python research_lookup.py "your query" --force-model pro
|
||||
|
||||
# Force Sonar Reasoning Pro
|
||||
python research_lookup.py "your query" --force-model reasoning
|
||||
|
||||
# Automatic (no flag)
|
||||
python research_lookup.py "your query"
|
||||
```
|
||||
|
||||
## Technical Integration
|
||||
|
||||
### OpenRouter API Configuration
|
||||
|
||||
This skill integrates with OpenRouter (openrouter.ai) to access Perplexity's Sonar models:
|
||||
|
||||
**Model Specifications**:
|
||||
- **Models**:
|
||||
- `perplexity/sonar-pro-online` (fast lookup)
|
||||
- `perplexity/sonar-reasoning-pro-online` (deep analysis)
|
||||
- **Search Mode**: Academic/scholarly mode (prioritizes peer-reviewed sources)
|
||||
- **Context Window**: 200K+ tokens for comprehensive research
|
||||
- **Capabilities**: Academic paper search, citation generation, scholarly analysis
|
||||
- **Output**: Rich responses with citations and source links from academic databases
|
||||
|
||||
**API Requirements**:
|
||||
- OpenRouter API key (set as `OPENROUTER_API_KEY` environment variable)
|
||||
- Account with sufficient credits for research queries
|
||||
- Proper attribution and citation of sources
|
||||
|
||||
**Academic Mode Configuration**:
|
||||
- System message configured to prioritize scholarly sources
|
||||
- Search focused on peer-reviewed journals and academic publications
|
||||
- Enhanced citation extraction for academic references
|
||||
- Preference for recent academic literature (2020-2024)
|
||||
- Direct access to academic databases and repositories
|
||||
|
||||
### Response Quality and Reliability
|
||||
|
||||
**Source Verification**: The skill prioritizes:
|
||||
- Peer-reviewed academic papers and journals
|
||||
- Reputable institutional sources (universities, government agencies, NGOs)
|
||||
- Recent publications (within last 2-3 years preferred)
|
||||
- High-impact journals and conferences
|
||||
- Primary research over secondary sources
|
||||
|
||||
**Citation Standards**: All responses include:
|
||||
- Complete bibliographic information
|
||||
- DOI or stable URLs when available
|
||||
- Access dates for web sources
|
||||
- Clear attribution of direct quotes or data
|
||||
|
||||
## Query Best Practices
|
||||
|
||||
### 1. Model Selection Strategy
|
||||
|
||||
**For Simple Lookups (Sonar Pro)**:
|
||||
- Recent papers on a specific topic
|
||||
- Statistical data or prevalence rates
|
||||
- Standard protocols or methodologies
|
||||
- Citation finding for specific papers
|
||||
- Factual information retrieval
|
||||
|
||||
**For Complex Analysis (Sonar Reasoning Pro)**:
|
||||
- Comparative studies and synthesis
|
||||
- Mechanism explanations
|
||||
- Controversy evaluation
|
||||
- Trade-off analysis
|
||||
- Theoretical frameworks
|
||||
- Multi-faceted relationships
|
||||
|
||||
**Pro Tip**: The automatic selection is optimized for most use cases. Only use `force_model` if you have specific requirements or know the query needs deeper reasoning than detected.
|
||||
|
||||
### 2. Specific and Focused Queries
|
||||
|
||||
**Good Queries** (will trigger appropriate model):
|
||||
- "Randomized controlled trials of mRNA vaccines for cancer treatment 2023-2024" → Sonar Pro
|
||||
- "Compare the efficacy and safety of mRNA vaccines vs traditional vaccines for cancer treatment" → Sonar Reasoning Pro
|
||||
- "Explain the mechanism by which CRISPR off-target effects occur and strategies to minimize them" → Sonar Reasoning Pro
|
||||
|
||||
**Poor Queries**:
|
||||
- "Tell me about AI" (too broad)
|
||||
- "Cancer research" (lacks specificity)
|
||||
- "Latest news" (too vague)
|
||||
|
||||
### 3. Structured Query Format
|
||||
|
||||
**Recommended Structure**:
|
||||
```
|
||||
[Topic] + [Specific Aspect] + [Time Frame] + [Type of Information]
|
||||
```
|
||||
|
||||
**Examples**:
|
||||
- "CRISPR gene editing + off-target effects + 2024 + clinical trials"
|
||||
- "Quantum computing + error correction + recent advances + review papers"
|
||||
- "Renewable energy + solar efficiency + 2023-2024 + statistical data"
|
||||
|
||||
### 4. Follow-up Queries
|
||||
|
||||
**Effective Follow-ups**:
|
||||
- "Show me the full citation for the Smith et al. 2024 paper"
|
||||
- "What are the limitations of this methodology?"
|
||||
- "Find similar studies using different approaches"
|
||||
- "What controversies exist in this research area?"
|
||||
|
||||
## Integration with Scientific Writing
|
||||
|
||||
This skill enhances scientific writing by providing:
|
||||
|
||||
1. **Literature Review Support**: Gather current research for introduction and discussion sections
|
||||
2. **Methods Validation**: Verify protocols and procedures against current standards
|
||||
3. **Results Contextualization**: Compare findings with recent similar studies
|
||||
4. **Discussion Enhancement**: Support arguments with latest evidence
|
||||
5. **Citation Management**: Provide properly formatted citations in multiple styles
|
||||
|
||||
## Error Handling and Limitations
|
||||
|
||||
**Known Limitations**:
|
||||
- Information cutoff: Responses limited to training data (typically 2023-2024)
|
||||
- Paywall content: May not access full text behind paywalls
|
||||
- Emerging research: May miss very recent papers not yet indexed
|
||||
- Specialized databases: Cannot access proprietary or restricted databases
|
||||
|
||||
**Error Conditions**:
|
||||
- API rate limits or quota exceeded
|
||||
- Network connectivity issues
|
||||
- Malformed or ambiguous queries
|
||||
- Model unavailability or maintenance
|
||||
|
||||
**Fallback Strategies**:
|
||||
- Rephrase queries for better clarity
|
||||
- Break complex queries into simpler components
|
||||
- Use broader time frames if recent data unavailable
|
||||
- Cross-reference with multiple query variations
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Simple Literature Search (Sonar Pro)
|
||||
|
||||
**Query**: "Recent advances in transformer attention mechanisms 2024"
|
||||
|
||||
**Model Selected**: Sonar Pro (straightforward lookup)
|
||||
|
||||
**Response Includes**:
|
||||
- Summary of 5 key papers from 2024
|
||||
- Complete citations with DOIs
|
||||
- Key innovations and improvements
|
||||
- Performance benchmarks
|
||||
- Future research directions
|
||||
|
||||
### Example 2: Comparative Analysis (Sonar Reasoning Pro)
|
||||
|
||||
**Query**: "Compare and contrast the advantages and limitations of transformer-based models versus traditional RNNs for sequence modeling"
|
||||
|
||||
**Model Selected**: Sonar Reasoning Pro (complex analysis required)
|
||||
|
||||
**Response Includes**:
|
||||
- Detailed comparison across multiple dimensions
|
||||
- Analysis of architectural differences
|
||||
- Trade-offs in computational efficiency vs performance
|
||||
- Use case recommendations
|
||||
- Synthesis of evidence from multiple studies
|
||||
- Discussion of ongoing debates in the field
|
||||
|
||||
### Example 3: Method Verification (Sonar Pro)
|
||||
|
||||
**Query**: "Standard protocols for flow cytometry analysis"
|
||||
|
||||
**Model Selected**: Sonar Pro (protocol lookup)
|
||||
|
||||
**Response Includes**:
|
||||
- Step-by-step protocol from recent review
|
||||
- Required controls and calibrations
|
||||
- Common pitfalls and troubleshooting
|
||||
- Reference to definitive methodology paper
|
||||
- Alternative approaches with pros/cons
|
||||
|
||||
### Example 4: Mechanism Explanation (Sonar Reasoning Pro)
|
||||
|
||||
**Query**: "Explain the underlying mechanism of how mRNA vaccines trigger immune responses and why they differ from traditional vaccines"
|
||||
|
||||
**Model Selected**: Sonar Reasoning Pro (requires causal reasoning)
|
||||
|
||||
**Response Includes**:
|
||||
- Detailed mechanistic explanation
|
||||
- Step-by-step biological processes
|
||||
- Comparative analysis with traditional vaccines
|
||||
- Molecular-level interactions
|
||||
- Integration of immunology and pharmacology concepts
|
||||
- Evidence from recent research
|
||||
|
||||
### Example 5: Statistical Data (Sonar Pro)
|
||||
|
||||
**Query**: "Global AI adoption in healthcare statistics 2024"
|
||||
|
||||
**Model Selected**: Sonar Pro (data lookup)
|
||||
|
||||
**Response Includes**:
|
||||
- Current adoption rates by region
|
||||
- Market size and growth projections
|
||||
- Survey methodology and sample size
|
||||
- Comparison with previous years
|
||||
- Citations to market research reports
|
||||
|
||||
## Performance and Cost Considerations
|
||||
|
||||
### Response Times
|
||||
|
||||
**Sonar Pro**:
|
||||
- Typical response time: 5-15 seconds
|
||||
- Best for rapid information gathering
|
||||
- Suitable for batch queries
|
||||
|
||||
**Sonar Reasoning Pro**:
|
||||
- Typical response time: 15-45 seconds
|
||||
- Worth the wait for complex analytical queries
|
||||
- Provides more thorough reasoning and synthesis
|
||||
|
||||
### Cost Optimization
|
||||
|
||||
**Automatic Selection Benefits**:
|
||||
- Saves costs by using Sonar Pro for straightforward queries
|
||||
- Reserves Sonar Reasoning Pro for queries that truly benefit from deeper analysis
|
||||
- Optimizes the balance between cost and quality
|
||||
|
||||
**Manual Override Use Cases**:
|
||||
- Force Sonar Pro when budget is constrained and speed is priority
|
||||
- Force Sonar Reasoning Pro when working on critical research requiring maximum depth
|
||||
- Use for specific sections of papers (e.g., Pro for methods, Reasoning for discussion)
|
||||
|
||||
**Best Practices**:
|
||||
1. Trust the automatic selection for most use cases
|
||||
2. Review query results - if Sonar Pro doesn't provide sufficient depth, rephrase with reasoning keywords
|
||||
3. Use batch queries strategically - combine simple lookups to minimize total query count
|
||||
4. For literature reviews, start with Sonar Pro for breadth, then use Sonar Reasoning Pro for synthesis
|
||||
|
||||
## Security and Ethical Considerations
|
||||
|
||||
**Responsible Use**:
|
||||
- Verify all information against primary sources when possible
|
||||
- Clearly attribute all data and quotes to original sources
|
||||
- Avoid presenting AI-generated summaries as original research
|
||||
- Respect copyright and licensing restrictions
|
||||
- Use for research assistance, not to bypass paywalls or subscriptions
|
||||
|
||||
**Academic Integrity**:
|
||||
- Always cite original sources, not the AI tool
|
||||
- Use as a starting point for literature searches
|
||||
- Follow institutional guidelines for AI tool usage
|
||||
- Maintain transparency about research methods
|
||||
|
||||
## Summary
|
||||
|
||||
This skill serves as a powerful research assistant with intelligent dual-model selection:
|
||||
|
||||
- **Automatic Intelligence**: Analyzes query complexity and selects the optimal model (Sonar Pro or Sonar Reasoning Pro)
|
||||
- **Cost-Effective**: Uses faster, cheaper Sonar Pro for straightforward lookups
|
||||
- **Deep Analysis**: Automatically engages Sonar Reasoning Pro for complex comparative, analytical, and theoretical queries
|
||||
- **Flexible Control**: Manual override available when you know exactly what level of analysis you need
|
||||
- **Academic Focus**: Both models configured to prioritize peer-reviewed sources and scholarly literature
|
||||
|
||||
Whether you need quick fact-finding or deep analytical synthesis, this skill automatically adapts to deliver the right level of research support for your scientific writing needs.
|
||||
174
skills/research-lookup/examples.py
Normal file
174
skills/research-lookup/examples.py
Normal file
@@ -0,0 +1,174 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Example usage of the Research Lookup skill with automatic model selection.
|
||||
|
||||
This script demonstrates:
|
||||
1. Automatic model selection based on query complexity
|
||||
2. Manual model override options
|
||||
3. Batch query processing
|
||||
4. Integration with scientific writing workflows
|
||||
"""
|
||||
|
||||
import os
|
||||
from research_lookup import ResearchLookup
|
||||
|
||||
|
||||
def example_automatic_selection():
|
||||
"""Demonstrate automatic model selection."""
|
||||
print("=" * 80)
|
||||
print("EXAMPLE 1: Automatic Model Selection")
|
||||
print("=" * 80)
|
||||
print()
|
||||
|
||||
research = ResearchLookup()
|
||||
|
||||
# Simple lookup - will use Sonar Pro
|
||||
query1 = "Recent advances in CRISPR gene editing 2024"
|
||||
print(f"Query: {query1}")
|
||||
print(f"Expected model: Sonar Pro (fast lookup)")
|
||||
result1 = research.lookup(query1)
|
||||
print(f"Actual model: {result1.get('model')}")
|
||||
print()
|
||||
|
||||
# Complex analysis - will use Sonar Reasoning Pro
|
||||
query2 = "Compare and contrast the efficacy of mRNA vaccines versus traditional vaccines"
|
||||
print(f"Query: {query2}")
|
||||
print(f"Expected model: Sonar Reasoning Pro (analytical)")
|
||||
result2 = research.lookup(query2)
|
||||
print(f"Actual model: {result2.get('model')}")
|
||||
print()
|
||||
|
||||
|
||||
def example_manual_override():
|
||||
"""Demonstrate manual model override."""
|
||||
print("=" * 80)
|
||||
print("EXAMPLE 2: Manual Model Override")
|
||||
print("=" * 80)
|
||||
print()
|
||||
|
||||
# Force Sonar Pro for budget-constrained rapid lookup
|
||||
research_pro = ResearchLookup(force_model='pro')
|
||||
query = "Explain the mechanism of CRISPR-Cas9"
|
||||
print(f"Query: {query}")
|
||||
print(f"Forced model: Sonar Pro")
|
||||
result = research_pro.lookup(query)
|
||||
print(f"Model used: {result.get('model')}")
|
||||
print()
|
||||
|
||||
# Force Sonar Reasoning Pro for critical analysis
|
||||
research_reasoning = ResearchLookup(force_model='reasoning')
|
||||
print(f"Query: {query}")
|
||||
print(f"Forced model: Sonar Reasoning Pro")
|
||||
result = research_reasoning.lookup(query)
|
||||
print(f"Model used: {result.get('model')}")
|
||||
print()
|
||||
|
||||
|
||||
def example_batch_queries():
|
||||
"""Demonstrate batch query processing."""
|
||||
print("=" * 80)
|
||||
print("EXAMPLE 3: Batch Query Processing")
|
||||
print("=" * 80)
|
||||
print()
|
||||
|
||||
research = ResearchLookup()
|
||||
|
||||
# Mix of simple and complex queries
|
||||
queries = [
|
||||
"Recent clinical trials for Alzheimer's disease", # Sonar Pro
|
||||
"Compare deep learning vs traditional ML in drug discovery", # Sonar Reasoning Pro
|
||||
"Statistical power analysis methods", # Sonar Pro
|
||||
]
|
||||
|
||||
print("Processing batch queries...")
|
||||
print("Each query will automatically select the appropriate model")
|
||||
print()
|
||||
|
||||
results = research.batch_lookup(queries, delay=1.0)
|
||||
|
||||
for i, result in enumerate(results):
|
||||
print(f"Query {i+1}: {result['query'][:50]}...")
|
||||
print(f" Model: {result.get('model')}")
|
||||
print(f" Type: {result.get('model_type')}")
|
||||
print()
|
||||
|
||||
|
||||
def example_scientific_writing_workflow():
|
||||
"""Demonstrate integration with scientific writing workflow."""
|
||||
print("=" * 80)
|
||||
print("EXAMPLE 4: Scientific Writing Workflow")
|
||||
print("=" * 80)
|
||||
print()
|
||||
|
||||
research = ResearchLookup()
|
||||
|
||||
# Literature review phase - use Pro for breadth
|
||||
print("PHASE 1: Literature Review (Breadth)")
|
||||
lit_queries = [
|
||||
"Recent papers on machine learning in genomics 2024",
|
||||
"Clinical applications of AI in radiology",
|
||||
"RNA sequencing analysis methods"
|
||||
]
|
||||
|
||||
for query in lit_queries:
|
||||
print(f" - {query}")
|
||||
# These will automatically use Sonar Pro
|
||||
print()
|
||||
|
||||
# Discussion phase - use Reasoning Pro for synthesis
|
||||
print("PHASE 2: Discussion (Synthesis & Analysis)")
|
||||
discussion_queries = [
|
||||
"Compare the advantages and limitations of different ML approaches in genomics",
|
||||
"Explain the relationship between model interpretability and clinical adoption",
|
||||
"Analyze the ethical implications of AI in medical diagnosis"
|
||||
]
|
||||
|
||||
for query in discussion_queries:
|
||||
print(f" - {query}")
|
||||
# These will automatically use Sonar Reasoning Pro
|
||||
print()
|
||||
|
||||
|
||||
def main():
|
||||
"""Run all examples (requires OPENROUTER_API_KEY to be set)."""
|
||||
|
||||
if not os.getenv("OPENROUTER_API_KEY"):
|
||||
print("Note: Set OPENROUTER_API_KEY environment variable to run live queries")
|
||||
print("These examples show the structure without making actual API calls")
|
||||
print()
|
||||
|
||||
# Uncomment to run examples (requires API key)
|
||||
# example_automatic_selection()
|
||||
# example_manual_override()
|
||||
# example_batch_queries()
|
||||
# example_scientific_writing_workflow()
|
||||
|
||||
# Show complexity assessment without API calls
|
||||
print("=" * 80)
|
||||
print("COMPLEXITY ASSESSMENT EXAMPLES (No API calls required)")
|
||||
print("=" * 80)
|
||||
print()
|
||||
|
||||
os.environ.setdefault("OPENROUTER_API_KEY", "test")
|
||||
research = ResearchLookup()
|
||||
|
||||
test_queries = [
|
||||
("Recent CRISPR studies", "pro"),
|
||||
("Compare CRISPR vs TALENs", "reasoning"),
|
||||
("Explain how CRISPR works", "reasoning"),
|
||||
("Western blot protocol", "pro"),
|
||||
("Pros and cons of different sequencing methods", "reasoning"),
|
||||
]
|
||||
|
||||
for query, expected in test_queries:
|
||||
complexity = research._assess_query_complexity(query)
|
||||
model_name = "Sonar Reasoning Pro" if complexity == "reasoning" else "Sonar Pro"
|
||||
status = "✓" if complexity == expected else "✗"
|
||||
print(f"{status} '{query}'")
|
||||
print(f" → {model_name}")
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
93
skills/research-lookup/lookup.py
Executable file
93
skills/research-lookup/lookup.py
Executable file
@@ -0,0 +1,93 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Research Lookup Tool for Claude Code
|
||||
Performs research queries using Perplexity Sonar Pro via OpenRouter.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
# Import the main research lookup class
|
||||
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'scripts'))
|
||||
from research_lookup import ResearchLookup
|
||||
|
||||
|
||||
def format_response(result: Dict) -> str:
|
||||
"""Format the research result for display."""
|
||||
if not result["success"]:
|
||||
return f"❌ Research lookup failed: {result['error']}"
|
||||
|
||||
response = result["response"]
|
||||
citations = result["citations"]
|
||||
|
||||
# Format the output for Claude Code
|
||||
output = f"""🔍 **Research Results**
|
||||
|
||||
**Query:** {result['query']}
|
||||
**Model:** {result['model']}
|
||||
**Timestamp:** {result['timestamp']}
|
||||
|
||||
---
|
||||
|
||||
{response}
|
||||
|
||||
"""
|
||||
|
||||
if citations:
|
||||
output += f"\n**Extracted Citations ({len(citations)}):**\n"
|
||||
for i, citation in enumerate(citations, 1):
|
||||
if citation.get("doi"):
|
||||
output += f"{i}. DOI: {citation['doi']}\n"
|
||||
elif citation.get("authors") and citation.get("year"):
|
||||
output += f"{i}. {citation['authors']} ({citation['year']})\n"
|
||||
else:
|
||||
output += f"{i}. {citation}\n"
|
||||
|
||||
if result.get("usage"):
|
||||
usage = result["usage"]
|
||||
output += f"\n**Usage:** {usage.get('total_tokens', 'N/A')} tokens"
|
||||
|
||||
return output
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point for Claude Code tool."""
|
||||
# Check for API key
|
||||
if not os.getenv("OPENROUTER_API_KEY"):
|
||||
print("❌ Error: OPENROUTER_API_KEY environment variable not set")
|
||||
print("Please set it in your .env file or export it:")
|
||||
print(" export OPENROUTER_API_KEY='your_openrouter_api_key'")
|
||||
return 1
|
||||
|
||||
# Get query from command line arguments
|
||||
if len(sys.argv) < 2:
|
||||
print("❌ Error: No query provided")
|
||||
print("Usage: python lookup.py 'your research query here'")
|
||||
return 1
|
||||
|
||||
query = " ".join(sys.argv[1:])
|
||||
|
||||
try:
|
||||
# Initialize research tool
|
||||
research = ResearchLookup()
|
||||
|
||||
# Perform lookup
|
||||
print(f"🔍 Researching: {query}")
|
||||
result = research.lookup(query)
|
||||
|
||||
# Format and output result
|
||||
formatted_output = format_response(result)
|
||||
print(formatted_output)
|
||||
|
||||
# Return success code
|
||||
return 0 if result["success"] else 1
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error: {str(e)}")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit(main())
|
||||
335
skills/research-lookup/research_lookup.py
Normal file
335
skills/research-lookup/research_lookup.py
Normal file
@@ -0,0 +1,335 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Research Information Lookup Tool
|
||||
Uses Perplexity's Sonar Pro or Sonar Reasoning Pro models through OpenRouter.
|
||||
Automatically selects the appropriate model based on query complexity.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import requests
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional, Any
|
||||
from urllib.parse import quote
|
||||
|
||||
|
||||
class ResearchLookup:
|
||||
"""Research information lookup using Perplexity Sonar models via OpenRouter."""
|
||||
|
||||
# Complexity indicators for determining which model to use
|
||||
REASONING_KEYWORDS = [
|
||||
'compare', 'contrast', 'analyze', 'analysis', 'synthesis', 'meta-analysis',
|
||||
'systematic review', 'evaluate', 'critique', 'trade-off', 'tradeoff',
|
||||
'relationship', 'versus', 'vs', 'vs.', 'compared to',
|
||||
'mechanism', 'why', 'how does', 'how do', 'explain', 'theoretical framework',
|
||||
'implications', 'debate', 'controversy', 'conflicting', 'paradox',
|
||||
'reconcile', 'integrate', 'multifaceted', 'complex interaction',
|
||||
'causal relationship', 'underlying mechanism', 'interpret', 'reasoning',
|
||||
'pros and cons', 'advantages and disadvantages', 'critical analysis',
|
||||
'differences between', 'similarities', 'trade offs'
|
||||
]
|
||||
|
||||
def __init__(self, force_model: Optional[str] = None):
|
||||
"""
|
||||
Initialize the research lookup tool.
|
||||
|
||||
Args:
|
||||
force_model: Optional model override ('pro' or 'reasoning').
|
||||
If None, automatically selects based on query complexity.
|
||||
"""
|
||||
self.api_key = os.getenv("OPENROUTER_API_KEY")
|
||||
if not self.api_key:
|
||||
raise ValueError("OPENROUTER_API_KEY environment variable not set")
|
||||
|
||||
self.base_url = "https://openrouter.ai/api/v1"
|
||||
self.model_pro = "perplexity/sonar-pro" # Fast, efficient lookup
|
||||
self.model_reasoning = "perplexity/sonar-reasoning-pro" # Deep analysis
|
||||
self.force_model = force_model
|
||||
self.headers = {
|
||||
"Authorization": f"Bearer {self.api_key}",
|
||||
"Content-Type": "application/json",
|
||||
"HTTP-Referer": "https://scientific-writer.local", # Replace with your domain
|
||||
"X-Title": "Scientific Writer Research Tool"
|
||||
}
|
||||
|
||||
def _assess_query_complexity(self, query: str) -> str:
|
||||
"""
|
||||
Assess query complexity to determine which model to use.
|
||||
|
||||
Returns:
|
||||
'reasoning' for complex analytical queries, 'pro' for straightforward lookups
|
||||
"""
|
||||
query_lower = query.lower()
|
||||
|
||||
# Count reasoning keywords
|
||||
reasoning_count = sum(1 for keyword in self.REASONING_KEYWORDS if keyword in query_lower)
|
||||
|
||||
# Count questions (multiple questions suggest complexity)
|
||||
question_count = query.count('?')
|
||||
|
||||
# Check for multiple clauses (complexity indicators)
|
||||
clause_indicators = [' and ', ' or ', ' but ', ' however ', ' whereas ', ' although ']
|
||||
clause_count = sum(1 for indicator in clause_indicators if indicator in query_lower)
|
||||
|
||||
# Complexity score
|
||||
complexity_score = (
|
||||
reasoning_count * 3 + # Reasoning keywords heavily weighted
|
||||
question_count * 2 + # Multiple questions indicate complexity
|
||||
clause_count * 1.5 + # Multiple clauses suggest nuance
|
||||
(1 if len(query) > 150 else 0) # Long queries often more complex
|
||||
)
|
||||
|
||||
# Threshold for using reasoning model (lowered to 3 to catch single reasoning keywords)
|
||||
return 'reasoning' if complexity_score >= 3 else 'pro'
|
||||
|
||||
def _select_model(self, query: str) -> str:
|
||||
"""Select the appropriate model based on query complexity or force override."""
|
||||
if self.force_model:
|
||||
return self.model_reasoning if self.force_model == 'reasoning' else self.model_pro
|
||||
|
||||
complexity_level = self._assess_query_complexity(query)
|
||||
return self.model_reasoning if complexity_level == 'reasoning' else self.model_pro
|
||||
|
||||
def _make_request(self, messages: List[Dict[str, str]], model: str, **kwargs) -> Dict[str, Any]:
|
||||
"""Make a request to the OpenRouter API."""
|
||||
data = {
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"max_tokens": 4000,
|
||||
"temperature": 0.1, # Low temperature for factual research
|
||||
"provider": {
|
||||
"order": ["Perplexity"],
|
||||
"allow_fallbacks": False
|
||||
},
|
||||
**kwargs
|
||||
}
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{self.base_url}/chat/completions",
|
||||
headers=self.headers,
|
||||
json=data,
|
||||
timeout=90 # Increased timeout for reasoning model
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except requests.exceptions.RequestException as e:
|
||||
raise Exception(f"API request failed: {str(e)}")
|
||||
|
||||
def _format_research_prompt(self, query: str) -> str:
|
||||
"""Format the query for optimal research results."""
|
||||
return f"""You are an expert research assistant. Please provide comprehensive, accurate research information for the following query: "{query}"
|
||||
|
||||
IMPORTANT INSTRUCTIONS:
|
||||
1. Focus on ACADEMIC and SCIENTIFIC sources (peer-reviewed papers, reputable journals, institutional research)
|
||||
2. Include RECENT information (prioritize 2020-2024 publications)
|
||||
3. Provide COMPLETE citations with authors, title, journal/conference, year, and DOI when available
|
||||
4. Structure your response with clear sections and proper attribution
|
||||
5. Be comprehensive but concise - aim for 800-1200 words
|
||||
6. Include key findings, methodologies, and implications when relevant
|
||||
7. Note any controversies, limitations, or conflicting evidence
|
||||
|
||||
RESPONSE FORMAT:
|
||||
- Start with a brief summary (2-3 sentences)
|
||||
- Present key findings and studies in organized sections
|
||||
- End with future directions or research gaps if applicable
|
||||
- Include 5-8 high-quality citations at the end
|
||||
|
||||
Remember: This is for academic research purposes. Prioritize accuracy, completeness, and proper attribution."""
|
||||
|
||||
def lookup(self, query: str) -> Dict[str, Any]:
|
||||
"""Perform a research lookup for the given query."""
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
# Select the appropriate model based on query complexity
|
||||
selected_model = self._select_model(query)
|
||||
model_type = "reasoning" if "reasoning" in selected_model else "standard"
|
||||
|
||||
print(f"[Research] Using {selected_model} (detected complexity: {model_type})")
|
||||
|
||||
# Format the research prompt
|
||||
research_prompt = self._format_research_prompt(query)
|
||||
|
||||
# Prepare messages for the API with system message for academic mode
|
||||
messages = [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are an academic research assistant. Focus exclusively on scholarly sources: peer-reviewed journals, academic papers, research institutions, and reputable scientific publications. Prioritize recent academic literature (2020-2024) and provide complete citations with DOIs. Use academic/scholarly search mode."
|
||||
},
|
||||
{"role": "user", "content": research_prompt}
|
||||
]
|
||||
|
||||
try:
|
||||
# Make the API request with selected model
|
||||
response = self._make_request(messages, model=selected_model)
|
||||
|
||||
# Extract the response content
|
||||
if "choices" in response and len(response["choices"]) > 0:
|
||||
choice = response["choices"][0]
|
||||
if "message" in choice and "content" in choice["message"]:
|
||||
content = choice["message"]["content"]
|
||||
|
||||
# Extract citations if present (basic regex extraction)
|
||||
citations = self._extract_citations(content)
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"query": query,
|
||||
"response": content,
|
||||
"citations": citations,
|
||||
"timestamp": timestamp,
|
||||
"model": selected_model,
|
||||
"model_type": model_type,
|
||||
"usage": response.get("usage", {})
|
||||
}
|
||||
else:
|
||||
raise Exception("Invalid response format from API")
|
||||
else:
|
||||
raise Exception("No response choices received from API")
|
||||
|
||||
except Exception as e:
|
||||
return {
|
||||
"success": False,
|
||||
"query": query,
|
||||
"error": str(e),
|
||||
"timestamp": timestamp,
|
||||
"model": selected_model,
|
||||
"model_type": model_type
|
||||
}
|
||||
|
||||
def _extract_citations(self, text: str) -> List[Dict[str, str]]:
|
||||
"""Extract potential citations from the response text."""
|
||||
# This is a simple citation extractor - in practice, you might want
|
||||
# to use a more sophisticated approach or rely on the model's structured output
|
||||
|
||||
citations = []
|
||||
|
||||
# Look for common citation patterns
|
||||
import re
|
||||
|
||||
# Pattern for author et al. year
|
||||
author_pattern = r'([A-Z][a-z]+(?:\s+[A-Z]\.)*(?:\s+et\s+al\.)?)\s*\((\d{4})\)'
|
||||
matches = re.findall(author_pattern, text)
|
||||
|
||||
for author, year in matches:
|
||||
citations.append({
|
||||
"authors": author,
|
||||
"year": year,
|
||||
"type": "extracted"
|
||||
})
|
||||
|
||||
# Look for DOI patterns
|
||||
doi_pattern = r'doi:\s*([^\s\)\]]+)'
|
||||
doi_matches = re.findall(doi_pattern, text, re.IGNORECASE)
|
||||
|
||||
for doi in doi_matches:
|
||||
citations.append({
|
||||
"doi": doi.strip(),
|
||||
"type": "doi"
|
||||
})
|
||||
|
||||
return citations
|
||||
|
||||
def batch_lookup(self, queries: List[str], delay: float = 1.0) -> List[Dict[str, Any]]:
|
||||
"""Perform multiple research lookups with optional delay between requests."""
|
||||
results = []
|
||||
|
||||
for i, query in enumerate(queries):
|
||||
if i > 0 and delay > 0:
|
||||
time.sleep(delay) # Rate limiting
|
||||
|
||||
result = self.lookup(query)
|
||||
results.append(result)
|
||||
|
||||
# Print progress
|
||||
print(f"[Research] Completed query {i+1}/{len(queries)}: {query[:50]}...")
|
||||
|
||||
return results
|
||||
|
||||
def get_model_info(self) -> Dict[str, Any]:
|
||||
"""Get information about available models from OpenRouter."""
|
||||
try:
|
||||
response = requests.get(
|
||||
f"{self.base_url}/models",
|
||||
headers=self.headers,
|
||||
timeout=30
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except Exception as e:
|
||||
return {"error": str(e)}
|
||||
|
||||
|
||||
def main():
|
||||
"""Command-line interface for testing the research lookup tool."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Research Information Lookup Tool")
|
||||
parser.add_argument("query", nargs="?", help="Research query to look up")
|
||||
parser.add_argument("--model-info", action="store_true", help="Show available models")
|
||||
parser.add_argument("--batch", nargs="+", help="Run multiple queries")
|
||||
parser.add_argument("--force-model", choices=['pro', 'reasoning'],
|
||||
help="Force use of specific model (pro=fast lookup, reasoning=deep analysis)")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Check for API key
|
||||
if not os.getenv("OPENROUTER_API_KEY"):
|
||||
print("Error: OPENROUTER_API_KEY environment variable not set")
|
||||
print("Please set it in your .env file or export it:")
|
||||
print(" export OPENROUTER_API_KEY='your_openrouter_api_key'")
|
||||
return 1
|
||||
|
||||
try:
|
||||
research = ResearchLookup(force_model=args.force_model)
|
||||
|
||||
if args.model_info:
|
||||
print("Available models from OpenRouter:")
|
||||
models = research.get_model_info()
|
||||
if "data" in models:
|
||||
for model in models["data"]:
|
||||
if "perplexity" in model["id"].lower():
|
||||
print(f" - {model['id']}: {model.get('name', 'N/A')}")
|
||||
return 0
|
||||
|
||||
if not args.query and not args.batch:
|
||||
parser.print_help()
|
||||
return 1
|
||||
|
||||
if args.batch:
|
||||
print(f"Running batch research for {len(args.batch)} queries...")
|
||||
results = research.batch_lookup(args.batch)
|
||||
else:
|
||||
print(f"Researching: {args.query}")
|
||||
results = [research.lookup(args.query)]
|
||||
|
||||
# Display results
|
||||
for i, result in enumerate(results):
|
||||
if result["success"]:
|
||||
print(f"\n{'='*80}")
|
||||
print(f"Query {i+1}: {result['query']}")
|
||||
print(f"Timestamp: {result['timestamp']}")
|
||||
print(f"Model: {result['model']} ({result.get('model_type', 'unknown')})")
|
||||
print(f"{'='*80}")
|
||||
print(result["response"])
|
||||
|
||||
if result["citations"]:
|
||||
print(f"\nExtracted Citations ({len(result['citations'])}):")
|
||||
for j, citation in enumerate(result["citations"]):
|
||||
print(f" {j+1}. {citation}")
|
||||
|
||||
if result["usage"]:
|
||||
print(f"\nUsage: {result['usage']}")
|
||||
else:
|
||||
print(f"\nError in query {i+1}: {result['error']}")
|
||||
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error: {str(e)}")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit(main())
|
||||
261
skills/research-lookup/scripts/research_lookup.py
Executable file
261
skills/research-lookup/scripts/research_lookup.py
Executable file
@@ -0,0 +1,261 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Research Information Lookup Tool
|
||||
Uses Perplexity's Sonar Pro model through OpenRouter for academic research queries.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import requests
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional, Any
|
||||
from urllib.parse import quote
|
||||
|
||||
|
||||
class ResearchLookup:
|
||||
"""Research information lookup using Perplexity Sonar Pro via OpenRouter."""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the research lookup tool."""
|
||||
self.api_key = os.getenv("OPENROUTER_API_KEY")
|
||||
if not self.api_key:
|
||||
raise ValueError("OPENROUTER_API_KEY environment variable not set")
|
||||
|
||||
self.base_url = "https://openrouter.ai/api/v1"
|
||||
self.model = "perplexity/sonar-reasoning-pro" # Perplexity Sonar Pro with online search
|
||||
self.headers = {
|
||||
"Authorization": f"Bearer {self.api_key}",
|
||||
"Content-Type": "application/json",
|
||||
"HTTP-Referer": "https://scientific-writer.local", # Replace with your domain
|
||||
"X-Title": "Scientific Writer Research Tool"
|
||||
}
|
||||
|
||||
def _make_request(self, messages: List[Dict[str, str]], **kwargs) -> Dict[str, Any]:
|
||||
"""Make a request to the OpenRouter API."""
|
||||
data = {
|
||||
"model": self.model,
|
||||
"messages": messages,
|
||||
"max_tokens": 8000,
|
||||
"temperature": 0.1, # Low temperature for factual research
|
||||
**kwargs
|
||||
}
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{self.base_url}/chat/completions",
|
||||
headers=self.headers,
|
||||
json=data,
|
||||
timeout=60
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except requests.exceptions.RequestException as e:
|
||||
raise Exception(f"API request failed: {str(e)}")
|
||||
|
||||
def _format_research_prompt(self, query: str) -> str:
|
||||
"""Format the query for optimal research results."""
|
||||
return f"""You are an expert research assistant. Please provide comprehensive, accurate research information for the following query: "{query}"
|
||||
|
||||
IMPORTANT INSTRUCTIONS:
|
||||
1. Focus on ACADEMIC and SCIENTIFIC sources (peer-reviewed papers, reputable journals, institutional research)
|
||||
2. Include RECENT information (prioritize 2020-2024 publications)
|
||||
3. Provide COMPLETE citations with authors, title, journal/conference, year, and DOI when available
|
||||
4. Structure your response with clear sections and proper attribution
|
||||
5. Be comprehensive but concise - aim for 800-1200 words
|
||||
6. Include key findings, methodologies, and implications when relevant
|
||||
7. Note any controversies, limitations, or conflicting evidence
|
||||
|
||||
RESPONSE FORMAT:
|
||||
- Start with a brief summary (2-3 sentences)
|
||||
- Present key findings and studies in organized sections
|
||||
- End with future directions or research gaps if applicable
|
||||
- Include 5-8 high-quality citations at the end
|
||||
|
||||
Remember: This is for academic research purposes. Prioritize accuracy, completeness, and proper attribution."""
|
||||
|
||||
def lookup(self, query: str) -> Dict[str, Any]:
|
||||
"""Perform a research lookup for the given query."""
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
# Format the research prompt
|
||||
research_prompt = self._format_research_prompt(query)
|
||||
|
||||
# Prepare messages for the API with system message for academic mode
|
||||
messages = [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are an academic research assistant. Focus exclusively on scholarly sources: peer-reviewed journals, academic papers, research institutions, and reputable scientific publications. Prioritize recent academic literature (2020-2024) and provide complete citations with DOIs. Use academic/scholarly search mode."
|
||||
},
|
||||
{"role": "user", "content": research_prompt}
|
||||
]
|
||||
|
||||
try:
|
||||
# Make the API request
|
||||
response = self._make_request(messages)
|
||||
|
||||
# Extract the response content
|
||||
if "choices" in response and len(response["choices"]) > 0:
|
||||
choice = response["choices"][0]
|
||||
if "message" in choice and "content" in choice["message"]:
|
||||
content = choice["message"]["content"]
|
||||
|
||||
# Extract citations if present (basic regex extraction)
|
||||
citations = self._extract_citations(content)
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"query": query,
|
||||
"response": content,
|
||||
"citations": citations,
|
||||
"timestamp": timestamp,
|
||||
"model": self.model,
|
||||
"usage": response.get("usage", {})
|
||||
}
|
||||
else:
|
||||
raise Exception("Invalid response format from API")
|
||||
else:
|
||||
raise Exception("No response choices received from API")
|
||||
|
||||
except Exception as e:
|
||||
return {
|
||||
"success": False,
|
||||
"query": query,
|
||||
"error": str(e),
|
||||
"timestamp": timestamp,
|
||||
"model": self.model
|
||||
}
|
||||
|
||||
def _extract_citations(self, text: str) -> List[Dict[str, str]]:
|
||||
"""Extract potential citations from the response text."""
|
||||
# This is a simple citation extractor - in practice, you might want
|
||||
# to use a more sophisticated approach or rely on the model's structured output
|
||||
|
||||
citations = []
|
||||
|
||||
# Look for common citation patterns
|
||||
import re
|
||||
|
||||
# Pattern for author et al. year
|
||||
author_pattern = r'([A-Z][a-z]+(?:\s+[A-Z]\.)*(?:\s+et\s+al\.)?)\s*\((\d{4})\)'
|
||||
matches = re.findall(author_pattern, text)
|
||||
|
||||
for author, year in matches:
|
||||
citations.append({
|
||||
"authors": author,
|
||||
"year": year,
|
||||
"type": "extracted"
|
||||
})
|
||||
|
||||
# Look for DOI patterns
|
||||
doi_pattern = r'doi:\s*([^\s\)\]]+)'
|
||||
doi_matches = re.findall(doi_pattern, text, re.IGNORECASE)
|
||||
|
||||
for doi in doi_matches:
|
||||
citations.append({
|
||||
"doi": doi.strip(),
|
||||
"type": "doi"
|
||||
})
|
||||
|
||||
return citations
|
||||
|
||||
def batch_lookup(self, queries: List[str], delay: float = 1.0) -> List[Dict[str, Any]]:
|
||||
"""Perform multiple research lookups with optional delay between requests."""
|
||||
results = []
|
||||
|
||||
for i, query in enumerate(queries):
|
||||
if i > 0 and delay > 0:
|
||||
time.sleep(delay) # Rate limiting
|
||||
|
||||
result = self.lookup(query)
|
||||
results.append(result)
|
||||
|
||||
# Print progress
|
||||
print(f"[Research] Completed query {i+1}/{len(queries)}: {query[:50]}...")
|
||||
|
||||
return results
|
||||
|
||||
def get_model_info(self) -> Dict[str, Any]:
|
||||
"""Get information about available models from OpenRouter."""
|
||||
try:
|
||||
response = requests.get(
|
||||
f"{self.base_url}/models",
|
||||
headers=self.headers,
|
||||
timeout=30
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except Exception as e:
|
||||
return {"error": str(e)}
|
||||
|
||||
|
||||
def main():
|
||||
"""Command-line interface for testing the research lookup tool."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Research Information Lookup Tool")
|
||||
parser.add_argument("query", nargs="?", help="Research query to look up")
|
||||
parser.add_argument("--model-info", action="store_true", help="Show available models")
|
||||
parser.add_argument("--batch", nargs="+", help="Run multiple queries")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Check for API key
|
||||
if not os.getenv("OPENROUTER_API_KEY"):
|
||||
print("Error: OPENROUTER_API_KEY environment variable not set")
|
||||
print("Please set it in your .env file or export it:")
|
||||
print(" export OPENROUTER_API_KEY='your_openrouter_api_key'")
|
||||
return 1
|
||||
|
||||
try:
|
||||
research = ResearchLookup()
|
||||
|
||||
if args.model_info:
|
||||
print("Available models from OpenRouter:")
|
||||
models = research.get_model_info()
|
||||
if "data" in models:
|
||||
for model in models["data"]:
|
||||
if "perplexity" in model["id"].lower():
|
||||
print(f" - {model['id']}: {model.get('name', 'N/A')}")
|
||||
return 0
|
||||
|
||||
if not args.query and not args.batch:
|
||||
print("Error: No query provided. Use --model-info to see available models.")
|
||||
return 1
|
||||
|
||||
if args.batch:
|
||||
print(f"Running batch research for {len(args.batch)} queries...")
|
||||
results = research.batch_lookup(args.batch)
|
||||
else:
|
||||
print(f"Researching: {args.query}")
|
||||
results = [research.lookup(args.query)]
|
||||
|
||||
# Display results
|
||||
for i, result in enumerate(results):
|
||||
if result["success"]:
|
||||
print(f"\n{'='*80}")
|
||||
print(f"Query {i+1}: {result['query']}")
|
||||
print(f"Timestamp: {result['timestamp']}")
|
||||
print(f"Model: {result['model']}")
|
||||
print(f"{'='*80}")
|
||||
print(result["response"])
|
||||
|
||||
if result["citations"]:
|
||||
print(f"\nExtracted Citations ({len(result['citations'])}):")
|
||||
for j, citation in enumerate(result["citations"]):
|
||||
print(f" {j+1}. {citation}")
|
||||
|
||||
if result["usage"]:
|
||||
print(f"\nUsage: {result['usage']}")
|
||||
else:
|
||||
print(f"\nError in query {i+1}: {result['error']}")
|
||||
|
||||
return 0
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error: {str(e)}")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit(main())
|
||||
Reference in New Issue
Block a user