472 lines
12 KiB
Markdown
472 lines
12 KiB
Markdown
# Research Pipeline API Reference
|
|
|
|
## Core Classes
|
|
|
|
### Denario
|
|
|
|
The main class for orchestrating research workflows.
|
|
|
|
#### Initialization
|
|
|
|
```python
|
|
from denario import Denario
|
|
|
|
den = Denario(project_dir="path/to/project")
|
|
```
|
|
|
|
**Parameters:**
|
|
- `project_dir` (str): Path to the research project directory where all outputs will be stored
|
|
|
|
#### Methods
|
|
|
|
##### set_data_description()
|
|
|
|
Define the research context by describing available data and analytical tools.
|
|
|
|
```python
|
|
den.set_data_description(description: str)
|
|
```
|
|
|
|
**Parameters:**
|
|
- `description` (str): Text describing the dataset, available tools, research domain, and any relevant context
|
|
|
|
**Example:**
|
|
```python
|
|
den.set_data_description("""
|
|
Available data: Time-series temperature measurements from 2010-2023
|
|
Tools: pandas, scipy, sklearn, matplotlib
|
|
Domain: Climate science
|
|
Research interest: Identifying seasonal patterns and long-term trends
|
|
""")
|
|
```
|
|
|
|
**Purpose:** This establishes the foundation for automated idea generation by providing context about what data is available and what analyses are feasible.
|
|
|
|
##### get_idea()
|
|
|
|
Generate research hypotheses based on the data description.
|
|
|
|
```python
|
|
den.get_idea()
|
|
```
|
|
|
|
**Returns:** Research idea/hypothesis (stored internally in project directory)
|
|
|
|
**Output:** Creates a file containing the generated research question or hypothesis
|
|
|
|
**Example:**
|
|
```python
|
|
den.get_idea()
|
|
# Generates ideas like: "Investigate the correlation between seasonal temperature
|
|
# variations and long-term warming trends using time-series decomposition"
|
|
```
|
|
|
|
##### set_idea()
|
|
|
|
Manually specify a research idea instead of generating one.
|
|
|
|
```python
|
|
den.set_idea(idea: str)
|
|
```
|
|
|
|
**Parameters:**
|
|
- `idea` (str): The research hypothesis or question to investigate
|
|
|
|
**Example:**
|
|
```python
|
|
den.set_idea("Analyze the impact of El Niño events on regional temperature anomalies")
|
|
```
|
|
|
|
**Use case:** When you have a specific research direction and want to skip automated idea generation.
|
|
|
|
##### get_method()
|
|
|
|
Develop a research methodology based on the idea and data description.
|
|
|
|
```python
|
|
den.get_method()
|
|
```
|
|
|
|
**Returns:** Methodology document (stored internally in project directory)
|
|
|
|
**Output:** Creates a structured methodology including:
|
|
- Analytical approach
|
|
- Statistical methods to apply
|
|
- Validation strategies
|
|
- Expected outputs
|
|
|
|
**Example:**
|
|
```python
|
|
den.get_method()
|
|
# Generates methodology: "Apply seasonal decomposition, compute correlation coefficients,
|
|
# perform statistical significance tests, generate visualization plots..."
|
|
```
|
|
|
|
##### set_method()
|
|
|
|
Provide a custom methodology instead of generating one.
|
|
|
|
```python
|
|
den.set_method(method: str)
|
|
den.set_method(method: Path) # Can also accept file paths
|
|
```
|
|
|
|
**Parameters:**
|
|
- `method` (str or Path): Methodology description or path to markdown file containing methodology
|
|
|
|
**Example:**
|
|
```python
|
|
# From string
|
|
den.set_method("""
|
|
1. Apply seasonal decomposition using STL
|
|
2. Compute Pearson correlation coefficients
|
|
3. Perform Mann-Kendall trend test
|
|
4. Generate time-series plots with confidence intervals
|
|
""")
|
|
|
|
# From file
|
|
den.set_method("methodology.md")
|
|
```
|
|
|
|
##### get_results()
|
|
|
|
Execute the methodology, perform computations, and generate results.
|
|
|
|
```python
|
|
den.get_results()
|
|
```
|
|
|
|
**Returns:** Results document with analysis outputs (stored internally in project directory)
|
|
|
|
**Output:** Creates results including:
|
|
- Computed statistics
|
|
- Generated figures and visualizations
|
|
- Data tables
|
|
- Analysis findings
|
|
|
|
**Example:**
|
|
```python
|
|
den.get_results()
|
|
# Executes the methodology, runs analyses, creates plots, compiles findings
|
|
```
|
|
|
|
**Note:** This is where the actual computational work happens. The agent executes code to perform the analyses specified in the methodology.
|
|
|
|
##### set_results()
|
|
|
|
Provide pre-computed results instead of generating them.
|
|
|
|
```python
|
|
den.set_results(results: str)
|
|
den.set_results(results: Path) # Can also accept file paths
|
|
```
|
|
|
|
**Parameters:**
|
|
- `results` (str or Path): Results description or path to markdown file containing results
|
|
|
|
**Example:**
|
|
```python
|
|
# From string
|
|
den.set_results("""
|
|
Analysis Results:
|
|
- Correlation coefficient: 0.78 (p < 0.001)
|
|
- Seasonal amplitude: 5.2°C
|
|
- Long-term trend: +0.15°C per decade
|
|
- Figure 1: Seasonal decomposition (see attached)
|
|
""")
|
|
|
|
# From file
|
|
den.set_results("results.md")
|
|
```
|
|
|
|
**Use case:** When analyses were performed externally or when iterating on paper writing without re-running computations.
|
|
|
|
##### get_paper()
|
|
|
|
Generate a publication-ready LaTeX paper with the research findings.
|
|
|
|
```python
|
|
den.get_paper(journal: Journal = None)
|
|
```
|
|
|
|
**Parameters:**
|
|
- `journal` (Journal, optional): Target journal for formatting. Defaults to generic format.
|
|
|
|
**Returns:** LaTeX paper with proper formatting (stored in project directory)
|
|
|
|
**Output:** Creates:
|
|
- Complete LaTeX source file
|
|
- Compiled PDF (if LaTeX is available)
|
|
- Integrated figures and tables
|
|
- Properly formatted bibliography
|
|
|
|
**Example:**
|
|
```python
|
|
from denario import Journal
|
|
|
|
den.get_paper(journal=Journal.APS)
|
|
# Generates paper.tex and paper.pdf formatted for APS journals
|
|
```
|
|
|
|
### Journal Enum
|
|
|
|
Enumeration of supported journal formats.
|
|
|
|
```python
|
|
from denario import Journal
|
|
```
|
|
|
|
#### Available Journals
|
|
|
|
- `Journal.APS` - American Physical Society format
|
|
- Suitable for Physical Review, Physical Review Letters, etc.
|
|
- Uses RevTeX document class
|
|
|
|
Additional journal formats may be available. Check the latest denario documentation for the complete list.
|
|
|
|
#### Usage
|
|
|
|
```python
|
|
from denario import Denario, Journal
|
|
|
|
den = Denario(project_dir="./research")
|
|
# ... complete workflow ...
|
|
den.get_paper(journal=Journal.APS)
|
|
```
|
|
|
|
## Workflow Patterns
|
|
|
|
### Fully Automated Pipeline
|
|
|
|
Let denario handle every stage:
|
|
|
|
```python
|
|
from denario import Denario, Journal
|
|
|
|
den = Denario(project_dir="./automated_research")
|
|
|
|
# Define context
|
|
den.set_data_description("""
|
|
Dataset: Sensor readings from IoT devices
|
|
Tools: pandas, numpy, sklearn, matplotlib
|
|
Goal: Anomaly detection in sensor networks
|
|
""")
|
|
|
|
# Automate entire pipeline
|
|
den.get_idea() # Generate research idea
|
|
den.get_method() # Develop methodology
|
|
den.get_results() # Execute analysis
|
|
den.get_paper(journal=Journal.APS) # Create paper
|
|
```
|
|
|
|
### Custom Idea, Automated Execution
|
|
|
|
Provide your research question, automate the rest:
|
|
|
|
```python
|
|
den = Denario(project_dir="./custom_idea")
|
|
|
|
den.set_data_description("Dataset: Financial time-series data...")
|
|
|
|
# Manual idea
|
|
den.set_idea("Investigate predictive models for stock market volatility using LSTM networks")
|
|
|
|
# Automated execution
|
|
den.get_method()
|
|
den.get_results()
|
|
den.get_paper(journal=Journal.APS)
|
|
```
|
|
|
|
### Fully Manual with Template Generation
|
|
|
|
Use denario only for paper formatting:
|
|
|
|
```python
|
|
den = Denario(project_dir="./manual_research")
|
|
|
|
# Provide everything manually
|
|
den.set_data_description("Pre-existing dataset description...")
|
|
den.set_idea("Pre-defined research hypothesis")
|
|
den.set_method("methodology.md") # Load from file
|
|
den.set_results("results.md") # Load from file
|
|
|
|
# Generate formatted paper
|
|
den.get_paper(journal=Journal.APS)
|
|
```
|
|
|
|
### Iterative Refinement
|
|
|
|
Refine specific stages without re-running everything:
|
|
|
|
```python
|
|
den = Denario(project_dir="./iterative")
|
|
|
|
# Initial run
|
|
den.set_data_description("Dataset description...")
|
|
den.get_idea()
|
|
den.get_method()
|
|
den.get_results()
|
|
|
|
# Refine methodology after reviewing results
|
|
den.set_method("""
|
|
Revised methodology:
|
|
- Use different statistical test
|
|
- Add sensitivity analysis
|
|
- Include cross-validation
|
|
""")
|
|
|
|
# Re-run only downstream stages
|
|
den.get_results() # Re-execute with new method
|
|
den.get_paper(journal=Journal.APS)
|
|
```
|
|
|
|
## Project Directory Structure
|
|
|
|
After running a complete workflow, the project directory contains:
|
|
|
|
```
|
|
project_dir/
|
|
├── data_description.txt # Input: data context
|
|
├── idea.md # Generated or provided research idea
|
|
├── methodology.md # Generated or provided methodology
|
|
├── results.md # Generated or provided results
|
|
├── figures/ # Generated visualizations
|
|
│ ├── figure_1.png
|
|
│ ├── figure_2.png
|
|
│ └── ...
|
|
├── paper.tex # Generated LaTeX source
|
|
├── paper.pdf # Compiled PDF (if LaTeX available)
|
|
└── logs/ # Agent execution logs
|
|
└── ...
|
|
```
|
|
|
|
## Advanced Features
|
|
|
|
### Multiagent Orchestration
|
|
|
|
Denario uses AG2 and LangGraph frameworks to coordinate multiple specialized agents:
|
|
|
|
- **Idea Agent**: Generates research hypotheses from data descriptions
|
|
- **Method Agent**: Develops analytical methodologies
|
|
- **Execution Agent**: Runs computations and creates visualizations
|
|
- **Writing Agent**: Produces publication-ready manuscripts
|
|
|
|
These agents collaborate automatically, with each stage building on previous outputs.
|
|
|
|
### Integration with Scientific Tools
|
|
|
|
Denario integrates with common scientific Python libraries:
|
|
|
|
- **pandas**: Data manipulation and analysis
|
|
- **scikit-learn**: Machine learning algorithms
|
|
- **scipy**: Scientific computing and statistics
|
|
- **matplotlib/seaborn**: Visualization
|
|
- **numpy**: Numerical operations
|
|
|
|
When generating results, denario can automatically write and execute code using these libraries.
|
|
|
|
### Reproducibility
|
|
|
|
All stages produce structured outputs saved to the project directory:
|
|
|
|
- Version control friendly (markdown and LaTeX)
|
|
- Auditable (logs of agent decisions and code execution)
|
|
- Reproducible (saved methodologies can be re-run)
|
|
|
|
### Literature Search
|
|
|
|
Denario includes capabilities for literature searches to provide context for research ideas and methodology development. See `examples.md` for literature search workflows.
|
|
|
|
## Error Handling
|
|
|
|
### Common Issues
|
|
|
|
**Missing data description:**
|
|
```python
|
|
den = Denario(project_dir="./project")
|
|
den.get_idea() # Error: must call set_data_description() first
|
|
```
|
|
|
|
**Solution:** Always set data description before generating ideas.
|
|
|
|
**Missing prerequisite stages:**
|
|
```python
|
|
den = Denario(project_dir="./project")
|
|
den.get_results() # Error: must have idea and method first
|
|
```
|
|
|
|
**Solution:** Follow the workflow order or manually set prerequisite stages.
|
|
|
|
**LaTeX compilation errors:**
|
|
```python
|
|
den.get_paper() # May fail if LaTeX not installed
|
|
```
|
|
|
|
**Solution:** Install LaTeX distribution or use Docker image with pre-installed LaTeX.
|
|
|
|
## Best Practices
|
|
|
|
### Data Description Quality
|
|
|
|
Provide detailed context for better idea generation:
|
|
|
|
```python
|
|
# Good: Detailed and specific
|
|
den.set_data_description("""
|
|
Dataset: 10 years of daily temperature readings from 50 weather stations
|
|
Format: CSV with columns [date, station_id, temperature, humidity]
|
|
Tools available: pandas, scipy, sklearn, matplotlib, seaborn
|
|
Domain: Climatology
|
|
Research interests: Climate change, seasonal patterns, regional variations
|
|
Known challenges: Missing data in 2015, station 23 has calibration issues
|
|
""")
|
|
|
|
# Bad: Too vague
|
|
den.set_data_description("Temperature data from weather stations")
|
|
```
|
|
|
|
### Methodology Validation
|
|
|
|
Review generated methodologies before executing:
|
|
|
|
```python
|
|
den.get_method()
|
|
# Review the methodology.md file in project_dir
|
|
# If needed, refine with set_method()
|
|
```
|
|
|
|
### Incremental Development
|
|
|
|
Build the research pipeline incrementally:
|
|
|
|
```python
|
|
# Stage 1: Validate idea generation
|
|
den.set_data_description("...")
|
|
den.get_idea()
|
|
# Review idea.md, adjust if needed
|
|
|
|
# Stage 2: Validate methodology
|
|
den.get_method()
|
|
# Review methodology.md, adjust if needed
|
|
|
|
# Stage 3: Execute and validate results
|
|
den.get_results()
|
|
# Review results.md and figures/
|
|
|
|
# Stage 4: Generate paper
|
|
den.get_paper(journal=Journal.APS)
|
|
```
|
|
|
|
### Version Control Integration
|
|
|
|
Initialize git in project directory for tracking:
|
|
|
|
```bash
|
|
cd project_dir
|
|
git init
|
|
git add .
|
|
git commit -m "Initial research workflow"
|
|
```
|
|
|
|
Commit after each stage to track the evolution of your research.
|