Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions

View File

@@ -0,0 +1,494 @@
# Denario Examples
## Complete End-to-End Research Example
This example demonstrates a full research pipeline from data to publication.
### Setup
```python
from denario import Denario, Journal
import os
# Create project directory
os.makedirs("climate_research", exist_ok=True)
den = Denario(project_dir="./climate_research")
```
### Define Research Context
```python
den.set_data_description("""
Available data: Global temperature anomaly dataset (1880-2023)
- Monthly mean temperature deviations from 1951-1980 baseline
- Global coverage with land and ocean measurements
- Format: CSV with columns [year, month, temperature_anomaly]
Available tools:
- pandas for data manipulation
- scipy for statistical analysis
- sklearn for regression modeling
- matplotlib and seaborn for visualization
Research domain: Climate science
Research goal: Quantify and characterize long-term global warming trends
Data source: NASA GISTEMP
Known characteristics: Strong autocorrelation, seasonal patterns, missing data pre-1900
""")
```
### Execute Full Pipeline
```python
# Generate research idea
den.get_idea()
# Output: "Quantify the rate of global temperature increase using
# linear regression and assess acceleration in warming trends"
# Develop methodology
den.get_method()
# Output: Creates methodology including:
# - Time-series preprocessing
# - Linear trend analysis
# - Moving average smoothing
# - Statistical significance testing
# - Visualization of trends
# Execute analysis
den.get_results()
# Output: Runs the analysis, generates:
# - Computed trend: +0.18°C per decade
# - Statistical tests: p < 0.001
# - Figure 1: Temperature anomaly over time with trend line
# - Figure 2: Decadal averages
# - Figure 3: Acceleration analysis
# Generate publication
den.get_paper(journal=Journal.APS)
# Output: Creates formatted LaTeX paper with:
# - Title, abstract, introduction
# - Methods section
# - Results with embedded figures
# - Discussion and conclusions
# - References
```
### Review Outputs
```bash
tree climate_research/
# climate_research/
# ├── data_description.txt
# ├── idea.md
# ├── methodology.md
# ├── results.md
# ├── figures/
# │ ├── temperature_trend.png
# │ ├── decadal_averages.png
# │ └── acceleration_analysis.png
# ├── paper.tex
# └── paper.pdf
```
## Enhancing Input Descriptions
Improve data descriptions for better idea generation.
### Basic Description
```python
den = Denario(project_dir="./enhanced_input")
# Start with minimal description
den.set_data_description("Gene expression data from cancer patients")
```
### Enhanced Description
```python
# Enhance with specifics
den.set_data_description("""
Dataset: Gene expression microarray data from breast cancer patients
- Sample size: 500 patients (250 responders, 250 non-responders to therapy)
- Features: Expression levels of 20,000 genes
- Format: CSV matrix (samples × genes)
- Clinical metadata: Age, tumor stage, treatment response, survival time
Available analytical tools:
- pandas for data processing
- sklearn for machine learning (PCA, random forests, SVM)
- lifelines for survival analysis
- matplotlib/seaborn for visualization
Research objectives:
- Identify gene signatures predictive of treatment response
- Discover potential therapeutic targets
- Validate findings using cross-validation
Data characteristics:
- Normalized log2 expression values
- Some missing data (<5% of values)
- Batch effects corrected
""")
den.get_idea()
# Now generates more specific and relevant research ideas
```
## Literature Search Integration
Incorporate existing research into your workflow.
### Example: Finding Related Work
```python
den = Denario(project_dir="./literature_review")
# Define research area
den.set_data_description("""
Research area: Machine learning for protein structure prediction
Available data: Protein sequence database with known structures
Tools: Biopython, TensorFlow, scikit-learn
""")
# Generate idea
den.set_idea("Develop a deep learning model for predicting protein secondary structure from amino acid sequences")
# NOTE: Literature search functionality would be integrated here
# The specific API for literature search should be checked in denario's documentation
# Example conceptual usage:
# den.search_literature(keywords=["protein structure prediction", "deep learning", "LSTM"])
# This would inform methodology and provide citations for the paper
```
## Generate Research Ideas from Data
Focus on idea generation without full pipeline execution.
### Example: Brainstorming Research Questions
```python
den = Denario(project_dir="./idea_generation")
# Provide comprehensive data description
den.set_data_description("""
Available datasets:
1. Social media sentiment data (1M tweets, 2020-2023)
2. Stock market prices (S&P 500, daily, 2020-2023)
3. Economic indicators (GDP, unemployment, inflation)
Tools: pandas, sklearn, statsmodels, Prophet, VADER sentiment analysis
Domain: Computational social science and finance
Research interests: Market prediction, sentiment analysis, causal inference
""")
# Generate multiple ideas (conceptual - depends on denario API)
den.get_idea()
# Review the generated idea in idea.md
# Decide whether to proceed or regenerate
```
## Writing a Paper from Existing Results
Use denario for paper generation when analysis is already complete.
### Example: Formatting Existing Research
```python
den = Denario(project_dir="./paper_generation")
# Provide all components manually
den.set_data_description("""
Completed analysis of traffic pattern data from urban sensors
Dataset: 6 months of traffic flow measurements from 100 intersections
Analysis completed using R and Python
""")
den.set_idea("""
Research question: Optimize traffic light timing using reinforcement learning
to reduce congestion and improve traffic flow efficiency
""")
den.set_method("""
# Methodology
## Data Collection
Traffic flow data collected from 100 intersections in downtown area from
January-June 2023. Measurements include vehicle counts, wait times, and
queue lengths at 1-minute intervals.
## Model Development
Developed a Deep Q-Network (DQN) reinforcement learning agent to optimize
traffic light timing. State space includes current queue lengths and
historical flow patterns. Actions correspond to light timing adjustments.
## Training
Trained the agent using historical data with a reward function based on
total wait time reduction. Used experience replay and target networks for
stable learning.
## Validation
Validated using held-out test data and compared against:
- Current fixed-timing system
- Actuated control system
- Alternative RL algorithms (A3C, PPO)
## Metrics
- Average wait time reduction
- Total throughput improvement
- Queue length distribution
- Computational efficiency
""")
den.set_results("""
# Results
## Training Performance
The DQN agent converged after 500,000 training episodes. Training time: 12 hours
on NVIDIA V100 GPU.
## Wait Time Reduction
- Current system: Average wait time 45.2 seconds
- DQN system: Average wait time 32.8 seconds
- Improvement: 27.4% reduction (p < 0.001)
## Throughput Analysis
- Vehicles processed per hour increased from 2,850 to 3,420 (+20%)
- Peak hour congestion reduced by 35%
## Comparison with Baselines
- Actuated control: 38.1 seconds average wait (DQN still 14% better)
- A3C: 34.5 seconds (DQN slightly better, 5%)
- PPO: 33.2 seconds (DQN marginally better, 1%)
## Queue Length Analysis
Maximum queue length reduced from 42 vehicles to 28 vehicles during peak hours.
## Figures
- Figure 1: Training curve showing convergence
- Figure 2: Wait time distribution comparison
- Figure 3: Throughput over time of day
- Figure 4: Heatmap of queue lengths across intersections
""")
# Generate publication-ready paper
den.get_paper(journal=Journal.APS)
```
## Fast Mode with Gemini
Use Google's Gemini models for faster execution.
### Example: Rapid Prototyping
```python
# Configure for fast mode (conceptual - check denario documentation)
# This would involve setting appropriate LLM backend
den = Denario(project_dir="./fast_research")
# Same workflow, optimized for speed
den.set_data_description("""
Quick analysis needed: Monthly sales data (2 years)
Goal: Identify seasonal patterns and forecast next quarter
Tools: pandas, Prophet
""")
# Fast execution
den.get_idea()
den.get_method()
den.get_results()
den.get_paper()
# Trade-off: Faster execution, potentially less detailed analysis
```
## Hybrid Workflow: Custom Idea + Automated Method
Combine manual and automated approaches.
### Example: Directed Research
```python
den = Denario(project_dir="./hybrid_workflow")
# Describe data
den.set_data_description("""
Medical imaging dataset: 10,000 chest X-rays
Labels: Normal, pneumonia, COVID-19
Format: 224x224 grayscale PNG files
Tools: TensorFlow, Keras, scikit-learn, OpenCV
""")
# Provide specific research direction
den.set_idea("""
Develop a transfer learning approach using pre-trained ResNet50 for multi-class
classification of chest X-rays, with focus on interpretability using Grad-CAM
to identify diagnostic regions
""")
# Let denario develop the methodology
den.get_method()
# Review methodology, then execute
den.get_results()
# Generate paper
den.get_paper(journal=Journal.APS)
```
## Time-Series Analysis Example
Specialized example for temporal data.
### Example: Economic Forecasting
```python
den = Denario(project_dir="./time_series_analysis")
den.set_data_description("""
Dataset: Monthly unemployment rates (US, 1950-2023)
Additional features: GDP growth, inflation, interest rates
Format: Multivariate time-series DataFrame
Tools: statsmodels, Prophet, pmdarima, sklearn
Analysis goals:
- Model unemployment trends
- Forecast next 12 months
- Identify leading indicators
- Assess forecast uncertainty
Data characteristics:
- Seasonal patterns (annual cycles)
- Structural breaks (recessions)
- Autocorrelation present
- Non-stationary (unit root)
""")
den.get_idea()
# Might generate: "Develop a SARIMAX model incorporating economic indicators
# as exogenous variables to forecast unemployment with confidence intervals"
den.get_method()
den.get_results()
den.get_paper(journal=Journal.APS)
```
## Machine Learning Pipeline Example
Complete ML workflow with validation.
### Example: Predictive Modeling
```python
den = Denario(project_dir="./ml_pipeline")
den.set_data_description("""
Dataset: Customer churn prediction
- 50,000 customers, 30 features (demographics, usage patterns, service history)
- Binary target: churned (1) or retained (0)
- Imbalanced: 20% churn rate
- Features: Numerical and categorical mixed
Available tools:
- pandas for preprocessing
- sklearn for modeling (RF, XGBoost, logistic regression)
- imblearn for handling imbalance
- SHAP for feature importance
Goals:
- Build predictive model for churn
- Identify key churn factors
- Provide actionable insights
- Achieve >85% AUC-ROC
""")
den.get_idea()
# Might generate: "Develop an ensemble model combining XGBoost and Random Forest
# with SMOTE oversampling, and use SHAP values to identify interpretable
# churn risk factors"
den.get_method()
# Will include: train/test split, cross-validation, hyperparameter tuning,
# performance metrics, feature importance analysis
den.get_results()
# Executes full ML pipeline, generates:
# - Model performance metrics
# - ROC curves
# - Feature importance plots
# - Confusion matrices
den.get_paper(journal=Journal.APS)
```
## Tips for Effective Usage
### Provide Rich Context
More context → better ideas and methodologies:
```python
# Include:
# - Data characteristics (size, format, quality issues)
# - Available tools and libraries
# - Domain-specific knowledge
# - Research objectives and constraints
# - Known challenges or considerations
```
### Iterate on Intermediate Outputs
Review and refine at each stage:
```python
# Generate
den.get_idea()
# Review idea.md
# If needed, refine:
den.set_idea("Refined version of the idea")
# Continue
den.get_method()
# Review methodology.md
# Refine if needed, then proceed
```
### Save Your Workflow
Document the complete pipeline:
```python
# Save workflow script
with open("research_workflow.py", "w") as f:
f.write("""
from denario import Denario, Journal
den = Denario(project_dir="./project")
den.set_data_description("...")
den.get_idea()
den.get_method()
den.get_results()
den.get_paper(journal=Journal.APS)
""")
```
### Use Version Control
Track research evolution:
```bash
cd project_dir
git init
git add .
git commit -m "Initial data description"
# After each stage
git add .
git commit -m "Generated research idea"
# ... continue committing after each stage
```

View File

@@ -0,0 +1,213 @@
# Installation Guide
## System Requirements
- **Python**: Version 3.12 or higher (required)
- **Operating System**: Linux, macOS, or Windows
- **Virtual Environment**: Recommended for isolation
- **LaTeX**: Required for paper generation (or use Docker)
## Installation Methods
### Method 1: Using uv (Recommended)
The uv package manager provides fast, reliable dependency resolution:
```bash
# Initialize a new project
uv init
# Add denario with app support
uv add "denario[app]"
```
### Method 2: Alternative Installation
Alternative installation using pip:
```bash
# Create virtual environment (recommended)
python3 -m venv denario_env
source denario_env/bin/activate # On Windows: denario_env\Scripts\activate
# Install denario
uv pip install "denario[app]"
```
### Method 3: Building from Source
For development or customization:
```bash
# Clone the repository
git clone https://github.com/AstroPilot-AI/Denario.git
cd Denario
# Create virtual environment
python3 -m venv Denario_env
source Denario_env/bin/activate
# Install in editable mode
uv pip install -e .
```
### Method 4: Docker Deployment
Docker provides a complete environment with all dependencies including LaTeX:
```bash
# Pull the official image
docker pull pablovd/denario:latest
# Run the container with GUI
docker run -p 8501:8501 --rm pablovd/denario:latest
# Run with environment variables (for API keys)
docker run -p 8501:8501 --env-file .env --rm pablovd/denario:latest
```
Access the GUI at `http://localhost:8501` after the container starts.
## Verifying Installation
After installation, verify denario is available:
```python
# Test import
python -c "from denario import Denario; print('Denario installed successfully')"
```
Or check the version:
```bash
python -c "import denario; print(denario.__version__)"
```
## Launching the Application
### Command-Line Interface
Run the graphical user interface:
```bash
denario run
```
This launches a web-based Streamlit application for interactive research workflow management.
### Programmatic Usage
Use denario directly in Python scripts:
```python
from denario import Denario
den = Denario(project_dir="./my_project")
# Continue with workflow...
```
## Dependencies
Denario automatically installs key dependencies:
- **AG2**: Agent orchestration framework
- **LangGraph**: Graph-based agent workflows
- **pandas**: Data manipulation
- **scikit-learn**: Machine learning tools
- **matplotlib/seaborn**: Visualization
- **streamlit**: GUI framework (with `[app]` extra)
## LaTeX Setup
For paper generation, LaTeX must be available:
### Linux
```bash
sudo apt-get install texlive-full
```
### macOS
```bash
brew install --cask mactex
```
### Windows
Download and install [MiKTeX](https://miktex.org/download) or [TeX Live](https://tug.org/texlive/).
### Docker Alternative
The Docker image includes a complete LaTeX installation, eliminating manual setup.
## Troubleshooting Installation
### Python Version Issues
Ensure Python 3.12+:
```bash
python --version
```
If older, install a newer version or use pyenv for version management.
### Virtual Environment Activation
**Linux/macOS:**
```bash
source venv/bin/activate
```
**Windows:**
```bash
venv\Scripts\activate
```
### Permission Errors
Use `--user` flag or virtual environments:
```bash
uv pip install --user "denario[app]"
```
### Docker Port Conflicts
If port 8501 is in use, map to a different port:
```bash
docker run -p 8502:8501 --rm pablovd/denario:latest
```
### Package Conflicts
Create a fresh virtual environment to avoid dependency conflicts.
## Updating Denario
### uv
```bash
uv add --upgrade denario
```
### pip
```bash
uv pip install --upgrade "denario[app]"
```
### Docker
```bash
docker pull pablovd/denario:latest
```
## Uninstallation
### uv
```bash
uv remove denario
```
### pip
```bash
uv pip uninstall denario
```
### Docker
```bash
docker rmi pablovd/denario:latest
```

View File

@@ -0,0 +1,265 @@
# LLM API Configuration
## Overview
Denario requires API credentials from supported LLM providers to power its multiagent research system. The system is built on AG2 and LangGraph, which support multiple LLM backends.
## Supported LLM Providers
### Google Vertex AI
- Full integration with Google's Vertex AI platform
- Supports Gemini and PaLM models
- Requires Google Cloud project setup
### OpenAI
- GPT-4, GPT-3.5, and other OpenAI models
- Direct API integration
### Other Providers
- Any LLM compatible with AG2/LangGraph frameworks
- Anthropic Claude (via compatible interfaces)
- Azure OpenAI
- Custom model endpoints
## Obtaining API Keys
### Google Vertex AI
1. **Create Google Cloud Project**
- Navigate to [Google Cloud Console](https://console.cloud.google.com/)
- Create a new project or select existing
2. **Enable Vertex AI API**
- Go to "APIs & Services" → "Library"
- Search for "Vertex AI API"
- Click "Enable"
3. **Create Service Account**
- Navigate to "IAM & Admin" → "Service Accounts"
- Create service account with Vertex AI permissions
- Download JSON key file
4. **Set up authentication**
```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
```
### OpenAI
1. **Create OpenAI Account**
- Visit [platform.openai.com](https://platform.openai.com/)
- Sign up or log in
2. **Generate API Key**
- Navigate to API Keys section
- Click "Create new secret key"
- Copy and store securely
3. **Set environment variable**
```bash
export OPENAI_API_KEY="sk-..."
```
## Storing API Keys
### Method 1: Environment Variables (Recommended)
**Linux/macOS:**
```bash
export OPENAI_API_KEY="your-key-here"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
```
Add to `~/.bashrc`, `~/.zshrc`, or `~/.bash_profile` for persistence.
**Windows:**
```bash
set OPENAI_API_KEY=your-key-here
```
Or use System Properties → Environment Variables for persistence.
### Method 2: .env Files
Create a `.env` file in your project directory:
```env
# OpenAI Configuration
OPENAI_API_KEY=sk-your-openai-key-here
OPENAI_MODEL=gpt-4
# Google Vertex AI Configuration
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
GOOGLE_CLOUD_PROJECT=your-project-id
# Optional: Model preferences
DEFAULT_MODEL=gpt-4
TEMPERATURE=0.7
```
Load the environment file in Python:
```python
from dotenv import load_dotenv
load_dotenv()
from denario import Denario
den = Denario(project_dir="./project")
```
### Method 3: Docker Environment Files
For Docker deployments, pass environment variables:
```bash
# Using --env-file flag
docker run -p 8501:8501 --env-file .env --rm pablovd/denario:latest
# Using -e flag for individual variables
docker run -p 8501:8501 \
-e OPENAI_API_KEY=sk-... \
-e GOOGLE_APPLICATION_CREDENTIALS=/credentials.json \
-v /local/path/to/creds.json:/credentials.json \
--rm pablovd/denario:latest
```
## Vertex AI Detailed Setup
### Prerequisites
- Google Cloud account with billing enabled
- gcloud CLI installed (optional but recommended)
### Step-by-Step Configuration
1. **Install Google Cloud SDK (if not using Docker)**
```bash
# Linux/macOS
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
gcloud init
```
2. **Authenticate gcloud**
```bash
gcloud auth application-default login
```
3. **Set project**
```bash
gcloud config set project YOUR_PROJECT_ID
```
4. **Enable required APIs**
```bash
gcloud services enable aiplatform.googleapis.com
gcloud services enable compute.googleapis.com
```
5. **Create service account (alternative to gcloud auth)**
```bash
gcloud iam service-accounts create denario-service-account \
--display-name="Denario AI Service Account"
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:denario-service-account@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
gcloud iam service-accounts keys create credentials.json \
--iam-account=denario-service-account@YOUR_PROJECT_ID.iam.gserviceaccount.com
```
6. **Configure denario to use Vertex AI**
```python
import os
os.environ['GOOGLE_CLOUD_PROJECT'] = 'YOUR_PROJECT_ID'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/credentials.json'
from denario import Denario
den = Denario(project_dir="./research")
```
## Model Selection
Configure which models denario uses for different tasks:
```python
# In your code
from denario import Denario
# Example configuration (if supported by denario API)
den = Denario(
project_dir="./project",
# Model configuration may vary based on denario version
)
```
Check denario's documentation for specific model selection APIs.
## Cost Management
### Monitoring Costs
- **OpenAI**: Track usage at [platform.openai.com/usage](https://platform.openai.com/usage)
- **Google Cloud**: Monitor in Cloud Console → Billing
- Set up billing alerts to avoid unexpected charges
### Cost Optimization Tips
1. **Use appropriate model tiers**
- GPT-3.5 for simpler tasks
- GPT-4 for complex reasoning
2. **Batch operations**
- Process multiple research tasks in single sessions
3. **Cache results**
- Reuse generated ideas, methods, and results when possible
4. **Set token limits**
- Configure maximum token usage for cost control
## Security Best Practices
### Do NOT commit API keys to version control
Add to `.gitignore`:
```gitignore
.env
*.json # If storing credentials
credentials.json
service-account-key.json
```
### Rotate keys regularly
- Generate new API keys periodically
- Revoke old keys after rotation
### Use least privilege access
- Grant only necessary permissions to service accounts
- Use separate keys for development and production
### Encrypt sensitive files
- Store credential files in encrypted volumes
- Use cloud secret management services for production
## Troubleshooting
### "API key not found" errors
- Verify environment variables are set: `echo $OPENAI_API_KEY`
- Check `.env` file is in correct directory
- Ensure `load_dotenv()` is called before importing denario
### Vertex AI authentication failures
- Verify `GOOGLE_APPLICATION_CREDENTIALS` points to valid JSON file
- Check service account has required permissions
- Ensure APIs are enabled in Google Cloud project
### Rate limiting issues
- Implement exponential backoff
- Reduce concurrent requests
- Upgrade API plan if needed
### Docker environment variable issues
- Use `docker run --env-file .env` to pass environment
- Mount credential files with `-v` flag
- Check environment inside container: `docker exec <container> env`

View File

@@ -0,0 +1,471 @@
# Research Pipeline API Reference
## Core Classes
### Denario
The main class for orchestrating research workflows.
#### Initialization
```python
from denario import Denario
den = Denario(project_dir="path/to/project")
```
**Parameters:**
- `project_dir` (str): Path to the research project directory where all outputs will be stored
#### Methods
##### set_data_description()
Define the research context by describing available data and analytical tools.
```python
den.set_data_description(description: str)
```
**Parameters:**
- `description` (str): Text describing the dataset, available tools, research domain, and any relevant context
**Example:**
```python
den.set_data_description("""
Available data: Time-series temperature measurements from 2010-2023
Tools: pandas, scipy, sklearn, matplotlib
Domain: Climate science
Research interest: Identifying seasonal patterns and long-term trends
""")
```
**Purpose:** This establishes the foundation for automated idea generation by providing context about what data is available and what analyses are feasible.
##### get_idea()
Generate research hypotheses based on the data description.
```python
den.get_idea()
```
**Returns:** Research idea/hypothesis (stored internally in project directory)
**Output:** Creates a file containing the generated research question or hypothesis
**Example:**
```python
den.get_idea()
# Generates ideas like: "Investigate the correlation between seasonal temperature
# variations and long-term warming trends using time-series decomposition"
```
##### set_idea()
Manually specify a research idea instead of generating one.
```python
den.set_idea(idea: str)
```
**Parameters:**
- `idea` (str): The research hypothesis or question to investigate
**Example:**
```python
den.set_idea("Analyze the impact of El Niño events on regional temperature anomalies")
```
**Use case:** When you have a specific research direction and want to skip automated idea generation.
##### get_method()
Develop a research methodology based on the idea and data description.
```python
den.get_method()
```
**Returns:** Methodology document (stored internally in project directory)
**Output:** Creates a structured methodology including:
- Analytical approach
- Statistical methods to apply
- Validation strategies
- Expected outputs
**Example:**
```python
den.get_method()
# Generates methodology: "Apply seasonal decomposition, compute correlation coefficients,
# perform statistical significance tests, generate visualization plots..."
```
##### set_method()
Provide a custom methodology instead of generating one.
```python
den.set_method(method: str)
den.set_method(method: Path) # Can also accept file paths
```
**Parameters:**
- `method` (str or Path): Methodology description or path to markdown file containing methodology
**Example:**
```python
# From string
den.set_method("""
1. Apply seasonal decomposition using STL
2. Compute Pearson correlation coefficients
3. Perform Mann-Kendall trend test
4. Generate time-series plots with confidence intervals
""")
# From file
den.set_method("methodology.md")
```
##### get_results()
Execute the methodology, perform computations, and generate results.
```python
den.get_results()
```
**Returns:** Results document with analysis outputs (stored internally in project directory)
**Output:** Creates results including:
- Computed statistics
- Generated figures and visualizations
- Data tables
- Analysis findings
**Example:**
```python
den.get_results()
# Executes the methodology, runs analyses, creates plots, compiles findings
```
**Note:** This is where the actual computational work happens. The agent executes code to perform the analyses specified in the methodology.
##### set_results()
Provide pre-computed results instead of generating them.
```python
den.set_results(results: str)
den.set_results(results: Path) # Can also accept file paths
```
**Parameters:**
- `results` (str or Path): Results description or path to markdown file containing results
**Example:**
```python
# From string
den.set_results("""
Analysis Results:
- Correlation coefficient: 0.78 (p < 0.001)
- Seasonal amplitude: 5.2°C
- Long-term trend: +0.15°C per decade
- Figure 1: Seasonal decomposition (see attached)
""")
# From file
den.set_results("results.md")
```
**Use case:** When analyses were performed externally or when iterating on paper writing without re-running computations.
##### get_paper()
Generate a publication-ready LaTeX paper with the research findings.
```python
den.get_paper(journal: Journal = None)
```
**Parameters:**
- `journal` (Journal, optional): Target journal for formatting. Defaults to generic format.
**Returns:** LaTeX paper with proper formatting (stored in project directory)
**Output:** Creates:
- Complete LaTeX source file
- Compiled PDF (if LaTeX is available)
- Integrated figures and tables
- Properly formatted bibliography
**Example:**
```python
from denario import Journal
den.get_paper(journal=Journal.APS)
# Generates paper.tex and paper.pdf formatted for APS journals
```
### Journal Enum
Enumeration of supported journal formats.
```python
from denario import Journal
```
#### Available Journals
- `Journal.APS` - American Physical Society format
- Suitable for Physical Review, Physical Review Letters, etc.
- Uses RevTeX document class
Additional journal formats may be available. Check the latest denario documentation for the complete list.
#### Usage
```python
from denario import Denario, Journal
den = Denario(project_dir="./research")
# ... complete workflow ...
den.get_paper(journal=Journal.APS)
```
## Workflow Patterns
### Fully Automated Pipeline
Let denario handle every stage:
```python
from denario import Denario, Journal
den = Denario(project_dir="./automated_research")
# Define context
den.set_data_description("""
Dataset: Sensor readings from IoT devices
Tools: pandas, numpy, sklearn, matplotlib
Goal: Anomaly detection in sensor networks
""")
# Automate entire pipeline
den.get_idea() # Generate research idea
den.get_method() # Develop methodology
den.get_results() # Execute analysis
den.get_paper(journal=Journal.APS) # Create paper
```
### Custom Idea, Automated Execution
Provide your research question, automate the rest:
```python
den = Denario(project_dir="./custom_idea")
den.set_data_description("Dataset: Financial time-series data...")
# Manual idea
den.set_idea("Investigate predictive models for stock market volatility using LSTM networks")
# Automated execution
den.get_method()
den.get_results()
den.get_paper(journal=Journal.APS)
```
### Fully Manual with Template Generation
Use denario only for paper formatting:
```python
den = Denario(project_dir="./manual_research")
# Provide everything manually
den.set_data_description("Pre-existing dataset description...")
den.set_idea("Pre-defined research hypothesis")
den.set_method("methodology.md") # Load from file
den.set_results("results.md") # Load from file
# Generate formatted paper
den.get_paper(journal=Journal.APS)
```
### Iterative Refinement
Refine specific stages without re-running everything:
```python
den = Denario(project_dir="./iterative")
# Initial run
den.set_data_description("Dataset description...")
den.get_idea()
den.get_method()
den.get_results()
# Refine methodology after reviewing results
den.set_method("""
Revised methodology:
- Use different statistical test
- Add sensitivity analysis
- Include cross-validation
""")
# Re-run only downstream stages
den.get_results() # Re-execute with new method
den.get_paper(journal=Journal.APS)
```
## Project Directory Structure
After running a complete workflow, the project directory contains:
```
project_dir/
├── data_description.txt # Input: data context
├── idea.md # Generated or provided research idea
├── methodology.md # Generated or provided methodology
├── results.md # Generated or provided results
├── figures/ # Generated visualizations
│ ├── figure_1.png
│ ├── figure_2.png
│ └── ...
├── paper.tex # Generated LaTeX source
├── paper.pdf # Compiled PDF (if LaTeX available)
└── logs/ # Agent execution logs
└── ...
```
## Advanced Features
### Multiagent Orchestration
Denario uses AG2 and LangGraph frameworks to coordinate multiple specialized agents:
- **Idea Agent**: Generates research hypotheses from data descriptions
- **Method Agent**: Develops analytical methodologies
- **Execution Agent**: Runs computations and creates visualizations
- **Writing Agent**: Produces publication-ready manuscripts
These agents collaborate automatically, with each stage building on previous outputs.
### Integration with Scientific Tools
Denario integrates with common scientific Python libraries:
- **pandas**: Data manipulation and analysis
- **scikit-learn**: Machine learning algorithms
- **scipy**: Scientific computing and statistics
- **matplotlib/seaborn**: Visualization
- **numpy**: Numerical operations
When generating results, denario can automatically write and execute code using these libraries.
### Reproducibility
All stages produce structured outputs saved to the project directory:
- Version control friendly (markdown and LaTeX)
- Auditable (logs of agent decisions and code execution)
- Reproducible (saved methodologies can be re-run)
### Literature Search
Denario includes capabilities for literature searches to provide context for research ideas and methodology development. See `examples.md` for literature search workflows.
## Error Handling
### Common Issues
**Missing data description:**
```python
den = Denario(project_dir="./project")
den.get_idea() # Error: must call set_data_description() first
```
**Solution:** Always set data description before generating ideas.
**Missing prerequisite stages:**
```python
den = Denario(project_dir="./project")
den.get_results() # Error: must have idea and method first
```
**Solution:** Follow the workflow order or manually set prerequisite stages.
**LaTeX compilation errors:**
```python
den.get_paper() # May fail if LaTeX not installed
```
**Solution:** Install LaTeX distribution or use Docker image with pre-installed LaTeX.
## Best Practices
### Data Description Quality
Provide detailed context for better idea generation:
```python
# Good: Detailed and specific
den.set_data_description("""
Dataset: 10 years of daily temperature readings from 50 weather stations
Format: CSV with columns [date, station_id, temperature, humidity]
Tools available: pandas, scipy, sklearn, matplotlib, seaborn
Domain: Climatology
Research interests: Climate change, seasonal patterns, regional variations
Known challenges: Missing data in 2015, station 23 has calibration issues
""")
# Bad: Too vague
den.set_data_description("Temperature data from weather stations")
```
### Methodology Validation
Review generated methodologies before executing:
```python
den.get_method()
# Review the methodology.md file in project_dir
# If needed, refine with set_method()
```
### Incremental Development
Build the research pipeline incrementally:
```python
# Stage 1: Validate idea generation
den.set_data_description("...")
den.get_idea()
# Review idea.md, adjust if needed
# Stage 2: Validate methodology
den.get_method()
# Review methodology.md, adjust if needed
# Stage 3: Execute and validate results
den.get_results()
# Review results.md and figures/
# Stage 4: Generate paper
den.get_paper(journal=Journal.APS)
```
### Version Control Integration
Initialize git in project directory for tracking:
```bash
cd project_dir
git init
git add .
git commit -m "Initial research workflow"
```
Commit after each stage to track the evolution of your research.