Initial commit
This commit is contained in:
298
skills/tooluniverse/references/api_reference.md
Normal file
298
skills/tooluniverse/references/api_reference.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# ToolUniverse Python API Reference
|
||||
|
||||
## Core Classes
|
||||
|
||||
### ToolUniverse
|
||||
|
||||
Main class for interacting with the ToolUniverse ecosystem.
|
||||
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
tu = ToolUniverse()
|
||||
```
|
||||
|
||||
#### Methods
|
||||
|
||||
##### `load_tools()`
|
||||
Load all available tools into the ToolUniverse instance.
|
||||
|
||||
```python
|
||||
tu.load_tools()
|
||||
```
|
||||
|
||||
**Returns:** None
|
||||
|
||||
**Side effects:** Loads 600+ tools into memory for discovery and execution.
|
||||
|
||||
---
|
||||
|
||||
##### `run(tool_config)`
|
||||
Execute a tool with specified arguments.
|
||||
|
||||
**Parameters:**
|
||||
- `tool_config` (dict): Configuration dictionary with keys:
|
||||
- `name` (str): Tool name to execute
|
||||
- `arguments` (dict): Tool-specific arguments
|
||||
|
||||
**Returns:** Tool-specific output (dict, list, str, or other types)
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
result = tu.run({
|
||||
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
|
||||
"arguments": {
|
||||
"efoId": "EFO_0000537"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
##### `list_tools(limit=None)`
|
||||
List all available tools or a subset.
|
||||
|
||||
**Parameters:**
|
||||
- `limit` (int, optional): Maximum number of tools to return. If None, returns all tools.
|
||||
|
||||
**Returns:** List of tool dictionaries
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# List all tools
|
||||
all_tools = tu.list_tools()
|
||||
|
||||
# List first 20 tools
|
||||
tools = tu.list_tools(limit=20)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
##### `get_tool_info(tool_name)`
|
||||
Get detailed information about a specific tool.
|
||||
|
||||
**Parameters:**
|
||||
- `tool_name` (str): Name of the tool
|
||||
|
||||
**Returns:** Dictionary containing tool metadata, parameters, and documentation
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
info = tu.get_tool_info("AlphaFold_get_structure")
|
||||
print(info['description'])
|
||||
print(info['parameters'])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Built-in Discovery Tools
|
||||
|
||||
These are special tools that help find other tools in the ecosystem.
|
||||
|
||||
### Tool_Finder
|
||||
|
||||
Embedding-based semantic search for tools. Requires GPU.
|
||||
|
||||
```python
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder",
|
||||
"arguments": {
|
||||
"description": "protein structure prediction",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `description` (str): Natural language description of desired functionality
|
||||
- `limit` (int): Maximum number of tools to return
|
||||
|
||||
**Returns:** List of relevant tools with similarity scores
|
||||
|
||||
---
|
||||
|
||||
### Tool_Finder_LLM
|
||||
|
||||
LLM-based semantic search for tools. No GPU required.
|
||||
|
||||
```python
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_LLM",
|
||||
"arguments": {
|
||||
"description": "Find tools for RNA sequencing analysis",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `description` (str): Natural language query
|
||||
- `limit` (int): Maximum number of tools to return
|
||||
|
||||
**Returns:** List of relevant tools
|
||||
|
||||
---
|
||||
|
||||
### Tool_Finder_Keyword
|
||||
|
||||
Fast keyword-based search through tool names and descriptions.
|
||||
|
||||
```python
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {
|
||||
"description": "pathway enrichment",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `description` (str): Keywords to search for
|
||||
- `limit` (int): Maximum number of tools to return
|
||||
|
||||
**Returns:** List of matching tools
|
||||
|
||||
---
|
||||
|
||||
## Tool Output Hooks
|
||||
|
||||
Post-processing hooks for tool results.
|
||||
|
||||
### Summarization Hook
|
||||
```python
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
},
|
||||
hooks={
|
||||
"summarize": {
|
||||
"format": "brief" # or "detailed"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### File Saving Hook
|
||||
```python
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
},
|
||||
hooks={
|
||||
"save_to_file": {
|
||||
"filename": "output.json",
|
||||
"format": "json" # or "csv", "txt"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Context Protocol (MCP)
|
||||
|
||||
### Starting MCP Server
|
||||
|
||||
Command-line interface:
|
||||
```bash
|
||||
tooluniverse-smcp
|
||||
```
|
||||
|
||||
This launches an MCP server that exposes all ToolUniverse tools through the Model Context Protocol.
|
||||
|
||||
**Configuration:**
|
||||
- Default port: Automatically assigned
|
||||
- Protocol: MCP standard
|
||||
- Authentication: None required for local use
|
||||
|
||||
---
|
||||
|
||||
## Integration Modules
|
||||
|
||||
### OpenRouter Integration
|
||||
|
||||
Access 100+ LLMs through OpenRouter API:
|
||||
|
||||
```python
|
||||
from tooluniverse import OpenRouterClient
|
||||
|
||||
client = OpenRouterClient(api_key="your_key")
|
||||
response = client.chat("Analyze this protein sequence", model="anthropic/claude-3-5-sonnet")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## AI-Tool Interaction Protocol
|
||||
|
||||
ToolUniverse uses a standardized protocol for LLM-tool communication:
|
||||
|
||||
**Request Format:**
|
||||
```json
|
||||
{
|
||||
"name": "tool_name",
|
||||
"arguments": {
|
||||
"param1": "value1",
|
||||
"param2": "value2"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Response Format:**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"data": { ... },
|
||||
"metadata": {
|
||||
"execution_time": 1.23,
|
||||
"tool_version": "1.0.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
try:
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
})
|
||||
except ToolNotFoundError as e:
|
||||
print(f"Tool not found: {e}")
|
||||
except InvalidArgumentError as e:
|
||||
print(f"Invalid arguments: {e}")
|
||||
except ToolExecutionError as e:
|
||||
print(f"Execution failed: {e}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Type Hints
|
||||
|
||||
```python
|
||||
from typing import Dict, List, Any, Optional
|
||||
|
||||
def run_tool(
|
||||
tu: ToolUniverse,
|
||||
tool_name: str,
|
||||
arguments: Dict[str, Any]
|
||||
) -> Any:
|
||||
"""Execute a tool with type-safe arguments."""
|
||||
return tu.run({
|
||||
"name": tool_name,
|
||||
"arguments": arguments
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Initialize Once**: Create a single ToolUniverse instance and reuse it
|
||||
2. **Load Tools Early**: Call `load_tools()` once at startup
|
||||
3. **Cache Tool Info**: Store frequently used tool information
|
||||
4. **Error Handling**: Always wrap tool execution in try-except blocks
|
||||
5. **Type Validation**: Validate argument types before execution
|
||||
6. **Resource Management**: Consider rate limits for remote APIs
|
||||
7. **Logging**: Enable logging for production environments
|
||||
272
skills/tooluniverse/references/domains.md
Normal file
272
skills/tooluniverse/references/domains.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# ToolUniverse Tool Domains and Categories
|
||||
|
||||
## Overview
|
||||
|
||||
ToolUniverse integrates 600+ scientific tools across multiple research domains. This document categorizes tools by scientific discipline and use case.
|
||||
|
||||
## Major Scientific Domains
|
||||
|
||||
### Bioinformatics
|
||||
|
||||
**Sequence Analysis:**
|
||||
- Sequence alignment and comparison
|
||||
- Multiple sequence alignment (MSA)
|
||||
- BLAST and homology searches
|
||||
- Motif finding and pattern matching
|
||||
|
||||
**Genomics:**
|
||||
- Gene expression analysis
|
||||
- RNA-seq data processing
|
||||
- Variant calling and annotation
|
||||
- Genome assembly and annotation
|
||||
- Copy number variation analysis
|
||||
|
||||
**Functional Analysis:**
|
||||
- Gene Ontology (GO) enrichment
|
||||
- Pathway analysis (KEGG, Reactome)
|
||||
- Gene set enrichment analysis (GSEA)
|
||||
- Protein domain analysis
|
||||
|
||||
**Example Tools:**
|
||||
- GEO data download and analysis
|
||||
- DESeq2 differential expression
|
||||
- KEGG pathway enrichment
|
||||
- UniProt sequence retrieval
|
||||
- VEP variant annotation
|
||||
|
||||
### Cheminformatics
|
||||
|
||||
**Molecular Descriptors:**
|
||||
- Chemical property calculation
|
||||
- Molecular fingerprints
|
||||
- SMILES/InChI conversion
|
||||
- 3D conformer generation
|
||||
|
||||
**Drug Discovery:**
|
||||
- Virtual screening
|
||||
- Molecular docking
|
||||
- ADMET prediction
|
||||
- Drug-likeness assessment (Lipinski's Rule of Five)
|
||||
- Toxicity prediction
|
||||
|
||||
**Chemical Databases:**
|
||||
- PubChem compound search
|
||||
- ChEMBL bioactivity data
|
||||
- ZINC compound libraries
|
||||
- DrugBank drug information
|
||||
|
||||
**Example Tools:**
|
||||
- RDKit molecular descriptors
|
||||
- AutoDock molecular docking
|
||||
- ZINC library screening
|
||||
- ChEMBL target-compound associations
|
||||
|
||||
### Structural Biology
|
||||
|
||||
**Protein Structure:**
|
||||
- AlphaFold structure prediction
|
||||
- PDB structure retrieval
|
||||
- Structure alignment and comparison
|
||||
- Binding site prediction
|
||||
- Protein-protein interaction prediction
|
||||
|
||||
**Structure Analysis:**
|
||||
- Secondary structure prediction
|
||||
- Solvent accessibility calculation
|
||||
- Structure quality assessment
|
||||
- Ramachandran plot analysis
|
||||
|
||||
**Example Tools:**
|
||||
- AlphaFold structure prediction
|
||||
- PDB structure download
|
||||
- Fpocket binding site detection
|
||||
- DSSP secondary structure assignment
|
||||
|
||||
### Proteomics
|
||||
|
||||
**Protein Analysis:**
|
||||
- Mass spectrometry data analysis
|
||||
- Protein identification
|
||||
- Post-translational modification analysis
|
||||
- Protein quantification
|
||||
|
||||
**Protein Databases:**
|
||||
- UniProt protein information
|
||||
- STRING protein interactions
|
||||
- IntAct interaction databases
|
||||
|
||||
**Example Tools:**
|
||||
- UniProt data retrieval
|
||||
- STRING interaction networks
|
||||
- Mass spec peak analysis
|
||||
|
||||
### Machine Learning
|
||||
|
||||
**Model Types:**
|
||||
- Classification models
|
||||
- Regression models
|
||||
- Clustering algorithms
|
||||
- Neural networks
|
||||
- Deep learning models
|
||||
|
||||
**Applications:**
|
||||
- Predictive modeling
|
||||
- Feature selection
|
||||
- Dimensionality reduction
|
||||
- Pattern recognition
|
||||
- Biomarker discovery
|
||||
|
||||
**Example Tools:**
|
||||
- Scikit-learn models
|
||||
- TensorFlow/PyTorch models
|
||||
- XGBoost predictors
|
||||
- Random forest classifiers
|
||||
|
||||
### Medical/Clinical
|
||||
|
||||
**Disease Databases:**
|
||||
- OpenTargets disease-target associations
|
||||
- OMIM genetic disorders
|
||||
- ClinVar pathogenic variants
|
||||
- DisGeNET disease-gene associations
|
||||
|
||||
**Clinical Data:**
|
||||
- Electronic health records analysis
|
||||
- Clinical trial data
|
||||
- Diagnostic tools
|
||||
- Treatment recommendations
|
||||
|
||||
**Example Tools:**
|
||||
- OpenTargets disease queries
|
||||
- ClinVar variant classification
|
||||
- OMIM disease lookup
|
||||
- FDA drug approval data
|
||||
|
||||
### Neuroscience
|
||||
|
||||
**Brain Imaging:**
|
||||
- fMRI data analysis
|
||||
- Brain atlas mapping
|
||||
- Connectivity analysis
|
||||
- Neuroimaging pipelines
|
||||
|
||||
**Neural Data:**
|
||||
- Electrophysiology analysis
|
||||
- Spike train analysis
|
||||
- Neural network simulation
|
||||
|
||||
### Image Processing
|
||||
|
||||
**Biomedical Imaging:**
|
||||
- Microscopy image analysis
|
||||
- Cell segmentation
|
||||
- Object detection
|
||||
- Image enhancement
|
||||
- Feature extraction
|
||||
|
||||
**Image Analysis:**
|
||||
- ImageJ/Fiji tools
|
||||
- CellProfiler pipelines
|
||||
- Deep learning segmentation
|
||||
|
||||
### Systems Biology
|
||||
|
||||
**Network Analysis:**
|
||||
- Biological network construction
|
||||
- Network topology analysis
|
||||
- Module identification
|
||||
- Hub gene identification
|
||||
|
||||
**Modeling:**
|
||||
- Systems biology models
|
||||
- Metabolic network modeling
|
||||
- Signaling pathway simulation
|
||||
|
||||
## Tool Categories by Use Case
|
||||
|
||||
### Literature and Knowledge
|
||||
|
||||
**Literature Search:**
|
||||
- PubMed article search
|
||||
- Article summarization
|
||||
- Citation analysis
|
||||
- Knowledge extraction
|
||||
|
||||
**Knowledge Bases:**
|
||||
- Ontology queries (GO, DO, HPO)
|
||||
- Database cross-referencing
|
||||
- Entity recognition
|
||||
|
||||
### Data Access
|
||||
|
||||
**Public Repositories:**
|
||||
- GEO (Gene Expression Omnibus)
|
||||
- SRA (Sequence Read Archive)
|
||||
- PDB (Protein Data Bank)
|
||||
- ChEMBL (Bioactivity database)
|
||||
|
||||
**API Access:**
|
||||
- RESTful API clients
|
||||
- Database query tools
|
||||
- Batch data retrieval
|
||||
|
||||
### Visualization
|
||||
|
||||
**Plot Generation:**
|
||||
- Heatmaps
|
||||
- Volcano plots
|
||||
- Manhattan plots
|
||||
- Network graphs
|
||||
- Molecular structures
|
||||
|
||||
### Utilities
|
||||
|
||||
**Data Processing:**
|
||||
- Format conversion
|
||||
- Data normalization
|
||||
- Statistical analysis
|
||||
- Quality control
|
||||
|
||||
**Workflow Management:**
|
||||
- Pipeline construction
|
||||
- Task orchestration
|
||||
- Result aggregation
|
||||
|
||||
## Finding Tools by Domain
|
||||
|
||||
Use domain-specific keywords with Tool_Finder:
|
||||
|
||||
```python
|
||||
# Bioinformatics
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "RNA-seq genomics", "limit": 10}
|
||||
})
|
||||
|
||||
# Cheminformatics
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "molecular docking SMILES", "limit": 10}
|
||||
})
|
||||
|
||||
# Structural biology
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "protein structure PDB", "limit": 10}
|
||||
})
|
||||
|
||||
# Clinical
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "disease clinical variants", "limit": 10}
|
||||
})
|
||||
```
|
||||
|
||||
## Cross-Domain Applications
|
||||
|
||||
Many scientific problems require tools from multiple domains:
|
||||
|
||||
- **Precision Medicine**: Genomics + Clinical + Proteomics
|
||||
- **Drug Discovery**: Cheminformatics + Structural Biology + Machine Learning
|
||||
- **Cancer Research**: Genomics + Pathways + Literature
|
||||
- **Neurodegenerative Diseases**: Genomics + Proteomics + Imaging
|
||||
83
skills/tooluniverse/references/installation.md
Normal file
83
skills/tooluniverse/references/installation.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# ToolUniverse Installation and Setup
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
uv pip install tooluniverse
|
||||
```
|
||||
|
||||
## Basic Setup
|
||||
|
||||
### Python SDK
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
# Initialize ToolUniverse
|
||||
tu = ToolUniverse()
|
||||
|
||||
# Load all available tools (600+ scientific tools)
|
||||
tu.load_tools()
|
||||
```
|
||||
|
||||
## Model Context Protocol (MCP) Setup
|
||||
|
||||
ToolUniverse provides native MCP support for integration with Claude Desktop, Claude Code, and other MCP-compatible systems.
|
||||
|
||||
### Starting MCP Server
|
||||
```bash
|
||||
tooluniverse-smcp
|
||||
```
|
||||
|
||||
This launches an MCP server that exposes ToolUniverse's 600+ tools through the Model Context Protocol.
|
||||
|
||||
### Claude Desktop Integration
|
||||
|
||||
Add to Claude Desktop configuration (~/.config/Claude/claude_desktop_config.json):
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"tooluniverse": {
|
||||
"command": "tooluniverse-smcp"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Claude Code Integration
|
||||
|
||||
ToolUniverse MCP server works natively with Claude Code through the MCP protocol.
|
||||
|
||||
## Integration with Other Platforms
|
||||
|
||||
### OpenRouter Integration
|
||||
ToolUniverse integrates with OpenRouter for access to 100+ LLMs through a single API:
|
||||
- GPT-5, Claude, Gemini
|
||||
- Qwen, Deepseek
|
||||
- Open-source models
|
||||
|
||||
### Supported LLM Platforms
|
||||
- Claude Desktop and Claude Code
|
||||
- Gemini CLI
|
||||
- Qwen Code
|
||||
- ChatGPT API
|
||||
- GPT Codex CLI
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.8+
|
||||
- For Tool_Finder (embedding-based search): GPU recommended
|
||||
- For Tool_Finder_LLM: No GPU required (uses LLM-based search)
|
||||
|
||||
## Verification
|
||||
|
||||
Test installation:
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
tu = ToolUniverse()
|
||||
tu.load_tools()
|
||||
|
||||
# List first 5 tools to verify setup
|
||||
tools = tu.list_tools(limit=5)
|
||||
print(f"Loaded {len(tools)} tools successfully")
|
||||
```
|
||||
249
skills/tooluniverse/references/tool-composition.md
Normal file
249
skills/tooluniverse/references/tool-composition.md
Normal file
@@ -0,0 +1,249 @@
|
||||
# Tool Composition and Workflows in ToolUniverse
|
||||
|
||||
## Overview
|
||||
|
||||
ToolUniverse enables chaining multiple tools together to create complex scientific workflows. Tools can be composed sequentially or in parallel to solve multi-step research problems.
|
||||
|
||||
## Sequential Tool Composition
|
||||
|
||||
Execute tools in sequence where each tool's output feeds into the next tool.
|
||||
|
||||
### Basic Pattern
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
tu = ToolUniverse()
|
||||
tu.load_tools()
|
||||
|
||||
# Step 1: Get disease-associated targets
|
||||
targets = tu.run({
|
||||
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
|
||||
"arguments": {"efoId": "EFO_0000537"} # Hypertension
|
||||
})
|
||||
|
||||
# Step 2: For each target, get protein structure
|
||||
structures = []
|
||||
for target in targets[:5]: # First 5 targets
|
||||
structure = tu.run({
|
||||
"name": "AlphaFold_get_structure",
|
||||
"arguments": {"uniprot_id": target['uniprot_id']}
|
||||
})
|
||||
structures.append(structure)
|
||||
|
||||
# Step 3: Analyze structures
|
||||
for structure in structures:
|
||||
analysis = tu.run({
|
||||
"name": "ProteinAnalysis_calculate_properties",
|
||||
"arguments": {"structure": structure}
|
||||
})
|
||||
```
|
||||
|
||||
## Complex Workflow Examples
|
||||
|
||||
### Drug Discovery Workflow
|
||||
|
||||
Complete workflow from disease to drug candidates:
|
||||
|
||||
```python
|
||||
# 1. Find disease-associated targets
|
||||
print("Finding disease targets...")
|
||||
targets = tu.run({
|
||||
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
|
||||
"arguments": {"efoId": "EFO_0000616"} # Breast cancer
|
||||
})
|
||||
|
||||
# 2. Get target protein sequences
|
||||
print("Retrieving protein sequences...")
|
||||
sequences = []
|
||||
for target in targets[:10]:
|
||||
seq = tu.run({
|
||||
"name": "UniProt_get_sequence",
|
||||
"arguments": {"uniprot_id": target['uniprot_id']}
|
||||
})
|
||||
sequences.append(seq)
|
||||
|
||||
# 3. Predict protein structures
|
||||
print("Predicting structures...")
|
||||
structures = []
|
||||
for seq in sequences:
|
||||
structure = tu.run({
|
||||
"name": "AlphaFold_get_structure",
|
||||
"arguments": {"sequence": seq}
|
||||
})
|
||||
structures.append(structure)
|
||||
|
||||
# 4. Find binding sites
|
||||
print("Identifying binding sites...")
|
||||
binding_sites = []
|
||||
for structure in structures:
|
||||
sites = tu.run({
|
||||
"name": "Fpocket_find_binding_sites",
|
||||
"arguments": {"structure": structure}
|
||||
})
|
||||
binding_sites.append(sites)
|
||||
|
||||
# 5. Screen compound libraries
|
||||
print("Screening compounds...")
|
||||
hits = []
|
||||
for site in binding_sites:
|
||||
compounds = tu.run({
|
||||
"name": "ZINC_virtual_screening",
|
||||
"arguments": {
|
||||
"binding_site": site,
|
||||
"library": "lead-like",
|
||||
"top_n": 100
|
||||
}
|
||||
})
|
||||
hits.extend(compounds)
|
||||
|
||||
# 6. Calculate drug-likeness
|
||||
print("Evaluating drug-likeness...")
|
||||
drug_candidates = []
|
||||
for compound in hits:
|
||||
properties = tu.run({
|
||||
"name": "RDKit_calculate_drug_properties",
|
||||
"arguments": {"smiles": compound['smiles']}
|
||||
})
|
||||
if properties['lipinski_pass']:
|
||||
drug_candidates.append(compound)
|
||||
|
||||
print(f"Found {len(drug_candidates)} drug candidates")
|
||||
```
|
||||
|
||||
### Genomics Analysis Workflow
|
||||
|
||||
```python
|
||||
# 1. Download gene expression data
|
||||
expression_data = tu.run({
|
||||
"name": "GEO_download_dataset",
|
||||
"arguments": {"geo_id": "GSE12345"}
|
||||
})
|
||||
|
||||
# 2. Perform differential expression analysis
|
||||
de_genes = tu.run({
|
||||
"name": "DESeq2_differential_expression",
|
||||
"arguments": {
|
||||
"data": expression_data,
|
||||
"condition1": "control",
|
||||
"condition2": "treated"
|
||||
}
|
||||
})
|
||||
|
||||
# 3. Pathway enrichment analysis
|
||||
pathways = tu.run({
|
||||
"name": "KEGG_pathway_enrichment",
|
||||
"arguments": {
|
||||
"gene_list": de_genes['significant_genes'],
|
||||
"organism": "hsa"
|
||||
}
|
||||
})
|
||||
|
||||
# 4. Find relevant literature
|
||||
papers = tu.run({
|
||||
"name": "PubMed_search",
|
||||
"arguments": {
|
||||
"query": f"{pathways[0]['pathway_name']} AND cancer",
|
||||
"max_results": 20
|
||||
}
|
||||
})
|
||||
|
||||
# 5. Summarize findings
|
||||
summary = tu.run({
|
||||
"name": "LLM_summarize",
|
||||
"arguments": {
|
||||
"text": papers,
|
||||
"focus": "therapeutic implications"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### Clinical Genomics Workflow
|
||||
|
||||
```python
|
||||
# 1. Load patient variants
|
||||
variants = tu.run({
|
||||
"name": "VCF_parse",
|
||||
"arguments": {"vcf_file": "patient_001.vcf"}
|
||||
})
|
||||
|
||||
# 2. Annotate variants
|
||||
annotated = tu.run({
|
||||
"name": "VEP_annotate_variants",
|
||||
"arguments": {"variants": variants}
|
||||
})
|
||||
|
||||
# 3. Filter pathogenic variants
|
||||
pathogenic = tu.run({
|
||||
"name": "ClinVar_filter_pathogenic",
|
||||
"arguments": {"variants": annotated}
|
||||
})
|
||||
|
||||
# 4. Find disease associations
|
||||
diseases = tu.run({
|
||||
"name": "OMIM_disease_lookup",
|
||||
"arguments": {"genes": pathogenic['affected_genes']}
|
||||
})
|
||||
|
||||
# 5. Generate clinical report
|
||||
report = tu.run({
|
||||
"name": "Report_generator",
|
||||
"arguments": {
|
||||
"variants": pathogenic,
|
||||
"diseases": diseases,
|
||||
"format": "clinical"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
## Parallel Tool Execution
|
||||
|
||||
Execute multiple tools simultaneously when they don't depend on each other:
|
||||
|
||||
```python
|
||||
import concurrent.futures
|
||||
|
||||
def run_tool(tu, tool_config):
|
||||
return tu.run(tool_config)
|
||||
|
||||
# Define parallel tasks
|
||||
tasks = [
|
||||
{"name": "PubMed_search", "arguments": {"query": "cancer", "max_results": 10}},
|
||||
{"name": "OpenTargets_get_diseases", "arguments": {"therapeutic_area": "oncology"}},
|
||||
{"name": "ChEMBL_search_compounds", "arguments": {"target": "EGFR"}}
|
||||
]
|
||||
|
||||
# Execute in parallel
|
||||
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
|
||||
futures = [executor.submit(run_tool, tu, task) for task in tasks]
|
||||
results = [future.result() for future in concurrent.futures.as_completed(futures)]
|
||||
```
|
||||
|
||||
## Output Processing Hooks
|
||||
|
||||
ToolUniverse supports post-processing hooks for:
|
||||
- Summarization
|
||||
- File saving
|
||||
- Data transformation
|
||||
- Visualization
|
||||
|
||||
```python
|
||||
# Example: Save results to file
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
},
|
||||
hooks={
|
||||
"save_to_file": {"filename": "results.json"},
|
||||
"summarize": {"format": "brief"}
|
||||
})
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Error Handling**: Implement try-except blocks for each tool in workflow
|
||||
2. **Data Validation**: Verify output from each step before passing to next tool
|
||||
3. **Checkpointing**: Save intermediate results for long workflows
|
||||
4. **Logging**: Track progress through complex workflows
|
||||
5. **Resource Management**: Consider rate limits and computational resources
|
||||
6. **Modularity**: Break complex workflows into reusable functions
|
||||
7. **Testing**: Test each step individually before composing full workflow
|
||||
126
skills/tooluniverse/references/tool-discovery.md
Normal file
126
skills/tooluniverse/references/tool-discovery.md
Normal file
@@ -0,0 +1,126 @@
|
||||
# Tool Discovery in ToolUniverse
|
||||
|
||||
## Overview
|
||||
|
||||
ToolUniverse provides multiple methods to discover and search through 600+ scientific tools using natural language, keywords, or embeddings.
|
||||
|
||||
## Discovery Methods
|
||||
|
||||
### 1. Tool_Finder (Embedding-Based Search)
|
||||
|
||||
Uses semantic embeddings to find relevant tools. **Requires GPU** for optimal performance.
|
||||
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
tu = ToolUniverse()
|
||||
tu.load_tools()
|
||||
|
||||
# Search by natural language description
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder",
|
||||
"arguments": {
|
||||
"description": "protein structure prediction",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
|
||||
print(tools)
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- Natural language queries
|
||||
- Semantic similarity search
|
||||
- When GPU is available
|
||||
|
||||
### 2. Tool_Finder_LLM (LLM-Based Search)
|
||||
|
||||
Alternative to embedding-based search that uses LLM reasoning. **No GPU required**.
|
||||
|
||||
```python
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_LLM",
|
||||
"arguments": {
|
||||
"description": "Find tools for analyzing gene expression data",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- When GPU is not available
|
||||
- Complex queries requiring reasoning
|
||||
- Semantic understanding needed
|
||||
|
||||
### 3. Tool_Finder_Keyword (Keyword Search)
|
||||
|
||||
Fast keyword-based search through tool names and descriptions.
|
||||
|
||||
```python
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {
|
||||
"description": "disease target associations",
|
||||
"limit": 10
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- Fast searches
|
||||
- Known keywords
|
||||
- Exact term matching
|
||||
|
||||
## Listing Available Tools
|
||||
|
||||
### List All Tools
|
||||
```python
|
||||
all_tools = tu.list_tools()
|
||||
print(f"Total tools available: {len(all_tools)}")
|
||||
```
|
||||
|
||||
### List Tools with Limit
|
||||
```python
|
||||
tools = tu.list_tools(limit=20)
|
||||
for tool in tools:
|
||||
print(f"{tool['name']}: {tool['description']}")
|
||||
```
|
||||
|
||||
## Tool Information
|
||||
|
||||
### Get Tool Details
|
||||
```python
|
||||
# After finding a tool, inspect its details
|
||||
tool_info = tu.get_tool_info("OpenTargets_get_associated_targets_by_disease_efoId")
|
||||
print(tool_info)
|
||||
```
|
||||
|
||||
## Search Strategies
|
||||
|
||||
### By Domain
|
||||
Use domain-specific keywords:
|
||||
- Bioinformatics: "sequence alignment", "genomics", "RNA-seq"
|
||||
- Cheminformatics: "molecular dynamics", "drug design", "SMILES"
|
||||
- Machine Learning: "classification", "prediction", "neural network"
|
||||
- Structural Biology: "protein structure", "PDB", "crystallography"
|
||||
|
||||
### By Functionality
|
||||
Search by what you want to accomplish:
|
||||
- "Find disease-gene associations"
|
||||
- "Predict protein interactions"
|
||||
- "Analyze clinical trial data"
|
||||
- "Generate molecular descriptors"
|
||||
|
||||
### By Data Source
|
||||
Search for specific databases or APIs:
|
||||
- "OpenTargets", "PubChem", "UniProt"
|
||||
- "AlphaFold", "ChEMBL", "PDB"
|
||||
- "KEGG", "Reactome", "STRING"
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Start Broad**: Begin with general terms, then refine
|
||||
2. **Use Multiple Methods**: Try different discovery methods if results aren't satisfactory
|
||||
3. **Set Appropriate Limits**: Use `limit` parameter to control result size (default: 10)
|
||||
4. **Check Tool Descriptions**: Review returned tool descriptions to verify relevance
|
||||
5. **Iterate**: Refine search terms based on initial results
|
||||
177
skills/tooluniverse/references/tool-execution.md
Normal file
177
skills/tooluniverse/references/tool-execution.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Tool Execution in ToolUniverse
|
||||
|
||||
## Overview
|
||||
|
||||
Execute individual tools through ToolUniverse's standardized interface using the `run()` method.
|
||||
|
||||
## Basic Tool Execution
|
||||
|
||||
### Standard Pattern
|
||||
```python
|
||||
from tooluniverse import ToolUniverse
|
||||
|
||||
tu = ToolUniverse()
|
||||
tu.load_tools()
|
||||
|
||||
# Execute a tool
|
||||
result = tu.run({
|
||||
"name": "tool_name_here",
|
||||
"arguments": {
|
||||
"param1": "value1",
|
||||
"param2": "value2"
|
||||
}
|
||||
})
|
||||
|
||||
print(result)
|
||||
```
|
||||
|
||||
## Real-World Examples
|
||||
|
||||
### Example 1: Disease-Target Associations (OpenTargets)
|
||||
```python
|
||||
# Find targets associated with hypertension
|
||||
result = tu.run({
|
||||
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
|
||||
"arguments": {
|
||||
"efoId": "EFO_0000537" # Hypertension
|
||||
}
|
||||
})
|
||||
|
||||
print(f"Found {len(result)} targets associated with hypertension")
|
||||
```
|
||||
|
||||
### Example 2: Protein Structure Prediction
|
||||
```python
|
||||
# Get AlphaFold structure prediction
|
||||
result = tu.run({
|
||||
"name": "AlphaFold_get_structure",
|
||||
"arguments": {
|
||||
"uniprot_id": "P12345"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### Example 3: Chemical Property Calculation
|
||||
```python
|
||||
# Calculate molecular descriptors
|
||||
result = tu.run({
|
||||
"name": "RDKit_calculate_descriptors",
|
||||
"arguments": {
|
||||
"smiles": "CCO" # Ethanol
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### Example 4: Gene Expression Analysis
|
||||
```python
|
||||
# Analyze differential gene expression
|
||||
result = tu.run({
|
||||
"name": "GeneExpression_differential_analysis",
|
||||
"arguments": {
|
||||
"dataset_id": "GSE12345",
|
||||
"condition1": "control",
|
||||
"condition2": "treatment"
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
## Tool Execution Workflow
|
||||
|
||||
### 1. Discover the Tool
|
||||
```python
|
||||
# Find relevant tools
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {
|
||||
"description": "pathway enrichment",
|
||||
"limit": 5
|
||||
}
|
||||
})
|
||||
|
||||
# Review available tools
|
||||
for tool in tools:
|
||||
print(f"Name: {tool['name']}")
|
||||
print(f"Description: {tool['description']}")
|
||||
print(f"Parameters: {tool['parameters']}")
|
||||
print("---")
|
||||
```
|
||||
|
||||
### 2. Check Tool Parameters
|
||||
```python
|
||||
# Get detailed tool information
|
||||
tool_info = tu.get_tool_info("KEGG_pathway_enrichment")
|
||||
print(tool_info['parameters'])
|
||||
```
|
||||
|
||||
### 3. Execute with Proper Arguments
|
||||
```python
|
||||
# Execute the tool
|
||||
result = tu.run({
|
||||
"name": "KEGG_pathway_enrichment",
|
||||
"arguments": {
|
||||
"gene_list": ["TP53", "BRCA1", "EGFR"],
|
||||
"organism": "hsa" # Homo sapiens
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
## Handling Tool Results
|
||||
|
||||
### Check Result Type
|
||||
```python
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
})
|
||||
|
||||
# Results can be various types
|
||||
if isinstance(result, dict):
|
||||
print("Dictionary result")
|
||||
elif isinstance(result, list):
|
||||
print(f"List with {len(result)} items")
|
||||
elif isinstance(result, str):
|
||||
print("String result")
|
||||
```
|
||||
|
||||
### Process Results
|
||||
```python
|
||||
# Example: Processing multiple results
|
||||
results = tu.run({
|
||||
"name": "PubMed_search",
|
||||
"arguments": {
|
||||
"query": "cancer immunotherapy",
|
||||
"max_results": 10
|
||||
}
|
||||
})
|
||||
|
||||
for idx, paper in enumerate(results, 1):
|
||||
print(f"{idx}. {paper['title']}")
|
||||
print(f" PMID: {paper['pmid']}")
|
||||
print(f" Authors: {', '.join(paper['authors'][:3])}")
|
||||
print()
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
try:
|
||||
result = tu.run({
|
||||
"name": "some_tool",
|
||||
"arguments": {"param": "value"}
|
||||
})
|
||||
except Exception as e:
|
||||
print(f"Tool execution failed: {e}")
|
||||
# Check if tool exists
|
||||
# Verify parameter names and types
|
||||
# Review tool documentation
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Verify Tool Parameters**: Always check required parameters before execution
|
||||
2. **Start Simple**: Test with simple cases before complex workflows
|
||||
3. **Handle Results Appropriately**: Check result type and structure
|
||||
4. **Error Recovery**: Implement try-except blocks for production code
|
||||
5. **Documentation**: Review tool descriptions for parameter requirements and output formats
|
||||
6. **Rate Limiting**: Be aware of API rate limits for remote tools
|
||||
7. **Data Validation**: Validate input data format (e.g., SMILES, UniProt IDs, gene symbols)
|
||||
Reference in New Issue
Block a user