Initial commit
This commit is contained in:
272
skills/tooluniverse/references/domains.md
Normal file
272
skills/tooluniverse/references/domains.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# ToolUniverse Tool Domains and Categories
|
||||
|
||||
## Overview
|
||||
|
||||
ToolUniverse integrates 600+ scientific tools across multiple research domains. This document categorizes tools by scientific discipline and use case.
|
||||
|
||||
## Major Scientific Domains
|
||||
|
||||
### Bioinformatics
|
||||
|
||||
**Sequence Analysis:**
|
||||
- Sequence alignment and comparison
|
||||
- Multiple sequence alignment (MSA)
|
||||
- BLAST and homology searches
|
||||
- Motif finding and pattern matching
|
||||
|
||||
**Genomics:**
|
||||
- Gene expression analysis
|
||||
- RNA-seq data processing
|
||||
- Variant calling and annotation
|
||||
- Genome assembly and annotation
|
||||
- Copy number variation analysis
|
||||
|
||||
**Functional Analysis:**
|
||||
- Gene Ontology (GO) enrichment
|
||||
- Pathway analysis (KEGG, Reactome)
|
||||
- Gene set enrichment analysis (GSEA)
|
||||
- Protein domain analysis
|
||||
|
||||
**Example Tools:**
|
||||
- GEO data download and analysis
|
||||
- DESeq2 differential expression
|
||||
- KEGG pathway enrichment
|
||||
- UniProt sequence retrieval
|
||||
- VEP variant annotation
|
||||
|
||||
### Cheminformatics
|
||||
|
||||
**Molecular Descriptors:**
|
||||
- Chemical property calculation
|
||||
- Molecular fingerprints
|
||||
- SMILES/InChI conversion
|
||||
- 3D conformer generation
|
||||
|
||||
**Drug Discovery:**
|
||||
- Virtual screening
|
||||
- Molecular docking
|
||||
- ADMET prediction
|
||||
- Drug-likeness assessment (Lipinski's Rule of Five)
|
||||
- Toxicity prediction
|
||||
|
||||
**Chemical Databases:**
|
||||
- PubChem compound search
|
||||
- ChEMBL bioactivity data
|
||||
- ZINC compound libraries
|
||||
- DrugBank drug information
|
||||
|
||||
**Example Tools:**
|
||||
- RDKit molecular descriptors
|
||||
- AutoDock molecular docking
|
||||
- ZINC library screening
|
||||
- ChEMBL target-compound associations
|
||||
|
||||
### Structural Biology
|
||||
|
||||
**Protein Structure:**
|
||||
- AlphaFold structure prediction
|
||||
- PDB structure retrieval
|
||||
- Structure alignment and comparison
|
||||
- Binding site prediction
|
||||
- Protein-protein interaction prediction
|
||||
|
||||
**Structure Analysis:**
|
||||
- Secondary structure prediction
|
||||
- Solvent accessibility calculation
|
||||
- Structure quality assessment
|
||||
- Ramachandran plot analysis
|
||||
|
||||
**Example Tools:**
|
||||
- AlphaFold structure prediction
|
||||
- PDB structure download
|
||||
- Fpocket binding site detection
|
||||
- DSSP secondary structure assignment
|
||||
|
||||
### Proteomics
|
||||
|
||||
**Protein Analysis:**
|
||||
- Mass spectrometry data analysis
|
||||
- Protein identification
|
||||
- Post-translational modification analysis
|
||||
- Protein quantification
|
||||
|
||||
**Protein Databases:**
|
||||
- UniProt protein information
|
||||
- STRING protein interactions
|
||||
- IntAct interaction databases
|
||||
|
||||
**Example Tools:**
|
||||
- UniProt data retrieval
|
||||
- STRING interaction networks
|
||||
- Mass spec peak analysis
|
||||
|
||||
### Machine Learning
|
||||
|
||||
**Model Types:**
|
||||
- Classification models
|
||||
- Regression models
|
||||
- Clustering algorithms
|
||||
- Neural networks
|
||||
- Deep learning models
|
||||
|
||||
**Applications:**
|
||||
- Predictive modeling
|
||||
- Feature selection
|
||||
- Dimensionality reduction
|
||||
- Pattern recognition
|
||||
- Biomarker discovery
|
||||
|
||||
**Example Tools:**
|
||||
- Scikit-learn models
|
||||
- TensorFlow/PyTorch models
|
||||
- XGBoost predictors
|
||||
- Random forest classifiers
|
||||
|
||||
### Medical/Clinical
|
||||
|
||||
**Disease Databases:**
|
||||
- OpenTargets disease-target associations
|
||||
- OMIM genetic disorders
|
||||
- ClinVar pathogenic variants
|
||||
- DisGeNET disease-gene associations
|
||||
|
||||
**Clinical Data:**
|
||||
- Electronic health records analysis
|
||||
- Clinical trial data
|
||||
- Diagnostic tools
|
||||
- Treatment recommendations
|
||||
|
||||
**Example Tools:**
|
||||
- OpenTargets disease queries
|
||||
- ClinVar variant classification
|
||||
- OMIM disease lookup
|
||||
- FDA drug approval data
|
||||
|
||||
### Neuroscience
|
||||
|
||||
**Brain Imaging:**
|
||||
- fMRI data analysis
|
||||
- Brain atlas mapping
|
||||
- Connectivity analysis
|
||||
- Neuroimaging pipelines
|
||||
|
||||
**Neural Data:**
|
||||
- Electrophysiology analysis
|
||||
- Spike train analysis
|
||||
- Neural network simulation
|
||||
|
||||
### Image Processing
|
||||
|
||||
**Biomedical Imaging:**
|
||||
- Microscopy image analysis
|
||||
- Cell segmentation
|
||||
- Object detection
|
||||
- Image enhancement
|
||||
- Feature extraction
|
||||
|
||||
**Image Analysis:**
|
||||
- ImageJ/Fiji tools
|
||||
- CellProfiler pipelines
|
||||
- Deep learning segmentation
|
||||
|
||||
### Systems Biology
|
||||
|
||||
**Network Analysis:**
|
||||
- Biological network construction
|
||||
- Network topology analysis
|
||||
- Module identification
|
||||
- Hub gene identification
|
||||
|
||||
**Modeling:**
|
||||
- Systems biology models
|
||||
- Metabolic network modeling
|
||||
- Signaling pathway simulation
|
||||
|
||||
## Tool Categories by Use Case
|
||||
|
||||
### Literature and Knowledge
|
||||
|
||||
**Literature Search:**
|
||||
- PubMed article search
|
||||
- Article summarization
|
||||
- Citation analysis
|
||||
- Knowledge extraction
|
||||
|
||||
**Knowledge Bases:**
|
||||
- Ontology queries (GO, DO, HPO)
|
||||
- Database cross-referencing
|
||||
- Entity recognition
|
||||
|
||||
### Data Access
|
||||
|
||||
**Public Repositories:**
|
||||
- GEO (Gene Expression Omnibus)
|
||||
- SRA (Sequence Read Archive)
|
||||
- PDB (Protein Data Bank)
|
||||
- ChEMBL (Bioactivity database)
|
||||
|
||||
**API Access:**
|
||||
- RESTful API clients
|
||||
- Database query tools
|
||||
- Batch data retrieval
|
||||
|
||||
### Visualization
|
||||
|
||||
**Plot Generation:**
|
||||
- Heatmaps
|
||||
- Volcano plots
|
||||
- Manhattan plots
|
||||
- Network graphs
|
||||
- Molecular structures
|
||||
|
||||
### Utilities
|
||||
|
||||
**Data Processing:**
|
||||
- Format conversion
|
||||
- Data normalization
|
||||
- Statistical analysis
|
||||
- Quality control
|
||||
|
||||
**Workflow Management:**
|
||||
- Pipeline construction
|
||||
- Task orchestration
|
||||
- Result aggregation
|
||||
|
||||
## Finding Tools by Domain
|
||||
|
||||
Use domain-specific keywords with Tool_Finder:
|
||||
|
||||
```python
|
||||
# Bioinformatics
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "RNA-seq genomics", "limit": 10}
|
||||
})
|
||||
|
||||
# Cheminformatics
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "molecular docking SMILES", "limit": 10}
|
||||
})
|
||||
|
||||
# Structural biology
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "protein structure PDB", "limit": 10}
|
||||
})
|
||||
|
||||
# Clinical
|
||||
tools = tu.run({
|
||||
"name": "Tool_Finder_Keyword",
|
||||
"arguments": {"description": "disease clinical variants", "limit": 10}
|
||||
})
|
||||
```
|
||||
|
||||
## Cross-Domain Applications
|
||||
|
||||
Many scientific problems require tools from multiple domains:
|
||||
|
||||
- **Precision Medicine**: Genomics + Clinical + Proteomics
|
||||
- **Drug Discovery**: Cheminformatics + Structural Biology + Machine Learning
|
||||
- **Cancer Research**: Genomics + Pathways + Literature
|
||||
- **Neurodegenerative Diseases**: Genomics + Proteomics + Imaging
|
||||
Reference in New Issue
Block a user