Files
gh-k-dense-ai-claude-scient…/skills/tooluniverse/references/domains.md
2025-11-30 08:30:10 +08:00

5.7 KiB

ToolUniverse Tool Domains and Categories

Overview

ToolUniverse integrates 600+ scientific tools across multiple research domains. This document categorizes tools by scientific discipline and use case.

Major Scientific Domains

Bioinformatics

Sequence Analysis:

  • Sequence alignment and comparison
  • Multiple sequence alignment (MSA)
  • BLAST and homology searches
  • Motif finding and pattern matching

Genomics:

  • Gene expression analysis
  • RNA-seq data processing
  • Variant calling and annotation
  • Genome assembly and annotation
  • Copy number variation analysis

Functional Analysis:

  • Gene Ontology (GO) enrichment
  • Pathway analysis (KEGG, Reactome)
  • Gene set enrichment analysis (GSEA)
  • Protein domain analysis

Example Tools:

  • GEO data download and analysis
  • DESeq2 differential expression
  • KEGG pathway enrichment
  • UniProt sequence retrieval
  • VEP variant annotation

Cheminformatics

Molecular Descriptors:

  • Chemical property calculation
  • Molecular fingerprints
  • SMILES/InChI conversion
  • 3D conformer generation

Drug Discovery:

  • Virtual screening
  • Molecular docking
  • ADMET prediction
  • Drug-likeness assessment (Lipinski's Rule of Five)
  • Toxicity prediction

Chemical Databases:

  • PubChem compound search
  • ChEMBL bioactivity data
  • ZINC compound libraries
  • DrugBank drug information

Example Tools:

  • RDKit molecular descriptors
  • AutoDock molecular docking
  • ZINC library screening
  • ChEMBL target-compound associations

Structural Biology

Protein Structure:

  • AlphaFold structure prediction
  • PDB structure retrieval
  • Structure alignment and comparison
  • Binding site prediction
  • Protein-protein interaction prediction

Structure Analysis:

  • Secondary structure prediction
  • Solvent accessibility calculation
  • Structure quality assessment
  • Ramachandran plot analysis

Example Tools:

  • AlphaFold structure prediction
  • PDB structure download
  • Fpocket binding site detection
  • DSSP secondary structure assignment

Proteomics

Protein Analysis:

  • Mass spectrometry data analysis
  • Protein identification
  • Post-translational modification analysis
  • Protein quantification

Protein Databases:

  • UniProt protein information
  • STRING protein interactions
  • IntAct interaction databases

Example Tools:

  • UniProt data retrieval
  • STRING interaction networks
  • Mass spec peak analysis

Machine Learning

Model Types:

  • Classification models
  • Regression models
  • Clustering algorithms
  • Neural networks
  • Deep learning models

Applications:

  • Predictive modeling
  • Feature selection
  • Dimensionality reduction
  • Pattern recognition
  • Biomarker discovery

Example Tools:

  • Scikit-learn models
  • TensorFlow/PyTorch models
  • XGBoost predictors
  • Random forest classifiers

Medical/Clinical

Disease Databases:

  • OpenTargets disease-target associations
  • OMIM genetic disorders
  • ClinVar pathogenic variants
  • DisGeNET disease-gene associations

Clinical Data:

  • Electronic health records analysis
  • Clinical trial data
  • Diagnostic tools
  • Treatment recommendations

Example Tools:

  • OpenTargets disease queries
  • ClinVar variant classification
  • OMIM disease lookup
  • FDA drug approval data

Neuroscience

Brain Imaging:

  • fMRI data analysis
  • Brain atlas mapping
  • Connectivity analysis
  • Neuroimaging pipelines

Neural Data:

  • Electrophysiology analysis
  • Spike train analysis
  • Neural network simulation

Image Processing

Biomedical Imaging:

  • Microscopy image analysis
  • Cell segmentation
  • Object detection
  • Image enhancement
  • Feature extraction

Image Analysis:

  • ImageJ/Fiji tools
  • CellProfiler pipelines
  • Deep learning segmentation

Systems Biology

Network Analysis:

  • Biological network construction
  • Network topology analysis
  • Module identification
  • Hub gene identification

Modeling:

  • Systems biology models
  • Metabolic network modeling
  • Signaling pathway simulation

Tool Categories by Use Case

Literature and Knowledge

Literature Search:

  • PubMed article search
  • Article summarization
  • Citation analysis
  • Knowledge extraction

Knowledge Bases:

  • Ontology queries (GO, DO, HPO)
  • Database cross-referencing
  • Entity recognition

Data Access

Public Repositories:

  • GEO (Gene Expression Omnibus)
  • SRA (Sequence Read Archive)
  • PDB (Protein Data Bank)
  • ChEMBL (Bioactivity database)

API Access:

  • RESTful API clients
  • Database query tools
  • Batch data retrieval

Visualization

Plot Generation:

  • Heatmaps
  • Volcano plots
  • Manhattan plots
  • Network graphs
  • Molecular structures

Utilities

Data Processing:

  • Format conversion
  • Data normalization
  • Statistical analysis
  • Quality control

Workflow Management:

  • Pipeline construction
  • Task orchestration
  • Result aggregation

Finding Tools by Domain

Use domain-specific keywords with Tool_Finder:

# Bioinformatics
tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {"description": "RNA-seq genomics", "limit": 10}
})

# Cheminformatics
tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {"description": "molecular docking SMILES", "limit": 10}
})

# Structural biology
tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {"description": "protein structure PDB", "limit": 10}
})

# Clinical
tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {"description": "disease clinical variants", "limit": 10}
})

Cross-Domain Applications

Many scientific problems require tools from multiple domains:

  • Precision Medicine: Genomics + Clinical + Proteomics
  • Drug Discovery: Cheminformatics + Structural Biology + Machine Learning
  • Cancer Research: Genomics + Pathways + Literature
  • Neurodegenerative Diseases: Genomics + Proteomics + Imaging