Files
gh-k-dense-ai-claude-scient…/skills/tooluniverse/references/tool-discovery.md
2025-11-30 08:30:10 +08:00

3.0 KiB

Tool Discovery in ToolUniverse

Overview

ToolUniverse provides multiple methods to discover and search through 600+ scientific tools using natural language, keywords, or embeddings.

Discovery Methods

Uses semantic embeddings to find relevant tools. Requires GPU for optimal performance.

from tooluniverse import ToolUniverse

tu = ToolUniverse()
tu.load_tools()

# Search by natural language description
tools = tu.run({
    "name": "Tool_Finder",
    "arguments": {
        "description": "protein structure prediction",
        "limit": 10
    }
})

print(tools)

When to use:

  • Natural language queries
  • Semantic similarity search
  • When GPU is available

Alternative to embedding-based search that uses LLM reasoning. No GPU required.

tools = tu.run({
    "name": "Tool_Finder_LLM",
    "arguments": {
        "description": "Find tools for analyzing gene expression data",
        "limit": 10
    }
})

When to use:

  • When GPU is not available
  • Complex queries requiring reasoning
  • Semantic understanding needed

Fast keyword-based search through tool names and descriptions.

tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {
        "description": "disease target associations",
        "limit": 10
    }
})

When to use:

  • Fast searches
  • Known keywords
  • Exact term matching

Listing Available Tools

List All Tools

all_tools = tu.list_tools()
print(f"Total tools available: {len(all_tools)}")

List Tools with Limit

tools = tu.list_tools(limit=20)
for tool in tools:
    print(f"{tool['name']}: {tool['description']}")

Tool Information

Get Tool Details

# After finding a tool, inspect its details
tool_info = tu.get_tool_info("OpenTargets_get_associated_targets_by_disease_efoId")
print(tool_info)

Search Strategies

By Domain

Use domain-specific keywords:

  • Bioinformatics: "sequence alignment", "genomics", "RNA-seq"
  • Cheminformatics: "molecular dynamics", "drug design", "SMILES"
  • Machine Learning: "classification", "prediction", "neural network"
  • Structural Biology: "protein structure", "PDB", "crystallography"

By Functionality

Search by what you want to accomplish:

  • "Find disease-gene associations"
  • "Predict protein interactions"
  • "Analyze clinical trial data"
  • "Generate molecular descriptors"

By Data Source

Search for specific databases or APIs:

  • "OpenTargets", "PubChem", "UniProt"
  • "AlphaFold", "ChEMBL", "PDB"
  • "KEGG", "Reactome", "STRING"

Best Practices

  1. Start Broad: Begin with general terms, then refine
  2. Use Multiple Methods: Try different discovery methods if results aren't satisfactory
  3. Set Appropriate Limits: Use limit parameter to control result size (default: 10)
  4. Check Tool Descriptions: Review returned tool descriptions to verify relevance
  5. Iterate: Refine search terms based on initial results