3.4 KiB
description
| description |
|---|
| Search for datasets in the National Data Platform |
NDP Dataset Search
Search for datasets across the National Data Platform ecosystem with advanced filtering options.
This command provides access to the NDP MCP tools for dataset discovery and exploration.
Available MCP Tools
When you use this command, Claude can invoke these MCP tools:
search_datasets - Primary search tool
Searches for datasets using various criteria:
- search_terms: List of terms to search across all fields
- owner_org: Filter by organization name
- resource_format: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
- dataset_description: Search in descriptions
- server: Query 'global' (default) or 'local' server
- limit: Maximum results (default: 20)
list_organizations - Organization discovery
Lists available organizations:
- name_filter: Filter by name substring
- server: Query 'global' (default), 'local', or 'pre_ckan'
get_dataset_details - Detailed information
Retrieves complete metadata for a specific dataset:
- dataset_identifier: Dataset ID or name from search results
- identifier_type: 'id' (default) or 'name'
- server: 'global' (default) or 'local'
Recommended Workflow
- Discover Organizations: Use
list_organizationsto find relevant data sources - Search Datasets: Use
search_datasetswith appropriate filters - Review Results: Claude will summarize matching datasets
- Get Details: Use
get_dataset_detailsfor datasets of interest - Refine Search: Adjust filters based on results
Best Practices
- Always verify organization names with
list_organizationsbefore using in search - Start broad, then refine: Begin with simple terms, add filters as needed
- Limit results appropriately: Default 20 is good, increase if needed
- Use format filters: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
- Multi-server searches: Query both global and local for comprehensive coverage
Example Queries
Basic Search
"Find climate datasets from NOAA"
Expected tools: list_organizations(name_filter="noaa"), then search_datasets(owner_org="NOAA", search_terms=["climate"])
Format-Specific Search
"Search for oceanographic data in NetCDF format"
Expected tools: search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")
Organization-Based Search
"List all datasets from a specific research institution"
Expected tools: list_organizations(name_filter="<institution>"), then search_datasets(owner_org="<name>")
Refined Search with Limit
"Find CSV datasets about temperature monitoring, limit to 10 results"
Expected tools: search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)
Multi-Server Comparison
"Compare oceanographic datasets on global and local servers"
Expected tools: search_datasets(server="global", ...) and search_datasets(server="local", ...)
Tips for Effective Searching
- Use specific terminology: Scientific terms work better than generic ones
- Combine filters: Organization + format + terms = precise results
- Check multiple formats: Try CSV, NetCDF, HDF5 for scientific data
- Explore organizations first: Understanding data providers helps target searches
- Request details selectively: Full metadata for only the most relevant datasets