90 lines
3.4 KiB
Markdown
90 lines
3.4 KiB
Markdown
---
|
|
description: Search for datasets in the National Data Platform
|
|
---
|
|
|
|
# NDP Dataset Search
|
|
|
|
Search for datasets across the National Data Platform ecosystem with advanced filtering options.
|
|
|
|
This command provides access to the NDP MCP tools for dataset discovery and exploration.
|
|
|
|
## Available MCP Tools
|
|
|
|
When you use this command, Claude can invoke these MCP tools:
|
|
|
|
### `search_datasets` - Primary search tool
|
|
Searches for datasets using various criteria:
|
|
- **search_terms**: List of terms to search across all fields
|
|
- **owner_org**: Filter by organization name
|
|
- **resource_format**: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
|
|
- **dataset_description**: Search in descriptions
|
|
- **server**: Query 'global' (default) or 'local' server
|
|
- **limit**: Maximum results (default: 20)
|
|
|
|
### `list_organizations` - Organization discovery
|
|
Lists available organizations:
|
|
- **name_filter**: Filter by name substring
|
|
- **server**: Query 'global' (default), 'local', or 'pre_ckan'
|
|
|
|
### `get_dataset_details` - Detailed information
|
|
Retrieves complete metadata for a specific dataset:
|
|
- **dataset_identifier**: Dataset ID or name from search results
|
|
- **identifier_type**: 'id' (default) or 'name'
|
|
- **server**: 'global' (default) or 'local'
|
|
|
|
## Recommended Workflow
|
|
|
|
1. **Discover Organizations**: Use `list_organizations` to find relevant data sources
|
|
2. **Search Datasets**: Use `search_datasets` with appropriate filters
|
|
3. **Review Results**: Claude will summarize matching datasets
|
|
4. **Get Details**: Use `get_dataset_details` for datasets of interest
|
|
5. **Refine Search**: Adjust filters based on results
|
|
|
|
## Best Practices
|
|
|
|
- **Always verify organization names** with `list_organizations` before using in search
|
|
- **Start broad, then refine**: Begin with simple terms, add filters as needed
|
|
- **Limit results appropriately**: Default 20 is good, increase if needed
|
|
- **Use format filters**: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
|
|
- **Multi-server searches**: Query both global and local for comprehensive coverage
|
|
|
|
## Example Queries
|
|
|
|
### Basic Search
|
|
```
|
|
"Find climate datasets from NOAA"
|
|
```
|
|
Expected tools: `list_organizations(name_filter="noaa")`, then `search_datasets(owner_org="NOAA", search_terms=["climate"])`
|
|
|
|
### Format-Specific Search
|
|
```
|
|
"Search for oceanographic data in NetCDF format"
|
|
```
|
|
Expected tools: `search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")`
|
|
|
|
### Organization-Based Search
|
|
```
|
|
"List all datasets from a specific research institution"
|
|
```
|
|
Expected tools: `list_organizations(name_filter="<institution>")`, then `search_datasets(owner_org="<name>")`
|
|
|
|
### Refined Search with Limit
|
|
```
|
|
"Find CSV datasets about temperature monitoring, limit to 10 results"
|
|
```
|
|
Expected tools: `search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)`
|
|
|
|
### Multi-Server Comparison
|
|
```
|
|
"Compare oceanographic datasets on global and local servers"
|
|
```
|
|
Expected tools: `search_datasets(server="global", ...)` and `search_datasets(server="local", ...)`
|
|
|
|
## Tips for Effective Searching
|
|
|
|
1. **Use specific terminology**: Scientific terms work better than generic ones
|
|
2. **Combine filters**: Organization + format + terms = precise results
|
|
3. **Check multiple formats**: Try CSV, NetCDF, HDF5 for scientific data
|
|
4. **Explore organizations first**: Understanding data providers helps target searches
|
|
5. **Request details selectively**: Full metadata for only the most relevant datasets
|