Files
2025-11-30 08:57:25 +08:00

90 lines
3.4 KiB
Markdown

---
description: Search for datasets in the National Data Platform
---
# NDP Dataset Search
Search for datasets across the National Data Platform ecosystem with advanced filtering options.
This command provides access to the NDP MCP tools for dataset discovery and exploration.
## Available MCP Tools
When you use this command, Claude can invoke these MCP tools:
### `search_datasets` - Primary search tool
Searches for datasets using various criteria:
- **search_terms**: List of terms to search across all fields
- **owner_org**: Filter by organization name
- **resource_format**: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
- **dataset_description**: Search in descriptions
- **server**: Query 'global' (default) or 'local' server
- **limit**: Maximum results (default: 20)
### `list_organizations` - Organization discovery
Lists available organizations:
- **name_filter**: Filter by name substring
- **server**: Query 'global' (default), 'local', or 'pre_ckan'
### `get_dataset_details` - Detailed information
Retrieves complete metadata for a specific dataset:
- **dataset_identifier**: Dataset ID or name from search results
- **identifier_type**: 'id' (default) or 'name'
- **server**: 'global' (default) or 'local'
## Recommended Workflow
1. **Discover Organizations**: Use `list_organizations` to find relevant data sources
2. **Search Datasets**: Use `search_datasets` with appropriate filters
3. **Review Results**: Claude will summarize matching datasets
4. **Get Details**: Use `get_dataset_details` for datasets of interest
5. **Refine Search**: Adjust filters based on results
## Best Practices
- **Always verify organization names** with `list_organizations` before using in search
- **Start broad, then refine**: Begin with simple terms, add filters as needed
- **Limit results appropriately**: Default 20 is good, increase if needed
- **Use format filters**: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
- **Multi-server searches**: Query both global and local for comprehensive coverage
## Example Queries
### Basic Search
```
"Find climate datasets from NOAA"
```
Expected tools: `list_organizations(name_filter="noaa")`, then `search_datasets(owner_org="NOAA", search_terms=["climate"])`
### Format-Specific Search
```
"Search for oceanographic data in NetCDF format"
```
Expected tools: `search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")`
### Organization-Based Search
```
"List all datasets from a specific research institution"
```
Expected tools: `list_organizations(name_filter="<institution>")`, then `search_datasets(owner_org="<name>")`
### Refined Search with Limit
```
"Find CSV datasets about temperature monitoring, limit to 10 results"
```
Expected tools: `search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)`
### Multi-Server Comparison
```
"Compare oceanographic datasets on global and local servers"
```
Expected tools: `search_datasets(server="global", ...)` and `search_datasets(server="local", ...)`
## Tips for Effective Searching
1. **Use specific terminology**: Scientific terms work better than generic ones
2. **Combine filters**: Organization + format + terms = precise results
3. **Check multiple formats**: Try CSV, NetCDF, HDF5 for scientific data
4. **Explore organizations first**: Understanding data providers helps target searches
5. **Request details selectively**: Full metadata for only the most relevant datasets