Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:57:25 +08:00
commit 37ed95ddbf
10 changed files with 1187 additions and 0 deletions

View File

@@ -0,0 +1,142 @@
---
description: Retrieve detailed information about a specific NDP dataset
---
# NDP Dataset Details
Get comprehensive metadata and resource information for a specific dataset.
This command provides access to detailed dataset metadata through the NDP MCP.
## Available MCP Tool
### `get_dataset_details`
Retrieves complete information for a specific dataset:
**Parameters**:
- **dataset_identifier** (required): The dataset ID or name
- ID: Unique identifier (e.g., "a1b2c3d4-5678-90ef-ghij-klmnopqrstuv")
- Name: Human-readable name (e.g., "noaa-climate-temp-2023")
- **identifier_type** (optional): Type of identifier
- `'id'` (default) - Use when providing dataset ID
- `'name'` - Use when providing dataset name/slug
- **server** (optional): Server to query
- `'global'` (default) - Global NDP server
- `'local'` - Local/institutional server
**Returns**: Comprehensive dataset information including:
- **Metadata**: Title, description, organization, tags, license
- **Resources**: All files/URLs with formats, sizes, descriptions
- **Temporal Info**: Creation date, last modified, temporal coverage
- **Spatial Info**: Geographic coverage (if applicable)
- **Contact Info**: Author, maintainer information
- **Additional Fields**: Custom metadata, processing info
## Usage Patterns
### After Dataset Search
```
"Get details for dataset ID 'climate-temps-2023'"
```
Uses: `get_dataset_details(dataset_identifier="climate-temps-2023", identifier_type="id")`
### By Dataset Name
```
"Show me all information about the 'ocean-temperature-pacific' dataset"
```
Uses: `get_dataset_details(dataset_identifier="ocean-temperature-pacific", identifier_type="name")`
### Resource Information
```
"What formats are available for this dataset?" (after finding it in search)
```
Uses: `get_dataset_details(dataset_identifier="<from_search>")`
### Quality Assessment
```
"Review the metadata quality for dataset 'satellite-imagery-2024'"
```
Uses: `get_dataset_details(dataset_identifier="satellite-imagery-2024", identifier_type="name")`
## Information Retrieved
### Core Metadata
- **Title**: Dataset name
- **Description**: Detailed description with methodology
- **Organization**: Owner organization
- **Tags**: Keywords for discoverability
- **License**: Usage rights and restrictions
### Resource Details
For each resource (file/URL):
- **Format**: File format (CSV, JSON, NetCDF, HDF5, etc.)
- **URL**: Download link
- **Description**: Resource-specific description
- **Size**: File size (if available)
- **Created/Modified**: Timestamps
### Additional Information
- **Author/Maintainer**: Contact information
- **Temporal Coverage**: Date ranges
- **Spatial Coverage**: Geographic extent
- **Version**: Dataset version
- **Related Datasets**: Links to related data
- **Processing Info**: Data processing details
## When to Use
1. **After Search**: Follow up on interesting datasets from search results
2. **Before Download**: Verify dataset contents and formats
3. **Quality Review**: Check metadata completeness for curation
4. **Citation Info**: Get complete information for proper attribution
5. **Resource Selection**: Choose specific files/formats from dataset
6. **Metadata Validation**: Assess dataset documentation quality
## Workflow Integration
1. **Search First**: Use `/ndp-search` to find datasets
2. **Get IDs**: Note dataset IDs or names from search results
3. **Retrieve Details**: Use this command for complete information
4. **Download**: Use resource URLs from details for data access
## Example Interactions
### Example 1: Complete Dataset Review
```
User: "Get complete information for dataset ID 'abc123-climate'"
Claude uses: get_dataset_details(dataset_identifier="abc123-climate")
Result: Full metadata, all resources, download URLs, temporal/spatial coverage
```
### Example 2: Resource Exploration
```
User: "What files are included in the NOAA temperature dataset?"
Claude uses:
1. search_datasets(owner_org="NOAA", search_terms=["temperature"])
2. get_dataset_details(dataset_identifier="<id_from_search>")
Result: List of all resources with formats and descriptions
```
### Example 3: Metadata Quality Check
```
User: "Review the documentation for this oceanographic dataset"
Claude uses: get_dataset_details(dataset_identifier="<provided_id>")
Analysis: Evaluates description, tags, resource documentation, contact info
```
### Example 4: Multi-Dataset Comparison
```
User: "Compare the resources available in these three datasets"
Claude uses: get_dataset_details() for each dataset
Result: Side-by-side comparison of formats, sizes, documentation
```
## Tips
- **Use IDs when available**: More reliable than names
- **Check both servers**: Same dataset name might exist on multiple servers
- **Review all resources**: Datasets often have multiple files/formats
- **Note download URLs**: Save resource URLs for data access
- **Check temporal coverage**: Ensure data covers your time period of interest
- **Verify formats**: Confirm file formats are compatible with your tools
- **Read descriptions carefully**: Important processing details often in descriptions

View File

@@ -0,0 +1,110 @@
---
description: List and filter organizations in the National Data Platform
---
# NDP Organizations
List all organizations contributing data to the National Data Platform.
This command provides access to organization discovery functionality through the NDP MCP.
## Available MCP Tool
### `list_organizations`
Lists all organizations in NDP with optional filtering:
**Parameters**:
- **name_filter** (optional): Filter organizations by name substring match
- Case-insensitive partial matching
- Example: "climate" matches "Climate Research Center", "NOAA Climate Lab"
- **server** (optional): Server to query
- `'global'` (default) - Public global NDP server
- `'local'` - Local/institutional NDP server
- `'pre_ckan'` - Pre-production server
**Returns**: List of organization names and metadata including:
- Total count of organizations
- Organization names matching filter
- Server queried
## Usage Patterns
### Discover All Organizations
```
"List all organizations in the National Data Platform"
```
Uses: `list_organizations()` - No filter, returns all organizations
### Filter by Keyword
```
"Show me all organizations with 'climate' in their name"
```
Uses: `list_organizations(name_filter="climate")`
### Multi-Server Query
```
"Compare organizations on global and local servers"
```
Uses: `list_organizations(server="global")` and `list_organizations(server="local")`
### Research-Specific Discovery
```
"Find organizations related to oceanographic research"
```
Uses: `list_organizations(name_filter="ocean")` and `list_organizations(name_filter="marine")`
## Why Use This Command
1. **Verify Organization Names**: Get exact names before using in dataset searches
2. **Explore Data Sources**: Understand what organizations contribute to NDP
3. **Guide Searches**: Identify relevant organizations for your research domain
4. **Server Comparison**: See organizational differences between servers
5. **Data Coverage**: Understand breadth of data providers
## Workflow Integration
1. **Start Here**: Use this command before searching datasets
2. **Identify Providers**: Find organizations relevant to your research
3. **Use in Search**: Pass organization names to `search_datasets`
4. **Iterate**: Refine organization filters as needed
## Example Interactions
### Example 1: General Exploration
```
User: "List all organizations available on the local NDP server"
Claude uses: list_organizations(server="local")
Result: Complete list of local organizations with count
```
### Example 2: Targeted Discovery
```
User: "Find organizations related to satellite data"
Claude uses: list_organizations(name_filter="satellite")
Result: Organizations with "satellite" in their name
```
### Example 3: Multi-Keyword Search
```
User: "Show me organizations working on Earth observation"
Claude uses:
- list_organizations(name_filter="earth")
- list_organizations(name_filter="observation")
Result: Combined results from both searches
```
### Example 4: Before Dataset Search
```
User: "I want to search for NOAA climate data"
Claude uses: list_organizations(name_filter="noaa")
Result: Exact NOAA organization name(s)
Then: Can proceed with search_datasets(owner_org="<verified_name>")
```
## Tips
- **Use partial names**: "ocean" will match "Oceanographic Institute", "Ocean Research Lab", etc.
- **Try variations**: Search both "climate" and "atmospheric" to find all relevant organizations
- **Check both servers**: Global and local may have different organizations
- **Verify before searching**: Always confirm organization name before using in dataset searches
- **Multiple keywords**: Try related terms to discover all relevant providers

89
commands/ndp-search.md Normal file
View File

@@ -0,0 +1,89 @@
---
description: Search for datasets in the National Data Platform
---
# NDP Dataset Search
Search for datasets across the National Data Platform ecosystem with advanced filtering options.
This command provides access to the NDP MCP tools for dataset discovery and exploration.
## Available MCP Tools
When you use this command, Claude can invoke these MCP tools:
### `search_datasets` - Primary search tool
Searches for datasets using various criteria:
- **search_terms**: List of terms to search across all fields
- **owner_org**: Filter by organization name
- **resource_format**: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
- **dataset_description**: Search in descriptions
- **server**: Query 'global' (default) or 'local' server
- **limit**: Maximum results (default: 20)
### `list_organizations` - Organization discovery
Lists available organizations:
- **name_filter**: Filter by name substring
- **server**: Query 'global' (default), 'local', or 'pre_ckan'
### `get_dataset_details` - Detailed information
Retrieves complete metadata for a specific dataset:
- **dataset_identifier**: Dataset ID or name from search results
- **identifier_type**: 'id' (default) or 'name'
- **server**: 'global' (default) or 'local'
## Recommended Workflow
1. **Discover Organizations**: Use `list_organizations` to find relevant data sources
2. **Search Datasets**: Use `search_datasets` with appropriate filters
3. **Review Results**: Claude will summarize matching datasets
4. **Get Details**: Use `get_dataset_details` for datasets of interest
5. **Refine Search**: Adjust filters based on results
## Best Practices
- **Always verify organization names** with `list_organizations` before using in search
- **Start broad, then refine**: Begin with simple terms, add filters as needed
- **Limit results appropriately**: Default 20 is good, increase if needed
- **Use format filters**: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
- **Multi-server searches**: Query both global and local for comprehensive coverage
## Example Queries
### Basic Search
```
"Find climate datasets from NOAA"
```
Expected tools: `list_organizations(name_filter="noaa")`, then `search_datasets(owner_org="NOAA", search_terms=["climate"])`
### Format-Specific Search
```
"Search for oceanographic data in NetCDF format"
```
Expected tools: `search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")`
### Organization-Based Search
```
"List all datasets from a specific research institution"
```
Expected tools: `list_organizations(name_filter="<institution>")`, then `search_datasets(owner_org="<name>")`
### Refined Search with Limit
```
"Find CSV datasets about temperature monitoring, limit to 10 results"
```
Expected tools: `search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)`
### Multi-Server Comparison
```
"Compare oceanographic datasets on global and local servers"
```
Expected tools: `search_datasets(server="global", ...)` and `search_datasets(server="local", ...)`
## Tips for Effective Searching
1. **Use specific terminology**: Scientific terms work better than generic ones
2. **Combine filters**: Organization + format + terms = precise results
3. **Check multiple formats**: Try CSV, NetCDF, HDF5 for scientific data
4. **Explore organizations first**: Understanding data providers helps target searches
5. **Request details selectively**: Full metadata for only the most relevant datasets