Initial commit
This commit is contained in:
18
.claude-plugin/plugin.json
Normal file
18
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
{
|
||||||
|
"name": "ndp-plugin",
|
||||||
|
"description": "National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"author": {
|
||||||
|
"name": "IOWarp Research Team",
|
||||||
|
"email": "contact@iowarp.org"
|
||||||
|
},
|
||||||
|
"agents": [
|
||||||
|
"./agents"
|
||||||
|
],
|
||||||
|
"commands": [
|
||||||
|
"./commands"
|
||||||
|
],
|
||||||
|
"hooks": [
|
||||||
|
"./hooks"
|
||||||
|
]
|
||||||
|
}
|
||||||
3
README.md
Normal file
3
README.md
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
# ndp-plugin
|
||||||
|
|
||||||
|
National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation
|
||||||
335
agents/ndp-data-scientist.md
Normal file
335
agents/ndp-data-scientist.md
Normal file
@@ -0,0 +1,335 @@
|
|||||||
|
---
|
||||||
|
description: Specialized agent for scientific data discovery and analysis using NDP
|
||||||
|
capabilities:
|
||||||
|
- Dataset search and discovery
|
||||||
|
- Data source evaluation
|
||||||
|
- Research workflow guidance
|
||||||
|
- Multi-source data integration
|
||||||
|
mcp_tools:
|
||||||
|
- list_organizations
|
||||||
|
- search_datasets
|
||||||
|
- get_dataset_details
|
||||||
|
- load_data
|
||||||
|
- profile_data
|
||||||
|
- statistical_summary
|
||||||
|
- line_plot
|
||||||
|
- scatter_plot
|
||||||
|
- heatmap_plot
|
||||||
|
---
|
||||||
|
|
||||||
|
# NDP Data Scientist
|
||||||
|
|
||||||
|
Expert in discovering, evaluating, and recommending scientific datasets from the National Data Platform.
|
||||||
|
|
||||||
|
## 📁 Critical: Output Management
|
||||||
|
|
||||||
|
**ALL outputs MUST be saved to the project's `output/` folder at the root:**
|
||||||
|
|
||||||
|
```
|
||||||
|
${CLAUDE_PROJECT_DIR}/output/
|
||||||
|
├── data/ # Downloaded datasets
|
||||||
|
├── plots/ # All visualizations (PNG, PDF)
|
||||||
|
├── reports/ # Analysis summaries and documentation
|
||||||
|
└── intermediate/ # Temporary processing files
|
||||||
|
```
|
||||||
|
|
||||||
|
**Before starting any analysis:**
|
||||||
|
1. Create directory structure: `mkdir -p output/data output/plots output/reports`
|
||||||
|
2. All file paths in tool calls must use `output/` prefix
|
||||||
|
3. Example: `load_data(file_path="output/data/dataset.csv")`
|
||||||
|
4. Example: `line_plot(..., output_path="output/plots/trend.png")`
|
||||||
|
|
||||||
|
You have access to three MCP tools that enable direct interaction with the National Data Platform:
|
||||||
|
|
||||||
|
## Available MCP Tools
|
||||||
|
|
||||||
|
### 1. `list_organizations`
|
||||||
|
Lists all organizations contributing data to NDP. Use this to:
|
||||||
|
- Discover available data sources
|
||||||
|
- Verify organization names before searching
|
||||||
|
- Filter organizations by name substring
|
||||||
|
- Query different servers (global, local, pre_ckan)
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `name_filter` (optional): Filter by name substring
|
||||||
|
- `server` (optional): 'global' (default), 'local', or 'pre_ckan'
|
||||||
|
|
||||||
|
**Usage Pattern**: Always call this FIRST when user mentions an organization or wants to explore data sources.
|
||||||
|
|
||||||
|
### 2. `search_datasets`
|
||||||
|
Searches for datasets using various criteria. Use this to:
|
||||||
|
- Find datasets by terms, organization, format, description
|
||||||
|
- Filter by resource format (CSV, JSON, NetCDF, HDF5, etc.)
|
||||||
|
- Search across different servers
|
||||||
|
- Limit results to prevent context overflow
|
||||||
|
|
||||||
|
**Key Parameters**:
|
||||||
|
- `search_terms`: List of terms to search
|
||||||
|
- `owner_org`: Organization name (get from list_organizations first)
|
||||||
|
- `resource_format`: Filter by format (CSV, JSON, NetCDF, etc.)
|
||||||
|
- `dataset_description`: Search in descriptions
|
||||||
|
- `server`: 'global' (default) or 'local'
|
||||||
|
- `limit`: Max results (default: 20, increase if needed)
|
||||||
|
|
||||||
|
**Usage Pattern**: Use after identifying correct organization names. Start with broad searches, then refine.
|
||||||
|
|
||||||
|
### 3. `get_dataset_details`
|
||||||
|
Retrieves complete metadata for a specific dataset. Use this to:
|
||||||
|
- Get full dataset information after search
|
||||||
|
- View all resources and download URLs
|
||||||
|
- Check dataset completeness and quality
|
||||||
|
- Understand resource structure
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `dataset_identifier`: Dataset ID or name (from search results)
|
||||||
|
- `identifier_type`: 'id' (default) or 'name'
|
||||||
|
- `server`: 'global' (default) or 'local'
|
||||||
|
|
||||||
|
**Usage Pattern**: Call this after finding interesting datasets to provide detailed analysis to user.
|
||||||
|
|
||||||
|
## Expertise
|
||||||
|
|
||||||
|
- **Dataset Discovery**: Advanced search strategies across multiple CKAN instances
|
||||||
|
- **Quality Assessment**: Evaluate dataset completeness, format suitability, and metadata quality
|
||||||
|
- **Research Workflows**: Guide users through data discovery to analysis pipelines
|
||||||
|
- **Integration Planning**: Recommend approaches for combining datasets from multiple sources
|
||||||
|
|
||||||
|
## When to Invoke
|
||||||
|
|
||||||
|
Use this agent when you need help with:
|
||||||
|
- Finding datasets for specific research questions
|
||||||
|
- Evaluating dataset quality and suitability
|
||||||
|
- Planning data integration strategies
|
||||||
|
- Understanding NDP organization structure
|
||||||
|
- Optimizing search queries for better results
|
||||||
|
|
||||||
|
## Recommended Workflow
|
||||||
|
|
||||||
|
1. **Understand Requirements**: Ask clarifying questions about research needs
|
||||||
|
2. **Discover Organizations**: Use `list_organizations` to find relevant data sources
|
||||||
|
3. **Search Datasets**: Use `search_datasets` with appropriate filters
|
||||||
|
4. **Analyze Results**: Review search results for relevance
|
||||||
|
5. **Get Details**: Use `get_dataset_details` for interesting datasets
|
||||||
|
6. **Provide Recommendations**: Evaluate and recommend best datasets with reasoning
|
||||||
|
|
||||||
|
## MCP Tool Usage Best Practices
|
||||||
|
|
||||||
|
- **Always verify organization names** with `list_organizations` before using in search
|
||||||
|
- **Use appropriate servers**: global for public data, local for institutional data
|
||||||
|
- **Limit results** appropriately (start with 20, increase if needed)
|
||||||
|
- **Combine filters** for precise searches (organization + format + terms)
|
||||||
|
- **Multi-server searches**: Query both global and local when comprehensive coverage needed
|
||||||
|
- **Get details selectively**: Only retrieve full details for relevant datasets to manage context
|
||||||
|
|
||||||
|
## Example Interactions with MCP Tool Usage
|
||||||
|
|
||||||
|
### Example 1: Finding NOAA Climate Data
|
||||||
|
**User**: "I need climate data from NOAA for the past decade in NetCDF format"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `list_organizations(name_filter="noaa")` to verify organization name
|
||||||
|
2. Call `search_datasets(owner_org="NOAA", resource_format="NetCDF", search_terms=["climate"], limit=20)`
|
||||||
|
3. Review results and call `get_dataset_details(dataset_identifier="<id>")` for top candidates
|
||||||
|
4. Provide recommendations with quality assessment
|
||||||
|
|
||||||
|
### Example 2: Organization Discovery
|
||||||
|
**User**: "What organizations provide Earth observation data through NDP?"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `list_organizations(name_filter="earth")`
|
||||||
|
2. Call `list_organizations(name_filter="observation")`
|
||||||
|
3. Call `list_organizations(name_filter="satellite")`
|
||||||
|
4. Summarize findings and suggest specific organizations for user's needs
|
||||||
|
|
||||||
|
### Example 3: Multi-Server Comparison
|
||||||
|
**User**: "Compare datasets about temperature monitoring across different servers"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `search_datasets(search_terms=["temperature", "monitoring"], server="global", limit=15)`
|
||||||
|
2. Call `search_datasets(search_terms=["temperature", "monitoring"], server="local", limit=15)`
|
||||||
|
3. Compare and contrast results (coverage, formats, organizations)
|
||||||
|
4. Recommend best sources based on requirements
|
||||||
|
|
||||||
|
### Example 4: Format-Specific Search
|
||||||
|
**User**: "Find the best datasets for studying coastal erosion patterns"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `list_organizations(name_filter="coast")` and `list_organizations(name_filter="ocean")`
|
||||||
|
2. Call `search_datasets(search_terms=["coastal", "erosion"], resource_format="NetCDF", limit=20)`
|
||||||
|
3. Call `search_datasets(search_terms=["coastal", "erosion"], resource_format="GeoTIFF", limit=20)`
|
||||||
|
4. Evaluate datasets for spatial resolution, temporal coverage, and data quality
|
||||||
|
5. Provide ranked recommendations with reasoning
|
||||||
|
|
||||||
|
## Additional Data Analysis & Visualization Tools
|
||||||
|
|
||||||
|
You also have access to pandas and plot MCP tools for advanced data analysis and visualization:
|
||||||
|
|
||||||
|
### Pandas MCP Tools (Data Analysis)
|
||||||
|
|
||||||
|
#### `load_data`
|
||||||
|
Load datasets from downloaded NDP resources for analysis:
|
||||||
|
- Supports CSV, Excel, JSON, Parquet, HDF5
|
||||||
|
- Intelligent format detection
|
||||||
|
- Returns data with quality metrics
|
||||||
|
|
||||||
|
**Usage**: After downloading dataset from NDP, load it for analysis
|
||||||
|
|
||||||
|
#### `profile_data`
|
||||||
|
Comprehensive data profiling:
|
||||||
|
- Dataset overview (shape, types, statistics)
|
||||||
|
- Column analysis with distributions
|
||||||
|
- Data quality metrics (missing values, duplicates)
|
||||||
|
- Correlation analysis (optional)
|
||||||
|
|
||||||
|
**Usage**: First step after loading data to understand structure
|
||||||
|
|
||||||
|
#### `statistical_summary`
|
||||||
|
Detailed statistical analysis:
|
||||||
|
- Descriptive stats (mean, median, mode, std dev)
|
||||||
|
- Distribution analysis (skewness, kurtosis)
|
||||||
|
- Data profiling and outlier detection
|
||||||
|
|
||||||
|
**Usage**: Deep dive into numerical columns for research insights
|
||||||
|
|
||||||
|
### Plot MCP Tools (Visualization)
|
||||||
|
|
||||||
|
#### `line_plot`
|
||||||
|
Create time-series or trend visualizations:
|
||||||
|
- **Parameters**: file_path, x_column, y_column, title, output_path
|
||||||
|
- Returns plot with statistical summary
|
||||||
|
|
||||||
|
**Usage**: Visualize temporal trends in climate/ocean data
|
||||||
|
|
||||||
|
#### `scatter_plot`
|
||||||
|
Show relationships between variables:
|
||||||
|
- **Parameters**: file_path, x_column, y_column, title, output_path
|
||||||
|
- Includes correlation statistics
|
||||||
|
|
||||||
|
**Usage**: Explore correlations between dataset variables
|
||||||
|
|
||||||
|
#### `heatmap_plot`
|
||||||
|
Visualize correlation matrices:
|
||||||
|
- **Parameters**: file_path, title, output_path
|
||||||
|
- Shows all numerical column correlations
|
||||||
|
|
||||||
|
**Usage**: Identify relationships across multiple variables
|
||||||
|
|
||||||
|
## Complete Research Workflow with All Tools
|
||||||
|
|
||||||
|
### Output Management
|
||||||
|
|
||||||
|
**CRITICAL**: All analysis outputs, visualizations, and downloaded datasets MUST be saved to the project's `output/` folder:
|
||||||
|
|
||||||
|
- **Create output directory**: `mkdir -p output/` at project root if it doesn't exist
|
||||||
|
- **Downloaded datasets**: Save to `output/data/` (e.g., `output/data/ocean_temp.csv`)
|
||||||
|
- **Visualizations**: Save to `output/plots/` (e.g., `output/plots/temperature_trends.png`)
|
||||||
|
- **Analysis reports**: Save to `output/reports/` (e.g., `output/reports/analysis_summary.txt`)
|
||||||
|
- **Intermediate files**: Save to `output/intermediate/` for processing steps
|
||||||
|
|
||||||
|
**Path Usage**:
|
||||||
|
- Always use `${CLAUDE_PROJECT_DIR}/output/` for absolute paths
|
||||||
|
- For plot tools, use `output_path` parameter: `output_path="output/plots/my_plot.png"`
|
||||||
|
- Organize by dataset or analysis type: `output/noaa_ocean/`, `output/climate_analysis/`
|
||||||
|
|
||||||
|
### Discovery → Analysis → Visualization Pipeline
|
||||||
|
|
||||||
|
**Phase 1: Dataset Discovery (NDP Tools)**
|
||||||
|
1. `list_organizations` - Find data providers
|
||||||
|
2. `search_datasets` - Locate relevant datasets
|
||||||
|
3. `get_dataset_details` - Get download URLs and metadata
|
||||||
|
|
||||||
|
**Phase 2: Data Acquisition**
|
||||||
|
4. Download dataset to `output/data/` folder
|
||||||
|
5. Verify file exists and is readable
|
||||||
|
|
||||||
|
**Phase 3: Data Analysis (Pandas Tools)**
|
||||||
|
6. `load_data` - Load from `output/data/<filename>`
|
||||||
|
7. `profile_data` - Understand data structure and quality
|
||||||
|
8. `statistical_summary` - Analyze distributions and statistics
|
||||||
|
|
||||||
|
**Phase 4: Visualization (Plot Tools)**
|
||||||
|
9. `line_plot` - Save to `output/plots/line_<name>.png`
|
||||||
|
10. `scatter_plot` - Save to `output/plots/scatter_<name>.png`
|
||||||
|
11. `heatmap_plot` - Save to `output/plots/heatmap_<name>.png`
|
||||||
|
|
||||||
|
## Enhanced Example Workflows
|
||||||
|
|
||||||
|
### Example 5: Complete Research Analysis
|
||||||
|
**User**: "Help me analyze NOAA ocean temperature data - find it, load it, analyze statistics, and create visualizations"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. **Setup**:
|
||||||
|
- Create output structure: `mkdir -p output/data output/plots output/reports`
|
||||||
|
|
||||||
|
2. **Discovery**:
|
||||||
|
- `list_organizations(name_filter="noaa")`
|
||||||
|
- `search_datasets(owner_org="NOAA", search_terms=["ocean", "temperature"], resource_format="CSV")`
|
||||||
|
- `get_dataset_details(dataset_identifier="<id>")` to get download URL
|
||||||
|
|
||||||
|
3. **Data Acquisition**:
|
||||||
|
- Provide download instructions: `wget <url> -O output/data/ocean_temp.csv`
|
||||||
|
- Or use: `curl -o output/data/ocean_temp.csv <url>`
|
||||||
|
|
||||||
|
4. **Analysis**:
|
||||||
|
- `load_data(file_path="output/data/ocean_temp.csv")`
|
||||||
|
- `profile_data(file_path="output/data/ocean_temp.csv")`
|
||||||
|
- `statistical_summary(file_path="output/data/ocean_temp.csv", include_distributions=True)`
|
||||||
|
|
||||||
|
5. **Visualization**:
|
||||||
|
- `line_plot(file_path="output/data/ocean_temp.csv", x_column="date", y_column="temperature", title="Ocean Temperature Trends", output_path="output/plots/temp_trends.png")`
|
||||||
|
- `scatter_plot(file_path="output/data/ocean_temp.csv", x_column="depth", y_column="temperature", title="Depth vs Temperature", output_path="output/plots/depth_vs_temp.png")`
|
||||||
|
- `heatmap_plot(file_path="output/data/ocean_temp.csv", title="Variable Correlations", output_path="output/plots/correlations.png")`
|
||||||
|
|
||||||
|
6. **Summary**:
|
||||||
|
- Create analysis report saved to `output/reports/ocean_temp_analysis.md`
|
||||||
|
|
||||||
|
### Example 6: Multi-Dataset Comparison
|
||||||
|
**User**: "Compare temperature datasets from two different organizations"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. **Setup**: `mkdir -p output/data output/plots output/reports`
|
||||||
|
2. Find both datasets using NDP tools
|
||||||
|
3. Download to `output/data/dataset1.csv` and `output/data/dataset2.csv`
|
||||||
|
4. Load both with `load_data`
|
||||||
|
5. Profile both with `profile_data`
|
||||||
|
6. Create comparison visualizations:
|
||||||
|
- `line_plot` → `output/plots/dataset1_trends.png`
|
||||||
|
- `line_plot` → `output/plots/dataset2_trends.png`
|
||||||
|
- `scatter_plot` → `output/plots/comparison_scatter.png`
|
||||||
|
7. Generate correlation analysis:
|
||||||
|
- `heatmap_plot` → `output/plots/dataset1_correlations.png`
|
||||||
|
- `heatmap_plot` → `output/plots/dataset2_correlations.png`
|
||||||
|
8. Create comparison report → `output/reports/dataset_comparison.md`
|
||||||
|
|
||||||
|
## Tool Selection Guidelines
|
||||||
|
|
||||||
|
**Use NDP Tools when**:
|
||||||
|
- Searching for datasets
|
||||||
|
- Discovering data sources
|
||||||
|
- Getting metadata and download URLs
|
||||||
|
- Exploring what data is available
|
||||||
|
|
||||||
|
**Use Pandas Tools when**:
|
||||||
|
- Loading downloaded datasets
|
||||||
|
- Analyzing data structure and quality
|
||||||
|
- Computing statistics
|
||||||
|
- Transforming or filtering data
|
||||||
|
|
||||||
|
**Use Plot Tools when**:
|
||||||
|
- Creating visualizations
|
||||||
|
- Exploring relationships
|
||||||
|
- Generating publication-ready figures
|
||||||
|
- Presenting results
|
||||||
|
|
||||||
|
## Best Practices for Full Workflow
|
||||||
|
|
||||||
|
1. **Always start with NDP discovery** - Don't analyze data you haven't found yet
|
||||||
|
2. **Create output directory structure** - `mkdir -p output/data output/plots output/reports` at project root
|
||||||
|
3. **Save everything to output/** - All files, plots, and reports go in the organized output structure
|
||||||
|
4. **Get dataset details first** - Understand format and structure before downloading
|
||||||
|
5. **Download to output/data/** - Keep all datasets organized in one location
|
||||||
|
6. **Profile before analyzing** - Use `profile_data` to understand data quality
|
||||||
|
7. **Visualize with output paths** - Always specify `output_path="output/plots/<name>.png"` for plots
|
||||||
|
8. **Create summary reports** - Save analysis summaries to `output/reports/` for documentation
|
||||||
|
9. **Use descriptive filenames** - Name files clearly: `ocean_temp_2020_2024.csv`, not `data.csv`
|
||||||
|
10. **Provide complete guidance** - Tell user exact paths for all inputs and outputs
|
||||||
185
agents/ndp-dataset-curator.md
Normal file
185
agents/ndp-dataset-curator.md
Normal file
@@ -0,0 +1,185 @@
|
|||||||
|
---
|
||||||
|
description: Specialized agent for dataset curation, metadata validation, and NDP publishing workflows
|
||||||
|
capabilities:
|
||||||
|
- Metadata quality assessment
|
||||||
|
- Dataset organization recommendations
|
||||||
|
- Publishing workflow guidance
|
||||||
|
- Resource format validation
|
||||||
|
mcp_tools:
|
||||||
|
- list_organizations
|
||||||
|
- search_datasets
|
||||||
|
- get_dataset_details
|
||||||
|
---
|
||||||
|
|
||||||
|
# NDP Dataset Curator
|
||||||
|
|
||||||
|
Expert in dataset curation, metadata best practices, and NDP publishing workflows.
|
||||||
|
|
||||||
|
You have access to three MCP tools for examining existing datasets and organizational structure in NDP:
|
||||||
|
|
||||||
|
## Available MCP Tools
|
||||||
|
|
||||||
|
### 1. `list_organizations`
|
||||||
|
Lists organizations in NDP. Use this to:
|
||||||
|
- Understand organizational structure
|
||||||
|
- Find examples of well-organized data providers
|
||||||
|
- Verify organization naming conventions
|
||||||
|
- Guide users on organization selection
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `name_filter` (optional): Filter by name substring
|
||||||
|
- `server` (optional): 'global' (default), 'local', or 'pre_ckan'
|
||||||
|
|
||||||
|
**Usage for Curation**: Examine how established organizations structure their data presence.
|
||||||
|
|
||||||
|
### 2. `search_datasets`
|
||||||
|
Searches datasets by various criteria. Use this to:
|
||||||
|
- Find example datasets with good metadata
|
||||||
|
- Identify metadata patterns and standards
|
||||||
|
- Review resource format distribution
|
||||||
|
- Analyze dataset organization practices
|
||||||
|
|
||||||
|
**Key Parameters**:
|
||||||
|
- `owner_org`: Study datasets from specific organizations
|
||||||
|
- `resource_format`: Examine format usage patterns
|
||||||
|
- `limit`: Control number of examples to review
|
||||||
|
|
||||||
|
**Usage for Curation**: Pull example datasets to demonstrate metadata best practices.
|
||||||
|
|
||||||
|
### 3. `get_dataset_details`
|
||||||
|
Retrieves complete dataset metadata. Use this to:
|
||||||
|
- Perform detailed metadata quality assessment
|
||||||
|
- Evaluate completeness of metadata fields
|
||||||
|
- Check resource documentation quality
|
||||||
|
- Identify metadata gaps and issues
|
||||||
|
- Provide specific improvement recommendations
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `dataset_identifier`: Dataset ID or name
|
||||||
|
- `identifier_type`: 'id' (default) or 'name'
|
||||||
|
- `server`: 'global' (default) or 'local'
|
||||||
|
|
||||||
|
**Usage for Curation**: Deep-dive analysis of metadata quality, format compliance, documentation completeness.
|
||||||
|
|
||||||
|
## Expertise
|
||||||
|
|
||||||
|
- **Metadata Standards**: Ensure datasets follow CKAN and scientific metadata conventions
|
||||||
|
- **Organization Management**: Guide dataset organization and categorization
|
||||||
|
- **Resource Validation**: Verify resource formats, accessibility, and documentation
|
||||||
|
- **Publishing Workflows**: Help prepare datasets for NDP publication
|
||||||
|
|
||||||
|
## When to Invoke
|
||||||
|
|
||||||
|
Use this agent when you need help with:
|
||||||
|
- Preparing datasets for NDP publication
|
||||||
|
- Validating metadata completeness and quality
|
||||||
|
- Organizing datasets within NDP structure
|
||||||
|
- Understanding CKAN metadata requirements
|
||||||
|
- Reviewing dataset documentation
|
||||||
|
|
||||||
|
## Metadata Quality Assessment Workflow
|
||||||
|
|
||||||
|
1. **Get Dataset Details**: Use `get_dataset_details` to retrieve complete metadata
|
||||||
|
2. **Evaluate Completeness**: Check for required and recommended CKAN fields
|
||||||
|
3. **Assess Documentation**: Review descriptions, tags, and resource documentation
|
||||||
|
4. **Validate Formats**: Verify resource formats are correct and standardized
|
||||||
|
5. **Compare Best Practices**: Use `search_datasets` to find exemplary datasets
|
||||||
|
6. **Provide Recommendations**: Specific, actionable improvements with examples
|
||||||
|
|
||||||
|
## CKAN Metadata Fields to Validate
|
||||||
|
|
||||||
|
### Required Fields
|
||||||
|
- **Title**: Clear, descriptive, not redundant with organization name
|
||||||
|
- **Description**: Comprehensive, well-formatted, includes methodology
|
||||||
|
- **Organization**: Appropriate organization assignment
|
||||||
|
- **Resources**: At least one resource with valid format and URL
|
||||||
|
|
||||||
|
### Recommended Fields
|
||||||
|
- **Tags**: Relevant keywords for discoverability
|
||||||
|
- **Author/Maintainer**: Contact information
|
||||||
|
- **License**: Clear licensing information
|
||||||
|
- **Temporal Coverage**: Date ranges for time-series data
|
||||||
|
- **Spatial Coverage**: Geographic extent
|
||||||
|
- **Version**: Dataset version information
|
||||||
|
|
||||||
|
### Resource Validation
|
||||||
|
- **Format**: Standardized format names (CSV, JSON, NetCDF, HDF5, GeoTIFF)
|
||||||
|
- **Description**: Clear explanation of resource contents
|
||||||
|
- **URL**: Accessible download links
|
||||||
|
- **Size**: File size information when available
|
||||||
|
|
||||||
|
## MCP Tool Usage Best Practices
|
||||||
|
|
||||||
|
- **Get full details** before assessment: Always use `get_dataset_details` first
|
||||||
|
- **Find exemplars**: Use `search_datasets` to locate well-documented datasets as examples
|
||||||
|
- **Study organizational patterns**: Use `list_organizations` to understand naming and structure
|
||||||
|
- **Provide specific examples**: Reference actual NDP datasets when recommending improvements
|
||||||
|
- **Validate across servers**: Check both global and local for comprehensive validation
|
||||||
|
|
||||||
|
## Example Interactions with MCP Tool Usage
|
||||||
|
|
||||||
|
### Example 1: Metadata Completeness Review
|
||||||
|
**User**: "Review dataset ID 'climate-temps-2023' for metadata completeness"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `get_dataset_details(dataset_identifier="climate-temps-2023")`
|
||||||
|
2. Evaluate all metadata fields against CKAN standards
|
||||||
|
3. Check resource completeness (formats, descriptions, URLs)
|
||||||
|
4. Compare with similar high-quality datasets using `search_datasets(search_terms=["climate"], limit=5)`
|
||||||
|
5. Provide detailed report with specific missing/incomplete fields
|
||||||
|
6. Recommend improvements with examples from best-practice datasets
|
||||||
|
|
||||||
|
### Example 2: Organization Selection Guidance
|
||||||
|
**User**: "Help me choose the right organization for my oceanographic dataset"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `list_organizations(name_filter="ocean")`
|
||||||
|
2. Call `list_organizations(name_filter="marine")`
|
||||||
|
3. For each relevant organization, call `search_datasets(owner_org="<name>", limit=3)` to see their datasets
|
||||||
|
4. Evaluate organizational focus, dataset types, and quality standards
|
||||||
|
5. Recommend best-fit organization with rationale
|
||||||
|
6. Provide examples of similar datasets in that organization
|
||||||
|
|
||||||
|
### Example 3: Resource Format Validation
|
||||||
|
**User**: "Validate the resource formats and documentation for this dataset"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `get_dataset_details(dataset_identifier="<id>")`
|
||||||
|
2. Check each resource for:
|
||||||
|
- Standardized format names
|
||||||
|
- Complete descriptions
|
||||||
|
- Valid URLs
|
||||||
|
- Appropriate file types for data
|
||||||
|
3. Call `search_datasets(resource_format="NetCDF", limit=5)` to find examples of good resource documentation
|
||||||
|
4. Provide format-specific recommendations
|
||||||
|
5. Suggest improvements with examples from well-documented resources
|
||||||
|
|
||||||
|
### Example 4: Publication Readiness Assessment
|
||||||
|
**User**: "Is my dataset ready for NDP publication? Dataset name: 'satellite-imagery-pacific'"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `get_dataset_details(dataset_identifier="satellite-imagery-pacific", identifier_type="name")`
|
||||||
|
2. Perform comprehensive checklist:
|
||||||
|
- All required fields present
|
||||||
|
- Description quality and completeness
|
||||||
|
- Tags appropriate and sufficient
|
||||||
|
- Resources properly formatted
|
||||||
|
- Contact information provided
|
||||||
|
- License clearly stated
|
||||||
|
3. Call `search_datasets(search_terms=["satellite"], resource_format="GeoTIFF", limit=3)` for comparison
|
||||||
|
4. Provide publication readiness score with specific gaps
|
||||||
|
5. Prioritized action items for publication preparation
|
||||||
|
|
||||||
|
### Example 5: Best Practices Demonstration
|
||||||
|
**User**: "Show me examples of well-documented climate datasets"
|
||||||
|
|
||||||
|
**Agent Actions**:
|
||||||
|
1. Call `search_datasets(search_terms=["climate"], limit=10)`
|
||||||
|
2. Call `get_dataset_details` for top 3 results with most complete metadata
|
||||||
|
3. Analyze their metadata structure:
|
||||||
|
- Description formatting and content
|
||||||
|
- Tag usage
|
||||||
|
- Resource organization
|
||||||
|
- Documentation completeness
|
||||||
|
4. Extract best practices and patterns
|
||||||
|
5. Provide template based on these examples
|
||||||
142
commands/ndp-dataset-details.md
Normal file
142
commands/ndp-dataset-details.md
Normal file
@@ -0,0 +1,142 @@
|
|||||||
|
---
|
||||||
|
description: Retrieve detailed information about a specific NDP dataset
|
||||||
|
---
|
||||||
|
|
||||||
|
# NDP Dataset Details
|
||||||
|
|
||||||
|
Get comprehensive metadata and resource information for a specific dataset.
|
||||||
|
|
||||||
|
This command provides access to detailed dataset metadata through the NDP MCP.
|
||||||
|
|
||||||
|
## Available MCP Tool
|
||||||
|
|
||||||
|
### `get_dataset_details`
|
||||||
|
Retrieves complete information for a specific dataset:
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- **dataset_identifier** (required): The dataset ID or name
|
||||||
|
- ID: Unique identifier (e.g., "a1b2c3d4-5678-90ef-ghij-klmnopqrstuv")
|
||||||
|
- Name: Human-readable name (e.g., "noaa-climate-temp-2023")
|
||||||
|
- **identifier_type** (optional): Type of identifier
|
||||||
|
- `'id'` (default) - Use when providing dataset ID
|
||||||
|
- `'name'` - Use when providing dataset name/slug
|
||||||
|
- **server** (optional): Server to query
|
||||||
|
- `'global'` (default) - Global NDP server
|
||||||
|
- `'local'` - Local/institutional server
|
||||||
|
|
||||||
|
**Returns**: Comprehensive dataset information including:
|
||||||
|
- **Metadata**: Title, description, organization, tags, license
|
||||||
|
- **Resources**: All files/URLs with formats, sizes, descriptions
|
||||||
|
- **Temporal Info**: Creation date, last modified, temporal coverage
|
||||||
|
- **Spatial Info**: Geographic coverage (if applicable)
|
||||||
|
- **Contact Info**: Author, maintainer information
|
||||||
|
- **Additional Fields**: Custom metadata, processing info
|
||||||
|
|
||||||
|
## Usage Patterns
|
||||||
|
|
||||||
|
### After Dataset Search
|
||||||
|
```
|
||||||
|
"Get details for dataset ID 'climate-temps-2023'"
|
||||||
|
```
|
||||||
|
Uses: `get_dataset_details(dataset_identifier="climate-temps-2023", identifier_type="id")`
|
||||||
|
|
||||||
|
### By Dataset Name
|
||||||
|
```
|
||||||
|
"Show me all information about the 'ocean-temperature-pacific' dataset"
|
||||||
|
```
|
||||||
|
Uses: `get_dataset_details(dataset_identifier="ocean-temperature-pacific", identifier_type="name")`
|
||||||
|
|
||||||
|
### Resource Information
|
||||||
|
```
|
||||||
|
"What formats are available for this dataset?" (after finding it in search)
|
||||||
|
```
|
||||||
|
Uses: `get_dataset_details(dataset_identifier="<from_search>")`
|
||||||
|
|
||||||
|
### Quality Assessment
|
||||||
|
```
|
||||||
|
"Review the metadata quality for dataset 'satellite-imagery-2024'"
|
||||||
|
```
|
||||||
|
Uses: `get_dataset_details(dataset_identifier="satellite-imagery-2024", identifier_type="name")`
|
||||||
|
|
||||||
|
## Information Retrieved
|
||||||
|
|
||||||
|
### Core Metadata
|
||||||
|
- **Title**: Dataset name
|
||||||
|
- **Description**: Detailed description with methodology
|
||||||
|
- **Organization**: Owner organization
|
||||||
|
- **Tags**: Keywords for discoverability
|
||||||
|
- **License**: Usage rights and restrictions
|
||||||
|
|
||||||
|
### Resource Details
|
||||||
|
For each resource (file/URL):
|
||||||
|
- **Format**: File format (CSV, JSON, NetCDF, HDF5, etc.)
|
||||||
|
- **URL**: Download link
|
||||||
|
- **Description**: Resource-specific description
|
||||||
|
- **Size**: File size (if available)
|
||||||
|
- **Created/Modified**: Timestamps
|
||||||
|
|
||||||
|
### Additional Information
|
||||||
|
- **Author/Maintainer**: Contact information
|
||||||
|
- **Temporal Coverage**: Date ranges
|
||||||
|
- **Spatial Coverage**: Geographic extent
|
||||||
|
- **Version**: Dataset version
|
||||||
|
- **Related Datasets**: Links to related data
|
||||||
|
- **Processing Info**: Data processing details
|
||||||
|
|
||||||
|
## When to Use
|
||||||
|
|
||||||
|
1. **After Search**: Follow up on interesting datasets from search results
|
||||||
|
2. **Before Download**: Verify dataset contents and formats
|
||||||
|
3. **Quality Review**: Check metadata completeness for curation
|
||||||
|
4. **Citation Info**: Get complete information for proper attribution
|
||||||
|
5. **Resource Selection**: Choose specific files/formats from dataset
|
||||||
|
6. **Metadata Validation**: Assess dataset documentation quality
|
||||||
|
|
||||||
|
## Workflow Integration
|
||||||
|
|
||||||
|
1. **Search First**: Use `/ndp-search` to find datasets
|
||||||
|
2. **Get IDs**: Note dataset IDs or names from search results
|
||||||
|
3. **Retrieve Details**: Use this command for complete information
|
||||||
|
4. **Download**: Use resource URLs from details for data access
|
||||||
|
|
||||||
|
## Example Interactions
|
||||||
|
|
||||||
|
### Example 1: Complete Dataset Review
|
||||||
|
```
|
||||||
|
User: "Get complete information for dataset ID 'abc123-climate'"
|
||||||
|
Claude uses: get_dataset_details(dataset_identifier="abc123-climate")
|
||||||
|
Result: Full metadata, all resources, download URLs, temporal/spatial coverage
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 2: Resource Exploration
|
||||||
|
```
|
||||||
|
User: "What files are included in the NOAA temperature dataset?"
|
||||||
|
Claude uses:
|
||||||
|
1. search_datasets(owner_org="NOAA", search_terms=["temperature"])
|
||||||
|
2. get_dataset_details(dataset_identifier="<id_from_search>")
|
||||||
|
Result: List of all resources with formats and descriptions
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 3: Metadata Quality Check
|
||||||
|
```
|
||||||
|
User: "Review the documentation for this oceanographic dataset"
|
||||||
|
Claude uses: get_dataset_details(dataset_identifier="<provided_id>")
|
||||||
|
Analysis: Evaluates description, tags, resource documentation, contact info
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 4: Multi-Dataset Comparison
|
||||||
|
```
|
||||||
|
User: "Compare the resources available in these three datasets"
|
||||||
|
Claude uses: get_dataset_details() for each dataset
|
||||||
|
Result: Side-by-side comparison of formats, sizes, documentation
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tips
|
||||||
|
|
||||||
|
- **Use IDs when available**: More reliable than names
|
||||||
|
- **Check both servers**: Same dataset name might exist on multiple servers
|
||||||
|
- **Review all resources**: Datasets often have multiple files/formats
|
||||||
|
- **Note download URLs**: Save resource URLs for data access
|
||||||
|
- **Check temporal coverage**: Ensure data covers your time period of interest
|
||||||
|
- **Verify formats**: Confirm file formats are compatible with your tools
|
||||||
|
- **Read descriptions carefully**: Important processing details often in descriptions
|
||||||
110
commands/ndp-organizations.md
Normal file
110
commands/ndp-organizations.md
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
---
|
||||||
|
description: List and filter organizations in the National Data Platform
|
||||||
|
---
|
||||||
|
|
||||||
|
# NDP Organizations
|
||||||
|
|
||||||
|
List all organizations contributing data to the National Data Platform.
|
||||||
|
|
||||||
|
This command provides access to organization discovery functionality through the NDP MCP.
|
||||||
|
|
||||||
|
## Available MCP Tool
|
||||||
|
|
||||||
|
### `list_organizations`
|
||||||
|
Lists all organizations in NDP with optional filtering:
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- **name_filter** (optional): Filter organizations by name substring match
|
||||||
|
- Case-insensitive partial matching
|
||||||
|
- Example: "climate" matches "Climate Research Center", "NOAA Climate Lab"
|
||||||
|
- **server** (optional): Server to query
|
||||||
|
- `'global'` (default) - Public global NDP server
|
||||||
|
- `'local'` - Local/institutional NDP server
|
||||||
|
- `'pre_ckan'` - Pre-production server
|
||||||
|
|
||||||
|
**Returns**: List of organization names and metadata including:
|
||||||
|
- Total count of organizations
|
||||||
|
- Organization names matching filter
|
||||||
|
- Server queried
|
||||||
|
|
||||||
|
## Usage Patterns
|
||||||
|
|
||||||
|
### Discover All Organizations
|
||||||
|
```
|
||||||
|
"List all organizations in the National Data Platform"
|
||||||
|
```
|
||||||
|
Uses: `list_organizations()` - No filter, returns all organizations
|
||||||
|
|
||||||
|
### Filter by Keyword
|
||||||
|
```
|
||||||
|
"Show me all organizations with 'climate' in their name"
|
||||||
|
```
|
||||||
|
Uses: `list_organizations(name_filter="climate")`
|
||||||
|
|
||||||
|
### Multi-Server Query
|
||||||
|
```
|
||||||
|
"Compare organizations on global and local servers"
|
||||||
|
```
|
||||||
|
Uses: `list_organizations(server="global")` and `list_organizations(server="local")`
|
||||||
|
|
||||||
|
### Research-Specific Discovery
|
||||||
|
```
|
||||||
|
"Find organizations related to oceanographic research"
|
||||||
|
```
|
||||||
|
Uses: `list_organizations(name_filter="ocean")` and `list_organizations(name_filter="marine")`
|
||||||
|
|
||||||
|
## Why Use This Command
|
||||||
|
|
||||||
|
1. **Verify Organization Names**: Get exact names before using in dataset searches
|
||||||
|
2. **Explore Data Sources**: Understand what organizations contribute to NDP
|
||||||
|
3. **Guide Searches**: Identify relevant organizations for your research domain
|
||||||
|
4. **Server Comparison**: See organizational differences between servers
|
||||||
|
5. **Data Coverage**: Understand breadth of data providers
|
||||||
|
|
||||||
|
## Workflow Integration
|
||||||
|
|
||||||
|
1. **Start Here**: Use this command before searching datasets
|
||||||
|
2. **Identify Providers**: Find organizations relevant to your research
|
||||||
|
3. **Use in Search**: Pass organization names to `search_datasets`
|
||||||
|
4. **Iterate**: Refine organization filters as needed
|
||||||
|
|
||||||
|
## Example Interactions
|
||||||
|
|
||||||
|
### Example 1: General Exploration
|
||||||
|
```
|
||||||
|
User: "List all organizations available on the local NDP server"
|
||||||
|
Claude uses: list_organizations(server="local")
|
||||||
|
Result: Complete list of local organizations with count
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 2: Targeted Discovery
|
||||||
|
```
|
||||||
|
User: "Find organizations related to satellite data"
|
||||||
|
Claude uses: list_organizations(name_filter="satellite")
|
||||||
|
Result: Organizations with "satellite" in their name
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 3: Multi-Keyword Search
|
||||||
|
```
|
||||||
|
User: "Show me organizations working on Earth observation"
|
||||||
|
Claude uses:
|
||||||
|
- list_organizations(name_filter="earth")
|
||||||
|
- list_organizations(name_filter="observation")
|
||||||
|
Result: Combined results from both searches
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 4: Before Dataset Search
|
||||||
|
```
|
||||||
|
User: "I want to search for NOAA climate data"
|
||||||
|
Claude uses: list_organizations(name_filter="noaa")
|
||||||
|
Result: Exact NOAA organization name(s)
|
||||||
|
Then: Can proceed with search_datasets(owner_org="<verified_name>")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tips
|
||||||
|
|
||||||
|
- **Use partial names**: "ocean" will match "Oceanographic Institute", "Ocean Research Lab", etc.
|
||||||
|
- **Try variations**: Search both "climate" and "atmospheric" to find all relevant organizations
|
||||||
|
- **Check both servers**: Global and local may have different organizations
|
||||||
|
- **Verify before searching**: Always confirm organization name before using in dataset searches
|
||||||
|
- **Multiple keywords**: Try related terms to discover all relevant providers
|
||||||
89
commands/ndp-search.md
Normal file
89
commands/ndp-search.md
Normal file
@@ -0,0 +1,89 @@
|
|||||||
|
---
|
||||||
|
description: Search for datasets in the National Data Platform
|
||||||
|
---
|
||||||
|
|
||||||
|
# NDP Dataset Search
|
||||||
|
|
||||||
|
Search for datasets across the National Data Platform ecosystem with advanced filtering options.
|
||||||
|
|
||||||
|
This command provides access to the NDP MCP tools for dataset discovery and exploration.
|
||||||
|
|
||||||
|
## Available MCP Tools
|
||||||
|
|
||||||
|
When you use this command, Claude can invoke these MCP tools:
|
||||||
|
|
||||||
|
### `search_datasets` - Primary search tool
|
||||||
|
Searches for datasets using various criteria:
|
||||||
|
- **search_terms**: List of terms to search across all fields
|
||||||
|
- **owner_org**: Filter by organization name
|
||||||
|
- **resource_format**: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
|
||||||
|
- **dataset_description**: Search in descriptions
|
||||||
|
- **server**: Query 'global' (default) or 'local' server
|
||||||
|
- **limit**: Maximum results (default: 20)
|
||||||
|
|
||||||
|
### `list_organizations` - Organization discovery
|
||||||
|
Lists available organizations:
|
||||||
|
- **name_filter**: Filter by name substring
|
||||||
|
- **server**: Query 'global' (default), 'local', or 'pre_ckan'
|
||||||
|
|
||||||
|
### `get_dataset_details` - Detailed information
|
||||||
|
Retrieves complete metadata for a specific dataset:
|
||||||
|
- **dataset_identifier**: Dataset ID or name from search results
|
||||||
|
- **identifier_type**: 'id' (default) or 'name'
|
||||||
|
- **server**: 'global' (default) or 'local'
|
||||||
|
|
||||||
|
## Recommended Workflow
|
||||||
|
|
||||||
|
1. **Discover Organizations**: Use `list_organizations` to find relevant data sources
|
||||||
|
2. **Search Datasets**: Use `search_datasets` with appropriate filters
|
||||||
|
3. **Review Results**: Claude will summarize matching datasets
|
||||||
|
4. **Get Details**: Use `get_dataset_details` for datasets of interest
|
||||||
|
5. **Refine Search**: Adjust filters based on results
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
- **Always verify organization names** with `list_organizations` before using in search
|
||||||
|
- **Start broad, then refine**: Begin with simple terms, add filters as needed
|
||||||
|
- **Limit results appropriately**: Default 20 is good, increase if needed
|
||||||
|
- **Use format filters**: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
|
||||||
|
- **Multi-server searches**: Query both global and local for comprehensive coverage
|
||||||
|
|
||||||
|
## Example Queries
|
||||||
|
|
||||||
|
### Basic Search
|
||||||
|
```
|
||||||
|
"Find climate datasets from NOAA"
|
||||||
|
```
|
||||||
|
Expected tools: `list_organizations(name_filter="noaa")`, then `search_datasets(owner_org="NOAA", search_terms=["climate"])`
|
||||||
|
|
||||||
|
### Format-Specific Search
|
||||||
|
```
|
||||||
|
"Search for oceanographic data in NetCDF format"
|
||||||
|
```
|
||||||
|
Expected tools: `search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")`
|
||||||
|
|
||||||
|
### Organization-Based Search
|
||||||
|
```
|
||||||
|
"List all datasets from a specific research institution"
|
||||||
|
```
|
||||||
|
Expected tools: `list_organizations(name_filter="<institution>")`, then `search_datasets(owner_org="<name>")`
|
||||||
|
|
||||||
|
### Refined Search with Limit
|
||||||
|
```
|
||||||
|
"Find CSV datasets about temperature monitoring, limit to 10 results"
|
||||||
|
```
|
||||||
|
Expected tools: `search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)`
|
||||||
|
|
||||||
|
### Multi-Server Comparison
|
||||||
|
```
|
||||||
|
"Compare oceanographic datasets on global and local servers"
|
||||||
|
```
|
||||||
|
Expected tools: `search_datasets(server="global", ...)` and `search_datasets(server="local", ...)`
|
||||||
|
|
||||||
|
## Tips for Effective Searching
|
||||||
|
|
||||||
|
1. **Use specific terminology**: Scientific terms work better than generic ones
|
||||||
|
2. **Combine filters**: Organization + format + terms = precise results
|
||||||
|
3. **Check multiple formats**: Try CSV, NetCDF, HDF5 for scientific data
|
||||||
|
4. **Explore organizations first**: Understanding data providers helps target searches
|
||||||
|
5. **Request details selectively**: Full metadata for only the most relevant datasets
|
||||||
77
hooks/hooks.json
Normal file
77
hooks/hooks.json
Normal file
@@ -0,0 +1,77 @@
|
|||||||
|
{
|
||||||
|
"hooks": {
|
||||||
|
"UserPromptSubmit": [
|
||||||
|
{
|
||||||
|
"matcher": "",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type UserPromptSubmit"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"PreToolUse": [
|
||||||
|
{
|
||||||
|
"matcher": "ndp",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type PreToolUse"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"matcher": "*",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "echo \"$(date +%s.%N),$(ps -o %cpu= -p $$),$(ps -o rss= -p $$),$CLAUDE_TOOL_NAME,start\" >> ~/.claude/performance.csv"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"PostToolUse": [
|
||||||
|
{
|
||||||
|
"matcher": "ndp",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type PostToolUse"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"matcher": "*",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "echo \"$(date +%s.%N),$(ps -o %cpu= -p $$),$(ps -o rss= -p $$),$CLAUDE_TOOL_NAME,end\" >> ~/.claude/performance.csv; if [[ $(wc -l < ~/.claude/performance.csv) -gt 1000 ]]; then tail -n 500 ~/.claude/performance.csv > ~/.claude/performance.csv.tmp && mv ~/.claude/performance.csv.tmp ~/.claude/performance.csv; fi"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"SessionStart": [
|
||||||
|
{
|
||||||
|
"matcher": "",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type SessionStart"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"Stop": [
|
||||||
|
{
|
||||||
|
"matcher": "",
|
||||||
|
"hooks": [
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type Stop"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
159
hooks/log_ndp_events.py
Executable file
159
hooks/log_ndp_events.py
Executable file
@@ -0,0 +1,159 @@
|
|||||||
|
#!/usr/bin/env -S uv run --python 3.10 --script
|
||||||
|
# /// script
|
||||||
|
# requires-python = ">=3.10"
|
||||||
|
# ///
|
||||||
|
"""
|
||||||
|
NDP Plugin Event Logger
|
||||||
|
Logs Claude Code events related to NDP plugin usage to a local file.
|
||||||
|
Enhanced to capture tool names, user input, and agent responses.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
import argparse
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
def get_log_file_path():
|
||||||
|
"""Get the log file path within plugin directory"""
|
||||||
|
# Get plugin root directory
|
||||||
|
plugin_root = Path(__file__).parent.parent
|
||||||
|
logs_dir = plugin_root / "logs"
|
||||||
|
|
||||||
|
# Create logs directory if it doesn't exist
|
||||||
|
logs_dir.mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
return logs_dir / "ndp_events.log"
|
||||||
|
|
||||||
|
def extract_enhanced_data(event_type: str, event_data: dict) -> dict:
|
||||||
|
"""Extract enhanced information from event data"""
|
||||||
|
enhanced = {
|
||||||
|
"timestamp": datetime.now().isoformat(),
|
||||||
|
"event_type": event_type,
|
||||||
|
"session_id": event_data.get("session_id", "unknown"),
|
||||||
|
}
|
||||||
|
|
||||||
|
# Extract tool information for PreToolUse and PostToolUse
|
||||||
|
if event_type in ["PreToolUse", "PostToolUse"]:
|
||||||
|
tool_data = event_data.get('tool', {})
|
||||||
|
if tool_data:
|
||||||
|
enhanced['tool_name'] = tool_data.get('name', 'unknown')
|
||||||
|
enhanced['tool_input'] = tool_data.get('input', {})
|
||||||
|
|
||||||
|
# For PostToolUse, capture tool results
|
||||||
|
if event_type == "PostToolUse":
|
||||||
|
if 'result' in event_data:
|
||||||
|
enhanced['tool_result'] = event_data['result']
|
||||||
|
if 'output' in event_data:
|
||||||
|
enhanced['tool_output'] = event_data['output']
|
||||||
|
if 'error' in event_data:
|
||||||
|
enhanced['tool_error'] = event_data['error']
|
||||||
|
|
||||||
|
# Extract user input for UserPromptSubmit
|
||||||
|
if event_type == "UserPromptSubmit":
|
||||||
|
if 'text' in event_data:
|
||||||
|
enhanced['user_prompt'] = event_data['text']
|
||||||
|
if 'messages' in event_data:
|
||||||
|
enhanced['conversation_messages'] = event_data['messages']
|
||||||
|
|
||||||
|
# For PostToolUse, extract agent response from transcript
|
||||||
|
if event_type == "PostToolUse" and 'transcript_path' in event_data:
|
||||||
|
transcript_path = event_data['transcript_path']
|
||||||
|
if os.path.exists(transcript_path):
|
||||||
|
try:
|
||||||
|
# Read last few messages to capture recent agent response
|
||||||
|
recent_chat = []
|
||||||
|
with open(transcript_path, 'r') as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
# Get last 5 messages to capture context
|
||||||
|
for line in lines[-5:]:
|
||||||
|
line = line.strip()
|
||||||
|
if line:
|
||||||
|
try:
|
||||||
|
msg = json.loads(line)
|
||||||
|
recent_chat.append(msg)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
enhanced['recent_chat'] = recent_chat
|
||||||
|
|
||||||
|
# Extract the latest agent response
|
||||||
|
for msg in reversed(recent_chat):
|
||||||
|
if msg.get('role') == 'assistant':
|
||||||
|
enhanced['latest_agent_response'] = msg.get('content', [])
|
||||||
|
break
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
enhanced['transcript_read_error'] = str(e)
|
||||||
|
|
||||||
|
# For Stop event, optionally include full chat if requested
|
||||||
|
if event_type == "Stop" and 'transcript_path' in event_data:
|
||||||
|
transcript_path = event_data['transcript_path']
|
||||||
|
if os.path.exists(transcript_path):
|
||||||
|
try:
|
||||||
|
chat_data = []
|
||||||
|
with open(transcript_path, 'r') as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line:
|
||||||
|
try:
|
||||||
|
chat_data.append(json.loads(line))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Add summary statistics
|
||||||
|
enhanced['chat_summary'] = {
|
||||||
|
'total_messages': len(chat_data),
|
||||||
|
'user_messages': sum(1 for msg in chat_data if msg.get('role') == 'user'),
|
||||||
|
'assistant_messages': sum(1 for msg in chat_data if msg.get('role') == 'assistant'),
|
||||||
|
}
|
||||||
|
# Optionally include last few messages
|
||||||
|
enhanced['last_5_messages'] = chat_data[-5:] if chat_data else []
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
enhanced['chat_read_error'] = str(e)
|
||||||
|
|
||||||
|
# Include raw event data for completeness
|
||||||
|
enhanced['raw_data'] = event_data
|
||||||
|
|
||||||
|
return enhanced
|
||||||
|
|
||||||
|
def log_event(event_type: str, event_data: dict):
|
||||||
|
"""Log event to file with enhanced data extraction"""
|
||||||
|
try:
|
||||||
|
log_file = get_log_file_path()
|
||||||
|
|
||||||
|
# Prepare enhanced log entry
|
||||||
|
log_entry = extract_enhanced_data(event_type, event_data)
|
||||||
|
|
||||||
|
# Append to log file (one JSON object per line)
|
||||||
|
with open(log_file, "a") as f:
|
||||||
|
f.write(json.dumps(log_entry) + "\n")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
# Fail silently to not block Claude Code
|
||||||
|
print(f"Warning: Failed to log event: {e}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description='Log NDP plugin events with enhanced data capture')
|
||||||
|
parser.add_argument('--event-type', required=True, help='Type of event')
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Read event data from stdin
|
||||||
|
event_data = json.load(sys.stdin)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
event_data = {}
|
||||||
|
|
||||||
|
# Log the event with enhanced data
|
||||||
|
log_event(args.event_type, event_data)
|
||||||
|
|
||||||
|
# Always exit successfully to not block Claude Code
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
69
plugin.lock.json
Normal file
69
plugin.lock.json
Normal file
@@ -0,0 +1,69 @@
|
|||||||
|
{
|
||||||
|
"$schema": "internal://schemas/plugin.lock.v1.json",
|
||||||
|
"pluginId": "gh:SIslamMun/iowarp-plugin:ndp-plugin",
|
||||||
|
"normalized": {
|
||||||
|
"repo": null,
|
||||||
|
"ref": "refs/tags/v20251128.0",
|
||||||
|
"commit": "8fc6a0e4bdb652d7f29cb6ccd20d1a937260e394",
|
||||||
|
"treeHash": "1e28125943e9edc7d798abae2f5b4311368a860bb39396c14f0a4f6a82ade6de",
|
||||||
|
"generatedAt": "2025-11-28T10:12:43.157888Z",
|
||||||
|
"toolVersion": "publish_plugins.py@0.2.0"
|
||||||
|
},
|
||||||
|
"origin": {
|
||||||
|
"remote": "git@github.com:zhongweili/42plugin-data.git",
|
||||||
|
"branch": "master",
|
||||||
|
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
|
||||||
|
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
|
||||||
|
},
|
||||||
|
"manifest": {
|
||||||
|
"name": "ndp-plugin",
|
||||||
|
"description": "National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation",
|
||||||
|
"version": "1.0.0"
|
||||||
|
},
|
||||||
|
"content": {
|
||||||
|
"files": [
|
||||||
|
{
|
||||||
|
"path": "README.md",
|
||||||
|
"sha256": "7a4168ad797d1f80a4b4380b374cfee7ea463ae21e3e894d96cc2fb3ce8f9522"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "agents/ndp-dataset-curator.md",
|
||||||
|
"sha256": "80537e47871ff2af4efcec669b72532bc9a79b31574e6a3021eeec8deb6d16d0"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "agents/ndp-data-scientist.md",
|
||||||
|
"sha256": "93c78b552db86ad8fa28fd9f1301d999ee925dda064dc6f0b7b85a697f007ac5"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "hooks/hooks.json",
|
||||||
|
"sha256": "330b9d07eb8a2a01671ac7c68320e3400ec7a890202ffd30741069f0acb94e83"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "hooks/log_ndp_events.py",
|
||||||
|
"sha256": "35c11a3727b98c423e7644083b7d57b8adaf855f747d13fdb687cd59cb96de24"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": ".claude-plugin/plugin.json",
|
||||||
|
"sha256": "9ed40f25eeffd93581d259506be104669cdbc316bb0e34414ff5c391bcbaaaf3"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "commands/ndp-organizations.md",
|
||||||
|
"sha256": "8453847b408366cebcc933ea9d16d6121aaa9ba6e6c57e557e52502e0ec636ce"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "commands/ndp-dataset-details.md",
|
||||||
|
"sha256": "b8ec4903d08ed8cbd61b16ac66a8c4daf5caf5dad37b5eb9f62e45ac04136531"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"path": "commands/ndp-search.md",
|
||||||
|
"sha256": "07061ee414c1dbb8d354c9ab4fd2248cdbdbd5955f0196a4ce4ef012f38de610"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"dirSha256": "1e28125943e9edc7d798abae2f5b4311368a860bb39396c14f0a4f6a82ade6de"
|
||||||
|
},
|
||||||
|
"security": {
|
||||||
|
"scannedAt": null,
|
||||||
|
"scannerVersion": null,
|
||||||
|
"flags": []
|
||||||
|
}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user