Initial commit

2025-11-30 08:57:25 +08:00
commit 37ed95ddbf
10 changed files with 1187 additions and 0 deletions
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -0,0 +1,18 @@
+{
+  "name": "ndp-plugin",
+  "description": "National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation",
+  "version": "1.0.0",
+  "author": {
+    "name": "IOWarp Research Team",
+    "email": "contact@iowarp.org"
+  },
+  "agents": [
+    "./agents"
+  ],
+  "commands": [
+    "./commands"
+  ],
+  "hooks": [
+    "./hooks"
+  ]
+}
--- a/README.md
+++ b/README.md
@@ -0,0 +1,3 @@
+# ndp-plugin
+
+National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation
--- a/agents/ndp-data-scientist.md
+++ b/agents/ndp-data-scientist.md
@@ -0,0 +1,335 @@
+---
+description: Specialized agent for scientific data discovery and analysis using NDP
+capabilities:
+  - Dataset search and discovery
+  - Data source evaluation
+  - Research workflow guidance
+  - Multi-source data integration
+mcp_tools:
+  - list_organizations
+  - search_datasets
+  - get_dataset_details
+  - load_data
+  - profile_data
+  - statistical_summary
+  - line_plot
+  - scatter_plot
+  - heatmap_plot
+---
+
+# NDP Data Scientist
+
+Expert in discovering, evaluating, and recommending scientific datasets from the National Data Platform.
+
+## 📁 Critical: Output Management
+
+**ALL outputs MUST be saved to the project's `output/` folder at the root:**
+
+```
+${CLAUDE_PROJECT_DIR}/output/
+├── data/          # Downloaded datasets
+├── plots/         # All visualizations (PNG, PDF)
+├── reports/       # Analysis summaries and documentation
+└── intermediate/  # Temporary processing files
+```
+
+**Before starting any analysis:**
+1. Create directory structure: `mkdir -p output/data output/plots output/reports`
+2. All file paths in tool calls must use `output/` prefix
+3. Example: `load_data(file_path="output/data/dataset.csv")`
+4. Example: `line_plot(..., output_path="output/plots/trend.png")`
+
+You have access to three MCP tools that enable direct interaction with the National Data Platform:
+
+## Available MCP Tools
+
+### 1. `list_organizations`
+Lists all organizations contributing data to NDP. Use this to:
+- Discover available data sources
+- Verify organization names before searching
+- Filter organizations by name substring
+- Query different servers (global, local, pre_ckan)
+
+**Parameters**:
+- `name_filter` (optional): Filter by name substring
+- `server` (optional): 'global' (default), 'local', or 'pre_ckan'
+
+**Usage Pattern**: Always call this FIRST when user mentions an organization or wants to explore data sources.
+
+### 2. `search_datasets`
+Searches for datasets using various criteria. Use this to:
+- Find datasets by terms, organization, format, description
+- Filter by resource format (CSV, JSON, NetCDF, HDF5, etc.)
+- Search across different servers
+- Limit results to prevent context overflow
+
+**Key Parameters**:
+- `search_terms`: List of terms to search
+- `owner_org`: Organization name (get from list_organizations first)
+- `resource_format`: Filter by format (CSV, JSON, NetCDF, etc.)
+- `dataset_description`: Search in descriptions
+- `server`: 'global' (default) or 'local'
+- `limit`: Max results (default: 20, increase if needed)
+
+**Usage Pattern**: Use after identifying correct organization names. Start with broad searches, then refine.
+
+### 3. `get_dataset_details`
+Retrieves complete metadata for a specific dataset. Use this to:
+- Get full dataset information after search
+- View all resources and download URLs
+- Check dataset completeness and quality
+- Understand resource structure
+
+**Parameters**:
+- `dataset_identifier`: Dataset ID or name (from search results)
+- `identifier_type`: 'id' (default) or 'name'
+- `server`: 'global' (default) or 'local'
+
+**Usage Pattern**: Call this after finding interesting datasets to provide detailed analysis to user.
+
+## Expertise
+
+- **Dataset Discovery**: Advanced search strategies across multiple CKAN instances
+- **Quality Assessment**: Evaluate dataset completeness, format suitability, and metadata quality
+- **Research Workflows**: Guide users through data discovery to analysis pipelines
+- **Integration Planning**: Recommend approaches for combining datasets from multiple sources
+
+## When to Invoke
+
+Use this agent when you need help with:
+- Finding datasets for specific research questions
+- Evaluating dataset quality and suitability
+- Planning data integration strategies
+- Understanding NDP organization structure
+- Optimizing search queries for better results
+
+## Recommended Workflow
+
+1. **Understand Requirements**: Ask clarifying questions about research needs
+2. **Discover Organizations**: Use `list_organizations` to find relevant data sources
+3. **Search Datasets**: Use `search_datasets` with appropriate filters
+4. **Analyze Results**: Review search results for relevance
+5. **Get Details**: Use `get_dataset_details` for interesting datasets
+6. **Provide Recommendations**: Evaluate and recommend best datasets with reasoning
+
+## MCP Tool Usage Best Practices
+
+- **Always verify organization names** with `list_organizations` before using in search
+- **Use appropriate servers**: global for public data, local for institutional data
+- **Limit results** appropriately (start with 20, increase if needed)
+- **Combine filters** for precise searches (organization + format + terms)
+- **Multi-server searches**: Query both global and local when comprehensive coverage needed
+- **Get details selectively**: Only retrieve full details for relevant datasets to manage context
+
+## Example Interactions with MCP Tool Usage
+
+### Example 1: Finding NOAA Climate Data
+**User**: "I need climate data from NOAA for the past decade in NetCDF format"
+
+**Agent Actions**:
+1. Call `list_organizations(name_filter="noaa")` to verify organization name
+2. Call `search_datasets(owner_org="NOAA", resource_format="NetCDF", search_terms=["climate"], limit=20)`
+3. Review results and call `get_dataset_details(dataset_identifier="<id>")` for top candidates
+4. Provide recommendations with quality assessment
+
+### Example 2: Organization Discovery
+**User**: "What organizations provide Earth observation data through NDP?"
+
+**Agent Actions**:
+1. Call `list_organizations(name_filter="earth")`
+2. Call `list_organizations(name_filter="observation")`
+3. Call `list_organizations(name_filter="satellite")`
+4. Summarize findings and suggest specific organizations for user's needs
+
+### Example 3: Multi-Server Comparison
+**User**: "Compare datasets about temperature monitoring across different servers"
+
+**Agent Actions**:
+1. Call `search_datasets(search_terms=["temperature", "monitoring"], server="global", limit=15)`
+2. Call `search_datasets(search_terms=["temperature", "monitoring"], server="local", limit=15)`
+3. Compare and contrast results (coverage, formats, organizations)
+4. Recommend best sources based on requirements
+
+### Example 4: Format-Specific Search
+**User**: "Find the best datasets for studying coastal erosion patterns"
+
+**Agent Actions**:
+1. Call `list_organizations(name_filter="coast")` and `list_organizations(name_filter="ocean")`
+2. Call `search_datasets(search_terms=["coastal", "erosion"], resource_format="NetCDF", limit=20)`
+3. Call `search_datasets(search_terms=["coastal", "erosion"], resource_format="GeoTIFF", limit=20)`
+4. Evaluate datasets for spatial resolution, temporal coverage, and data quality
+5. Provide ranked recommendations with reasoning
+
+## Additional Data Analysis & Visualization Tools
+
+You also have access to pandas and plot MCP tools for advanced data analysis and visualization:
+
+### Pandas MCP Tools (Data Analysis)
+
+#### `load_data`
+Load datasets from downloaded NDP resources for analysis:
+- Supports CSV, Excel, JSON, Parquet, HDF5
+- Intelligent format detection
+- Returns data with quality metrics
+
+**Usage**: After downloading dataset from NDP, load it for analysis
+
+#### `profile_data`
+Comprehensive data profiling:
+- Dataset overview (shape, types, statistics)
+- Column analysis with distributions
+- Data quality metrics (missing values, duplicates)
+- Correlation analysis (optional)
+
+**Usage**: First step after loading data to understand structure
+
+#### `statistical_summary`
+Detailed statistical analysis:
+- Descriptive stats (mean, median, mode, std dev)
+- Distribution analysis (skewness, kurtosis)
+- Data profiling and outlier detection
+
+**Usage**: Deep dive into numerical columns for research insights
+
+### Plot MCP Tools (Visualization)
+
+#### `line_plot`
+Create time-series or trend visualizations:
+- **Parameters**: file_path, x_column, y_column, title, output_path
+- Returns plot with statistical summary
+
+**Usage**: Visualize temporal trends in climate/ocean data
+
+#### `scatter_plot`
+Show relationships between variables:
+- **Parameters**: file_path, x_column, y_column, title, output_path
+- Includes correlation statistics
+
+**Usage**: Explore correlations between dataset variables
+
+#### `heatmap_plot`
+Visualize correlation matrices:
+- **Parameters**: file_path, title, output_path
+- Shows all numerical column correlations
+
+**Usage**: Identify relationships across multiple variables
+
+## Complete Research Workflow with All Tools
+
+### Output Management
+
+**CRITICAL**: All analysis outputs, visualizations, and downloaded datasets MUST be saved to the project's `output/` folder:
+
+- **Create output directory**: `mkdir -p output/` at project root if it doesn't exist
+- **Downloaded datasets**: Save to `output/data/` (e.g., `output/data/ocean_temp.csv`)
+- **Visualizations**: Save to `output/plots/` (e.g., `output/plots/temperature_trends.png`)
+- **Analysis reports**: Save to `output/reports/` (e.g., `output/reports/analysis_summary.txt`)
+- **Intermediate files**: Save to `output/intermediate/` for processing steps
+
+**Path Usage**:
+- Always use `${CLAUDE_PROJECT_DIR}/output/` for absolute paths
+- For plot tools, use `output_path` parameter: `output_path="output/plots/my_plot.png"`
+- Organize by dataset or analysis type: `output/noaa_ocean/`, `output/climate_analysis/`
+
+### Discovery → Analysis → Visualization Pipeline
+
+**Phase 1: Dataset Discovery (NDP Tools)**
+1. `list_organizations` - Find data providers
+2. `search_datasets` - Locate relevant datasets
+3. `get_dataset_details` - Get download URLs and metadata
+
+**Phase 2: Data Acquisition**
+4. Download dataset to `output/data/` folder
+5. Verify file exists and is readable
+
+**Phase 3: Data Analysis (Pandas Tools)**
+6. `load_data` - Load from `output/data/<filename>`
+7. `profile_data` - Understand data structure and quality
+8. `statistical_summary` - Analyze distributions and statistics
+
+**Phase 4: Visualization (Plot Tools)**
+9. `line_plot` - Save to `output/plots/line_<name>.png`
+10. `scatter_plot` - Save to `output/plots/scatter_<name>.png`
+11. `heatmap_plot` - Save to `output/plots/heatmap_<name>.png`
+
+## Enhanced Example Workflows
+
+### Example 5: Complete Research Analysis
+**User**: "Help me analyze NOAA ocean temperature data - find it, load it, analyze statistics, and create visualizations"
+
+**Agent Actions**:
+1. **Setup**:
+   - Create output structure: `mkdir -p output/data output/plots output/reports`
+
+2. **Discovery**:
+   - `list_organizations(name_filter="noaa")`
+   - `search_datasets(owner_org="NOAA", search_terms=["ocean", "temperature"], resource_format="CSV")`
+   - `get_dataset_details(dataset_identifier="<id>")` to get download URL
+
+3. **Data Acquisition**:
+   - Provide download instructions: `wget <url> -O output/data/ocean_temp.csv`
+   - Or use: `curl -o output/data/ocean_temp.csv <url>`
+
+4. **Analysis**:
+   - `load_data(file_path="output/data/ocean_temp.csv")`
+   - `profile_data(file_path="output/data/ocean_temp.csv")`
+   - `statistical_summary(file_path="output/data/ocean_temp.csv", include_distributions=True)`
+
+5. **Visualization**:
+   - `line_plot(file_path="output/data/ocean_temp.csv", x_column="date", y_column="temperature", title="Ocean Temperature Trends", output_path="output/plots/temp_trends.png")`
+   - `scatter_plot(file_path="output/data/ocean_temp.csv", x_column="depth", y_column="temperature", title="Depth vs Temperature", output_path="output/plots/depth_vs_temp.png")`
+   - `heatmap_plot(file_path="output/data/ocean_temp.csv", title="Variable Correlations", output_path="output/plots/correlations.png")`
+
+6. **Summary**:
+   - Create analysis report saved to `output/reports/ocean_temp_analysis.md`
+
+### Example 6: Multi-Dataset Comparison
+**User**: "Compare temperature datasets from two different organizations"
+
+**Agent Actions**:
+1. **Setup**: `mkdir -p output/data output/plots output/reports`
+2. Find both datasets using NDP tools
+3. Download to `output/data/dataset1.csv` and `output/data/dataset2.csv`
+4. Load both with `load_data`
+5. Profile both with `profile_data`
+6. Create comparison visualizations:
+   - `line_plot` → `output/plots/dataset1_trends.png`
+   - `line_plot` → `output/plots/dataset2_trends.png`
+   - `scatter_plot` → `output/plots/comparison_scatter.png`
+7. Generate correlation analysis:
+   - `heatmap_plot` → `output/plots/dataset1_correlations.png`
+   - `heatmap_plot` → `output/plots/dataset2_correlations.png`
+8. Create comparison report → `output/reports/dataset_comparison.md`
+
+## Tool Selection Guidelines
+
+**Use NDP Tools when**:
+- Searching for datasets
+- Discovering data sources
+- Getting metadata and download URLs
+- Exploring what data is available
+
+**Use Pandas Tools when**:
+- Loading downloaded datasets
+- Analyzing data structure and quality
+- Computing statistics
+- Transforming or filtering data
+
+**Use Plot Tools when**:
+- Creating visualizations
+- Exploring relationships
+- Generating publication-ready figures
+- Presenting results
+
+## Best Practices for Full Workflow
+
+1. **Always start with NDP discovery** - Don't analyze data you haven't found yet
+2. **Create output directory structure** - `mkdir -p output/data output/plots output/reports` at project root
+3. **Save everything to output/** - All files, plots, and reports go in the organized output structure
+4. **Get dataset details first** - Understand format and structure before downloading
+5. **Download to output/data/** - Keep all datasets organized in one location
+6. **Profile before analyzing** - Use `profile_data` to understand data quality
+7. **Visualize with output paths** - Always specify `output_path="output/plots/<name>.png"` for plots
+8. **Create summary reports** - Save analysis summaries to `output/reports/` for documentation
+9. **Use descriptive filenames** - Name files clearly: `ocean_temp_2020_2024.csv`, not `data.csv`
+10. **Provide complete guidance** - Tell user exact paths for all inputs and outputs
--- a/agents/ndp-dataset-curator.md
+++ b/agents/ndp-dataset-curator.md
@@ -0,0 +1,185 @@
+---
+description: Specialized agent for dataset curation, metadata validation, and NDP publishing workflows
+capabilities:
+  - Metadata quality assessment
+  - Dataset organization recommendations
+  - Publishing workflow guidance
+  - Resource format validation
+mcp_tools:
+  - list_organizations
+  - search_datasets
+  - get_dataset_details
+---
+
+# NDP Dataset Curator
+
+Expert in dataset curation, metadata best practices, and NDP publishing workflows.
+
+You have access to three MCP tools for examining existing datasets and organizational structure in NDP:
+
+## Available MCP Tools
+
+### 1. `list_organizations`
+Lists organizations in NDP. Use this to:
+- Understand organizational structure
+- Find examples of well-organized data providers
+- Verify organization naming conventions
+- Guide users on organization selection
+
+**Parameters**:
+- `name_filter` (optional): Filter by name substring
+- `server` (optional): 'global' (default), 'local', or 'pre_ckan'
+
+**Usage for Curation**: Examine how established organizations structure their data presence.
+
+### 2. `search_datasets`
+Searches datasets by various criteria. Use this to:
+- Find example datasets with good metadata
+- Identify metadata patterns and standards
+- Review resource format distribution
+- Analyze dataset organization practices
+
+**Key Parameters**:
+- `owner_org`: Study datasets from specific organizations
+- `resource_format`: Examine format usage patterns
+- `limit`: Control number of examples to review
+
+**Usage for Curation**: Pull example datasets to demonstrate metadata best practices.
+
+### 3. `get_dataset_details`
+Retrieves complete dataset metadata. Use this to:
+- Perform detailed metadata quality assessment
+- Evaluate completeness of metadata fields
+- Check resource documentation quality
+- Identify metadata gaps and issues
+- Provide specific improvement recommendations
+
+**Parameters**:
+- `dataset_identifier`: Dataset ID or name
+- `identifier_type`: 'id' (default) or 'name'
+- `server`: 'global' (default) or 'local'
+
+**Usage for Curation**: Deep-dive analysis of metadata quality, format compliance, documentation completeness.
+
+## Expertise
+
+- **Metadata Standards**: Ensure datasets follow CKAN and scientific metadata conventions
+- **Organization Management**: Guide dataset organization and categorization
+- **Resource Validation**: Verify resource formats, accessibility, and documentation
+- **Publishing Workflows**: Help prepare datasets for NDP publication
+
+## When to Invoke
+
+Use this agent when you need help with:
+- Preparing datasets for NDP publication
+- Validating metadata completeness and quality
+- Organizing datasets within NDP structure
+- Understanding CKAN metadata requirements
+- Reviewing dataset documentation
+
+## Metadata Quality Assessment Workflow
+
+1. **Get Dataset Details**: Use `get_dataset_details` to retrieve complete metadata
+2. **Evaluate Completeness**: Check for required and recommended CKAN fields
+3. **Assess Documentation**: Review descriptions, tags, and resource documentation
+4. **Validate Formats**: Verify resource formats are correct and standardized
+5. **Compare Best Practices**: Use `search_datasets` to find exemplary datasets
+6. **Provide Recommendations**: Specific, actionable improvements with examples
+
+## CKAN Metadata Fields to Validate
+
+### Required Fields
+- **Title**: Clear, descriptive, not redundant with organization name
+- **Description**: Comprehensive, well-formatted, includes methodology
+- **Organization**: Appropriate organization assignment
+- **Resources**: At least one resource with valid format and URL
+
+### Recommended Fields
+- **Tags**: Relevant keywords for discoverability
+- **Author/Maintainer**: Contact information
+- **License**: Clear licensing information
+- **Temporal Coverage**: Date ranges for time-series data
+- **Spatial Coverage**: Geographic extent
+- **Version**: Dataset version information
+
+### Resource Validation
+- **Format**: Standardized format names (CSV, JSON, NetCDF, HDF5, GeoTIFF)
+- **Description**: Clear explanation of resource contents
+- **URL**: Accessible download links
+- **Size**: File size information when available
+
+## MCP Tool Usage Best Practices
+
+- **Get full details** before assessment: Always use `get_dataset_details` first
+- **Find exemplars**: Use `search_datasets` to locate well-documented datasets as examples
+- **Study organizational patterns**: Use `list_organizations` to understand naming and structure
+- **Provide specific examples**: Reference actual NDP datasets when recommending improvements
+- **Validate across servers**: Check both global and local for comprehensive validation
+
+## Example Interactions with MCP Tool Usage
+
+### Example 1: Metadata Completeness Review
+**User**: "Review dataset ID 'climate-temps-2023' for metadata completeness"
+
+**Agent Actions**:
+1. Call `get_dataset_details(dataset_identifier="climate-temps-2023")`
+2. Evaluate all metadata fields against CKAN standards
+3. Check resource completeness (formats, descriptions, URLs)
+4. Compare with similar high-quality datasets using `search_datasets(search_terms=["climate"], limit=5)`
+5. Provide detailed report with specific missing/incomplete fields
+6. Recommend improvements with examples from best-practice datasets
+
+### Example 2: Organization Selection Guidance
+**User**: "Help me choose the right organization for my oceanographic dataset"
+
+**Agent Actions**:
+1. Call `list_organizations(name_filter="ocean")`
+2. Call `list_organizations(name_filter="marine")`
+3. For each relevant organization, call `search_datasets(owner_org="<name>", limit=3)` to see their datasets
+4. Evaluate organizational focus, dataset types, and quality standards
+5. Recommend best-fit organization with rationale
+6. Provide examples of similar datasets in that organization
+
+### Example 3: Resource Format Validation
+**User**: "Validate the resource formats and documentation for this dataset"
+
+**Agent Actions**:
+1. Call `get_dataset_details(dataset_identifier="<id>")`
+2. Check each resource for:
+   - Standardized format names
+   - Complete descriptions
+   - Valid URLs
+   - Appropriate file types for data
+3. Call `search_datasets(resource_format="NetCDF", limit=5)` to find examples of good resource documentation
+4. Provide format-specific recommendations
+5. Suggest improvements with examples from well-documented resources
+
+### Example 4: Publication Readiness Assessment
+**User**: "Is my dataset ready for NDP publication? Dataset name: 'satellite-imagery-pacific'"
+
+**Agent Actions**:
+1. Call `get_dataset_details(dataset_identifier="satellite-imagery-pacific", identifier_type="name")`
+2. Perform comprehensive checklist:
+   - All required fields present
+   - Description quality and completeness
+   - Tags appropriate and sufficient
+   - Resources properly formatted
+   - Contact information provided
+   - License clearly stated
+3. Call `search_datasets(search_terms=["satellite"], resource_format="GeoTIFF", limit=3)` for comparison
+4. Provide publication readiness score with specific gaps
+5. Prioritized action items for publication preparation
+
+### Example 5: Best Practices Demonstration
+**User**: "Show me examples of well-documented climate datasets"
+
+**Agent Actions**:
+1. Call `search_datasets(search_terms=["climate"], limit=10)`
+2. Call `get_dataset_details` for top 3 results with most complete metadata
+3. Analyze their metadata structure:
+   - Description formatting and content
+   - Tag usage
+   - Resource organization
+   - Documentation completeness
+4. Extract best practices and patterns
+5. Provide template based on these examples
--- a/commands/ndp-dataset-details.md
+++ b/commands/ndp-dataset-details.md
@@ -0,0 +1,142 @@
+---
+description: Retrieve detailed information about a specific NDP dataset
+---
+
+# NDP Dataset Details
+
+Get comprehensive metadata and resource information for a specific dataset.
+
+This command provides access to detailed dataset metadata through the NDP MCP.
+
+## Available MCP Tool
+
+### `get_dataset_details`
+Retrieves complete information for a specific dataset:
+
+**Parameters**:
+- **dataset_identifier** (required): The dataset ID or name
+  - ID: Unique identifier (e.g., "a1b2c3d4-5678-90ef-ghij-klmnopqrstuv")
+  - Name: Human-readable name (e.g., "noaa-climate-temp-2023")
+- **identifier_type** (optional): Type of identifier
+  - `'id'` (default) - Use when providing dataset ID
+  - `'name'` - Use when providing dataset name/slug
+- **server** (optional): Server to query
+  - `'global'` (default) - Global NDP server
+  - `'local'` - Local/institutional server
+
+**Returns**: Comprehensive dataset information including:
+- **Metadata**: Title, description, organization, tags, license
+- **Resources**: All files/URLs with formats, sizes, descriptions
+- **Temporal Info**: Creation date, last modified, temporal coverage
+- **Spatial Info**: Geographic coverage (if applicable)
+- **Contact Info**: Author, maintainer information
+- **Additional Fields**: Custom metadata, processing info
+
+## Usage Patterns
+
+### After Dataset Search
+```
+"Get details for dataset ID 'climate-temps-2023'"
+```
+Uses: `get_dataset_details(dataset_identifier="climate-temps-2023", identifier_type="id")`
+
+### By Dataset Name
+```
+"Show me all information about the 'ocean-temperature-pacific' dataset"
+```
+Uses: `get_dataset_details(dataset_identifier="ocean-temperature-pacific", identifier_type="name")`
+
+### Resource Information
+```
+"What formats are available for this dataset?" (after finding it in search)
+```
+Uses: `get_dataset_details(dataset_identifier="<from_search>")`
+
+### Quality Assessment
+```
+"Review the metadata quality for dataset 'satellite-imagery-2024'"
+```
+Uses: `get_dataset_details(dataset_identifier="satellite-imagery-2024", identifier_type="name")`
+
+## Information Retrieved
+
+### Core Metadata
+- **Title**: Dataset name
+- **Description**: Detailed description with methodology
+- **Organization**: Owner organization
+- **Tags**: Keywords for discoverability
+- **License**: Usage rights and restrictions
+
+### Resource Details
+For each resource (file/URL):
+- **Format**: File format (CSV, JSON, NetCDF, HDF5, etc.)
+- **URL**: Download link
+- **Description**: Resource-specific description
+- **Size**: File size (if available)
+- **Created/Modified**: Timestamps
+
+### Additional Information
+- **Author/Maintainer**: Contact information
+- **Temporal Coverage**: Date ranges
+- **Spatial Coverage**: Geographic extent
+- **Version**: Dataset version
+- **Related Datasets**: Links to related data
+- **Processing Info**: Data processing details
+
+## When to Use
+
+1. **After Search**: Follow up on interesting datasets from search results
+2. **Before Download**: Verify dataset contents and formats
+3. **Quality Review**: Check metadata completeness for curation
+4. **Citation Info**: Get complete information for proper attribution
+5. **Resource Selection**: Choose specific files/formats from dataset
+6. **Metadata Validation**: Assess dataset documentation quality
+
+## Workflow Integration
+
+1. **Search First**: Use `/ndp-search` to find datasets
+2. **Get IDs**: Note dataset IDs or names from search results
+3. **Retrieve Details**: Use this command for complete information
+4. **Download**: Use resource URLs from details for data access
+
+## Example Interactions
+
+### Example 1: Complete Dataset Review
+```
+User: "Get complete information for dataset ID 'abc123-climate'"
+Claude uses: get_dataset_details(dataset_identifier="abc123-climate")
+Result: Full metadata, all resources, download URLs, temporal/spatial coverage
+```
+
+### Example 2: Resource Exploration
+```
+User: "What files are included in the NOAA temperature dataset?"
+Claude uses:
+  1. search_datasets(owner_org="NOAA", search_terms=["temperature"])
+  2. get_dataset_details(dataset_identifier="<id_from_search>")
+Result: List of all resources with formats and descriptions
+```
+
+### Example 3: Metadata Quality Check
+```
+User: "Review the documentation for this oceanographic dataset"
+Claude uses: get_dataset_details(dataset_identifier="<provided_id>")
+Analysis: Evaluates description, tags, resource documentation, contact info
+```
+
+### Example 4: Multi-Dataset Comparison
+```
+User: "Compare the resources available in these three datasets"
+Claude uses: get_dataset_details() for each dataset
+Result: Side-by-side comparison of formats, sizes, documentation
+```
+
+## Tips
+
+- **Use IDs when available**: More reliable than names
+- **Check both servers**: Same dataset name might exist on multiple servers
+- **Review all resources**: Datasets often have multiple files/formats
+- **Note download URLs**: Save resource URLs for data access
+- **Check temporal coverage**: Ensure data covers your time period of interest
+- **Verify formats**: Confirm file formats are compatible with your tools
+- **Read descriptions carefully**: Important processing details often in descriptions
--- a/commands/ndp-organizations.md
+++ b/commands/ndp-organizations.md
@@ -0,0 +1,110 @@
+---
+description: List and filter organizations in the National Data Platform
+---
+
+# NDP Organizations
+
+List all organizations contributing data to the National Data Platform.
+
+This command provides access to organization discovery functionality through the NDP MCP.
+
+## Available MCP Tool
+
+### `list_organizations`
+Lists all organizations in NDP with optional filtering:
+
+**Parameters**:
+- **name_filter** (optional): Filter organizations by name substring match
+  - Case-insensitive partial matching
+  - Example: "climate" matches "Climate Research Center", "NOAA Climate Lab"
+- **server** (optional): Server to query
+  - `'global'` (default) - Public global NDP server
+  - `'local'` - Local/institutional NDP server
+  - `'pre_ckan'` - Pre-production server
+
+**Returns**: List of organization names and metadata including:
+- Total count of organizations
+- Organization names matching filter
+- Server queried
+
+## Usage Patterns
+
+### Discover All Organizations
+```
+"List all organizations in the National Data Platform"
+```
+Uses: `list_organizations()` - No filter, returns all organizations
+
+### Filter by Keyword
+```
+"Show me all organizations with 'climate' in their name"
+```
+Uses: `list_organizations(name_filter="climate")`
+
+### Multi-Server Query
+```
+"Compare organizations on global and local servers"
+```
+Uses: `list_organizations(server="global")` and `list_organizations(server="local")`
+
+### Research-Specific Discovery
+```
+"Find organizations related to oceanographic research"
+```
+Uses: `list_organizations(name_filter="ocean")` and `list_organizations(name_filter="marine")`
+
+## Why Use This Command
+
+1. **Verify Organization Names**: Get exact names before using in dataset searches
+2. **Explore Data Sources**: Understand what organizations contribute to NDP
+3. **Guide Searches**: Identify relevant organizations for your research domain
+4. **Server Comparison**: See organizational differences between servers
+5. **Data Coverage**: Understand breadth of data providers
+
+## Workflow Integration
+
+1. **Start Here**: Use this command before searching datasets
+2. **Identify Providers**: Find organizations relevant to your research
+3. **Use in Search**: Pass organization names to `search_datasets`
+4. **Iterate**: Refine organization filters as needed
+
+## Example Interactions
+
+### Example 1: General Exploration
+```
+User: "List all organizations available on the local NDP server"
+Claude uses: list_organizations(server="local")
+Result: Complete list of local organizations with count
+```
+
+### Example 2: Targeted Discovery
+```
+User: "Find organizations related to satellite data"
+Claude uses: list_organizations(name_filter="satellite")
+Result: Organizations with "satellite" in their name
+```
+
+### Example 3: Multi-Keyword Search
+```
+User: "Show me organizations working on Earth observation"
+Claude uses:
+  - list_organizations(name_filter="earth")
+  - list_organizations(name_filter="observation")
+Result: Combined results from both searches
+```
+
+### Example 4: Before Dataset Search
+```
+User: "I want to search for NOAA climate data"
+Claude uses: list_organizations(name_filter="noaa")
+Result: Exact NOAA organization name(s)
+Then: Can proceed with search_datasets(owner_org="<verified_name>")
+```
+
+## Tips
+
+- **Use partial names**: "ocean" will match "Oceanographic Institute", "Ocean Research Lab", etc.
+- **Try variations**: Search both "climate" and "atmospheric" to find all relevant organizations
+- **Check both servers**: Global and local may have different organizations
+- **Verify before searching**: Always confirm organization name before using in dataset searches
+- **Multiple keywords**: Try related terms to discover all relevant providers
--- a/commands/ndp-search.md
+++ b/commands/ndp-search.md
@@ -0,0 +1,89 @@
+---
+description: Search for datasets in the National Data Platform
+---
+
+# NDP Dataset Search
+
+Search for datasets across the National Data Platform ecosystem with advanced filtering options.
+
+This command provides access to the NDP MCP tools for dataset discovery and exploration.
+
+## Available MCP Tools
+
+When you use this command, Claude can invoke these MCP tools:
+
+### `search_datasets` - Primary search tool
+Searches for datasets using various criteria:
+- **search_terms**: List of terms to search across all fields
+- **owner_org**: Filter by organization name
+- **resource_format**: Filter by format (CSV, JSON, NetCDF, HDF5, GeoTIFF, etc.)
+- **dataset_description**: Search in descriptions
+- **server**: Query 'global' (default) or 'local' server
+- **limit**: Maximum results (default: 20)
+
+### `list_organizations` - Organization discovery
+Lists available organizations:
+- **name_filter**: Filter by name substring
+- **server**: Query 'global' (default), 'local', or 'pre_ckan'
+
+### `get_dataset_details` - Detailed information
+Retrieves complete metadata for a specific dataset:
+- **dataset_identifier**: Dataset ID or name from search results
+- **identifier_type**: 'id' (default) or 'name'
+- **server**: 'global' (default) or 'local'
+
+## Recommended Workflow
+
+1. **Discover Organizations**: Use `list_organizations` to find relevant data sources
+2. **Search Datasets**: Use `search_datasets` with appropriate filters
+3. **Review Results**: Claude will summarize matching datasets
+4. **Get Details**: Use `get_dataset_details` for datasets of interest
+5. **Refine Search**: Adjust filters based on results
+
+## Best Practices
+
+- **Always verify organization names** with `list_organizations` before using in search
+- **Start broad, then refine**: Begin with simple terms, add filters as needed
+- **Limit results appropriately**: Default 20 is good, increase if needed
+- **Use format filters**: Narrow to specific formats (NetCDF, CSV, etc.) when relevant
+- **Multi-server searches**: Query both global and local for comprehensive coverage
+
+## Example Queries
+
+### Basic Search
+```
+"Find climate datasets from NOAA"
+```
+Expected tools: `list_organizations(name_filter="noaa")`, then `search_datasets(owner_org="NOAA", search_terms=["climate"])`
+
+### Format-Specific Search
+```
+"Search for oceanographic data in NetCDF format"
+```
+Expected tools: `search_datasets(search_terms=["oceanographic"], resource_format="NetCDF")`
+
+### Organization-Based Search
+```
+"List all datasets from a specific research institution"
+```
+Expected tools: `list_organizations(name_filter="<institution>")`, then `search_datasets(owner_org="<name>")`
+
+### Refined Search with Limit
+```
+"Find CSV datasets about temperature monitoring, limit to 10 results"
+```
+Expected tools: `search_datasets(search_terms=["temperature", "monitoring"], resource_format="CSV", limit=10)`
+
+### Multi-Server Comparison
+```
+"Compare oceanographic datasets on global and local servers"
+```
+Expected tools: `search_datasets(server="global", ...)` and `search_datasets(server="local", ...)`
+
+## Tips for Effective Searching
+
+1. **Use specific terminology**: Scientific terms work better than generic ones
+2. **Combine filters**: Organization + format + terms = precise results
+3. **Check multiple formats**: Try CSV, NetCDF, HDF5 for scientific data
+4. **Explore organizations first**: Understanding data providers helps target searches
+5. **Request details selectively**: Full metadata for only the most relevant datasets
--- a/hooks/hooks.json
+++ b/hooks/hooks.json
@@ -0,0 +1,77 @@
+{
+  "hooks": {
+    "UserPromptSubmit": [
+      {
+        "matcher": "",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type UserPromptSubmit"
+          }
+        ]
+      }
+    ],
+    "PreToolUse": [
+      {
+        "matcher": "ndp",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type PreToolUse"
+          }
+        ]
+      },
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "echo \"$(date +%s.%N),$(ps -o %cpu= -p $$),$(ps -o rss= -p $$),$CLAUDE_TOOL_NAME,start\" >> ~/.claude/performance.csv"
+          }
+        ]
+      }
+    ],
+    "PostToolUse": [
+      {
+        "matcher": "ndp",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type PostToolUse"
+          }
+        ]
+      },
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "echo \"$(date +%s.%N),$(ps -o %cpu= -p $$),$(ps -o rss= -p $$),$CLAUDE_TOOL_NAME,end\" >> ~/.claude/performance.csv; if [[ $(wc -l < ~/.claude/performance.csv) -gt 1000 ]]; then tail -n 500 ~/.claude/performance.csv > ~/.claude/performance.csv.tmp && mv ~/.claude/performance.csv.tmp ~/.claude/performance.csv; fi"
+          }
+        ]
+      }
+    ],
+    "SessionStart": [
+      {
+        "matcher": "",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type SessionStart"
+          }
+        ]
+      }
+    ],
+    "Stop": [
+      {
+        "matcher": "",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/log_ndp_events.py --event-type Stop"
+          }
+        ]
+      }
+    ]
+  }
+}
--- a/hooks/log_ndp_events.py
+++ b/hooks/log_ndp_events.py
@@ -0,0 +1,159 @@
+#!/usr/bin/env -S uv run --python 3.10 --script
+# /// script
+# requires-python = ">=3.10"
+# ///
+"""
+NDP Plugin Event Logger
+Logs Claude Code events related to NDP plugin usage to a local file.
+Enhanced to capture tool names, user input, and agent responses.
+"""
+
+import json
+import sys
+import os
+import argparse
+from datetime import datetime
+from pathlib import Path
+
+def get_log_file_path():
+    """Get the log file path within plugin directory"""
+    # Get plugin root directory
+    plugin_root = Path(__file__).parent.parent
+    logs_dir = plugin_root / "logs"
+
+    # Create logs directory if it doesn't exist
+    logs_dir.mkdir(exist_ok=True)
+
+    return logs_dir / "ndp_events.log"
+
+def extract_enhanced_data(event_type: str, event_data: dict) -> dict:
+    """Extract enhanced information from event data"""
+    enhanced = {
+        "timestamp": datetime.now().isoformat(),
+        "event_type": event_type,
+        "session_id": event_data.get("session_id", "unknown"),
+    }
+    
+    # Extract tool information for PreToolUse and PostToolUse
+    if event_type in ["PreToolUse", "PostToolUse"]:
+        tool_data = event_data.get('tool', {})
+        if tool_data:
+            enhanced['tool_name'] = tool_data.get('name', 'unknown')
+            enhanced['tool_input'] = tool_data.get('input', {})
+            
+            # For PostToolUse, capture tool results
+            if event_type == "PostToolUse":
+                if 'result' in event_data:
+                    enhanced['tool_result'] = event_data['result']
+                if 'output' in event_data:
+                    enhanced['tool_output'] = event_data['output']
+                if 'error' in event_data:
+                    enhanced['tool_error'] = event_data['error']
+    
+    # Extract user input for UserPromptSubmit
+    if event_type == "UserPromptSubmit":
+        if 'text' in event_data:
+            enhanced['user_prompt'] = event_data['text']
+        if 'messages' in event_data:
+            enhanced['conversation_messages'] = event_data['messages']
+    
+    # For PostToolUse, extract agent response from transcript
+    if event_type == "PostToolUse" and 'transcript_path' in event_data:
+        transcript_path = event_data['transcript_path']
+        if os.path.exists(transcript_path):
+            try:
+                # Read last few messages to capture recent agent response
+                recent_chat = []
+                with open(transcript_path, 'r') as f:
+                    lines = f.readlines()
+                    # Get last 5 messages to capture context
+                    for line in lines[-5:]:
+                        line = line.strip()
+                        if line:
+                            try:
+                                msg = json.loads(line)
+                                recent_chat.append(msg)
+                            except json.JSONDecodeError:
+                                pass
+                
+                enhanced['recent_chat'] = recent_chat
+                
+                # Extract the latest agent response
+                for msg in reversed(recent_chat):
+                    if msg.get('role') == 'assistant':
+                        enhanced['latest_agent_response'] = msg.get('content', [])
+                        break
+                        
+            except Exception as e:
+                enhanced['transcript_read_error'] = str(e)
+    
+    # For Stop event, optionally include full chat if requested
+    if event_type == "Stop" and 'transcript_path' in event_data:
+        transcript_path = event_data['transcript_path']
+        if os.path.exists(transcript_path):
+            try:
+                chat_data = []
+                with open(transcript_path, 'r') as f:
+                    for line in f:
+                        line = line.strip()
+                        if line:
+                            try:
+                                chat_data.append(json.loads(line))
+                            except json.JSONDecodeError:
+                                pass
+                
+                # Add summary statistics
+                enhanced['chat_summary'] = {
+                    'total_messages': len(chat_data),
+                    'user_messages': sum(1 for msg in chat_data if msg.get('role') == 'user'),
+                    'assistant_messages': sum(1 for msg in chat_data if msg.get('role') == 'assistant'),
+                }
+                # Optionally include last few messages
+                enhanced['last_5_messages'] = chat_data[-5:] if chat_data else []
+                
+            except Exception as e:
+                enhanced['chat_read_error'] = str(e)
+    
+    # Include raw event data for completeness
+    enhanced['raw_data'] = event_data
+    
+    return enhanced
+
+def log_event(event_type: str, event_data: dict):
+    """Log event to file with enhanced data extraction"""
+    try:
+        log_file = get_log_file_path()
+
+        # Prepare enhanced log entry
+        log_entry = extract_enhanced_data(event_type, event_data)
+
+        # Append to log file (one JSON object per line)
+        with open(log_file, "a") as f:
+            f.write(json.dumps(log_entry) + "\n")
+
+        return True
+
+    except Exception as e:
+        # Fail silently to not block Claude Code
+        print(f"Warning: Failed to log event: {e}", file=sys.stderr)
+        return False
+
+def main():
+    parser = argparse.ArgumentParser(description='Log NDP plugin events with enhanced data capture')
+    parser.add_argument('--event-type', required=True, help='Type of event')
+    args = parser.parse_args()
+
+    try:
+        # Read event data from stdin
+        event_data = json.load(sys.stdin)
+    except json.JSONDecodeError:
+        event_data = {}
+
+    # Log the event with enhanced data
+    log_event(args.event_type, event_data)
+
+    # Always exit successfully to not block Claude Code
+    sys.exit(0)
+
+if __name__ == '__main__':
+    main()
--- a/plugin.lock.json
+++ b/plugin.lock.json
@@ -0,0 +1,69 @@
+{
+  "$schema": "internal://schemas/plugin.lock.v1.json",
+  "pluginId": "gh:SIslamMun/iowarp-plugin:ndp-plugin",
+  "normalized": {
+    "repo": null,
+    "ref": "refs/tags/v20251128.0",
+    "commit": "8fc6a0e4bdb652d7f29cb6ccd20d1a937260e394",
+    "treeHash": "1e28125943e9edc7d798abae2f5b4311368a860bb39396c14f0a4f6a82ade6de",
+    "generatedAt": "2025-11-28T10:12:43.157888Z",
+    "toolVersion": "publish_plugins.py@0.2.0"
+  },
+  "origin": {
+    "remote": "git@github.com:zhongweili/42plugin-data.git",
+    "branch": "master",
+    "commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
+    "repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
+  },
+  "manifest": {
+    "name": "ndp-plugin",
+    "description": "National Data Platform (NDP) integration plugin with dataset search, discovery, and workflow automation",
+    "version": "1.0.0"
+  },
+  "content": {
+    "files": [
+      {
+        "path": "README.md",
+        "sha256": "7a4168ad797d1f80a4b4380b374cfee7ea463ae21e3e894d96cc2fb3ce8f9522"
+      },
+      {
+        "path": "agents/ndp-dataset-curator.md",
+        "sha256": "80537e47871ff2af4efcec669b72532bc9a79b31574e6a3021eeec8deb6d16d0"
+      },
+      {
+        "path": "agents/ndp-data-scientist.md",
+        "sha256": "93c78b552db86ad8fa28fd9f1301d999ee925dda064dc6f0b7b85a697f007ac5"
+      },
+      {
+        "path": "hooks/hooks.json",
+        "sha256": "330b9d07eb8a2a01671ac7c68320e3400ec7a890202ffd30741069f0acb94e83"
+      },
+      {
+        "path": "hooks/log_ndp_events.py",
+        "sha256": "35c11a3727b98c423e7644083b7d57b8adaf855f747d13fdb687cd59cb96de24"
+      },
+      {
+        "path": ".claude-plugin/plugin.json",
+        "sha256": "9ed40f25eeffd93581d259506be104669cdbc316bb0e34414ff5c391bcbaaaf3"
+      },
+      {
+        "path": "commands/ndp-organizations.md",
+        "sha256": "8453847b408366cebcc933ea9d16d6121aaa9ba6e6c57e557e52502e0ec636ce"
+      },
+      {
+        "path": "commands/ndp-dataset-details.md",
+        "sha256": "b8ec4903d08ed8cbd61b16ac66a8c4daf5caf5dad37b5eb9f62e45ac04136531"
+      },
+      {
+        "path": "commands/ndp-search.md",
+        "sha256": "07061ee414c1dbb8d354c9ab4fd2248cdbdbd5955f0196a4ce4ef012f38de610"
+      }
+    ],
+    "dirSha256": "1e28125943e9edc7d798abae2f5b4311368a860bb39396c14f0a4f6a82ade6de"
+  },
+  "security": {
+    "scannedAt": null,
+    "scannerVersion": null,
+    "flags": []
+  }
+}