Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:50:58 +08:00
commit 3fb2d73fdf
11 changed files with 488 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
{
"name": "automl-pipeline-builder",
"description": "Build AutoML pipelines",
"version": "1.0.0",
"author": {
"name": "Claude Code Plugins",
"email": "[email protected]"
},
"skills": [
"./skills"
],
"commands": [
"./commands"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# automl-pipeline-builder
Build AutoML pipelines

15
commands/build-automl.md Normal file
View File

@@ -0,0 +1,15 @@
---
description: Execute AI/ML task with intelligent automation
---
# AI/ML Task Executor
You are an AI/ML specialist. When this command is invoked:
1. Analyze the current context and requirements
2. Generate appropriate code for the ML task
3. Include data validation and error handling
4. Provide performance metrics and insights
5. Save artifacts and generate documentation
Support modern ML frameworks and best practices.

73
plugin.lock.json Normal file
View File

@@ -0,0 +1,73 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:jeremylongshore/claude-code-plugins-plus:plugins/ai-ml/automl-pipeline-builder",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "fb34009766a8bc2e9399fffd0376914fd9c97ab3",
"treeHash": "c31c6acb6671d79f92b4e28acb851a76d56e34bb91fe52739417ac754698c660",
"generatedAt": "2025-11-28T10:18:10.783298Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "automl-pipeline-builder",
"description": "Build AutoML pipelines",
"version": "1.0.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "65ddad12a34abea594620cef98ef80e73e8f5fe9e4b51f6de2748cef28f5b33c"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "29894f57198e94aa48e120e7a39a2ad0a38259cd66f7a7c6aaf4b865722d4248"
},
{
"path": "commands/build-automl.md",
"sha256": "043efb83e2f02fc6d0869c8a3a7388d6e49f6c809292b93dd6a97a1b142e5647"
},
{
"path": "skills/automl-pipeline-builder/SKILL.md",
"sha256": "95dcd51280f06617680d939bb4b53db46e97e18bd51ceaf5eaecd600a5def7b2"
},
{
"path": "skills/automl-pipeline-builder/references/README.md",
"sha256": "cc008a892b439b86749b67023a04bb02ceaa11207207b5a267a4c161afa0a73d"
},
{
"path": "skills/automl-pipeline-builder/scripts/README.md",
"sha256": "06961fee5bb99c5bd43e86c2cef8c010c7c93a20cd23f6125d8575f356c0b0a2"
},
{
"path": "skills/automl-pipeline-builder/assets/example_dataset.csv",
"sha256": "df2165de808d5ffc740008e372a6dcf72d7851aa0a4f9b087cacfd47e6adc9d3"
},
{
"path": "skills/automl-pipeline-builder/assets/pipeline_template.yaml",
"sha256": "ce61db53c569e9945a1bc79fc05ba5757395dfec0cad306039eb6927b6c0a6c3"
},
{
"path": "skills/automl-pipeline-builder/assets/README.md",
"sha256": "392baecc536382ba5f53272f95a232d90e39b683e1f678bdc174b49282b13f29"
},
{
"path": "skills/automl-pipeline-builder/assets/evaluation_report_template.html",
"sha256": "bc27b226c4c65d0567721be60a65e799b7511ef72c545277a68cc3a0729f7a8f"
}
],
"dirSha256": "c31c6acb6671d79f92b4e28acb851a76d56e34bb91fe52739417ac754698c660"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,53 @@
---
name: building-automl-pipelines
description: |
This skill empowers Claude to build AutoML pipelines using the automl-pipeline-builder plugin. It is triggered when the user requests the creation of an automated machine learning pipeline, specifies the use of AutoML techniques, or asks for assistance in automating the machine learning model building process. The skill analyzes the context, generates code for the ML task, includes data validation and error handling, provides performance metrics, and saves artifacts with documentation. Use this skill when the user explicitly asks to "build automl pipeline", "create automated ml pipeline", or needs help with "automating machine learning workflows".
allowed-tools: Read, Write, Edit, Grep, Glob, Bash
version: 1.0.0
---
## Overview
This skill automates the creation of machine learning pipelines using the automl-pipeline-builder plugin. It simplifies the process of building, training, and evaluating machine learning models by automating feature engineering, model selection, and hyperparameter tuning.
## How It Works
1. **Analyze Requirements**: The skill analyzes the user's request and identifies the specific machine learning task and data requirements.
2. **Generate Code**: Based on the analysis, the skill generates the necessary code to build an AutoML pipeline using appropriate libraries.
3. **Implement Best Practices**: The skill incorporates data validation, error handling, and performance optimization techniques into the generated code.
4. **Provide Insights**: After execution, the skill provides performance metrics, insights, and documentation for the created pipeline.
## When to Use This Skill
This skill activates when you need to:
- Build an automated machine learning pipeline.
- Automate the process of model selection and hyperparameter tuning.
- Generate code for a complete AutoML workflow.
## Examples
### Example 1: Creating a Classification Pipeline
User request: "Build an AutoML pipeline for classifying customer churn."
The skill will:
1. Generate code to load and preprocess customer data.
2. Create an AutoML pipeline that automatically selects and tunes a classification model.
### Example 2: Optimizing a Regression Model
User request: "Create an automated ml pipeline to predict house prices."
The skill will:
1. Generate code to build a regression model using AutoML techniques.
2. Automatically select the best performing model and provide performance metrics.
## Best Practices
- **Data Preparation**: Ensure data is clean, properly formatted, and relevant to the machine learning task.
- **Performance Monitoring**: Continuously monitor the performance of the AutoML pipeline and retrain the model as needed.
- **Error Handling**: Implement robust error handling to gracefully handle unexpected issues during pipeline execution.
## Integration
This skill can be integrated with other data processing and visualization plugins to create end-to-end machine learning workflows. It can also be used in conjunction with deployment plugins to automate the deployment of trained models.

View File

@@ -0,0 +1,7 @@
# Assets
Bundled resources for automl-pipeline-builder skill
- [ ] pipeline_template.yaml: YAML template for defining the structure and configuration of the AutoML pipeline.
- [ ] example_dataset.csv: Sample dataset that can be used as input for the AutoML pipeline.
- [ ] evaluation_report_template.html: HTML template for generating the model evaluation report.

View File

@@ -0,0 +1,203 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AutoML Model Evaluation Report</title>
<style>
/* Basic Reset */
body, h1, h2, h3, p, table, th, td {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* General Styles */
body {
font-family: sans-serif;
line-height: 1.6;
background-color: #f4f4f4;
color: #333;
padding: 20px;
}
.container {
max-width: 960px;
margin: 0 auto;
background-color: #fff;
padding: 20px;
border-radius: 8px;
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}
h1, h2, h3 {
margin-bottom: 15px;
color: #0056b3;
}
h1 {
font-size: 2.5em;
}
h2 {
font-size: 2em;
}
h3 {
font-size: 1.5em;
}
p {
margin-bottom: 15px;
}
/* Table Styles */
table {
width: 100%;
border-collapse: collapse;
margin-bottom: 20px;
}
th, td {
padding: 12px 15px;
text-align: left;
border-bottom: 1px solid #ddd;
}
th {
background-color: #f0f0f0;
font-weight: bold;
}
/* Responsive Design */
@media (max-width: 768px) {
.container {
padding: 15px;
}
h1 {
font-size: 2em;
}
h2 {
font-size: 1.6em;
}
h3 {
font-size: 1.3em;
}
table {
display: block;
overflow-x: auto;
}
}
/* Specific Styles */
.model-summary {
margin-bottom: 30px;
}
.evaluation-metrics {
margin-bottom: 30px;
}
.visualizations {
margin-bottom: 30px;
}
.conclusion {
margin-bottom: 20px;
}
.visualization-image {
max-width: 100%;
height: auto;
border: 1px solid #ccc;
border-radius: 5px;
margin-bottom: 10px;
}
</style>
</head>
<body>
<div class="container">
<!-- Report Header -->
<h1>AutoML Model Evaluation Report</h1>
<p>Generated on: {{generation_date}}</p>
<!-- Model Summary -->
<section class="model-summary">
<h2>Model Summary</h2>
<p><strong>Model Name:</strong> {{model_name}}</p>
<p><strong>Algorithm:</strong> {{algorithm}}</p>
<p><strong>Dataset:</strong> {{dataset_name}}</p>
<p><strong>Features Used:</strong> {{features_used}}</p>
</section>
<!-- Evaluation Metrics -->
<section class="evaluation-metrics">
<h2>Evaluation Metrics</h2>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Accuracy</td>
<td>{{accuracy}}</td>
</tr>
<tr>
<td>Precision</td>
<td>{{precision}}</td>
</tr>
<tr>
<td>Recall</td>
<td>{{recall}}</td>
</tr>
<tr>
<td>F1-Score</td>
<td>{{f1_score}}</td>
</tr>
<tr>
<td>AUC-ROC</td>
<td>{{auc_roc}}</td>
</tr>
</tbody>
</table>
</section>
<!-- Visualizations -->
<section class="visualizations">
<h2>Visualizations</h2>
<h3>Confusion Matrix</h3>
<img src="{{confusion_matrix_image}}" alt="Confusion Matrix" class="visualization-image">
<h3>ROC Curve</h3>
<img src="{{roc_curve_image}}" alt="ROC Curve" class="visualization-image">
<h3>Feature Importance</h3>
<img src="{{feature_importance_image}}" alt="Feature Importance" class="visualization-image">
</section>
<!-- Conclusion -->
<section class="conclusion">
<h2>Conclusion</h2>
<p>{{conclusion_text}}</p>
</section>
<!-- Additional Notes -->
<section class="notes">
<h3>Additional Notes</h3>
<p>{{additional_notes}}</p>
</section>
</div>
</body>
</html>

View File

@@ -0,0 +1,36 @@
# Sample dataset for AutoML pipeline builder plugin
# This dataset is a simplified example and may not be suitable for all AutoML tasks.
# Replace this with your actual dataset for optimal results.
#
# Columns:
# feature1: Numerical feature (e.g., age, income)
# feature2: Categorical feature (e.g., city, product type) - encoded as strings
# target: Target variable (e.g., churn, conversion) - binary (0 or 1)
feature1,feature2,target
25,New York,0
30,Los Angeles,1
40,Chicago,0
22,Houston,0
35,Phoenix,1
48,Philadelphia,1
28,San Antonio,0
32,San Diego,1
45,Dallas,0
27,San Jose,0
31,Austin,1
38,Jacksonville,0
24,Fort Worth,0
41,Columbus,1
29,Charlotte,0
33,San Francisco,1
46,Indianapolis,1
23,Seattle,0
36,Denver,1
49,Washington,1
# Add more data rows here. Aim for a larger dataset (hundreds or thousands of rows) for better AutoML performance.
# Example:
# 52,Miami,0
# 39,Boston,1
# Consider adding missing values (e.g., empty strings) to test the pipeline's handling of missing data.
# For categorical features with many unique values, consider using techniques like one-hot encoding or target encoding.
1 # Sample dataset for AutoML pipeline builder plugin
2 # This dataset is a simplified example and may not be suitable for all AutoML tasks.
3 # Replace this with your actual dataset for optimal results.
4 #
5 # Columns:
6 # feature1: Numerical feature (e.g., age, income)
7 # feature2: Categorical feature (e.g., city, product type) - encoded as strings
8 # target: Target variable (e.g., churn, conversion) - binary (0 or 1)
9 feature1,feature2,target
10 25,New York,0
11 30,Los Angeles,1
12 40,Chicago,0
13 22,Houston,0
14 35,Phoenix,1
15 48,Philadelphia,1
16 28,San Antonio,0
17 32,San Diego,1
18 45,Dallas,0
19 27,San Jose,0
20 31,Austin,1
21 38,Jacksonville,0
22 24,Fort Worth,0
23 41,Columbus,1
24 29,Charlotte,0
25 33,San Francisco,1
26 46,Indianapolis,1
27 23,Seattle,0
28 36,Denver,1
29 49,Washington,1
30 # Add more data rows here. Aim for a larger dataset (hundreds or thousands of rows) for better AutoML performance.
31 # Example:
32 # 52,Miami,0
33 # 39,Boston,1
34 # Consider adding missing values (e.g., empty strings) to test the pipeline's handling of missing data.
35 # For categorical features with many unique values, consider using techniques like one-hot encoding or target encoding.

View File

@@ -0,0 +1,69 @@
# pipeline_template.yaml
# --- General Pipeline Configuration ---
pipeline_name: "AutoML Pipeline - REPLACE_ME" # Name of the pipeline (e.g., Customer Churn Prediction)
description: "Automated Machine Learning pipeline for REPLACE_ME." # Short description of the pipeline's purpose
version: "1.0.0" # Pipeline version
# --- Data Source Configuration ---
data_source:
type: "csv" # Type of data source (e.g., csv, database, api)
location: "data/YOUR_DATASET.csv" # Path to the data file or connection string
target_column: "target" # Name of the target variable column
index_column: null # Name of the index column (optional)
delimiter: "," # Delimiter for CSV files (e.g., ",", ";", "\t")
quotechar: '"' # Quote character for CSV files
encoding: "utf-8" # Encoding of the data file
# --- Feature Engineering Configuration ---
feature_engineering:
enabled: true # Enable or disable feature engineering
numeric_imputation: "mean" # Strategy for handling missing numerical values (e.g., mean, median, most_frequent, constant)
categorical_encoding: "onehot" # Method for encoding categorical features (e.g., onehot, ordinal, target)
feature_scaling: "standard" # Scaling method for numeric features (e.g., standard, minmax, robust)
feature_selection:
enabled: false # Enable or disable feature selection
method: "variance_threshold" # Feature selection method (e.g., variance_threshold, selectkbest)
threshold: 0.01 # Threshold for feature selection (depends on the method)
# --- Model Training Configuration ---
model_training:
algorithm: "xgboost" # Machine learning algorithm to use (e.g., xgboost, lightgbm, randomforest, logisticregression)
hyperparameter_tuning:
enabled: true # Enable or disable hyperparameter tuning
method: "random_search" # Hyperparameter tuning method (e.g., random_search, grid_search, bayesian_optimization)
n_trials: 50 # Number of trials for hyperparameter tuning
scoring_metric: "roc_auc" # Metric to optimize for (e.g., roc_auc, accuracy, f1, precision, recall)
hyperparameter_space: # Define hyperparameter ranges for each algorithm
xgboost: # Example for XGBoost
n_estimators: [100, 200, 300]
learning_rate: [0.01, 0.1, 0.2]
max_depth: [3, 5, 7]
# Add hyperparameter spaces for other algorithms as needed
# --- Model Evaluation Configuration ---
model_evaluation:
split_ratio: 0.2 # Ratio for splitting data into training and validation sets
scoring_metrics: ["roc_auc", "accuracy", "f1", "precision", "recall"] # List of metrics to evaluate the model
cross_validation:
enabled: true # Enable or disable cross-validation
n_folds: 5 # Number of folds for cross-validation
# --- Model Deployment Configuration ---
model_deployment:
enabled: false # Enable or disable model deployment
environment: "staging" # Target deployment environment (e.g., staging, production)
model_registry: "local" # Location to store the trained model (e.g., local, s3, gcp)
model_path: "models/YOUR_MODEL.pkl" # Path to save the trained model
api_endpoint: "YOUR_API_ENDPOINT" # API endpoint for model deployment (if applicable)
# --- Logging Configuration ---
logging:
level: "INFO" # Logging level (e.g., DEBUG, INFO, WARNING, ERROR)
format: "%(asctime)s - %(levelname)s - %(message)s" # Logging format
file_path: "logs/pipeline.log" # Path to the log file
# --- Error Handling Configuration ---
error_handling:
on_failure: "email_notification" # Action to take on pipeline failure (e.g., email_notification, retry, stop)
email_recipients: ["YOUR_EMAIL@example.com"] # List of email addresses to notify on failure

View File

@@ -0,0 +1,7 @@
# References
Bundled resources for automl-pipeline-builder skill
- [ ] automl_best_practices.md: Document outlining best practices for building and deploying AutoML pipelines, including data preprocessing, feature engineering, and model selection.
- [ ] supported_algorithms.md: Document listing the supported machine learning algorithms within the AutoML pipeline builder plugin, along with their parameters and usage.
- [ ] error_handling_guide.md: Guide on how to handle errors and exceptions that may occur during the AutoML pipeline building process.

View File

@@ -0,0 +1,7 @@
# Scripts
Bundled resources for automl-pipeline-builder skill
- [ ] data_validation.py: Script to validate input data for the AutoML pipeline, ensuring data quality and preventing errors.
- [ ] model_evaluation.py: Script to evaluate the performance of the trained AutoML model using various metrics and generate a report.
- [ ] pipeline_deployment.py: Script to deploy the trained AutoML pipeline to a production environment.