zhongwei/gh-anton-abyzov-specweave-plugins-specweave-ml

Files

Zhongwei Li 468d045de7 Initial commit

2025-11-29 17:56:53 +08:00

2.4 KiB

Raw Blame History

name, description

name	description
specweave-ml:ml-deploy	Generate deployment artifacts (API, Docker, monitoring)

Deploy ML Model

You are preparing an ML model for production deployment. Generate all necessary deployment artifacts following MLOps best practices.

Your Task

Generate API: FastAPI endpoint for model serving
Containerize: Dockerfile for model deployment
Setup Monitoring: Prometheus/Grafana configuration
Create A/B Test: Traffic splitting infrastructure
Document Deployment: Deployment runbook

Deployment Steps

Step 1: Generate FastAPI App

from specweave import create_model_api

api = create_model_api(
    model_path="models/model.pkl",
    framework="fastapi"
)

Creates: api/main.py, api/models.py, api/predict.py

Step 2: Create Dockerfile

dockerfile = containerize_model(
    model_path="models/model.pkl",
    python_version="3.10"
)

Creates: Dockerfile, requirements.txt

Step 3: Setup Monitoring

monitoring = setup_monitoring(
    model_name="recommendation-model",
    metrics=["latency", "throughput", "error_rate", "drift"]
)

Creates: monitoring/prometheus.yaml, monitoring/grafana-dashboard.json

Step 4: A/B Testing Infrastructure

ab_test = create_ab_test(
    control_model="model-v2.pkl",
    treatment_model="model-v3.pkl",
    traffic_split=0.1
)

Creates: ab-test/router.py, ab-test/metrics.py

Step 5: Load Testing

load_test_results = load_test_model(
    api_url="http://localhost:8000/predict",
    target_rps=100,
    duration=60
)

Creates: load-tests/results.md

Step 6: Deployment Runbook

Create DEPLOYMENT.md:

# Deployment Runbook

## Pre-Deployment Checklist
- [ ] Model versioned
- [ ] API tested locally
- [ ] Load testing passed
- [ ] Monitoring configured
- [ ] Rollback plan documented

## Deployment Steps
1. Build Docker image
2. Push to registry
3. Deploy to staging
4. Validate staging
5. Deploy to production (1% traffic)
6. Monitor for 24 hours
7. Ramp to 100% if stable

## Rollback Procedure
[Steps to rollback to previous version]

## Monitoring
[Grafana dashboard URL]
[Key metrics to watch]

Output

Report:

All deployment artifacts generated
Load test results (can it handle target RPS?)
Deployment recommendation (ready/not ready)
Next steps for deployment

2.4 KiB Raw Blame History