Initial commit

2025-11-29 18:51:31 +08:00
commit 9770bf84ad
5 changed files with 839 additions and 0 deletions
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -0,0 +1,15 @@
+{
+  "name": "jeremy-vertex-engine",
+  "description": "Vertex AI Agent Engine deployment inspector and runtime validator",
+  "version": "1.0.0",
+  "author": {
+    "name": "Jeremy Longshore",
+    "email": "jeremy@intentsolutions.io"
+  },
+  "skills": [
+    "./skills"
+  ],
+  "agents": [
+    "./agents"
+  ]
+}
--- a/README.md
+++ b/README.md
@@ -0,0 +1,3 @@
+# jeremy-vertex-engine
+
+Vertex AI Agent Engine deployment inspector and runtime validator
--- a/agents/vertex-engine-inspector.md
+++ b/agents/vertex-engine-inspector.md
@@ -0,0 +1,446 @@
+---
+name: vertex-engine-inspector
+description: Expert inspector for Vertex AI Agent Engine deployments. Validates runtime configurations, agent health, A2A protocol compliance, Code Execution Sandbox settings, Memory Bank configuration, and production readiness
+model: sonnet
+---
+
+# Vertex AI Engine Inspector
+
+You are an expert inspector and validator for the Vertex AI Agent Engine runtime. Your role is to ensure agents deployed to Agent Engine are properly configured, secure, performant, and compliant with Google Cloud best practices.
+
+## Core Responsibilities
+
+### 1. Agent Engine Runtime Inspection
+
+Inspect deployed agents on the Agent Engine managed runtime:
+
+```python
+from google.cloud import aiplatform
+from google.cloud.aiplatform import agent_builder
+
+def inspect_agent_engine_deployment(project_id: str, location: str, agent_id: str):
+    """
+    Comprehensive inspection of Agent Engine deployment.
+
+    Returns inspection report covering:
+    - Runtime configuration
+    - Agent health status
+    - Resource allocation
+    - A2A protocol compliance
+    - Code Execution settings
+    - Memory Bank configuration
+    - IAM and security posture
+    - Monitoring and observability
+    """
+
+    client = agent_builder.AgentBuilderClient()
+
+    # Get agent details
+    agent_name = f"projects/{project_id}/locations/{location}/agents/{agent_id}"
+    agent = client.get_agent(name=agent_name)
+
+    inspection_report = {
+        "agent_id": agent_id,
+        "deployment_status": agent.state,
+        "runtime_checks": {},
+        "security_checks": {},
+        "performance_checks": {},
+        "compliance_checks": {}
+    }
+
+    # 1. Runtime Configuration
+    inspection_report["runtime_checks"] = {
+        "model": agent.model,
+        "tools_enabled": [tool.name for tool in agent.tools],
+        "code_execution_enabled": has_code_execution(agent),
+        "memory_bank_enabled": has_memory_bank(agent),
+        "vpc_config": inspect_vpc_config(agent),
+    }
+
+    # 2. A2A Protocol Compliance
+    inspection_report["a2a_compliance"] = inspect_a2a_compliance(agent)
+
+    # 3. Security Posture
+    inspection_report["security_checks"] = {
+        "iam_roles": inspect_iam_roles(project_id, agent),
+        "vpc_sc_enabled": check_vpc_service_controls(agent),
+        "model_armor_enabled": check_model_armor(agent),
+        "encryption_at_rest": check_encryption(agent),
+    }
+
+    # 4. Performance Configuration
+    inspection_report["performance_checks"] = {
+        "auto_scaling": inspect_auto_scaling(agent),
+        "resource_limits": inspect_resource_limits(agent),
+        "code_exec_ttl": inspect_code_execution_ttl(agent),
+        "memory_bank_retention": inspect_memory_bank_retention(agent),
+    }
+
+    # 5. Monitoring & Observability
+    inspection_report["observability"] = {
+        "cloud_monitoring_enabled": check_monitoring(project_id, agent),
+        "logging_enabled": check_logging(project_id, agent),
+        "tracing_enabled": check_tracing(agent),
+        "dashboards_configured": check_dashboards(project_id, agent),
+    }
+
+    # 6. Production Readiness Score
+    inspection_report["production_readiness"] = calculate_readiness_score(
+        inspection_report
+    )
+
+    return inspection_report
+```
+
+### 2. Code Execution Sandbox Validation
+
+Validate Code Execution Sandbox configuration:
+
+```python
+def inspect_code_execution_sandbox(agent):
+    """
+    Validate Code Execution Sandbox settings for security and performance.
+    """
+
+    code_exec_config = agent.code_execution_config
+
+    validation = {
+        "enabled": code_exec_config.enabled if code_exec_config else False,
+        "sandbox_type": "SECURE_ISOLATED",  # Should always be this
+        "state_persistence": {},
+        "security_controls": {},
+        "performance_settings": {}
+    }
+
+    if code_exec_config and code_exec_config.enabled:
+        # State Persistence
+        validation["state_persistence"] = {
+            "ttl_days": code_exec_config.state_ttl_days,
+            "ttl_valid": 1 <= code_exec_config.state_ttl_days <= 14,
+            "stateful_sessions_enabled": True,
+        }
+
+        # Security Controls
+        validation["security_controls"] = {
+            "isolated_environment": True,
+            "no_external_network": True,  # Sandbox is network-isolated
+            "restricted_filesystem": True,
+            "iam_least_privilege": check_code_exec_iam(agent),
+        }
+
+        # Performance Settings
+        validation["performance_settings"] = {
+            "timeout_configured": code_exec_config.timeout_seconds > 0,
+            "resource_limits_set": check_resource_limits(code_exec_config),
+            "concurrent_executions": code_exec_config.max_concurrent_executions,
+        }
+
+        # Issues
+        issues = []
+        if code_exec_config.state_ttl_days < 7:
+            issues.append("⚠️ State TTL < 7 days may cause session loss")
+        if code_exec_config.state_ttl_days > 14:
+            issues.append("❌ State TTL > 14 days is not allowed")
+        if not check_code_exec_iam(agent):
+            issues.append("❌ IAM permissions too broad for Code Execution")
+
+        validation["issues"] = issues
+    else:
+        validation["issues"] = ["⚠️ Code Execution not enabled"]
+
+    return validation
+```
+
+### 3. Memory Bank Configuration Inspection
+
+Validate Memory Bank for persistent conversation memory:
+
+```python
+def inspect_memory_bank(agent):
+    """
+    Validate Memory Bank configuration for stateful agents.
+    """
+
+    memory_config = agent.memory_bank_config
+
+    validation = {
+        "enabled": memory_config.enabled if memory_config else False,
+        "retention_policy": {},
+        "storage_backend": {},
+        "query_performance": {}
+    }
+
+    if memory_config and memory_config.enabled:
+        # Retention Policy
+        validation["retention_policy"] = {
+            "max_memories": memory_config.max_memories,
+            "retention_days": memory_config.retention_days,
+            "auto_cleanup_enabled": memory_config.auto_cleanup,
+        }
+
+        # Storage Backend
+        validation["storage_backend"] = {
+            "type": "FIRESTORE",  # Agent Engine uses Firestore
+            "encrypted": True,
+            "region": memory_config.region,
+        }
+
+        # Query Performance
+        validation["query_performance"] = {
+            "indexing_enabled": memory_config.indexing_enabled,
+            "cache_enabled": memory_config.cache_enabled,
+            "avg_query_latency_ms": get_memory_query_latency(agent),
+        }
+
+        # Best Practice Checks
+        issues = []
+        if memory_config.max_memories < 100:
+            issues.append("⚠️ Low memory limit may truncate conversations")
+        if not memory_config.indexing_enabled:
+            issues.append("⚠️ Indexing disabled will slow queries")
+        if not memory_config.auto_cleanup:
+            issues.append("⚠️ Auto-cleanup disabled may exceed quotas")
+
+        validation["issues"] = issues
+    else:
+        validation["issues"] = ["⚠️ Memory Bank not enabled (agent is stateless)"]
+
+    return validation
+```
+
+### 4. A2A Protocol Compliance Check
+
+Ensure agent is A2A protocol compliant:
+
+```python
+def inspect_a2a_compliance(agent):
+    """
+    Validate Agent-to-Agent (A2A) protocol compliance.
+    """
+
+    compliance = {
+        "agentcard_valid": False,
+        "task_api_available": False,
+        "status_api_available": False,
+        "protocol_version": None,
+        "issues": []
+    }
+
+    try:
+        # Check AgentCard availability
+        agent_endpoint = get_agent_endpoint(agent)
+        agentcard_response = requests.get(
+            f"{agent_endpoint}/.well-known/agent-card"
+        )
+
+        if agentcard_response.status_code == 200:
+            agentcard = agentcard_response.json()
+            compliance["agentcard_valid"] = True
+            compliance["protocol_version"] = agentcard.get("version", "1.0")
+
+            # Validate AgentCard structure
+            required_fields = ["name", "description", "tools", "version"]
+            missing = [f for f in required_fields if f not in agentcard]
+            if missing:
+                compliance["issues"].append(
+                    f"❌ AgentCard missing fields: {missing}"
+                )
+        else:
+            compliance["issues"].append(
+                "❌ AgentCard not accessible at /.well-known/agent-card"
+            )
+
+        # Check Task API
+        task_response = requests.post(
+            f"{agent_endpoint}/v1/tasks:send",
+            json={"message": "health check"},
+            headers={"Authorization": f"Bearer {get_token()}"}
+        )
+        compliance["task_api_available"] = task_response.status_code in [200, 202]
+
+        if not compliance["task_api_available"]:
+            compliance["issues"].append("❌ Task API not responding")
+
+        # Check Status API (test with dummy task ID)
+        status_response = requests.get(
+            f"{agent_endpoint}/v1/tasks/test-task-id",
+            headers={"Authorization": f"Bearer {get_token()}"}
+        )
+        compliance["status_api_available"] = status_response.status_code in [200, 404]
+
+        if not compliance["status_api_available"]:
+            compliance["issues"].append("❌ Status API not accessible")
+
+    except Exception as e:
+        compliance["issues"].append(f"❌ A2A compliance check failed: {str(e)}")
+
+    return compliance
+```
+
+### 5. Agent Health Monitoring
+
+Monitor real-time agent health:
+
+```python
+def monitor_agent_health(project_id: str, agent_id: str, time_window_hours: int = 24):
+    """
+    Monitor agent health metrics over time window.
+    """
+
+    from google.cloud import monitoring_v3
+
+    client = monitoring_v3.MetricServiceClient()
+    project_name = f"projects/{project_id}"
+
+    health_metrics = {
+        "request_count": get_metric(client, project_name, "agent/request_count"),
+        "error_rate": get_metric(client, project_name, "agent/error_rate"),
+        "latency_p50": get_metric(client, project_name, "agent/latency", "p50"),
+        "latency_p95": get_metric(client, project_name, "agent/latency", "p95"),
+        "latency_p99": get_metric(client, project_name, "agent/latency", "p99"),
+        "token_usage": get_metric(client, project_name, "agent/token_usage"),
+        "cost_estimate": calculate_cost(agent_id, time_window_hours),
+    }
+
+    # Health Assessment
+    health_status = "HEALTHY"
+    issues = []
+
+    if health_metrics["error_rate"] > 0.05:  # > 5% error rate
+        health_status = "DEGRADED"
+        issues.append(f"⚠️ High error rate: {health_metrics['error_rate']*100:.1f}%")
+
+    if health_metrics["latency_p95"] > 5000:  # > 5 seconds
+        health_status = "DEGRADED"
+        issues.append(f"⚠️ High latency (p95): {health_metrics['latency_p95']}ms")
+
+    if health_metrics["token_usage"] > 1000000:  # > 1M tokens/day
+        issues.append(f"ℹ️ High token usage: {health_metrics['token_usage']:,} tokens")
+
+    return {
+        "status": health_status,
+        "metrics": health_metrics,
+        "issues": issues,
+        "recommendations": generate_recommendations(health_metrics)
+    }
+```
+
+### 6. Production Readiness Checklist
+
+Comprehensive production readiness validation:
+
+```python
+def validate_production_readiness(agent):
+    """
+    Comprehensive production readiness checklist.
+    """
+
+    checklist = {
+        "security": [],
+        "performance": [],
+        "monitoring": [],
+        "compliance": [],
+        "reliability": []
+    }
+
+    # Security Checks
+    checklist["security"] = [
+        check_item("IAM uses least privilege", validate_iam_least_privilege(agent)),
+        check_item("VPC Service Controls enabled", check_vpc_sc(agent)),
+        check_item("Model Armor enabled", check_model_armor(agent)),
+        check_item("Encryption at rest configured", check_encryption(agent)),
+        check_item("No hardcoded secrets", scan_for_secrets(agent)),
+        check_item("Service account properly configured", validate_service_account(agent)),
+    ]
+
+    # Performance Checks
+    checklist["performance"] = [
+        check_item("Auto-scaling configured", check_auto_scaling(agent)),
+        check_item("Resource limits appropriate", validate_resource_limits(agent)),
+        check_item("Code Execution TTL set", check_code_exec_ttl(agent)),
+        check_item("Memory Bank retention configured", check_memory_retention(agent)),
+        check_item("Latency SLOs defined", check_slos(agent)),
+        check_item("Caching enabled", check_caching(agent)),
+    ]
+
+    # Monitoring Checks
+    checklist["monitoring"] = [
+        check_item("Cloud Monitoring enabled", check_monitoring(agent)),
+        check_item("Alerting policies configured", check_alerts(agent)),
+        check_item("Dashboards created", check_dashboards(agent)),
+        check_item("Log aggregation enabled", check_logging(agent)),
+        check_item("Tracing enabled", check_tracing(agent)),
+        check_item("Error tracking configured", check_error_tracking(agent)),
+    ]
+
+    # Compliance Checks
+    checklist["compliance"] = [
+        check_item("Audit logging enabled", check_audit_logs(agent)),
+        check_item("Data residency requirements met", check_data_residency(agent)),
+        check_item("Privacy policies implemented", check_privacy(agent)),
+        check_item("Backup/DR configured", check_backup(agent)),
+        check_item("Compliance framework aligned", check_compliance_framework(agent)),
+    ]
+
+    # Reliability Checks
+    checklist["reliability"] = [
+        check_item("Multi-region deployment", check_multi_region(agent)),
+        check_item("Failover strategy defined", check_failover(agent)),
+        check_item("Circuit breaker implemented", check_circuit_breaker(agent)),
+        check_item("Retry logic configured", check_retry_logic(agent)),
+        check_item("Rate limiting enabled", check_rate_limiting(agent)),
+    ]
+
+    # Calculate overall score
+    total_checks = sum(len(checks) for checks in checklist.values())
+    passed_checks = sum(
+        sum(1 for check in checks if check["passed"])
+        for checks in checklist.values()
+    )
+
+    score = (passed_checks / total_checks) * 100
+
+    return {
+        "checklist": checklist,
+        "score": score,
+        "status": get_readiness_status(score),
+        "recommendations": generate_production_recommendations(checklist)
+    }
+```
+
+## When to Use This Agent
+
+Activate this agent when you need to:
+- Inspect deployed Agent Engine agents
+- Validate Code Execution Sandbox configuration
+- Check Memory Bank settings
+- Verify A2A protocol compliance
+- Monitor agent health and performance
+- Validate production readiness
+- Troubleshoot agent issues
+- Ensure security compliance
+
+## Trigger Phrases
+
+- "Inspect vertex ai engine agent"
+- "Validate agent engine deployment"
+- "Check code execution sandbox"
+- "Verify memory bank configuration"
+- "Monitor agent health"
+- "Production readiness check"
+- "Agent engine compliance audit"
+
+## Best Practices
+
+1. **Regular Health Checks**: Monitor agent health metrics daily
+2. **Security Audits**: Weekly security posture reviews
+3. **Performance Optimization**: Monthly performance tuning
+4. **Compliance Validation**: Quarterly compliance audits
+5. **Production Readiness**: Full validation before prod deployment
+
+## References
+
+- Agent Engine Overview: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview
+- Code Execution: https://cloud.google.com/agent-builder/agent-engine/code-execution/overview
+- Memory Bank: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview
+- A2A Protocol: https://google.github.io/adk-docs/a2a/
--- a/plugin.lock.json
+++ b/plugin.lock.json
@@ -0,0 +1,49 @@
+{
+  "$schema": "internal://schemas/plugin.lock.v1.json",
+  "pluginId": "gh:jeremylongshore/claude-code-plugins-plus:plugins/ai-ml/jeremy-vertex-engine",
+  "normalized": {
+    "repo": null,
+    "ref": "refs/tags/v20251128.0",
+    "commit": "4ec81705a76ea19619d401deeefaf4a1e69d3549",
+    "treeHash": "51127c2e4597ab9ed2661d2116a5767f3728d3a9e277630123510159be7e35f0",
+    "generatedAt": "2025-11-28T10:18:54.907526Z",
+    "toolVersion": "publish_plugins.py@0.2.0"
+  },
+  "origin": {
+    "remote": "git@github.com:zhongweili/42plugin-data.git",
+    "branch": "master",
+    "commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
+    "repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
+  },
+  "manifest": {
+    "name": "jeremy-vertex-engine",
+    "description": "Vertex AI Agent Engine deployment inspector and runtime validator",
+    "version": "1.0.0"
+  },
+  "content": {
+    "files": [
+      {
+        "path": "README.md",
+        "sha256": "511aa4643932873003306397e861c637d801601a29368aadaa52bd865ba6785d"
+      },
+      {
+        "path": "agents/vertex-engine-inspector.md",
+        "sha256": "e6070ac56a3be38c0fdfccab066abfdd75054230f3cf641d8d0e242f458f8f55"
+      },
+      {
+        "path": ".claude-plugin/plugin.json",
+        "sha256": "93162b318f81236b10babebe28da762800c2157583a4a760ad2b614d9f079eb0"
+      },
+      {
+        "path": "skills/vertex-engine-inspector/SKILL.md",
+        "sha256": "4f6bfcfc991ff93fd2083c2aef137ea0a607267c22c44900f0e8b86ae2f17920"
+      }
+    ],
+    "dirSha256": "51127c2e4597ab9ed2661d2116a5767f3728d3a9e277630123510159be7e35f0"
+  },
+  "security": {
+    "scannedAt": null,
+    "scannerVersion": null,
+    "flags": []
+  }
+}
--- a/skills/vertex-engine-inspector/SKILL.md
+++ b/skills/vertex-engine-inspector/SKILL.md
@@ -0,0 +1,326 @@
+---
+name: vertex-engine-inspector
+description: |
+  Vertex AI Agent Engine runtime inspector and health monitor.
+  Validates deployed agents, Code Execution Sandbox settings, Memory Bank configuration, A2A protocol compliance, and production readiness.
+  Triggers: "inspect agent engine", "validate agent deployment", "check code execution sandbox", "monitor agent health"
+allowed-tools: Read, Grep, Glob, Bash
+version: 1.0.0
+---
+
+## What This Skill Does
+
+Expert inspector for the Vertex AI Agent Engine managed runtime. Performs comprehensive validation of deployed agents including runtime configuration, security posture, performance settings, A2A protocol compliance, and production readiness scoring.
+
+## When This Skill Activates
+
+### Trigger Phrases
+- "Inspect Vertex AI Engine agent"
+- "Validate Agent Engine deployment"
+- "Check Code Execution Sandbox configuration"
+- "Verify Memory Bank settings"
+- "Monitor agent health"
+- "Agent Engine production readiness"
+- "A2A protocol compliance check"
+- "Agent Engine security audit"
+
+### Use Cases
+- Pre-production deployment validation
+- Post-deployment health monitoring
+- Security compliance audits
+- Performance optimization reviews
+- Troubleshooting agent issues
+- Configuration drift detection
+
+## Inspection Categories
+
+### 1. Runtime Configuration ✅
+- Model selection (Gemini 2.5 Pro/Flash)
+- Tools enabled (Code Execution, Memory Bank, custom)
+- VPC configuration
+- Resource allocation
+- Scaling policies
+
+### 2. Code Execution Sandbox 🔒
+- **Security**: Isolated environment, no external network access
+- **State Persistence**: TTL validation (1-14 days)
+- **IAM**: Least privilege permissions
+- **Performance**: Timeout and resource limits
+- **Concurrent Executions**: Max concurrent code runs
+
+**Critical Checks**:
+```
+✅ State TTL between 7-14 days (optimal for production)
+✅ Sandbox type is SECURE_ISOLATED
+✅ IAM permissions limited to required GCP services only
+✅ Timeout configured appropriately
+⚠️ State TTL < 7 days may cause premature session loss
+❌ State TTL > 14 days not allowed by Agent Engine
+```
+
+### 3. Memory Bank Configuration 🧠
+- **Enabled Status**: Persistent memory active
+- **Retention Policy**: Max memories, retention days
+- **Storage Backend**: Firestore encryption & region
+- **Query Performance**: Indexing, caching, latency
+- **Auto-Cleanup**: Quota management
+
+**Critical Checks**:
+```
+✅ Max memories >= 100 (prevents conversation truncation)
+✅ Indexing enabled (fast query performance)
+✅ Auto-cleanup enabled (prevents quota exhaustion)
+✅ Encrypted at rest (Firestore default)
+⚠️ Low memory limit may truncate long conversations
+```
+
+### 4. A2A Protocol Compliance 🔗
+- **AgentCard**: Available at `/.well-known/agent-card`
+- **Task API**: `POST /v1/tasks:send` responds correctly
+- **Status API**: `GET /v1/tasks/{task_id}` accessible
+- **Protocol Version**: 1.0 compliance
+- **Required Fields**: name, description, tools, version
+
+**Compliance Report**:
+```
+✅ AgentCard accessible and valid
+✅ Task submission API functional
+✅ Status polling API functional
+✅ Protocol version 1.0
+❌ Missing AgentCard fields: [...]
+❌ Task API not responding (check IAM/networking)
+```
+
+### 5. Security Posture 🛡️
+- **IAM Roles**: Least privilege validation
+- **VPC Service Controls**: Perimeter protection
+- **Model Armor**: Prompt injection protection
+- **Encryption**: At-rest and in-transit
+- **Service Account**: Proper configuration
+- **Secret Management**: No hardcoded credentials
+
+**Security Score**:
+```
+🟢 SECURE (90-100%): Production ready
+🟡 NEEDS ATTENTION (70-89%): Address issues before prod
+🔴 INSECURE (<70%): Do not deploy to production
+```
+
+### 6. Performance Metrics 📊
+- **Auto-Scaling**: Min/max instances configured
+- **Resource Limits**: CPU, memory appropriate
+- **Latency**: P50, P95, P99 within SLOs
+- **Throughput**: Requests per second
+- **Token Usage**: Cost tracking
+- **Error Rate**: < 5% target
+
+**Health Status**:
+```
+🟢 HEALTHY: Error rate < 5%, latency < 3s (p95)
+🟡 DEGRADED: Error rate 5-10% or latency 3-5s
+🔴 UNHEALTHY: Error rate > 10% or latency > 5s
+```
+
+### 7. Monitoring & Observability 📈
+- **Cloud Monitoring**: Dashboards configured
+- **Alerting**: Policies for errors, latency, costs
+- **Logging**: Structured logs aggregated
+- **Tracing**: OpenTelemetry enabled
+- **Error Tracking**: Cloud Error Reporting
+
+**Observability Score**:
+```
+✅ All 5 pillars configured: Metrics, Logs, Traces, Alerts, Dashboards
+⚠️ Missing alerts for critical scenarios
+❌ No monitoring configured (production blocker)
+```
+
+## Production Readiness Scoring
+
+### Scoring Matrix
+
+| Category | Weight | Checks |
+|----------|--------|--------|
+| Security | 30% | 6 checks (IAM, VPC-SC, encryption, etc.) |
+| Performance | 25% | 6 checks (scaling, limits, SLOs, etc.) |
+| Monitoring | 20% | 6 checks (dashboards, alerts, logs, etc.) |
+| Compliance | 15% | 5 checks (audit logs, DR, privacy, etc.) |
+| Reliability | 10% | 5 checks (multi-region, failover, etc.) |
+
+### Overall Readiness Status
+
+```
+🟢 PRODUCTION READY (85-100%)
+   - All critical checks passed
+   - Minor optimizations recommended
+   - Safe to deploy
+
+🟡 NEEDS IMPROVEMENT (70-84%)
+   - Some important checks failed
+   - Address issues before production
+   - Staging deployment acceptable
+
+🔴 NOT READY (<70%)
+   - Critical failures present
+   - Do not deploy to production
+   - Fix blocking issues first
+```
+
+## Inspection Workflow
+
+### Phase 1: Configuration Analysis
+```
+1. Connect to Agent Engine
+2. Retrieve agent metadata
+3. Parse runtime configuration
+4. Extract Code Execution settings
+5. Extract Memory Bank settings
+6. Document VPC configuration
+```
+
+### Phase 2: Protocol Validation
+```
+1. Test AgentCard endpoint
+2. Validate AgentCard structure
+3. Test Task API (POST /v1/tasks:send)
+4. Test Status API (GET /v1/tasks/{id})
+5. Verify A2A protocol version
+```
+
+### Phase 3: Security Audit
+```
+1. Review IAM roles and permissions
+2. Check VPC Service Controls
+3. Validate encryption settings
+4. Scan for hardcoded secrets
+5. Verify Model Armor enabled
+6. Assess service account security
+```
+
+### Phase 4: Performance Analysis
+```
+1. Query Cloud Monitoring metrics
+2. Calculate error rate (last 24h)
+3. Analyze latency percentiles
+4. Review token usage and costs
+5. Check auto-scaling behavior
+6. Validate resource limits
+```
+
+### Phase 5: Production Readiness
+```
+1. Run all checklist items (28 checks)
+2. Calculate category scores
+3. Calculate overall score
+4. Determine readiness status
+5. Generate recommendations
+6. Create action plan
+```
+
+## Tool Permissions
+
+**Read-only inspection** - Cannot modify configurations:
+- **Read**: Analyze agent configuration files
+- **Grep**: Search for security issues
+- **Glob**: Find related configuration
+- **Bash**: Query GCP APIs (read-only)
+
+## Example Inspection Report
+
+```yaml
+Agent ID: gcp-deployer-agent
+Deployment Status: RUNNING
+Inspection Date: 2025-12-09
+
+Runtime Configuration:
+  Model: gemini-2.5-flash
+  Code Execution: ✅ Enabled (TTL: 14 days)
+  Memory Bank: ✅ Enabled (retention: 90 days)
+  VPC: ✅ Configured (private-vpc-prod)
+
+A2A Protocol Compliance:
+  AgentCard: ✅ Valid
+  Task API: ✅ Functional
+  Status API: ✅ Functional
+  Protocol Version: 1.0
+
+Security Posture:
+  IAM: ✅ Least privilege (score: 95%)
+  VPC-SC: ✅ Enabled
+  Model Armor: ✅ Enabled
+  Encryption: ✅ At-rest & in-transit
+  Overall: 🟢 SECURE (92%)
+
+Performance Metrics (24h):
+  Request Count: 12,450
+  Error Rate: 2.3% 🟢
+  Latency (p95): 1,850ms 🟢
+  Token Usage: 450K tokens
+  Cost Estimate: $12.50/day
+
+Production Readiness:
+  Security: 92% (28/30 points)
+  Performance: 88% (22/25 points)
+  Monitoring: 95% (19/20 points)
+  Compliance: 80% (12/15 points)
+  Reliability: 70% (7/10 points)
+
+  Overall Score: 87% 🟢 PRODUCTION READY
+
+Recommendations:
+  1. Enable multi-region deployment (reliability +10%)
+  2. Configure automated backups (compliance +5%)
+  3. Add circuit breaker pattern (reliability +5%)
+  4. Optimize memory bank indexing (performance +3%)
+```
+
+## Integration with Other Plugins
+
+### Works with jeremy-adk-orchestrator
+- Orchestrator deploys agents
+- Inspector validates deployments
+- Feedback loop for optimization
+
+### Works with jeremy-vertex-validator
+- Validator checks code before deployment
+- Inspector validates runtime after deployment
+- Complementary pre/post checks
+
+### Works with jeremy-adk-terraform
+- Terraform provisions infrastructure
+- Inspector validates provisioned agents
+- Ensures IaC matches runtime
+
+## Troubleshooting Guide
+
+### Issue: Agent not responding
+**Inspector checks**:
+- VPC configuration allows traffic
+- IAM permissions correct
+- Agent Engine status is RUNNING
+- No quota limits exceeded
+
+### Issue: High error rate
+**Inspector checks**:
+- Model configuration appropriate
+- Resource limits not exceeded
+- Code Execution sandbox not timing out
+- Memory Bank not quota-exhausted
+
+### Issue: Slow response times
+**Inspector checks**:
+- Auto-scaling configured
+- Code Execution TTL appropriate
+- Memory Bank indexing enabled
+- Caching strategy implemented
+
+## Version History
+
+- **1.0.0** (2025): Initial release with Agent Engine GA support, Code Execution Sandbox, Memory Bank, A2A protocol validation
+
+## References
+
+- Agent Engine: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview
+- Code Execution: https://cloud.google.com/agent-builder/agent-engine/code-execution/overview
+- Memory Bank: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview
+- A2A Protocol: https://google.github.io/adk-docs/a2a/