Initial commit

2025-11-29 18:42:29 +08:00
commit 5d9c5c1010
21 changed files with 5694 additions and 0 deletions
--- a/skills/k8s-manifest-generator/SKILL.md
+++ b/skills/k8s-manifest-generator/SKILL.md
@@ -0,0 +1,511 @@
+---
+name: k8s-manifest-generator
+description: Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.
+---
+
+# Kubernetes Manifest Generator
+
+Step-by-step guidance for creating production-ready Kubernetes manifests including Deployments, Services, ConfigMaps, Secrets, and PersistentVolumeClaims.
+
+## Purpose
+
+This skill provides comprehensive guidance for generating well-structured, secure, and production-ready Kubernetes manifests following cloud-native best practices and Kubernetes conventions.
+
+## When to Use This Skill
+
+Use this skill when you need to:
+- Create new Kubernetes Deployment manifests
+- Define Service resources for network connectivity
+- Generate ConfigMap and Secret resources for configuration management
+- Create PersistentVolumeClaim manifests for stateful workloads
+- Follow Kubernetes best practices and naming conventions
+- Implement resource limits, health checks, and security contexts
+- Design manifests for multi-environment deployments
+
+## Step-by-Step Workflow
+
+### 1. Gather Requirements
+
+**Understand the workload:**
+- Application type (stateless/stateful)
+- Container image and version
+- Environment variables and configuration needs
+- Storage requirements
+- Network exposure requirements (internal/external)
+- Resource requirements (CPU, memory)
+- Scaling requirements
+- Health check endpoints
+
+**Questions to ask:**
+- What is the application name and purpose?
+- What container image and tag will be used?
+- Does the application need persistent storage?
+- What ports does the application expose?
+- Are there any secrets or configuration files needed?
+- What are the CPU and memory requirements?
+- Does the application need to be exposed externally?
+
+### 2. Create Deployment Manifest
+
+**Follow this structure:**
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: <app-name>
+  namespace: <namespace>
+  labels:
+    app: <app-name>
+    version: <version>
+spec:
+  replicas: 3
+  selector:
+    matchLabels:
+      app: <app-name>
+  template:
+    metadata:
+      labels:
+        app: <app-name>
+        version: <version>
+    spec:
+      containers:
+      - name: <container-name>
+        image: <image>:<tag>
+        ports:
+        - containerPort: <port>
+          name: http
+        resources:
+          requests:
+            memory: "256Mi"
+            cpu: "250m"
+          limits:
+            memory: "512Mi"
+            cpu: "500m"
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: http
+          initialDelaySeconds: 30
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /ready
+            port: http
+          initialDelaySeconds: 5
+          periodSeconds: 5
+        env:
+        - name: ENV_VAR
+          value: "value"
+        envFrom:
+        - configMapRef:
+            name: <app-name>-config
+        - secretRef:
+            name: <app-name>-secret
+```
+
+**Best practices to apply:**
+- Always set resource requests and limits
+- Implement both liveness and readiness probes
+- Use specific image tags (never `:latest`)
+- Apply security context for non-root users
+- Use labels for organization and selection
+- Set appropriate replica count based on availability needs
+
+**Reference:** See `references/deployment-spec.md` for detailed deployment options
+
+### 3. Create Service Manifest
+
+**Choose the appropriate Service type:**
+
+**ClusterIP (internal only):**
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>
+  namespace: <namespace>
+  labels:
+    app: <app-name>
+spec:
+  type: ClusterIP
+  selector:
+    app: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+```
+
+**LoadBalancer (external access):**
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>
+  namespace: <namespace>
+  labels:
+    app: <app-name>
+  annotations:
+    service.beta.kubernetes.io/aws-load-balancer-type: nlb
+spec:
+  type: LoadBalancer
+  selector:
+    app: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+```
+
+**Reference:** See `references/service-spec.md` for service types and networking
+
+### 4. Create ConfigMap
+
+**For application configuration:**
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-config
+  namespace: <namespace>
+data:
+  APP_MODE: production
+  LOG_LEVEL: info
+  DATABASE_HOST: db.example.com
+  # For config files
+  app.properties: |
+    server.port=8080
+    server.host=0.0.0.0
+    logging.level=INFO
+```
+
+**Best practices:**
+- Use ConfigMaps for non-sensitive data only
+- Organize related configuration together
+- Use meaningful names for keys
+- Consider using one ConfigMap per component
+- Version ConfigMaps when making changes
+
+**Reference:** See `assets/configmap-template.yaml` for examples
+
+### 5. Create Secret
+
+**For sensitive data:**
+
+```yaml
+apiVersion: v1
+kind: Secret
+metadata:
+  name: <app-name>-secret
+  namespace: <namespace>
+type: Opaque
+stringData:
+  DATABASE_PASSWORD: "changeme"
+  API_KEY: "secret-api-key"
+  # For certificate files
+  tls.crt: |
+    -----BEGIN CERTIFICATE-----
+    ...
+    -----END CERTIFICATE-----
+  tls.key: |
+    -----BEGIN PRIVATE KEY-----
+    ...
+    -----END PRIVATE KEY-----
+```
+
+**Security considerations:**
+- Never commit secrets to Git in plain text
+- Use Sealed Secrets, External Secrets Operator, or Vault
+- Rotate secrets regularly
+- Use RBAC to limit secret access
+- Consider using Secret type: `kubernetes.io/tls` for TLS secrets
+
+### 6. Create PersistentVolumeClaim (if needed)
+
+**For stateful applications:**
+
+```yaml
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: <app-name>-data
+  namespace: <namespace>
+spec:
+  accessModes:
+  - ReadWriteOnce
+  storageClassName: gp3
+  resources:
+    requests:
+      storage: 10Gi
+```
+
+**Mount in Deployment:**
+```yaml
+spec:
+  template:
+    spec:
+      containers:
+      - name: app
+        volumeMounts:
+        - name: data
+          mountPath: /var/lib/app
+      volumes:
+      - name: data
+        persistentVolumeClaim:
+          claimName: <app-name>-data
+```
+
+**Storage considerations:**
+- Choose appropriate StorageClass for performance needs
+- Use ReadWriteOnce for single-pod access
+- Use ReadWriteMany for multi-pod shared storage
+- Consider backup strategies
+- Set appropriate retention policies
+
+### 7. Apply Security Best Practices
+
+**Add security context to Deployment:**
+
+```yaml
+spec:
+  template:
+    spec:
+      securityContext:
+        runAsNonRoot: true
+        runAsUser: 1000
+        fsGroup: 1000
+        seccompProfile:
+          type: RuntimeDefault
+      containers:
+      - name: app
+        securityContext:
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          capabilities:
+            drop:
+            - ALL
+```
+
+**Security checklist:**
+- [ ] Run as non-root user
+- [ ] Drop all capabilities
+- [ ] Use read-only root filesystem
+- [ ] Disable privilege escalation
+- [ ] Set seccomp profile
+- [ ] Use Pod Security Standards
+
+### 8. Add Labels and Annotations
+
+**Standard labels (recommended):**
+
+```yaml
+metadata:
+  labels:
+    app.kubernetes.io/name: <app-name>
+    app.kubernetes.io/instance: <instance-name>
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/component: backend
+    app.kubernetes.io/part-of: <system-name>
+    app.kubernetes.io/managed-by: kubectl
+```
+
+**Useful annotations:**
+
+```yaml
+metadata:
+  annotations:
+    description: "Application description"
+    contact: "team@example.com"
+    prometheus.io/scrape: "true"
+    prometheus.io/port: "9090"
+    prometheus.io/path: "/metrics"
+```
+
+### 9. Organize Multi-Resource Manifests
+
+**File organization options:**
+
+**Option 1: Single file with `---` separator**
+```yaml
+# app-name.yaml
+---
+apiVersion: v1
+kind: ConfigMap
+...
+---
+apiVersion: v1
+kind: Secret
+...
+---
+apiVersion: apps/v1
+kind: Deployment
+...
+---
+apiVersion: v1
+kind: Service
+...
+```
+
+**Option 2: Separate files**
+```
+manifests/
+├── configmap.yaml
+├── secret.yaml
+├── deployment.yaml
+├── service.yaml
+└── pvc.yaml
+```
+
+**Option 3: Kustomize structure**
+```
+base/
+├── kustomization.yaml
+├── deployment.yaml
+├── service.yaml
+└── configmap.yaml
+overlays/
+├── dev/
+│   └── kustomization.yaml
+└── prod/
+    └── kustomization.yaml
+```
+
+### 10. Validate and Test
+
+**Validation steps:**
+
+```bash
+# Dry-run validation
+kubectl apply -f manifest.yaml --dry-run=client
+
+# Server-side validation
+kubectl apply -f manifest.yaml --dry-run=server
+
+# Validate with kubeval
+kubeval manifest.yaml
+
+# Validate with kube-score
+kube-score score manifest.yaml
+
+# Check with kube-linter
+kube-linter lint manifest.yaml
+```
+
+**Testing checklist:**
+- [ ] Manifest passes dry-run validation
+- [ ] All required fields are present
+- [ ] Resource limits are reasonable
+- [ ] Health checks are configured
+- [ ] Security context is set
+- [ ] Labels follow conventions
+- [ ] Namespace exists or is created
+
+## Common Patterns
+
+### Pattern 1: Simple Stateless Web Application
+
+**Use case:** Standard web API or microservice
+
+**Components needed:**
+- Deployment (3 replicas for HA)
+- ClusterIP Service
+- ConfigMap for configuration
+- Secret for API keys
+- HorizontalPodAutoscaler (optional)
+
+**Reference:** See `assets/deployment-template.yaml`
+
+### Pattern 2: Stateful Database Application
+
+**Use case:** Database or persistent storage application
+
+**Components needed:**
+- StatefulSet (not Deployment)
+- Headless Service
+- PersistentVolumeClaim template
+- ConfigMap for DB configuration
+- Secret for credentials
+
+### Pattern 3: Background Job or Cron
+
+**Use case:** Scheduled tasks or batch processing
+
+**Components needed:**
+- CronJob or Job
+- ConfigMap for job parameters
+- Secret for credentials
+- ServiceAccount with RBAC
+
+### Pattern 4: Multi-Container Pod
+
+**Use case:** Application with sidecar containers
+
+**Components needed:**
+- Deployment with multiple containers
+- Shared volumes between containers
+- Init containers for setup
+- Service (if needed)
+
+## Templates
+
+The following templates are available in the `assets/` directory:
+
+- `deployment-template.yaml` - Standard deployment with best practices
+- `service-template.yaml` - Service configurations (ClusterIP, LoadBalancer, NodePort)
+- `configmap-template.yaml` - ConfigMap examples with different data types
+- `secret-template.yaml` - Secret examples (to be generated, not committed)
+- `pvc-template.yaml` - PersistentVolumeClaim templates
+
+## Reference Documentation
+
+- `references/deployment-spec.md` - Detailed Deployment specification
+- `references/service-spec.md` - Service types and networking details
+
+## Best Practices Summary
+
+1. **Always set resource requests and limits** - Prevents resource starvation
+2. **Implement health checks** - Ensures Kubernetes can manage your application
+3. **Use specific image tags** - Avoid unpredictable deployments
+4. **Apply security contexts** - Run as non-root, drop capabilities
+5. **Use ConfigMaps and Secrets** - Separate config from code
+6. **Label everything** - Enables filtering and organization
+7. **Follow naming conventions** - Use standard Kubernetes labels
+8. **Validate before applying** - Use dry-run and validation tools
+9. **Version your manifests** - Keep in Git with version control
+10. **Document with annotations** - Add context for other developers
+
+## Troubleshooting
+
+**Pods not starting:**
+- Check image pull errors: `kubectl describe pod <pod-name>`
+- Verify resource availability: `kubectl get nodes`
+- Check events: `kubectl get events --sort-by='.lastTimestamp'`
+
+**Service not accessible:**
+- Verify selector matches pod labels: `kubectl get endpoints <service-name>`
+- Check service type and port configuration
+- Test from within cluster: `kubectl run debug --rm -it --image=busybox -- sh`
+
+**ConfigMap/Secret not loading:**
+- Verify names match in Deployment
+- Check namespace
+- Ensure resources exist: `kubectl get configmap,secret`
+
+## Next Steps
+
+After creating manifests:
+1. Store in Git repository
+2. Set up CI/CD pipeline for deployment
+3. Consider using Helm or Kustomize for templating
+4. Implement GitOps with ArgoCD or Flux
+5. Add monitoring and observability
+
+## Related Skills
+
+- `helm-chart-scaffolding` - For templating and packaging
+- `gitops-workflow` - For automated deployments
+- `k8s-security-policies` - For advanced security configurations
--- a/skills/k8s-manifest-generator/assets/configmap-template.yaml
+++ b/skills/k8s-manifest-generator/assets/configmap-template.yaml
@@ -0,0 +1,296 @@
+# Kubernetes ConfigMap Templates
+
+---
+# Template 1: Simple Key-Value Configuration
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-config
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+    app.kubernetes.io/instance: <instance-name>
+data:
+  # Simple key-value pairs
+  APP_ENV: "production"
+  LOG_LEVEL: "info"
+  DATABASE_HOST: "db.example.com"
+  DATABASE_PORT: "5432"
+  CACHE_TTL: "3600"
+  MAX_CONNECTIONS: "100"
+
+---
+# Template 2: Configuration File
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-config-file
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+data:
+  # Application configuration file
+  application.yaml: |
+    server:
+      port: 8080
+      host: 0.0.0.0
+
+    logging:
+      level: INFO
+      format: json
+
+    database:
+      host: db.example.com
+      port: 5432
+      pool_size: 20
+      timeout: 30
+
+    cache:
+      enabled: true
+      ttl: 3600
+      max_entries: 10000
+
+    features:
+      new_ui: true
+      beta_features: false
+
+---
+# Template 3: Multiple Configuration Files
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-multi-config
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+data:
+  # Nginx configuration
+  nginx.conf: |
+    user nginx;
+    worker_processes auto;
+    error_log /var/log/nginx/error.log warn;
+    pid /var/run/nginx.pid;
+
+    events {
+      worker_connections 1024;
+    }
+
+    http {
+      include /etc/nginx/mime.types;
+      default_type application/octet-stream;
+
+      log_format main '$remote_addr - $remote_user [$time_local] "$request" '
+                      '$status $body_bytes_sent "$http_referer" '
+                      '"$http_user_agent" "$http_x_forwarded_for"';
+
+      access_log /var/log/nginx/access.log main;
+      sendfile on;
+      keepalive_timeout 65;
+
+      include /etc/nginx/conf.d/*.conf;
+    }
+
+  # Default site configuration
+  default.conf: |
+    server {
+      listen 80;
+      server_name _;
+
+      location / {
+        proxy_pass http://backend:8080;
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+      }
+
+      location /health {
+        access_log off;
+        return 200 "healthy\n";
+      }
+    }
+
+---
+# Template 4: JSON Configuration
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-json-config
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+data:
+  config.json: |
+    {
+      "server": {
+        "port": 8080,
+        "host": "0.0.0.0",
+        "timeout": 30
+      },
+      "database": {
+        "host": "postgres.example.com",
+        "port": 5432,
+        "database": "myapp",
+        "pool": {
+          "min": 2,
+          "max": 20
+        }
+      },
+      "redis": {
+        "host": "redis.example.com",
+        "port": 6379,
+        "db": 0
+      },
+      "features": {
+        "auth": true,
+        "metrics": true,
+        "tracing": true
+      }
+    }
+
+---
+# Template 5: Environment-Specific Configuration
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-prod-config
+  namespace: production
+  labels:
+    app.kubernetes.io/name: <app-name>
+    environment: production
+data:
+  APP_ENV: "production"
+  LOG_LEVEL: "warn"
+  DEBUG: "false"
+  RATE_LIMIT: "1000"
+  CACHE_TTL: "3600"
+  DATABASE_POOL_SIZE: "50"
+  FEATURE_FLAG_NEW_UI: "true"
+  FEATURE_FLAG_BETA: "false"
+
+---
+# Template 6: Script Configuration
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: <app-name>-scripts
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+data:
+  # Initialization script
+  init.sh: |
+    #!/bin/bash
+    set -e
+
+    echo "Running initialization..."
+
+    # Wait for database
+    until nc -z $DATABASE_HOST $DATABASE_PORT; do
+      echo "Waiting for database..."
+      sleep 2
+    done
+
+    echo "Database is ready!"
+
+    # Run migrations
+    if [ "$RUN_MIGRATIONS" = "true" ]; then
+      echo "Running database migrations..."
+      ./migrate up
+    fi
+
+    echo "Initialization complete!"
+
+  # Health check script
+  healthcheck.sh: |
+    #!/bin/bash
+
+    # Check application health endpoint
+    response=$(curl -sf http://localhost:8080/health)
+
+    if [ $? -eq 0 ]; then
+      echo "Health check passed"
+      exit 0
+    else
+      echo "Health check failed"
+      exit 1
+    fi
+
+---
+# Template 7: Prometheus Configuration
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: prometheus-config
+  namespace: monitoring
+  labels:
+    app.kubernetes.io/name: prometheus
+data:
+  prometheus.yml: |
+    global:
+      scrape_interval: 15s
+      evaluation_interval: 15s
+      external_labels:
+        cluster: 'production'
+        region: 'us-west-2'
+
+    alerting:
+      alertmanagers:
+      - static_configs:
+        - targets:
+          - alertmanager:9093
+
+    rule_files:
+    - /etc/prometheus/rules/*.yml
+
+    scrape_configs:
+    - job_name: 'kubernetes-pods'
+      kubernetes_sd_configs:
+      - role: pod
+      relabel_configs:
+      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
+        action: keep
+        regex: true
+      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
+        action: replace
+        target_label: __metrics_path__
+        regex: (.+)
+      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
+        action: replace
+        target_label: __address__
+        regex: ([^:]+)(?::\d+)?;(\d+)
+        replacement: $1:$2
+
+---
+# Usage Examples:
+#
+# 1. Mount as environment variables:
+#   envFrom:
+#   - configMapRef:
+#       name: <app-name>-config
+#
+# 2. Mount as files:
+#   volumeMounts:
+#   - name: config
+#     mountPath: /etc/app
+#   volumes:
+#   - name: config
+#     configMap:
+#       name: <app-name>-config-file
+#
+# 3. Mount specific keys as files:
+#   volumes:
+#   - name: nginx-config
+#     configMap:
+#       name: <app-name>-multi-config
+#       items:
+#       - key: nginx.conf
+#         path: nginx.conf
+#
+# 4. Use individual environment variables:
+#   env:
+#   - name: LOG_LEVEL
+#     valueFrom:
+#       configMapKeyRef:
+#         name: <app-name>-config
+#         key: LOG_LEVEL
--- a/skills/k8s-manifest-generator/assets/deployment-template.yaml
+++ b/skills/k8s-manifest-generator/assets/deployment-template.yaml
@@ -0,0 +1,203 @@
+# Production-Ready Kubernetes Deployment Template
+# Replace all <placeholders> with actual values
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: <app-name>
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+    app.kubernetes.io/instance: <instance-name>
+    app.kubernetes.io/version: "<version>"
+    app.kubernetes.io/component: <component>  # backend, frontend, database, cache
+    app.kubernetes.io/part-of: <system-name>
+    app.kubernetes.io/managed-by: kubectl
+  annotations:
+    description: "<application description>"
+    contact: "<team-email>"
+spec:
+  replicas: 3  # Minimum 3 for production HA
+  revisionHistoryLimit: 10
+
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: <app-name>
+      app.kubernetes.io/instance: <instance-name>
+
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxSurge: 1
+      maxUnavailable: 0  # Zero-downtime deployment
+
+  minReadySeconds: 10
+  progressDeadlineSeconds: 600
+
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: <app-name>
+        app.kubernetes.io/instance: <instance-name>
+        app.kubernetes.io/version: "<version>"
+      annotations:
+        prometheus.io/scrape: "true"
+        prometheus.io/port: "9090"
+        prometheus.io/path: "/metrics"
+
+    spec:
+      serviceAccountName: <app-name>
+
+      # Pod-level security context
+      securityContext:
+        runAsNonRoot: true
+        runAsUser: 1000
+        runAsGroup: 1000
+        fsGroup: 1000
+        seccompProfile:
+          type: RuntimeDefault
+
+      # Init containers (optional)
+      initContainers:
+      - name: init-wait
+        image: busybox:1.36
+        command: ['sh', '-c', 'echo "Initializing..."']
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 1000
+
+      containers:
+      - name: <container-name>
+        image: <registry>/<image>:<tag>  # Never use :latest
+        imagePullPolicy: IfNotPresent
+
+        ports:
+        - name: http
+          containerPort: 8080
+          protocol: TCP
+        - name: metrics
+          containerPort: 9090
+          protocol: TCP
+
+        # Environment variables
+        env:
+        - name: POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: POD_NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: POD_IP
+          valueFrom:
+            fieldRef:
+              fieldPath: status.podIP
+
+        # Load from ConfigMap and Secret
+        envFrom:
+        - configMapRef:
+            name: <app-name>-config
+        - secretRef:
+            name: <app-name>-secret
+
+        # Resource limits
+        resources:
+          requests:
+            memory: "256Mi"
+            cpu: "250m"
+          limits:
+            memory: "512Mi"
+            cpu: "500m"
+
+        # Startup probe (for slow-starting apps)
+        startupProbe:
+          httpGet:
+            path: /health/startup
+            port: http
+          initialDelaySeconds: 0
+          periodSeconds: 10
+          timeoutSeconds: 3
+          failureThreshold: 30  # 5 minutes to start
+
+        # Liveness probe
+        livenessProbe:
+          httpGet:
+            path: /health/live
+            port: http
+          initialDelaySeconds: 30
+          periodSeconds: 10
+          timeoutSeconds: 5
+          failureThreshold: 3
+
+        # Readiness probe
+        readinessProbe:
+          httpGet:
+            path: /health/ready
+            port: http
+          initialDelaySeconds: 5
+          periodSeconds: 5
+          timeoutSeconds: 3
+          failureThreshold: 3
+
+        # Volume mounts
+        volumeMounts:
+        - name: tmp
+          mountPath: /tmp
+        - name: cache
+          mountPath: /app/cache
+        # - name: data
+        #   mountPath: /var/lib/app
+
+        # Container security context
+        securityContext:
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
+          runAsUser: 1000
+          capabilities:
+            drop:
+            - ALL
+
+        # Lifecycle hooks
+        lifecycle:
+          preStop:
+            exec:
+              command: ["/bin/sh", "-c", "sleep 15"]  # Graceful shutdown
+
+      # Volumes
+      volumes:
+      - name: tmp
+        emptyDir: {}
+      - name: cache
+        emptyDir:
+          sizeLimit: 1Gi
+      # - name: data
+      #   persistentVolumeClaim:
+      #     claimName: <app-name>-data
+
+      # Scheduling
+      affinity:
+        podAntiAffinity:
+          preferredDuringSchedulingIgnoredDuringExecution:
+          - weight: 100
+            podAffinityTerm:
+              labelSelector:
+                matchLabels:
+                  app.kubernetes.io/name: <app-name>
+              topologyKey: kubernetes.io/hostname
+
+      topologySpreadConstraints:
+      - maxSkew: 1
+        topologyKey: topology.kubernetes.io/zone
+        whenUnsatisfiable: ScheduleAnyway
+        labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: <app-name>
+
+      terminationGracePeriodSeconds: 30
+
+      # Image pull secrets (if using private registry)
+      # imagePullSecrets:
+      # - name: regcred
--- a/skills/k8s-manifest-generator/assets/service-template.yaml
+++ b/skills/k8s-manifest-generator/assets/service-template.yaml
@@ -0,0 +1,171 @@
+# Kubernetes Service Templates
+
+---
+# Template 1: ClusterIP Service (Internal Only)
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+    app.kubernetes.io/instance: <instance-name>
+  annotations:
+    description: "Internal service for <app-name>"
+spec:
+  type: ClusterIP
+  selector:
+    app.kubernetes.io/name: <app-name>
+    app.kubernetes.io/instance: <instance-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: http  # Named port from container
+    protocol: TCP
+  sessionAffinity: None
+
+---
+# Template 2: LoadBalancer Service (External Access)
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>-lb
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+  annotations:
+    # AWS NLB annotations
+    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
+    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
+    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
+    # SSL certificate (optional)
+    # service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
+spec:
+  type: LoadBalancer
+  externalTrafficPolicy: Local  # Preserves client IP
+  selector:
+    app.kubernetes.io/name: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: http
+    protocol: TCP
+  - name: https
+    port: 443
+    targetPort: https
+    protocol: TCP
+  # Restrict access to specific IPs (optional)
+  # loadBalancerSourceRanges:
+  # - 203.0.113.0/24
+
+---
+# Template 3: NodePort Service (Direct Node Access)
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>-np
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+spec:
+  type: NodePort
+  selector:
+    app.kubernetes.io/name: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    nodePort: 30080  # Optional, 30000-32767 range
+    protocol: TCP
+
+---
+# Template 4: Headless Service (StatefulSet)
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>-headless
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+spec:
+  clusterIP: None  # Headless
+  selector:
+    app.kubernetes.io/name: <app-name>
+  ports:
+  - name: client
+    port: 9042
+    targetPort: 9042
+  publishNotReadyAddresses: true  # Include not-ready pods in DNS
+
+---
+# Template 5: Multi-Port Service with Metrics
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>-multi
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+  annotations:
+    prometheus.io/scrape: "true"
+    prometheus.io/port: "9090"
+    prometheus.io/path: "/metrics"
+spec:
+  type: ClusterIP
+  selector:
+    app.kubernetes.io/name: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+  - name: https
+    port: 443
+    targetPort: 8443
+    protocol: TCP
+  - name: grpc
+    port: 9090
+    targetPort: 9090
+    protocol: TCP
+  - name: metrics
+    port: 9091
+    targetPort: 9091
+    protocol: TCP
+
+---
+# Template 6: Service with Session Affinity
+apiVersion: v1
+kind: Service
+metadata:
+  name: <app-name>-sticky
+  namespace: <namespace>
+  labels:
+    app.kubernetes.io/name: <app-name>
+spec:
+  type: ClusterIP
+  selector:
+    app.kubernetes.io/name: <app-name>
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+  sessionAffinity: ClientIP
+  sessionAffinityConfig:
+    clientIP:
+      timeoutSeconds: 10800  # 3 hours
+
+---
+# Template 7: ExternalName Service (External Service Mapping)
+apiVersion: v1
+kind: Service
+metadata:
+  name: external-db
+  namespace: <namespace>
+spec:
+  type: ExternalName
+  externalName: db.example.com
+  ports:
+  - port: 5432
+    targetPort: 5432
+    protocol: TCP
--- a/skills/k8s-manifest-generator/references/deployment-spec.md
+++ b/skills/k8s-manifest-generator/references/deployment-spec.md
@@ -0,0 +1,753 @@
+# Kubernetes Deployment Specification Reference
+
+Comprehensive reference for Kubernetes Deployment resources, covering all key fields, best practices, and common patterns.
+
+## Overview
+
+A Deployment provides declarative updates for Pods and ReplicaSets. It manages the desired state of your application, handling rollouts, rollbacks, and scaling operations.
+
+## Complete Deployment Specification
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: my-app
+  namespace: production
+  labels:
+    app.kubernetes.io/name: my-app
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/component: backend
+    app.kubernetes.io/part-of: my-system
+  annotations:
+    description: "Main application deployment"
+    contact: "backend-team@example.com"
+spec:
+  # Replica management
+  replicas: 3
+  revisionHistoryLimit: 10
+
+  # Pod selection
+  selector:
+    matchLabels:
+      app: my-app
+      version: v1
+
+  # Update strategy
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxSurge: 1
+      maxUnavailable: 0
+
+  # Minimum time for pod to be ready
+  minReadySeconds: 10
+
+  # Deployment will fail if it doesn't progress in this time
+  progressDeadlineSeconds: 600
+
+  # Pod template
+  template:
+    metadata:
+      labels:
+        app: my-app
+        version: v1
+      annotations:
+        prometheus.io/scrape: "true"
+        prometheus.io/port: "9090"
+    spec:
+      # Service account for RBAC
+      serviceAccountName: my-app
+
+      # Security context for the pod
+      securityContext:
+        runAsNonRoot: true
+        runAsUser: 1000
+        fsGroup: 1000
+        seccompProfile:
+          type: RuntimeDefault
+
+      # Init containers run before main containers
+      initContainers:
+      - name: init-db
+        image: busybox:1.36
+        command: ['sh', '-c', 'until nc -z db-service 5432; do sleep 1; done']
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 1000
+
+      # Main containers
+      containers:
+      - name: app
+        image: myapp:1.0.0
+        imagePullPolicy: IfNotPresent
+
+        # Container ports
+        ports:
+        - name: http
+          containerPort: 8080
+          protocol: TCP
+        - name: metrics
+          containerPort: 9090
+          protocol: TCP
+
+        # Environment variables
+        env:
+        - name: POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: POD_NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: DATABASE_URL
+          valueFrom:
+            secretKeyRef:
+              name: db-credentials
+              key: url
+
+        # ConfigMap and Secret references
+        envFrom:
+        - configMapRef:
+            name: app-config
+        - secretRef:
+            name: app-secrets
+
+        # Resource requests and limits
+        resources:
+          requests:
+            memory: "256Mi"
+            cpu: "250m"
+          limits:
+            memory: "512Mi"
+            cpu: "500m"
+
+        # Liveness probe
+        livenessProbe:
+          httpGet:
+            path: /health/live
+            port: http
+            httpHeaders:
+            - name: Custom-Header
+              value: Awesome
+          initialDelaySeconds: 30
+          periodSeconds: 10
+          timeoutSeconds: 5
+          successThreshold: 1
+          failureThreshold: 3
+
+        # Readiness probe
+        readinessProbe:
+          httpGet:
+            path: /health/ready
+            port: http
+          initialDelaySeconds: 5
+          periodSeconds: 5
+          timeoutSeconds: 3
+          successThreshold: 1
+          failureThreshold: 3
+
+        # Startup probe (for slow-starting containers)
+        startupProbe:
+          httpGet:
+            path: /health/startup
+            port: http
+          initialDelaySeconds: 0
+          periodSeconds: 10
+          timeoutSeconds: 3
+          successThreshold: 1
+          failureThreshold: 30
+
+        # Volume mounts
+        volumeMounts:
+        - name: data
+          mountPath: /var/lib/app
+        - name: config
+          mountPath: /etc/app
+          readOnly: true
+        - name: tmp
+          mountPath: /tmp
+
+        # Security context for container
+        securityContext:
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
+          runAsUser: 1000
+          capabilities:
+            drop:
+            - ALL
+
+        # Lifecycle hooks
+        lifecycle:
+          postStart:
+            exec:
+              command: ["/bin/sh", "-c", "echo Container started > /tmp/started"]
+          preStop:
+            exec:
+              command: ["/bin/sh", "-c", "sleep 15"]
+
+      # Volumes
+      volumes:
+      - name: data
+        persistentVolumeClaim:
+          claimName: app-data
+      - name: config
+        configMap:
+          name: app-config
+      - name: tmp
+        emptyDir: {}
+
+      # DNS configuration
+      dnsPolicy: ClusterFirst
+      dnsConfig:
+        options:
+        - name: ndots
+          value: "2"
+
+      # Scheduling
+      nodeSelector:
+        disktype: ssd
+
+      affinity:
+        podAntiAffinity:
+          preferredDuringSchedulingIgnoredDuringExecution:
+          - weight: 100
+            podAffinityTerm:
+              labelSelector:
+                matchExpressions:
+                - key: app
+                  operator: In
+                  values:
+                  - my-app
+              topologyKey: kubernetes.io/hostname
+
+      tolerations:
+      - key: "app"
+        operator: "Equal"
+        value: "my-app"
+        effect: "NoSchedule"
+
+      # Termination
+      terminationGracePeriodSeconds: 30
+
+      # Image pull secrets
+      imagePullSecrets:
+      - name: regcred
+```
+
+## Field Reference
+
+### Metadata Fields
+
+#### Required Fields
+- `apiVersion`: `apps/v1` (current stable version)
+- `kind`: `Deployment`
+- `metadata.name`: Unique name within namespace
+
+#### Recommended Metadata
+- `metadata.namespace`: Target namespace (defaults to `default`)
+- `metadata.labels`: Key-value pairs for organization
+- `metadata.annotations`: Non-identifying metadata
+
+### Spec Fields
+
+#### Replica Management
+
+**`replicas`** (integer, default: 1)
+- Number of desired pod instances
+- Best practice: Use 3+ for production high availability
+- Can be scaled manually or via HorizontalPodAutoscaler
+
+**`revisionHistoryLimit`** (integer, default: 10)
+- Number of old ReplicaSets to retain for rollback
+- Set to 0 to disable rollback capability
+- Reduces storage overhead for long-running deployments
+
+#### Update Strategy
+
+**`strategy.type`** (string)
+- `RollingUpdate` (default): Gradual pod replacement
+- `Recreate`: Delete all pods before creating new ones
+
+**`strategy.rollingUpdate.maxSurge`** (int or percent, default: 25%)
+- Maximum pods above desired replicas during update
+- Example: With 3 replicas and maxSurge=1, up to 4 pods during update
+
+**`strategy.rollingUpdate.maxUnavailable`** (int or percent, default: 25%)
+- Maximum pods below desired replicas during update
+- Set to 0 for zero-downtime deployments
+- Cannot be 0 if maxSurge is 0
+
+**Best practices:**
+```yaml
+# Zero-downtime deployment
+strategy:
+  type: RollingUpdate
+  rollingUpdate:
+    maxSurge: 1
+    maxUnavailable: 0
+
+# Fast deployment (can have brief downtime)
+strategy:
+  type: RollingUpdate
+  rollingUpdate:
+    maxSurge: 2
+    maxUnavailable: 1
+
+# Complete replacement
+strategy:
+  type: Recreate
+```
+
+#### Pod Template
+
+**`template.metadata.labels`**
+- Must include labels matching `spec.selector.matchLabels`
+- Add version labels for blue/green deployments
+- Include standard Kubernetes labels
+
+**`template.spec.containers`** (required)
+- Array of container specifications
+- At least one container required
+- Each container needs unique name
+
+#### Container Configuration
+
+**Image Management:**
+```yaml
+containers:
+- name: app
+  image: registry.example.com/myapp:1.0.0
+  imagePullPolicy: IfNotPresent  # or Always, Never
+```
+
+Image pull policies:
+- `IfNotPresent`: Pull if not cached (default for tagged images)
+- `Always`: Always pull (default for :latest)
+- `Never`: Never pull, fail if not cached
+
+**Port Declarations:**
+```yaml
+ports:
+- name: http      # Named for referencing in Service
+  containerPort: 8080
+  protocol: TCP   # TCP (default), UDP, or SCTP
+  hostPort: 8080  # Optional: Bind to host port (rarely used)
+```
+
+#### Resource Management
+
+**Requests vs Limits:**
+
+```yaml
+resources:
+  requests:
+    memory: "256Mi"  # Guaranteed resources
+    cpu: "250m"      # 0.25 CPU cores
+  limits:
+    memory: "512Mi"  # Maximum allowed
+    cpu: "500m"      # 0.5 CPU cores
+```
+
+**QoS Classes (determined automatically):**
+
+1. **Guaranteed**: requests = limits for all containers
+   - Highest priority
+   - Last to be evicted
+
+2. **Burstable**: requests < limits or only requests set
+   - Medium priority
+   - Evicted before Guaranteed
+
+3. **BestEffort**: No requests or limits set
+   - Lowest priority
+   - First to be evicted
+
+**Best practices:**
+- Always set requests in production
+- Set limits to prevent resource monopolization
+- Memory limits should be 1.5-2x requests
+- CPU limits can be higher for bursty workloads
+
+#### Health Checks
+
+**Probe Types:**
+
+1. **startupProbe** - For slow-starting applications
+   ```yaml
+   startupProbe:
+     httpGet:
+       path: /health/startup
+       port: 8080
+     initialDelaySeconds: 0
+     periodSeconds: 10
+     failureThreshold: 30  # 5 minutes to start (10s * 30)
+   ```
+
+2. **livenessProbe** - Restarts unhealthy containers
+   ```yaml
+   livenessProbe:
+     httpGet:
+       path: /health/live
+       port: 8080
+     initialDelaySeconds: 30
+     periodSeconds: 10
+     timeoutSeconds: 5
+     failureThreshold: 3  # Restart after 3 failures
+   ```
+
+3. **readinessProbe** - Controls traffic routing
+   ```yaml
+   readinessProbe:
+     httpGet:
+       path: /health/ready
+       port: 8080
+     initialDelaySeconds: 5
+     periodSeconds: 5
+     failureThreshold: 3  # Remove from service after 3 failures
+   ```
+
+**Probe Mechanisms:**
+
+```yaml
+# HTTP GET
+httpGet:
+  path: /health
+  port: 8080
+  httpHeaders:
+  - name: Authorization
+    value: Bearer token
+
+# TCP Socket
+tcpSocket:
+  port: 3306
+
+# Command execution
+exec:
+  command:
+  - cat
+  - /tmp/healthy
+
+# gRPC (Kubernetes 1.24+)
+grpc:
+  port: 9090
+  service: my.service.health.v1.Health
+```
+
+**Probe Timing Parameters:**
+
+- `initialDelaySeconds`: Wait before first probe
+- `periodSeconds`: How often to probe
+- `timeoutSeconds`: Probe timeout
+- `successThreshold`: Successes needed to mark healthy (1 for liveness/startup)
+- `failureThreshold`: Failures before taking action
+
+#### Security Context
+
+**Pod-level security context:**
+```yaml
+spec:
+  securityContext:
+    runAsNonRoot: true
+    runAsUser: 1000
+    runAsGroup: 1000
+    fsGroup: 1000
+    fsGroupChangePolicy: OnRootMismatch
+    seccompProfile:
+      type: RuntimeDefault
+```
+
+**Container-level security context:**
+```yaml
+containers:
+- name: app
+  securityContext:
+    allowPrivilegeEscalation: false
+    readOnlyRootFilesystem: true
+    runAsNonRoot: true
+    runAsUser: 1000
+    capabilities:
+      drop:
+      - ALL
+      add:
+      - NET_BIND_SERVICE  # Only if needed
+```
+
+**Security best practices:**
+- Always run as non-root (`runAsNonRoot: true`)
+- Drop all capabilities and add only needed ones
+- Use read-only root filesystem when possible
+- Enable seccomp profile
+- Disable privilege escalation
+
+#### Volumes
+
+**Volume Types:**
+
+```yaml
+volumes:
+# PersistentVolumeClaim
+- name: data
+  persistentVolumeClaim:
+    claimName: app-data
+
+# ConfigMap
+- name: config
+  configMap:
+    name: app-config
+    items:
+    - key: app.properties
+      path: application.properties
+
+# Secret
+- name: secrets
+  secret:
+    secretName: app-secrets
+    defaultMode: 0400
+
+# EmptyDir (ephemeral)
+- name: cache
+  emptyDir:
+    sizeLimit: 1Gi
+
+# HostPath (avoid in production)
+- name: host-data
+  hostPath:
+    path: /data
+    type: DirectoryOrCreate
+```
+
+#### Scheduling
+
+**Node Selection:**
+
+```yaml
+# Simple node selector
+nodeSelector:
+  disktype: ssd
+  zone: us-west-1a
+
+# Node affinity (more expressive)
+affinity:
+  nodeAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      nodeSelectorTerms:
+      - matchExpressions:
+        - key: kubernetes.io/arch
+          operator: In
+          values:
+          - amd64
+          - arm64
+```
+
+**Pod Affinity/Anti-Affinity:**
+
+```yaml
+# Spread pods across nodes
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+    - labelSelector:
+        matchLabels:
+          app: my-app
+      topologyKey: kubernetes.io/hostname
+
+# Co-locate with database
+affinity:
+  podAffinity:
+    preferredDuringSchedulingIgnoredDuringExecution:
+    - weight: 100
+      podAffinityTerm:
+        labelSelector:
+          matchLabels:
+            app: database
+        topologyKey: kubernetes.io/hostname
+```
+
+**Tolerations:**
+
+```yaml
+tolerations:
+- key: "node.kubernetes.io/unreachable"
+  operator: "Exists"
+  effect: "NoExecute"
+  tolerationSeconds: 30
+- key: "dedicated"
+  operator: "Equal"
+  value: "database"
+  effect: "NoSchedule"
+```
+
+## Common Patterns
+
+### High Availability Deployment
+
+```yaml
+spec:
+  replicas: 3
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxSurge: 1
+      maxUnavailable: 0
+  template:
+    spec:
+      affinity:
+        podAntiAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+          - labelSelector:
+              matchLabels:
+                app: my-app
+            topologyKey: kubernetes.io/hostname
+      topologySpreadConstraints:
+      - maxSkew: 1
+        topologyKey: topology.kubernetes.io/zone
+        whenUnsatisfiable: DoNotSchedule
+        labelSelector:
+          matchLabels:
+            app: my-app
+```
+
+### Sidecar Container Pattern
+
+```yaml
+spec:
+  template:
+    spec:
+      containers:
+      - name: app
+        image: myapp:1.0.0
+        volumeMounts:
+        - name: shared-logs
+          mountPath: /var/log
+      - name: log-forwarder
+        image: fluent-bit:2.0
+        volumeMounts:
+        - name: shared-logs
+          mountPath: /var/log
+          readOnly: true
+      volumes:
+      - name: shared-logs
+        emptyDir: {}
+```
+
+### Init Container for Dependencies
+
+```yaml
+spec:
+  template:
+    spec:
+      initContainers:
+      - name: wait-for-db
+        image: busybox:1.36
+        command:
+        - sh
+        - -c
+        - |
+          until nc -z database-service 5432; do
+            echo "Waiting for database..."
+            sleep 2
+          done
+      - name: run-migrations
+        image: myapp:1.0.0
+        command: ["./migrate", "up"]
+        env:
+        - name: DATABASE_URL
+          valueFrom:
+            secretKeyRef:
+              name: db-credentials
+              key: url
+      containers:
+      - name: app
+        image: myapp:1.0.0
+```
+
+## Best Practices
+
+### Production Checklist
+
+- [ ] Set resource requests and limits
+- [ ] Implement all three probe types (startup, liveness, readiness)
+- [ ] Use specific image tags (not :latest)
+- [ ] Configure security context (non-root, read-only filesystem)
+- [ ] Set replica count >= 3 for HA
+- [ ] Configure pod anti-affinity for spread
+- [ ] Set appropriate update strategy (maxUnavailable: 0 for zero-downtime)
+- [ ] Use ConfigMaps and Secrets for configuration
+- [ ] Add standard labels and annotations
+- [ ] Configure graceful shutdown (preStop hook, terminationGracePeriodSeconds)
+- [ ] Set revisionHistoryLimit for rollback capability
+- [ ] Use ServiceAccount with minimal RBAC permissions
+
+### Performance Tuning
+
+**Fast startup:**
+```yaml
+spec:
+  minReadySeconds: 5
+  strategy:
+    rollingUpdate:
+      maxSurge: 2
+      maxUnavailable: 1
+```
+
+**Zero-downtime updates:**
+```yaml
+spec:
+  minReadySeconds: 10
+  strategy:
+    rollingUpdate:
+      maxSurge: 1
+      maxUnavailable: 0
+```
+
+**Graceful shutdown:**
+```yaml
+spec:
+  template:
+    spec:
+      terminationGracePeriodSeconds: 60
+      containers:
+      - name: app
+        lifecycle:
+          preStop:
+            exec:
+              command: ["/bin/sh", "-c", "sleep 15 && kill -SIGTERM 1"]
+```
+
+## Troubleshooting
+
+### Common Issues
+
+**Pods not starting:**
+```bash
+kubectl describe deployment <name>
+kubectl get pods -l app=<app-name>
+kubectl describe pod <pod-name>
+kubectl logs <pod-name>
+```
+
+**ImagePullBackOff:**
+- Check image name and tag
+- Verify imagePullSecrets
+- Check registry credentials
+
+**CrashLoopBackOff:**
+- Check container logs
+- Verify liveness probe is not too aggressive
+- Check resource limits
+- Verify application dependencies
+
+**Deployment stuck in progress:**
+- Check progressDeadlineSeconds
+- Verify readiness probes
+- Check resource availability
+
+## Related Resources
+
+- [Kubernetes Deployment API Reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#deployment-v1-apps)
+- [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
+- [Resource Management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)
--- a/skills/k8s-manifest-generator/references/service-spec.md
+++ b/skills/k8s-manifest-generator/references/service-spec.md
@@ -0,0 +1,724 @@
+# Kubernetes Service Specification Reference
+
+Comprehensive reference for Kubernetes Service resources, covering service types, networking, load balancing, and service discovery patterns.
+
+## Overview
+
+A Service provides stable network endpoints for accessing Pods. Services enable loose coupling between microservices by providing service discovery and load balancing.
+
+## Service Types
+
+### 1. ClusterIP (Default)
+
+Exposes the service on an internal cluster IP. Only reachable from within the cluster.
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: backend-service
+  namespace: production
+spec:
+  type: ClusterIP
+  selector:
+    app: backend
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+  sessionAffinity: None
+```
+
+**Use cases:**
+- Internal microservice communication
+- Database services
+- Internal APIs
+- Message queues
+
+### 2. NodePort
+
+Exposes the service on each Node's IP at a static port (30000-32767 range).
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: frontend-service
+spec:
+  type: NodePort
+  selector:
+    app: frontend
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    nodePort: 30080  # Optional, auto-assigned if omitted
+    protocol: TCP
+```
+
+**Use cases:**
+- Development/testing external access
+- Small deployments without load balancer
+- Direct node access requirements
+
+**Limitations:**
+- Limited port range (30000-32767)
+- Must handle node failures
+- No built-in load balancing across nodes
+
+### 3. LoadBalancer
+
+Exposes the service using a cloud provider's load balancer.
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: public-api
+  annotations:
+    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
+    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
+spec:
+  type: LoadBalancer
+  selector:
+    app: api
+  ports:
+  - name: https
+    port: 443
+    targetPort: 8443
+    protocol: TCP
+  loadBalancerSourceRanges:
+  - 203.0.113.0/24
+```
+
+**Cloud-specific annotations:**
+
+**AWS:**
+```yaml
+annotations:
+  service.beta.kubernetes.io/aws-load-balancer-type: "nlb"  # or "external"
+  service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
+  service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
+  service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
+  service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
+```
+
+**Azure:**
+```yaml
+annotations:
+  service.beta.kubernetes.io/azure-load-balancer-internal: "true"
+  service.beta.kubernetes.io/azure-pip-name: "my-public-ip"
+```
+
+**GCP:**
+```yaml
+annotations:
+  cloud.google.com/load-balancer-type: "Internal"
+  cloud.google.com/backend-config: '{"default": "my-backend-config"}'
+```
+
+### 4. ExternalName
+
+Maps service to external DNS name (CNAME record).
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: external-db
+spec:
+  type: ExternalName
+  externalName: db.external.example.com
+  ports:
+  - port: 5432
+```
+
+**Use cases:**
+- Accessing external services
+- Service migration scenarios
+- Multi-cluster service references
+
+## Complete Service Specification
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: my-service
+  namespace: production
+  labels:
+    app: my-app
+    tier: backend
+  annotations:
+    description: "Main application service"
+    prometheus.io/scrape: "true"
+spec:
+  # Service type
+  type: ClusterIP
+
+  # Pod selector
+  selector:
+    app: my-app
+    version: v1
+
+  # Ports configuration
+  ports:
+  - name: http
+    port: 80           # Service port
+    targetPort: 8080   # Container port (or named port)
+    protocol: TCP      # TCP, UDP, or SCTP
+
+  # Session affinity
+  sessionAffinity: ClientIP
+  sessionAffinityConfig:
+    clientIP:
+      timeoutSeconds: 10800
+
+  # IP configuration
+  clusterIP: 10.0.0.10  # Optional: specific IP
+  clusterIPs:
+  - 10.0.0.10
+  ipFamilies:
+  - IPv4
+  ipFamilyPolicy: SingleStack
+
+  # External traffic policy
+  externalTrafficPolicy: Local
+
+  # Internal traffic policy
+  internalTrafficPolicy: Local
+
+  # Health check
+  healthCheckNodePort: 30000
+
+  # Load balancer config (for type: LoadBalancer)
+  loadBalancerIP: 203.0.113.100
+  loadBalancerSourceRanges:
+  - 203.0.113.0/24
+
+  # External IPs
+  externalIPs:
+  - 80.11.12.10
+
+  # Publishing strategy
+  publishNotReadyAddresses: false
+```
+
+## Port Configuration
+
+### Named Ports
+
+Use named ports in Pods for flexibility:
+
+**Deployment:**
+```yaml
+spec:
+  template:
+    spec:
+      containers:
+      - name: app
+        ports:
+        - name: http
+          containerPort: 8080
+        - name: metrics
+          containerPort: 9090
+```
+
+**Service:**
+```yaml
+spec:
+  ports:
+  - name: http
+    port: 80
+    targetPort: http  # References named port
+  - name: metrics
+    port: 9090
+    targetPort: metrics
+```
+
+### Multiple Ports
+
+```yaml
+spec:
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+    protocol: TCP
+  - name: https
+    port: 443
+    targetPort: 8443
+    protocol: TCP
+  - name: grpc
+    port: 9090
+    targetPort: 9090
+    protocol: TCP
+```
+
+## Session Affinity
+
+### None (Default)
+
+Distributes requests randomly across pods.
+
+```yaml
+spec:
+  sessionAffinity: None
+```
+
+### ClientIP
+
+Routes requests from same client IP to same pod.
+
+```yaml
+spec:
+  sessionAffinity: ClientIP
+  sessionAffinityConfig:
+    clientIP:
+      timeoutSeconds: 10800  # 3 hours
+```
+
+**Use cases:**
+- Stateful applications
+- Session-based applications
+- WebSocket connections
+
+## Traffic Policies
+
+### External Traffic Policy
+
+**Cluster (Default):**
+```yaml
+spec:
+  externalTrafficPolicy: Cluster
+```
+- Load balances across all nodes
+- May add extra network hop
+- Source IP is masked
+
+**Local:**
+```yaml
+spec:
+  externalTrafficPolicy: Local
+```
+- Traffic goes only to pods on receiving node
+- Preserves client source IP
+- Better performance (no extra hop)
+- May cause imbalanced load
+
+### Internal Traffic Policy
+
+```yaml
+spec:
+  internalTrafficPolicy: Local  # or Cluster
+```
+
+Controls traffic routing for cluster-internal clients.
+
+## Headless Services
+
+Service without cluster IP for direct pod access.
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: database
+spec:
+  clusterIP: None  # Headless
+  selector:
+    app: database
+  ports:
+  - port: 5432
+    targetPort: 5432
+```
+
+**Use cases:**
+- StatefulSet pod discovery
+- Direct pod-to-pod communication
+- Custom load balancing
+- Database clusters
+
+**DNS returns:**
+- Individual pod IPs instead of service IP
+- Format: `<pod-name>.<service-name>.<namespace>.svc.cluster.local`
+
+## Service Discovery
+
+### DNS
+
+**ClusterIP Service:**
+```
+<service-name>.<namespace>.svc.cluster.local
+```
+
+Example:
+```bash
+curl http://backend-service.production.svc.cluster.local
+```
+
+**Within same namespace:**
+```bash
+curl http://backend-service
+```
+
+**Headless Service (returns pod IPs):**
+```
+<pod-name>.<service-name>.<namespace>.svc.cluster.local
+```
+
+### Environment Variables
+
+Kubernetes injects service info into pods:
+
+```bash
+# Service host and port
+BACKEND_SERVICE_SERVICE_HOST=10.0.0.100
+BACKEND_SERVICE_SERVICE_PORT=80
+
+# For named ports
+BACKEND_SERVICE_SERVICE_PORT_HTTP=80
+```
+
+**Note:** Pods must be created after the service for env vars to be injected.
+
+## Load Balancing
+
+### Algorithms
+
+Kubernetes uses random selection by default. For advanced load balancing:
+
+**Service Mesh (Istio example):**
+```yaml
+apiVersion: networking.istio.io/v1beta1
+kind: DestinationRule
+metadata:
+  name: my-destination-rule
+spec:
+  host: my-service
+  trafficPolicy:
+    loadBalancer:
+      simple: LEAST_REQUEST  # or ROUND_ROBIN, RANDOM, PASSTHROUGH
+    connectionPool:
+      tcp:
+        maxConnections: 100
+```
+
+### Connection Limits
+
+Use pod disruption budgets and resource limits:
+
+```yaml
+apiVersion: policy/v1
+kind: PodDisruptionBudget
+metadata:
+  name: my-app-pdb
+spec:
+  minAvailable: 2
+  selector:
+    matchLabels:
+      app: my-app
+```
+
+## Service Mesh Integration
+
+### Istio Virtual Service
+
+```yaml
+apiVersion: networking.istio.io/v1beta1
+kind: VirtualService
+metadata:
+  name: my-service
+spec:
+  hosts:
+  - my-service
+  http:
+  - match:
+    - headers:
+        version:
+          exact: v2
+    route:
+    - destination:
+        host: my-service
+        subset: v2
+  - route:
+    - destination:
+        host: my-service
+        subset: v1
+      weight: 90
+    - destination:
+        host: my-service
+        subset: v2
+      weight: 10
+```
+
+## Common Patterns
+
+### Pattern 1: Internal Microservice
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: user-service
+  namespace: backend
+  labels:
+    app: user-service
+    tier: backend
+spec:
+  type: ClusterIP
+  selector:
+    app: user-service
+  ports:
+  - name: http
+    port: 8080
+    targetPort: http
+    protocol: TCP
+  - name: grpc
+    port: 9090
+    targetPort: grpc
+    protocol: TCP
+```
+
+### Pattern 2: Public API with Load Balancer
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: api-gateway
+  annotations:
+    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
+    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
+spec:
+  type: LoadBalancer
+  externalTrafficPolicy: Local
+  selector:
+    app: api-gateway
+  ports:
+  - name: https
+    port: 443
+    targetPort: 8443
+    protocol: TCP
+  loadBalancerSourceRanges:
+  - 0.0.0.0/0
+```
+
+### Pattern 3: StatefulSet with Headless Service
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: cassandra
+spec:
+  clusterIP: None
+  selector:
+    app: cassandra
+  ports:
+  - port: 9042
+    targetPort: 9042
+---
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: cassandra
+spec:
+  serviceName: cassandra
+  replicas: 3
+  selector:
+    matchLabels:
+      app: cassandra
+  template:
+    metadata:
+      labels:
+        app: cassandra
+    spec:
+      containers:
+      - name: cassandra
+        image: cassandra:4.0
+```
+
+### Pattern 4: External Service Mapping
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: external-database
+spec:
+  type: ExternalName
+  externalName: prod-db.cxyz.us-west-2.rds.amazonaws.com
+---
+# Or with Endpoints for IP-based external service
+apiVersion: v1
+kind: Service
+metadata:
+  name: external-api
+spec:
+  ports:
+  - port: 443
+    targetPort: 443
+    protocol: TCP
+---
+apiVersion: v1
+kind: Endpoints
+metadata:
+  name: external-api
+subsets:
+- addresses:
+  - ip: 203.0.113.100
+  ports:
+  - port: 443
+```
+
+### Pattern 5: Multi-Port Service with Metrics
+
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: web-app
+  annotations:
+    prometheus.io/scrape: "true"
+    prometheus.io/port: "9090"
+    prometheus.io/path: "/metrics"
+spec:
+  type: ClusterIP
+  selector:
+    app: web-app
+  ports:
+  - name: http
+    port: 80
+    targetPort: 8080
+  - name: metrics
+    port: 9090
+    targetPort: 9090
+```
+
+## Network Policies
+
+Control traffic to services:
+
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+  name: allow-frontend-to-backend
+spec:
+  podSelector:
+    matchLabels:
+      app: backend
+  policyTypes:
+  - Ingress
+  ingress:
+  - from:
+    - podSelector:
+        matchLabels:
+          app: frontend
+    ports:
+    - protocol: TCP
+      port: 8080
+```
+
+## Best Practices
+
+### Service Configuration
+
+1. **Use named ports** for flexibility
+2. **Set appropriate service type** based on exposure needs
+3. **Use labels and selectors consistently** across Deployments and Services
+4. **Configure session affinity** for stateful apps
+5. **Set external traffic policy to Local** for IP preservation
+6. **Use headless services** for StatefulSets
+7. **Implement network policies** for security
+8. **Add monitoring annotations** for observability
+
+### Production Checklist
+
+- [ ] Service type appropriate for use case
+- [ ] Selector matches pod labels
+- [ ] Named ports used for clarity
+- [ ] Session affinity configured if needed
+- [ ] Traffic policy set appropriately
+- [ ] Load balancer annotations configured (if applicable)
+- [ ] Source IP ranges restricted (for public services)
+- [ ] Health check configuration validated
+- [ ] Monitoring annotations added
+- [ ] Network policies defined
+
+### Performance Tuning
+
+**For high traffic:**
+```yaml
+spec:
+  externalTrafficPolicy: Local
+  sessionAffinity: ClientIP
+  sessionAffinityConfig:
+    clientIP:
+      timeoutSeconds: 3600
+```
+
+**For WebSocket/long connections:**
+```yaml
+spec:
+  sessionAffinity: ClientIP
+  sessionAffinityConfig:
+    clientIP:
+      timeoutSeconds: 86400  # 24 hours
+```
+
+## Troubleshooting
+
+### Service not accessible
+
+```bash
+# Check service exists
+kubectl get service <service-name>
+
+# Check endpoints (should show pod IPs)
+kubectl get endpoints <service-name>
+
+# Describe service
+kubectl describe service <service-name>
+
+# Check if pods match selector
+kubectl get pods -l app=<app-name>
+```
+
+**Common issues:**
+- Selector doesn't match pod labels
+- No pods running (endpoints empty)
+- Ports misconfigured
+- Network policy blocking traffic
+
+### DNS resolution failing
+
+```bash
+# Test DNS from pod
+kubectl run debug --rm -it --image=busybox -- nslookup <service-name>
+
+# Check CoreDNS
+kubectl get pods -n kube-system -l k8s-app=kube-dns
+kubectl logs -n kube-system -l k8s-app=kube-dns
+```
+
+### Load balancer issues
+
+```bash
+# Check load balancer status
+kubectl describe service <service-name>
+
+# Check events
+kubectl get events --sort-by='.lastTimestamp'
+
+# Verify cloud provider configuration
+kubectl describe node
+```
+
+## Related Resources
+
+- [Kubernetes Service API Reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#service-v1-core)
+- [Service Networking](https://kubernetes.io/docs/concepts/services-networking/service/)
+- [DNS for Services and Pods](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)