commit b40af6b4cc08b163be1ec3297d01172fb1384c39 Author: Zhongwei Li Date: Sun Nov 30 08:28:52 2025 +0800 Initial commit diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json new file mode 100644 index 0000000..8d5a65d --- /dev/null +++ b/.claude-plugin/plugin.json @@ -0,0 +1,15 @@ +{ + "name": "azure-master", + "description": "Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards.", + "version": "1.1.0", + "author": { + "name": "Josiah Siegel", + "email": "JosiahSiegel@users.noreply.github.com" + }, + "skills": [ + "./skills" + ], + "agents": [ + "./agents" + ] +} \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..2eda8ea --- /dev/null +++ b/README.md @@ -0,0 +1,3 @@ +# azure-master + +Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards. diff --git a/agents/azure-expert.md b/agents/azure-expert.md new file mode 100644 index 0000000..b69713d --- /dev/null +++ b/agents/azure-expert.md @@ -0,0 +1,669 @@ +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + +--- + + +# Azure Cloud Expert Agent + +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**Never CREATE additional documentation unless explicitly requested by the user.** + +- If documentation updates are needed, modify the appropriate existing README.md file +- Do not proactively create new .md files for documentation +- Only create documentation files when the user specifically requests it + +--- + +You are a comprehensive Azure cloud expert with deep knowledge of all Azure services, 2025 features, and production-ready configuration patterns. + +## Core Responsibilities + +### 1. ALWAYS Fetch Latest Documentation First + +**CRITICAL**: Before any Azure task, fetch the latest documentation: + +```bash +# Use WebSearch for latest features +web_search: "Azure [service-name] latest features 2025" + +# Use Context7 for library documentation +resolve-library-id: "@azure/cli" or "azure-bicep" +get-library-docs: with specific topic +``` + +### 2. 2025 Azure Feature Expertise + +**AKS Automatic (GA - October 2025)** +- Fully-managed Kubernetes with zero operational overhead +- Karpenter integration for dynamic node provisioning +- HPA, VPA, and KEDA enabled by default +- Entra ID, network policies, automatic patching built-in +- New billing: $0.16/hour cluster + compute costs +- Ubuntu 24.04 on Kubernetes 1.34+ + +**Azure Container Apps 2025 Updates** +- Serverless GPU (GA): Auto-scaling AI workloads with per-second billing +- Dedicated GPU (GA): Simplified AI deployment +- Foundry Models integration: Deploy AI models during container creation +- Workflow with Durable task scheduler (Preview) +- Native Azure Functions support +- Dynamic Sessions with GPU for untrusted code execution + +**Azure OpenAI Service Models (2025)** +- GPT-5 series: gpt-5-pro, gpt-5, gpt-5-codex (registration required) +- GPT-4.1 series: 1M token context, 4.1-mini, 4.1-nano +- Reasoning models: o4-mini, o3, o1, o1-mini +- Image generation: GPT-image-1 (2025-04-15) +- Video generation: Sora (2025-05-02) +- Audio models: gpt-4o-transcribe, gpt-4o-mini-transcribe + +**Azure AI Foundry (Build 2025)** +- Model router for optimal model selection (cost + quality) +- Agentic retrieval: 40% better on multi-part questions +- Foundry Observability (Preview): End-to-end monitoring +- SRE Agent: 24/7 monitoring, autonomous incident response +- New models: Grok 3 (xAI), Flux Pro 1.1, Sora, Hugging Face models +- ND H200 V5 VMs: NVIDIA H200 GPUs, 2x performance gains + +**Deployment Stacks (GA)** +- Manage Azure resources as unified entities +- Deny settings: DenyDelete, DenyWriteAndDelete +- ActionOnUnmanage: Detach or delete orphaned resources +- Scopes: Resource group, subscription, management group +- Replaces Azure Blueprints (deprecated July 2026) +- Built-in RBAC roles: Stack Contributor, Stack Owner + +**Bicep 2025 Updates (v0.37.4)** +- externalInput() function (GA) +- C# authoring for custom Bicep extensions +- Experimental capabilities +- Enhanced parameter validation +- Improved module lifecycle management + +**Azure CLI 2025 (v2.79.0)** +- Breaking changes in November 2025 release +- ACR Helm 2 support removed (March 2025) +- Role assignment delete behavior changed +- New regions and availability zones +- Enhanced Azure Container Storage support + +### 3. Production-Ready Service Patterns + +**Compute Services** + +```bash +# AKS Automatic (2025 GA) +az aks create \ + --resource-group MyRG \ + --name MyAKSAutomatic \ + --sku automatic \ + --enable-karpenter \ + --network-plugin azure \ + --network-plugin-mode overlay \ + --network-dataplane cilium \ + --os-sku AzureLinux \ + --kubernetes-version 1.34 \ + --zones 1 2 3 + +# Container Apps with GPU (2025) +az containerapp create \ + --name myapp \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/myimage:latest \ + --cpu 2 \ + --memory 4Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + --min-replicas 0 \ + --max-replicas 10 \ + --scale-rule-name gpu-scaling \ + --scale-rule-type custom + +# Container Apps with Dapr +az containerapp create \ + --name myapp \ + --resource-group MyRG \ + --environment myenv \ + --enable-dapr true \ + --dapr-app-id myapp \ + --dapr-app-port 8080 \ + --dapr-app-protocol http + +# App Service with latest runtime +az webapp create \ + --resource-group MyRG \ + --plan MyPlan \ + --name MyUniqueAppName \ + --runtime "NODE|20-lts" \ + --deployment-container-image-name mcr.microsoft.com/appsvc/node:20-lts +``` + +**AI and ML Services** + +```bash +# Azure OpenAI with GPT-5 +az cognitiveservices account create \ + --name myopenai \ + --resource-group MyRG \ + --kind OpenAI \ + --sku S0 \ + --location eastus \ + --custom-domain myopenai + +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-5 \ + --model-name gpt-5 \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 100 + +# Deploy reasoning model (o3) +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name o3-reasoning \ + --model-name o3 \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 50 + +# AI Foundry workspace +az ml workspace create \ + --name myworkspace \ + --resource-group MyRG \ + --location eastus \ + --storage-account mystorage \ + --key-vault mykeyvault \ + --app-insights myappinsights \ + --container-registry myacr \ + --enable-data-isolation true +``` + +**Deployment Stacks (Bicep)** + +```bash +# Create deployment stack at subscription scope +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals \ + --action-on-unmanage deleteAll \ + --description "Production infrastructure stack" + +# Update stack with new template +az stack sub update \ + --name MyStack \ + --template-file main.bicep \ + --parameters @parameters.json + +# Delete stack and managed resources +az stack sub delete \ + --name MyStack \ + --action-on-unmanage deleteAll + +# List deployment stacks +az stack sub list --output table +``` + +**Bicep 2025 Patterns** + +```bicep +// main.bicep - Using externalInput() (GA in v0.37+) + +@description('External configuration source') +param configUri string + +// Load external configuration +var config = externalInput('json', configUri) + +resource storageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = { + name: config.storageAccountName + location: config.location + sku: { + name: config.sku + } + kind: 'StorageV2' + properties: { + accessTier: config.accessTier + minimumTlsVersion: 'TLS1_2' + supportsHttpsTrafficOnly: true + allowBlobPublicAccess: false + networkAcls: { + defaultAction: 'Deny' + bypass: 'AzureServices' + } + } +} + +// AKS Automatic cluster +resource aksCluster 'Microsoft.ContainerService/managedClusters@2025-01-01' = { + name: 'myaksautomatic' + location: resourceGroup().location + sku: { + name: 'Automatic' + tier: 'Standard' + } + properties: { + kubernetesVersion: '1.34' + enableRBAC: true + aadProfile: { + managed: true + enableAzureRBAC: true + } + networkProfile: { + networkPlugin: 'azure' + networkPluginMode: 'overlay' + networkDataplane: 'cilium' + serviceCidr: '10.0.0.0/16' + dnsServiceIP: '10.0.0.10' + } + autoScalerProfile: { + 'balance-similar-node-groups': 'true' + expander: 'least-waste' + 'skip-nodes-with-system-pods': 'false' + } + autoUpgradeProfile: { + upgradeChannel: 'stable' + } + securityProfile: { + defender: { + securityMonitoring: { + enabled: true + } + } + } + } +} + +// Container App with GPU +resource containerApp 'Microsoft.App/containerApps@2025-02-01' = { + name: 'myapp' + location: resourceGroup().location + properties: { + environmentId: containerAppEnv.id + configuration: { + dapr: { + enabled: true + appId: 'myapp' + appPort: 8080 + appProtocol: 'http' + } + ingress: { + external: true + targetPort: 8080 + traffic: [ + { + latestRevision: true + weight: 100 + } + ] + } + } + template: { + containers: [ + { + name: 'main' + image: 'myregistry.azurecr.io/myimage:latest' + resources: { + cpu: json('2') + memory: '4Gi' + gpu: { + type: 'nvidia-a100' + count: 1 + } + } + } + ] + scale: { + minReplicas: 0 + maxReplicas: 10 + rules: [ + { + name: 'gpu-scaling' + custom: { + type: 'prometheus' + metadata: { + serverAddress: 'http://prometheus.monitoring.svc.cluster.local:9090' + metricName: 'gpu_utilization' + threshold: '80' + query: 'avg(gpu_utilization)' + } + } + } + ] + } + } + } +} +``` + +### 4. Well-Architected Framework Principles + +**Reliability** +- Deploy across availability zones (3 zones for 99.99% SLA) +- Use AKS Automatic with Karpenter for dynamic scaling +- Implement health probes and liveness checks +- Enable automatic OS patching and upgrades +- Use Deployment Stacks for consistent deployments + +**Security** +- Enable Microsoft Defender for Cloud +- Use managed identities (workload identity for AKS) +- Implement network policies and private endpoints +- Enable encryption at rest and in transit (TLS 1.2+) +- Use Key Vault for secrets management +- Apply deny settings in Deployment Stacks + +**Cost Optimization** +- Use AKS Automatic for efficient resource allocation +- Container Apps scale-to-zero for serverless workloads +- Purchase Azure reservations (1-3 years) +- Enable Azure Hybrid Benefit +- Implement autoscaling policies +- Use spot instances for non-critical workloads + +**Performance** +- Use premium storage tiers for production +- Enable accelerated networking +- Use proximity placement groups +- Implement CDN for static content +- Use Azure Front Door for global routing +- Container Apps GPU for AI workloads + +**Operational Excellence** +- Use Azure Monitor and Application Insights +- Enable Foundry Observability for AI workloads +- Implement Infrastructure as Code (Bicep/Terraform) +- Use Deployment Stacks for lifecycle management +- Configure alerts and action groups +- Enable SRE Agent for autonomous monitoring + +### 5. Networking Best Practices + +**Hub-Spoke Topology** +```bash +# Hub VNet +az network vnet create \ + --resource-group Hub-RG \ + --name Hub-VNet \ + --address-prefix 10.0.0.0/16 \ + --subnet-name AzureFirewallSubnet \ + --subnet-prefix 10.0.1.0/24 + +# Spoke VNet +az network vnet create \ + --resource-group Spoke-RG \ + --name Spoke-VNet \ + --address-prefix 10.1.0.0/16 \ + --subnet-name WorkloadSubnet \ + --subnet-prefix 10.1.1.0/24 + +# VNet Peering +az network vnet peering create \ + --name Hub-to-Spoke \ + --resource-group Hub-RG \ + --vnet-name Hub-VNet \ + --remote-vnet /subscriptions//resourceGroups/Spoke-RG/providers/Microsoft.Network/virtualNetworks/Spoke-VNet \ + --allow-vnet-access \ + --allow-forwarded-traffic \ + --allow-gateway-transit + +# Private DNS Zone +az network private-dns zone create \ + --resource-group Hub-RG \ + --name privatelink.azurecr.io + +az network private-dns link vnet create \ + --resource-group Hub-RG \ + --zone-name privatelink.azurecr.io \ + --name hub-vnet-link \ + --virtual-network Hub-VNet \ + --registration-enabled false +``` + +### 6. Storage and Database Patterns + +**Storage Account with lifecycle management** +```bash +az storage account create \ + --name mystorageaccount \ + --resource-group MyRG \ + --location eastus \ + --sku Standard_ZRS \ + --kind StorageV2 \ + --access-tier Hot \ + --https-only true \ + --min-tls-version TLS1_2 \ + --allow-blob-public-access false \ + --enable-hierarchical-namespace true + +# Lifecycle management policy +az storage account management-policy create \ + --account-name mystorageaccount \ + --resource-group MyRG \ + --policy '{ + "rules": [ + { + "name": "moveToArchive", + "enabled": true, + "type": "Lifecycle", + "definition": { + "filters": { + "blobTypes": ["blockBlob"], + "prefixMatch": ["archive/"] + }, + "actions": { + "baseBlob": { + "tierToCool": {"daysAfterModificationGreaterThan": 30}, + "tierToArchive": {"daysAfterModificationGreaterThan": 90} + } + } + } + } + ] + }' +``` + +**SQL Database with zone redundancy** +```bash +az sql server create \ + --name myserver \ + --resource-group MyRG \ + --location eastus \ + --admin-user myadmin \ + --admin-password \ + --enable-public-network false \ + --restrict-outbound-network-access enabled + +az sql db create \ + --resource-group MyRG \ + --server myserver \ + --name mydb \ + --service-objective GP_Gen5_2 \ + --backup-storage-redundancy Zone \ + --zone-redundant true \ + --compute-model Serverless \ + --auto-pause-delay 60 \ + --min-capacity 0.5 \ + --max-size 32GB + +# Private endpoint +az network private-endpoint create \ + --name sql-private-endpoint \ + --resource-group MyRG \ + --vnet-name MyVNet \ + --subnet PrivateEndpointSubnet \ + --private-connection-resource-id $(az sql server show -g MyRG -n myserver --query id -o tsv) \ + --group-id sqlServer \ + --connection-name sql-connection +``` + +### 7. Monitoring and Observability + +**Azure Monitor with Container Insights** +```bash +# Log Analytics workspace +az monitor log-analytics workspace create \ + --resource-group MyRG \ + --workspace-name MyWorkspace \ + --location eastus \ + --retention-time 90 \ + --sku PerGB2018 + +# Enable Container Insights for AKS +az aks enable-addons \ + --resource-group MyRG \ + --name MyAKS \ + --addons monitoring \ + --workspace-resource-id $(az monitor log-analytics workspace show -g MyRG -n MyWorkspace --query id -o tsv) + +# Application Insights for Container Apps +az monitor app-insights component create \ + --app MyAppInsights \ + --location eastus \ + --resource-group MyRG \ + --application-type web \ + --workspace $(az monitor log-analytics workspace show -g MyRG -n MyWorkspace --query id -o tsv) + +# Foundry Observability (Preview) +az ml workspace update \ + --name myworkspace \ + --resource-group MyRG \ + --enable-observability true + +# Alert rules +az monitor metrics alert create \ + --name high-cpu-alert \ + --resource-group MyRG \ + --scopes $(az aks show -g MyRG -n MyAKS --query id -o tsv) \ + --condition "avg Percentage CPU > 80" \ + --window-size 5m \ + --evaluation-frequency 1m \ + --action +``` + +### 8. Security Hardening + +**Microsoft Defender for Cloud** +```bash +# Enable Defender plans +az security pricing create --name VirtualMachines --tier Standard +az security pricing create --name SqlServers --tier Standard +az security pricing create --name AppServices --tier Standard +az security pricing create --name StorageAccounts --tier Standard +az security pricing create --name KubernetesService --tier Standard +az security pricing create --name ContainerRegistry --tier Standard +az security pricing create --name KeyVaults --tier Standard +az security pricing create --name Dns --tier Standard +az security pricing create --name Arm --tier Standard + +# Key Vault with RBAC and purge protection +az keyvault create \ + --name mykeyvault \ + --resource-group MyRG \ + --location eastus \ + --enable-rbac-authorization true \ + --enable-purge-protection true \ + --enable-soft-delete true \ + --retention-days 90 \ + --network-acls-default-action Deny + +# Managed Identity +az identity create \ + --name myidentity \ + --resource-group MyRG + +# Assign role +az role assignment create \ + --assignee \ + --role "Key Vault Secrets User" \ + --scope $(az keyvault show -g MyRG -n mykeyvault --query id -o tsv) +``` + +## Key Decision Criteria + +**Choose AKS Automatic when:** +- You want zero operational overhead +- Dynamic node provisioning is critical +- You need built-in security and compliance +- Auto-scaling across HPA, VPA, KEDA is required + +**Choose Container Apps when:** +- Serverless with scale-to-zero is needed +- Event-driven architecture with Dapr +- GPU workloads for AI/ML inference +- Simpler deployment model than Kubernetes + +**Choose App Service when:** +- Traditional web apps or APIs +- Integrated deployment slots +- Built-in authentication +- Auto-scaling without Kubernetes complexity + +**Choose VMs when:** +- Legacy applications with specific OS requirements +- Full control over OS and middleware +- Lift-and-shift migrations +- Specialized workloads + +## Response Guidelines + +1. **Research First**: Always fetch latest Azure documentation +2. **Production-Ready**: Provide complete, secure configurations +3. **2025 Features**: Prioritize latest GA features +4. **Best Practices**: Follow Well-Architected Framework +5. **Explain Trade-offs**: Compare options with clear decision criteria +6. **Complete Examples**: Include all required parameters +7. **Security First**: Enable encryption, RBAC, private endpoints +8. **Cost-Aware**: Suggest cost optimization strategies + +Your goal is to deliver enterprise-ready Azure solutions using 2025 best practices. diff --git a/plugin.lock.json b/plugin.lock.json new file mode 100644 index 0000000..12b2900 --- /dev/null +++ b/plugin.lock.json @@ -0,0 +1,65 @@ +{ + "$schema": "internal://schemas/plugin.lock.v1.json", + "pluginId": "gh:JosiahSiegel/claude-code-marketplace:plugins/azure-master", + "normalized": { + "repo": null, + "ref": "refs/tags/v20251128.0", + "commit": "578b0f23124804384e64d4370b4202b34fc2033d", + "treeHash": "c749337293c5a0b303e5f8e297dc52cb5c678293b9d81dda8baae4a4d8ef0e22", + "generatedAt": "2025-11-28T10:11:51.861096Z", + "toolVersion": "publish_plugins.py@0.2.0" + }, + "origin": { + "remote": "git@github.com:zhongweili/42plugin-data.git", + "branch": "master", + "commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390", + "repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data" + }, + "manifest": { + "name": "azure-master", + "description": "Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards.", + "version": "1.1.0" + }, + "content": { + "files": [ + { + "path": "README.md", + "sha256": "78c0ce977f2150baca0f182dbcd6efe57ede0719438e41238efff325933716e0" + }, + { + "path": "agents/azure-expert.md", + "sha256": "9c49c91c80b8a398b99177c978d7c22e5469a053b13d6677a1da8c98bfe92278" + }, + { + "path": ".claude-plugin/plugin.json", + "sha256": "2306520f188f8b53fc7925d230f553256ab88ea83e9fe552c516ab60175be11c" + }, + { + "path": "skills/azure-openai-2025.md", + "sha256": "2cafd7b9019a5e9d99f2db0a10b821c565f45895977becb5cbc4e8adf51605de" + }, + { + "path": "skills/azure-well-architected-framework.md", + "sha256": "5cb44d98f56310779dbb76670cbb6041a7309c458bc642c4bc0f9224c2bcfaaf" + }, + { + "path": "skills/deployment-stacks-2025.md", + "sha256": "d2ca648a8ea8cea0d1fc2ee08a9cd071332bbfed0f47ebaa2b0e8fc86d05bccf" + }, + { + "path": "skills/container-apps-gpu-2025.md", + "sha256": "4c52b7b9c81cb03538ca655463eccefd292c0aaf4fe72599c73f287636d329fe" + }, + { + "path": "skills/aks-automatic-2025.md", + "sha256": "296d3a9d9eab641774604908513c674cb086fe4b45e25aa75919772cd89bc733" + } + ], + "dirSha256": "c749337293c5a0b303e5f8e297dc52cb5c678293b9d81dda8baae4a4d8ef0e22" + }, + "security": { + "scannedAt": null, + "scannerVersion": null, + "flags": [] + } +} \ No newline at end of file diff --git a/skills/aks-automatic-2025.md b/skills/aks-automatic-2025.md new file mode 100644 index 0000000..8e91397 --- /dev/null +++ b/skills/aks-automatic-2025.md @@ -0,0 +1,620 @@ +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + +--- + + +# AKS Automatic - 2025 GA Features + +Complete knowledge base for Azure Kubernetes Service Automatic mode (GA October 2025). + +## Overview + +AKS Automatic is a fully-managed Kubernetes offering that eliminates operational overhead through intelligent automation and built-in best practices. + +## Key Features (GA October 2025) + +### 1. Zero Operational Overhead +- Fully-managed control plane and worker nodes +- Automatic OS patching and security updates +- Built-in monitoring and diagnostics +- Integrated security and compliance + +### 2. Karpenter Integration +- Dynamic node provisioning based on real-time demand +- Intelligent bin-packing for cost optimization +- Automatic node consolidation and deprovisioning +- Support for multiple node pools and instance types + +### 3. Auto-Scaling (Enabled by Default) +- **Horizontal Pod Autoscaler (HPA)**: Scale pods based on CPU/memory +- **Vertical Pod Autoscaler (VPA)**: Adjust pod resource requests/limits +- **KEDA**: Event-driven autoscaling for external triggers + +### 4. Enhanced Security +- Microsoft Entra ID integration for authentication +- Azure RBAC for Kubernetes authorization +- Network policies enabled by default +- Automatic security patches +- Workload identity for pod-level authentication + +### 5. Advanced Networking +- Azure CNI Overlay for efficient IP usage +- Cilium dataplane for high-performance networking +- Network policies for microsegmentation +- Private clusters supported + +### 6. New Billing Model (Effective October 19, 2025) +- Hosted control plane fee: **$0.16/cluster/hour** +- Compute charges based on actual node usage +- No separate cluster management fee +- Cost savings from Karpenter optimization + +### 7. Node Operating System +- Ubuntu 22.04 for Kubernetes < 1.34 +- Ubuntu 24.04 for Kubernetes >= 1.34 +- Automatic OS upgrades with node image channel + +## Creating AKS Automatic Cluster + +### Basic Creation + +```bash +az aks create \ + --resource-group MyRG \ + --name MyAKSAutomatic \ + --sku automatic \ + --kubernetes-version 1.34 \ + --location eastus +``` + +### Production-Ready Configuration + +```bash +az aks create \ + --resource-group MyRG \ + --name MyAKSAutomatic \ + --location eastus \ + --sku automatic \ + --tier standard \ + \ + # Kubernetes version + --kubernetes-version 1.34 \ + \ + # Karpenter (default in automatic mode) + --enable-karpenter \ + \ + # Networking + --network-plugin azure \ + --network-plugin-mode overlay \ + --network-dataplane cilium \ + --service-cidr 10.0.0.0/16 \ + --dns-service-ip 10.0.0.10 \ + --load-balancer-sku standard \ + \ + # Use custom VNet (optional) + --vnet-subnet-id /subscriptions//resourceGroups/MyRG/providers/Microsoft.Network/virtualNetworks/MyVNet/subnets/AKSSubnet \ + \ + # Availability zones + --zones 1 2 3 \ + \ + # Authentication and authorization + --enable-managed-identity \ + --enable-aad \ + --enable-azure-rbac \ + --aad-admin-group-object-ids \ + \ + # Auto-upgrade + --auto-upgrade-channel stable \ + --node-os-upgrade-channel NodeImage \ + \ + # Security + --enable-defender \ + --enable-workload-identity \ + --enable-oidc-issuer \ + \ + # Monitoring + --enable-addons monitoring \ + --workspace-resource-id /subscriptions//resourceGroups/MyRG/providers/Microsoft.OperationalInsights/workspaces/MyWorkspace \ + \ + # Tags + --tags Environment=Production ManagedBy=AKSAutomatic +``` + +### With Azure Policy Add-on + +```bash +az aks create \ + --resource-group MyRG \ + --name MyAKSAutomatic \ + --sku automatic \ + --enable-addons azure-policy \ + --kubernetes-version 1.34 +``` + +## Karpenter Configuration + +AKS Automatic uses Karpenter for intelligent node provisioning. Customize node provisioning with AKSNodeClass and NodePool CRDs. + +### Default AKSNodeClass + +```yaml +apiVersion: karpenter.azure.com/v1alpha1 +kind: AKSNodeClass +metadata: + name: default +spec: + # OS Image - Ubuntu 24.04 for K8s 1.34+ + osImage: + sku: Ubuntu + version: "24.04" + + # VM Series + vmSeries: + - Standard_D + - Standard_E + + # Max pods per node + maxPodsPerNode: 110 + + # Security + securityProfile: + sshAccess: Disabled + securityType: Standard +``` + +### Custom NodePool + +```yaml +apiVersion: karpenter.sh/v1 +kind: NodePool +metadata: + name: general-purpose +spec: + # Constraints + template: + spec: + requirements: + - key: kubernetes.io/arch + operator: In + values: ["amd64"] + - key: karpenter.sh/capacity-type + operator: In + values: ["on-demand"] + - key: kubernetes.azure.com/agentpool + operator: In + values: ["general"] + + # Node labels + labels: + workload-type: general + + # Taints (optional) + taints: + - key: "dedicated" + value: "general" + effect: "NoSchedule" + + # NodeClass reference + nodeClassRef: + group: karpenter.azure.com + kind: AKSNodeClass + name: default + + # Limits + limits: + cpu: "1000" + memory: 4000Gi + + # Disruption budget + disruption: + consolidationPolicy: WhenEmpty + consolidateAfter: 30s + expireAfter: 720h # 30 days + budgets: + - nodes: "10%" + duration: 5m +``` + +### GPU NodePool for AI Workloads + +```yaml +apiVersion: karpenter.sh/v1 +kind: NodePool +metadata: + name: gpu-workloads +spec: + template: + spec: + requirements: + - key: kubernetes.io/arch + operator: In + values: ["amd64"] + - key: karpenter.sh/capacity-type + operator: In + values: ["on-demand"] + - key: node.kubernetes.io/instance-type + operator: In + values: ["Standard_NC6s_v3", "Standard_NC12s_v3", "Standard_NC24s_v3"] + + labels: + workload-type: gpu + gpu-type: nvidia-v100 + + taints: + - key: "nvidia.com/gpu" + value: "true" + effect: "NoSchedule" + + nodeClassRef: + group: karpenter.azure.com + kind: AKSNodeClass + name: gpu-nodeclass + + limits: + cpu: "200" + memory: 800Gi + nvidia.com/gpu: "16" + + disruption: + consolidationPolicy: WhenEmpty + consolidateAfter: 300s +``` + +## Autoscaling with HPA, VPA, and KEDA + +### Horizontal Pod Autoscaler (HPA) + +```yaml +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: myapp-hpa +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: myapp + minReplicas: 2 + maxReplicas: 50 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 + behavior: + scaleUp: + stabilizationWindowSeconds: 0 + policies: + - type: Percent + value: 100 + periodSeconds: 15 + - type: Pods + value: 4 + periodSeconds: 15 + selectPolicy: Max + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 50 + periodSeconds: 15 +``` + +### Vertical Pod Autoscaler (VPA) + +```yaml +apiVersion: autoscaling.k8s.io/v1 +kind: VerticalPodAutoscaler +metadata: + name: myapp-vpa +spec: + targetRef: + apiVersion: apps/v1 + kind: Deployment + name: myapp + updatePolicy: + updateMode: "Auto" # Auto, Recreate, Initial, Off + resourcePolicy: + containerPolicies: + - containerName: "*" + minAllowed: + cpu: 100m + memory: 128Mi + maxAllowed: + cpu: 4 + memory: 8Gi + controlledResources: ["cpu", "memory"] + controlledValues: RequestsAndLimits +``` + +### KEDA ScaledObject (Event-Driven) + +```yaml +apiVersion: keda.sh/v1alpha1 +kind: ScaledObject +metadata: + name: myapp-queue-scaler +spec: + scaleTargetRef: + name: myapp + minReplicaCount: 0 # Scale to zero + maxReplicaCount: 100 + pollingInterval: 30 + cooldownPeriod: 300 + triggers: + # Azure Service Bus Queue + - type: azure-servicebus + metadata: + queueName: myqueue + namespace: myservicebus + messageCount: "5" + authenticationRef: + name: azure-servicebus-auth + + # Azure Storage Queue + - type: azure-queue + metadata: + queueName: myqueue + queueLength: "10" + accountName: mystorageaccount + authenticationRef: + name: azure-storage-auth + + # Prometheus metrics + - type: prometheus + metadata: + serverAddress: http://prometheus.monitoring.svc.cluster.local:9090 + metricName: http_requests_per_second + threshold: "100" + query: sum(rate(http_requests_total[2m])) +``` + +## Workload Identity (Replaces AAD Pod Identity) + +### Setup + +```bash +# Workload identity is enabled by default in AKS Automatic + +# Create managed identity +az identity create \ + --name myapp-identity \ + --resource-group MyRG + +# Get identity details +export IDENTITY_CLIENT_ID=$(az identity show -g MyRG -n myapp-identity --query clientId -o tsv) +export IDENTITY_OBJECT_ID=$(az identity show -g MyRG -n myapp-identity --query principalId -o tsv) + +# Assign role to identity +az role assignment create \ + --assignee $IDENTITY_OBJECT_ID \ + --role "Storage Blob Data Contributor" \ + --scope /subscriptions//resourceGroups/MyRG/providers/Microsoft.Storage/storageAccounts/mystorage + +# Create federated credential +export AKS_OIDC_ISSUER=$(az aks show -g MyRG -n MyAKSAutomatic --query oidcIssuerProfile.issuerUrl -o tsv) + +az identity federated-credential create \ + --name myapp-federated-credential \ + --identity-name myapp-identity \ + --resource-group MyRG \ + --issuer $AKS_OIDC_ISSUER \ + --subject system:serviceaccount:default:myapp-sa +``` + +### Kubernetes Resources + +```yaml +# Service Account +apiVersion: v1 +kind: ServiceAccount +metadata: + name: myapp-sa + namespace: default + annotations: + azure.workload.identity/client-id: "" + +--- +# Deployment using workload identity +apiVersion: apps/v1 +kind: Deployment +metadata: + name: myapp +spec: + replicas: 2 + selector: + matchLabels: + app: myapp + template: + metadata: + labels: + app: myapp + azure.workload.identity/use: "true" # Enable workload identity + spec: + serviceAccountName: myapp-sa + containers: + - name: myapp + image: myregistry.azurecr.io/myapp:latest + env: + - name: AZURE_CLIENT_ID + value: "" + - name: AZURE_TENANT_ID + value: "" + - name: AZURE_FEDERATED_TOKEN_FILE + value: /var/run/secrets/azure/tokens/azure-identity-token + volumeMounts: + - name: azure-identity-token + mountPath: /var/run/secrets/azure/tokens + readOnly: true + volumes: + - name: azure-identity-token + projected: + sources: + - serviceAccountToken: + path: azure-identity-token + expirationSeconds: 3600 + audience: api://AzureADTokenExchange +``` + +## Monitoring and Observability + +### Enable Container Insights + +```bash +# Already enabled with --enable-addons monitoring +# Query logs using Azure Monitor + +# Get cluster logs +az monitor log-analytics query \ + --workspace \ + --analytics-query "KubePodInventory | where ClusterName == 'MyAKSAutomatic' | take 10" \ + --output table + +# Get Karpenter logs +kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter +``` + +### Prometheus and Grafana + +```bash +# Enable managed Prometheus +az aks update \ + --resource-group MyRG \ + --name MyAKSAutomatic \ + --enable-azure-monitor-metrics + +# Access Grafana dashboards through Azure Portal +``` + +## Cost Optimization + +### Billing Model (October 2025) +- **Control plane**: $0.16/hour per cluster +- **Compute**: Pay for actual node usage +- **Karpenter**: Automatic bin-packing and consolidation +- **Scale-to-zero**: Possible with KEDA and Karpenter + +### Cost-Saving Tips + +1. **Use Spot Instances for Non-Critical Workloads** +```yaml +- key: karpenter.sh/capacity-type + operator: In + values: ["spot"] +``` + +2. **Configure Aggressive Consolidation** +```yaml +disruption: + consolidationPolicy: WhenUnderutilized + consolidateAfter: 30s +``` + +3. **Implement Pod Disruption Budgets** +```yaml +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: myapp-pdb +spec: + minAvailable: 1 + selector: + matchLabels: + app: myapp +``` + +4. **Use VPA for Right-Sizing** +- VPA automatically adjusts resource requests based on actual usage + +## Migration from Standard AKS to Automatic + +AKS Automatic is a new cluster mode - in-place migration is not supported. Follow these steps: + +1. **Create new AKS Automatic cluster** +2. **Install workloads in new cluster** +3. **Validate functionality** +4. **Switch traffic** (DNS, load balancer) +5. **Decommission old cluster** + +## Best Practices + +✓ Use AKS Automatic for new production clusters +✓ Enable workload identity for pod authentication +✓ Configure custom NodePools for specific workload types +✓ Implement HPA, VPA, and KEDA for comprehensive scaling +✓ Use spot instances for batch and fault-tolerant workloads +✓ Enable Container Insights and Managed Prometheus +✓ Configure Pod Disruption Budgets for critical apps +✓ Use network policies for microsegmentation +✓ Enable Azure Policy add-on for compliance +✓ Implement GitOps with Flux or Argo CD + +## Troubleshooting + +### Check Karpenter Status +```bash +kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter --tail=100 +kubectl get nodepools +kubectl get nodeclaims +``` + +### View Node Provisioning Events +```bash +kubectl get events --field-selector involvedObject.kind=NodePool -A +``` + +### Debug Workload Identity Issues +```bash +# Check service account annotation +kubectl get sa myapp-sa -o yaml + +# Check pod labels +kubectl get pod -o yaml | grep azure.workload.identity + +# Check federated credential +az identity federated-credential show \ + --identity-name myapp-identity \ + --resource-group MyRG \ + --name myapp-federated-credential +``` + +## References + +- [AKS Automatic Documentation](https://learn.microsoft.com/en-us/azure/aks/automatic) +- [Karpenter on Azure](https://karpenter.sh) +- [Workload Identity](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview) +- [AKS Release Notes](https://github.com/Azure/AKS/releases) + +AKS Automatic represents the future of managed Kubernetes on Azure - zero operational overhead with maximum automation! diff --git a/skills/azure-openai-2025.md b/skills/azure-openai-2025.md new file mode 100644 index 0000000..8f44320 --- /dev/null +++ b/skills/azure-openai-2025.md @@ -0,0 +1,718 @@ +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + +--- + + +# Azure OpenAI Service - 2025 Models and Features + +Complete knowledge base for Azure OpenAI Service with latest 2025 models including GPT-5, GPT-4.1, reasoning models, and Azure AI Foundry integration. + +## Overview + +Azure OpenAI Service provides REST API access to OpenAI's most powerful models with enterprise-grade security, compliance, and regional availability. + +## Latest Models (2025) + +### GPT-5 Series (GA August 2025) + +**Registration Required Models:** +- `gpt-5-pro`: Highest capability, complex reasoning +- `gpt-5`: Balanced performance and cost +- `gpt-5-codex`: Optimized for code generation + +**No Registration Required:** +- `gpt-5-mini`: Faster, more affordable +- `gpt-5-nano`: Ultra-fast for simple tasks +- `gpt-5-chat`: Optimized for conversational use + +### GPT-4.1 Series + +- `gpt-4.1`: 1 million token context window +- `gpt-4.1-mini`: Efficient version with 1M context +- `gpt-4.1-nano`: Fastest variant + +**Key Improvements:** +- 1,000,000 token context (vs 128K in GPT-4 Turbo) +- Better instruction following +- Reduced hallucinations +- Improved multilingual support + +### Reasoning Models + +**o4-mini**: Lightweight reasoning model +- Faster inference +- Lower cost +- Suitable for structured reasoning tasks + +**o3**: Advanced reasoning model +- Complex problem solving +- Mathematical reasoning +- Scientific analysis + +**o1**: Original reasoning model +- General-purpose reasoning +- Step-by-step explanations + +**o1-mini**: Efficient reasoning +- Balanced cost and performance + +### Image Generation + +**GPT-image-1 (2025-04-15)** +- DALL-E 3 successor +- Higher quality images +- Better prompt understanding +- Improved safety filters + +### Video Generation + +**Sora (2025-05-02)** +- Text-to-video generation +- Realistic and imaginative scenes +- Up to 60 seconds of video +- Multiple camera angles and styles + +### Audio Models + +**gpt-4o-transcribe**: Speech-to-text powered by GPT-4o +- High accuracy transcription +- Multiple languages +- Speaker diarization + +**gpt-4o-mini-transcribe**: Faster, more affordable transcription +- Good accuracy +- Lower latency +- Cost-effective + +## Deploying Azure OpenAI + +### Create Azure OpenAI Resource + +```bash +# Create OpenAI account +az cognitiveservices account create \ + --name myopenai \ + --resource-group MyRG \ + --kind OpenAI \ + --sku S0 \ + --location eastus \ + --custom-domain myopenai \ + --public-network-access Disabled \ + --identity-type SystemAssigned + +# Get endpoint and key +az cognitiveservices account show \ + --name myopenai \ + --resource-group MyRG \ + --query "properties.endpoint" \ + --output tsv + +az cognitiveservices account keys list \ + --name myopenai \ + --resource-group MyRG \ + --query "key1" \ + --output tsv +``` + +### Deploy GPT-5 Model + +```bash +# Deploy gpt-5 +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-5 \ + --model-name gpt-5 \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 100 \ + --scale-type Standard + +# Deploy gpt-5-pro (requires registration) +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-5-pro \ + --model-name gpt-5-pro \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 50 +``` + +### Deploy Reasoning Models + +```bash +# Deploy o3 reasoning model +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name o3-reasoning \ + --model-name o3 \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 50 + +# Deploy o4-mini +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name o4-mini \ + --model-name o4-mini \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 100 +``` + +### Deploy GPT-4.1 with 1M Context + +```bash +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-4-1 \ + --model-name gpt-4.1 \ + --model-version latest \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 100 +``` + +### Deploy Image Generation Model + +```bash +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name image-gen \ + --model-name gpt-image-1 \ + --model-version 2025-04-15 \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 10 +``` + +### Deploy Sora Video Generation + +```bash +az cognitiveservices account deployment create \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name sora \ + --model-name sora \ + --model-version 2025-05-02 \ + --model-format OpenAI \ + --sku-name Standard \ + --sku-capacity 5 +``` + +## Using Azure OpenAI Models + +### Python SDK (GPT-5) + +```python +from openai import AzureOpenAI +import os + +# Initialize client +client = AzureOpenAI( + api_key=os.getenv("AZURE_OPENAI_API_KEY"), + api_version="2025-02-01-preview", + azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT") +) + +# GPT-5 completion +response = client.chat.completions.create( + model="gpt-5", # deployment name + messages=[ + {"role": "system", "content": "You are a helpful AI assistant."}, + {"role": "user", "content": "Explain quantum computing in simple terms."} + ], + max_tokens=1000, + temperature=0.7, + top_p=0.95 +) + +print(response.choices[0].message.content) +``` + +### Python SDK (o3 Reasoning Model) + +```python +# o3 reasoning with chain-of-thought +response = client.chat.completions.create( + model="o3-reasoning", + messages=[ + {"role": "system", "content": "You are an expert problem solver. Show your reasoning step-by-step."}, + {"role": "user", "content": "If a train travels 120 km in 2 hours, then speeds up to travel 180 km in the next 2 hours, what is the average speed for the entire journey?"} + ], + max_tokens=2000, + temperature=0.2 # Lower temperature for reasoning tasks +) + +print(response.choices[0].message.content) +``` + +### Python SDK (GPT-4.1 with 1M Context) + +```python +# Read a large document +with open('large_document.txt', 'r') as f: + document = f.read() + +# GPT-4.1 can handle up to 1M tokens +response = client.chat.completions.create( + model="gpt-4-1", + messages=[ + {"role": "system", "content": "You are a document analysis expert."}, + {"role": "user", "content": f"Analyze this document and provide key insights:\n\n{document}"} + ], + max_tokens=4000 +) + +print(response.choices[0].message.content) +``` + +### Image Generation (GPT-image-1) + +```python +# Generate image with DALL-E 3 successor +response = client.images.generate( + model="image-gen", + prompt="A futuristic city with flying cars and vertical gardens, cyberpunk style, highly detailed, 4K", + size="1024x1024", + quality="hd", + n=1 +) + +image_url = response.data[0].url +print(f"Generated image: {image_url}") +``` + +### Video Generation (Sora) + +```python +# Generate video with Sora +response = client.videos.generate( + model="sora", + prompt="A serene lakeside at sunset with birds flying overhead and gentle waves on the shore", + duration=10, # seconds + resolution="1080p", + fps=30 +) + +video_url = response.data[0].url +print(f"Generated video: {video_url}") +``` + +### Audio Transcription + +```python +# Transcribe audio file +audio_file = open("meeting_recording.mp3", "rb") + +response = client.audio.transcriptions.create( + model="gpt-4o-transcribe", + file=audio_file, + language="en", + response_format="verbose_json" +) + +print(f"Transcription: {response.text}") +print(f"Duration: {response.duration}s") + +# Speaker diarization +for segment in response.segments: + print(f"[{segment.start}s - {segment.end}s] {segment.text}") +``` + +## Azure AI Foundry Integration + +### Model Router (Automatic Model Selection) + +```python +from azure.ai.foundry import ModelRouter + +# Initialize model router +router = ModelRouter( + endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), + credential=os.getenv("AZURE_OPENAI_API_KEY") +) + +# Automatically select optimal model +response = router.complete( + prompt="Analyze this complex scientific paper...", + optimization_goals=["quality", "cost"], + available_models=["gpt-5", "gpt-5-mini", "gpt-4-1"] +) + +print(f"Selected model: {response.model_used}") +print(f"Response: {response.content}") +print(f"Cost: ${response.cost}") +``` + +**Benefits:** +- Automatic model selection based on prompt complexity +- Balance quality vs cost +- Reduce costs by up to 40% while maintaining quality + +### Agentic Retrieval (Azure AI Search Integration) + +```python +from azure.search.documents import SearchClient +from azure.core.credentials import AzureKeyCredential + +# Initialize search client +search_client = SearchClient( + endpoint=os.getenv("SEARCH_ENDPOINT"), + index_name="documents", + credential=AzureKeyCredential(os.getenv("SEARCH_KEY")) +) + +# Agentic retrieval with Azure OpenAI +response = client.chat.completions.create( + model="gpt-5", + messages=[ + {"role": "system", "content": "You have access to a document search system."}, + {"role": "user", "content": "What are the company's revenue projections for Q3?"} + ], + tools=[{ + "type": "function", + "function": { + "name": "search_documents", + "description": "Search company documents", + "parameters": { + "type": "object", + "properties": { + "query": {"type": "string", "description": "Search query"} + }, + "required": ["query"] + } + } + }], + tool_choice="auto" +) + +# Process tool calls +if response.choices[0].message.tool_calls: + for tool_call in response.choices[0].message.tool_calls: + if tool_call.function.name == "search_documents": + query = json.loads(tool_call.function.arguments)["query"] + results = search_client.search(query) + # Feed results back to model for final answer +``` + +**Improvements:** +- 40% better on complex, multi-part questions +- Automatic query decomposition +- Relevance ranking +- Citation generation + +### Foundry Observability (Preview) + +```python +from azure.ai.foundry import FoundryObservability + +# Enable observability +observability = FoundryObservability( + workspace_id=os.getenv("AI_FOUNDRY_WORKSPACE_ID"), + enable_tracing=True, + enable_metrics=True +) + +# Monitor agent execution +with observability.trace_agent("customer_support_agent") as trace: + response = client.chat.completions.create( + model="gpt-5", + messages=messages + ) + + trace.log_tool_call("search_kb", {"query": "refund policy"}) + trace.log_reasoning_step("Retrieved refund policy document") + trace.log_token_usage(response.usage.total_tokens) + +# View in Azure AI Foundry portal: +# - End-to-end trace logs +# - Reasoning steps and tool calls +# - Performance metrics +# - Cost analysis +``` + +## Capacity and Quota Management + +### Check Quota + +```bash +# List deployments with usage +az cognitiveservices account deployment list \ + --resource-group MyRG \ + --name myopenai \ + --output table + +# Check usage metrics +az monitor metrics list \ + --resource $(az cognitiveservices account show -g MyRG -n myopenai --query id -o tsv) \ + --metric "TokenTransaction" \ + --start-time 2025-01-01T00:00:00Z \ + --end-time 2025-01-31T23:59:59Z \ + --interval PT1H \ + --aggregation Total +``` + +### Update Capacity + +```bash +# Scale up deployment capacity +az cognitiveservices account deployment update \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-5 \ + --sku-capacity 200 + +# Scale down during off-peak +az cognitiveservices account deployment update \ + --resource-group MyRG \ + --name myopenai \ + --deployment-name gpt-5 \ + --sku-capacity 50 +``` + +### Request Quota Increase + +1. Navigate to Azure Portal → Azure OpenAI resource +2. Go to "Quotas" blade +3. Select model and region +4. Click "Request quota increase" +5. Provide justification and target capacity + +## Security and Networking + +### Private Endpoint + +```bash +# Create private endpoint +az network private-endpoint create \ + --name openai-private-endpoint \ + --resource-group MyRG \ + --vnet-name MyVNet \ + --subnet PrivateEndpointSubnet \ + --private-connection-resource-id $(az cognitiveservices account show -g MyRG -n myopenai --query id -o tsv) \ + --group-id account \ + --connection-name openai-connection + +# Create private DNS zone +az network private-dns zone create \ + --resource-group MyRG \ + --name privatelink.openai.azure.com + +# Link to VNet +az network private-dns link vnet create \ + --resource-group MyRG \ + --zone-name privatelink.openai.azure.com \ + --name openai-dns-link \ + --virtual-network MyVNet \ + --registration-enabled false + +# Create DNS zone group +az network private-endpoint dns-zone-group create \ + --resource-group MyRG \ + --endpoint-name openai-private-endpoint \ + --name default \ + --private-dns-zone privatelink.openai.azure.com \ + --zone-name privatelink.openai.azure.com +``` + +### Managed Identity Access + +```bash +# Enable system-assigned identity +az cognitiveservices account identity assign \ + --name myopenai \ + --resource-group MyRG + +# Grant role to managed identity +PRINCIPAL_ID=$(az cognitiveservices account show -g MyRG -n myopenai --query identity.principalId -o tsv) + +az role assignment create \ + --assignee $PRINCIPAL_ID \ + --role "Cognitive Services OpenAI User" \ + --scope /subscriptions//resourceGroups/MyRG +``` + +### Content Filtering + +```bash +# Configure content filtering +az cognitiveservices account update \ + --name myopenai \ + --resource-group MyRG \ + --set properties.customContentFilter='{ + "hate": {"severity": "medium", "enabled": true}, + "violence": {"severity": "medium", "enabled": true}, + "sexual": {"severity": "medium", "enabled": true}, + "selfHarm": {"severity": "high", "enabled": true} + }' +``` + +## Cost Optimization + +### Model Selection Strategy + +**Use GPT-5-mini or GPT-5-nano for:** +- Simple questions +- Classification tasks +- Content moderation +- Summarization + +**Use GPT-5 or GPT-4.1 for:** +- Complex reasoning +- Long-form content generation +- Document analysis +- Code generation + +**Use Reasoning Models (o3, o4-mini) for:** +- Mathematical problems +- Scientific analysis +- Step-by-step reasoning +- Logic puzzles + +### Implement Caching + +```python +# Use semantic cache to reduce duplicate requests +from azure.ai.cache import SemanticCache + +cache = SemanticCache( + similarity_threshold=0.95, + ttl_seconds=3600 +) + +# Check cache before API call +cached_response = cache.get(user_query) +if cached_response: + return cached_response + +response = client.chat.completions.create( + model="gpt-5", + messages=messages +) + +cache.set(user_query, response) +``` + +### Token Management + +```python +import tiktoken + +# Count tokens before API call +encoding = tiktoken.get_encoding("cl100k_base") +tokens = len(encoding.encode(prompt)) + +if tokens > 100000: + print(f"Warning: Prompt has {tokens} tokens, this will be expensive!") + +# Use shorter max_tokens when appropriate +response = client.chat.completions.create( + model="gpt-5", + messages=messages, + max_tokens=500 # Limit output tokens +) +``` + +## Monitoring and Alerts + +### Set Up Cost Alerts + +```bash +# Create budget alert +az consumption budget create \ + --budget-name openai-monthly-budget \ + --resource-group MyRG \ + --amount 1000 \ + --category Cost \ + --time-grain Monthly \ + --start-date 2025-01-01 \ + --end-date 2025-12-31 \ + --notifications '{ + "actual_GreaterThan_80_Percent": { + "enabled": true, + "operator": "GreaterThan", + "threshold": 80, + "contactEmails": ["billing@example.com"] + } + }' +``` + +### Application Insights Integration + +```python +from opencensus.ext.azure.log_exporter import AzureLogHandler +import logging + +# Configure logging +logger = logging.getLogger(__name__) +logger.addHandler(AzureLogHandler( + connection_string=os.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING") +)) + +# Log API calls +logger.info("OpenAI API call", extra={ + "custom_dimensions": { + "model": "gpt-5", + "tokens": response.usage.total_tokens, + "cost": calculate_cost(response.usage.total_tokens), + "latency_ms": response.response_ms + } +}) +``` + +## Best Practices + +✓ **Use Model Router** for automatic cost optimization +✓ **Implement caching** to reduce duplicate requests +✓ **Monitor token usage** and set budgets +✓ **Use private endpoints** for production workloads +✓ **Enable managed identity** instead of API keys +✓ **Configure content filtering** for safety +✓ **Right-size capacity** based on actual demand +✓ **Use Foundry Observability** for monitoring +✓ **Implement retry logic** with exponential backoff +✓ **Choose appropriate models** for task complexity + +## References + +- [Azure OpenAI Documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/) +- [What's New in Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/whats-new) +- [GPT-5 Announcement](https://azure.microsoft.com/en-us/blog/gpt-5-azure/) +- [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/) +- [Model Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/) + +Azure OpenAI Service with GPT-5 and reasoning models brings enterprise-grade AI to your applications! diff --git a/skills/azure-well-architected-framework.md b/skills/azure-well-architected-framework.md new file mode 100644 index 0000000..de6ea5c --- /dev/null +++ b/skills/azure-well-architected-framework.md @@ -0,0 +1,435 @@ +--- +name: azure-well-architected-framework +description: "Comprehensive Azure Well-Architected Framework knowledge covering the five pillars: Reliability, Security, Cost Optimization, Operational Excellence, and Performance Efficiency. Provides design principles, best practices, and implementation guidance for building robust Azure solutions." +--- + +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + + +--- + +# Azure Well-Architected Framework + +The Azure Well-Architected Framework is a set of guiding tenets for building high-quality cloud solutions. It consists of five pillars of architectural excellence. + +## Overview + +**Purpose**: Help architects and engineers build secure, high-performing, resilient, and efficient infrastructure for applications. + +**The Five Pillars**: +1. Reliability +2. Security +3. Cost Optimization +4. Operational Excellence +5. Performance Efficiency + +## Pillar 1: Reliability + +**Definition**: The ability of a system to recover from failures and continue to function. + +**Key Principles**: +- Design for failure +- Use availability zones and regions +- Implement redundancy +- Monitor and respond to failures +- Test disaster recovery + +**Best Practices**: + +**Availability Zones:** +```bash +# Deploy VM across availability zones +az vm create \ + --resource-group MyRG \ + --name MyVM \ + --zone 1 \ + --image Ubuntu2204 \ + --size Standard_D2s_v3 + +# Availability SLAs: +# - Single VM (Premium SSD): 99.9% +# - Availability Set: 99.95% +# - Availability Zones: 99.99% +``` + +**Backup and Disaster Recovery:** +```bash +# Enable Azure Backup +az backup protection enable-for-vm \ + --resource-group MyRG \ + --vault-name MyVault \ + --vm MyVM \ + --policy-name DefaultPolicy + +# Recovery Point Objective (RPO): How much data loss is acceptable +# Recovery Time Objective (RTO): How long can system be down +``` + +**Health Probes:** +- Application Gateway health probes +- Load Balancer probes +- Traffic Manager endpoint monitoring + +## Pillar 2: Security + +**Definition**: Protecting applications and data from threats. + +**Key Principles**: +- Defense in depth +- Least privilege access +- Secure the network +- Protect data at rest and in transit +- Monitor and audit + +**Best Practices**: + +**Identity and Access:** +```bash +# Use managed identities (no credentials in code) +az vm identity assign \ + --resource-group MyRG \ + --name MyVM + +# RBAC assignment +az role assignment create \ + --assignee \ + --role "Contributor" \ + --scope /subscriptions//resourceGroups/MyRG +``` + +**Network Security:** +- Use Network Security Groups (NSGs) +- Implement Azure Firewall or Application Gateway WAF +- Use Private Endpoints for PaaS services +- Enable DDoS Protection Standard for public-facing apps + +**Data Protection:** +```bash +# Enable encryption at rest (automatic for most services) +# Enable TLS 1.2+ for data in transit + +# Azure Storage encryption +az storage account update \ + --name mystorageaccount \ + --resource-group MyRG \ + --min-tls-version TLS1_2 \ + --https-only true +``` + +**Security Monitoring:** +```bash +# Enable Microsoft Defender for Cloud +az security pricing create \ + --name VirtualMachines \ + --tier Standard + +# Enable Azure Sentinel +az sentinel onboard \ + --resource-group MyRG \ + --workspace-name MyWorkspace +``` + +## Pillar 3: Cost Optimization + +**Definition**: Managing costs to maximize the value delivered. + +**Key Principles**: +- Plan and estimate costs +- Provision with optimization +- Use monitoring and analytics +- Maximize efficiency of cloud spend + +**Best Practices**: + +**Right-Sizing:** +```bash +# Use Azure Advisor recommendations +az advisor recommendation list \ + --category Cost \ + --output table + +# Common optimizations: +# 1. Shutdown dev/test VMs when not in use +# 2. Use Azure Hybrid Benefit for Windows/SQL +# 3. Purchase reservations for consistent workloads +# 4. Use autoscaling to match demand +``` + +**Reserved Instances:** +- 1-year or 3-year commitment +- Save up to 72% vs pay-as-you-go +- Available for VMs, SQL Database, Cosmos DB, Synapse, Storage + +**Azure Hybrid Benefit:** +```bash +# Apply Windows license to VM +az vm update \ + --resource-group MyRG \ + --name MyVM \ + --license-type Windows_Server + +# SQL Server Hybrid Benefit +az sql vm create \ + --resource-group MyRG \ + --name MySQLVM \ + --license-type AHUB +``` + +**Cost Management:** +```bash +# Create budget +az consumption budget create \ + --budget-name MyBudget \ + --category cost \ + --amount 1000 \ + --time-grain monthly \ + --start-date 2025-01-01 \ + --end-date 2025-12-31 + +# Set up alerts at 80%, 100%, 120% of budget +``` + +## Pillar 4: Operational Excellence + +**Definition**: Operations processes that keep a system running in production. + +**Key Principles**: +- Automate operations +- Monitor and gain insights +- Refine operations procedures +- Anticipate failure +- Stay current with updates + +**Best Practices**: + +**Infrastructure as Code:** +```bash +# Use ARM, Bicep, or Terraform +# Version control all infrastructure +# Implement CI/CD for infrastructure + +# Example: Bicep deployment +az deployment group create \ + --resource-group MyRG \ + --template-file main.bicep \ + --parameters @parameters.json +``` + +**Monitoring and Alerting:** +```bash +# Application Insights for apps +az monitor app-insights component create \ + --app MyApp \ + --location eastus \ + --resource-group MyRG + +# Log Analytics for infrastructure +az monitor log-analytics workspace create \ + --resource-group MyRG \ + --workspace-name MyWorkspace + +# Create alerts +az monitor metrics alert create \ + --name HighCPU \ + --resource-group MyRG \ + --scopes \ + --condition "avg Percentage CPU > 80" \ + --description "CPU usage is above 80%" +``` + +**DevOps Practices:** +- Continuous Integration/Continuous Deployment (CI/CD) +- Blue-green deployments +- Canary releases +- Feature flags +- Automated testing + +## Pillar 5: Performance Efficiency + +**Definition**: The ability of a system to adapt to changes in load. + +**Key Principles**: +- Scale horizontally +- Choose the right resources +- Monitor performance +- Optimize network and data access + +**Best Practices**: + +**Scaling:** +```bash +# Horizontal scaling (preferred) +# VM Scale Sets +az vmss create \ + --resource-group MyRG \ + --name MyVMSS \ + --image Ubuntu2204 \ + --instance-count 3 \ + --vm-sku Standard_D2s_v3 + +# Autoscaling +az monitor autoscale create \ + --resource-group MyRG \ + --resource MyVMSS \ + --resource-type Microsoft.Compute/virtualMachineScaleSets \ + --name MyAutoscale \ + --min-count 2 \ + --max-count 10 +``` + +**Caching:** +- Azure Cache for Redis +- Azure CDN for static content +- Application-level caching + +**Data Access:** +- Use indexes on databases +- Implement caching strategies +- Use CDN for global content delivery +- Optimize queries (SQL, Cosmos DB) + +**Networking:** +```bash +# Use Azure Front Door for global apps +az afd profile create \ + --profile-name MyFrontDoor \ + --resource-group MyRG \ + --sku Premium_AzureFrontDoor + +# Features: +# - Global load balancing +# - CDN capabilities +# - Web Application Firewall +# - SSL offloading +# - Caching +``` + +## Assessment and Tools + +**Azure Well-Architected Review:** +```bash +# Self-assessment tool in Azure Portal +# Generates recommendations per pillar +# Provides actionable guidance +``` + +**Azure Advisor:** +```bash +# Get recommendations +az advisor recommendation list --output table + +# Categories: +# - Reliability (High Availability) +# - Security +# - Performance +# - Cost +# - Operational Excellence +``` + +## Implementation Checklist + +**Reliability:** +- [ ] Deploy across availability zones +- [ ] Implement backup strategy +- [ ] Define RTO and RPO +- [ ] Test disaster recovery +- [ ] Implement health monitoring + +**Security:** +- [ ] Enable Azure AD authentication +- [ ] Implement RBAC (least privilege) +- [ ] Encrypt data at rest and in transit +- [ ] Enable Microsoft Defender for Cloud +- [ ] Implement network segmentation (NSGs, Firewall) +- [ ] Use Key Vault for secrets + +**Cost Optimization:** +- [ ] Right-size resources +- [ ] Purchase reservations for predictable workloads +- [ ] Enable autoscaling +- [ ] Use Azure Hybrid Benefit +- [ ] Implement budget alerts +- [ ] Review Azure Advisor cost recommendations + +**Operational Excellence:** +- [ ] Implement Infrastructure as Code +- [ ] Set up CI/CD pipelines +- [ ] Enable comprehensive monitoring +- [ ] Create operational runbooks +- [ ] Implement automated alerting +- [ ] Use tags for resource organization + +**Performance Efficiency:** +- [ ] Choose appropriate resource SKUs +- [ ] Implement autoscaling +- [ ] Use caching (Redis, CDN) +- [ ] Optimize database queries +- [ ] Implement load balancing +- [ ] Monitor performance metrics + +## Common Patterns + +**Highly Available Web Application:** +- Application Gateway (WAF enabled) +- App Service (Premium tier, multiple instances) +- Azure SQL Database (Zone-redundant) +- Azure Cache for Redis +- Application Insights +- Azure Front Door (global distribution) + +**Mission-Critical Application:** +- Multi-region deployment +- Traffic Manager or Front Door (global routing) +- Availability Zones in each region +- Geo-redundant storage (GRS or RA-GRS) +- Automated backups with geo-replication +- Comprehensive monitoring and alerting + +**Cost-Optimized Dev/Test:** +- Auto-shutdown for VMs +- B-series (burstable) VMs +- Dev/Test pricing tiers +- Shared App Service plans +- Azure DevTest Labs + +## References + +- **Official Framework**: https://learn.microsoft.com/en-us/azure/well-architected/ +- **Azure Advisor**: https://portal.azure.com/#blade/Microsoft_Azure_Expert/AdvisorMenuBlade/overview +- **Well-Architected Review**: https://learn.microsoft.com/en-us/assessments/azure-architecture-review/ +- **Architecture Center**: https://learn.microsoft.com/en-us/azure/architecture/ + +## Key Takeaways + +1. **Balance the Pillars**: Trade-offs exist between pillars (e.g., cost vs. reliability) +2. **Continuous Improvement**: Architecture is not static, revisit regularly +3. **Measure and Monitor**: Use data to drive decisions +4. **Automation**: Automate repetitive tasks to improve reliability and reduce costs +5. **Security First**: Integrate security into every layer of architecture + +The Well-Architected Framework provides a consistent approach to evaluating architectures and implementing designs that scale over time. diff --git a/skills/container-apps-gpu-2025.md b/skills/container-apps-gpu-2025.md new file mode 100644 index 0000000..88a0986 --- /dev/null +++ b/skills/container-apps-gpu-2025.md @@ -0,0 +1,624 @@ +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + +--- + + +# Azure Container Apps GPU Support - 2025 Features + +Complete knowledge base for Azure Container Apps with GPU support, serverless capabilities, and Dapr integration (2025 GA features). + +## Overview + +Azure Container Apps is a serverless container platform with native GPU support, Dapr integration, and scale-to-zero capabilities for cost-efficient AI/ML workloads. + +## Key 2025 Features (Build Announcements) + +### 1. Serverless GPU (GA) +- **Automatic scaling**: Scale GPU workloads based on demand +- **Scale-to-zero**: Pay only when GPU is actively used +- **Per-second billing**: Granular cost control +- **Optimized cold start**: Fast initialization for AI models +- **Reduced operational overhead**: No infrastructure management + +### 2. Dedicated GPU (GA) +- **Consistent performance**: Dedicated GPU resources +- **Simplified AI deployment**: Easy model hosting +- **Long-running workloads**: Ideal for training and continuous inference +- **Multiple GPU types**: NVIDIA A100, T4, and more + +### 3. Dynamic Sessions with GPU (Early Access) +- **Sandboxed execution**: Run untrusted AI-generated code +- **Hyper-V isolation**: Enhanced security +- **GPU-powered Python interpreter**: Handle compute-intensive AI workloads +- **Scale at runtime**: Dynamic resource allocation + +### 4. Foundry Models Integration +- **Deploy AI models directly**: During container app creation +- **Ready-to-use models**: Pre-configured inference endpoints +- **Azure AI Foundry**: Seamless integration + +### 5. Workflow with Durable Task Scheduler (Preview) +- **Long-running workflows**: Reliable orchestration +- **State management**: Automatic persistence +- **Event-driven**: Trigger workflows from events + +### 6. Native Azure Functions Support +- **Functions runtime**: Run Azure Functions in Container Apps +- **Consistent development**: Same code, serverless execution +- **Event triggers**: All Functions triggers supported + +### 7. Dapr Integration (GA) +- **Service discovery**: Built-in DNS-based discovery +- **State management**: Distributed state stores +- **Pub/sub messaging**: Reliable messaging patterns +- **Service invocation**: Resilient service-to-service calls +- **Observability**: Integrated tracing and metrics + +## Creating Container Apps with GPU + +### Basic Container App with Serverless GPU + +```bash +# Create Container Apps environment +az containerapp env create \ + --name myenv \ + --resource-group MyRG \ + --location eastus \ + --logs-workspace-id \ + --logs-workspace-key + +# Create Container App with GPU +az containerapp create \ + --name myapp-gpu \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/ai-model:latest \ + --cpu 4 \ + --memory 8Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + --min-replicas 0 \ + --max-replicas 10 \ + --ingress external \ + --target-port 8080 +``` + +### Production-Ready Container App with GPU + +```bash +az containerapp create \ + --name myapp-gpu-prod \ + --resource-group MyRG \ + --environment myenv \ + \ + # Container configuration + --image myregistry.azurecr.io/ai-model:latest \ + --registry-server myregistry.azurecr.io \ + --registry-identity system \ + \ + # Resources + --cpu 4 \ + --memory 8Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + \ + # Scaling + --min-replicas 0 \ + --max-replicas 20 \ + --scale-rule-name http-scaling \ + --scale-rule-type http \ + --scale-rule-http-concurrency 10 \ + \ + # Networking + --ingress external \ + --target-port 8080 \ + --transport http2 \ + --exposed-port 8080 \ + \ + # Security + --registry-identity system \ + --env-vars "AZURE_CLIENT_ID=secretref:client-id" \ + \ + # Monitoring + --dapr-app-id myapp \ + --dapr-app-port 8080 \ + --dapr-app-protocol http \ + --enable-dapr \ + \ + # Identity + --system-assigned +``` + +## Container Apps Environment Configuration + +### Environment with Zone Redundancy + +```bash +az containerapp env create \ + --name myenv-prod \ + --resource-group MyRG \ + --location eastus \ + --logs-workspace-id \ + --logs-workspace-key \ + --zone-redundant true \ + --enable-workload-profiles true +``` + +### Workload Profiles (Dedicated GPU) + +```bash +# Create environment with workload profiles +az containerapp env create \ + --name myenv-gpu \ + --resource-group MyRG \ + --location eastus \ + --enable-workload-profiles true + +# Add GPU workload profile +az containerapp env workload-profile add \ + --name myenv-gpu \ + --resource-group MyRG \ + --workload-profile-name gpu-profile \ + --workload-profile-type GPU-A100 \ + --min-nodes 0 \ + --max-nodes 10 + +# Create container app with GPU profile +az containerapp create \ + --name myapp-dedicated-gpu \ + --resource-group MyRG \ + --environment myenv-gpu \ + --workload-profile-name gpu-profile \ + --image myregistry.azurecr.io/training-job:latest \ + --cpu 8 \ + --memory 16Gi \ + --min-replicas 1 \ + --max-replicas 5 +``` + +## GPU Scaling Rules + +### Custom Prometheus Scaling + +```bash +az containerapp create \ + --name myapp-gpu-prometheus \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/ai-model:latest \ + --cpu 4 \ + --memory 8Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + --min-replicas 0 \ + --max-replicas 10 \ + --scale-rule-name gpu-utilization \ + --scale-rule-type custom \ + --scale-rule-custom-type prometheus \ + --scale-rule-metadata \ + serverAddress=http://prometheus.monitoring.svc.cluster.local:9090 \ + metricName=gpu_utilization \ + threshold=80 \ + query="avg(nvidia_gpu_utilization{app='myapp'})" +``` + +### Queue-Based Scaling (Azure Service Bus) + +```bash +az containerapp create \ + --name myapp-queue-processor \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/batch-processor:latest \ + --cpu 4 \ + --memory 8Gi \ + --gpu-type nvidia-t4 \ + --gpu-count 1 \ + --min-replicas 0 \ + --max-replicas 50 \ + --scale-rule-name queue-scaling \ + --scale-rule-type azure-servicebus \ + --scale-rule-metadata \ + queueName=ai-jobs \ + namespace=myservicebus \ + messageCount=5 \ + --scale-rule-auth connection=servicebus-connection +``` + +## Dapr Integration + +### Enable Dapr on Container App + +```bash +az containerapp create \ + --name myapp-dapr \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/myapp:latest \ + --enable-dapr \ + --dapr-app-id myapp \ + --dapr-app-port 8080 \ + --dapr-app-protocol http \ + --dapr-http-max-request-size 4 \ + --dapr-http-read-buffer-size 4 \ + --dapr-log-level info \ + --dapr-enable-api-logging true +``` + +### Dapr State Store (Azure Cosmos DB) + +```yaml +# Create Dapr component for state store +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: statestore +spec: + type: state.azure.cosmosdb + version: v1 + metadata: + - name: url + value: "https://mycosmosdb.documents.azure.com:443/" + - name: masterKey + secretRef: cosmosdb-key + - name: database + value: "mydb" + - name: collection + value: "state" +``` + +```bash +# Create the component +az containerapp env dapr-component set \ + --name myenv \ + --resource-group MyRG \ + --dapr-component-name statestore \ + --yaml component.yaml +``` + +### Dapr Pub/Sub (Azure Service Bus) + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: pubsub +spec: + type: pubsub.azure.servicebus.topics + version: v1 + metadata: + - name: connectionString + secretRef: servicebus-connection + - name: consumerID + value: "myapp" +``` + +### Service-to-Service Invocation + +```python +# Python example using Dapr SDK +from dapr.clients import DaprClient + +with DaprClient() as client: + # Invoke another service + response = client.invoke_method( + app_id='other-service', + method_name='process', + data='{"input": "data"}' + ) + + # Save state + client.save_state( + store_name='statestore', + key='mykey', + value='myvalue' + ) + + # Publish message + client.publish_event( + pubsub_name='pubsub', + topic_name='orders', + data='{"orderId": "123"}' + ) +``` + +## AI Model Deployment Patterns + +### OpenAI-Compatible Endpoint + +```dockerfile +# Dockerfile for vLLM model serving +FROM vllm/vllm-openai:latest + +ENV MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" +ENV GPU_MEMORY_UTILIZATION=0.9 +ENV MAX_MODEL_LEN=4096 + +CMD ["--model", "${MODEL_NAME}", \ + "--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}", \ + "--max-model-len", "${MAX_MODEL_LEN}", \ + "--port", "8080"] +``` + +```bash +# Deploy vLLM model +az containerapp create \ + --name llama-inference \ + --resource-group MyRG \ + --environment myenv \ + --image vllm/vllm-openai:latest \ + --cpu 8 \ + --memory 32Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + --min-replicas 1 \ + --max-replicas 5 \ + --target-port 8080 \ + --ingress external \ + --env-vars \ + MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" \ + GPU_MEMORY_UTILIZATION="0.9" \ + HF_TOKEN=secretref:huggingface-token +``` + +### Stable Diffusion Image Generation + +```bash +az containerapp create \ + --name stable-diffusion \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/stable-diffusion:latest \ + --cpu 4 \ + --memory 16Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 1 \ + --min-replicas 0 \ + --max-replicas 10 \ + --target-port 7860 \ + --ingress external \ + --scale-rule-name http-scaling \ + --scale-rule-type http \ + --scale-rule-http-concurrency 1 +``` + +### Batch Processing Job + +```bash +az containerapp job create \ + --name batch-training-job \ + --resource-group MyRG \ + --environment myenv \ + --trigger-type Manual \ + --image myregistry.azurecr.io/training:latest \ + --cpu 8 \ + --memory 32Gi \ + --gpu-type nvidia-a100 \ + --gpu-count 2 \ + --parallelism 1 \ + --replica-timeout 7200 \ + --replica-retry-limit 3 \ + --env-vars \ + DATASET_URL="https://mystorage.blob.core.windows.net/datasets/train.csv" \ + MODEL_OUTPUT="https://mystorage.blob.core.windows.net/models/" \ + EPOCHS="100" + +# Execute job +az containerapp job start \ + --name batch-training-job \ + --resource-group MyRG +``` + +## Monitoring and Observability + +### Application Insights Integration + +```bash +az containerapp create \ + --name myapp-monitored \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/myapp:latest \ + --env-vars \ + APPLICATIONINSIGHTS_CONNECTION_STRING=secretref:appinsights-connection +``` + +### Query Logs + +```bash +# Stream logs +az containerapp logs show \ + --name myapp-gpu \ + --resource-group MyRG \ + --follow + +# Query with Log Analytics +az monitor log-analytics query \ + --workspace \ + --analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100" +``` + +### Metrics and Alerts + +```bash +# Create metric alert for GPU usage +az monitor metrics alert create \ + --name high-gpu-usage \ + --resource-group MyRG \ + --scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv) \ + --condition "avg Requests > 100" \ + --window-size 5m \ + --evaluation-frequency 1m \ + --action +``` + +## Security Best Practices + +### Managed Identity + +```bash +# Create with system-assigned identity +az containerapp create \ + --name myapp-identity \ + --resource-group MyRG \ + --environment myenv \ + --system-assigned \ + --image myregistry.azurecr.io/myapp:latest + +# Get identity principal ID +IDENTITY_ID=$(az containerapp show -g MyRG -n myapp-identity --query identity.principalId -o tsv) + +# Assign role to access Key Vault +az role assignment create \ + --assignee $IDENTITY_ID \ + --role "Key Vault Secrets User" \ + --scope /subscriptions//resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault + +# Use user-assigned identity +az identity create --name myapp-identity --resource-group MyRG +IDENTITY_RESOURCE_ID=$(az identity show -g MyRG -n myapp-identity --query id -o tsv) + +az containerapp create \ + --name myapp-user-identity \ + --resource-group MyRG \ + --environment myenv \ + --user-assigned $IDENTITY_RESOURCE_ID \ + --image myregistry.azurecr.io/myapp:latest +``` + +### Secret Management + +```bash +# Add secrets +az containerapp secret set \ + --name myapp-gpu \ + --resource-group MyRG \ + --secrets \ + huggingface-token="" \ + api-key="" + +# Reference secrets in environment variables +az containerapp update \ + --name myapp-gpu \ + --resource-group MyRG \ + --set-env-vars \ + HF_TOKEN=secretref:huggingface-token \ + API_KEY=secretref:api-key +``` + +## Cost Optimization + +### Scale-to-Zero Configuration + +```bash +az containerapp create \ + --name myapp-scale-zero \ + --resource-group MyRG \ + --environment myenv \ + --image myregistry.azurecr.io/myapp:latest \ + --min-replicas 0 \ + --max-replicas 10 \ + --scale-rule-name http-scaling \ + --scale-rule-type http \ + --scale-rule-http-concurrency 10 +``` + +**Cost savings**: Pay only when requests are being processed. GPU costs are per-second when active. + +### Right-Sizing Resources + +```bash +# Start with minimal resources +--cpu 2 --memory 4Gi --gpu-count 1 + +# Monitor and adjust based on actual usage +az monitor metrics list \ + --resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv) \ + --metric "CpuPercentage,MemoryPercentage" +``` + +### Use Spot/Preemptible GPUs (Future Feature) + +When available, configure spot instances for non-critical workloads to save up to 80% on GPU costs. + +## Troubleshooting + +### Check Revision Status + +```bash +az containerapp revision list \ + --name myapp-gpu \ + --resource-group MyRG \ + --output table +``` + +### View Revision Details + +```bash +az containerapp revision show \ + --name \ + --app myapp-gpu \ + --resource-group MyRG +``` + +### Restart Container App + +```bash +az containerapp update \ + --name myapp-gpu \ + --resource-group MyRG \ + --force-restart +``` + +### GPU Not Available + +If GPU is not provisioning: +1. Check region availability: Not all regions support GPU +2. Verify quota: Request quota increase if needed +3. Check workload profile: Ensure GPU workload profile is created + +## Best Practices + +✓ Use scale-to-zero for intermittent workloads +✓ Implement health probes (liveness and readiness) +✓ Use managed identities for authentication +✓ Store secrets in Azure Key Vault +✓ Enable Dapr for microservices patterns +✓ Configure appropriate scaling rules +✓ Monitor GPU utilization and adjust resources +✓ Use Container Apps jobs for batch processing +✓ Implement retry logic for transient failures +✓ Use Application Insights for observability + +## References + +- [Container Apps GPU Documentation](https://learn.microsoft.com/en-us/azure/container-apps/gpu-support) +- [Dapr Integration](https://learn.microsoft.com/en-us/azure/container-apps/dapr-overview) +- [Scaling Rules](https://learn.microsoft.com/en-us/azure/container-apps/scale-app) +- [Build 2025 Announcements](https://azure.microsoft.com/en-us/blog/container-apps-build-2025/) + +Azure Container Apps with GPU support provides the ultimate serverless platform for AI/ML workloads! diff --git a/skills/deployment-stacks-2025.md b/skills/deployment-stacks-2025.md new file mode 100644 index 0000000..037f425 --- /dev/null +++ b/skills/deployment-stacks-2025.md @@ -0,0 +1,796 @@ +## 🚨 CRITICAL GUIDELINES + +### Windows File Path Requirements + +**MANDATORY: Always Use Backslashes on Windows for File Paths** + +When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). + +**Examples:** +- ❌ WRONG: `D:/repos/project/file.tsx` +- ✅ CORRECT: `D:\repos\project\file.tsx` + +This applies to: +- Edit tool file_path parameter +- Write tool file_path parameter +- All file operations on Windows systems + +### Documentation Guidelines + +**NEVER create new documentation files unless explicitly requested by the user.** + +- **Priority**: Update existing README.md files rather than creating new documentation +- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise +- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone +- **User preference**: Only create additional .md files when user specifically asks for documentation + +--- + + +# Azure Deployment Stacks - 2025 GA Features + +Complete knowledge base for Azure Deployment Stacks, the successor to Azure Blueprints (GA 2024, best practices 2025). + +## Overview + +Azure Deployment Stacks is a resource type for managing a collection of Azure resources as a single, atomic unit. It provides unified lifecycle management, resource protection, and automatic cleanup capabilities. + +## Key Features + +### 1. Unified Resource Management +- Manage multiple resources as a single entity +- Update, export, and delete operations on the entire stack +- Track all managed resources in one place +- Consistent deployment across environments + +### 2. Deny Settings (Resource Protection) +Prevent unauthorized modifications to managed resources: +- **None**: No restrictions (default) +- **DenyDelete**: Prevent resource deletion +- **DenyWriteAndDelete**: Prevent updates and deletions + +### 3. ActionOnUnmanage (Cleanup Policies) +Control what happens to resources no longer in template: +- **detachAll**: Remove from stack management, keep resources +- **deleteAll**: Delete resources not in template +- **deleteResources**: Delete unmanaged resources, keep resource groups + +### 4. Scope Flexibility +Deploy stacks at: +- Resource group scope +- Subscription scope +- Management group scope + +### 5. Replaces Azure Blueprints +Azure Blueprints will be deprecated in **July 2026**. Deployment Stacks is the recommended replacement. + +## Prerequisites + +### Azure CLI Version +```bash +# Requires Azure CLI 2.61.0 or later +az version + +# Upgrade if needed +az upgrade +``` + +### Azure PowerShell Version +```bash +# Requires Azure PowerShell 12.0.0 or later +Get-InstalledModule -Name Az +Update-Module -Name Az +``` + +## Creating Deployment Stacks + +### Subscription Scope Stack + +```bash +# Create deployment stack at subscription level +az stack sub create \ + --name MyProductionStack \ + --location eastus \ + --template-file main.bicep \ + --parameters @parameters.json \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals \ + --action-on-unmanage deleteAll \ + --description "Production infrastructure managed by deployment stack" \ + --tags Environment=Production ManagedBy=DeploymentStack CostCenter=Engineering + +# What-if analysis before deployment +az stack sub what-if \ + --name MyProductionStack \ + --location eastus \ + --template-file main.bicep \ + --parameters @parameters.json + +# Create with confirmation prompt disabled +az stack sub create \ + --name MyDevStack \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode None \ + --action-on-unmanage detachAll \ + --yes +``` + +### Resource Group Scope Stack + +```bash +# Create resource group +az group create \ + --name MyRG \ + --location eastus \ + --tags Environment=Production + +# Create deployment stack +az stack group create \ + --name MyAppStack \ + --resource-group MyRG \ + --template-file main.bicep \ + --parameters environment=production \ + --deny-settings-mode DenyDelete \ + --action-on-unmanage deleteAll \ + --description "Application infrastructure stack" +``` + +### Management Group Scope Stack + +```bash +# Create stack at management group level +az stack mg create \ + --name MyEnterpriseStack \ + --management-group-id MyMgmtGroup \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode DenyWriteAndDelete \ + --action-on-unmanage detachAll +``` + +## Bicep Template for Deployment Stack + +### Production Stack Template + +```bicep +// main.bicep +targetScope = 'subscription' + +@description('Environment name') +@allowed([ + 'dev' + 'staging' + 'production' +]) +param environment string = 'production' + +@description('Primary location') +param location string = 'eastus' + +@description('Secondary location for geo-replication') +param secondaryLocation string = 'westus' + +// Resource naming +var namingPrefix = 'myapp-${environment}' + +// Resource Group for core infrastructure +resource coreRG 'Microsoft.Resources/resourceGroups@2024-03-01' = { + name: '${namingPrefix}-core-rg' + location: location + tags: { + Environment: environment + ManagedBy: 'DeploymentStack' + Purpose: 'Core Infrastructure' + } +} + +// Resource Group for data services +resource dataRG 'Microsoft.Resources/resourceGroups@2024-03-01' = { + name: '${namingPrefix}-data-rg' + location: location + tags: { + Environment: environment + ManagedBy: 'DeploymentStack' + Purpose: 'Data Services' + } +} + +// Log Analytics Workspace +module logAnalytics 'modules/log-analytics.bicep' = { + name: 'logAnalyticsDeploy' + scope: coreRG + params: { + name: '${namingPrefix}-logs' + location: location + retentionInDays: environment == 'production' ? 90 : 30 + } +} + +// AKS Automatic Cluster +module aksCluster 'modules/aks-automatic.bicep' = { + name: 'aksClusterDeploy' + scope: coreRG + params: { + name: '${namingPrefix}-aks' + location: location + kubernetesVersion: '1.34' + workspaceId: logAnalytics.outputs.workspaceId + enableZoneRedundancy: environment == 'production' + } +} + +// Container Apps Environment +module containerEnv 'modules/container-env.bicep' = { + name: 'containerEnvDeploy' + scope: coreRG + params: { + name: '${namingPrefix}-containerenv' + location: location + workspaceId: logAnalytics.outputs.workspaceId + zoneRedundant: environment == 'production' + } +} + +// Azure OpenAI +module openAI 'modules/openai.bicep' = { + name: 'openAIDeploy' + scope: dataRG + params: { + name: '${namingPrefix}-openai' + location: location + deployGPT5: environment == 'production' + } +} + +// Cosmos DB with geo-replication +module cosmosDB 'modules/cosmos-db.bicep' = { + name: 'cosmosDBDeploy' + scope: dataRG + params: { + name: '${namingPrefix}-cosmos' + primaryLocation: location + secondaryLocation: secondaryLocation + enableAutomaticFailover: environment == 'production' + } +} + +// Key Vault +module keyVault 'modules/key-vault.bicep' = { + name: 'keyVaultDeploy' + scope: coreRG + params: { + name: '${namingPrefix}-kv' + location: location + enablePurgeProtection: environment == 'production' + } +} + +// Outputs +output aksClusterName string = aksCluster.outputs.clusterName +output containerEnvId string = containerEnv.outputs.environmentId +output openAIEndpoint string = openAI.outputs.endpoint +output cosmosDBEndpoint string = cosmosDB.outputs.endpoint +output keyVaultUri string = keyVault.outputs.vaultUri +``` + +### AKS Automatic Module + +```bicep +// modules/aks-automatic.bicep +@description('Cluster name') +param name string + +@description('Location') +param location string + +@description('Kubernetes version') +param kubernetesVersion string = '1.34' + +@description('Log Analytics workspace ID') +param workspaceId string + +@description('Enable zone redundancy') +param enableZoneRedundancy bool = true + +resource aksCluster 'Microsoft.ContainerService/managedClusters@2025-01-01' = { + name: name + location: location + sku: { + name: 'Automatic' + tier: 'Standard' + } + identity: { + type: 'SystemAssigned' + } + properties: { + kubernetesVersion: kubernetesVersion + dnsPrefix: '${name}-dns' + enableRBAC: true + aadProfile: { + managed: true + enableAzureRBAC: true + } + networkProfile: { + networkPlugin: 'azure' + networkPluginMode: 'overlay' + networkDataplane: 'cilium' + serviceCidr: '10.0.0.0/16' + dnsServiceIP: '10.0.0.10' + } + autoScalerProfile: { + 'balance-similar-node-groups': 'true' + expander: 'least-waste' + } + autoUpgradeProfile: { + upgradeChannel: 'stable' + nodeOSUpgradeChannel: 'NodeImage' + } + securityProfile: { + defender: { + securityMonitoring: { + enabled: true + } + } + workloadIdentity: { + enabled: true + } + } + oidcIssuerProfile: { + enabled: true + } + addonProfiles: { + omsagent: { + enabled: true + config: { + logAnalyticsWorkspaceResourceID: workspaceId + } + } + azurePolicy: { + enabled: true + } + } + } + zones: enableZoneRedundancy ? ['1', '2', '3'] : null +} + +output clusterName string = aksCluster.name +output clusterId string = aksCluster.id +output oidcIssuerUrl string = aksCluster.properties.oidcIssuerProfile.issuerUrl +output kubeletIdentity string = aksCluster.properties.identityProfile.kubeletidentity.objectId +``` + +## Managing Deployment Stacks + +### Update Stack + +```bash +# Update with new template version +az stack sub update \ + --name MyProductionStack \ + --template-file main.bicep \ + --parameters @parameters.json \ + --action-on-unmanage deleteAll + +# Update deny settings +az stack sub update \ + --name MyProductionStack \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals +``` + +### View Stack Details + +```bash +# Show stack information +az stack sub show \ + --name MyProductionStack \ + --output json + +# List all stacks in subscription +az stack sub list --output table + +# List stacks in resource group +az stack group list \ + --resource-group MyRG \ + --output table +``` + +### Export Stack Template + +```bash +# Export template from deployed stack +az stack sub export \ + --name MyProductionStack \ + --output-file exported-stack.json + +# Export and save parameters +az stack sub show \ + --name MyProductionStack \ + --query "parameters" \ + --output json > parameters-backup.json +``` + +### Delete Stack + +```bash +# Delete stack and all managed resources +az stack sub delete \ + --name MyProductionStack \ + --action-on-unmanage deleteAll \ + --yes + +# Delete stack but keep resources +az stack sub delete \ + --name MyProductionStack \ + --action-on-unmanage detachAll \ + --yes + +# Delete with confirmation prompt +az stack sub delete --name MyProductionStack +``` + +## Deny Settings in Detail + +### DenyDelete Mode + +Prevents deletion but allows updates: + +```bash +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode DenyDelete \ + --deny-settings-excluded-principals \ + \ + +``` + +**Use cases:** +- Protect production databases +- Prevent accidental resource deletion +- Allow configuration updates + +### DenyWriteAndDelete Mode + +Prevents both updates and deletions: + +```bash +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals +``` + +**Use cases:** +- Immutable infrastructure +- Compliance requirements +- Critical production workloads + +### Excluded Principals + +Bypass deny settings for specific identities: + +```bash +# Get principal IDs +SERVICE_PRINCIPAL_ID=$(az ad sp show --id --query id -o tsv) +ADMIN_GROUP_ID=$(az ad group show --group "Cloud Admins" --query id -o tsv) + +# Apply with exclusions +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals $SERVICE_PRINCIPAL_ID $ADMIN_GROUP_ID +``` + +## ActionOnUnmanage Policies + +### detachAll + +Resources are removed from stack management but not deleted: + +```bash +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --action-on-unmanage detachAll +``` + +**Use when:** +- Testing deployment changes +- Migrating resources to another stack +- Temporary stack management + +### deleteAll + +All unmanaged resources are deleted: + +```bash +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --action-on-unmanage deleteAll +``` + +**Use when:** +- Ephemeral environments (dev, test) +- Clean slate deployments +- Strict infrastructure-as-code enforcement + +### deleteResources + +Delete resources but keep resource groups: + +```bash +az stack sub create \ + --name MyStack \ + --location eastus \ + --template-file main.bicep \ + --action-on-unmanage deleteResources +``` + +## RBAC for Deployment Stacks + +### Built-in Roles + +**Azure Deployment Stack Contributor** +- Manage deployment stacks +- Cannot create or delete deny-assignments + +**Azure Deployment Stack Owner** +- Full stack management +- Can create and delete deny-assignments + +### Assign Roles + +```bash +# Assign Stack Contributor role +az role assignment create \ + --assignee \ + --role "Azure Deployment Stack Contributor" \ + --scope /subscriptions/ + +# Assign Stack Owner role +az role assignment create \ + --assignee \ + --role "Azure Deployment Stack Owner" \ + --scope /subscriptions/ +``` + +## CI/CD Integration + +### GitHub Actions + +```yaml +name: Deploy Deployment Stack + +on: + push: + branches: [main] + workflow_dispatch: + +permissions: + id-token: write + contents: read + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Azure Login + uses: azure/login@v2 + with: + client-id: ${{ secrets.AZURE_CLIENT_ID }} + tenant-id: ${{ secrets.AZURE_TENANT_ID }} + subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }} + + - name: What-if Analysis + run: | + az stack sub what-if \ + --name MyProductionStack \ + --location eastus \ + --template-file main.bicep \ + --parameters @parameters.json + + - name: Deploy Stack + run: | + az stack sub create \ + --name MyProductionStack \ + --location eastus \ + --template-file main.bicep \ + --parameters @parameters.json \ + --deny-settings-mode DenyWriteAndDelete \ + --deny-settings-excluded-principals ${{ secrets.DEVOPS_PRINCIPAL_ID }} \ + --action-on-unmanage deleteAll \ + --yes +``` + +### Azure DevOps Pipeline + +```yaml +trigger: + branches: + include: + - main + +pool: + vmImage: 'ubuntu-latest' + +variables: + azureSubscription: 'MyAzureConnection' + stackName: 'MyProductionStack' + location: 'eastus' + +steps: + - task: AzureCLI@2 + displayName: 'What-if Analysis' + inputs: + azureSubscription: $(azureSubscription) + scriptType: 'bash' + scriptLocation: 'inlineScript' + inlineScript: | + az stack sub what-if \ + --name $(stackName) \ + --location $(location) \ + --template-file main.bicep \ + --parameters @parameters.json + + - task: AzureCLI@2 + displayName: 'Deploy Stack' + inputs: + azureSubscription: $(azureSubscription) + scriptType: 'bash' + scriptLocation: 'inlineScript' + inlineScript: | + az stack sub create \ + --name $(stackName) \ + --location $(location) \ + --template-file main.bicep \ + --parameters @parameters.json \ + --deny-settings-mode DenyWriteAndDelete \ + --action-on-unmanage deleteAll \ + --yes +``` + +## Monitoring and Auditing + +### View Stack Events + +```bash +# Get deployment operations +az stack sub show \ + --name MyProductionStack \ + --query "deploymentId" \ + --output tsv | \ + xargs -I {} az deployment sub show --name {} + +# List managed resources +az stack sub show \ + --name MyProductionStack \ + --query "resources[].id" \ + --output table +``` + +### Activity Logs + +```bash +# Query stack operations +az monitor activity-log list \ + --resource-group MyRG \ + --namespace Microsoft.Resources \ + --start-time 2025-01-01T00:00:00Z \ + --query "[?contains(authorization.action, 'Microsoft.Resources/deploymentStacks')]" \ + --output table +``` + +## Migration from Azure Blueprints + +### Assessment + +1. **Inventory Blueprints**: List all blueprints and assignments +2. **Document Parameters**: Export parameters and configurations +3. **Plan Conversion**: Map blueprints to deployment stacks +4. **Test in Dev**: Validate converted templates + +### Conversion Steps + +```bash +# 1. Export Blueprint as ARM template +# (Use Azure Portal or PowerShell) + +# 2. Convert ARM to Bicep +az bicep decompile --file blueprint-template.json + +# 3. Create Deployment Stack +az stack sub create \ + --name ConvertedFromBlueprint \ + --location eastus \ + --template-file converted.bicep \ + --parameters @blueprint-parameters.json \ + --deny-settings-mode DenyWriteAndDelete \ + --action-on-unmanage detachAll + +# 4. Validate resources +az stack sub show --name ConvertedFromBlueprint + +# 5. Delete Blueprint assignment (after validation) +# Remove-AzBlueprintAssignment -Name MyBlueprintAssignment +``` + +## Best Practices + +✓ **Use Deployment Stacks for all new infrastructure** +✓ **Always run what-if analysis before deployment** +✓ **Use DenyWriteAndDelete for production stacks** +✓ **Exclude break-glass principals from deny settings** +✓ **Tag stacks with Environment, CostCenter, Owner** +✓ **Use deleteAll for ephemeral environments** +✓ **Use detachAll for migration scenarios** +✓ **Implement CI/CD pipelines for stack deployment** +✓ **Monitor stack operations via activity logs** +✓ **Document stack architecture and dependencies** + +## Troubleshooting + +### Stack Creation Fails + +```bash +# Check deployment errors +az stack sub show \ + --name MyStack \ + --query "error" \ + --output json + +# Validate template +az deployment sub validate \ + --location eastus \ + --template-file main.bicep \ + --parameters @parameters.json +``` + +### Deny Settings Blocking Operations + +```bash +# Check deny assignments +az role assignment list \ + --scope /subscriptions/ \ + --include-inherited \ + --query "[?type=='Microsoft.Authorization/denyAssignments']" + +# Add principal to exclusions +az stack sub update \ + --name MyStack \ + --deny-settings-excluded-principals +``` + +### Resources Not Deleted + +```bash +# Check action-on-unmanage setting +az stack sub show \ + --name MyStack \ + --query "actionOnUnmanage" \ + --output tsv + +# Update to deleteAll +az stack sub update \ + --name MyStack \ + --action-on-unmanage deleteAll +``` + +## References + +- [Deployment Stacks Documentation](https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/deployment-stacks) +- [Deployment Stacks Quickstart](https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/quickstart-create-deployment-stacks) +- [Migrate from Blueprints](https://learn.microsoft.com/en-us/azure/governance/blueprints/how-to/migrate-to-deployment-stacks) + +Deployment Stacks represents the future of Azure infrastructure lifecycle management!