Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:28:52 +08:00
commit b40af6b4cc
9 changed files with 3945 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
{
"name": "azure-master",
"description": "Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards.",
"version": "1.1.0",
"author": {
"name": "Josiah Siegel",
"email": "JosiahSiegel@users.noreply.github.com"
},
"skills": [
"./skills"
],
"agents": [
"./agents"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# azure-master
Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards.

669
agents/azure-expert.md Normal file
View File

@@ -0,0 +1,669 @@
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# Azure Cloud Expert Agent
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**Never CREATE additional documentation unless explicitly requested by the user.**
- If documentation updates are needed, modify the appropriate existing README.md file
- Do not proactively create new .md files for documentation
- Only create documentation files when the user specifically requests it
---
You are a comprehensive Azure cloud expert with deep knowledge of all Azure services, 2025 features, and production-ready configuration patterns.
## Core Responsibilities
### 1. ALWAYS Fetch Latest Documentation First
**CRITICAL**: Before any Azure task, fetch the latest documentation:
```bash
# Use WebSearch for latest features
web_search: "Azure [service-name] latest features 2025"
# Use Context7 for library documentation
resolve-library-id: "@azure/cli" or "azure-bicep"
get-library-docs: with specific topic
```
### 2. 2025 Azure Feature Expertise
**AKS Automatic (GA - October 2025)**
- Fully-managed Kubernetes with zero operational overhead
- Karpenter integration for dynamic node provisioning
- HPA, VPA, and KEDA enabled by default
- Entra ID, network policies, automatic patching built-in
- New billing: $0.16/hour cluster + compute costs
- Ubuntu 24.04 on Kubernetes 1.34+
**Azure Container Apps 2025 Updates**
- Serverless GPU (GA): Auto-scaling AI workloads with per-second billing
- Dedicated GPU (GA): Simplified AI deployment
- Foundry Models integration: Deploy AI models during container creation
- Workflow with Durable task scheduler (Preview)
- Native Azure Functions support
- Dynamic Sessions with GPU for untrusted code execution
**Azure OpenAI Service Models (2025)**
- GPT-5 series: gpt-5-pro, gpt-5, gpt-5-codex (registration required)
- GPT-4.1 series: 1M token context, 4.1-mini, 4.1-nano
- Reasoning models: o4-mini, o3, o1, o1-mini
- Image generation: GPT-image-1 (2025-04-15)
- Video generation: Sora (2025-05-02)
- Audio models: gpt-4o-transcribe, gpt-4o-mini-transcribe
**Azure AI Foundry (Build 2025)**
- Model router for optimal model selection (cost + quality)
- Agentic retrieval: 40% better on multi-part questions
- Foundry Observability (Preview): End-to-end monitoring
- SRE Agent: 24/7 monitoring, autonomous incident response
- New models: Grok 3 (xAI), Flux Pro 1.1, Sora, Hugging Face models
- ND H200 V5 VMs: NVIDIA H200 GPUs, 2x performance gains
**Deployment Stacks (GA)**
- Manage Azure resources as unified entities
- Deny settings: DenyDelete, DenyWriteAndDelete
- ActionOnUnmanage: Detach or delete orphaned resources
- Scopes: Resource group, subscription, management group
- Replaces Azure Blueprints (deprecated July 2026)
- Built-in RBAC roles: Stack Contributor, Stack Owner
**Bicep 2025 Updates (v0.37.4)**
- externalInput() function (GA)
- C# authoring for custom Bicep extensions
- Experimental capabilities
- Enhanced parameter validation
- Improved module lifecycle management
**Azure CLI 2025 (v2.79.0)**
- Breaking changes in November 2025 release
- ACR Helm 2 support removed (March 2025)
- Role assignment delete behavior changed
- New regions and availability zones
- Enhanced Azure Container Storage support
### 3. Production-Ready Service Patterns
**Compute Services**
```bash
# AKS Automatic (2025 GA)
az aks create \
--resource-group MyRG \
--name MyAKSAutomatic \
--sku automatic \
--enable-karpenter \
--network-plugin azure \
--network-plugin-mode overlay \
--network-dataplane cilium \
--os-sku AzureLinux \
--kubernetes-version 1.34 \
--zones 1 2 3
# Container Apps with GPU (2025)
az containerapp create \
--name myapp \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myimage:latest \
--cpu 2 \
--memory 4Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name gpu-scaling \
--scale-rule-type custom
# Container Apps with Dapr
az containerapp create \
--name myapp \
--resource-group MyRG \
--environment myenv \
--enable-dapr true \
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http
# App Service with latest runtime
az webapp create \
--resource-group MyRG \
--plan MyPlan \
--name MyUniqueAppName \
--runtime "NODE|20-lts" \
--deployment-container-image-name mcr.microsoft.com/appsvc/node:20-lts
```
**AI and ML Services**
```bash
# Azure OpenAI with GPT-5
az cognitiveservices account create \
--name myopenai \
--resource-group MyRG \
--kind OpenAI \
--sku S0 \
--location eastus \
--custom-domain myopenai
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-5 \
--model-name gpt-5 \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 100
# Deploy reasoning model (o3)
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name o3-reasoning \
--model-name o3 \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 50
# AI Foundry workspace
az ml workspace create \
--name myworkspace \
--resource-group MyRG \
--location eastus \
--storage-account mystorage \
--key-vault mykeyvault \
--app-insights myappinsights \
--container-registry myacr \
--enable-data-isolation true
```
**Deployment Stacks (Bicep)**
```bash
# Create deployment stack at subscription scope
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals <service-principal-id> \
--action-on-unmanage deleteAll \
--description "Production infrastructure stack"
# Update stack with new template
az stack sub update \
--name MyStack \
--template-file main.bicep \
--parameters @parameters.json
# Delete stack and managed resources
az stack sub delete \
--name MyStack \
--action-on-unmanage deleteAll
# List deployment stacks
az stack sub list --output table
```
**Bicep 2025 Patterns**
```bicep
// main.bicep - Using externalInput() (GA in v0.37+)
@description('External configuration source')
param configUri string
// Load external configuration
var config = externalInput('json', configUri)
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-05-01' = {
name: config.storageAccountName
location: config.location
sku: {
name: config.sku
}
kind: 'StorageV2'
properties: {
accessTier: config.accessTier
minimumTlsVersion: 'TLS1_2'
supportsHttpsTrafficOnly: true
allowBlobPublicAccess: false
networkAcls: {
defaultAction: 'Deny'
bypass: 'AzureServices'
}
}
}
// AKS Automatic cluster
resource aksCluster 'Microsoft.ContainerService/managedClusters@2025-01-01' = {
name: 'myaksautomatic'
location: resourceGroup().location
sku: {
name: 'Automatic'
tier: 'Standard'
}
properties: {
kubernetesVersion: '1.34'
enableRBAC: true
aadProfile: {
managed: true
enableAzureRBAC: true
}
networkProfile: {
networkPlugin: 'azure'
networkPluginMode: 'overlay'
networkDataplane: 'cilium'
serviceCidr: '10.0.0.0/16'
dnsServiceIP: '10.0.0.10'
}
autoScalerProfile: {
'balance-similar-node-groups': 'true'
expander: 'least-waste'
'skip-nodes-with-system-pods': 'false'
}
autoUpgradeProfile: {
upgradeChannel: 'stable'
}
securityProfile: {
defender: {
securityMonitoring: {
enabled: true
}
}
}
}
}
// Container App with GPU
resource containerApp 'Microsoft.App/containerApps@2025-02-01' = {
name: 'myapp'
location: resourceGroup().location
properties: {
environmentId: containerAppEnv.id
configuration: {
dapr: {
enabled: true
appId: 'myapp'
appPort: 8080
appProtocol: 'http'
}
ingress: {
external: true
targetPort: 8080
traffic: [
{
latestRevision: true
weight: 100
}
]
}
}
template: {
containers: [
{
name: 'main'
image: 'myregistry.azurecr.io/myimage:latest'
resources: {
cpu: json('2')
memory: '4Gi'
gpu: {
type: 'nvidia-a100'
count: 1
}
}
}
]
scale: {
minReplicas: 0
maxReplicas: 10
rules: [
{
name: 'gpu-scaling'
custom: {
type: 'prometheus'
metadata: {
serverAddress: 'http://prometheus.monitoring.svc.cluster.local:9090'
metricName: 'gpu_utilization'
threshold: '80'
query: 'avg(gpu_utilization)'
}
}
}
]
}
}
}
}
```
### 4. Well-Architected Framework Principles
**Reliability**
- Deploy across availability zones (3 zones for 99.99% SLA)
- Use AKS Automatic with Karpenter for dynamic scaling
- Implement health probes and liveness checks
- Enable automatic OS patching and upgrades
- Use Deployment Stacks for consistent deployments
**Security**
- Enable Microsoft Defender for Cloud
- Use managed identities (workload identity for AKS)
- Implement network policies and private endpoints
- Enable encryption at rest and in transit (TLS 1.2+)
- Use Key Vault for secrets management
- Apply deny settings in Deployment Stacks
**Cost Optimization**
- Use AKS Automatic for efficient resource allocation
- Container Apps scale-to-zero for serverless workloads
- Purchase Azure reservations (1-3 years)
- Enable Azure Hybrid Benefit
- Implement autoscaling policies
- Use spot instances for non-critical workloads
**Performance**
- Use premium storage tiers for production
- Enable accelerated networking
- Use proximity placement groups
- Implement CDN for static content
- Use Azure Front Door for global routing
- Container Apps GPU for AI workloads
**Operational Excellence**
- Use Azure Monitor and Application Insights
- Enable Foundry Observability for AI workloads
- Implement Infrastructure as Code (Bicep/Terraform)
- Use Deployment Stacks for lifecycle management
- Configure alerts and action groups
- Enable SRE Agent for autonomous monitoring
### 5. Networking Best Practices
**Hub-Spoke Topology**
```bash
# Hub VNet
az network vnet create \
--resource-group Hub-RG \
--name Hub-VNet \
--address-prefix 10.0.0.0/16 \
--subnet-name AzureFirewallSubnet \
--subnet-prefix 10.0.1.0/24
# Spoke VNet
az network vnet create \
--resource-group Spoke-RG \
--name Spoke-VNet \
--address-prefix 10.1.0.0/16 \
--subnet-name WorkloadSubnet \
--subnet-prefix 10.1.1.0/24
# VNet Peering
az network vnet peering create \
--name Hub-to-Spoke \
--resource-group Hub-RG \
--vnet-name Hub-VNet \
--remote-vnet /subscriptions/<sub-id>/resourceGroups/Spoke-RG/providers/Microsoft.Network/virtualNetworks/Spoke-VNet \
--allow-vnet-access \
--allow-forwarded-traffic \
--allow-gateway-transit
# Private DNS Zone
az network private-dns zone create \
--resource-group Hub-RG \
--name privatelink.azurecr.io
az network private-dns link vnet create \
--resource-group Hub-RG \
--zone-name privatelink.azurecr.io \
--name hub-vnet-link \
--virtual-network Hub-VNet \
--registration-enabled false
```
### 6. Storage and Database Patterns
**Storage Account with lifecycle management**
```bash
az storage account create \
--name mystorageaccount \
--resource-group MyRG \
--location eastus \
--sku Standard_ZRS \
--kind StorageV2 \
--access-tier Hot \
--https-only true \
--min-tls-version TLS1_2 \
--allow-blob-public-access false \
--enable-hierarchical-namespace true
# Lifecycle management policy
az storage account management-policy create \
--account-name mystorageaccount \
--resource-group MyRG \
--policy '{
"rules": [
{
"name": "moveToArchive",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["archive/"]
},
"actions": {
"baseBlob": {
"tierToCool": {"daysAfterModificationGreaterThan": 30},
"tierToArchive": {"daysAfterModificationGreaterThan": 90}
}
}
}
}
]
}'
```
**SQL Database with zone redundancy**
```bash
az sql server create \
--name myserver \
--resource-group MyRG \
--location eastus \
--admin-user myadmin \
--admin-password <strong-password> \
--enable-public-network false \
--restrict-outbound-network-access enabled
az sql db create \
--resource-group MyRG \
--server myserver \
--name mydb \
--service-objective GP_Gen5_2 \
--backup-storage-redundancy Zone \
--zone-redundant true \
--compute-model Serverless \
--auto-pause-delay 60 \
--min-capacity 0.5 \
--max-size 32GB
# Private endpoint
az network private-endpoint create \
--name sql-private-endpoint \
--resource-group MyRG \
--vnet-name MyVNet \
--subnet PrivateEndpointSubnet \
--private-connection-resource-id $(az sql server show -g MyRG -n myserver --query id -o tsv) \
--group-id sqlServer \
--connection-name sql-connection
```
### 7. Monitoring and Observability
**Azure Monitor with Container Insights**
```bash
# Log Analytics workspace
az monitor log-analytics workspace create \
--resource-group MyRG \
--workspace-name MyWorkspace \
--location eastus \
--retention-time 90 \
--sku PerGB2018
# Enable Container Insights for AKS
az aks enable-addons \
--resource-group MyRG \
--name MyAKS \
--addons monitoring \
--workspace-resource-id $(az monitor log-analytics workspace show -g MyRG -n MyWorkspace --query id -o tsv)
# Application Insights for Container Apps
az monitor app-insights component create \
--app MyAppInsights \
--location eastus \
--resource-group MyRG \
--application-type web \
--workspace $(az monitor log-analytics workspace show -g MyRG -n MyWorkspace --query id -o tsv)
# Foundry Observability (Preview)
az ml workspace update \
--name myworkspace \
--resource-group MyRG \
--enable-observability true
# Alert rules
az monitor metrics alert create \
--name high-cpu-alert \
--resource-group MyRG \
--scopes $(az aks show -g MyRG -n MyAKS --query id -o tsv) \
--condition "avg Percentage CPU > 80" \
--window-size 5m \
--evaluation-frequency 1m \
--action <action-group-id>
```
### 8. Security Hardening
**Microsoft Defender for Cloud**
```bash
# Enable Defender plans
az security pricing create --name VirtualMachines --tier Standard
az security pricing create --name SqlServers --tier Standard
az security pricing create --name AppServices --tier Standard
az security pricing create --name StorageAccounts --tier Standard
az security pricing create --name KubernetesService --tier Standard
az security pricing create --name ContainerRegistry --tier Standard
az security pricing create --name KeyVaults --tier Standard
az security pricing create --name Dns --tier Standard
az security pricing create --name Arm --tier Standard
# Key Vault with RBAC and purge protection
az keyvault create \
--name mykeyvault \
--resource-group MyRG \
--location eastus \
--enable-rbac-authorization true \
--enable-purge-protection true \
--enable-soft-delete true \
--retention-days 90 \
--network-acls-default-action Deny
# Managed Identity
az identity create \
--name myidentity \
--resource-group MyRG
# Assign role
az role assignment create \
--assignee <identity-principal-id> \
--role "Key Vault Secrets User" \
--scope $(az keyvault show -g MyRG -n mykeyvault --query id -o tsv)
```
## Key Decision Criteria
**Choose AKS Automatic when:**
- You want zero operational overhead
- Dynamic node provisioning is critical
- You need built-in security and compliance
- Auto-scaling across HPA, VPA, KEDA is required
**Choose Container Apps when:**
- Serverless with scale-to-zero is needed
- Event-driven architecture with Dapr
- GPU workloads for AI/ML inference
- Simpler deployment model than Kubernetes
**Choose App Service when:**
- Traditional web apps or APIs
- Integrated deployment slots
- Built-in authentication
- Auto-scaling without Kubernetes complexity
**Choose VMs when:**
- Legacy applications with specific OS requirements
- Full control over OS and middleware
- Lift-and-shift migrations
- Specialized workloads
## Response Guidelines
1. **Research First**: Always fetch latest Azure documentation
2. **Production-Ready**: Provide complete, secure configurations
3. **2025 Features**: Prioritize latest GA features
4. **Best Practices**: Follow Well-Architected Framework
5. **Explain Trade-offs**: Compare options with clear decision criteria
6. **Complete Examples**: Include all required parameters
7. **Security First**: Enable encryption, RBAC, private endpoints
8. **Cost-Aware**: Suggest cost optimization strategies
Your goal is to deliver enterprise-ready Azure solutions using 2025 best practices.

65
plugin.lock.json Normal file
View File

@@ -0,0 +1,65 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:JosiahSiegel/claude-code-marketplace:plugins/azure-master",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "578b0f23124804384e64d4370b4202b34fc2033d",
"treeHash": "c749337293c5a0b303e5f8e297dc52cb5c678293b9d81dda8baae4a4d8ef0e22",
"generatedAt": "2025-11-28T10:11:51.861096Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "azure-master",
"description": "Complete Azure cloud expertise system with 2025 features including AKS Automatic, Container Apps GPU support, and Deployment Stacks. PROACTIVELY activate for: (1) ANY Azure resource provisioning or management, (2) AKS Automatic with Karpenter autoscaling, (3) Container Apps with serverless GPU and Dapr integration, (4) Azure OpenAI GPT-5 and reasoning models (o4-mini, o3), (5) Deployment Stacks for infrastructure lifecycle management, (6) Bicep v0.37+ with externalInput() and custom extensions, (7) Azure CLI 2.79+ with latest breaking changes, (8) SRE Agent integration for monitoring and incident response, (9) Azure AI Foundry model deployment, (10) Security, networking, and cost optimization. Provides: AKS Automatic GA features, Container Apps GPU workloads, Deployment Stacks best practices, latest Azure OpenAI models, Bicep 2025 patterns, Azure CLI expertise, comprehensive service configurations (compute, networking, storage, databases, AI/ML), Well-Architected Framework guidance, high availability patterns, security hardening, cost optimization strategies, and production-ready configurations. Ensures enterprise-ready, secure, scalable Azure infrastructure following Microsoft 2025 standards.",
"version": "1.1.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "78c0ce977f2150baca0f182dbcd6efe57ede0719438e41238efff325933716e0"
},
{
"path": "agents/azure-expert.md",
"sha256": "9c49c91c80b8a398b99177c978d7c22e5469a053b13d6677a1da8c98bfe92278"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "2306520f188f8b53fc7925d230f553256ab88ea83e9fe552c516ab60175be11c"
},
{
"path": "skills/azure-openai-2025.md",
"sha256": "2cafd7b9019a5e9d99f2db0a10b821c565f45895977becb5cbc4e8adf51605de"
},
{
"path": "skills/azure-well-architected-framework.md",
"sha256": "5cb44d98f56310779dbb76670cbb6041a7309c458bc642c4bc0f9224c2bcfaaf"
},
{
"path": "skills/deployment-stacks-2025.md",
"sha256": "d2ca648a8ea8cea0d1fc2ee08a9cd071332bbfed0f47ebaa2b0e8fc86d05bccf"
},
{
"path": "skills/container-apps-gpu-2025.md",
"sha256": "4c52b7b9c81cb03538ca655463eccefd292c0aaf4fe72599c73f287636d329fe"
},
{
"path": "skills/aks-automatic-2025.md",
"sha256": "296d3a9d9eab641774604908513c674cb086fe4b45e25aa75919772cd89bc733"
}
],
"dirSha256": "c749337293c5a0b303e5f8e297dc52cb5c678293b9d81dda8baae4a4d8ef0e22"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,620 @@
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# AKS Automatic - 2025 GA Features
Complete knowledge base for Azure Kubernetes Service Automatic mode (GA October 2025).
## Overview
AKS Automatic is a fully-managed Kubernetes offering that eliminates operational overhead through intelligent automation and built-in best practices.
## Key Features (GA October 2025)
### 1. Zero Operational Overhead
- Fully-managed control plane and worker nodes
- Automatic OS patching and security updates
- Built-in monitoring and diagnostics
- Integrated security and compliance
### 2. Karpenter Integration
- Dynamic node provisioning based on real-time demand
- Intelligent bin-packing for cost optimization
- Automatic node consolidation and deprovisioning
- Support for multiple node pools and instance types
### 3. Auto-Scaling (Enabled by Default)
- **Horizontal Pod Autoscaler (HPA)**: Scale pods based on CPU/memory
- **Vertical Pod Autoscaler (VPA)**: Adjust pod resource requests/limits
- **KEDA**: Event-driven autoscaling for external triggers
### 4. Enhanced Security
- Microsoft Entra ID integration for authentication
- Azure RBAC for Kubernetes authorization
- Network policies enabled by default
- Automatic security patches
- Workload identity for pod-level authentication
### 5. Advanced Networking
- Azure CNI Overlay for efficient IP usage
- Cilium dataplane for high-performance networking
- Network policies for microsegmentation
- Private clusters supported
### 6. New Billing Model (Effective October 19, 2025)
- Hosted control plane fee: **$0.16/cluster/hour**
- Compute charges based on actual node usage
- No separate cluster management fee
- Cost savings from Karpenter optimization
### 7. Node Operating System
- Ubuntu 22.04 for Kubernetes < 1.34
- Ubuntu 24.04 for Kubernetes >= 1.34
- Automatic OS upgrades with node image channel
## Creating AKS Automatic Cluster
### Basic Creation
```bash
az aks create \
--resource-group MyRG \
--name MyAKSAutomatic \
--sku automatic \
--kubernetes-version 1.34 \
--location eastus
```
### Production-Ready Configuration
```bash
az aks create \
--resource-group MyRG \
--name MyAKSAutomatic \
--location eastus \
--sku automatic \
--tier standard \
\
# Kubernetes version
--kubernetes-version 1.34 \
\
# Karpenter (default in automatic mode)
--enable-karpenter \
\
# Networking
--network-plugin azure \
--network-plugin-mode overlay \
--network-dataplane cilium \
--service-cidr 10.0.0.0/16 \
--dns-service-ip 10.0.0.10 \
--load-balancer-sku standard \
\
# Use custom VNet (optional)
--vnet-subnet-id /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.Network/virtualNetworks/MyVNet/subnets/AKSSubnet \
\
# Availability zones
--zones 1 2 3 \
\
# Authentication and authorization
--enable-managed-identity \
--enable-aad \
--enable-azure-rbac \
--aad-admin-group-object-ids <group-object-id> \
\
# Auto-upgrade
--auto-upgrade-channel stable \
--node-os-upgrade-channel NodeImage \
\
# Security
--enable-defender \
--enable-workload-identity \
--enable-oidc-issuer \
\
# Monitoring
--enable-addons monitoring \
--workspace-resource-id /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.OperationalInsights/workspaces/MyWorkspace \
\
# Tags
--tags Environment=Production ManagedBy=AKSAutomatic
```
### With Azure Policy Add-on
```bash
az aks create \
--resource-group MyRG \
--name MyAKSAutomatic \
--sku automatic \
--enable-addons azure-policy \
--kubernetes-version 1.34
```
## Karpenter Configuration
AKS Automatic uses Karpenter for intelligent node provisioning. Customize node provisioning with AKSNodeClass and NodePool CRDs.
### Default AKSNodeClass
```yaml
apiVersion: karpenter.azure.com/v1alpha1
kind: AKSNodeClass
metadata:
name: default
spec:
# OS Image - Ubuntu 24.04 for K8s 1.34+
osImage:
sku: Ubuntu
version: "24.04"
# VM Series
vmSeries:
- Standard_D
- Standard_E
# Max pods per node
maxPodsPerNode: 110
# Security
securityProfile:
sshAccess: Disabled
securityType: Standard
```
### Custom NodePool
```yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: general-purpose
spec:
# Constraints
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
- key: kubernetes.azure.com/agentpool
operator: In
values: ["general"]
# Node labels
labels:
workload-type: general
# Taints (optional)
taints:
- key: "dedicated"
value: "general"
effect: "NoSchedule"
# NodeClass reference
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: default
# Limits
limits:
cpu: "1000"
memory: 4000Gi
# Disruption budget
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 30s
expireAfter: 720h # 30 days
budgets:
- nodes: "10%"
duration: 5m
```
### GPU NodePool for AI Workloads
```yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: gpu-workloads
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["Standard_NC6s_v3", "Standard_NC12s_v3", "Standard_NC24s_v3"]
labels:
workload-type: gpu
gpu-type: nvidia-v100
taints:
- key: "nvidia.com/gpu"
value: "true"
effect: "NoSchedule"
nodeClassRef:
group: karpenter.azure.com
kind: AKSNodeClass
name: gpu-nodeclass
limits:
cpu: "200"
memory: 800Gi
nvidia.com/gpu: "16"
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 300s
```
## Autoscaling with HPA, VPA, and KEDA
### Horizontal Pod Autoscaler (HPA)
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 15
```
### Vertical Pod Autoscaler (VPA)
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto" # Auto, Recreate, Initial, Off
resourcePolicy:
containerPolicies:
- containerName: "*"
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 4
memory: 8Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits
```
### KEDA ScaledObject (Event-Driven)
```yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: myapp-queue-scaler
spec:
scaleTargetRef:
name: myapp
minReplicaCount: 0 # Scale to zero
maxReplicaCount: 100
pollingInterval: 30
cooldownPeriod: 300
triggers:
# Azure Service Bus Queue
- type: azure-servicebus
metadata:
queueName: myqueue
namespace: myservicebus
messageCount: "5"
authenticationRef:
name: azure-servicebus-auth
# Azure Storage Queue
- type: azure-queue
metadata:
queueName: myqueue
queueLength: "10"
accountName: mystorageaccount
authenticationRef:
name: azure-storage-auth
# Prometheus metrics
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
metricName: http_requests_per_second
threshold: "100"
query: sum(rate(http_requests_total[2m]))
```
## Workload Identity (Replaces AAD Pod Identity)
### Setup
```bash
# Workload identity is enabled by default in AKS Automatic
# Create managed identity
az identity create \
--name myapp-identity \
--resource-group MyRG
# Get identity details
export IDENTITY_CLIENT_ID=$(az identity show -g MyRG -n myapp-identity --query clientId -o tsv)
export IDENTITY_OBJECT_ID=$(az identity show -g MyRG -n myapp-identity --query principalId -o tsv)
# Assign role to identity
az role assignment create \
--assignee $IDENTITY_OBJECT_ID \
--role "Storage Blob Data Contributor" \
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.Storage/storageAccounts/mystorage
# Create federated credential
export AKS_OIDC_ISSUER=$(az aks show -g MyRG -n MyAKSAutomatic --query oidcIssuerProfile.issuerUrl -o tsv)
az identity federated-credential create \
--name myapp-federated-credential \
--identity-name myapp-identity \
--resource-group MyRG \
--issuer $AKS_OIDC_ISSUER \
--subject system:serviceaccount:default:myapp-sa
```
### Kubernetes Resources
```yaml
# Service Account
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp-sa
namespace: default
annotations:
azure.workload.identity/client-id: "<IDENTITY_CLIENT_ID>"
---
# Deployment using workload identity
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 2
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
azure.workload.identity/use: "true" # Enable workload identity
spec:
serviceAccountName: myapp-sa
containers:
- name: myapp
image: myregistry.azurecr.io/myapp:latest
env:
- name: AZURE_CLIENT_ID
value: "<IDENTITY_CLIENT_ID>"
- name: AZURE_TENANT_ID
value: "<TENANT_ID>"
- name: AZURE_FEDERATED_TOKEN_FILE
value: /var/run/secrets/azure/tokens/azure-identity-token
volumeMounts:
- name: azure-identity-token
mountPath: /var/run/secrets/azure/tokens
readOnly: true
volumes:
- name: azure-identity-token
projected:
sources:
- serviceAccountToken:
path: azure-identity-token
expirationSeconds: 3600
audience: api://AzureADTokenExchange
```
## Monitoring and Observability
### Enable Container Insights
```bash
# Already enabled with --enable-addons monitoring
# Query logs using Azure Monitor
# Get cluster logs
az monitor log-analytics query \
--workspace <workspace-id> \
--analytics-query "KubePodInventory | where ClusterName == 'MyAKSAutomatic' | take 10" \
--output table
# Get Karpenter logs
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter
```
### Prometheus and Grafana
```bash
# Enable managed Prometheus
az aks update \
--resource-group MyRG \
--name MyAKSAutomatic \
--enable-azure-monitor-metrics
# Access Grafana dashboards through Azure Portal
```
## Cost Optimization
### Billing Model (October 2025)
- **Control plane**: $0.16/hour per cluster
- **Compute**: Pay for actual node usage
- **Karpenter**: Automatic bin-packing and consolidation
- **Scale-to-zero**: Possible with KEDA and Karpenter
### Cost-Saving Tips
1. **Use Spot Instances for Non-Critical Workloads**
```yaml
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
```
2. **Configure Aggressive Consolidation**
```yaml
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
```
3. **Implement Pod Disruption Budgets**
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: myapp
```
4. **Use VPA for Right-Sizing**
- VPA automatically adjusts resource requests based on actual usage
## Migration from Standard AKS to Automatic
AKS Automatic is a new cluster mode - in-place migration is not supported. Follow these steps:
1. **Create new AKS Automatic cluster**
2. **Install workloads in new cluster**
3. **Validate functionality**
4. **Switch traffic** (DNS, load balancer)
5. **Decommission old cluster**
## Best Practices
✓ Use AKS Automatic for new production clusters
✓ Enable workload identity for pod authentication
✓ Configure custom NodePools for specific workload types
✓ Implement HPA, VPA, and KEDA for comprehensive scaling
✓ Use spot instances for batch and fault-tolerant workloads
✓ Enable Container Insights and Managed Prometheus
✓ Configure Pod Disruption Budgets for critical apps
✓ Use network policies for microsegmentation
✓ Enable Azure Policy add-on for compliance
✓ Implement GitOps with Flux or Argo CD
## Troubleshooting
### Check Karpenter Status
```bash
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter --tail=100
kubectl get nodepools
kubectl get nodeclaims
```
### View Node Provisioning Events
```bash
kubectl get events --field-selector involvedObject.kind=NodePool -A
```
### Debug Workload Identity Issues
```bash
# Check service account annotation
kubectl get sa myapp-sa -o yaml
# Check pod labels
kubectl get pod <pod-name> -o yaml | grep azure.workload.identity
# Check federated credential
az identity federated-credential show \
--identity-name myapp-identity \
--resource-group MyRG \
--name myapp-federated-credential
```
## References
- [AKS Automatic Documentation](https://learn.microsoft.com/en-us/azure/aks/automatic)
- [Karpenter on Azure](https://karpenter.sh)
- [Workload Identity](https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview)
- [AKS Release Notes](https://github.com/Azure/AKS/releases)
AKS Automatic represents the future of managed Kubernetes on Azure - zero operational overhead with maximum automation!

718
skills/azure-openai-2025.md Normal file
View File

@@ -0,0 +1,718 @@
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# Azure OpenAI Service - 2025 Models and Features
Complete knowledge base for Azure OpenAI Service with latest 2025 models including GPT-5, GPT-4.1, reasoning models, and Azure AI Foundry integration.
## Overview
Azure OpenAI Service provides REST API access to OpenAI's most powerful models with enterprise-grade security, compliance, and regional availability.
## Latest Models (2025)
### GPT-5 Series (GA August 2025)
**Registration Required Models:**
- `gpt-5-pro`: Highest capability, complex reasoning
- `gpt-5`: Balanced performance and cost
- `gpt-5-codex`: Optimized for code generation
**No Registration Required:**
- `gpt-5-mini`: Faster, more affordable
- `gpt-5-nano`: Ultra-fast for simple tasks
- `gpt-5-chat`: Optimized for conversational use
### GPT-4.1 Series
- `gpt-4.1`: 1 million token context window
- `gpt-4.1-mini`: Efficient version with 1M context
- `gpt-4.1-nano`: Fastest variant
**Key Improvements:**
- 1,000,000 token context (vs 128K in GPT-4 Turbo)
- Better instruction following
- Reduced hallucinations
- Improved multilingual support
### Reasoning Models
**o4-mini**: Lightweight reasoning model
- Faster inference
- Lower cost
- Suitable for structured reasoning tasks
**o3**: Advanced reasoning model
- Complex problem solving
- Mathematical reasoning
- Scientific analysis
**o1**: Original reasoning model
- General-purpose reasoning
- Step-by-step explanations
**o1-mini**: Efficient reasoning
- Balanced cost and performance
### Image Generation
**GPT-image-1 (2025-04-15)**
- DALL-E 3 successor
- Higher quality images
- Better prompt understanding
- Improved safety filters
### Video Generation
**Sora (2025-05-02)**
- Text-to-video generation
- Realistic and imaginative scenes
- Up to 60 seconds of video
- Multiple camera angles and styles
### Audio Models
**gpt-4o-transcribe**: Speech-to-text powered by GPT-4o
- High accuracy transcription
- Multiple languages
- Speaker diarization
**gpt-4o-mini-transcribe**: Faster, more affordable transcription
- Good accuracy
- Lower latency
- Cost-effective
## Deploying Azure OpenAI
### Create Azure OpenAI Resource
```bash
# Create OpenAI account
az cognitiveservices account create \
--name myopenai \
--resource-group MyRG \
--kind OpenAI \
--sku S0 \
--location eastus \
--custom-domain myopenai \
--public-network-access Disabled \
--identity-type SystemAssigned
# Get endpoint and key
az cognitiveservices account show \
--name myopenai \
--resource-group MyRG \
--query "properties.endpoint" \
--output tsv
az cognitiveservices account keys list \
--name myopenai \
--resource-group MyRG \
--query "key1" \
--output tsv
```
### Deploy GPT-5 Model
```bash
# Deploy gpt-5
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-5 \
--model-name gpt-5 \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 100 \
--scale-type Standard
# Deploy gpt-5-pro (requires registration)
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-5-pro \
--model-name gpt-5-pro \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 50
```
### Deploy Reasoning Models
```bash
# Deploy o3 reasoning model
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name o3-reasoning \
--model-name o3 \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 50
# Deploy o4-mini
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name o4-mini \
--model-name o4-mini \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 100
```
### Deploy GPT-4.1 with 1M Context
```bash
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-4-1 \
--model-name gpt-4.1 \
--model-version latest \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 100
```
### Deploy Image Generation Model
```bash
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name image-gen \
--model-name gpt-image-1 \
--model-version 2025-04-15 \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 10
```
### Deploy Sora Video Generation
```bash
az cognitiveservices account deployment create \
--resource-group MyRG \
--name myopenai \
--deployment-name sora \
--model-name sora \
--model-version 2025-05-02 \
--model-format OpenAI \
--sku-name Standard \
--sku-capacity 5
```
## Using Azure OpenAI Models
### Python SDK (GPT-5)
```python
from openai import AzureOpenAI
import os
# Initialize client
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2025-02-01-preview",
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)
# GPT-5 completion
response = client.chat.completions.create(
model="gpt-5", # deployment name
messages=[
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
max_tokens=1000,
temperature=0.7,
top_p=0.95
)
print(response.choices[0].message.content)
```
### Python SDK (o3 Reasoning Model)
```python
# o3 reasoning with chain-of-thought
response = client.chat.completions.create(
model="o3-reasoning",
messages=[
{"role": "system", "content": "You are an expert problem solver. Show your reasoning step-by-step."},
{"role": "user", "content": "If a train travels 120 km in 2 hours, then speeds up to travel 180 km in the next 2 hours, what is the average speed for the entire journey?"}
],
max_tokens=2000,
temperature=0.2 # Lower temperature for reasoning tasks
)
print(response.choices[0].message.content)
```
### Python SDK (GPT-4.1 with 1M Context)
```python
# Read a large document
with open('large_document.txt', 'r') as f:
document = f.read()
# GPT-4.1 can handle up to 1M tokens
response = client.chat.completions.create(
model="gpt-4-1",
messages=[
{"role": "system", "content": "You are a document analysis expert."},
{"role": "user", "content": f"Analyze this document and provide key insights:\n\n{document}"}
],
max_tokens=4000
)
print(response.choices[0].message.content)
```
### Image Generation (GPT-image-1)
```python
# Generate image with DALL-E 3 successor
response = client.images.generate(
model="image-gen",
prompt="A futuristic city with flying cars and vertical gardens, cyberpunk style, highly detailed, 4K",
size="1024x1024",
quality="hd",
n=1
)
image_url = response.data[0].url
print(f"Generated image: {image_url}")
```
### Video Generation (Sora)
```python
# Generate video with Sora
response = client.videos.generate(
model="sora",
prompt="A serene lakeside at sunset with birds flying overhead and gentle waves on the shore",
duration=10, # seconds
resolution="1080p",
fps=30
)
video_url = response.data[0].url
print(f"Generated video: {video_url}")
```
### Audio Transcription
```python
# Transcribe audio file
audio_file = open("meeting_recording.mp3", "rb")
response = client.audio.transcriptions.create(
model="gpt-4o-transcribe",
file=audio_file,
language="en",
response_format="verbose_json"
)
print(f"Transcription: {response.text}")
print(f"Duration: {response.duration}s")
# Speaker diarization
for segment in response.segments:
print(f"[{segment.start}s - {segment.end}s] {segment.text}")
```
## Azure AI Foundry Integration
### Model Router (Automatic Model Selection)
```python
from azure.ai.foundry import ModelRouter
# Initialize model router
router = ModelRouter(
endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
credential=os.getenv("AZURE_OPENAI_API_KEY")
)
# Automatically select optimal model
response = router.complete(
prompt="Analyze this complex scientific paper...",
optimization_goals=["quality", "cost"],
available_models=["gpt-5", "gpt-5-mini", "gpt-4-1"]
)
print(f"Selected model: {response.model_used}")
print(f"Response: {response.content}")
print(f"Cost: ${response.cost}")
```
**Benefits:**
- Automatic model selection based on prompt complexity
- Balance quality vs cost
- Reduce costs by up to 40% while maintaining quality
### Agentic Retrieval (Azure AI Search Integration)
```python
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
# Initialize search client
search_client = SearchClient(
endpoint=os.getenv("SEARCH_ENDPOINT"),
index_name="documents",
credential=AzureKeyCredential(os.getenv("SEARCH_KEY"))
)
# Agentic retrieval with Azure OpenAI
response = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You have access to a document search system."},
{"role": "user", "content": "What are the company's revenue projections for Q3?"}
],
tools=[{
"type": "function",
"function": {
"name": "search_documents",
"description": "Search company documents",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
}],
tool_choice="auto"
)
# Process tool calls
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
if tool_call.function.name == "search_documents":
query = json.loads(tool_call.function.arguments)["query"]
results = search_client.search(query)
# Feed results back to model for final answer
```
**Improvements:**
- 40% better on complex, multi-part questions
- Automatic query decomposition
- Relevance ranking
- Citation generation
### Foundry Observability (Preview)
```python
from azure.ai.foundry import FoundryObservability
# Enable observability
observability = FoundryObservability(
workspace_id=os.getenv("AI_FOUNDRY_WORKSPACE_ID"),
enable_tracing=True,
enable_metrics=True
)
# Monitor agent execution
with observability.trace_agent("customer_support_agent") as trace:
response = client.chat.completions.create(
model="gpt-5",
messages=messages
)
trace.log_tool_call("search_kb", {"query": "refund policy"})
trace.log_reasoning_step("Retrieved refund policy document")
trace.log_token_usage(response.usage.total_tokens)
# View in Azure AI Foundry portal:
# - End-to-end trace logs
# - Reasoning steps and tool calls
# - Performance metrics
# - Cost analysis
```
## Capacity and Quota Management
### Check Quota
```bash
# List deployments with usage
az cognitiveservices account deployment list \
--resource-group MyRG \
--name myopenai \
--output table
# Check usage metrics
az monitor metrics list \
--resource $(az cognitiveservices account show -g MyRG -n myopenai --query id -o tsv) \
--metric "TokenTransaction" \
--start-time 2025-01-01T00:00:00Z \
--end-time 2025-01-31T23:59:59Z \
--interval PT1H \
--aggregation Total
```
### Update Capacity
```bash
# Scale up deployment capacity
az cognitiveservices account deployment update \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-5 \
--sku-capacity 200
# Scale down during off-peak
az cognitiveservices account deployment update \
--resource-group MyRG \
--name myopenai \
--deployment-name gpt-5 \
--sku-capacity 50
```
### Request Quota Increase
1. Navigate to Azure Portal → Azure OpenAI resource
2. Go to "Quotas" blade
3. Select model and region
4. Click "Request quota increase"
5. Provide justification and target capacity
## Security and Networking
### Private Endpoint
```bash
# Create private endpoint
az network private-endpoint create \
--name openai-private-endpoint \
--resource-group MyRG \
--vnet-name MyVNet \
--subnet PrivateEndpointSubnet \
--private-connection-resource-id $(az cognitiveservices account show -g MyRG -n myopenai --query id -o tsv) \
--group-id account \
--connection-name openai-connection
# Create private DNS zone
az network private-dns zone create \
--resource-group MyRG \
--name privatelink.openai.azure.com
# Link to VNet
az network private-dns link vnet create \
--resource-group MyRG \
--zone-name privatelink.openai.azure.com \
--name openai-dns-link \
--virtual-network MyVNet \
--registration-enabled false
# Create DNS zone group
az network private-endpoint dns-zone-group create \
--resource-group MyRG \
--endpoint-name openai-private-endpoint \
--name default \
--private-dns-zone privatelink.openai.azure.com \
--zone-name privatelink.openai.azure.com
```
### Managed Identity Access
```bash
# Enable system-assigned identity
az cognitiveservices account identity assign \
--name myopenai \
--resource-group MyRG
# Grant role to managed identity
PRINCIPAL_ID=$(az cognitiveservices account show -g MyRG -n myopenai --query identity.principalId -o tsv)
az role assignment create \
--assignee $PRINCIPAL_ID \
--role "Cognitive Services OpenAI User" \
--scope /subscriptions/<sub-id>/resourceGroups/MyRG
```
### Content Filtering
```bash
# Configure content filtering
az cognitiveservices account update \
--name myopenai \
--resource-group MyRG \
--set properties.customContentFilter='{
"hate": {"severity": "medium", "enabled": true},
"violence": {"severity": "medium", "enabled": true},
"sexual": {"severity": "medium", "enabled": true},
"selfHarm": {"severity": "high", "enabled": true}
}'
```
## Cost Optimization
### Model Selection Strategy
**Use GPT-5-mini or GPT-5-nano for:**
- Simple questions
- Classification tasks
- Content moderation
- Summarization
**Use GPT-5 or GPT-4.1 for:**
- Complex reasoning
- Long-form content generation
- Document analysis
- Code generation
**Use Reasoning Models (o3, o4-mini) for:**
- Mathematical problems
- Scientific analysis
- Step-by-step reasoning
- Logic puzzles
### Implement Caching
```python
# Use semantic cache to reduce duplicate requests
from azure.ai.cache import SemanticCache
cache = SemanticCache(
similarity_threshold=0.95,
ttl_seconds=3600
)
# Check cache before API call
cached_response = cache.get(user_query)
if cached_response:
return cached_response
response = client.chat.completions.create(
model="gpt-5",
messages=messages
)
cache.set(user_query, response)
```
### Token Management
```python
import tiktoken
# Count tokens before API call
encoding = tiktoken.get_encoding("cl100k_base")
tokens = len(encoding.encode(prompt))
if tokens > 100000:
print(f"Warning: Prompt has {tokens} tokens, this will be expensive!")
# Use shorter max_tokens when appropriate
response = client.chat.completions.create(
model="gpt-5",
messages=messages,
max_tokens=500 # Limit output tokens
)
```
## Monitoring and Alerts
### Set Up Cost Alerts
```bash
# Create budget alert
az consumption budget create \
--budget-name openai-monthly-budget \
--resource-group MyRG \
--amount 1000 \
--category Cost \
--time-grain Monthly \
--start-date 2025-01-01 \
--end-date 2025-12-31 \
--notifications '{
"actual_GreaterThan_80_Percent": {
"enabled": true,
"operator": "GreaterThan",
"threshold": 80,
"contactEmails": ["billing@example.com"]
}
}'
```
### Application Insights Integration
```python
from opencensus.ext.azure.log_exporter import AzureLogHandler
import logging
# Configure logging
logger = logging.getLogger(__name__)
logger.addHandler(AzureLogHandler(
connection_string=os.getenv("APPLICATIONINSIGHTS_CONNECTION_STRING")
))
# Log API calls
logger.info("OpenAI API call", extra={
"custom_dimensions": {
"model": "gpt-5",
"tokens": response.usage.total_tokens,
"cost": calculate_cost(response.usage.total_tokens),
"latency_ms": response.response_ms
}
})
```
## Best Practices
**Use Model Router** for automatic cost optimization
**Implement caching** to reduce duplicate requests
**Monitor token usage** and set budgets
**Use private endpoints** for production workloads
**Enable managed identity** instead of API keys
**Configure content filtering** for safety
**Right-size capacity** based on actual demand
**Use Foundry Observability** for monitoring
**Implement retry logic** with exponential backoff
**Choose appropriate models** for task complexity
## References
- [Azure OpenAI Documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/)
- [What's New in Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/whats-new)
- [GPT-5 Announcement](https://azure.microsoft.com/en-us/blog/gpt-5-azure/)
- [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/)
- [Model Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/)
Azure OpenAI Service with GPT-5 and reasoning models brings enterprise-grade AI to your applications!

View File

@@ -0,0 +1,435 @@
---
name: azure-well-architected-framework
description: "Comprehensive Azure Well-Architected Framework knowledge covering the five pillars: Reliability, Security, Cost Optimization, Operational Excellence, and Performance Efficiency. Provides design principles, best practices, and implementation guidance for building robust Azure solutions."
---
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# Azure Well-Architected Framework
The Azure Well-Architected Framework is a set of guiding tenets for building high-quality cloud solutions. It consists of five pillars of architectural excellence.
## Overview
**Purpose**: Help architects and engineers build secure, high-performing, resilient, and efficient infrastructure for applications.
**The Five Pillars**:
1. Reliability
2. Security
3. Cost Optimization
4. Operational Excellence
5. Performance Efficiency
## Pillar 1: Reliability
**Definition**: The ability of a system to recover from failures and continue to function.
**Key Principles**:
- Design for failure
- Use availability zones and regions
- Implement redundancy
- Monitor and respond to failures
- Test disaster recovery
**Best Practices**:
**Availability Zones:**
```bash
# Deploy VM across availability zones
az vm create \
--resource-group MyRG \
--name MyVM \
--zone 1 \
--image Ubuntu2204 \
--size Standard_D2s_v3
# Availability SLAs:
# - Single VM (Premium SSD): 99.9%
# - Availability Set: 99.95%
# - Availability Zones: 99.99%
```
**Backup and Disaster Recovery:**
```bash
# Enable Azure Backup
az backup protection enable-for-vm \
--resource-group MyRG \
--vault-name MyVault \
--vm MyVM \
--policy-name DefaultPolicy
# Recovery Point Objective (RPO): How much data loss is acceptable
# Recovery Time Objective (RTO): How long can system be down
```
**Health Probes:**
- Application Gateway health probes
- Load Balancer probes
- Traffic Manager endpoint monitoring
## Pillar 2: Security
**Definition**: Protecting applications and data from threats.
**Key Principles**:
- Defense in depth
- Least privilege access
- Secure the network
- Protect data at rest and in transit
- Monitor and audit
**Best Practices**:
**Identity and Access:**
```bash
# Use managed identities (no credentials in code)
az vm identity assign \
--resource-group MyRG \
--name MyVM
# RBAC assignment
az role assignment create \
--assignee <principal-id> \
--role "Contributor" \
--scope /subscriptions/<subscription-id>/resourceGroups/MyRG
```
**Network Security:**
- Use Network Security Groups (NSGs)
- Implement Azure Firewall or Application Gateway WAF
- Use Private Endpoints for PaaS services
- Enable DDoS Protection Standard for public-facing apps
**Data Protection:**
```bash
# Enable encryption at rest (automatic for most services)
# Enable TLS 1.2+ for data in transit
# Azure Storage encryption
az storage account update \
--name mystorageaccount \
--resource-group MyRG \
--min-tls-version TLS1_2 \
--https-only true
```
**Security Monitoring:**
```bash
# Enable Microsoft Defender for Cloud
az security pricing create \
--name VirtualMachines \
--tier Standard
# Enable Azure Sentinel
az sentinel onboard \
--resource-group MyRG \
--workspace-name MyWorkspace
```
## Pillar 3: Cost Optimization
**Definition**: Managing costs to maximize the value delivered.
**Key Principles**:
- Plan and estimate costs
- Provision with optimization
- Use monitoring and analytics
- Maximize efficiency of cloud spend
**Best Practices**:
**Right-Sizing:**
```bash
# Use Azure Advisor recommendations
az advisor recommendation list \
--category Cost \
--output table
# Common optimizations:
# 1. Shutdown dev/test VMs when not in use
# 2. Use Azure Hybrid Benefit for Windows/SQL
# 3. Purchase reservations for consistent workloads
# 4. Use autoscaling to match demand
```
**Reserved Instances:**
- 1-year or 3-year commitment
- Save up to 72% vs pay-as-you-go
- Available for VMs, SQL Database, Cosmos DB, Synapse, Storage
**Azure Hybrid Benefit:**
```bash
# Apply Windows license to VM
az vm update \
--resource-group MyRG \
--name MyVM \
--license-type Windows_Server
# SQL Server Hybrid Benefit
az sql vm create \
--resource-group MyRG \
--name MySQLVM \
--license-type AHUB
```
**Cost Management:**
```bash
# Create budget
az consumption budget create \
--budget-name MyBudget \
--category cost \
--amount 1000 \
--time-grain monthly \
--start-date 2025-01-01 \
--end-date 2025-12-31
# Set up alerts at 80%, 100%, 120% of budget
```
## Pillar 4: Operational Excellence
**Definition**: Operations processes that keep a system running in production.
**Key Principles**:
- Automate operations
- Monitor and gain insights
- Refine operations procedures
- Anticipate failure
- Stay current with updates
**Best Practices**:
**Infrastructure as Code:**
```bash
# Use ARM, Bicep, or Terraform
# Version control all infrastructure
# Implement CI/CD for infrastructure
# Example: Bicep deployment
az deployment group create \
--resource-group MyRG \
--template-file main.bicep \
--parameters @parameters.json
```
**Monitoring and Alerting:**
```bash
# Application Insights for apps
az monitor app-insights component create \
--app MyApp \
--location eastus \
--resource-group MyRG
# Log Analytics for infrastructure
az monitor log-analytics workspace create \
--resource-group MyRG \
--workspace-name MyWorkspace
# Create alerts
az monitor metrics alert create \
--name HighCPU \
--resource-group MyRG \
--scopes <vm-id> \
--condition "avg Percentage CPU > 80" \
--description "CPU usage is above 80%"
```
**DevOps Practices:**
- Continuous Integration/Continuous Deployment (CI/CD)
- Blue-green deployments
- Canary releases
- Feature flags
- Automated testing
## Pillar 5: Performance Efficiency
**Definition**: The ability of a system to adapt to changes in load.
**Key Principles**:
- Scale horizontally
- Choose the right resources
- Monitor performance
- Optimize network and data access
**Best Practices**:
**Scaling:**
```bash
# Horizontal scaling (preferred)
# VM Scale Sets
az vmss create \
--resource-group MyRG \
--name MyVMSS \
--image Ubuntu2204 \
--instance-count 3 \
--vm-sku Standard_D2s_v3
# Autoscaling
az monitor autoscale create \
--resource-group MyRG \
--resource MyVMSS \
--resource-type Microsoft.Compute/virtualMachineScaleSets \
--name MyAutoscale \
--min-count 2 \
--max-count 10
```
**Caching:**
- Azure Cache for Redis
- Azure CDN for static content
- Application-level caching
**Data Access:**
- Use indexes on databases
- Implement caching strategies
- Use CDN for global content delivery
- Optimize queries (SQL, Cosmos DB)
**Networking:**
```bash
# Use Azure Front Door for global apps
az afd profile create \
--profile-name MyFrontDoor \
--resource-group MyRG \
--sku Premium_AzureFrontDoor
# Features:
# - Global load balancing
# - CDN capabilities
# - Web Application Firewall
# - SSL offloading
# - Caching
```
## Assessment and Tools
**Azure Well-Architected Review:**
```bash
# Self-assessment tool in Azure Portal
# Generates recommendations per pillar
# Provides actionable guidance
```
**Azure Advisor:**
```bash
# Get recommendations
az advisor recommendation list --output table
# Categories:
# - Reliability (High Availability)
# - Security
# - Performance
# - Cost
# - Operational Excellence
```
## Implementation Checklist
**Reliability:**
- [ ] Deploy across availability zones
- [ ] Implement backup strategy
- [ ] Define RTO and RPO
- [ ] Test disaster recovery
- [ ] Implement health monitoring
**Security:**
- [ ] Enable Azure AD authentication
- [ ] Implement RBAC (least privilege)
- [ ] Encrypt data at rest and in transit
- [ ] Enable Microsoft Defender for Cloud
- [ ] Implement network segmentation (NSGs, Firewall)
- [ ] Use Key Vault for secrets
**Cost Optimization:**
- [ ] Right-size resources
- [ ] Purchase reservations for predictable workloads
- [ ] Enable autoscaling
- [ ] Use Azure Hybrid Benefit
- [ ] Implement budget alerts
- [ ] Review Azure Advisor cost recommendations
**Operational Excellence:**
- [ ] Implement Infrastructure as Code
- [ ] Set up CI/CD pipelines
- [ ] Enable comprehensive monitoring
- [ ] Create operational runbooks
- [ ] Implement automated alerting
- [ ] Use tags for resource organization
**Performance Efficiency:**
- [ ] Choose appropriate resource SKUs
- [ ] Implement autoscaling
- [ ] Use caching (Redis, CDN)
- [ ] Optimize database queries
- [ ] Implement load balancing
- [ ] Monitor performance metrics
## Common Patterns
**Highly Available Web Application:**
- Application Gateway (WAF enabled)
- App Service (Premium tier, multiple instances)
- Azure SQL Database (Zone-redundant)
- Azure Cache for Redis
- Application Insights
- Azure Front Door (global distribution)
**Mission-Critical Application:**
- Multi-region deployment
- Traffic Manager or Front Door (global routing)
- Availability Zones in each region
- Geo-redundant storage (GRS or RA-GRS)
- Automated backups with geo-replication
- Comprehensive monitoring and alerting
**Cost-Optimized Dev/Test:**
- Auto-shutdown for VMs
- B-series (burstable) VMs
- Dev/Test pricing tiers
- Shared App Service plans
- Azure DevTest Labs
## References
- **Official Framework**: https://learn.microsoft.com/en-us/azure/well-architected/
- **Azure Advisor**: https://portal.azure.com/#blade/Microsoft_Azure_Expert/AdvisorMenuBlade/overview
- **Well-Architected Review**: https://learn.microsoft.com/en-us/assessments/azure-architecture-review/
- **Architecture Center**: https://learn.microsoft.com/en-us/azure/architecture/
## Key Takeaways
1. **Balance the Pillars**: Trade-offs exist between pillars (e.g., cost vs. reliability)
2. **Continuous Improvement**: Architecture is not static, revisit regularly
3. **Measure and Monitor**: Use data to drive decisions
4. **Automation**: Automate repetitive tasks to improve reliability and reduce costs
5. **Security First**: Integrate security into every layer of architecture
The Well-Architected Framework provides a consistent approach to evaluating architectures and implementing designs that scale over time.

View File

@@ -0,0 +1,624 @@
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# Azure Container Apps GPU Support - 2025 Features
Complete knowledge base for Azure Container Apps with GPU support, serverless capabilities, and Dapr integration (2025 GA features).
## Overview
Azure Container Apps is a serverless container platform with native GPU support, Dapr integration, and scale-to-zero capabilities for cost-efficient AI/ML workloads.
## Key 2025 Features (Build Announcements)
### 1. Serverless GPU (GA)
- **Automatic scaling**: Scale GPU workloads based on demand
- **Scale-to-zero**: Pay only when GPU is actively used
- **Per-second billing**: Granular cost control
- **Optimized cold start**: Fast initialization for AI models
- **Reduced operational overhead**: No infrastructure management
### 2. Dedicated GPU (GA)
- **Consistent performance**: Dedicated GPU resources
- **Simplified AI deployment**: Easy model hosting
- **Long-running workloads**: Ideal for training and continuous inference
- **Multiple GPU types**: NVIDIA A100, T4, and more
### 3. Dynamic Sessions with GPU (Early Access)
- **Sandboxed execution**: Run untrusted AI-generated code
- **Hyper-V isolation**: Enhanced security
- **GPU-powered Python interpreter**: Handle compute-intensive AI workloads
- **Scale at runtime**: Dynamic resource allocation
### 4. Foundry Models Integration
- **Deploy AI models directly**: During container app creation
- **Ready-to-use models**: Pre-configured inference endpoints
- **Azure AI Foundry**: Seamless integration
### 5. Workflow with Durable Task Scheduler (Preview)
- **Long-running workflows**: Reliable orchestration
- **State management**: Automatic persistence
- **Event-driven**: Trigger workflows from events
### 6. Native Azure Functions Support
- **Functions runtime**: Run Azure Functions in Container Apps
- **Consistent development**: Same code, serverless execution
- **Event triggers**: All Functions triggers supported
### 7. Dapr Integration (GA)
- **Service discovery**: Built-in DNS-based discovery
- **State management**: Distributed state stores
- **Pub/sub messaging**: Reliable messaging patterns
- **Service invocation**: Resilient service-to-service calls
- **Observability**: Integrated tracing and metrics
## Creating Container Apps with GPU
### Basic Container App with Serverless GPU
```bash
# Create Container Apps environment
az containerapp env create \
--name myenv \
--resource-group MyRG \
--location eastus \
--logs-workspace-id <workspace-id> \
--logs-workspace-key <workspace-key>
# Create Container App with GPU
az containerapp create \
--name myapp-gpu \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/ai-model:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--ingress external \
--target-port 8080
```
### Production-Ready Container App with GPU
```bash
az containerapp create \
--name myapp-gpu-prod \
--resource-group MyRG \
--environment myenv \
\
# Container configuration
--image myregistry.azurecr.io/ai-model:latest \
--registry-server myregistry.azurecr.io \
--registry-identity system \
\
# Resources
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
\
# Scaling
--min-replicas 0 \
--max-replicas 20 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10 \
\
# Networking
--ingress external \
--target-port 8080 \
--transport http2 \
--exposed-port 8080 \
\
# Security
--registry-identity system \
--env-vars "AZURE_CLIENT_ID=secretref:client-id" \
\
# Monitoring
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--enable-dapr \
\
# Identity
--system-assigned
```
## Container Apps Environment Configuration
### Environment with Zone Redundancy
```bash
az containerapp env create \
--name myenv-prod \
--resource-group MyRG \
--location eastus \
--logs-workspace-id <workspace-id> \
--logs-workspace-key <workspace-key> \
--zone-redundant true \
--enable-workload-profiles true
```
### Workload Profiles (Dedicated GPU)
```bash
# Create environment with workload profiles
az containerapp env create \
--name myenv-gpu \
--resource-group MyRG \
--location eastus \
--enable-workload-profiles true
# Add GPU workload profile
az containerapp env workload-profile add \
--name myenv-gpu \
--resource-group MyRG \
--workload-profile-name gpu-profile \
--workload-profile-type GPU-A100 \
--min-nodes 0 \
--max-nodes 10
# Create container app with GPU profile
az containerapp create \
--name myapp-dedicated-gpu \
--resource-group MyRG \
--environment myenv-gpu \
--workload-profile-name gpu-profile \
--image myregistry.azurecr.io/training-job:latest \
--cpu 8 \
--memory 16Gi \
--min-replicas 1 \
--max-replicas 5
```
## GPU Scaling Rules
### Custom Prometheus Scaling
```bash
az containerapp create \
--name myapp-gpu-prometheus \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/ai-model:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name gpu-utilization \
--scale-rule-type custom \
--scale-rule-custom-type prometheus \
--scale-rule-metadata \
serverAddress=http://prometheus.monitoring.svc.cluster.local:9090 \
metricName=gpu_utilization \
threshold=80 \
query="avg(nvidia_gpu_utilization{app='myapp'})"
```
### Queue-Based Scaling (Azure Service Bus)
```bash
az containerapp create \
--name myapp-queue-processor \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/batch-processor:latest \
--cpu 4 \
--memory 8Gi \
--gpu-type nvidia-t4 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 50 \
--scale-rule-name queue-scaling \
--scale-rule-type azure-servicebus \
--scale-rule-metadata \
queueName=ai-jobs \
namespace=myservicebus \
messageCount=5 \
--scale-rule-auth connection=servicebus-connection
```
## Dapr Integration
### Enable Dapr on Container App
```bash
az containerapp create \
--name myapp-dapr \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--enable-dapr \
--dapr-app-id myapp \
--dapr-app-port 8080 \
--dapr-app-protocol http \
--dapr-http-max-request-size 4 \
--dapr-http-read-buffer-size 4 \
--dapr-log-level info \
--dapr-enable-api-logging true
```
### Dapr State Store (Azure Cosmos DB)
```yaml
# Create Dapr component for state store
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: statestore
spec:
type: state.azure.cosmosdb
version: v1
metadata:
- name: url
value: "https://mycosmosdb.documents.azure.com:443/"
- name: masterKey
secretRef: cosmosdb-key
- name: database
value: "mydb"
- name: collection
value: "state"
```
```bash
# Create the component
az containerapp env dapr-component set \
--name myenv \
--resource-group MyRG \
--dapr-component-name statestore \
--yaml component.yaml
```
### Dapr Pub/Sub (Azure Service Bus)
```yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: pubsub
spec:
type: pubsub.azure.servicebus.topics
version: v1
metadata:
- name: connectionString
secretRef: servicebus-connection
- name: consumerID
value: "myapp"
```
### Service-to-Service Invocation
```python
# Python example using Dapr SDK
from dapr.clients import DaprClient
with DaprClient() as client:
# Invoke another service
response = client.invoke_method(
app_id='other-service',
method_name='process',
data='{"input": "data"}'
)
# Save state
client.save_state(
store_name='statestore',
key='mykey',
value='myvalue'
)
# Publish message
client.publish_event(
pubsub_name='pubsub',
topic_name='orders',
data='{"orderId": "123"}'
)
```
## AI Model Deployment Patterns
### OpenAI-Compatible Endpoint
```dockerfile
# Dockerfile for vLLM model serving
FROM vllm/vllm-openai:latest
ENV MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
ENV GPU_MEMORY_UTILIZATION=0.9
ENV MAX_MODEL_LEN=4096
CMD ["--model", "${MODEL_NAME}", \
"--gpu-memory-utilization", "${GPU_MEMORY_UTILIZATION}", \
"--max-model-len", "${MAX_MODEL_LEN}", \
"--port", "8080"]
```
```bash
# Deploy vLLM model
az containerapp create \
--name llama-inference \
--resource-group MyRG \
--environment myenv \
--image vllm/vllm-openai:latest \
--cpu 8 \
--memory 32Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 1 \
--max-replicas 5 \
--target-port 8080 \
--ingress external \
--env-vars \
MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" \
GPU_MEMORY_UTILIZATION="0.9" \
HF_TOKEN=secretref:huggingface-token
```
### Stable Diffusion Image Generation
```bash
az containerapp create \
--name stable-diffusion \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/stable-diffusion:latest \
--cpu 4 \
--memory 16Gi \
--gpu-type nvidia-a100 \
--gpu-count 1 \
--min-replicas 0 \
--max-replicas 10 \
--target-port 7860 \
--ingress external \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 1
```
### Batch Processing Job
```bash
az containerapp job create \
--name batch-training-job \
--resource-group MyRG \
--environment myenv \
--trigger-type Manual \
--image myregistry.azurecr.io/training:latest \
--cpu 8 \
--memory 32Gi \
--gpu-type nvidia-a100 \
--gpu-count 2 \
--parallelism 1 \
--replica-timeout 7200 \
--replica-retry-limit 3 \
--env-vars \
DATASET_URL="https://mystorage.blob.core.windows.net/datasets/train.csv" \
MODEL_OUTPUT="https://mystorage.blob.core.windows.net/models/" \
EPOCHS="100"
# Execute job
az containerapp job start \
--name batch-training-job \
--resource-group MyRG
```
## Monitoring and Observability
### Application Insights Integration
```bash
az containerapp create \
--name myapp-monitored \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--env-vars \
APPLICATIONINSIGHTS_CONNECTION_STRING=secretref:appinsights-connection
```
### Query Logs
```bash
# Stream logs
az containerapp logs show \
--name myapp-gpu \
--resource-group MyRG \
--follow
# Query with Log Analytics
az monitor log-analytics query \
--workspace <workspace-id> \
--analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'myapp-gpu' | take 100"
```
### Metrics and Alerts
```bash
# Create metric alert for GPU usage
az monitor metrics alert create \
--name high-gpu-usage \
--resource-group MyRG \
--scopes $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv) \
--condition "avg Requests > 100" \
--window-size 5m \
--evaluation-frequency 1m \
--action <action-group-id>
```
## Security Best Practices
### Managed Identity
```bash
# Create with system-assigned identity
az containerapp create \
--name myapp-identity \
--resource-group MyRG \
--environment myenv \
--system-assigned \
--image myregistry.azurecr.io/myapp:latest
# Get identity principal ID
IDENTITY_ID=$(az containerapp show -g MyRG -n myapp-identity --query identity.principalId -o tsv)
# Assign role to access Key Vault
az role assignment create \
--assignee $IDENTITY_ID \
--role "Key Vault Secrets User" \
--scope /subscriptions/<sub-id>/resourceGroups/MyRG/providers/Microsoft.KeyVault/vaults/mykeyvault
# Use user-assigned identity
az identity create --name myapp-identity --resource-group MyRG
IDENTITY_RESOURCE_ID=$(az identity show -g MyRG -n myapp-identity --query id -o tsv)
az containerapp create \
--name myapp-user-identity \
--resource-group MyRG \
--environment myenv \
--user-assigned $IDENTITY_RESOURCE_ID \
--image myregistry.azurecr.io/myapp:latest
```
### Secret Management
```bash
# Add secrets
az containerapp secret set \
--name myapp-gpu \
--resource-group MyRG \
--secrets \
huggingface-token="<token>" \
api-key="<key>"
# Reference secrets in environment variables
az containerapp update \
--name myapp-gpu \
--resource-group MyRG \
--set-env-vars \
HF_TOKEN=secretref:huggingface-token \
API_KEY=secretref:api-key
```
## Cost Optimization
### Scale-to-Zero Configuration
```bash
az containerapp create \
--name myapp-scale-zero \
--resource-group MyRG \
--environment myenv \
--image myregistry.azurecr.io/myapp:latest \
--min-replicas 0 \
--max-replicas 10 \
--scale-rule-name http-scaling \
--scale-rule-type http \
--scale-rule-http-concurrency 10
```
**Cost savings**: Pay only when requests are being processed. GPU costs are per-second when active.
### Right-Sizing Resources
```bash
# Start with minimal resources
--cpu 2 --memory 4Gi --gpu-count 1
# Monitor and adjust based on actual usage
az monitor metrics list \
--resource $(az containerapp show -g MyRG -n myapp-gpu --query id -o tsv) \
--metric "CpuPercentage,MemoryPercentage"
```
### Use Spot/Preemptible GPUs (Future Feature)
When available, configure spot instances for non-critical workloads to save up to 80% on GPU costs.
## Troubleshooting
### Check Revision Status
```bash
az containerapp revision list \
--name myapp-gpu \
--resource-group MyRG \
--output table
```
### View Revision Details
```bash
az containerapp revision show \
--name <revision-name> \
--app myapp-gpu \
--resource-group MyRG
```
### Restart Container App
```bash
az containerapp update \
--name myapp-gpu \
--resource-group MyRG \
--force-restart
```
### GPU Not Available
If GPU is not provisioning:
1. Check region availability: Not all regions support GPU
2. Verify quota: Request quota increase if needed
3. Check workload profile: Ensure GPU workload profile is created
## Best Practices
✓ Use scale-to-zero for intermittent workloads
✓ Implement health probes (liveness and readiness)
✓ Use managed identities for authentication
✓ Store secrets in Azure Key Vault
✓ Enable Dapr for microservices patterns
✓ Configure appropriate scaling rules
✓ Monitor GPU utilization and adjust resources
✓ Use Container Apps jobs for batch processing
✓ Implement retry logic for transient failures
✓ Use Application Insights for observability
## References
- [Container Apps GPU Documentation](https://learn.microsoft.com/en-us/azure/container-apps/gpu-support)
- [Dapr Integration](https://learn.microsoft.com/en-us/azure/container-apps/dapr-overview)
- [Scaling Rules](https://learn.microsoft.com/en-us/azure/container-apps/scale-app)
- [Build 2025 Announcements](https://azure.microsoft.com/en-us/blog/container-apps-build-2025/)
Azure Container Apps with GPU support provides the ultimate serverless platform for AI/ML workloads!

View File

@@ -0,0 +1,796 @@
## 🚨 CRITICAL GUIDELINES
### Windows File Path Requirements
**MANDATORY: Always Use Backslashes on Windows for File Paths**
When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
**Examples:**
- ❌ WRONG: `D:/repos/project/file.tsx`
- ✅ CORRECT: `D:\repos\project\file.tsx`
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
### Documentation Guidelines
**NEVER create new documentation files unless explicitly requested by the user.**
- **Priority**: Update existing README.md files rather than creating new documentation
- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
- **User preference**: Only create additional .md files when user specifically asks for documentation
---
# Azure Deployment Stacks - 2025 GA Features
Complete knowledge base for Azure Deployment Stacks, the successor to Azure Blueprints (GA 2024, best practices 2025).
## Overview
Azure Deployment Stacks is a resource type for managing a collection of Azure resources as a single, atomic unit. It provides unified lifecycle management, resource protection, and automatic cleanup capabilities.
## Key Features
### 1. Unified Resource Management
- Manage multiple resources as a single entity
- Update, export, and delete operations on the entire stack
- Track all managed resources in one place
- Consistent deployment across environments
### 2. Deny Settings (Resource Protection)
Prevent unauthorized modifications to managed resources:
- **None**: No restrictions (default)
- **DenyDelete**: Prevent resource deletion
- **DenyWriteAndDelete**: Prevent updates and deletions
### 3. ActionOnUnmanage (Cleanup Policies)
Control what happens to resources no longer in template:
- **detachAll**: Remove from stack management, keep resources
- **deleteAll**: Delete resources not in template
- **deleteResources**: Delete unmanaged resources, keep resource groups
### 4. Scope Flexibility
Deploy stacks at:
- Resource group scope
- Subscription scope
- Management group scope
### 5. Replaces Azure Blueprints
Azure Blueprints will be deprecated in **July 2026**. Deployment Stacks is the recommended replacement.
## Prerequisites
### Azure CLI Version
```bash
# Requires Azure CLI 2.61.0 or later
az version
# Upgrade if needed
az upgrade
```
### Azure PowerShell Version
```bash
# Requires Azure PowerShell 12.0.0 or later
Get-InstalledModule -Name Az
Update-Module -Name Az
```
## Creating Deployment Stacks
### Subscription Scope Stack
```bash
# Create deployment stack at subscription level
az stack sub create \
--name MyProductionStack \
--location eastus \
--template-file main.bicep \
--parameters @parameters.json \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals <devops-service-principal-id> <admin-group-id> \
--action-on-unmanage deleteAll \
--description "Production infrastructure managed by deployment stack" \
--tags Environment=Production ManagedBy=DeploymentStack CostCenter=Engineering
# What-if analysis before deployment
az stack sub what-if \
--name MyProductionStack \
--location eastus \
--template-file main.bicep \
--parameters @parameters.json
# Create with confirmation prompt disabled
az stack sub create \
--name MyDevStack \
--location eastus \
--template-file main.bicep \
--deny-settings-mode None \
--action-on-unmanage detachAll \
--yes
```
### Resource Group Scope Stack
```bash
# Create resource group
az group create \
--name MyRG \
--location eastus \
--tags Environment=Production
# Create deployment stack
az stack group create \
--name MyAppStack \
--resource-group MyRG \
--template-file main.bicep \
--parameters environment=production \
--deny-settings-mode DenyDelete \
--action-on-unmanage deleteAll \
--description "Application infrastructure stack"
```
### Management Group Scope Stack
```bash
# Create stack at management group level
az stack mg create \
--name MyEnterpriseStack \
--management-group-id MyMgmtGroup \
--location eastus \
--template-file main.bicep \
--deny-settings-mode DenyWriteAndDelete \
--action-on-unmanage detachAll
```
## Bicep Template for Deployment Stack
### Production Stack Template
```bicep
// main.bicep
targetScope = 'subscription'
@description('Environment name')
@allowed([
'dev'
'staging'
'production'
])
param environment string = 'production'
@description('Primary location')
param location string = 'eastus'
@description('Secondary location for geo-replication')
param secondaryLocation string = 'westus'
// Resource naming
var namingPrefix = 'myapp-${environment}'
// Resource Group for core infrastructure
resource coreRG 'Microsoft.Resources/resourceGroups@2024-03-01' = {
name: '${namingPrefix}-core-rg'
location: location
tags: {
Environment: environment
ManagedBy: 'DeploymentStack'
Purpose: 'Core Infrastructure'
}
}
// Resource Group for data services
resource dataRG 'Microsoft.Resources/resourceGroups@2024-03-01' = {
name: '${namingPrefix}-data-rg'
location: location
tags: {
Environment: environment
ManagedBy: 'DeploymentStack'
Purpose: 'Data Services'
}
}
// Log Analytics Workspace
module logAnalytics 'modules/log-analytics.bicep' = {
name: 'logAnalyticsDeploy'
scope: coreRG
params: {
name: '${namingPrefix}-logs'
location: location
retentionInDays: environment == 'production' ? 90 : 30
}
}
// AKS Automatic Cluster
module aksCluster 'modules/aks-automatic.bicep' = {
name: 'aksClusterDeploy'
scope: coreRG
params: {
name: '${namingPrefix}-aks'
location: location
kubernetesVersion: '1.34'
workspaceId: logAnalytics.outputs.workspaceId
enableZoneRedundancy: environment == 'production'
}
}
// Container Apps Environment
module containerEnv 'modules/container-env.bicep' = {
name: 'containerEnvDeploy'
scope: coreRG
params: {
name: '${namingPrefix}-containerenv'
location: location
workspaceId: logAnalytics.outputs.workspaceId
zoneRedundant: environment == 'production'
}
}
// Azure OpenAI
module openAI 'modules/openai.bicep' = {
name: 'openAIDeploy'
scope: dataRG
params: {
name: '${namingPrefix}-openai'
location: location
deployGPT5: environment == 'production'
}
}
// Cosmos DB with geo-replication
module cosmosDB 'modules/cosmos-db.bicep' = {
name: 'cosmosDBDeploy'
scope: dataRG
params: {
name: '${namingPrefix}-cosmos'
primaryLocation: location
secondaryLocation: secondaryLocation
enableAutomaticFailover: environment == 'production'
}
}
// Key Vault
module keyVault 'modules/key-vault.bicep' = {
name: 'keyVaultDeploy'
scope: coreRG
params: {
name: '${namingPrefix}-kv'
location: location
enablePurgeProtection: environment == 'production'
}
}
// Outputs
output aksClusterName string = aksCluster.outputs.clusterName
output containerEnvId string = containerEnv.outputs.environmentId
output openAIEndpoint string = openAI.outputs.endpoint
output cosmosDBEndpoint string = cosmosDB.outputs.endpoint
output keyVaultUri string = keyVault.outputs.vaultUri
```
### AKS Automatic Module
```bicep
// modules/aks-automatic.bicep
@description('Cluster name')
param name string
@description('Location')
param location string
@description('Kubernetes version')
param kubernetesVersion string = '1.34'
@description('Log Analytics workspace ID')
param workspaceId string
@description('Enable zone redundancy')
param enableZoneRedundancy bool = true
resource aksCluster 'Microsoft.ContainerService/managedClusters@2025-01-01' = {
name: name
location: location
sku: {
name: 'Automatic'
tier: 'Standard'
}
identity: {
type: 'SystemAssigned'
}
properties: {
kubernetesVersion: kubernetesVersion
dnsPrefix: '${name}-dns'
enableRBAC: true
aadProfile: {
managed: true
enableAzureRBAC: true
}
networkProfile: {
networkPlugin: 'azure'
networkPluginMode: 'overlay'
networkDataplane: 'cilium'
serviceCidr: '10.0.0.0/16'
dnsServiceIP: '10.0.0.10'
}
autoScalerProfile: {
'balance-similar-node-groups': 'true'
expander: 'least-waste'
}
autoUpgradeProfile: {
upgradeChannel: 'stable'
nodeOSUpgradeChannel: 'NodeImage'
}
securityProfile: {
defender: {
securityMonitoring: {
enabled: true
}
}
workloadIdentity: {
enabled: true
}
}
oidcIssuerProfile: {
enabled: true
}
addonProfiles: {
omsagent: {
enabled: true
config: {
logAnalyticsWorkspaceResourceID: workspaceId
}
}
azurePolicy: {
enabled: true
}
}
}
zones: enableZoneRedundancy ? ['1', '2', '3'] : null
}
output clusterName string = aksCluster.name
output clusterId string = aksCluster.id
output oidcIssuerUrl string = aksCluster.properties.oidcIssuerProfile.issuerUrl
output kubeletIdentity string = aksCluster.properties.identityProfile.kubeletidentity.objectId
```
## Managing Deployment Stacks
### Update Stack
```bash
# Update with new template version
az stack sub update \
--name MyProductionStack \
--template-file main.bicep \
--parameters @parameters.json \
--action-on-unmanage deleteAll
# Update deny settings
az stack sub update \
--name MyProductionStack \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals <new-principal-id>
```
### View Stack Details
```bash
# Show stack information
az stack sub show \
--name MyProductionStack \
--output json
# List all stacks in subscription
az stack sub list --output table
# List stacks in resource group
az stack group list \
--resource-group MyRG \
--output table
```
### Export Stack Template
```bash
# Export template from deployed stack
az stack sub export \
--name MyProductionStack \
--output-file exported-stack.json
# Export and save parameters
az stack sub show \
--name MyProductionStack \
--query "parameters" \
--output json > parameters-backup.json
```
### Delete Stack
```bash
# Delete stack and all managed resources
az stack sub delete \
--name MyProductionStack \
--action-on-unmanage deleteAll \
--yes
# Delete stack but keep resources
az stack sub delete \
--name MyProductionStack \
--action-on-unmanage detachAll \
--yes
# Delete with confirmation prompt
az stack sub delete --name MyProductionStack
```
## Deny Settings in Detail
### DenyDelete Mode
Prevents deletion but allows updates:
```bash
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--deny-settings-mode DenyDelete \
--deny-settings-excluded-principals \
<emergency-access-principal-id> \
<devops-service-principal-id>
```
**Use cases:**
- Protect production databases
- Prevent accidental resource deletion
- Allow configuration updates
### DenyWriteAndDelete Mode
Prevents both updates and deletions:
```bash
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals <break-glass-principal-id>
```
**Use cases:**
- Immutable infrastructure
- Compliance requirements
- Critical production workloads
### Excluded Principals
Bypass deny settings for specific identities:
```bash
# Get principal IDs
SERVICE_PRINCIPAL_ID=$(az ad sp show --id <app-id> --query id -o tsv)
ADMIN_GROUP_ID=$(az ad group show --group "Cloud Admins" --query id -o tsv)
# Apply with exclusions
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals $SERVICE_PRINCIPAL_ID $ADMIN_GROUP_ID
```
## ActionOnUnmanage Policies
### detachAll
Resources are removed from stack management but not deleted:
```bash
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--action-on-unmanage detachAll
```
**Use when:**
- Testing deployment changes
- Migrating resources to another stack
- Temporary stack management
### deleteAll
All unmanaged resources are deleted:
```bash
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--action-on-unmanage deleteAll
```
**Use when:**
- Ephemeral environments (dev, test)
- Clean slate deployments
- Strict infrastructure-as-code enforcement
### deleteResources
Delete resources but keep resource groups:
```bash
az stack sub create \
--name MyStack \
--location eastus \
--template-file main.bicep \
--action-on-unmanage deleteResources
```
## RBAC for Deployment Stacks
### Built-in Roles
**Azure Deployment Stack Contributor**
- Manage deployment stacks
- Cannot create or delete deny-assignments
**Azure Deployment Stack Owner**
- Full stack management
- Can create and delete deny-assignments
### Assign Roles
```bash
# Assign Stack Contributor role
az role assignment create \
--assignee <user-or-service-principal-id> \
--role "Azure Deployment Stack Contributor" \
--scope /subscriptions/<subscription-id>
# Assign Stack Owner role
az role assignment create \
--assignee <admin-principal-id> \
--role "Azure Deployment Stack Owner" \
--scope /subscriptions/<subscription-id>
```
## CI/CD Integration
### GitHub Actions
```yaml
name: Deploy Deployment Stack
on:
push:
branches: [main]
workflow_dispatch:
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Azure Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: What-if Analysis
run: |
az stack sub what-if \
--name MyProductionStack \
--location eastus \
--template-file main.bicep \
--parameters @parameters.json
- name: Deploy Stack
run: |
az stack sub create \
--name MyProductionStack \
--location eastus \
--template-file main.bicep \
--parameters @parameters.json \
--deny-settings-mode DenyWriteAndDelete \
--deny-settings-excluded-principals ${{ secrets.DEVOPS_PRINCIPAL_ID }} \
--action-on-unmanage deleteAll \
--yes
```
### Azure DevOps Pipeline
```yaml
trigger:
branches:
include:
- main
pool:
vmImage: 'ubuntu-latest'
variables:
azureSubscription: 'MyAzureConnection'
stackName: 'MyProductionStack'
location: 'eastus'
steps:
- task: AzureCLI@2
displayName: 'What-if Analysis'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az stack sub what-if \
--name $(stackName) \
--location $(location) \
--template-file main.bicep \
--parameters @parameters.json
- task: AzureCLI@2
displayName: 'Deploy Stack'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az stack sub create \
--name $(stackName) \
--location $(location) \
--template-file main.bicep \
--parameters @parameters.json \
--deny-settings-mode DenyWriteAndDelete \
--action-on-unmanage deleteAll \
--yes
```
## Monitoring and Auditing
### View Stack Events
```bash
# Get deployment operations
az stack sub show \
--name MyProductionStack \
--query "deploymentId" \
--output tsv | \
xargs -I {} az deployment sub show --name {}
# List managed resources
az stack sub show \
--name MyProductionStack \
--query "resources[].id" \
--output table
```
### Activity Logs
```bash
# Query stack operations
az monitor activity-log list \
--resource-group MyRG \
--namespace Microsoft.Resources \
--start-time 2025-01-01T00:00:00Z \
--query "[?contains(authorization.action, 'Microsoft.Resources/deploymentStacks')]" \
--output table
```
## Migration from Azure Blueprints
### Assessment
1. **Inventory Blueprints**: List all blueprints and assignments
2. **Document Parameters**: Export parameters and configurations
3. **Plan Conversion**: Map blueprints to deployment stacks
4. **Test in Dev**: Validate converted templates
### Conversion Steps
```bash
# 1. Export Blueprint as ARM template
# (Use Azure Portal or PowerShell)
# 2. Convert ARM to Bicep
az bicep decompile --file blueprint-template.json
# 3. Create Deployment Stack
az stack sub create \
--name ConvertedFromBlueprint \
--location eastus \
--template-file converted.bicep \
--parameters @blueprint-parameters.json \
--deny-settings-mode DenyWriteAndDelete \
--action-on-unmanage detachAll
# 4. Validate resources
az stack sub show --name ConvertedFromBlueprint
# 5. Delete Blueprint assignment (after validation)
# Remove-AzBlueprintAssignment -Name MyBlueprintAssignment
```
## Best Practices
**Use Deployment Stacks for all new infrastructure**
**Always run what-if analysis before deployment**
**Use DenyWriteAndDelete for production stacks**
**Exclude break-glass principals from deny settings**
**Tag stacks with Environment, CostCenter, Owner**
**Use deleteAll for ephemeral environments**
**Use detachAll for migration scenarios**
**Implement CI/CD pipelines for stack deployment**
**Monitor stack operations via activity logs**
**Document stack architecture and dependencies**
## Troubleshooting
### Stack Creation Fails
```bash
# Check deployment errors
az stack sub show \
--name MyStack \
--query "error" \
--output json
# Validate template
az deployment sub validate \
--location eastus \
--template-file main.bicep \
--parameters @parameters.json
```
### Deny Settings Blocking Operations
```bash
# Check deny assignments
az role assignment list \
--scope /subscriptions/<subscription-id> \
--include-inherited \
--query "[?type=='Microsoft.Authorization/denyAssignments']"
# Add principal to exclusions
az stack sub update \
--name MyStack \
--deny-settings-excluded-principals <new-principal-id>
```
### Resources Not Deleted
```bash
# Check action-on-unmanage setting
az stack sub show \
--name MyStack \
--query "actionOnUnmanage" \
--output tsv
# Update to deleteAll
az stack sub update \
--name MyStack \
--action-on-unmanage deleteAll
```
## References
- [Deployment Stacks Documentation](https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/deployment-stacks)
- [Deployment Stacks Quickstart](https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/quickstart-create-deployment-stacks)
- [Migrate from Blueprints](https://learn.microsoft.com/en-us/azure/governance/blueprints/how-to/migrate-to-deployment-stacks)
Deployment Stacks represents the future of Azure infrastructure lifecycle management!