Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:47:38 +08:00
commit 18faa0569e
47 changed files with 7969 additions and 0 deletions

View File

@@ -0,0 +1,66 @@
# External Resources
Pointers to official documentation and community resources.
## Official HashiCorp Documentation
| Resource | URL | Use For |
|----------|-----|---------|
| Terraform Docs | https://developer.hashicorp.com/terraform/docs | Language reference, CLI commands |
| Terraform Tutorials | https://developer.hashicorp.com/terraform/tutorials | Step-by-step learning paths |
| Language Reference | https://developer.hashicorp.com/terraform/language | HCL syntax, expressions, functions |
| CLI Reference | https://developer.hashicorp.com/terraform/cli | Command options and usage |
| Best Practices | https://developer.hashicorp.com/terraform/cloud-docs/recommended-practices | Official workflow recommendations |
## Terraform Registry
| Resource | URL | Use For |
|----------|-----|---------|
| Provider Registry | https://registry.terraform.io/browse/providers | Find and explore providers |
| Module Registry | https://registry.terraform.io/browse/modules | Pre-built modules |
| Telmate Proxmox | https://registry.terraform.io/providers/Telmate/proxmox/latest/docs | Proxmox provider docs |
| AWS Provider | https://registry.terraform.io/providers/hashicorp/aws/latest/docs | AWS resource reference |
## Proxmox Resources
| Resource | URL | Use For |
|----------|-----|---------|
| Telmate Provider Docs | https://registry.terraform.io/providers/Telmate/proxmox/latest/docs | Resource configuration |
| Telmate GitHub | https://github.com/Telmate/terraform-provider-proxmox | Source, issues, examples |
| Proxmox VE API | https://pve.proxmox.com/pve-docs/api-viewer/ | Understanding API calls |
| Proxmox Wiki | https://pve.proxmox.com/wiki/Main_Page | Proxmox concepts and setup |
## Community Resources
| Resource | URL | Use For |
|----------|-----|---------|
| Terraform Best Practices | https://www.terraform-best-practices.com | Community-maintained guide |
| Awesome Terraform | https://github.com/shuaibiyy/awesome-terraform | Curated list of resources |
| Terraform Weekly | https://www.yourdevopsmentor.com/terraform-weekly | News and updates |
## Learning Resources
| Resource | URL | Use For |
|----------|-----|---------|
| HashiCorp Learn | https://developer.hashicorp.com/terraform/tutorials | Official tutorials |
| Terraform Up & Running | https://www.terraformupandrunning.com/ | Comprehensive book |
## Tools
| Tool | URL | Use For |
|------|-----|---------|
| TFLint | https://github.com/terraform-linters/tflint | Linting and best practices |
| Checkov | https://github.com/bridgecrewio/checkov | Security scanning |
| Infracost | https://github.com/infracost/infracost | Cost estimation |
| Terragrunt | https://terragrunt.gruntwork.io/ | DRY Terraform configurations |
| tfenv | https://github.com/tfutils/tfenv | Terraform version management |
## Quick Links
**Most commonly needed:**
1. **HCL Syntax**: https://developer.hashicorp.com/terraform/language/syntax/configuration
2. **Functions**: https://developer.hashicorp.com/terraform/language/functions
3. **Expressions**: https://developer.hashicorp.com/terraform/language/expressions
4. **Backend Configuration**: https://developer.hashicorp.com/terraform/language/settings/backends
5. **Proxmox VM Resource**: https://registry.terraform.io/providers/Telmate/proxmox/latest/docs/resources/vm_qemu

View File

@@ -0,0 +1,165 @@
# Module Design
## Standard Structure
```
modules/<name>/
├── main.tf # Resources
├── variables.tf # Inputs
├── outputs.tf # Outputs
├── versions.tf # Provider constraints
```
## Module Example
```hcl
# modules/vm/variables.tf
variable "name" {
description = "VM name"
type = string
}
variable "target_node" {
description = "Proxmox node"
type = string
}
variable "specs" {
type = object({
cores = number
memory = number
disk = optional(string, "50G")
})
}
```
```hcl
# modules/vm/main.tf
resource "proxmox_vm_qemu" "vm" {
name = var.name
target_node = var.target_node
cores = var.specs.cores
memory = var.specs.memory
}
```
```hcl
# modules/vm/outputs.tf
output "ip" {
value = proxmox_vm_qemu.vm.default_ipv4_address
}
```
```hcl
# Usage
module "web" {
source = "./modules/vm"
name = "web-01"
target_node = "pve1"
specs = { cores = 4, memory = 8192 }
}
```
## Complex Variable Types
```hcl
# Map of objects
variable "vms" {
type = map(object({
node = string
cores = number
memory = number
}))
}
# Object with optional fields
variable "network" {
type = object({
bridge = string
vlan = optional(number)
ip = optional(string, "dhcp")
})
}
```
## Variable Validation
```hcl
variable "environment" {
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Must be dev, staging, or prod."
}
}
variable "cores" {
type = number
validation {
condition = var.cores >= 1 && var.cores <= 32
error_message = "Cores must be 1-32."
}
}
```
## Module Composition
```hcl
module "network" {
source = "../../modules/network"
# ...
}
module "web" {
source = "../../modules/vm"
network_id = module.network.id # Implicit dependency
}
module "database" {
source = "../../modules/vm"
depends_on = [module.network] # Explicit dependency
}
```
## for_each vs count
```hcl
# count - index-based (0, 1, 2)
module "worker" {
source = "./modules/vm"
count = 3
name = "worker-${count.index}"
}
# Access: module.worker[0]
# for_each - key-based (preferred)
module "vm" {
source = "./modules/vm"
for_each = var.vms
name = each.key
specs = each.value
}
# Access: module.vm["web"]
```
## Version Constraints
```hcl
# modules/vm/versions.tf
terraform {
required_version = ">= 1.0"
required_providers {
proxmox = {
source = "telmate/proxmox"
version = "~> 3.0"
}
}
}
```
```hcl
# Pin module version
module "vm" {
source = "git::https://github.com/org/modules.git//vm?ref=v2.1.0"
}
```

View File

@@ -0,0 +1,44 @@
# Proxmox Provider Authentication
## Provider Configuration
```hcl
terraform {
required_providers {
proxmox = {
source = "telmate/proxmox"
version = "~> 3.0"
}
}
}
provider "proxmox" {
pm_api_url = "https://proxmox.example.com:8006/api2/json"
pm_api_token_id = "terraform@pve!mytoken"
pm_api_token_secret = var.pm_api_token_secret
pm_tls_insecure = false # true for self-signed certs
pm_parallel = 4 # concurrent operations
pm_timeout = 600 # API timeout seconds
}
```
## Create API Token
```bash
pveum user add terraform@pve
pveum aclmod / -user terraform@pve -role PVEAdmin
pveum user token add terraform@pve mytoken
```
## Environment Variables
```bash
export PM_API_TOKEN_ID="terraform@pve!mytoken"
export PM_API_TOKEN_SECRET="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
```
## Official Resources
- [Provider Docs](https://registry.terraform.io/providers/Telmate/proxmox/latest/docs)
- [GitHub](https://github.com/Telmate/terraform-provider-proxmox)
- [Proxmox API](https://pve.proxmox.com/pve-docs/api-viewer/)

View File

@@ -0,0 +1,86 @@
# Proxmox Provider Gotchas
Critical issues when using Telmate Proxmox provider with Terraform.
## 1. Cloud-Init Changes Not Tracked
Terraform does **not** detect changes to cloud-init snippet file contents.
```hcl
# PROBLEM: Changing vendor-data.yml won't trigger replacement
resource "proxmox_vm_qemu" "vm" {
cicustom = "vendor=local:snippets/vendor-data.yml"
}
# SOLUTION: Use replace_triggered_by
resource "local_file" "vendor_data" {
filename = "vendor-data.yml"
content = templatefile("vendor-data.yml.tftpl", { ... })
}
resource "proxmox_vm_qemu" "vm" {
cicustom = "vendor=local:snippets/vendor-data.yml"
lifecycle {
replace_triggered_by = [
local_file.vendor_data.content_base64sha256
]
}
}
```
## 2. Storage Type vs Storage Pool
Different concepts - don't confuse:
```hcl
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm" # Pool NAME (from Proxmox datacenter)
size = "50G"
}
}
}
}
scsihw = "virtio-scsi-single" # Controller TYPE
```
- **Storage pool** = Where data stored (local-lvm, ceph-pool, nfs-share)
- **Disk type** = Interface (scsi, virtio, ide, sata)
## 3. Network Interface Naming
Proxmox VMs get predictable names by device order:
| NIC Order | Guest Name |
|-----------|------------|
| First | ens18 |
| Second | ens19 |
| Third | ens20 |
**NOT** eth0, eth1. Configure cloud-init netplan matching `ens*`.
## 4. API Token Expiration
Long operations (20+ VMs) can exceed token lifetime.
```hcl
provider "proxmox" {
pm_api_token_id = "terraform@pve!mytoken"
pm_api_token_secret = var.pm_api_token_secret
pm_timeout = 1200 # 20 minutes for large operations
}
```
Use API tokens (longer-lived) not passwords.
## 5. Full Clone vs Linked Clone
```hcl
full_clone = true # Independent copy - safe, slower, more storage
full_clone = false # References template - BREAKS if template modified
```
**Always use `full_clone = true` for production.** Linked clones only for disposable test VMs.

View File

@@ -0,0 +1,66 @@
# Proxmox Troubleshooting
## VM Creation Stuck
```
Timeout waiting for VM to be created
```
**Causes**: Template missing, storage full, network unreachable
**Debug**: Check Proxmox task log in web UI
## Clone Failed
```
VM template not found
```
**Check**: `qm list | grep template-name`
**Causes**: Template doesn't exist, wrong node, permission issue
## SSH Timeout
```
Timeout waiting for SSH
```
**Debug**:
1. VM console in Proxmox UI
2. `cloud-init status` on VM
3. `ip addr` to verify network
**Causes**: Cloud-init failed, network misconfigured, firewall
## State Drift
```
Plan shows changes for unchanged resources
```
**Causes**: Manual changes in Proxmox UI, provider bug
**Fix**:
```bash
terraform refresh
terraform plan # Verify
```
## API Errors
```
500 Internal Server Error
```
**Causes**: Invalid config, resource constraints, API timeout
**Debug**: Check `/var/log/pveproxy/access.log` on Proxmox node
## Permission Denied
```
Permission check failed
```
**Fix**: Verify API token has required permissions:
```bash
pveum acl list
pveum user permissions terraform@pve
```

View File

@@ -0,0 +1,86 @@
# proxmox_vm_qemu Resource
## Basic VM from Template
```hcl
resource "proxmox_vm_qemu" "vm" {
name = "my-vm"
target_node = "pve1"
clone = "ubuntu-template"
full_clone = true
cores = 4
sockets = 1
memory = 8192
cpu = "host"
onboot = true
agent = 1 # QEMU guest agent
scsihw = "virtio-scsi-single"
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm"
size = "50G"
}
}
}
}
network {
bridge = "vmbr0"
model = "virtio"
}
# Cloud-init
os_type = "cloud-init"
ciuser = "ubuntu"
sshkeys = var.ssh_public_key
ipconfig0 = "ip=dhcp"
# Static: ipconfig0 = "ip=192.168.1.10/24,gw=192.168.1.1"
# Custom cloud-init
cicustom = "vendor=local:snippets/vendor-data.yml"
}
```
## Lifecycle Management
```hcl
lifecycle {
prevent_destroy = true # Block accidental deletion
ignore_changes = [
network, # Ignore manual changes
]
replace_triggered_by = [
local_file.cloud_init.content_base64sha256
]
create_before_destroy = true # Blue-green deployment
}
```
## Multiple VMs with for_each
```hcl
variable "vms" {
type = map(object({
node = string
cores = number
memory = number
}))
}
resource "proxmox_vm_qemu" "vm" {
for_each = var.vms
name = each.key
target_node = each.value.node
cores = each.value.cores
memory = each.value.memory
# ...
}
```

View File

@@ -0,0 +1,92 @@
# Security
## Secrets Management
### Environment Variables (Recommended)
```bash
export TF_VAR_proxmox_password="secret"
export TF_VAR_api_token="xxxxx"
terraform apply
```
### Sensitive Variables
```hcl
variable "database_password" {
type = string
sensitive = true # Hidden in logs/plan
}
```
### External Secrets Managers
**HashiCorp Vault**:
```hcl
data "vault_generic_secret" "db" {
path = "secret/database"
}
resource "some_resource" "x" {
password = data.vault_generic_secret.db.data["password"]
}
```
**1Password CLI**:
```bash
export TF_VAR_password="$(op read 'op://vault/item/password')"
terraform apply
```
## State Security
**CRITICAL**: State contains secrets in plaintext.
### Encrypt at Rest
```hcl
backend "s3" {
encrypt = true
kms_key_id = "arn:aws:kms:..." # Optional KMS
}
```
### Restrict Access
- IAM/RBAC on backend storage
- Enable state locking
- Never commit state to git
## Provider Credentials
```hcl
provider "proxmox" {
pm_api_token_id = "terraform@pve!mytoken"
pm_api_token_secret = var.pm_api_token_secret # From env
}
```
Create minimal-permission API user:
```bash
pveum user add terraform@pve
pveum aclmod / -user terraform@pve -role PVEVMAdmin
pveum user token add terraform@pve terraform-token
```
## Sensitive Outputs
```hcl
output "db_password" {
value = random_password.db.result
sensitive = true
}
```
## Checklist
- [ ] Sensitive vars marked `sensitive = true`
- [ ] Secrets via env vars or secrets manager
- [ ] State backend encryption enabled
- [ ] State locking enabled
- [ ] No credentials in .tf files
- [ ] Provider credentials minimal permissions

View File

@@ -0,0 +1,112 @@
# State Management
## Remote Backend (Recommended)
```hcl
terraform {
backend "s3" {
bucket = "terraform-state"
key = "project/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks" # State locking
}
}
```
### S3-Compatible (MinIO, Ceph)
```hcl
terraform {
backend "s3" {
bucket = "terraform-state"
key = "project/terraform.tfstate"
region = "us-east-1" # Required but ignored
endpoint = "https://minio.example.com"
skip_credentials_validation = true
skip_metadata_api_check = true
skip_region_validation = true
force_path_style = true
}
}
```
## State Operations
```bash
# List resources
terraform state list
terraform state list proxmox_vm_qemu.*
# Show resource details
terraform state show proxmox_vm_qemu.web
# Rename resource
terraform state mv proxmox_vm_qemu.old proxmox_vm_qemu.new
# Move to module
terraform state mv proxmox_vm_qemu.web modules.web.proxmox_vm_qemu.main
# Remove from state (doesn't destroy)
terraform state rm proxmox_vm_qemu.orphaned
# Import existing resource
terraform import proxmox_vm_qemu.web pve1/qemu/100
# Update state from infrastructure
terraform refresh
```
## State Migration
```bash
# Change backend - updates terraform block, then:
terraform init -migrate-state
# Reinitialize without migration
terraform init -reconfigure
```
## State Locking
Prevents concurrent modifications. Enable via backend config:
- S3: `dynamodb_table`
- Consul: Built-in
- HTTP: `lock_address`
### Force Unlock (Emergency)
```bash
# Only when certain no operation running
terraform force-unlock LOCK_ID
```
## Troubleshooting
### State Lock Timeout
```
Error: Error acquiring state lock
```
1. Wait for other operation
2. Verify no process running
3. `terraform force-unlock LOCK_ID` if safe
### State Drift
```
Plan shows unexpected changes
```
```bash
terraform refresh # Update state from real infra
terraform plan # Review changes
```
### Corrupted State
1. Restore from backup
2. `terraform state pull > backup.tfstate`
3. Last resort: `terraform state rm` and re-import