Initial commit

2025-11-29 18:00:27 +08:00
commit 0c6988a884
19 changed files with 5729 additions and 0 deletions
--- a/skills/proxmox-infrastructure/reference/api-reference.md
+++ b/skills/proxmox-infrastructure/reference/api-reference.md
@@ -0,0 +1,378 @@
+# Proxmox API Reference
+
+## Overview
+
+The Proxmox API enables programmatic management of the cluster via REST. This reference focuses on common patterns for Python (proxmoxer) and Terraform/Ansible usage.
+
+## Authentication Methods
+
+### API Tokens (Recommended)
+
+**Create API token via CLI:**
+
+```bash
+pveum user token add <user>@<realm> <token-id> --privsep 0
+```
+
+**Environment variables:**
+
+```bash
+export PROXMOX_VE_API_TOKEN="user@realm!token-id=secret"
+export PROXMOX_VE_ENDPOINT="https://192.168.3.5:8006"
+```
+
+### Password Authentication
+
+```bash
+export PROXMOX_VE_USERNAME="root@pam"
+export PROXMOX_VE_PASSWORD="password"
+export PROXMOX_VE_ENDPOINT="https://192.168.3.5:8006"
+```
+
+## Python API Usage (proxmoxer)
+
+### Installation
+
+```bash
+# Using uv inline script metadata
+# /// script
+# dependencies = ["proxmoxer", "requests"]
+# ///
+```
+
+### Basic Connection
+
+```python
+#!/usr/bin/env python3
+# /// script
+# dependencies = ["proxmoxer", "requests"]
+# ///
+
+from proxmoxer import ProxmoxAPI
+import os
+
+# Connect using API token
+proxmox = ProxmoxAPI(
+    os.getenv("PROXMOX_VE_ENDPOINT").replace("https://", "").replace(":8006", ""),
+    user=os.getenv("PROXMOX_VE_USERNAME"),
+    token_name=os.getenv("PROXMOX_VE_TOKEN_NAME"),
+    token_value=os.getenv("PROXMOX_VE_TOKEN_VALUE"),
+    verify_ssl=False
+)
+
+# OR using password
+proxmox = ProxmoxAPI(
+    '192.168.3.5',
+    user='root@pam',
+    password=os.getenv("PROXMOX_VE_PASSWORD"),
+    verify_ssl=False
+)
+```
+
+### Common Operations
+
+**List VMs:**
+
+```python
+# Get all VMs across cluster
+for node in proxmox.nodes.get():
+    node_name = node['node']
+    for vm in proxmox.nodes(node_name).qemu.get():
+        print(f"VM {vm['vmid']}: {vm['name']} on {node_name} - {vm['status']}")
+```
+
+**Get VM Configuration:**
+
+```python
+vmid = 101
+node = "foxtrot"
+
+vm_config = proxmox.nodes(node).qemu(vmid).config.get()
+print(f"VM {vmid} config: {vm_config}")
+```
+
+**Clone Template:**
+
+```python
+template_id = 9000
+new_vmid = 101
+node = "foxtrot"
+
+# Clone template
+proxmox.nodes(node).qemu(template_id).clone.post(
+    newid=new_vmid,
+    name="docker-01-nexus",
+    full=1,  # Full clone (not linked)
+    storage="local-lvm"
+)
+
+# Wait for clone to complete
+import time
+while True:
+    tasks = proxmox.nodes(node).tasks.get()
+    clone_task = next((t for t in tasks if t['type'] == 'qmclone' and str(t['id']) == str(new_vmid)), None)
+    if not clone_task or clone_task['status'] == 'stopped':
+        break
+    time.sleep(2)
+```
+
+**Update VM Configuration:**
+
+```python
+# Set cloud-init parameters
+proxmox.nodes(node).qemu(vmid).config.put(
+    ipconfig0="ip=192.168.3.100/24,gw=192.168.3.1",
+    nameserver="192.168.3.1",
+    searchdomain="spaceships.work",
+    sshkeys="ssh-rsa AAAA..."
+)
+```
+
+**Start/Stop VM:**
+
+```python
+# Start VM
+proxmox.nodes(node).qemu(vmid).status.start.post()
+
+# Stop VM (graceful)
+proxmox.nodes(node).qemu(vmid).status.shutdown.post()
+
+# Force stop
+proxmox.nodes(node).qemu(vmid).status.stop.post()
+```
+
+**Delete VM:**
+
+```python
+proxmox.nodes(node).qemu(vmid).delete()
+```
+
+### Cluster Operations
+
+**Get Cluster Status:**
+
+```python
+cluster_status = proxmox.cluster.status.get()
+for node in cluster_status:
+    if node['type'] == 'node':
+        print(f"Node: {node['name']} - {node['online']}")
+```
+
+**Get Node Resources:**
+
+```python
+node_status = proxmox.nodes(node).status.get()
+print(f"CPU: {node_status['cpu']*100:.1f}%")
+print(f"Memory: {node_status['memory']['used']/1024**3:.1f}GB / {node_status['memory']['total']/1024**3:.1f}GB")
+```
+
+### Storage Operations
+
+**List Storage:**
+
+```python
+for storage in proxmox.storage.get():
+    print(f"Storage: {storage['storage']} - Type: {storage['type']} - {storage['active']}")
+```
+
+**Get Storage Content:**
+
+```python
+storage = "local-lvm"
+content = proxmox.storage(storage).content.get()
+for item in content:
+    print(f"{item['volid']} - {item.get('vmid', 'N/A')} - {item['size']/1024**3:.1f}GB")
+```
+
+## Terraform Provider Patterns
+
+### Basic Resource (VM from Clone)
+
+```hcl
+resource "proxmox_vm_qemu" "docker_host" {
+  name        = "docker-01-nexus"
+  target_node = "foxtrot"
+  vmid        = 101
+
+  clone       = "ubuntu-template"
+  full_clone  = true
+
+  cores   = 4
+  memory  = 8192
+  sockets = 1
+
+  network {
+    bridge = "vmbr0"
+    model  = "virtio"
+    tag    = 30  # VLAN 30
+  }
+
+  disk {
+    storage = "local-lvm"
+    type    = "scsi"
+    size    = "50G"
+  }
+
+  ipconfig0 = "ip=192.168.3.100/24,gw=192.168.3.1"
+
+  sshkeys = file("~/.ssh/id_rsa.pub")
+}
+```
+
+### Data Sources
+
+```hcl
+# Get template information
+data "proxmox_vm_qemu" "template" {
+  name        = "ubuntu-template"
+  target_node = "foxtrot"
+}
+
+# Get storage information
+data "proxmox_storage" "local_lvm" {
+  node    = "foxtrot"
+  storage = "local-lvm"
+}
+```
+
+## Ansible Module Patterns
+
+### Create VM from Template
+
+```yaml
+- name: Clone template to create VM
+  community.proxmox.proxmox_kvm:
+    api_host: "{{ proxmox_api_host }}"
+    api_user: "{{ proxmox_api_user }}"
+    api_token_id: "{{ proxmox_token_id }}"
+    api_token_secret: "{{ proxmox_token_secret }}"
+    node: foxtrot
+    vmid: 101
+    name: docker-01-nexus
+    clone: ubuntu-template
+    full: true
+    storage: local-lvm
+    net:
+      net0: 'virtio,bridge=vmbr0,tag=30'
+    ipconfig:
+      ipconfig0: 'ip=192.168.3.100/24,gw=192.168.3.1'
+    cores: 4
+    memory: 8192
+    agent: 1
+    state: present
+```
+
+### Start VM
+
+```yaml
+- name: Start VM
+  community.proxmox.proxmox_kvm:
+    api_host: "{{ proxmox_api_host }}"
+    api_user: "{{ proxmox_api_user }}"
+    api_token_id: "{{ proxmox_token_id }}"
+    api_token_secret: "{{ proxmox_token_secret }}"
+    node: foxtrot
+    vmid: 101
+    state: started
+```
+
+## Matrix Cluster Specifics
+
+### Node IP Addresses
+
+```python
+MATRIX_NODES = {
+    "foxtrot": "192.168.3.5",
+    "golf": "192.168.3.6",
+    "hotel": "192.168.3.7"
+}
+```
+
+### Storage Pools
+
+```python
+STORAGE_POOLS = {
+    "local": "dir",           # Local directory
+    "local-lvm": "lvmthin",   # LVM thin on boot disk
+    "ceph-pool": "rbd"        # CEPH RBD (when configured)
+}
+```
+
+### Network Bridges
+
+```python
+BRIDGES = {
+    "vmbr0": "192.168.3.0/24",   # Management + VLAN 9 (Corosync)
+    "vmbr1": "192.168.5.0/24",   # CEPH Public (MTU 9000)
+    "vmbr2": "192.168.7.0/24"    # CEPH Private (MTU 9000)
+}
+```
+
+## Error Handling
+
+### Python Example
+
+```python
+from proxmoxer import ProxmoxAPI, ResourceException
+import sys
+
+try:
+    proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass', verify_ssl=False)
+    vm_config = proxmox.nodes('foxtrot').qemu(101).config.get()
+except ResourceException as e:
+    print(f"API Error: {e}", file=sys.stderr)
+    sys.exit(1)
+except Exception as e:
+    print(f"Unexpected error: {e}", file=sys.stderr)
+    sys.exit(1)
+```
+
+### Ansible Example
+
+```yaml
+- name: Clone VM with error handling
+  community.proxmox.proxmox_kvm:
+    api_host: "{{ proxmox_api_host }}"
+    # ... config ...
+  register: clone_result
+  failed_when: false
+
+- name: Check clone result
+  ansible.builtin.fail:
+    msg: "Failed to clone VM: {{ clone_result.msg }}"
+  when: clone_result.failed
+```
+
+## API Endpoints Reference
+
+### Common Endpoints
+
+```text
+GET    /api2/json/nodes                        # List nodes
+GET    /api2/json/nodes/{node}/qemu            # List VMs on node
+GET    /api2/json/nodes/{node}/qemu/{vmid}    # Get VM status
+POST   /api2/json/nodes/{node}/qemu/{vmid}/clone  # Clone VM
+PUT    /api2/json/nodes/{node}/qemu/{vmid}/config # Update config
+POST   /api2/json/nodes/{node}/qemu/{vmid}/status/start   # Start VM
+POST   /api2/json/nodes/{node}/qemu/{vmid}/status/shutdown # Stop VM
+DELETE /api2/json/nodes/{node}/qemu/{vmid}    # Delete VM
+
+GET    /api2/json/cluster/status               # Cluster status
+GET    /api2/json/storage                      # List storage
+```
+
+## Best Practices
+
+1. **Use API tokens** - More secure than password authentication
+2. **Handle SSL properly** - Use `verify_ssl=True` with proper CA cert in production
+3. **Check task completion** - Clone/migrate operations are async, poll for completion
+4. **Error handling** - Always catch ResourceException and provide meaningful errors
+5. **Rate limiting** - Don't hammer the API, add delays in loops
+6. **Idempotency** - Check if resource exists before creating
+7. **Use VMID ranges** - Reserve ranges for different purposes (templates: 9000-9999, VMs: 100-999)
+
+## Further Reading
+
+- [Proxmox VE API Documentation](https://pve.proxmox.com/pve-docs/api-viewer/)
+- [proxmoxer GitHub](https://github.com/proxmoxer/proxmoxer)
+- [community.proxmox Collection](https://docs.ansible.com/ansible/latest/collections/community/proxmox/)
--- a/skills/proxmox-infrastructure/reference/cloud-init-patterns.md
+++ b/skills/proxmox-infrastructure/reference/cloud-init-patterns.md
@@ -0,0 +1,163 @@
+# Cloud-Init Patterns for Proxmox VE
+
+*Source: <https://pve.proxmox.com/wiki/Cloud-Init_Support*>
+
+## Overview
+
+Cloud-Init is the de facto multi-distribution package that handles early initialization of virtual machines. When a VM starts for the first time, Cloud-Init applies network and SSH key settings configured on the hypervisor.
+
+## Template Creation Workflow
+
+### Download and Import Cloud Image
+
+```bash
+# Download Ubuntu cloud image
+wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
+
+# Create VM with VirtIO SCSI controller
+qm create 9000 --memory 2048 --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci
+
+# Import disk to storage
+qm set 9000 --scsi0 local-lvm:0,import-from=/path/to/bionic-server-cloudimg-amd64.img
+```
+
+**Important**: Ubuntu Cloud-Init images require `virtio-scsi-pci` controller type for SCSI drives.
+
+### Configure Cloud-Init Components
+
+```bash
+# Add Cloud-Init CD-ROM drive
+qm set 9000 --ide2 local-lvm:cloudinit
+
+# Set boot order (speeds up boot)
+qm set 9000 --boot order=scsi0
+
+# Configure serial console (required for many cloud images)
+qm set 9000 --serial0 socket --vga serial0
+
+# Convert to template
+qm template 9000
+```
+
+## Deploying from Templates
+
+### Clone Template
+
+```bash
+# Clone template to new VM
+qm clone 9000 123 --name ubuntu2
+```
+
+### Configure VM
+
+```bash
+# Set SSH public key
+qm set 123 --sshkey ~/.ssh/id_rsa.pub
+
+# Configure network
+qm set 123 --ipconfig0 ip=10.0.10.123/24,gw=10.0.10.1
+```
+
+## Custom Cloud-Init Configuration
+
+### Using Custom Config Files
+
+Proxmox allows custom cloud-init configurations via the `cicustom` option:
+
+```bash
+qm set 9000 --cicustom "user=<volume>,network=<volume>,meta=<volume>"
+```
+
+Example using local snippets storage:
+
+```bash
+qm set 9000 --cicustom "user=local:snippets/userconfig.yaml"
+```
+
+### Dump Generated Config
+
+Use as a base for custom configurations:
+
+```bash
+qm cloudinit dump 9000 user
+qm cloudinit dump 9000 network
+qm cloudinit dump 9000 meta
+```
+
+## Cloud-Init Options Reference
+
+### cicustom
+
+Specify custom files to replace automatically generated ones:
+
+- `meta=<volume>` - Meta data (provider specific)
+- `network=<volume>` - Network data
+- `user=<volume>` - User data
+- `vendor=<volume>` - Vendor data
+
+### cipassword
+
+Password for the user. **Not recommended** - use SSH keys instead.
+
+### citype
+
+Configuration format: `configdrive2 | nocloud | opennebula`
+
+- Default: `nocloud` for Linux, `configdrive2` for Windows
+
+### ciupgrade
+
+Automatic package upgrade after first boot (default: `true`)
+
+### ciuser
+
+Username to configure (instead of image's default user)
+
+### ipconfig[n]
+
+IP addresses and gateways for network interfaces.
+
+Format: `[gw=<GatewayIPv4>] [,gw6=<GatewayIPv6>] [,ip=<IPv4Format/CIDR>] [,ip6=<IPv6Format/CIDR>]`
+
+Special values:
+
+- `ip=dhcp` - Use DHCP for IPv4
+- `ip6=auto` - Use stateless autoconfiguration (requires cloud-init 19.4+)
+
+### sshkeys
+
+Public SSH keys (one per line, OpenSSH format)
+
+### nameserver
+
+DNS server IP address
+
+### searchdomain
+
+DNS search domains
+
+## Best Practices
+
+1. **Use SSH keys** instead of passwords for authentication
+2. **Configure serial console** for cloud images (many require it)
+3. **Set boot order** to speed up boot process
+4. **Convert to template** for fast linked clone deployment
+5. **Store custom configs in snippets** storage (must be on all nodes for migration)
+6. **Test with a clone** before modifying template
+
+## Troubleshooting
+
+### Template Won't Boot
+
+- Check if serial console is configured: `qm set <vmid> --serial0 socket --vga serial0`
+- Verify boot order: `qm set <vmid> --boot order=scsi0`
+
+### Network Not Configured
+
+- Ensure cloud-init CD-ROM is attached: `qm set <vmid> --ide2 local-lvm:cloudinit`
+- Check IP configuration: `qm config <vmid> | grep ipconfig`
+
+### SSH Keys Not Working
+
+- Verify sshkeys format (OpenSSH format, one per line)
+- Check cloud-init logs in VM: `cat /var/log/cloud-init.log`
--- a/skills/proxmox-infrastructure/reference/networking.md
+++ b/skills/proxmox-infrastructure/reference/networking.md
@@ -0,0 +1,373 @@
+# Proxmox Network Configuration
+
+*Source: <https://pve.proxmox.com/wiki/Network_Configuration*>
+
+## Key Concepts
+
+### Configuration File
+
+All network configuration is in `/etc/network/interfaces`. GUI changes write to `/etc/network/interfaces.new` for safety.
+
+### Applying Changes
+
+**ifupdown2 (recommended):**
+
+```bash
+# Apply from GUI or run:
+ifreload -a
+```
+
+**Reboot method:**
+The `pvenetcommit` service activates staging file before `networking` service applies it.
+
+## Naming Conventions
+
+### Current (Proxmox VE 5.0+)
+
+- Ethernet: `en*` (systemd predictable names)
+  - `eno1` - first on-board NIC
+  - `enp3s0f1` - function 1 of NIC on PCI bus 3, slot 0
+- Bridges: `vmbr[0-4094]`
+- Bonds: `bond[N]`
+- VLANs: Add VLAN number after period: `eno1.50`, `bond1.30`
+
+### Legacy (pre-5.0)
+
+- Ethernet: `eth[N]` (eth0, eth1, ...)
+
+### Pinning Naming Scheme Version
+
+Add to kernel command line to prevent name changes:
+
+```bash
+net.naming-scheme=v252
+```
+
+### Overriding Device Names
+
+**Automatic tool:**
+
+```bash
+# Generate .link files for all interfaces
+pve-network-interface-pinning generate
+
+# With custom prefix
+pve-network-interface-pinning generate --prefix myprefix
+
+# Pin specific interface
+pve-network-interface-pinning generate --interface enp1s0 --target-name if42
+```
+
+**Manual method** (`/etc/systemd/network/10-enwan0.link`):
+
+```ini
+[Match]
+MACAddress=aa:bb:cc:dd:ee:ff
+Type=ether
+
+[Link]
+Name=enwan0
+```
+
+After creating link files:
+
+```bash
+update-initramfs -u -k all
+# Then reboot
+```
+
+## Network Setups
+
+### Default Bridged Configuration
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+
+auto vmbr0
+iface vmbr0 inet static
+        address 192.168.10.2/24
+        gateway 192.168.10.1
+        bridge-ports eno1
+        bridge-stp off
+        bridge-fd 0
+```
+
+VMs behave as if directly connected to physical network.
+
+### Routed Configuration
+
+For hosting providers that block multiple MACs:
+
+```bash
+auto lo
+iface lo inet loopback
+
+auto eno0
+iface eno0 inet static
+        address  198.51.100.5/29
+        gateway  198.51.100.1
+        post-up echo 1 > /proc/sys/net/ipv4/ip_forward
+        post-up echo 1 > /proc/sys/net/ipv4/conf/eno0/proxy_arp
+
+auto vmbr0
+iface vmbr0 inet static
+        address  203.0.113.17/28
+        bridge-ports none
+        bridge-stp off
+        bridge-fd 0
+```
+
+### Masquerading (NAT)
+
+For VMs with private IPs:
+
+```bash
+auto lo
+iface lo inet loopback
+
+auto eno1
+iface eno1 inet static
+        address  198.51.100.5/24
+        gateway  198.51.100.1
+
+auto vmbr0
+iface vmbr0 inet static
+        address  10.10.10.1/24
+        bridge-ports none
+        bridge-stp off
+        bridge-fd 0
+        post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
+        post-up   iptables -t nat -A POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
+        post-down iptables -t nat -D POSTROUTING -s '10.10.10.0/24' -o eno1 -j MASQUERADE
+```
+
+**Conntrack zones fix** (if firewall blocks outgoing):
+
+```bash
+post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
+post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1
+```
+
+## Linux Bonding
+
+### Bond Modes
+
+1. **balance-rr** - Round-robin (load balancing + fault tolerance)
+2. **active-backup** - Only one active NIC (fault tolerance only)
+3. **balance-xor** - XOR selection (load balancing + fault tolerance)
+4. **broadcast** - Transmit on all slaves (fault tolerance)
+5. **802.3ad (LACP)** - IEEE 802.3ad Dynamic link aggregation (requires switch support)
+6. **balance-tlb** - Adaptive transmit load balancing
+7. **balance-alb** - Adaptive load balancing (balance-tlb + receive balancing)
+
+**Recommendation:**
+
+- If switch supports LACP → use 802.3ad
+- Otherwise → use active-backup
+
+### Bond with Fixed IP
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+iface eno2 inet manual
+
+auto bond0
+iface bond0 inet static
+      bond-slaves eno1 eno2
+      address  192.168.1.2/24
+      bond-miimon 100
+      bond-mode 802.3ad
+      bond-xmit-hash-policy layer2+3
+
+auto vmbr0
+iface vmbr0 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+        bridge-ports eno3
+        bridge-stp off
+        bridge-fd 0
+```
+
+### Bond as Bridge Port
+
+For fault-tolerant guest network:
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+iface eno2 inet manual
+
+auto bond0
+iface bond0 inet manual
+      bond-slaves eno1 eno2
+      bond-miimon 100
+      bond-mode 802.3ad
+      bond-xmit-hash-policy layer2+3
+
+auto vmbr0
+iface vmbr0 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+        bridge-ports bond0
+        bridge-stp off
+        bridge-fd 0
+```
+
+## VLAN Configuration (802.1Q)
+
+### VLAN Awareness on Bridge
+
+**Guest VLANs** - Configure VLAN tag in VM settings, bridge handles transparently.
+
+**Bridge with VLAN awareness:**
+
+```bash
+auto vmbr0
+iface vmbr0 inet manual
+        bridge-ports eno1
+        bridge-stp off
+        bridge-fd 0
+        bridge-vlan-aware yes
+        bridge-vids 2-4094
+```
+
+### Host Management on VLAN
+
+**With VLAN-aware bridge:**
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+
+auto vmbr0.5
+iface vmbr0.5 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+
+auto vmbr0
+iface vmbr0 inet manual
+        bridge-ports eno1
+        bridge-stp off
+        bridge-fd 0
+        bridge-vlan-aware yes
+        bridge-vids 2-4094
+```
+
+**Traditional VLAN:**
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+iface eno1.5 inet manual
+
+auto vmbr0v5
+iface vmbr0v5 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+        bridge-ports eno1.5
+        bridge-stp off
+        bridge-fd 0
+
+auto vmbr0
+iface vmbr0 inet manual
+        bridge-ports eno1
+        bridge-stp off
+        bridge-fd 0
+```
+
+### VLAN with Bonding
+
+```bash
+auto lo
+iface lo inet loopback
+
+iface eno1 inet manual
+iface eno2 inet manual
+
+auto bond0
+iface bond0 inet manual
+      bond-slaves eno1 eno2
+      bond-miimon 100
+      bond-mode 802.3ad
+      bond-xmit-hash-policy layer2+3
+
+iface bond0.5 inet manual
+
+auto vmbr0v5
+iface vmbr0v5 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+        bridge-ports bond0.5
+        bridge-stp off
+        bridge-fd 0
+
+auto vmbr0
+iface vmbr0 inet manual
+        bridge-ports bond0
+        bridge-stp off
+        bridge-fd 0
+```
+
+## Advanced Features
+
+### Disable MAC Learning
+
+Available since Proxmox VE 7.3:
+
+```bash
+auto vmbr0
+iface vmbr0 inet static
+        address  10.10.10.2/24
+        gateway  10.10.10.1
+        bridge-ports ens18
+        bridge-stp off
+        bridge-fd 0
+        bridge-disable-mac-learning 1
+```
+
+Proxmox VE manually adds VM/CT MAC addresses to forwarding database.
+
+### Disable IPv6
+
+Create `/etc/sysctl.d/disable-ipv6.conf`:
+
+```ini
+net.ipv6.conf.all.disable_ipv6 = 1
+net.ipv6.conf.default.disable_ipv6 = 1
+```
+
+Then: `sysctl -p /etc/sysctl.d/disable-ipv6.conf`
+
+## Troubleshooting
+
+### Avoid ifup/ifdown
+
+**Don't use** `ifup`/`ifdown` on bridges as they interrupt guest traffic without reconnecting.
+
+**Use instead:**
+
+- GUI "Apply Configuration" button
+- `ifreload -a` command
+- Reboot
+
+### Network Changes Not Applied
+
+1. Check `/etc/network/interfaces.new` exists
+2. Click "Apply Configuration" in GUI or run `ifreload -a`
+3. If issues persist, reboot
+
+### Bond Not Working with Corosync
+
+Some bond modes are problematic for Corosync. Use multiple networks instead of bonding for cluster traffic.
--- a/skills/proxmox-infrastructure/reference/qemu-guest-agent.md
+++ b/skills/proxmox-infrastructure/reference/qemu-guest-agent.md
@@ -0,0 +1,467 @@
+# QEMU Guest Agent Integration
+
+## Overview
+
+The QEMU Guest Agent (`qemu-guest-agent`) is a service running inside VMs that enables communication between Proxmox and the guest OS. It provides IP address detection, graceful shutdowns, filesystem freezing for snapshots, and more.
+
+## Why Use QEMU Guest Agent?
+
+**Without Guest Agent:**
+
+- VM IP address unknown to Proxmox
+- Shutdown = hard power off
+- Snapshots don't freeze filesystem (risk of corruption)
+- No guest-level monitoring
+
+**With Guest Agent:**
+
+- Automatic IP address detection
+- Graceful shutdown/reboot
+- Consistent snapshots with filesystem freeze
+- Execute commands inside VM
+- Query guest information (hostname, users, OS details)
+
+## Installation in Guest VM
+
+### Ubuntu/Debian
+
+```bash
+sudo apt update
+sudo apt install qemu-guest-agent
+sudo systemctl enable qemu-guest-agent
+sudo systemctl start qemu-guest-agent
+```
+
+### RHEL/Rocky/AlmaLinux
+
+```bash
+sudo dnf install qemu-guest-agent
+sudo systemctl enable qemu-guest-agent
+sudo systemctl start qemu-guest-agent
+```
+
+### Verify Installation
+
+```bash
+systemctl status qemu-guest-agent
+```
+
+**Expected output:**
+
+```text
+● qemu-guest-agent.service - QEMU Guest Agent
+     Loaded: loaded (/lib/systemd/system/qemu-guest-agent.service; enabled)
+     Active: active (running)
+```
+
+## Enable in VM Configuration
+
+### Via Proxmox Web UI
+
+**VM → Hardware → Add → QEMU Guest Agent**
+
+OR edit VM options:
+
+**VM → Options → QEMU Guest Agent → Edit → Check "Use QEMU Guest Agent"**
+
+### Via CLI
+
+```bash
+qm set <vmid> --agent 1
+```
+
+**With custom options:**
+
+```bash
+# Enable with filesystem freeze support
+qm set <vmid> --agent enabled=1,fstrim_cloned_disks=1
+```
+
+### Via Terraform
+
+```hcl
+resource "proxmox_vm_qemu" "vm" {
+  name = "my-vm"
+  # ... other config ...
+
+  agent = 1  # Enable guest agent
+}
+```
+
+### Via Ansible
+
+```yaml
+- name: Enable QEMU guest agent
+  community.proxmox.proxmox_kvm:
+    api_host: "{{ proxmox_api_host }}"
+    api_user: "{{ proxmox_api_user }}"
+    api_token_id: "{{ proxmox_token_id }}"
+    api_token_secret: "{{ proxmox_token_secret }}"
+    node: foxtrot
+    vmid: 101
+    agent: 1
+    update: true
+```
+
+## Using Guest Agent
+
+### Check Agent Status
+
+**Via CLI:**
+
+```bash
+# Test if agent is responding
+qm agent 101 ping
+
+# Get guest info
+qm agent 101 info
+
+# Get network interfaces
+qm agent 101 network-get-interfaces
+
+# Get IP addresses
+qm agent 101 get-osinfo
+```
+
+**Example output:**
+
+```json
+{
+  "result": {
+    "id": "ubuntu",
+    "kernel-release": "5.15.0-91-generic",
+    "kernel-version": "#101-Ubuntu SMP",
+    "machine": "x86_64",
+    "name": "Ubuntu",
+    "pretty-name": "Ubuntu 22.04.3 LTS",
+    "version": "22.04",
+    "version-id": "22.04"
+  }
+}
+```
+
+### Execute Commands
+
+**Via CLI:**
+
+```bash
+# Execute command in guest
+qm guest exec 101 -- whoami
+
+# With arguments
+qm guest exec 101 -- ls -la /tmp
+```
+
+**Via Python API:**
+
+```python
+from proxmoxer import ProxmoxAPI
+
+proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass')
+
+# Execute command
+result = proxmox.nodes('foxtrot').qemu(101).agent.exec.post(
+    command=['whoami']
+)
+
+# Get execution result
+pid = result['pid']
+exec_status = proxmox.nodes('foxtrot').qemu(101).agent('exec-status').get(pid=pid)
+print(exec_status)
+```
+
+### Graceful Shutdown/Reboot
+
+**Shutdown (graceful with agent):**
+
+```bash
+# Sends ACPI shutdown to guest, waits for agent to shutdown OS
+qm shutdown 101
+
+# Force shutdown if doesn't complete in 60s
+qm shutdown 101 --timeout 60 --forceStop 1
+```
+
+**Reboot:**
+
+```bash
+qm reboot 101
+```
+
+## Snapshot Integration
+
+### Filesystem Freeze for Consistent Snapshots
+
+When guest agent is enabled, Proxmox can freeze the filesystem before taking a snapshot, ensuring consistency.
+
+**Create snapshot with FS freeze:**
+
+```bash
+# Guest agent automatically freezes filesystem
+qm snapshot 101 before-upgrade --vmstate 0 --description "Before upgrade"
+```
+
+**Rollback to snapshot:**
+
+```bash
+qm rollback 101 before-upgrade
+```
+
+**Delete snapshot:**
+
+```bash
+qm delsnapshot 101 before-upgrade
+```
+
+## IP Address Detection
+
+### Automatic IP Assignment
+
+With guest agent, Proxmox automatically detects VM IP addresses.
+
+**View in Web UI:**
+
+VM → Summary → IPs section shows detected IPs
+
+**Via CLI:**
+
+```bash
+qm agent 101 network-get-interfaces | jq '.result[] | select(.name=="eth0") | ."ip-addresses"'
+```
+
+**Via Python:**
+
+```python
+interfaces = proxmox.nodes('foxtrot').qemu(101).agent('network-get-interfaces').get()
+
+for iface in interfaces['result']:
+    if iface['name'] == 'eth0':
+        for ip in iface.get('ip-addresses', []):
+            if ip['ip-address-type'] == 'ipv4':
+                print(f"IPv4: {ip['ip-address']}")
+```
+
+## Advanced Configuration
+
+### Guest Agent Options
+
+**Full options syntax:**
+
+```bash
+qm set <vmid> --agent [enabled=]<1|0>[,fstrim_cloned_disks=<1|0>][,type=<virtio|isa>]
+```
+
+**Parameters:**
+
+- `enabled` - Enable/disable guest agent (default: 1)
+- `fstrim_cloned_disks` - Run fstrim after cloning disk (default: 0)
+- `type` - Agent communication type: virtio or isa (default: virtio)
+
+**Example:**
+
+```bash
+# Enable with fstrim on cloned disks
+qm set 101 --agent enabled=1,fstrim_cloned_disks=1
+```
+
+### Filesystem Trim (fstrim)
+
+For VMs on thin-provisioned storage (LVM-thin, CEPH), fstrim helps reclaim unused space.
+
+**Manual fstrim:**
+
+```bash
+# Inside VM
+sudo fstrim -av
+```
+
+**Automatic on clone:**
+
+```bash
+qm set <vmid> --agent enabled=1,fstrim_cloned_disks=1
+```
+
+**Scheduled fstrim (inside VM):**
+
+```bash
+# Enable weekly fstrim timer
+sudo systemctl enable fstrim.timer
+sudo systemctl start fstrim.timer
+```
+
+## Cloud-Init Integration
+
+### Include in Cloud-Init Template
+
+**During template creation:**
+
+```bash
+# Install agent package
+virt-customize -a ubuntu-22.04.img \
+  --install qemu-guest-agent \
+  --run-command "systemctl enable qemu-guest-agent"
+
+# Create VM from image
+qm create 9000 --name ubuntu-template --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
+qm importdisk 9000 ubuntu-22.04.img local-lvm
+qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0
+qm set 9000 --agent 1  # Enable guest agent
+qm set 9000 --ide2 local-lvm:cloudinit
+qm template 9000
+```
+
+### Cloud-Init User Data
+
+**Include in cloud-init config:**
+
+```yaml
+#cloud-config
+packages:
+  - qemu-guest-agent
+
+runcmd:
+  - systemctl enable qemu-guest-agent
+  - systemctl start qemu-guest-agent
+```
+
+## Troubleshooting
+
+### Guest Agent Not Responding
+
+**1. Check if service is running in guest:**
+
+```bash
+# Inside VM
+systemctl status qemu-guest-agent
+journalctl -u qemu-guest-agent
+```
+
+**2. Check if agent is enabled in VM config:**
+
+```bash
+# On Proxmox host
+qm config 101 | grep agent
+```
+
+**3. Check virtio serial device:**
+
+```bash
+# Inside VM
+ls -l /dev/virtio-ports/
+# Should show: org.qemu.guest_agent.0
+```
+
+**4. Restart agent:**
+
+```bash
+# Inside VM
+sudo systemctl restart qemu-guest-agent
+```
+
+**5. Check Proxmox can communicate:**
+
+```bash
+# On Proxmox host
+qm agent 101 ping
+```
+
+### IP Address Not Detected
+
+**Possible causes:**
+
+1. Guest agent not running
+2. Network interface not configured
+3. DHCP not assigning IP
+4. Firewall blocking communication
+
+**Debug:**
+
+```bash
+# Check all interfaces
+qm agent 101 network-get-interfaces | jq
+
+# Verify cloud-init completed
+# Inside VM
+cloud-init status
+```
+
+### Filesystem Freeze Timeout
+
+**Symptoms:**
+
+Snapshot creation hangs or times out.
+
+**Solution:**
+
+```bash
+# Disable FS freeze for snapshots
+qm set 101 --agent enabled=1
+
+# Take snapshot without FS freeze
+qm snapshot 101 test --vmstate 0
+```
+
+### Agent Installed but Not Enabled
+
+**Check VM config:**
+
+```bash
+qm config 101 | grep agent
+```
+
+**If missing, enable:**
+
+```bash
+qm set 101 --agent 1
+```
+
+**Restart VM for changes to take effect:**
+
+```bash
+qm reboot 101
+```
+
+## Best Practices
+
+1. **Always install in templates** - Include qemu-guest-agent in VM templates
+2. **Enable during provisioning** - Set `--agent 1` when creating VMs
+3. **Use for production VMs** - Critical for graceful shutdowns and monitoring
+4. **Enable fstrim for thin storage** - Helps reclaim space on LVM-thin and CEPH
+5. **Test before snapshots** - Verify agent works: `qm agent <vmid> ping`
+6. **Cloud-init integration** - Automate installation via cloud-init packages
+7. **Monitor agent status** - Check agent is running in monitoring tools
+
+## Ansible Automation Example
+
+```yaml
+---
+- name: Ensure QEMU guest agent is configured
+  hosts: proxmox_vms
+  become: true
+  tasks:
+    - name: Install qemu-guest-agent
+      ansible.builtin.apt:
+        name: qemu-guest-agent
+        state: present
+      when: ansible_os_family == "Debian"
+
+    - name: Enable and start qemu-guest-agent
+      ansible.builtin.systemd:
+        name: qemu-guest-agent
+        enabled: true
+        state: started
+
+    - name: Verify agent is running
+      ansible.builtin.systemd:
+        name: qemu-guest-agent
+      register: agent_status
+
+    - name: Report agent status
+      ansible.builtin.debug:
+        msg: "Guest agent is {{ agent_status.status.ActiveState }}"
+```
+
+## Further Reading
+
+- [Proxmox QEMU Guest Agent Documentation](https://pve.proxmox.com/wiki/Qemu-guest-agent)
+- [QEMU Guest Agent Protocol](https://www.qemu.org/docs/master/interop/qemu-ga.html)
--- a/skills/proxmox-infrastructure/reference/storage-management.md
+++ b/skills/proxmox-infrastructure/reference/storage-management.md
@@ -0,0 +1,486 @@
+# Proxmox Storage Management
+
+## Overview
+
+Proxmox VE supports multiple storage backends. This guide focuses on the storage architecture of the Matrix cluster: LVM-thin for boot disks and CEPH for distributed storage.
+
+## Matrix Cluster Storage Architecture
+
+### Hardware Configuration
+
+**Per Node (Foxtrot, Golf, Hotel):**
+
+```text
+nvme0n1  - 1TB Crucial P3        → Boot disk + LVM
+nvme1n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)
+nvme2n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)
+```
+
+**Total Cluster:**
+
+- 3× 1TB boot disks (LVM local storage)
+- 6× 4TB NVMe drives (24TB raw CEPH capacity)
+- 12 CEPH OSDs total (2 per NVMe drive)
+
+### Storage Pools
+
+```text
+Storage Pool     Type       Backend    Purpose
+-------------    ----       -------    -------
+local            dir        Directory  ISO images, templates, backups
+local-lvm        lvmthin    LVM-thin   VM disks (local)
+ceph-pool        rbd        CEPH RBD   VM disks (distributed, HA)
+ceph-fs          cephfs     CephFS     Shared filesystem
+```
+
+## LVM Storage
+
+### LVM-thin Configuration
+
+**Advantages:**
+
+- Thin provisioning (overcommit storage)
+- Fast snapshots
+- Local to each node (low latency)
+- No network overhead
+
+**Disadvantages:**
+
+- No HA (tied to single node)
+- No live migration with storage
+- Limited to node's local disk size
+
+**Check LVM usage:**
+
+```bash
+# View volume groups
+vgs
+
+# View logical volumes
+lvs
+
+# View thin pool usage
+lvs -a | grep thin
+```
+
+**Example output:**
+
+```text
+  LV            VG  Attr       LSize   Pool Origin Data%
+  data          pve twi-aotz-- 850.00g             45.23
+  vm-101-disk-0 pve Vwi-aotz--  50.00g data        12.45
+```
+
+### Managing LVM Storage
+
+**Extend thin pool (if boot disk has space):**
+
+```bash
+# Check free space in VG
+vgs pve
+
+# Extend thin pool
+lvextend -L +100G pve/data
+```
+
+**Create VM disk manually:**
+
+```bash
+# Create 50GB disk for VM 101
+lvcreate -V 50G -T pve/data -n vm-101-disk-0
+```
+
+## CEPH Storage
+
+### CEPH Architecture for Matrix
+
+**Network Configuration:**
+
+```text
+vmbr1 (192.168.5.0/24, MTU 9000) → CEPH Public Network
+vmbr2 (192.168.7.0/24, MTU 9000) → CEPH Private Network
+```
+
+**OSD Distribution:**
+
+```text
+Node      NVMe       OSDs    Capacity
+-------   ------     ----    --------
+foxtrot   nvme1n1    2       4TB
+foxtrot   nvme2n1    2       4TB
+golf      nvme1n1    2       4TB
+golf      nvme2n1    2       4TB
+hotel     nvme1n1    2       4TB
+hotel     nvme2n1    2       4TB
+-------   ------     ----    --------
+Total                12      24TB raw
+```
+
+**Usable capacity (replica 3):** ~8TB
+
+### CEPH Deployment Commands
+
+**Install CEPH:**
+
+```bash
+# On first node (foxtrot)
+pveceph install --version reef
+
+# Initialize cluster
+pveceph init --network 192.168.5.0/24 --cluster-network 192.168.7.0/24
+```
+
+**Create Monitors (3 for quorum):**
+
+```bash
+# On each node
+pveceph mon create
+```
+
+**Create Manager (on each node):**
+
+```bash
+pveceph mgr create
+```
+
+**Create OSDs:**
+
+```bash
+# On each node - 2 OSDs per NVMe drive
+
+# For nvme1n1 (4TB)
+pveceph osd create /dev/nvme1n1 --crush-device-class nvme
+
+# For nvme2n1 (4TB)
+pveceph osd create /dev/nvme2n1 --crush-device-class nvme
+```
+
+**Create CEPH Pool:**
+
+```bash
+# Create RBD pool for VMs
+pveceph pool create ceph-pool --add_storages
+
+# Create CephFS for shared storage
+pveceph fs create --name cephfs --add-storage
+```
+
+### CEPH Configuration Best Practices
+
+**Optimize for NVMe:**
+
+```bash
+# /etc/pve/ceph.conf
+[global]
+    public_network = 192.168.5.0/24
+    cluster_network = 192.168.7.0/24
+    osd_pool_default_size = 3
+    osd_pool_default_min_size = 2
+
+[osd]
+    osd_memory_target = 4294967296  # 4GB per OSD
+    osd_max_backfills = 1
+    osd_recovery_max_active = 1
+```
+
+**Restart CEPH services after config change:**
+
+```bash
+systemctl restart ceph-osd@*.service
+```
+
+### CEPH Monitoring
+
+**Check cluster health:**
+
+```bash
+ceph status
+ceph health detail
+```
+
+**Example healthy output:**
+
+```text
+cluster:
+  id:     a1b2c3d4-e5f6-7890-abcd-ef1234567890
+  health: HEALTH_OK
+
+services:
+  mon: 3 daemons, quorum foxtrot,golf,hotel
+  mgr: foxtrot(active), standbys: golf, hotel
+  osd: 12 osds: 12 up, 12 in
+
+data:
+  pools:   2 pools, 128 pgs
+  objects: 1.23k objects, 45 GiB
+  usage:   135 GiB used, 23.8 TiB / 24 TiB avail
+  pgs:     128 active+clean
+```
+
+**Check OSD performance:**
+
+```bash
+ceph osd df
+ceph osd perf
+```
+
+**Check pool usage:**
+
+```bash
+ceph df
+rados df
+```
+
+## Storage Configuration in Proxmox
+
+### Add Storage via Web UI
+
+**Datacenter → Storage → Add:**
+
+1. **Directory** - For ISOs and backups
+2. **LVM-Thin** - For local VM disks
+3. **RBD** - For CEPH VM disks
+4. **CephFS** - For shared files
+
+### Add Storage via CLI
+
+**CEPH RBD:**
+
+```bash
+pvesm add rbd ceph-pool \
+  --pool ceph-pool \
+  --content images,rootdir \
+  --nodes foxtrot,golf,hotel
+```
+
+**CephFS:**
+
+```bash
+pvesm add cephfs cephfs \
+  --path /mnt/pve/cephfs \
+  --content backup,iso,vztmpl \
+  --nodes foxtrot,golf,hotel
+```
+
+**NFS (if using external NAS):**
+
+```bash
+pvesm add nfs nas-storage \
+  --server 192.168.3.10 \
+  --export /mnt/tank/proxmox \
+  --content images,backup,iso \
+  --nodes foxtrot,golf,hotel
+```
+
+## VM Disk Management
+
+### Create VM Disk on CEPH
+
+**Via CLI:**
+
+```bash
+# Create 100GB disk for VM 101 on CEPH
+qm set 101 --scsi1 ceph-pool:100
+```
+
+**Via API (Python):**
+
+```python
+from proxmoxer import ProxmoxAPI
+
+proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass')
+proxmox.nodes('foxtrot').qemu(101).config.put(scsi1='ceph-pool:100')
+```
+
+### Move VM Disk Between Storage
+
+**Move from local-lvm to CEPH:**
+
+```bash
+qm move-disk 101 scsi0 ceph-pool --delete 1
+```
+
+**Move with live migration:**
+
+```bash
+qm move-disk 101 scsi0 ceph-pool --delete 1 --online 1
+```
+
+### Resize VM Disk
+
+**Grow disk (can't shrink):**
+
+```bash
+# Grow VM 101's scsi0 by 50GB
+qm resize 101 scsi0 +50G
+```
+
+**Inside VM (expand filesystem):**
+
+```bash
+# For ext4
+sudo resize2fs /dev/sda1
+
+# For XFS
+sudo xfs_growfs /
+```
+
+## Backup and Restore
+
+### Backup to Storage
+
+**Create backup:**
+
+```bash
+# Backup VM 101 to local storage
+vzdump 101 --storage local --mode snapshot --compress zstd
+
+# Backup to CephFS
+vzdump 101 --storage cephfs --mode snapshot --compress zstd
+```
+
+**Scheduled backups (via Web UI):**
+
+Datacenter → Backup → Add:
+
+- Schedule: Daily at 2 AM
+- Storage: cephfs
+- Mode: Snapshot
+- Compression: ZSTD
+- Retention: Keep last 7
+
+### Restore from Backup
+
+**List backups:**
+
+```bash
+ls /var/lib/vz/dump/
+# OR
+ls /mnt/pve/cephfs/dump/
+```
+
+**Restore:**
+
+```bash
+# Restore to same VMID
+qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 101
+
+# Restore to new VMID
+qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 102 --storage ceph-pool
+```
+
+## Performance Tuning
+
+### CEPH Performance
+
+**For NVMe OSDs:**
+
+```bash
+# Set proper device class
+ceph osd crush set-device-class nvme osd.0
+ceph osd crush set-device-class nvme osd.1
+# ... repeat for all OSDs
+```
+
+**Create performance pool:**
+
+```bash
+ceph osd pool create fast-pool 128 128
+ceph osd pool application enable fast-pool rbd
+```
+
+**Enable RBD cache:**
+
+```bash
+# /etc/pve/ceph.conf
+[client]
+    rbd_cache = true
+    rbd_cache_size = 134217728  # 128MB
+    rbd_cache_writethrough_until_flush = false
+```
+
+### LVM Performance
+
+**Use SSD discard:**
+
+```bash
+# Enable discard on VM disk
+qm set 101 --scsi0 local-lvm:vm-101-disk-0,discard=on,ssd=1
+```
+
+## Troubleshooting
+
+### CEPH Not Healthy
+
+**Check OSD status:**
+
+```bash
+ceph osd tree
+ceph osd stat
+```
+
+**Restart stuck OSD:**
+
+```bash
+systemctl restart ceph-osd@0.service
+```
+
+**Check network connectivity:**
+
+```bash
+# From one node to another
+ping -c 3 -M do -s 8972 192.168.5.6  # Test MTU 9000
+```
+
+### LVM Out of Space
+
+**Check thin pool usage:**
+
+```bash
+lvs pve/data -o lv_name,data_percent,metadata_percent
+```
+
+**If thin pool > 90% full:**
+
+```bash
+# Extend if VG has space
+lvextend -L +100G pve/data
+
+# OR delete unused VM disks
+lvremove pve/vm-XXX-disk-0
+```
+
+### Storage Performance Issues
+
+**Test disk I/O:**
+
+```bash
+# Test sequential write
+dd if=/dev/zero of=/tmp/test bs=1M count=1024 oflag=direct
+
+# Test CEPH RBD performance
+rbd bench --io-type write ceph-pool/test-image
+```
+
+**Monitor CEPH latency:**
+
+```bash
+ceph osd perf
+```
+
+## Best Practices
+
+1. **Use CEPH for HA VMs** - Store critical VM disks on CEPH for live migration
+2. **Use LVM for performance** - Non-critical VMs get better performance on local LVM
+3. **MTU 9000 for CEPH** - Always use jumbo frames on CEPH networks
+4. **Separate networks** - Public and private CEPH networks on different interfaces
+5. **Monitor CEPH health** - Set up alerts for HEALTH_WARN/HEALTH_ERR
+6. **Regular backups** - Automated daily backups to CephFS or external NAS
+7. **Plan for growth** - Leave 20% free space in CEPH for rebalancing
+8. **Use replica 3** - Essential for data safety, especially with only 3 nodes
+
+## Further Reading
+
+- [Proxmox VE Storage Documentation](https://pve.proxmox.com/wiki/Storage)
+- [CEPH Documentation](https://docs.ceph.com/)
+- [Proxmox CEPH Guide](https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster)