gh-basher83-lunar-claude-pl…/skills/proxmox-infrastructure/reference/storage-management.md

# Proxmox Storage Management

## Overview

Proxmox VE supports multiple storage backends. This guide focuses on the storage architecture of the Matrix cluster: LVM-thin for boot disks and CEPH for distributed storage.

## Matrix Cluster Storage Architecture

### Hardware Configuration

**Per Node (Foxtrot, Golf, Hotel):**

```text
nvme0n1  - 1TB Crucial P3        → Boot disk + LVM
nvme1n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)
nvme2n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)
```

**Total Cluster:**

- 3× 1TB boot disks (LVM local storage)
- 6× 4TB NVMe drives (24TB raw CEPH capacity)
- 12 CEPH OSDs total (2 per NVMe drive)

### Storage Pools

```text
Storage Pool     Type       Backend    Purpose
-------------    ----       -------    -------
local            dir        Directory  ISO images, templates, backups
local-lvm        lvmthin    LVM-thin   VM disks (local)
ceph-pool        rbd        CEPH RBD   VM disks (distributed, HA)
ceph-fs          cephfs     CephFS     Shared filesystem
```

## LVM Storage

### LVM-thin Configuration

**Advantages:**

- Thin provisioning (overcommit storage)
- Fast snapshots
- Local to each node (low latency)
- No network overhead

**Disadvantages:**

- No HA (tied to single node)
- No live migration with storage
- Limited to node's local disk size

**Check LVM usage:**

```bash
# View volume groups
vgs

# View logical volumes
lvs

# View thin pool usage
lvs -a | grep thin
```

**Example output:**

```text
  LV            VG  Attr       LSize   Pool Origin Data%
  data          pve twi-aotz-- 850.00g             45.23
  vm-101-disk-0 pve Vwi-aotz--  50.00g data        12.45
```

### Managing LVM Storage

**Extend thin pool (if boot disk has space):**

```bash
# Check free space in VG
vgs pve

# Extend thin pool
lvextend -L +100G pve/data
```

**Create VM disk manually:**

```bash
# Create 50GB disk for VM 101
lvcreate -V 50G -T pve/data -n vm-101-disk-0
```

## CEPH Storage

### CEPH Architecture for Matrix

**Network Configuration:**

```text
vmbr1 (192.168.5.0/24, MTU 9000) → CEPH Public Network
vmbr2 (192.168.7.0/24, MTU 9000) → CEPH Private Network
```

**OSD Distribution:**

```text
Node      NVMe       OSDs    Capacity
-------   ------     ----    --------
foxtrot   nvme1n1    2       4TB
foxtrot   nvme2n1    2       4TB
golf      nvme1n1    2       4TB
golf      nvme2n1    2       4TB
hotel     nvme1n1    2       4TB
hotel     nvme2n1    2       4TB
-------   ------     ----    --------
Total                12      24TB raw
```

**Usable capacity (replica 3):** ~8TB

### CEPH Deployment Commands

**Install CEPH:**

```bash
# On first node (foxtrot)
pveceph install --version reef

# Initialize cluster
pveceph init --network 192.168.5.0/24 --cluster-network 192.168.7.0/24
```

**Create Monitors (3 for quorum):**

```bash
# On each node
pveceph mon create
```

**Create Manager (on each node):**

```bash
pveceph mgr create
```

**Create OSDs:**

```bash
# On each node - 2 OSDs per NVMe drive

# For nvme1n1 (4TB)
pveceph osd create /dev/nvme1n1 --crush-device-class nvme

# For nvme2n1 (4TB)
pveceph osd create /dev/nvme2n1 --crush-device-class nvme
```

**Create CEPH Pool:**

```bash
# Create RBD pool for VMs
pveceph pool create ceph-pool --add_storages

# Create CephFS for shared storage
pveceph fs create --name cephfs --add-storage
```

### CEPH Configuration Best Practices

**Optimize for NVMe:**

```bash
# /etc/pve/ceph.conf
[global]
    public_network = 192.168.5.0/24
    cluster_network = 192.168.7.0/24
    osd_pool_default_size = 3
    osd_pool_default_min_size = 2

[osd]
    osd_memory_target = 4294967296  # 4GB per OSD
    osd_max_backfills = 1
    osd_recovery_max_active = 1
```

**Restart CEPH services after config change:**

```bash
systemctl restart ceph-osd@*.service
```

### CEPH Monitoring

**Check cluster health:**

```bash
ceph status
ceph health detail
```

**Example healthy output:**

```text
cluster:
  id:     a1b2c3d4-e5f6-7890-abcd-ef1234567890
  health: HEALTH_OK

services:
  mon: 3 daemons, quorum foxtrot,golf,hotel
  mgr: foxtrot(active), standbys: golf, hotel
  osd: 12 osds: 12 up, 12 in

data:
  pools:   2 pools, 128 pgs
  objects: 1.23k objects, 45 GiB
  usage:   135 GiB used, 23.8 TiB / 24 TiB avail
  pgs:     128 active+clean
```

**Check OSD performance:**

```bash
ceph osd df
ceph osd perf
```

**Check pool usage:**

```bash
ceph df
rados df
```

## Storage Configuration in Proxmox

### Add Storage via Web UI

**Datacenter → Storage → Add:**

1. **Directory** - For ISOs and backups
2. **LVM-Thin** - For local VM disks
3. **RBD** - For CEPH VM disks
4. **CephFS** - For shared files

### Add Storage via CLI

**CEPH RBD:**

```bash
pvesm add rbd ceph-pool \
  --pool ceph-pool \
  --content images,rootdir \
  --nodes foxtrot,golf,hotel
```

**CephFS:**

```bash
pvesm add cephfs cephfs \
  --path /mnt/pve/cephfs \
  --content backup,iso,vztmpl \
  --nodes foxtrot,golf,hotel
```

**NFS (if using external NAS):**

```bash
pvesm add nfs nas-storage \
  --server 192.168.3.10 \
  --export /mnt/tank/proxmox \
  --content images,backup,iso \
  --nodes foxtrot,golf,hotel
```

## VM Disk Management

### Create VM Disk on CEPH

**Via CLI:**

```bash
# Create 100GB disk for VM 101 on CEPH
qm set 101 --scsi1 ceph-pool:100
```

**Via API (Python):**

```python
from proxmoxer import ProxmoxAPI

proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass')
proxmox.nodes('foxtrot').qemu(101).config.put(scsi1='ceph-pool:100')
```

### Move VM Disk Between Storage

**Move from local-lvm to CEPH:**

```bash
qm move-disk 101 scsi0 ceph-pool --delete 1
```

**Move with live migration:**

```bash
qm move-disk 101 scsi0 ceph-pool --delete 1 --online 1
```

### Resize VM Disk

**Grow disk (can't shrink):**

```bash
# Grow VM 101's scsi0 by 50GB
qm resize 101 scsi0 +50G
```

**Inside VM (expand filesystem):**

```bash
# For ext4
sudo resize2fs /dev/sda1

# For XFS
sudo xfs_growfs /
```

## Backup and Restore

### Backup to Storage

**Create backup:**

```bash
# Backup VM 101 to local storage
vzdump 101 --storage local --mode snapshot --compress zstd

# Backup to CephFS
vzdump 101 --storage cephfs --mode snapshot --compress zstd
```

**Scheduled backups (via Web UI):**

Datacenter → Backup → Add:

- Schedule: Daily at 2 AM
- Storage: cephfs
- Mode: Snapshot
- Compression: ZSTD
- Retention: Keep last 7

### Restore from Backup

**List backups:**

```bash
ls /var/lib/vz/dump/
# OR
ls /mnt/pve/cephfs/dump/
```

**Restore:**

```bash
# Restore to same VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 101

# Restore to new VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 102 --storage ceph-pool
```

## Performance Tuning

### CEPH Performance

**For NVMe OSDs:**

```bash
# Set proper device class
ceph osd crush set-device-class nvme osd.0
ceph osd crush set-device-class nvme osd.1
# ... repeat for all OSDs
```

**Create performance pool:**

```bash
ceph osd pool create fast-pool 128 128
ceph osd pool application enable fast-pool rbd
```

**Enable RBD cache:**

```bash
# /etc/pve/ceph.conf
[client]
    rbd_cache = true
    rbd_cache_size = 134217728  # 128MB
    rbd_cache_writethrough_until_flush = false
```

### LVM Performance

**Use SSD discard:**

```bash
# Enable discard on VM disk
qm set 101 --scsi0 local-lvm:vm-101-disk-0,discard=on,ssd=1
```

## Troubleshooting

### CEPH Not Healthy

**Check OSD status:**

```bash
ceph osd tree
ceph osd stat
```

**Restart stuck OSD:**

```bash
systemctl restart ceph-osd@0.service
```

**Check network connectivity:**

```bash
# From one node to another
ping -c 3 -M do -s 8972 192.168.5.6  # Test MTU 9000
```

### LVM Out of Space

**Check thin pool usage:**

```bash
lvs pve/data -o lv_name,data_percent,metadata_percent
```

**If thin pool > 90% full:**

```bash
# Extend if VG has space
lvextend -L +100G pve/data

# OR delete unused VM disks
lvremove pve/vm-XXX-disk-0
```

### Storage Performance Issues

**Test disk I/O:**

```bash
# Test sequential write
dd if=/dev/zero of=/tmp/test bs=1M count=1024 oflag=direct

# Test CEPH RBD performance
rbd bench --io-type write ceph-pool/test-image
```

**Monitor CEPH latency:**

```bash
ceph osd perf
```

## Best Practices

1. **Use CEPH for HA VMs** - Store critical VM disks on CEPH for live migration
2. **Use LVM for performance** - Non-critical VMs get better performance on local LVM
3. **MTU 9000 for CEPH** - Always use jumbo frames on CEPH networks
4. **Separate networks** - Public and private CEPH networks on different interfaces
5. **Monitor CEPH health** - Set up alerts for HEALTH_WARN/HEALTH_ERR
6. **Regular backups** - Automated daily backups to CephFS or external NAS
7. **Plan for growth** - Leave 20% free space in CEPH for rebalancing
8. **Use replica 3** - Essential for data safety, especially with only 3 nodes

## Further Reading

- [Proxmox VE Storage Documentation](https://pve.proxmox.com/wiki/Storage)
- [CEPH Documentation](https://docs.ceph.com/)
- [Proxmox CEPH Guide](https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster)