Files
2025-11-29 18:00:27 +08:00

8.9 KiB
Raw Permalink Blame History

Proxmox Storage Management

Overview

Proxmox VE supports multiple storage backends. This guide focuses on the storage architecture of the Matrix cluster: LVM-thin for boot disks and CEPH for distributed storage.

Matrix Cluster Storage Architecture

Hardware Configuration

Per Node (Foxtrot, Golf, Hotel):

nvme0n1  - 1TB Crucial P3        → Boot disk + LVM
nvme1n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)
nvme2n1  - 4TB Samsung 990 PRO   → CEPH OSD (2 OSDs)

Total Cluster:

  • 3× 1TB boot disks (LVM local storage)
  • 6× 4TB NVMe drives (24TB raw CEPH capacity)
  • 12 CEPH OSDs total (2 per NVMe drive)

Storage Pools

Storage Pool     Type       Backend    Purpose
-------------    ----       -------    -------
local            dir        Directory  ISO images, templates, backups
local-lvm        lvmthin    LVM-thin   VM disks (local)
ceph-pool        rbd        CEPH RBD   VM disks (distributed, HA)
ceph-fs          cephfs     CephFS     Shared filesystem

LVM Storage

LVM-thin Configuration

Advantages:

  • Thin provisioning (overcommit storage)
  • Fast snapshots
  • Local to each node (low latency)
  • No network overhead

Disadvantages:

  • No HA (tied to single node)
  • No live migration with storage
  • Limited to node's local disk size

Check LVM usage:

# View volume groups
vgs

# View logical volumes
lvs

# View thin pool usage
lvs -a | grep thin

Example output:

  LV            VG  Attr       LSize   Pool Origin Data%
  data          pve twi-aotz-- 850.00g             45.23
  vm-101-disk-0 pve Vwi-aotz--  50.00g data        12.45

Managing LVM Storage

Extend thin pool (if boot disk has space):

# Check free space in VG
vgs pve

# Extend thin pool
lvextend -L +100G pve/data

Create VM disk manually:

# Create 50GB disk for VM 101
lvcreate -V 50G -T pve/data -n vm-101-disk-0

CEPH Storage

CEPH Architecture for Matrix

Network Configuration:

vmbr1 (192.168.5.0/24, MTU 9000) → CEPH Public Network
vmbr2 (192.168.7.0/24, MTU 9000) → CEPH Private Network

OSD Distribution:

Node      NVMe       OSDs    Capacity
-------   ------     ----    --------
foxtrot   nvme1n1    2       4TB
foxtrot   nvme2n1    2       4TB
golf      nvme1n1    2       4TB
golf      nvme2n1    2       4TB
hotel     nvme1n1    2       4TB
hotel     nvme2n1    2       4TB
-------   ------     ----    --------
Total                12      24TB raw

Usable capacity (replica 3): ~8TB

CEPH Deployment Commands

Install CEPH:

# On first node (foxtrot)
pveceph install --version reef

# Initialize cluster
pveceph init --network 192.168.5.0/24 --cluster-network 192.168.7.0/24

Create Monitors (3 for quorum):

# On each node
pveceph mon create

Create Manager (on each node):

pveceph mgr create

Create OSDs:

# On each node - 2 OSDs per NVMe drive

# For nvme1n1 (4TB)
pveceph osd create /dev/nvme1n1 --crush-device-class nvme

# For nvme2n1 (4TB)
pveceph osd create /dev/nvme2n1 --crush-device-class nvme

Create CEPH Pool:

# Create RBD pool for VMs
pveceph pool create ceph-pool --add_storages

# Create CephFS for shared storage
pveceph fs create --name cephfs --add-storage

CEPH Configuration Best Practices

Optimize for NVMe:

# /etc/pve/ceph.conf
[global]
    public_network = 192.168.5.0/24
    cluster_network = 192.168.7.0/24
    osd_pool_default_size = 3
    osd_pool_default_min_size = 2

[osd]
    osd_memory_target = 4294967296  # 4GB per OSD
    osd_max_backfills = 1
    osd_recovery_max_active = 1

Restart CEPH services after config change:

systemctl restart ceph-osd@*.service

CEPH Monitoring

Check cluster health:

ceph status
ceph health detail

Example healthy output:

cluster:
  id:     a1b2c3d4-e5f6-7890-abcd-ef1234567890
  health: HEALTH_OK

services:
  mon: 3 daemons, quorum foxtrot,golf,hotel
  mgr: foxtrot(active), standbys: golf, hotel
  osd: 12 osds: 12 up, 12 in

data:
  pools:   2 pools, 128 pgs
  objects: 1.23k objects, 45 GiB
  usage:   135 GiB used, 23.8 TiB / 24 TiB avail
  pgs:     128 active+clean

Check OSD performance:

ceph osd df
ceph osd perf

Check pool usage:

ceph df
rados df

Storage Configuration in Proxmox

Add Storage via Web UI

Datacenter → Storage → Add:

  1. Directory - For ISOs and backups
  2. LVM-Thin - For local VM disks
  3. RBD - For CEPH VM disks
  4. CephFS - For shared files

Add Storage via CLI

CEPH RBD:

pvesm add rbd ceph-pool \
  --pool ceph-pool \
  --content images,rootdir \
  --nodes foxtrot,golf,hotel

CephFS:

pvesm add cephfs cephfs \
  --path /mnt/pve/cephfs \
  --content backup,iso,vztmpl \
  --nodes foxtrot,golf,hotel

NFS (if using external NAS):

pvesm add nfs nas-storage \
  --server 192.168.3.10 \
  --export /mnt/tank/proxmox \
  --content images,backup,iso \
  --nodes foxtrot,golf,hotel

VM Disk Management

Create VM Disk on CEPH

Via CLI:

# Create 100GB disk for VM 101 on CEPH
qm set 101 --scsi1 ceph-pool:100

Via API (Python):

from proxmoxer import ProxmoxAPI

proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass')
proxmox.nodes('foxtrot').qemu(101).config.put(scsi1='ceph-pool:100')

Move VM Disk Between Storage

Move from local-lvm to CEPH:

qm move-disk 101 scsi0 ceph-pool --delete 1

Move with live migration:

qm move-disk 101 scsi0 ceph-pool --delete 1 --online 1

Resize VM Disk

Grow disk (can't shrink):

# Grow VM 101's scsi0 by 50GB
qm resize 101 scsi0 +50G

Inside VM (expand filesystem):

# For ext4
sudo resize2fs /dev/sda1

# For XFS
sudo xfs_growfs /

Backup and Restore

Backup to Storage

Create backup:

# Backup VM 101 to local storage
vzdump 101 --storage local --mode snapshot --compress zstd

# Backup to CephFS
vzdump 101 --storage cephfs --mode snapshot --compress zstd

Scheduled backups (via Web UI):

Datacenter → Backup → Add:

  • Schedule: Daily at 2 AM
  • Storage: cephfs
  • Mode: Snapshot
  • Compression: ZSTD
  • Retention: Keep last 7

Restore from Backup

List backups:

ls /var/lib/vz/dump/
# OR
ls /mnt/pve/cephfs/dump/

Restore:

# Restore to same VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 101

# Restore to new VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 102 --storage ceph-pool

Performance Tuning

CEPH Performance

For NVMe OSDs:

# Set proper device class
ceph osd crush set-device-class nvme osd.0
ceph osd crush set-device-class nvme osd.1
# ... repeat for all OSDs

Create performance pool:

ceph osd pool create fast-pool 128 128
ceph osd pool application enable fast-pool rbd

Enable RBD cache:

# /etc/pve/ceph.conf
[client]
    rbd_cache = true
    rbd_cache_size = 134217728  # 128MB
    rbd_cache_writethrough_until_flush = false

LVM Performance

Use SSD discard:

# Enable discard on VM disk
qm set 101 --scsi0 local-lvm:vm-101-disk-0,discard=on,ssd=1

Troubleshooting

CEPH Not Healthy

Check OSD status:

ceph osd tree
ceph osd stat

Restart stuck OSD:

systemctl restart ceph-osd@0.service

Check network connectivity:

# From one node to another
ping -c 3 -M do -s 8972 192.168.5.6  # Test MTU 9000

LVM Out of Space

Check thin pool usage:

lvs pve/data -o lv_name,data_percent,metadata_percent

If thin pool > 90% full:

# Extend if VG has space
lvextend -L +100G pve/data

# OR delete unused VM disks
lvremove pve/vm-XXX-disk-0

Storage Performance Issues

Test disk I/O:

# Test sequential write
dd if=/dev/zero of=/tmp/test bs=1M count=1024 oflag=direct

# Test CEPH RBD performance
rbd bench --io-type write ceph-pool/test-image

Monitor CEPH latency:

ceph osd perf

Best Practices

  1. Use CEPH for HA VMs - Store critical VM disks on CEPH for live migration
  2. Use LVM for performance - Non-critical VMs get better performance on local LVM
  3. MTU 9000 for CEPH - Always use jumbo frames on CEPH networks
  4. Separate networks - Public and private CEPH networks on different interfaces
  5. Monitor CEPH health - Set up alerts for HEALTH_WARN/HEALTH_ERR
  6. Regular backups - Automated daily backups to CephFS or external NAS
  7. Plan for growth - Leave 20% free space in CEPH for rebalancing
  8. Use replica 3 - Essential for data safety, especially with only 3 nodes

Further Reading