Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:00:27 +08:00
commit 0c6988a884
19 changed files with 5729 additions and 0 deletions

View File

@@ -0,0 +1,486 @@
# Proxmox Storage Management
## Overview
Proxmox VE supports multiple storage backends. This guide focuses on the storage architecture of the Matrix cluster: LVM-thin for boot disks and CEPH for distributed storage.
## Matrix Cluster Storage Architecture
### Hardware Configuration
**Per Node (Foxtrot, Golf, Hotel):**
```text
nvme0n1 - 1TB Crucial P3 → Boot disk + LVM
nvme1n1 - 4TB Samsung 990 PRO → CEPH OSD (2 OSDs)
nvme2n1 - 4TB Samsung 990 PRO → CEPH OSD (2 OSDs)
```
**Total Cluster:**
- 3× 1TB boot disks (LVM local storage)
- 6× 4TB NVMe drives (24TB raw CEPH capacity)
- 12 CEPH OSDs total (2 per NVMe drive)
### Storage Pools
```text
Storage Pool Type Backend Purpose
------------- ---- ------- -------
local dir Directory ISO images, templates, backups
local-lvm lvmthin LVM-thin VM disks (local)
ceph-pool rbd CEPH RBD VM disks (distributed, HA)
ceph-fs cephfs CephFS Shared filesystem
```
## LVM Storage
### LVM-thin Configuration
**Advantages:**
- Thin provisioning (overcommit storage)
- Fast snapshots
- Local to each node (low latency)
- No network overhead
**Disadvantages:**
- No HA (tied to single node)
- No live migration with storage
- Limited to node's local disk size
**Check LVM usage:**
```bash
# View volume groups
vgs
# View logical volumes
lvs
# View thin pool usage
lvs -a | grep thin
```
**Example output:**
```text
LV VG Attr LSize Pool Origin Data%
data pve twi-aotz-- 850.00g 45.23
vm-101-disk-0 pve Vwi-aotz-- 50.00g data 12.45
```
### Managing LVM Storage
**Extend thin pool (if boot disk has space):**
```bash
# Check free space in VG
vgs pve
# Extend thin pool
lvextend -L +100G pve/data
```
**Create VM disk manually:**
```bash
# Create 50GB disk for VM 101
lvcreate -V 50G -T pve/data -n vm-101-disk-0
```
## CEPH Storage
### CEPH Architecture for Matrix
**Network Configuration:**
```text
vmbr1 (192.168.5.0/24, MTU 9000) → CEPH Public Network
vmbr2 (192.168.7.0/24, MTU 9000) → CEPH Private Network
```
**OSD Distribution:**
```text
Node NVMe OSDs Capacity
------- ------ ---- --------
foxtrot nvme1n1 2 4TB
foxtrot nvme2n1 2 4TB
golf nvme1n1 2 4TB
golf nvme2n1 2 4TB
hotel nvme1n1 2 4TB
hotel nvme2n1 2 4TB
------- ------ ---- --------
Total 12 24TB raw
```
**Usable capacity (replica 3):** ~8TB
### CEPH Deployment Commands
**Install CEPH:**
```bash
# On first node (foxtrot)
pveceph install --version reef
# Initialize cluster
pveceph init --network 192.168.5.0/24 --cluster-network 192.168.7.0/24
```
**Create Monitors (3 for quorum):**
```bash
# On each node
pveceph mon create
```
**Create Manager (on each node):**
```bash
pveceph mgr create
```
**Create OSDs:**
```bash
# On each node - 2 OSDs per NVMe drive
# For nvme1n1 (4TB)
pveceph osd create /dev/nvme1n1 --crush-device-class nvme
# For nvme2n1 (4TB)
pveceph osd create /dev/nvme2n1 --crush-device-class nvme
```
**Create CEPH Pool:**
```bash
# Create RBD pool for VMs
pveceph pool create ceph-pool --add_storages
# Create CephFS for shared storage
pveceph fs create --name cephfs --add-storage
```
### CEPH Configuration Best Practices
**Optimize for NVMe:**
```bash
# /etc/pve/ceph.conf
[global]
public_network = 192.168.5.0/24
cluster_network = 192.168.7.0/24
osd_pool_default_size = 3
osd_pool_default_min_size = 2
[osd]
osd_memory_target = 4294967296 # 4GB per OSD
osd_max_backfills = 1
osd_recovery_max_active = 1
```
**Restart CEPH services after config change:**
```bash
systemctl restart ceph-osd@*.service
```
### CEPH Monitoring
**Check cluster health:**
```bash
ceph status
ceph health detail
```
**Example healthy output:**
```text
cluster:
id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
health: HEALTH_OK
services:
mon: 3 daemons, quorum foxtrot,golf,hotel
mgr: foxtrot(active), standbys: golf, hotel
osd: 12 osds: 12 up, 12 in
data:
pools: 2 pools, 128 pgs
objects: 1.23k objects, 45 GiB
usage: 135 GiB used, 23.8 TiB / 24 TiB avail
pgs: 128 active+clean
```
**Check OSD performance:**
```bash
ceph osd df
ceph osd perf
```
**Check pool usage:**
```bash
ceph df
rados df
```
## Storage Configuration in Proxmox
### Add Storage via Web UI
**Datacenter → Storage → Add:**
1. **Directory** - For ISOs and backups
2. **LVM-Thin** - For local VM disks
3. **RBD** - For CEPH VM disks
4. **CephFS** - For shared files
### Add Storage via CLI
**CEPH RBD:**
```bash
pvesm add rbd ceph-pool \
--pool ceph-pool \
--content images,rootdir \
--nodes foxtrot,golf,hotel
```
**CephFS:**
```bash
pvesm add cephfs cephfs \
--path /mnt/pve/cephfs \
--content backup,iso,vztmpl \
--nodes foxtrot,golf,hotel
```
**NFS (if using external NAS):**
```bash
pvesm add nfs nas-storage \
--server 192.168.3.10 \
--export /mnt/tank/proxmox \
--content images,backup,iso \
--nodes foxtrot,golf,hotel
```
## VM Disk Management
### Create VM Disk on CEPH
**Via CLI:**
```bash
# Create 100GB disk for VM 101 on CEPH
qm set 101 --scsi1 ceph-pool:100
```
**Via API (Python):**
```python
from proxmoxer import ProxmoxAPI
proxmox = ProxmoxAPI('192.168.3.5', user='root@pam', password='pass')
proxmox.nodes('foxtrot').qemu(101).config.put(scsi1='ceph-pool:100')
```
### Move VM Disk Between Storage
**Move from local-lvm to CEPH:**
```bash
qm move-disk 101 scsi0 ceph-pool --delete 1
```
**Move with live migration:**
```bash
qm move-disk 101 scsi0 ceph-pool --delete 1 --online 1
```
### Resize VM Disk
**Grow disk (can't shrink):**
```bash
# Grow VM 101's scsi0 by 50GB
qm resize 101 scsi0 +50G
```
**Inside VM (expand filesystem):**
```bash
# For ext4
sudo resize2fs /dev/sda1
# For XFS
sudo xfs_growfs /
```
## Backup and Restore
### Backup to Storage
**Create backup:**
```bash
# Backup VM 101 to local storage
vzdump 101 --storage local --mode snapshot --compress zstd
# Backup to CephFS
vzdump 101 --storage cephfs --mode snapshot --compress zstd
```
**Scheduled backups (via Web UI):**
Datacenter → Backup → Add:
- Schedule: Daily at 2 AM
- Storage: cephfs
- Mode: Snapshot
- Compression: ZSTD
- Retention: Keep last 7
### Restore from Backup
**List backups:**
```bash
ls /var/lib/vz/dump/
# OR
ls /mnt/pve/cephfs/dump/
```
**Restore:**
```bash
# Restore to same VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 101
# Restore to new VMID
qmrestore /var/lib/vz/dump/vzdump-qemu-101-2024_01_15-02_00_00.vma.zst 102 --storage ceph-pool
```
## Performance Tuning
### CEPH Performance
**For NVMe OSDs:**
```bash
# Set proper device class
ceph osd crush set-device-class nvme osd.0
ceph osd crush set-device-class nvme osd.1
# ... repeat for all OSDs
```
**Create performance pool:**
```bash
ceph osd pool create fast-pool 128 128
ceph osd pool application enable fast-pool rbd
```
**Enable RBD cache:**
```bash
# /etc/pve/ceph.conf
[client]
rbd_cache = true
rbd_cache_size = 134217728 # 128MB
rbd_cache_writethrough_until_flush = false
```
### LVM Performance
**Use SSD discard:**
```bash
# Enable discard on VM disk
qm set 101 --scsi0 local-lvm:vm-101-disk-0,discard=on,ssd=1
```
## Troubleshooting
### CEPH Not Healthy
**Check OSD status:**
```bash
ceph osd tree
ceph osd stat
```
**Restart stuck OSD:**
```bash
systemctl restart ceph-osd@0.service
```
**Check network connectivity:**
```bash
# From one node to another
ping -c 3 -M do -s 8972 192.168.5.6 # Test MTU 9000
```
### LVM Out of Space
**Check thin pool usage:**
```bash
lvs pve/data -o lv_name,data_percent,metadata_percent
```
**If thin pool > 90% full:**
```bash
# Extend if VG has space
lvextend -L +100G pve/data
# OR delete unused VM disks
lvremove pve/vm-XXX-disk-0
```
### Storage Performance Issues
**Test disk I/O:**
```bash
# Test sequential write
dd if=/dev/zero of=/tmp/test bs=1M count=1024 oflag=direct
# Test CEPH RBD performance
rbd bench --io-type write ceph-pool/test-image
```
**Monitor CEPH latency:**
```bash
ceph osd perf
```
## Best Practices
1. **Use CEPH for HA VMs** - Store critical VM disks on CEPH for live migration
2. **Use LVM for performance** - Non-critical VMs get better performance on local LVM
3. **MTU 9000 for CEPH** - Always use jumbo frames on CEPH networks
4. **Separate networks** - Public and private CEPH networks on different interfaces
5. **Monitor CEPH health** - Set up alerts for HEALTH_WARN/HEALTH_ERR
6. **Regular backups** - Automated daily backups to CephFS or external NAS
7. **Plan for growth** - Leave 20% free space in CEPH for rebalancing
8. **Use replica 3** - Essential for data safety, especially with only 3 nodes
## Further Reading
- [Proxmox VE Storage Documentation](https://pve.proxmox.com/wiki/Storage)
- [CEPH Documentation](https://docs.ceph.com/)
- [Proxmox CEPH Guide](https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster)