Initial commit
This commit is contained in:
782
skills/proxmox-infrastructure/workflows/ceph-deployment.md
Normal file
782
skills/proxmox-infrastructure/workflows/ceph-deployment.md
Normal file
@@ -0,0 +1,782 @@
|
||||
# CEPH Storage Deployment Workflow
|
||||
|
||||
Complete guide to deploying CEPH storage on a Proxmox VE cluster with automated OSD creation, pool
|
||||
configuration, and health verification.
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow automates CEPH deployment with:
|
||||
|
||||
- CEPH package installation
|
||||
- Cluster initialization with proper network configuration
|
||||
- Monitor and manager creation across all nodes
|
||||
- Automated OSD creation with partition support
|
||||
- Pool configuration with replication and compression
|
||||
- Comprehensive health verification
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before deploying CEPH:
|
||||
|
||||
1. **Cluster must be formed:**
|
||||
- Proxmox cluster already initialized and healthy
|
||||
- All nodes showing quorum
|
||||
- See [Cluster Formation](cluster-formation.md) first
|
||||
|
||||
2. **Network requirements:**
|
||||
- Dedicated CEPH public network (192.168.5.0/24 for Matrix)
|
||||
- Dedicated CEPH private/cluster network (192.168.7.0/24 for Matrix)
|
||||
- MTU 9000 (jumbo frames) configured on CEPH networks
|
||||
- Bridges configured: vmbr1 (public), vmbr2 (private)
|
||||
|
||||
3. **Storage requirements:**
|
||||
- Dedicated disks for OSDs (not boot disks)
|
||||
- All OSD disks should be the same type (SSD/NVMe)
|
||||
- Matrix: 2× 4TB Samsung 990 PRO NVMe per node = 24TB raw
|
||||
|
||||
4. **System requirements:**
|
||||
- Minimum 3 nodes for production (replication factor 3)
|
||||
- At least 4GB RAM per OSD
|
||||
- Fast network (10GbE recommended for CEPH networks)
|
||||
|
||||
## Phase 1: Install CEPH Packages
|
||||
|
||||
### Step 1: Install CEPH
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/install.yml
|
||||
---
|
||||
- name: Check if CEPH is already installed
|
||||
ansible.builtin.stat:
|
||||
path: /etc/pve/ceph.conf
|
||||
register: ceph_conf_check
|
||||
|
||||
- name: Check CEPH packages
|
||||
ansible.builtin.command:
|
||||
cmd: dpkg -l ceph-common
|
||||
register: ceph_package_check
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Install CEPH packages via pveceph
|
||||
ansible.builtin.command:
|
||||
cmd: "pveceph install --repository {{ ceph_repository }}"
|
||||
when: ceph_package_check.rc != 0
|
||||
register: ceph_install
|
||||
changed_when: "'installed' in ceph_install.stdout | default('')"
|
||||
|
||||
- name: Verify CEPH installation
|
||||
ansible.builtin.command:
|
||||
cmd: ceph --version
|
||||
register: ceph_version
|
||||
changed_when: false
|
||||
failed_when: ceph_version.rc != 0
|
||||
|
||||
- name: Display CEPH version
|
||||
ansible.builtin.debug:
|
||||
msg: "Installed CEPH version: {{ ceph_version.stdout }}"
|
||||
```
|
||||
|
||||
## Phase 2: Initialize CEPH Cluster
|
||||
|
||||
### Step 2: Initialize CEPH (First Node Only)
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/init.yml
|
||||
---
|
||||
- name: Check if CEPH cluster is initialized
|
||||
ansible.builtin.command:
|
||||
cmd: ceph status
|
||||
register: ceph_status_check
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Set CEPH initialization facts
|
||||
ansible.builtin.set_fact:
|
||||
ceph_initialized: "{{ ceph_status_check.rc == 0 }}"
|
||||
is_ceph_first_node: "{{ inventory_hostname == groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
|
||||
- name: Initialize CEPH cluster on first node
|
||||
ansible.builtin.command:
|
||||
cmd: >
|
||||
pveceph init
|
||||
--network {{ ceph_network }}
|
||||
--cluster-network {{ ceph_cluster_network }}
|
||||
when:
|
||||
- is_ceph_first_node
|
||||
- not ceph_initialized
|
||||
register: ceph_init
|
||||
changed_when: ceph_init.rc == 0
|
||||
|
||||
- name: Wait for CEPH cluster to initialize
|
||||
ansible.builtin.pause:
|
||||
seconds: 15
|
||||
when: ceph_init.changed
|
||||
|
||||
- name: Verify CEPH initialization
|
||||
ansible.builtin.command:
|
||||
cmd: ceph status
|
||||
register: ceph_init_verify
|
||||
changed_when: false
|
||||
when:
|
||||
- is_ceph_first_node
|
||||
failed_when:
|
||||
- ceph_init_verify.rc != 0
|
||||
|
||||
- name: Display initial CEPH status
|
||||
ansible.builtin.debug:
|
||||
var: ceph_init_verify.stdout_lines
|
||||
when:
|
||||
- is_ceph_first_node
|
||||
- ceph_init.changed or ansible_verbosity > 0
|
||||
```
|
||||
|
||||
## Phase 3: Create Monitors and Managers
|
||||
|
||||
### Step 3: Create CEPH Monitors
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/monitors.yml
|
||||
---
|
||||
- name: Check existing CEPH monitors
|
||||
ansible.builtin.command:
|
||||
cmd: ceph mon dump --format json
|
||||
register: mon_dump
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Parse monitor list
|
||||
ansible.builtin.set_fact:
|
||||
existing_monitors: "{{ (mon_dump.stdout | from_json).mons | map(attribute='name') | list }}"
|
||||
when: mon_dump.rc == 0
|
||||
|
||||
- name: Set monitor facts
|
||||
ansible.builtin.set_fact:
|
||||
has_monitor: "{{ inventory_hostname_short in existing_monitors | default([]) }}"
|
||||
|
||||
- name: Create CEPH monitor on first node
|
||||
ansible.builtin.command:
|
||||
cmd: pveceph mon create
|
||||
when:
|
||||
- is_ceph_first_node
|
||||
- not has_monitor
|
||||
register: mon_create_first
|
||||
changed_when: mon_create_first.rc == 0
|
||||
|
||||
- name: Wait for first monitor to stabilize
|
||||
ansible.builtin.pause:
|
||||
seconds: 10
|
||||
when: mon_create_first.changed
|
||||
|
||||
- name: Create CEPH monitors on other nodes
|
||||
ansible.builtin.command:
|
||||
cmd: pveceph mon create
|
||||
when:
|
||||
- not is_ceph_first_node
|
||||
- not has_monitor
|
||||
register: mon_create_others
|
||||
changed_when: mon_create_others.rc == 0
|
||||
|
||||
- name: Verify monitor quorum
|
||||
ansible.builtin.command:
|
||||
cmd: ceph quorum_status --format json
|
||||
register: quorum_status
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Check monitor quorum size
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- (quorum_status.stdout | from_json).quorum | length >= ((groups[cluster_group | default('matrix_cluster')] | length // 2) + 1)
|
||||
fail_msg: "Monitor quorum not established"
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
```
|
||||
|
||||
### Step 4: Create CEPH Managers
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/managers.yml
|
||||
---
|
||||
- name: Check existing CEPH managers
|
||||
ansible.builtin.command:
|
||||
cmd: ceph mgr dump --format json
|
||||
register: mgr_dump
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Parse manager list
|
||||
ansible.builtin.set_fact:
|
||||
existing_managers: "{{ [(mgr_dump.stdout | from_json).active_name] + ((mgr_dump.stdout | from_json).standbys | map(attribute='name') | list) }}"
|
||||
when: mgr_dump.rc == 0
|
||||
|
||||
- name: Initialize empty manager list if check failed
|
||||
ansible.builtin.set_fact:
|
||||
existing_managers: []
|
||||
when: mgr_dump.rc != 0
|
||||
|
||||
- name: Set manager facts
|
||||
ansible.builtin.set_fact:
|
||||
has_manager: "{{ inventory_hostname_short in (existing_managers | default([])) }}"
|
||||
|
||||
- name: Create CEPH manager
|
||||
ansible.builtin.command:
|
||||
cmd: pveceph mgr create
|
||||
when: not has_manager
|
||||
register: mgr_create
|
||||
changed_when: mgr_create.rc == 0
|
||||
|
||||
- name: Wait for managers to stabilize
|
||||
ansible.builtin.pause:
|
||||
seconds: 5
|
||||
when: mgr_create.changed
|
||||
|
||||
- name: Enable CEPH dashboard module
|
||||
ansible.builtin.command:
|
||||
cmd: ceph mgr module enable dashboard
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
register: dashboard_enable
|
||||
changed_when: "'already enabled' not in dashboard_enable.stderr"
|
||||
failed_when:
|
||||
- dashboard_enable.rc != 0
|
||||
- "'already enabled' not in dashboard_enable.stderr"
|
||||
|
||||
- name: Enable Prometheus module
|
||||
ansible.builtin.command:
|
||||
cmd: ceph mgr module enable prometheus
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
register: prometheus_enable
|
||||
changed_when: "'already enabled' not in prometheus_enable.stderr"
|
||||
failed_when:
|
||||
- prometheus_enable.rc != 0
|
||||
- "'already enabled' not in prometheus_enable.stderr"
|
||||
```
|
||||
|
||||
## Phase 4: Create OSDs
|
||||
|
||||
### Step 5: Prepare and Create OSDs
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/osd_create.yml
|
||||
---
|
||||
- name: Get list of existing OSDs
|
||||
ansible.builtin.command:
|
||||
cmd: ceph osd ls
|
||||
register: existing_osds
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Check OSD devices availability
|
||||
ansible.builtin.command:
|
||||
cmd: "lsblk -ndo NAME,SIZE,TYPE {{ item.device }}"
|
||||
register: device_check
|
||||
failed_when: device_check.rc != 0
|
||||
changed_when: false
|
||||
loop: "{{ ceph_osds[inventory_hostname_short] | default([]) }}"
|
||||
loop_control:
|
||||
label: "{{ item.device }}"
|
||||
|
||||
- name: Display device information
|
||||
ansible.builtin.debug:
|
||||
msg: "Device {{ item.item.device }}: {{ item.stdout }}"
|
||||
loop: "{{ device_check.results }}"
|
||||
loop_control:
|
||||
label: "{{ item.item.device }}"
|
||||
when: ansible_verbosity > 0
|
||||
|
||||
- name: Wipe existing partitions on OSD devices
|
||||
ansible.builtin.command:
|
||||
cmd: "wipefs -a {{ item.device }}"
|
||||
when:
|
||||
- ceph_wipe_disks | default(false)
|
||||
loop: "{{ ceph_osds[inventory_hostname_short] | default([]) }}"
|
||||
loop_control:
|
||||
label: "{{ item.device }}"
|
||||
register: wipe_result
|
||||
changed_when: wipe_result.rc == 0
|
||||
|
||||
- name: Create OSDs from whole devices (no partitioning)
|
||||
ansible.builtin.command:
|
||||
cmd: >
|
||||
pveceph osd create {{ item.device }}
|
||||
{% if item.db_device is defined and item.db_device %}--db_dev {{ item.db_device }}{% endif %}
|
||||
{% if item.wal_device is defined and item.wal_device %}--wal_dev {{ item.wal_device }}{% endif %}
|
||||
when:
|
||||
- item.partitions | default(1) == 1
|
||||
loop: "{{ ceph_osds[inventory_hostname_short] | default([]) }}"
|
||||
loop_control:
|
||||
label: "{{ item.device }}"
|
||||
register: osd_create_whole
|
||||
changed_when: "'successfully created' in osd_create_whole.stdout | default('')"
|
||||
failed_when:
|
||||
- osd_create_whole.rc != 0
|
||||
- "'already in use' not in osd_create_whole.stderr | default('')"
|
||||
- "'ceph-volume' not in osd_create_whole.stderr | default('')"
|
||||
|
||||
- name: Create multiple OSDs per device (with partitioning)
|
||||
ansible.builtin.command:
|
||||
cmd: >
|
||||
pveceph osd create {{ item.0.device }}
|
||||
--size {{ (item.0.device_size_gb | default(4000) / item.0.partitions) | int }}G
|
||||
{% if item.0.db_device is defined and item.0.db_device %}--db_dev {{ item.0.db_device }}{% endif %}
|
||||
{% if item.0.wal_device is defined and item.0.wal_device %}--wal_dev {{ item.0.wal_device }}{% endif %}
|
||||
when:
|
||||
- item.0.partitions > 1
|
||||
with_subelements:
|
||||
- "{{ ceph_osds[inventory_hostname_short] | default([]) }}"
|
||||
- partition_indices
|
||||
- skip_missing: true
|
||||
loop_control:
|
||||
label: "{{ item.0.device }} partition {{ item.1 }}"
|
||||
register: osd_create_partition
|
||||
changed_when: "'successfully created' in osd_create_partition.stdout | default('')"
|
||||
failed_when:
|
||||
- osd_create_partition.rc != 0
|
||||
- "'already in use' not in osd_create_partition.stderr | default('')"
|
||||
|
||||
- name: Wait for OSDs to come up
|
||||
ansible.builtin.command:
|
||||
cmd: ceph osd tree --format json
|
||||
register: osd_tree
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
until: >
|
||||
(osd_tree.stdout | from_json).nodes
|
||||
| selectattr('type', 'equalto', 'osd')
|
||||
| selectattr('status', 'equalto', 'up')
|
||||
| list | length >= expected_osd_count | int
|
||||
retries: 20
|
||||
delay: 10
|
||||
vars:
|
||||
expected_osd_count: >-
|
||||
{{
|
||||
ceph_osds.values()
|
||||
| map('map', attribute='partitions')
|
||||
| map('default', 1)
|
||||
| sum
|
||||
}}
|
||||
```
|
||||
|
||||
## Phase 5: Create and Configure Pools
|
||||
|
||||
### Step 6: Create CEPH Pools
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/pools.yml
|
||||
---
|
||||
- name: Get existing CEPH pools
|
||||
ansible.builtin.command:
|
||||
cmd: ceph osd pool ls
|
||||
register: existing_pools
|
||||
changed_when: false
|
||||
|
||||
- name: Create CEPH pools
|
||||
ansible.builtin.command:
|
||||
cmd: >
|
||||
ceph osd pool create {{ item.name }}
|
||||
{{ item.pg_num }}
|
||||
{{ item.pgp_num | default(item.pg_num) }}
|
||||
when: item.name not in existing_pools.stdout_lines
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_create
|
||||
changed_when: pool_create.rc == 0
|
||||
|
||||
- name: Set pool replication size
|
||||
ansible.builtin.command:
|
||||
cmd: "ceph osd pool set {{ item.name }} size {{ item.size }}"
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_size
|
||||
changed_when: "'set pool' in pool_size.stdout"
|
||||
|
||||
- name: Set pool minimum replication size
|
||||
ansible.builtin.command:
|
||||
cmd: "ceph osd pool set {{ item.name }} min_size {{ item.min_size }}"
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_min_size
|
||||
changed_when: "'set pool' in pool_min_size.stdout"
|
||||
|
||||
- name: Set pool application
|
||||
ansible.builtin.command:
|
||||
cmd: "ceph osd pool application enable {{ item.name }} {{ item.application }}"
|
||||
when: item.application is defined
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_app
|
||||
changed_when: "'enabled application' in pool_app.stdout"
|
||||
failed_when:
|
||||
- pool_app.rc != 0
|
||||
- "'already enabled' not in pool_app.stderr"
|
||||
|
||||
- name: Enable compression on pools
|
||||
ansible.builtin.command:
|
||||
cmd: "ceph osd pool set {{ item.name }} compression_mode aggressive"
|
||||
when: item.compression | default(false)
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_compression
|
||||
changed_when: "'set pool' in pool_compression.stdout"
|
||||
|
||||
- name: Set compression algorithm
|
||||
ansible.builtin.command:
|
||||
cmd: "ceph osd pool set {{ item.name }} compression_algorithm {{ item.compression_algorithm | default('zstd') }}"
|
||||
when: item.compression | default(false)
|
||||
loop: "{{ ceph_pools }}"
|
||||
loop_control:
|
||||
label: "{{ item.name }}"
|
||||
register: pool_compression_algo
|
||||
changed_when: "'set pool' in pool_compression_algo.stdout"
|
||||
```
|
||||
|
||||
## Phase 6: Verify CEPH Health
|
||||
|
||||
### Step 7: Health Verification
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_ceph/tasks/verify.yml
|
||||
---
|
||||
- name: Wait for CEPH to stabilize
|
||||
ansible.builtin.pause:
|
||||
seconds: 30
|
||||
|
||||
- name: Check CEPH cluster health
|
||||
ansible.builtin.command:
|
||||
cmd: ceph health
|
||||
register: ceph_health
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Get CEPH status
|
||||
ansible.builtin.command:
|
||||
cmd: ceph status --format json
|
||||
register: ceph_status
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Parse CEPH status
|
||||
ansible.builtin.set_fact:
|
||||
ceph_status_data: "{{ ceph_status.stdout | from_json }}"
|
||||
|
||||
- name: Calculate expected OSD count
|
||||
ansible.builtin.set_fact:
|
||||
expected_osd_count: >-
|
||||
{{
|
||||
ceph_osds.values()
|
||||
| map('map', attribute='partitions')
|
||||
| map('default', 1)
|
||||
| sum
|
||||
}}
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Verify OSD count
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- ceph_status_data.osdmap.num_osds | int == expected_osd_count | int
|
||||
fail_msg: "Expected {{ expected_osd_count }} OSDs but found {{ ceph_status_data.osdmap.num_osds }}"
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Verify all OSDs are up
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- ceph_status_data.osdmap.num_up_osds == ceph_status_data.osdmap.num_osds
|
||||
fail_msg: "Not all OSDs are up: {{ ceph_status_data.osdmap.num_up_osds }}/{{ ceph_status_data.osdmap.num_osds }}"
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Verify all OSDs are in
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- ceph_status_data.osdmap.num_in_osds == ceph_status_data.osdmap.num_osds
|
||||
fail_msg: "Not all OSDs are in cluster: {{ ceph_status_data.osdmap.num_in_osds }}/{{ ceph_status_data.osdmap.num_osds }}"
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Wait for PGs to become active+clean
|
||||
ansible.builtin.command:
|
||||
cmd: ceph pg stat --format json
|
||||
register: pg_stat
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
until: >
|
||||
(pg_stat.stdout | from_json).num_pg_by_state
|
||||
| selectattr('name', 'equalto', 'active+clean')
|
||||
| map(attribute='num')
|
||||
| sum == (pg_stat.stdout | from_json).num_pgs
|
||||
retries: 60
|
||||
delay: 10
|
||||
|
||||
- name: Display CEPH cluster summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
CEPH Cluster Health: {{ ceph_health.stdout }}
|
||||
Total OSDs: {{ ceph_status_data.osdmap.num_osds }}
|
||||
OSDs Up: {{ ceph_status_data.osdmap.num_up_osds }}
|
||||
OSDs In: {{ ceph_status_data.osdmap.num_in_osds }}
|
||||
PGs: {{ ceph_status_data.pgmap.num_pgs }}
|
||||
Data: {{ ceph_status_data.pgmap.bytes_used | default(0) | human_readable }}
|
||||
Available: {{ ceph_status_data.pgmap.bytes_avail | default(0) | human_readable }}
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
```
|
||||
|
||||
## Matrix Cluster Configuration Example
|
||||
|
||||
```yaml
|
||||
# group_vars/matrix_cluster.yml (CEPH section)
|
||||
---
|
||||
# CEPH configuration
|
||||
ceph_enabled: true
|
||||
ceph_repository: "no-subscription" # or "enterprise" with subscription
|
||||
ceph_network: "192.168.5.0/24" # vmbr1 - Public network
|
||||
ceph_cluster_network: "192.168.7.0/24" # vmbr2 - Private network
|
||||
|
||||
# OSD configuration (4 OSDs per node = 12 total)
|
||||
ceph_osds:
|
||||
foxtrot:
|
||||
- device: /dev/nvme1n1
|
||||
partitions: 2 # Create 2 OSDs per 4TB NVMe
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
db_device: null
|
||||
wal_device: null
|
||||
crush_device_class: nvme
|
||||
- device: /dev/nvme2n1
|
||||
partitions: 2
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
db_device: null
|
||||
wal_device: null
|
||||
crush_device_class: nvme
|
||||
|
||||
golf:
|
||||
- device: /dev/nvme1n1
|
||||
partitions: 2
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
crush_device_class: nvme
|
||||
- device: /dev/nvme2n1
|
||||
partitions: 2
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
crush_device_class: nvme
|
||||
|
||||
hotel:
|
||||
- device: /dev/nvme1n1
|
||||
partitions: 2
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
crush_device_class: nvme
|
||||
- device: /dev/nvme2n1
|
||||
partitions: 2
|
||||
device_size_gb: 4000
|
||||
partition_indices: [0, 1]
|
||||
crush_device_class: nvme
|
||||
|
||||
# Pool configuration
|
||||
ceph_pools:
|
||||
- name: vm_ssd
|
||||
pg_num: 128
|
||||
pgp_num: 128
|
||||
size: 3 # Replicate across 3 nodes
|
||||
min_size: 2 # Minimum 2 replicas required
|
||||
application: rbd
|
||||
compression: false
|
||||
|
||||
- name: vm_containers
|
||||
pg_num: 64
|
||||
pgp_num: 64
|
||||
size: 3
|
||||
min_size: 2
|
||||
application: rbd
|
||||
compression: true
|
||||
compression_algorithm: zstd
|
||||
|
||||
# Safety flags
|
||||
ceph_wipe_disks: false # Set to true for fresh deployment (DESTRUCTIVE!)
|
||||
```
|
||||
|
||||
## Complete Playbook Example
|
||||
|
||||
```yaml
|
||||
# playbooks/ceph-deploy.yml
|
||||
---
|
||||
- name: Deploy CEPH Storage on Proxmox Cluster
|
||||
hosts: "{{ cluster_group | default('matrix_cluster') }}"
|
||||
become: true
|
||||
serial: 1 # Deploy one node at a time
|
||||
|
||||
pre_tasks:
|
||||
- name: Verify cluster is healthy
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: cluster_check
|
||||
changed_when: false
|
||||
failed_when: "'Quorate: Yes' not in cluster_check.stdout"
|
||||
|
||||
- name: Verify CEPH networks MTU
|
||||
ansible.builtin.command:
|
||||
cmd: "ip link show {{ item }}"
|
||||
register: mtu_check
|
||||
changed_when: false
|
||||
failed_when: "'mtu 9000' not in mtu_check.stdout"
|
||||
loop:
|
||||
- vmbr1 # CEPH public
|
||||
- vmbr2 # CEPH private
|
||||
|
||||
- name: Display CEPH configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
Deploying CEPH to cluster: {{ cluster_name }}
|
||||
Public network: {{ ceph_network }}
|
||||
Cluster network: {{ ceph_cluster_network }}
|
||||
Expected OSDs: {{ ceph_osds.values() | map('map', attribute='partitions') | map('default', 1) | sum }}
|
||||
run_once: true
|
||||
|
||||
roles:
|
||||
- role: proxmox_ceph
|
||||
|
||||
post_tasks:
|
||||
- name: Display CEPH OSD tree
|
||||
ansible.builtin.command:
|
||||
cmd: ceph osd tree
|
||||
register: osd_tree_final
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Show OSD tree
|
||||
ansible.builtin.debug:
|
||||
var: osd_tree_final.stdout_lines
|
||||
run_once: true
|
||||
|
||||
- name: Display pool information
|
||||
ansible.builtin.command:
|
||||
cmd: ceph osd pool ls detail
|
||||
register: pool_info
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group | default('matrix_cluster')][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Show pool details
|
||||
ansible.builtin.debug:
|
||||
var: pool_info.stdout_lines
|
||||
run_once: true
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Deploy CEPH to Matrix Cluster
|
||||
|
||||
```bash
|
||||
# Check syntax
|
||||
ansible-playbook playbooks/ceph-deploy.yml --syntax-check
|
||||
|
||||
# Deploy CEPH
|
||||
ansible-playbook playbooks/ceph-deploy.yml --limit matrix_cluster
|
||||
|
||||
# Verify CEPH status
|
||||
ansible -i inventory/proxmox.yml foxtrot -m shell -a "ceph status"
|
||||
ansible -i inventory/proxmox.yml foxtrot -m shell -a "ceph osd tree"
|
||||
ansible -i inventory/proxmox.yml foxtrot -m shell -a "ceph df"
|
||||
```
|
||||
|
||||
### Add mise Tasks
|
||||
|
||||
```toml
|
||||
# .mise.toml
|
||||
[tasks."ceph:deploy"]
|
||||
description = "Deploy CEPH storage on cluster"
|
||||
run = """
|
||||
cd ansible
|
||||
uv run ansible-playbook playbooks/ceph-deploy.yml
|
||||
"""
|
||||
|
||||
[tasks."ceph:status"]
|
||||
description = "Show CEPH cluster status"
|
||||
run = """
|
||||
ansible -i ansible/inventory/proxmox.yml foxtrot -m shell -a "ceph -s"
|
||||
"""
|
||||
|
||||
[tasks."ceph:health"]
|
||||
description = "Show CEPH health detail"
|
||||
run = """
|
||||
ansible -i ansible/inventory/proxmox.yml foxtrot -m shell -a "ceph health detail"
|
||||
"""
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### OSDs Won't Create
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- `pveceph osd create` fails with "already in use" error
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Check if disk has existing partitions: `lsblk /dev/nvme1n1`
|
||||
2. Wipe disk: `wipefs -a /dev/nvme1n1` (DESTRUCTIVE!)
|
||||
3. Set `ceph_wipe_disks: true` in group_vars
|
||||
4. Check for existing LVM: `pvdisplay`, `lvdisplay`
|
||||
|
||||
### PGs Stuck in Creating
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- PGs stay in "creating" state for extended period
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Check OSD status: `ceph osd tree`
|
||||
2. Verify all OSDs are up and in: `ceph osd stat`
|
||||
3. Check mon/mgr status: `ceph mon stat`, `ceph mgr stat`
|
||||
4. Review logs: `journalctl -u ceph-osd@*.service -n 100`
|
||||
|
||||
### Poor CEPH Performance
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- Slow VM disk I/O
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Verify MTU 9000: `ip link show vmbr1 | grep mtu`
|
||||
2. Test network throughput: `iperf3` between nodes
|
||||
3. Check OSD utilization: `ceph osd df`
|
||||
4. Verify SSD/NVMe is being used: `ceph osd tree`
|
||||
5. Check for rebalancing: `ceph -s` (look for "recovery")
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- [Cluster Formation](cluster-formation.md) - Form cluster before CEPH
|
||||
- [Network Configuration](../reference/networking.md) - Configure CEPH networks
|
||||
- [Storage Management](../reference/storage-management.md) - Manage CEPH pools and OSDs
|
||||
|
||||
## References
|
||||
|
||||
- ProxSpray analysis: `docs/proxspray-analysis.md` (lines 1431-1562)
|
||||
- Proxmox VE CEPH documentation
|
||||
- CEPH deployment best practices
|
||||
- [Ansible CEPH automation pattern](../../.claude/skills/ansible-best-practices/patterns/ceph-automation.md)
|
||||
646
skills/proxmox-infrastructure/workflows/cluster-formation.md
Normal file
646
skills/proxmox-infrastructure/workflows/cluster-formation.md
Normal file
@@ -0,0 +1,646 @@
|
||||
# Proxmox Cluster Formation Workflow
|
||||
|
||||
Complete guide to forming a Proxmox VE cluster using Ansible automation with idempotent patterns.
|
||||
|
||||
## Overview
|
||||
|
||||
This workflow automates the creation of a Proxmox VE cluster with:
|
||||
|
||||
- Hostname resolution configuration
|
||||
- SSH key distribution for cluster operations
|
||||
- Idempotent cluster initialization
|
||||
- Corosync network configuration
|
||||
- Quorum and health verification
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before forming a cluster:
|
||||
|
||||
1. **All nodes must have:**
|
||||
- Proxmox VE 9.x installed
|
||||
- Network connectivity on management network
|
||||
- Dedicated corosync network configured (VLAN 9 for Matrix)
|
||||
- Unique hostnames
|
||||
- Synchronized time (NTP configured)
|
||||
|
||||
2. **Minimum requirements:**
|
||||
- At least 3 nodes for quorum (production)
|
||||
- 1 node for development/testing (non-recommended)
|
||||
|
||||
3. **Network requirements:**
|
||||
- All nodes must be able to resolve each other's hostnames
|
||||
- Corosync network must be isolated (no VM traffic)
|
||||
- Low latency between nodes (<2ms recommended)
|
||||
- MTU 1500 on management network
|
||||
|
||||
## Phase 1: Prepare Cluster Nodes
|
||||
|
||||
### Step 1: Verify Prerequisites
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/prerequisites.yml
|
||||
---
|
||||
- name: Check Proxmox VE is installed
|
||||
ansible.builtin.stat:
|
||||
path: /usr/bin/pvecm
|
||||
register: pvecm_binary
|
||||
failed_when: not pvecm_binary.stat.exists
|
||||
|
||||
- name: Get Proxmox VE version
|
||||
ansible.builtin.command:
|
||||
cmd: pveversion
|
||||
register: pve_version
|
||||
changed_when: false
|
||||
|
||||
- name: Verify minimum Proxmox VE version
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- "'pve-manager/9' in pve_version.stdout or 'pve-manager/8' in pve_version.stdout"
|
||||
fail_msg: "Proxmox VE 8.x or 9.x required"
|
||||
|
||||
- name: Verify minimum node count for production
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- groups[cluster_group] | length >= 3
|
||||
fail_msg: "Production cluster requires at least 3 nodes for quorum"
|
||||
when: cluster_environment == 'production'
|
||||
|
||||
- name: Check no existing cluster membership
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: existing_cluster
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Display cluster warning if already member
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
WARNING: Node {{ inventory_hostname }} is already a cluster member.
|
||||
Current cluster: {{ existing_cluster.stdout }}
|
||||
This playbook will attempt to join the target cluster.
|
||||
when:
|
||||
- existing_cluster.rc == 0
|
||||
- cluster_name not in existing_cluster.stdout
|
||||
```
|
||||
|
||||
### Step 2: Configure Hostname Resolution
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/hosts_config.yml
|
||||
---
|
||||
- name: Ensure cluster nodes in /etc/hosts (management IP)
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/hosts
|
||||
regexp: "^{{ item.management_ip }}\\s+"
|
||||
line: "{{ item.management_ip }} {{ item.fqdn }} {{ item.short_name }}"
|
||||
state: present
|
||||
loop: "{{ cluster_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.short_name }}"
|
||||
|
||||
- name: Ensure corosync IPs in /etc/hosts
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/hosts
|
||||
regexp: "^{{ item.corosync_ip }}\\s+"
|
||||
line: "{{ item.corosync_ip }} {{ item.short_name }}-corosync"
|
||||
state: present
|
||||
loop: "{{ cluster_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.short_name }}"
|
||||
|
||||
- name: Verify hostname resolution (forward)
|
||||
ansible.builtin.command:
|
||||
cmd: "getent hosts {{ item.fqdn }}"
|
||||
register: host_lookup
|
||||
failed_when: host_lookup.rc != 0
|
||||
changed_when: false
|
||||
loop: "{{ cluster_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.fqdn }}"
|
||||
|
||||
- name: Verify hostname resolution (reverse)
|
||||
ansible.builtin.command:
|
||||
cmd: "getent hosts {{ item.management_ip }}"
|
||||
register: reverse_lookup
|
||||
failed_when:
|
||||
- reverse_lookup.rc != 0
|
||||
changed_when: false
|
||||
loop: "{{ cluster_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.management_ip }}"
|
||||
|
||||
- name: Test corosync network connectivity
|
||||
ansible.builtin.command:
|
||||
cmd: "ping -c 3 -W 2 {{ item.corosync_ip }}"
|
||||
register: corosync_ping
|
||||
changed_when: false
|
||||
when: item.short_name != inventory_hostname_short
|
||||
loop: "{{ cluster_nodes }}"
|
||||
loop_control:
|
||||
label: "{{ item.short_name }}"
|
||||
```
|
||||
|
||||
### Step 3: Distribute SSH Keys
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/ssh_keys.yml
|
||||
---
|
||||
- name: Generate SSH key for root (if not exists)
|
||||
ansible.builtin.user:
|
||||
name: root
|
||||
generate_ssh_key: true
|
||||
ssh_key_type: ed25519
|
||||
ssh_key_comment: "root@{{ inventory_hostname }}"
|
||||
register: root_ssh_key
|
||||
|
||||
- name: Fetch public keys from all nodes
|
||||
ansible.builtin.slurp:
|
||||
src: /root/.ssh/id_ed25519.pub
|
||||
register: node_public_keys
|
||||
|
||||
- name: Distribute SSH keys to all nodes
|
||||
ansible.posix.authorized_key:
|
||||
user: root
|
||||
state: present
|
||||
key: "{{ hostvars[item].node_public_keys.content | b64decode }}"
|
||||
comment: "cluster-{{ item }}"
|
||||
loop: "{{ groups[cluster_group] }}"
|
||||
when: item != inventory_hostname
|
||||
|
||||
- name: Populate known_hosts with node SSH keys
|
||||
ansible.builtin.shell:
|
||||
cmd: "ssh-keyscan -H {{ item }} >> /root/.ssh/known_hosts"
|
||||
when: item != inventory_hostname
|
||||
loop: "{{ groups[cluster_group] }}"
|
||||
loop_control:
|
||||
label: "{{ item }}"
|
||||
changed_when: true
|
||||
|
||||
- name: Test SSH connectivity to all nodes
|
||||
ansible.builtin.command:
|
||||
cmd: "ssh -o ConnectTimeout=5 {{ item }} hostname"
|
||||
register: ssh_test
|
||||
changed_when: false
|
||||
when: item != inventory_hostname
|
||||
loop: "{{ groups[cluster_group] }}"
|
||||
loop_control:
|
||||
label: "{{ item }}"
|
||||
```
|
||||
|
||||
## Phase 2: Initialize Cluster
|
||||
|
||||
### Step 4: Create Cluster (First Node Only)
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/cluster_init.yml
|
||||
---
|
||||
- name: Check existing cluster status
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: cluster_status
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Get cluster nodes list
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm nodes
|
||||
register: cluster_nodes_check
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Set cluster facts
|
||||
ansible.builtin.set_fact:
|
||||
in_target_cluster: "{{ cluster_status.rc == 0 and cluster_name in cluster_status.stdout }}"
|
||||
|
||||
- name: Create new cluster on first node
|
||||
ansible.builtin.command:
|
||||
cmd: "pvecm create {{ cluster_name }} --link0 {{ corosync_link0_address }}"
|
||||
when: not in_target_cluster
|
||||
register: cluster_create
|
||||
changed_when: cluster_create.rc == 0
|
||||
|
||||
- name: Wait for cluster to initialize
|
||||
ansible.builtin.pause:
|
||||
seconds: 10
|
||||
when: cluster_create.changed
|
||||
|
||||
- name: Verify cluster creation
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: cluster_verify
|
||||
changed_when: false
|
||||
failed_when: cluster_name not in cluster_verify.stdout
|
||||
|
||||
- name: Display cluster status
|
||||
ansible.builtin.debug:
|
||||
var: cluster_verify.stdout_lines
|
||||
when: cluster_create.changed or ansible_verbosity > 0
|
||||
```
|
||||
|
||||
### Step 5: Join Nodes to Cluster
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/cluster_join.yml
|
||||
---
|
||||
- name: Check if already in cluster
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: cluster_status
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Set membership facts
|
||||
ansible.builtin.set_fact:
|
||||
is_cluster_member: "{{ cluster_status.rc == 0 }}"
|
||||
in_target_cluster: "{{ cluster_status.rc == 0 and cluster_name in cluster_status.stdout }}"
|
||||
|
||||
- name: Get first node hostname
|
||||
ansible.builtin.set_fact:
|
||||
first_node_hostname: "{{ hostvars[groups[cluster_group][0]].inventory_hostname }}"
|
||||
|
||||
- name: Join cluster
|
||||
ansible.builtin.command:
|
||||
cmd: >
|
||||
pvecm add {{ first_node_hostname }}
|
||||
--link0 {{ corosync_link0_address }}
|
||||
when:
|
||||
- not is_cluster_member or not in_target_cluster
|
||||
register: cluster_join
|
||||
changed_when: cluster_join.rc == 0
|
||||
failed_when:
|
||||
- cluster_join.rc != 0
|
||||
- "'already in a cluster' not in cluster_join.stderr"
|
||||
|
||||
- name: Wait for node to join cluster
|
||||
ansible.builtin.pause:
|
||||
seconds: 10
|
||||
when: cluster_join.changed
|
||||
|
||||
- name: Verify cluster membership
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: join_verify
|
||||
changed_when: false
|
||||
failed_when:
|
||||
- "'Quorate: Yes' not in join_verify.stdout"
|
||||
```
|
||||
|
||||
## Phase 3: Configure Corosync
|
||||
|
||||
### Step 6: Corosync Network Configuration
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/corosync.yml
|
||||
---
|
||||
- name: Get current corosync configuration
|
||||
ansible.builtin.slurp:
|
||||
src: /etc/pve/corosync.conf
|
||||
register: corosync_conf_current
|
||||
|
||||
- name: Parse current corosync config
|
||||
ansible.builtin.set_fact:
|
||||
current_corosync: "{{ corosync_conf_current.content | b64decode }}"
|
||||
|
||||
- name: Check if corosync config needs update
|
||||
ansible.builtin.set_fact:
|
||||
corosync_needs_update: "{{ corosync_network not in current_corosync }}"
|
||||
|
||||
- name: Backup corosync.conf
|
||||
ansible.builtin.copy:
|
||||
src: /etc/pve/corosync.conf
|
||||
dest: "/etc/pve/corosync.conf.{{ ansible_date_time.epoch }}.bak"
|
||||
remote_src: true
|
||||
mode: '0640'
|
||||
when: corosync_needs_update
|
||||
delegate_to: "{{ groups[cluster_group][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Update corosync configuration
|
||||
ansible.builtin.template:
|
||||
src: corosync.conf.j2
|
||||
dest: /etc/pve/corosync.conf.new
|
||||
validate: corosync-cfgtool -c %s
|
||||
mode: '0640'
|
||||
when: corosync_needs_update
|
||||
delegate_to: "{{ groups[cluster_group][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Apply new corosync configuration
|
||||
ansible.builtin.copy:
|
||||
src: /etc/pve/corosync.conf.new
|
||||
dest: /etc/pve/corosync.conf
|
||||
remote_src: true
|
||||
mode: '0640'
|
||||
when: corosync_needs_update
|
||||
notify:
|
||||
- reload corosync
|
||||
delegate_to: "{{ groups[cluster_group][0] }}"
|
||||
run_once: true
|
||||
```
|
||||
|
||||
**Corosync Template Example:**
|
||||
|
||||
```jinja2
|
||||
# templates/corosync.conf.j2
|
||||
totem {
|
||||
version: 2
|
||||
cluster_name: {{ cluster_name }}
|
||||
transport: knet
|
||||
crypto_cipher: aes256
|
||||
crypto_hash: sha256
|
||||
|
||||
interface {
|
||||
linknumber: 0
|
||||
knet_link_priority: 255
|
||||
}
|
||||
}
|
||||
|
||||
nodelist {
|
||||
{% for node in cluster_nodes %}
|
||||
node {
|
||||
name: {{ node.short_name }}
|
||||
nodeid: {{ node.node_id }}
|
||||
quorum_votes: 1
|
||||
ring0_addr: {{ node.corosync_ip }}
|
||||
}
|
||||
{% endfor %}
|
||||
}
|
||||
|
||||
quorum {
|
||||
provider: corosync_votequorum
|
||||
{% if cluster_nodes | length == 2 %}
|
||||
two_node: 1
|
||||
{% endif %}
|
||||
}
|
||||
|
||||
logging {
|
||||
to_logfile: yes
|
||||
logfile: /var/log/corosync/corosync.log
|
||||
to_syslog: yes
|
||||
timestamp: on
|
||||
}
|
||||
```
|
||||
|
||||
## Phase 4: Verify Cluster Health
|
||||
|
||||
### Step 7: Health Checks
|
||||
|
||||
```yaml
|
||||
# roles/proxmox_cluster/tasks/verify.yml
|
||||
---
|
||||
- name: Wait for cluster to stabilize
|
||||
ansible.builtin.pause:
|
||||
seconds: 15
|
||||
|
||||
- name: Check cluster quorum
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: cluster_health
|
||||
changed_when: false
|
||||
failed_when: "'Quorate: Yes' not in cluster_health.stdout"
|
||||
|
||||
- name: Get cluster node count
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm nodes
|
||||
register: cluster_nodes_final
|
||||
changed_when: false
|
||||
|
||||
- name: Verify expected node count
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- cluster_nodes_final.stdout_lines | length >= groups[cluster_group] | length
|
||||
fail_msg: "Expected {{ groups[cluster_group] | length }} nodes but found {{ cluster_nodes_final.stdout_lines | length }}"
|
||||
|
||||
- name: Check corosync ring status
|
||||
ansible.builtin.command:
|
||||
cmd: corosync-cfgtool -s
|
||||
register: corosync_status
|
||||
changed_when: false
|
||||
|
||||
- name: Verify all nodes in corosync
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- "'online' in corosync_status.stdout"
|
||||
fail_msg: "Corosync ring issues detected"
|
||||
|
||||
- name: Get cluster configuration version
|
||||
ansible.builtin.command:
|
||||
cmd: corosync-cmapctl -b totem.config_version
|
||||
register: config_version
|
||||
changed_when: false
|
||||
|
||||
- name: Display cluster health summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
Cluster: {{ cluster_name }}
|
||||
Quorum: {{ 'Yes' if 'Quorate: Yes' in cluster_health.stdout else 'No' }}
|
||||
Nodes: {{ cluster_nodes_final.stdout_lines | length }}
|
||||
Config Version: {{ config_version.stdout }}
|
||||
```
|
||||
|
||||
## Matrix Cluster Example Configuration
|
||||
|
||||
```yaml
|
||||
# group_vars/matrix_cluster.yml
|
||||
---
|
||||
cluster_name: "Matrix"
|
||||
cluster_group: "matrix_cluster"
|
||||
cluster_environment: "production"
|
||||
|
||||
# Corosync configuration
|
||||
corosync_network: "192.168.8.0/24" # VLAN 9
|
||||
|
||||
# Node configuration
|
||||
cluster_nodes:
|
||||
- short_name: foxtrot
|
||||
fqdn: foxtrot.matrix.spaceships.work
|
||||
management_ip: 192.168.3.5
|
||||
corosync_ip: 192.168.8.5
|
||||
node_id: 1
|
||||
|
||||
- short_name: golf
|
||||
fqdn: golf.matrix.spaceships.work
|
||||
management_ip: 192.168.3.6
|
||||
corosync_ip: 192.168.8.6
|
||||
node_id: 2
|
||||
|
||||
- short_name: hotel
|
||||
fqdn: hotel.matrix.spaceships.work
|
||||
management_ip: 192.168.3.7
|
||||
corosync_ip: 192.168.8.7
|
||||
node_id: 3
|
||||
|
||||
# Set per-node corosync address
|
||||
corosync_link0_address: "{{ cluster_nodes | selectattr('short_name', 'equalto', inventory_hostname_short) | map(attribute='corosync_ip') | first }}"
|
||||
```
|
||||
|
||||
## Complete Playbook Example
|
||||
|
||||
```yaml
|
||||
# playbooks/cluster-init.yml
|
||||
---
|
||||
- name: Initialize Proxmox Cluster
|
||||
hosts: "{{ cluster_group | default('matrix_cluster') }}"
|
||||
become: true
|
||||
serial: 1 # One node at a time for safety
|
||||
|
||||
pre_tasks:
|
||||
- name: Validate cluster group is defined
|
||||
ansible.builtin.assert:
|
||||
that:
|
||||
- cluster_group is defined
|
||||
- cluster_name is defined
|
||||
- cluster_nodes is defined
|
||||
fail_msg: "Required variables not defined in group_vars"
|
||||
|
||||
- name: Display cluster configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
Forming cluster: {{ cluster_name }}
|
||||
Nodes: {{ cluster_nodes | map(attribute='short_name') | join(', ') }}
|
||||
Corosync network: {{ corosync_network }}
|
||||
run_once: true
|
||||
|
||||
tasks:
|
||||
- name: Verify prerequisites
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/prerequisites.yml"
|
||||
|
||||
- name: Configure /etc/hosts
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/hosts_config.yml"
|
||||
|
||||
- name: Distribute SSH keys
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/ssh_keys.yml"
|
||||
|
||||
# First node creates cluster
|
||||
- name: Initialize cluster on first node
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/cluster_init.yml"
|
||||
when: inventory_hostname == groups[cluster_group][0]
|
||||
|
||||
# Wait for first node
|
||||
- name: Wait for first node to complete
|
||||
ansible.builtin.pause:
|
||||
seconds: 20
|
||||
when: inventory_hostname != groups[cluster_group][0]
|
||||
|
||||
# Other nodes join
|
||||
- name: Join cluster on other nodes
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/cluster_join.yml"
|
||||
when: inventory_hostname != groups[cluster_group][0]
|
||||
|
||||
- name: Configure corosync
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/corosync.yml"
|
||||
|
||||
- name: Verify cluster health
|
||||
ansible.builtin.include_tasks: "{{ role_path }}/tasks/verify.yml"
|
||||
|
||||
post_tasks:
|
||||
- name: Display final cluster status
|
||||
ansible.builtin.command:
|
||||
cmd: pvecm status
|
||||
register: final_status
|
||||
changed_when: false
|
||||
delegate_to: "{{ groups[cluster_group][0] }}"
|
||||
run_once: true
|
||||
|
||||
- name: Show cluster status
|
||||
ansible.builtin.debug:
|
||||
var: final_status.stdout_lines
|
||||
run_once: true
|
||||
|
||||
handlers:
|
||||
- name: reload corosync
|
||||
ansible.builtin.systemd:
|
||||
name: corosync
|
||||
state: reloaded
|
||||
throttle: 1
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Initialize Matrix Cluster
|
||||
|
||||
```bash
|
||||
# Check syntax
|
||||
ansible-playbook playbooks/cluster-init.yml --syntax-check
|
||||
|
||||
# Dry run (limited functionality)
|
||||
ansible-playbook playbooks/cluster-init.yml --check --diff
|
||||
|
||||
# Initialize cluster
|
||||
ansible-playbook playbooks/cluster-init.yml --limit matrix_cluster
|
||||
|
||||
# Verify cluster status
|
||||
ansible -i inventory/proxmox.yml foxtrot -m shell -a "pvecm status"
|
||||
ansible -i inventory/proxmox.yml foxtrot -m shell -a "pvecm nodes"
|
||||
```
|
||||
|
||||
### Add mise Task
|
||||
|
||||
```toml
|
||||
# .mise.toml
|
||||
[tasks."cluster:init"]
|
||||
description = "Initialize Proxmox cluster"
|
||||
run = """
|
||||
cd ansible
|
||||
uv run ansible-playbook playbooks/cluster-init.yml
|
||||
"""
|
||||
|
||||
[tasks."cluster:status"]
|
||||
description = "Show cluster status"
|
||||
run = """
|
||||
ansible -i ansible/inventory/proxmox.yml foxtrot -m shell -a "pvecm status"
|
||||
"""
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Node Won't Join Cluster
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- `pvecm add` fails with timeout or connection error
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Verify SSH connectivity: `ssh root@first-node hostname`
|
||||
2. Check /etc/hosts: `getent hosts first-node`
|
||||
3. Verify corosync network: `ping -c 3 192.168.8.5`
|
||||
4. Check firewall: `iptables -L | grep 5404`
|
||||
|
||||
### Cluster Shows No Quorum
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- `pvecm status` shows `Quorate: No`
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Check node count: Must have majority (2 of 3, 3 of 5, etc.)
|
||||
2. Verify corosync: `systemctl status corosync`
|
||||
3. Check corosync ring: `corosync-cfgtool -s`
|
||||
4. Review logs: `journalctl -u corosync -n 50`
|
||||
|
||||
### Configuration Sync Issues
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- Changes on one node don't appear on others
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. Verify pmxcfs: `systemctl status pve-cluster`
|
||||
2. Check filesystem: `pvecm status | grep -i cluster`
|
||||
3. Restart cluster filesystem: `systemctl restart pve-cluster`
|
||||
|
||||
## Related Workflows
|
||||
|
||||
- [CEPH Deployment](ceph-deployment.md) - Deploy CEPH after cluster formation
|
||||
- [Network Configuration](../reference/networking.md) - Configure cluster networking
|
||||
- [Cluster Maintenance](cluster-maintenance.md) - Add/remove nodes, upgrades
|
||||
|
||||
## References
|
||||
|
||||
- ProxSpray analysis: `docs/proxspray-analysis.md` (lines 1318-1428)
|
||||
- Proxmox VE Cluster Manager documentation
|
||||
- Corosync configuration guide
|
||||
- [Ansible cluster automation pattern](../../ansible-best-practices/patterns/cluster-automation.md)
|
||||
Reference in New Issue
Block a user