Initial commit
This commit is contained in:
116
commands/analyze-node-tuning.md
Normal file
116
commands/analyze-node-tuning.md
Normal file
@@ -0,0 +1,116 @@
|
||||
---
|
||||
description: Analyze kernel/sysctl tuning from a live node or sosreport snapshot and propose NTO recommendations
|
||||
argument-hint: "[--sosreport PATH] [--format json|markdown] [--max-irq-samples N]"
|
||||
---
|
||||
|
||||
## Name
|
||||
node-tuning:analyze-node-tuning
|
||||
|
||||
## Synopsis
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning [--sosreport PATH] [--collect-sosreport|--no-collect-sosreport] [--sosreport-output PATH] [--node NODE] [--kubeconfig PATH] [--oc-binary PATH] [--format json|markdown] [--max-irq-samples N] [--keep-snapshot]
|
||||
```
|
||||
|
||||
## Description
|
||||
The `node-tuning:analyze-node-tuning` command inspects kernel tuning signals gathered from either a live OpenShift node (`/proc`, `/sys`), an `oc debug node/<name>` snapshot captured via KUBECONFIG, or an extracted sosreport directory. It parses CPU isolation parameters, IRQ affinity, huge page allocation, critical sysctl settings, and networking counters before compiling actionable recommendations that can be enforced through Tuned profiles or MachineConfig updates.
|
||||
|
||||
Use this command when you need to:
|
||||
- Audit a node for tuning regressions after upgrades or configuration changes.
|
||||
- Translate findings into remediation steps for the Node Tuning Operator.
|
||||
- Produce JSON or Markdown reports suitable for incident response, CI gates, or documentation.
|
||||
|
||||
## Implementation
|
||||
1. **Establish data source**
|
||||
- Live (local) analysis: the helper script defaults to `/proc` and `/sys`. Ensure the command runs on the target node (or within an SSH session / debug pod).
|
||||
- Remote analysis via `oc debug`: provide `--node <name>` (plus optional `--kubeconfig` and `--oc-binary`). The helper defaults to entering the RHCOS `toolbox` (backed by the `registry.redhat.io/rhel9/support-tools` image) via `oc debug node/<name>`, running `sosreport --batch --quiet -e openshift -e openshift_ovn -e openvswitch -e podman -e crio -k crio.all=on -k crio.logs=on -k podman.all=on -k podman.logs=on -k networking.ethtool-namespaces=off --all-logs --plugin-timeout=600`, streaming the archive locally (respecting `--sosreport-output` when set), and analyzing the extracted data. Use `--toolbox-image` (or `TOOLBOX_IMAGE`) to point at a mirrored support-tools image, `--sosreport-arg` to append extra flags (repeat per flag), or `--skip-default-sosreport-flags` to take full control. Host HTTP(S) proxy variables are forwarded when present but entirely optional. Add `--no-collect-sosreport` to skip sosreport generation entirely, and `--keep-snapshot` if you want to retain the downloaded files.
|
||||
- Offline analysis: provide `--sosreport /path/to/sosreport-<timestamp>` pointing to an extracted sosreport directory; the script auto-discovers embedded `proc/` and `sys/` trees.
|
||||
- Override non-standard layouts with `--proc-root` or `--sys-root` as needed.
|
||||
|
||||
2. **Prepare workspace**
|
||||
- Create `.work/node-tuning/<hostname>/` to store generated reports (remote snapshots and sosreport captures may reuse this path or default to a temporary directory).
|
||||
- Decide whether you want Markdown (human-readable) or JSON (automation-ready) output. Set `--format json` and `--output` for machine consumption.
|
||||
|
||||
3. **Invoke the analysis helper**
|
||||
```bash
|
||||
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
|
||||
--sosreport "$SOS_DIR" \
|
||||
--format markdown \
|
||||
--max-irq-samples 10 \
|
||||
--output ".work/node-tuning/${HOSTNAME}/analysis.md"
|
||||
```
|
||||
- Omit `--sosreport` and `--node` to evaluate the local environment.
|
||||
- Lower `--max-irq-samples` to cap the number of IRQ affinity overlaps listed in the report.
|
||||
|
||||
4. **Interpret results**
|
||||
- **System Overview**: Validates kernel release, NUMA nodes, and kernel cmdline flags (isolcpus, nohz_full, tuned.non_isolcpus).
|
||||
- **CPU & Isolation**: Highlights SMT detection, isolated CPU masks, and mismatches between default IRQ affinity and isolated cores.
|
||||
- **Huge Pages**: Summarizes global and per-NUMA huge page pools, reserved counts, and sysctl targets.
|
||||
- **Sysctl Highlights**: Surfaces values for tuning-critical keys (e.g., `net.core.netdev_max_backlog`, `vm.swappiness`, THP state) with recommendations when thresholds are missed.
|
||||
- **Network Signals**: Examines `TcpExt` counters and sockstat data for backlog drops, syncookie failures, or orphaned sockets.
|
||||
- **IRQ Affinity**: Lists IRQs overlapping isolated CPUs so you can adjust tuned profiles or irqbalance policies.
|
||||
- **Process Snapshot**: When available in sosreport snapshots, shows top CPU consumers and flags irqbalance presence.
|
||||
|
||||
5. **Apply remediation**
|
||||
- Feed the recommendations into `/node-tuning:generate-tuned-profile` or MachineConfig workflows.
|
||||
- For immediate live tuning, adjust sysctls or interrupt affinities manually, then rerun the analysis to confirm remediation.
|
||||
|
||||
## Return Value
|
||||
- **Success**: Returns a Markdown or JSON report summarizing findings and recommended actions.
|
||||
- **Failure**: Reports descriptive errors (e.g., missing `proc/` or `sys/` directories, unreadable sosreport path) and exits non-zero.
|
||||
|
||||
## Examples
|
||||
|
||||
1. **Analyze a live node and print Markdown**
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning --format markdown
|
||||
```
|
||||
|
||||
2. **Capture `/proc` and `/sys` via `oc debug` (sosreport by default) and analyze remotely**
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning \
|
||||
--node worker-rt-0 \
|
||||
--kubeconfig ~/.kube/prod \
|
||||
--format markdown
|
||||
```
|
||||
|
||||
3. **Collect a sosreport via `oc debug` (custom image + flags) and analyze it locally**
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning \
|
||||
--node worker-rt-0 \
|
||||
--toolbox-image registry.example.com/support-tools:latest \
|
||||
--sosreport-arg "--case-id=01234567" \
|
||||
--sosreport-output .work/node-tuning/sosreports \
|
||||
--format json
|
||||
```
|
||||
|
||||
4. **Inspect an extracted sosreport and save JSON to disk**
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning \
|
||||
--sosreport ~/Downloads/sosreport-worker-001 \
|
||||
--format json \
|
||||
--max-irq-samples 20
|
||||
```
|
||||
|
||||
5. **Limit the recommendation set to a handful of IRQ overlaps**
|
||||
```text
|
||||
/node-tuning:analyze-node-tuning --sosreport /tmp/sosreport --max-irq-samples 5
|
||||
```
|
||||
|
||||
## Arguments:
|
||||
- **--sosreport**: Path to an extracted sosreport directory to analyze instead of the live filesystem.
|
||||
- **--format**: Output format (`markdown` default or `json` for structured data).
|
||||
- **--output**: Optional file path where the helper writes the report.
|
||||
- **--max-irq-samples**: Maximum number of IRQ affinity overlaps to include in the output (default 15).
|
||||
- **--proc-root**: Override path to the procfs tree when auto-detection is insufficient.
|
||||
- **--sys-root**: Override path to the sysfs tree when auto-detection is insufficient.
|
||||
- **--node**: OpenShift node name to analyze via `oc debug node/<name>` when direct access is not possible.
|
||||
- **--kubeconfig**: Path to the kubeconfig file used for `oc debug`; relies on the current oc context when omitted.
|
||||
- **--oc-binary**: Path to the `oc` binary (defaults to `$OC_BIN` or `oc`).
|
||||
- **--keep-snapshot**: Preserve the temporary directory produced from `oc debug` (snapshots or sosreports) for later inspection.
|
||||
- **--collect-sosreport**: Trigger `sosreport` via `oc debug node/<name>`, download the archive, and analyze the extracted contents automatically (default behavior whenever `--node` is supplied and no other source is chosen).
|
||||
- **--no-collect-sosreport**: Disable the default sosreport workflow when `--node` is supplied, falling back to the raw `/proc`/`/sys` snapshot.
|
||||
- **--sosreport-output**: Directory where downloaded sosreport archives and their extraction should be placed (defaults to a temporary directory).
|
||||
- **--toolbox-image**: Override the container image that toolbox pulls when collecting sosreport (defaults to `registry.redhat.io/rhel9/support-tools:latest` or `TOOLBOX_IMAGE` env).
|
||||
- **--sosreport-arg**: Append an additional argument to the sosreport command (repeatable).
|
||||
- **--skip-default-sosreport-flags**: Do not include the default OpenShift-focused sosreport plugins/collectors; only use values supplied via `--sosreport-arg`.
|
||||
|
||||
200
commands/generate-tuned-profile.md
Normal file
200
commands/generate-tuned-profile.md
Normal file
@@ -0,0 +1,200 @@
|
||||
---
|
||||
description: Generate a Tuned (tuned.openshift.io/v1) profile manifest for the Node Tuning Operator
|
||||
argument-hint: "[profile-name] [--summary ...] [--sysctl ...] [options]"
|
||||
---
|
||||
|
||||
## Name
|
||||
node-tuning:generate-tuned-profile
|
||||
|
||||
## Synopsis
|
||||
```text
|
||||
/node-tuning:generate-tuned-profile [profile-name] [--summary TEXT] [--include VALUE ...] [--sysctl KEY=VALUE ...] [--match-label KEY[=VALUE] ...] [options]
|
||||
```
|
||||
|
||||
## Description
|
||||
The `node-tuning:generate-tuned-profile` command streamlines creation of `tuned.openshift.io/v1` manifests for the OpenShift Node Tuning Operator. It captures the desired Tuned profile metadata, tuned daemon configuration blocks (e.g. `[sysctl]`, `[variables]`, `[bootloader]`), and recommendation rules, then invokes the helper script at `plugins/node-tuning/skills/scripts/generate_tuned_profile.py` to render a ready-to-apply YAML file.
|
||||
|
||||
Use this command whenever you need to:
|
||||
- Bootstrap a new Tuned custom profile targeting selected nodes or machine config pools
|
||||
- Generate manifests that can be version-controlled alongside other automation
|
||||
- Iterate on sysctl, bootloader, or service parameters without hand-editing multi-line YAML
|
||||
|
||||
The generated manifest follows the structure expected by the cluster Node Tuning Operator:
|
||||
```
|
||||
apiVersion: tuned.openshift.io/v1
|
||||
kind: Tuned
|
||||
metadata:
|
||||
name: <profile-name>
|
||||
namespace: openshift-cluster-node-tuning-operator
|
||||
spec:
|
||||
profile:
|
||||
- data: |
|
||||
[main]
|
||||
summary=...
|
||||
include=...
|
||||
...
|
||||
name: <profile-name>
|
||||
recommend:
|
||||
- machineConfigLabels: {...}
|
||||
match:
|
||||
- label: ...
|
||||
value: ...
|
||||
priority: <priority>
|
||||
profile: <profile-name>
|
||||
```
|
||||
|
||||
## Implementation
|
||||
1. **Collect inputs**
|
||||
- Confirm Python 3.8+ is available (`python3 --version`).
|
||||
- Gather the Tuned profile name, summary, optional include chain, sysctl values, variables, and any additional section lines (e.g. `[bootloader]`, `[service]`).
|
||||
- Determine targeting rules: either `--match-label` entries (node labels) or `--machine-config-label` entries (MachineConfigPool selectors).
|
||||
- Decide whether an accompanying MachineConfigPool (MCP) workflow is required for kernel boot arguments (see **Advanced Workflow** below).
|
||||
- Use the helper's `--list-nodes` and `--label-node` flags when you need to inspect or label nodes prior to manifest generation.
|
||||
|
||||
2. **Build execution workspace**
|
||||
- Create or reuse `.work/node-tuning/<profile-name>/`.
|
||||
- Decide on the manifest filename (default `tuned.yaml` inside the workspace) or provide `--output` to override.
|
||||
|
||||
3. **Invoke the generator script**
|
||||
- Run the helper with the collected switches:
|
||||
```text
|
||||
bash
|
||||
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
|
||||
--profile-name "$PROFILE_NAME" \
|
||||
--summary "$SUMMARY" \
|
||||
--include openshift-node \
|
||||
--sysctl net.core.netdev_max_backlog=16384 \
|
||||
--variable isolated_cores=1 \
|
||||
--section bootloader:cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded} \
|
||||
--machine-config-label machineconfiguration.openshift.io/role=worker-rt \
|
||||
--match-label tuned.openshift.io/elasticsearch="" \
|
||||
--priority 25 \
|
||||
--output ".work/node-tuning/$PROFILE_NAME/tuned.yaml"
|
||||
```
|
||||
- Use `--dry-run` to print the manifest to stdout before writing, if desired.
|
||||
|
||||
4. **Validate output**
|
||||
- Inspect the generated YAML (`yq e . .work/node-tuning/$PROFILE_NAME/tuned.yaml` or open in an editor).
|
||||
- Optionally run `oc apply --server-dry-run=client -f .work/node-tuning/$PROFILE_NAME/tuned.yaml` to confirm schema compatibility.
|
||||
|
||||
5. **Apply or distribute**
|
||||
- Apply to a cluster with `oc apply -f .work/node-tuning/$PROFILE_NAME/tuned.yaml`.
|
||||
- Commit the manifest to Git or attach to automated pipelines as needed.
|
||||
|
||||
## Advanced Workflow: Huge Pages with a Dedicated MachineConfigPool
|
||||
Use this workflow when enabling huge pages or other kernel boot parameters that require coordinating the Node Tuning Operator with the Machine Config Operator while minimizing reboots.
|
||||
|
||||
1. **Label target nodes**
|
||||
- Preview candidates: `python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py --list-nodes --node-selector "node-role.kubernetes.io/worker" --skip-manifest`.
|
||||
- Label workers with the helper (repeat per node):
|
||||
```text
|
||||
bash
|
||||
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
|
||||
--label-node ip-10-0-1-23.ec2.internal:node-role.kubernetes.io/worker-hp= \
|
||||
--overwrite-labels \
|
||||
--skip-manifest
|
||||
```
|
||||
- Alternatively run `oc label node <node> node-role.kubernetes.io/worker-hp=` directly if you prefer the CLI.
|
||||
|
||||
2. **Generate the Tuned manifest**
|
||||
- Include bootloader arguments via the helper script:
|
||||
```text
|
||||
bash
|
||||
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
|
||||
--profile-name "openshift-node-hugepages" \
|
||||
--summary "Boot time configuration for hugepages" \
|
||||
--include openshift-node \
|
||||
--section bootloader:cmdline_openshift_node_hugepages="hugepagesz=2M hugepages=50" \
|
||||
--machine-config-label machineconfiguration.openshift.io/role=worker-hp \
|
||||
--priority 30 \
|
||||
--output .work/node-tuning/openshift-node-hugepages/hugepages-tuned-boottime.yaml
|
||||
```
|
||||
- Review the `[bootloader]` section to ensure the kernel arguments match the desired configuration (e.g. `kernel-rt`, huge pages, additional sysctls).
|
||||
|
||||
3. **Author the MachineConfigPool manifest**
|
||||
- Create `.work/node-tuning/openshift-node-hugepages/hugepages-mcp.yaml` with:
|
||||
```yaml
|
||||
apiVersion: machineconfiguration.openshift.io/v1
|
||||
kind: MachineConfigPool
|
||||
metadata:
|
||||
name: worker-hp
|
||||
labels:
|
||||
worker-hp: ""
|
||||
spec:
|
||||
machineConfigSelector:
|
||||
matchExpressions:
|
||||
- key: machineconfiguration.openshift.io/role
|
||||
operator: In
|
||||
values:
|
||||
- worker
|
||||
- worker-hp
|
||||
nodeSelector:
|
||||
matchLabels:
|
||||
node-role.kubernetes.io/worker-hp: ""
|
||||
```
|
||||
|
||||
4. **Apply manifests (optional `--dry-run`)**
|
||||
- `oc apply -f .work/node-tuning/openshift-node-hugepages/hugepages-tuned-boottime.yaml`
|
||||
- `oc apply -f .work/node-tuning/openshift-node-hugepages/hugepages-mcp.yaml`
|
||||
- Watch progress: `oc get mcp worker-hp -w`
|
||||
|
||||
5. **Verify results**
|
||||
- Confirm huge page allocation after the reboot: `oc get node <node> -o jsonpath="{.status.allocatable.hugepages-2Mi}"`
|
||||
- Inspect kernel arguments: `oc debug node/<node> -q -- chroot /host cat /proc/cmdline`
|
||||
|
||||
## Return Value
|
||||
- **Success**: Path to the generated manifest and the profile name are returned to the caller.
|
||||
- **Failure**: Script exits non-zero with stderr diagnostics (e.g. invalid `KEY=VALUE` pair, missing labels, unwritable output path).
|
||||
|
||||
## Examples
|
||||
|
||||
1. **Realtime worker profile targeting worker-rt MCP**
|
||||
```text
|
||||
/node-tuning:generate-tuned-profile openshift-realtime \
|
||||
--summary "Custom realtime tuned profile" \
|
||||
--include openshift-node --include realtime \
|
||||
--variable isolated_cores=1 \
|
||||
--section bootloader:cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded} \
|
||||
--machine-config-label machineconfiguration.openshift.io/role=worker-rt \
|
||||
--output .work/node-tuning/openshift-realtime/realtime.yaml
|
||||
```
|
||||
|
||||
2. **Sysctl-only profile matched by node label**
|
||||
```text
|
||||
/node-tuning:generate-tuned-profile custom-net-tuned \
|
||||
--summary "Increase conntrack table" \
|
||||
--sysctl net.netfilter.nf_conntrack_max=262144 \
|
||||
--match-label tuned.openshift.io/custom-net \
|
||||
--priority 18
|
||||
```
|
||||
|
||||
3. **Preview manifest without writing to disk**
|
||||
```text
|
||||
/node-tuning:generate-tuned-profile pidmax-test \
|
||||
--summary "Raise pid max" \
|
||||
--sysctl kernel.pid_max=131072 \
|
||||
--match-label tuned.openshift.io/pidmax="" \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
## Arguments:
|
||||
- **$1** (`profile-name`): Name for the Tuned profile and manifest resource.
|
||||
- **--summary**: Required summary string placed in the `[main]` section.
|
||||
- **--include**: Optional include chain entries (multiple allowed).
|
||||
- **--main-option**: Additional `[main]` section key/value pairs (`KEY=VALUE`).
|
||||
- **--variable**: Add entries to the `[variables]` section (`KEY=VALUE`).
|
||||
- **--sysctl**: Add sysctl settings to the `[sysctl]` section (`KEY=VALUE`).
|
||||
- **--section**: Add lines to arbitrary sections using `SECTION:KEY=VALUE`.
|
||||
- **--machine-config-label**: MachineConfigPool selector labels (`key=value`) applied under `machineConfigLabels`.
|
||||
- **--match-label**: Node selector labels for the `recommend[].match[]` block; omit `=value` to match existence only.
|
||||
- **--priority**: Recommendation priority (integer, default 20).
|
||||
- **--namespace**: Override the manifest namespace (default `openshift-cluster-node-tuning-operator`).
|
||||
- **--output**: Destination file path; defaults to `<profile-name>.yaml` in the current directory.
|
||||
- **--dry-run**: Print manifest to stdout instead of writing to a file.
|
||||
- **--skip-manifest**: Skip manifest generation; useful when only listing or labeling nodes.
|
||||
- **--list-nodes**: List nodes via `oc get nodes` (works with `--node-selector`).
|
||||
- **--node-selector**: Label selector applied when `--list-nodes` is used.
|
||||
- **--label-node**: Apply labels to nodes using `NODE:KEY[=VALUE]` notation; repeatable.
|
||||
- **--overwrite-labels**: Allow overwriting existing labels when labeling nodes.
|
||||
- **--oc-binary**: Path to the `oc` executable (defaults to `$OC_BIN` or `oc`).
|
||||
|
||||
Reference in New Issue
Block a user