Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:46:13 +08:00
commit 31d7c4f4b6
12 changed files with 2671 additions and 0 deletions

View File

@@ -0,0 +1,11 @@
{
"name": "openshift",
"description": "OpenShift development utilities and helpers",
"version": "0.0.1",
"author": {
"name": "github.com/openshift-eng"
},
"commands": [
"./commands"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# openshift
OpenShift development utilities and helpers

422
commands/bump-deps.md Normal file
View File

@@ -0,0 +1,422 @@
---
description: Bump dependencies in OpenShift projects with automated analysis and PR creation
argument-hint: <dependency> [version] [--create-jira] [--create-pr]
---
## Name
openshift:bump-deps
## Synopsis
```
/openshift:bump-deps <dependency> [version] [--create-jira] [--create-pr]
```
## Description
The `openshift:bump-deps` command automates the process of bumping dependencies in OpenShift organization projects. It analyzes the dependency, determines the appropriate version to bump to, updates the necessary files (go.mod, go.sum, package.json, etc.), runs tests, and optionally creates Jira tickets and pull requests.
This command significantly reduces the manual effort required for dependency updates by automating:
- Dependency version discovery and analysis
- Compatibility checking with current codebase
- File updates (go.mod, package.json, Dockerfile, etc.)
- Test execution to verify the update
- Jira ticket creation with comprehensive details
- Pull request creation with proper formatting
- Release notes generation
The command intelligently handles different dependency types (Go modules, npm packages, container images, etc.) and can process single or multiple dependencies at once.
## Implementation
The command executes the following workflow:
### 1. Repository Analysis
- Detects repository type (Go, Node.js, Python, etc.)
- Identifies dependency management files (go.mod, package.json, requirements.txt, etc.)
- Determines current project structure and conventions
- Checks for existing CI/CD configuration
### 2. Dependency Discovery
**For Go Projects:**
- Parses go.mod to find current version
- Uses `go list -m -versions <module>` to list available versions
- Checks for major version compatibility (v0, v1, v2+)
- Identifies if dependency is direct or indirect
**For Node.js Projects:**
- Parses package.json for current version
- Uses npm/yarn to find latest versions
- Checks semantic versioning constraints
- Identifies devDependencies vs dependencies
**For Container Images:**
- Parses Dockerfile and related files
- Checks registry for available tags
- Verifies image digest and signatures
- Identifies base images and tool images
**For Python Projects:**
- Parses requirements.txt or pyproject.toml
- Uses pip to find available versions
- Checks for version constraints
### 3. Version Selection
If no version is specified:
- Suggests latest stable version
- Considers semantic versioning (patch, minor, major)
- Checks for breaking changes in release notes
- Validates against project's minimum version requirements
If version is specified:
- Validates version exists
- Checks compatibility with current project version
- Warns about major version jumps
### 4. Impact Analysis
- Searches codebase for usage of the dependency
- Identifies files importing/using the dependency
- Analyzes API changes between versions
- Checks for deprecated features being used
- Reviews upstream changelog and release notes
- Identifies potential breaking changes
### 5. File Updates
**Go Projects:**
- Updates go.mod with new version
- Runs `go mod tidy` to update go.sum
- Runs `go mod vendor` if vendor directory exists
- Updates any version constraints in comments
**Node.js Projects:**
- Updates package.json
- Runs `npm install` or `yarn install`
- Updates package-lock.json or yarn.lock
**Container Images:**
- Updates Dockerfile(s)
- Updates related manifests (kubernetes, etc.)
- Updates any CI configuration using the image
**Python Projects:**
- Updates requirements.txt or pyproject.toml
- Generates new lock file if applicable
### 6. Testing Strategy
- Identifies relevant test suites
- Runs unit tests: `make test` or equivalent
- Runs integration tests if available
- Runs e2e tests for critical dependencies
- Checks for test failures and analyzes logs
- Verifies build succeeds: `make build`
### 7. Jira Ticket Creation (if --create-jira)
Creates a Jira ticket with:
- **Summary**: `Bump {dependency} from {old_version} to {new_version}`
- **Type**: Task or Bug (if security update)
- **Components**: Auto-detected from repository
- **Labels**: ["dependencies", "automated-update", "ai-generated"]
- Adds "security" if CVE-related
- Adds "breaking-change" if major version bump
- **Description**: Includes:
- Dependency information and type
- Current and new versions
- Changelog summary
- Breaking changes (if any)
- Files modified
- Test results
- Migration steps (if needed)
- Links to upstream release notes
- **Target Version**: Auto-detected from release branches
### 8. Pull Request Creation (if --create-pr)
Creates a pull request with:
- **Title**: `[{JIRA-ID}] Bump {dependency} from {old_version} to {new_version}`
- **Body**: Includes:
- Link to Jira ticket
- Summary of changes
- Breaking changes callout
- Testing performed
- Checklist for reviewers
- Release notes snippet
- **Labels**: Auto-applied based on change type
- **Branch naming**: `deps/{dependency}-{new_version}` or `{jira-id}-bump-{dependency}`
### 9. Conflict Resolution
If updates cause issues:
- Identifies conflicting dependencies
- Suggests resolution strategies
- Can attempt automatic resolution for common cases
- Provides manual resolution steps for complex scenarios
## Return Value
- **Claude agent text**: Processing status, test results, and summary
- **Side effects**:
- Modified dependency files (go.mod, package.json, etc.)
- Updated lock files
- Jira ticket created (if --create-jira)
- Pull request created (if --create-pr)
- Git branch created with changes
## Examples
1. **Bump a Go dependency to latest**:
```
/openshift:bump-deps k8s.io/api
```
Output:
```
Analyzing dependency: k8s.io/api
Current version: v0.28.0
Latest version: v0.29.1
Checking compatibility...
✅ No breaking changes detected
Updating go.mod...
Running go mod tidy...
Running tests...
✅ All tests passed
Summary:
- Dependency: k8s.io/api
- Old version: v0.28.0
- New version: v0.29.1
- Files modified: go.mod, go.sum
- Tests: ✅ Passed
Changes are ready. Use --create-pr to create a pull request.
```
2. **Bump to a specific version with Jira ticket**:
```
/openshift:bump-deps golang.org/x/net v0.20.0 --create-jira
```
Output:
```
Analyzing dependency: golang.org/x/net
Current version: v0.19.0
Target version: v0.20.0
Reviewing changes...
⚠️ Breaking changes detected in v0.20.0:
- http2: Server.IdleTimeout applies to idle h2 connections
Updating go.mod...
Running tests...
✅ All tests passed
Creating Jira ticket...
✅ Created: OCPBUGS-12345
Summary:
- Jira: https://issues.redhat.com/browse/OCPBUGS-12345
- Dependency: golang.org/x/net
- Version: v0.19.0 → v0.20.0
- Breaking changes: Yes
```
3. **Bump and create PR in one step**:
```
/openshift:bump-deps github.com/spf13/cobra --create-jira --create-pr
```
Output:
```
Processing dependency bump for github.com/spf13/cobra...
[1/7] Analyzing dependency...
Current: v1.7.0
Latest: v1.8.0
[2/7] Checking changelog...
Changes include:
- New features: Enhanced shell completion
- Bug fixes: 5 issues resolved
- No breaking changes
[3/7] Updating files...
✅ go.mod updated
✅ go.sum updated
[4/7] Running tests...
✅ Unit tests: 156/156 passed
✅ Integration tests: 23/23 passed
[5/7] Creating Jira ticket...
✅ Created: OCPBUGS-12346
[6/7] Creating git branch...
✅ Branch: OCPBUGS-12346-bump-cobra
[7/7] Creating pull request...
✅ PR created: #1234
Summary:
- Jira: https://issues.redhat.com/browse/OCPBUGS-12346
- PR: https://github.com/openshift/repo/pull/1234
- Dependency: github.com/spf13/cobra
- Version: v1.7.0 → v1.8.0
- Tests: All passed
Next steps:
1. Review the PR at the link above
2. Address any reviewer comments
3. Merge when approved
```
4. **Bump multiple related dependencies**:
```
/openshift:bump-deps "k8s.io/*"
```
Output:
```
Found 8 Kubernetes dependencies to update:
[1/8] k8s.io/api: v0.28.0 → v0.29.1
[2/8] k8s.io/apimachinery: v0.28.0 → v0.29.1
[3/8] k8s.io/client-go: v0.28.0 → v0.29.1
[4/8] k8s.io/kubectl: v0.28.0 → v0.29.1
...
These should be updated together to maintain compatibility.
Proceed with batch update? [y/N]
```
5. **Bump a container base image**:
```
/openshift:bump-deps registry.access.redhat.com/ubi9/ubi-minimal
```
Output:
```
Analyzing container image: ubi9/ubi-minimal
Current: 9.3-1361
Latest: 9.4-1194
Checking for security updates...
✅ 3 CVEs fixed in new version
Updating Dockerfile...
Building test image...
Running container tests...
✅ All tests passed
Files modified:
- Dockerfile
- .github/workflows/build.yml
```
## Arguments
- **$1** (required): Dependency identifier
- Go module: `github.com/org/repo` or `golang.org/x/net`
- npm package: `@types/node` or `react`
- Container image: `registry.access.redhat.com/ubi9/ubi-minimal`
- Wildcard for batch: `k8s.io/*` (requires confirmation)
- **$2** (optional): Target version
- Semantic version: `v1.2.3`, `1.2.3`
- Version range: `^1.2.0`, `~1.2.0`
- Special: `latest`, `latest-stable`
- If omitted: suggests latest stable version
- **--create-jira** (flag): Create a Jira ticket for the update
- Auto-detects project from repository
- Can be configured with JIRA_PROJECT env var
- Ticket includes full change analysis
- **--create-pr** (flag): Create a pull request with the changes
- Implies creating a git branch
- Includes --create-jira automatically
- PR is created as draft if tests fail
- **--jira-project** (option): Specify Jira project (default: auto-detect)
- Example: `--jira-project OCPBUGS`
- **--component** (option): Specify Jira component (default: auto-detect)
- Example: `--component "Control Plane"`
- **--branch** (option): Specify git branch name (default: auto-generate)
- Example: `--branch feature/update-deps`
- **--skip-tests** (flag): Skip running tests (not recommended)
- Use only for non-critical updates
- PR will be marked as draft
- **--force** (flag): Force update even if tests fail
- Creates PR as draft
- Includes test failure details in PR
## Error Handling
The command handles common error cases:
- **Dependency not found**: Lists similar dependencies in project
- **Version not found**: Shows available versions
- **Test failures**:
- Provides detailed error logs
- Suggests potential fixes
- Asks whether to create draft PR anyway
- **Conflicting dependencies**:
- Identifies conflicts
- Suggests resolution order
- Can attempt batch update
- **Breaking changes**:
- Highlights breaking changes
- Links to migration guides
- Requires explicit confirmation for major bumps
- **Network failures**: Retries with exponential backoff
- **Permission errors**: Checks git/GitHub authentication
## Notes
- Repository name and organization are auto-detected from `git remote -v`
- For Go dependencies, supports both versioned (v2+) and unversioned modules
- Automatically detects if running in a fork vs upstream repository
- Respects `.gitignore` and doesn't commit generated/vendored files unnecessarily
- Can handle dependencies with replace directives in go.mod
- Supports monorepos with multiple go.mod files
- All Jira tickets are labeled with "ai-generated" for tracking
- PR creation requires GitHub CLI (gh) to be installed and authenticated
- For security updates (CVEs), automatically prioritizes and labels appropriately
- Compatible with Renovate - can be used to customize/enhance Renovate PRs
## Environment Variables
- **JIRA_PROJECT**: Default Jira project for ticket creation
- **JIRA_COMPONENT**: Default component for Jira tickets
- **GITHUB_TOKEN**: GitHub authentication (if not using gh auth)
- **DEFAULT_BRANCH**: Override default branch detection (default: main)
## See Also
- `utils:process-renovate-pr` - Process existing Renovate dependency PRs
- `git:create-pr` - General PR creation command
- `jira:create` - Manual Jira ticket creation

View File

@@ -0,0 +1,542 @@
---
description: Perform comprehensive health check on OpenShift cluster and report issues
argument-hint: "[--verbose] [--output-format]"
---
## Name
openshift:cluster-health-check
## Synopsis
```
/openshift:cluster-health-check [--verbose] [--output-format json|text]
```
## Description
The `cluster-health-check` command performs a comprehensive health analysis of an OpenShift/Kubernetes cluster and reports any detected issues. It examines cluster operators, nodes, deployments, pods, persistent volumes, and other critical resources to identify problems that may affect cluster stability or workload availability.
This command is useful for:
- Quick cluster status assessment
- Troubleshooting cluster issues
- Pre-deployment validation
- Regular health monitoring
- Identifying degraded components
## Prerequisites
Before using this command, ensure you have:
1. **Kubernetes/OpenShift CLI**: Either `oc` (OpenShift) or `kubectl` (Kubernetes)
- Install `oc` from: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
- Or install `kubectl` from: https://kubernetes.io/docs/tasks/tools/
- Verify with: `oc version` or `kubectl version`
2. **Active cluster connection**: Must be connected to a running cluster
- Verify with: `oc whoami` or `kubectl cluster-info`
- Ensure KUBECONFIG is set if needed
3. **Sufficient permissions**: Must have read access to cluster resources
- Cluster-admin or monitoring role recommended for comprehensive checks
- Minimum: ability to view nodes, pods, and cluster operators
## Arguments
- **--verbose** (optional): Enable detailed output with additional context
- Shows resource-level details
- Includes warning conditions
- Provides remediation suggestions
- **--output-format** (optional): Output format for results
- `text` (default): Human-readable text format
- `json`: Machine-readable JSON format for automation
## Implementation
The command performs the following health checks:
### 1. Determine CLI Tool
Detect which Kubernetes CLI is available:
```bash
if command -v oc &> /dev/null; then
CLI="oc"
CLUSTER_TYPE="OpenShift"
elif command -v kubectl &> /dev/null; then
CLI="kubectl"
CLUSTER_TYPE="Kubernetes"
else
echo "Error: Neither 'oc' nor 'kubectl' CLI found. Please install one of them."
exit 1
fi
```
### 2. Verify Cluster Connectivity
Check if connected to a cluster:
```bash
if ! $CLI cluster-info &> /dev/null; then
echo "Error: Not connected to a cluster. Please configure your KUBECONFIG."
exit 1
fi
# Get cluster version info
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
CLUSTER_VERSION=$($CLI version -o json 2>/dev/null | jq -r '.openshiftVersion // "unknown"')
else
CLUSTER_VERSION=$($CLI version --short 2>/dev/null | grep -i server | awk '{print $3}')
fi
```
### 3. Initialize Health Check Report
Create a report structure to collect findings:
```bash
REPORT_FILE=".work/cluster-health-check/report-$(date +%Y%m%d-%H%M%S).txt"
mkdir -p .work/cluster-health-check
# Initialize counters
CRITICAL_ISSUES=0
WARNING_ISSUES=0
INFO_MESSAGES=0
```
### 4. Check Cluster Operators (OpenShift only)
For OpenShift clusters, check cluster operator health:
```bash
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
echo "Checking Cluster Operators..."
# Get all cluster operators
DEGRADED_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Degraded" and .status=="True")) | .metadata.name')
UNAVAILABLE_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Available" and .status=="False")) | .metadata.name')
PROGRESSING_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Progressing" and .status=="True")) | .metadata.name')
if [ -n "$DEGRADED_COs" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$DEGRADED_COs" | wc -l)))
echo "❌ CRITICAL: Degraded cluster operators found:"
echo "$DEGRADED_COs" | while read co; do
echo " - $co"
# Get degraded message
$CLI get clusteroperator "$co" -o json | jq -r '.status.conditions[] | select(.type=="Degraded") | " Reason: \(.reason)\n Message: \(.message)"'
done
fi
if [ -n "$UNAVAILABLE_COs" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$UNAVAILABLE_COs" | wc -l)))
echo "❌ CRITICAL: Unavailable cluster operators found:"
echo "$UNAVAILABLE_COs" | while read co; do
echo " - $co"
done
fi
if [ -n "$PROGRESSING_COs" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PROGRESSING_COs" | wc -l)))
echo "⚠️ WARNING: Cluster operators in progress:"
echo "$PROGRESSING_COs" | while read co; do
echo " - $co"
done
fi
fi
```
### 5. Check Node Health
Examine all cluster nodes for issues:
```bash
echo "Checking Node Health..."
# Get nodes that are not Ready
NOT_READY_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Ready" and .status!="True")) | .metadata.name')
if [ -n "$NOT_READY_NODES" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$NOT_READY_NODES" | wc -l)))
echo "❌ CRITICAL: Nodes not in Ready state:"
echo "$NOT_READY_NODES" | while read node; do
echo " - $node"
# Get node conditions
$CLI get node "$node" -o json | jq -r '.status.conditions[] | " \(.type): \(.status) - \(.message // "N/A")"'
done
fi
# Check for SchedulingDisabled nodes
DISABLED_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.spec.unschedulable==true) | .metadata.name')
if [ -n "$DISABLED_NODES" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$DISABLED_NODES" | wc -l)))
echo "⚠️ WARNING: Nodes with scheduling disabled:"
echo "$DISABLED_NODES" | while read node; do
echo " - $node"
done
fi
# Check for node pressure conditions (MemoryPressure, DiskPressure, PIDPressure)
PRESSURE_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.status.conditions[] | select((.type=="MemoryPressure" or .type=="DiskPressure" or .type=="PIDPressure") and .status=="True")) | .metadata.name')
if [ -n "$PRESSURE_NODES" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PRESSURE_NODES" | wc -l)))
echo "⚠️ WARNING: Nodes under resource pressure:"
echo "$PRESSURE_NODES" | while read node; do
echo " - $node"
$CLI get node "$node" -o json | jq -r '.status.conditions[] | select((.type=="MemoryPressure" or .type=="DiskPressure" or .type=="PIDPressure") and .status=="True") | " \(.type): \(.message // "N/A")"'
done
fi
# Check node resource utilization if metrics-server is available
if $CLI top nodes &> /dev/null; then
echo "Node Resource Utilization:"
$CLI top nodes
fi
```
### 6. Check Pod Health Across All Namespaces
Identify problematic pods:
```bash
echo "Checking Pod Health..."
# Get pods that are not Running or Completed
FAILED_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.phase != "Running" and .status.phase != "Succeeded") | "\(.metadata.namespace)/\(.metadata.name) [\(.status.phase)]"')
if [ -n "$FAILED_PODS" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$FAILED_PODS" | wc -l)))
echo "❌ CRITICAL: Pods in failed/pending state:"
echo "$FAILED_PODS"
fi
# Check for pods with restarts
HIGH_RESTART_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .restartCount > 5) | "\(.metadata.namespace)/\(.metadata.name) [Restarts: \(.status.containerStatuses[0].restartCount)]"')
if [ -n "$HIGH_RESTART_PODS" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$HIGH_RESTART_PODS" | wc -l)))
echo "⚠️ WARNING: Pods with high restart count (>5):"
echo "$HIGH_RESTART_PODS"
fi
# Check for CrashLoopBackOff pods
CRASHLOOP_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .state.waiting?.reason == "CrashLoopBackOff") | "\(.metadata.namespace)/\(.metadata.name)"')
if [ -n "$CRASHLOOP_PODS" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$CRASHLOOP_PODS" | wc -l)))
echo "❌ CRITICAL: Pods in CrashLoopBackOff:"
echo "$CRASHLOOP_PODS"
fi
# Check for ImagePullBackOff pods
IMAGE_PULL_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .state.waiting?.reason == "ImagePullBackOff" or .state.waiting?.reason == "ErrImagePull") | "\(.metadata.namespace)/\(.metadata.name)"')
if [ -n "$IMAGE_PULL_PODS" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$IMAGE_PULL_PODS" | wc -l)))
echo "❌ CRITICAL: Pods with image pull errors:"
echo "$IMAGE_PULL_PODS"
fi
```
### 7. Check Deployment/StatefulSet/DaemonSet Health
Verify workload controllers:
```bash
echo "Checking Deployments..."
# Check deployments with unavailable replicas
UNHEALTHY_DEPLOYMENTS=$($CLI get deployments --all-namespaces -o json | jq -r '.items[] | select(.status.unavailableReplicas > 0 or .status.replicas != .status.readyReplicas) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.readyReplicas // 0)/\(.spec.replicas)]"')
if [ -n "$UNHEALTHY_DEPLOYMENTS" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_DEPLOYMENTS" | wc -l)))
echo "⚠️ WARNING: Deployments with unavailable replicas:"
echo "$UNHEALTHY_DEPLOYMENTS"
fi
echo "Checking StatefulSets..."
UNHEALTHY_STATEFULSETS=$($CLI get statefulsets --all-namespaces -o json | jq -r '.items[] | select(.status.replicas != .status.readyReplicas) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.readyReplicas // 0)/\(.spec.replicas)]"')
if [ -n "$UNHEALTHY_STATEFULSETS" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_STATEFULSETS" | wc -l)))
echo "⚠️ WARNING: StatefulSets with unavailable replicas:"
echo "$UNHEALTHY_STATEFULSETS"
fi
echo "Checking DaemonSets..."
UNHEALTHY_DAEMONSETS=$($CLI get daemonsets --all-namespaces -o json | jq -r '.items[] | select(.status.numberReady != .status.desiredNumberScheduled) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.numberReady)/\(.status.desiredNumberScheduled)]"')
if [ -n "$UNHEALTHY_DAEMONSETS" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_DAEMONSETS" | wc -l)))
echo "⚠️ WARNING: DaemonSets with unavailable pods:"
echo "$UNHEALTHY_DAEMONSETS"
fi
```
### 8. Check Persistent Volume Claims
Check for storage issues:
```bash
echo "Checking Persistent Volume Claims..."
# Get PVCs that are not Bound
PENDING_PVCS=$($CLI get pvc --all-namespaces -o json | jq -r '.items[] | select(.status.phase != "Bound") | "\(.metadata.namespace)/\(.metadata.name) [\(.status.phase)]"')
if [ -n "$PENDING_PVCS" ]; then
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PENDING_PVCS" | wc -l)))
echo "⚠️ WARNING: PVCs not in Bound state:"
echo "$PENDING_PVCS"
fi
```
### 9. Check Critical Namespace Health
For OpenShift, check critical namespaces:
```bash
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
echo "Checking Critical Namespaces..."
CRITICAL_NAMESPACES="openshift-kube-apiserver openshift-etcd openshift-authentication openshift-console openshift-monitoring"
for ns in $CRITICAL_NAMESPACES; do
# Check if namespace exists
if ! $CLI get namespace "$ns" &> /dev/null; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + 1))
echo "❌ CRITICAL: Critical namespace missing: $ns"
continue
fi
# Check for failed pods in critical namespace
FAILED_IN_NS=$($CLI get pods -n "$ns" -o json | jq -r '.items[] | select(.status.phase != "Running" and .status.phase != "Succeeded") | .metadata.name')
if [ -n "$FAILED_IN_NS" ]; then
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$FAILED_IN_NS" | wc -l)))
echo "❌ CRITICAL: Failed pods in critical namespace $ns:"
echo "$FAILED_IN_NS" | while read pod; do
echo " - $pod"
done
fi
done
fi
```
### 10. Check Events for Recent Errors
Look for recent warning/error events:
```bash
echo "Checking Recent Events..."
# Get events from last 30 minutes with Warning or Error type
RECENT_WARNINGS=$($CLI get events --all-namespaces --field-selector type=Warning -o json | jq -r --arg since "$(date -u -d '30 minutes ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-30M +%Y-%m-%dT%H:%M:%SZ)" '.items[] | select(.lastTimestamp > $since) | "\(.lastTimestamp) [\(.involvedObject.namespace)/\(.involvedObject.name)]: \(.message)"' | head -20)
if [ -n "$RECENT_WARNINGS" ]; then
echo "⚠️ Recent Warning Events (last 30 minutes):"
echo "$RECENT_WARNINGS"
fi
```
### 11. Generate Summary Report
Create a summary of findings:
```bash
echo ""
echo "==============================================="
echo "Cluster Health Check Summary"
echo "==============================================="
echo "Cluster Type: $CLUSTER_TYPE"
echo "Cluster Version: $CLUSTER_VERSION"
echo "Check Time: $(date)"
echo ""
echo "Results:"
echo " Critical Issues: $CRITICAL_ISSUES"
echo " Warnings: $WARNING_ISSUES"
echo ""
if [ $CRITICAL_ISSUES -eq 0 ] && [ $WARNING_ISSUES -eq 0 ]; then
echo "✅ Cluster is healthy - no issues detected"
exit 0
elif [ $CRITICAL_ISSUES -gt 0 ]; then
echo "❌ Cluster has CRITICAL issues requiring immediate attention"
exit 1
else
echo "⚠️ Cluster has warnings - monitoring recommended"
exit 0
fi
```
### 12. Optional: Export to JSON Format
If `--output-format json` is specified, export findings as JSON:
```json
{
"cluster": {
"type": "OpenShift",
"version": "4.21.0",
"checkTime": "2025-10-31T12:00:00Z"
},
"summary": {
"criticalIssues": 2,
"warnings": 5,
"healthy": false
},
"findings": {
"clusterOperators": {
"degraded": ["authentication", "monitoring"],
"unavailable": [],
"progressing": ["network"]
},
"nodes": {
"notReady": ["worker-1"],
"schedulingDisabled": ["worker-2"],
"underPressure": []
},
"pods": {
"failed": ["namespace/pod-1", "namespace/pod-2"],
"crashLooping": [],
"imagePullErrors": ["namespace/pod-3"]
},
"workloads": {
"unhealthyDeployments": [],
"unhealthyStatefulSets": [],
"unhealthyDaemonSets": []
},
"storage": {
"pendingPVCs": []
}
}
}
```
## Examples
### Example 1: Basic health check
```
/openshift:cluster-health-check
```
Output:
```
Checking Cluster Operators...
✅ All cluster operators healthy
Checking Node Health...
⚠️ WARNING: Nodes with scheduling disabled:
- ip-10-0-51-201.us-east-2.compute.internal
Checking Pod Health...
✅ All pods healthy
...
===============================================
Cluster Health Check Summary
===============================================
Cluster Type: OpenShift
Cluster Version: 4.21.0
Check Time: 2025-10-31 12:00:00
Results:
Critical Issues: 0
Warnings: 1
⚠️ Cluster has warnings - monitoring recommended
```
### Example 2: Verbose health check
```
/openshift:cluster-health-check --verbose
```
### Example 3: JSON output for automation
```
/openshift:cluster-health-check --output-format json
```
## Return Value
The command returns different exit codes based on findings:
- **Exit 0**: No critical issues found (cluster is healthy or has only warnings)
- **Exit 1**: Critical issues detected requiring immediate attention
**Output Format**:
- **Text** (default): Human-readable report with emoji indicators
- **JSON**: Structured data suitable for parsing/automation
## Common Issues and Remediation
### Degraded Cluster Operators
**Symptoms**: Cluster operators showing Degraded=True or Available=False
**Investigation**:
```bash
oc get clusteroperator <operator-name> -o yaml
oc logs -n openshift-<operator-namespace> -l app=<operator-name>
```
**Remediation**: Check operator logs and events for specific errors
### Nodes Not Ready
**Symptoms**: Nodes in NotReady state
**Investigation**:
```bash
oc describe node <node-name>
oc get events --field-selector involvedObject.name=<node-name>
```
**Remediation**: Common causes include network issues, disk pressure, or kubelet problems
### Pods in CrashLoopBackOff
**Symptoms**: Pods continuously restarting
**Investigation**:
```bash
oc logs <pod-name> -n <namespace> --previous
oc describe pod <pod-name> -n <namespace>
```
**Remediation**: Check application logs, resource limits, and configuration
### ImagePullBackOff Errors
**Symptoms**: Pods unable to pull container images
**Investigation**:
```bash
oc describe pod <pod-name> -n <namespace>
```
**Remediation**: Verify image name, registry credentials, and network connectivity
## Security Considerations
- **Read-only access**: This command only reads cluster state, no modifications
- **Sensitive data**: Be cautious when sharing reports as they may contain cluster topology information
- **RBAC requirements**: Ensure user has appropriate permissions for all resource types checked
## See Also
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/support/troubleshooting/
- Kubernetes Troubleshooting: https://kubernetes.io/docs/tasks/debug/
- Related commands: `/prow-job:analyze-test-failure`, `/must-gather:analyze`
## Notes
- The command checks cluster state at a point in time; transient issues may not be detected
- For OpenShift clusters, cluster operator checks are performed
- For vanilla Kubernetes, cluster operator checks are skipped
- Resource utilization checks require metrics-server to be installed
- Some checks may be skipped if user lacks sufficient permissions

278
commands/crd-review.md Normal file
View File

@@ -0,0 +1,278 @@
---
description: Review Kubernetes CRDs against Kubernetes and OpenShift API conventions
argument-hint: [repository-path]
---
## Name
openshift:crd-review
## Synopsis
```
/openshift:crd-review [repository-path]
```
## Description
The `openshift:crd-review` command analyzes Go Kubernetes Custom Resource Definitions (CRDs) in a repository against both:
- **Kubernetes API Conventions** as defined in the [Kubernetes community guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md)
- **OpenShift API Conventions** as defined in the [OpenShift development guide](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md)
This command helps ensure CRDs follow best practices for:
- API naming conventions and patterns
- Resource structure and field organization
- Status field design and patterns
- Field types and validation
- Documentation standards
- OpenShift-specific requirements
The review covers Go API type definitions, providing actionable feedback to improve API design.
## Key Convention Checks
### Kubernetes API Conventions
#### Naming Conventions
- **Resource Names**: Must follow DNS label format (lowercase, alphanumeric, hyphens)
- **Field Names**: PascalCase for Go, camelCase for JSON
- **Avoid**: Abbreviations, underscores, ambiguous names
- **Include**: Units/types in field names when needed (e.g., `timeoutSeconds`)
#### API Structure
- **Required Fields**: Every API object must embed a `k8s.io/apimachinery/pkg/apis/meta/v1` `TypeMeta` struct
- **Metadata**: Every API object must include a `k8s.io/apimachinery/pkg/apis/meta/v1` `ObjectMeta` struct called `metadata`
- **Spec/Status Separation**: Clear separation between desired state (spec) and observed state (status)
#### Status Field Design
- **Conditions**: Must include conditions array with:
- `type`: Clear, human-readable condition type
- `status`: `True`, `False`, or `Unknown`
- `reason`: Machine-readable reason code
- `message`: Human-readable message
- `lastTransitionTime`: RFC 3339 timestamp
#### Field Types
- **Integers**: Prefer `int32` over `int64`
- **Avoid**: Unsigned integers, floating-point values
- **Enums**: Use string constants, not numeric values
- **Optional Fields**: Use pointers in Go
#### Versioning
- **Group Names**: Use domain format (e.g., `myapp.example.com`)
- **Version Strings**: Must match DNS label format (e.g., `v1`, `v1beta1`)
- **Migration**: Provide clear paths between versions
### OpenShift API Conventions
#### Configuration vs Workload APIs
- **Configuration APIs**: Typically cluster-scoped, manage cluster behavior
- **Workload APIs**: Usually namespaced, user-facing resources
#### Field Design
- **Avoid Boolean Fields**: Use enumerations that describe end-user behavior instead of binary true/false
- ❌ Bad: `paused: true`
- ✅ Good: `lifecycle: "Paused"` with enum values `["Paused", "Active"]`
- **Object References**: Use specific types, omit "Ref" suffix
- **Clear Semantics**: Each field should have one clear purpose
#### Documentation Requirements
- **Godoc Comments**: Comprehensive documentation for all exported types and fields
- **JSON Field Names**: Use JSON names in documentation (not Go names)
- **User-Facing**: Write for users, not just developers
- **Explain Interactions**: Document how fields interact with each other
#### Validation
- **Kubebuilder Tags**: Use validation markers (`+kubebuilder:validation:*`)
- **Enum Values**: Explicitly define allowed values
- **Field Constraints**: Define minimums, maximums, patterns
- **Meaningful Errors**: Validation messages should guide users
#### Union Types
- **Discriminated Unions**: Use a discriminator field to select variant
- **Optional Pointers**: All union members should be optional pointers
- **Validation**: Ensure exactly one union member is set
## Implementation
The command performs the following analysis workflow:
1. **Repository Discovery**
- Find Go API types (typically in `api/`, `pkg/apis/` directories)
- Identify CRD generation markers (`+kubebuilder` comments)
2. **Kubernetes Convention Validation**
- **Naming validation**: Check resource names, field names, condition types
- **Structure validation**: Verify required fields, metadata, spec/status separation
- **Status validation**: Ensure conditions array, proper condition structure
- **Field type validation**: Check integer types, avoid floats, validate enums
- **Versioning validation**: Verify group names and version strings
3. **OpenShift Convention Validation**
- **API classification**: Identify configuration vs workload APIs
- **Field design**: Flag boolean fields, check enumerations
- **Documentation**: Verify Godoc comments, user-facing descriptions
- **Validation markers**: Check kubebuilder validation tags
- **Union types**: Validate discriminated union patterns
4. **Report Generation**
- List all findings with severity levels (Critical, Warning, Info)
- Provide specific file and line references
- Include remediation suggestions
- Highlight whether a suggested change might lead to breaking API changes
- Link to relevant convention documentation
## Output Format
The command generates a structured report with:
- **Summary**: Overview of findings by severity
- **Kubernetes Findings**: Issues related to upstream conventions
- **OpenShift Findings**: Issues related to OpenShift-specific patterns
- **Recommendations**: Actionable steps to improve API design
- **openshift/api crd-command reference**: Add a prominent note notifying the user of the existence of the openshift/api repository's api-review command (https://github.com/openshift/api/blob/master/.claude/commands/api-review.md) for PR reviews against that repository.
Each finding includes:
- Severity level (❌ Critical, ⚠️ Warning, 💡 Info)
- File location and line number
- Description of the issue
- Remediation suggestion
- Link to relevant documentation
## Examples
### Example 1: Review current repository
```
/crd-review
```
Analyzes CRDs in the current working directory.
### Example 2: Review specific repository
```
/crd-review /path/to/operator-project
```
Analyzes CRDs in the specified directory.
### Example 3: Review with detailed output
The command automatically provides detailed output including:
- All CRD files found
- Go API type definitions
- Compliance summary
- Specific violations with file references
## Common Findings
### Kubernetes Convention Issues
#### Boolean vs Enum Fields
**Issue**: Using boolean where enum is better
```go
// ❌ Bad
type MySpec struct {
Enabled bool `json:"enabled"`
}
// ✅ Good
type MySpec struct {
// State defines the operational state
// Valid values are: "Enabled", "Disabled", "Auto"
// +kubebuilder:validation:Enum=Enabled;Disabled;Auto
State string `json:"state"`
}
```
#### Missing Status Conditions
**Issue**: Status without conditions array
```go
// ❌ Bad
type MyStatus struct {
Ready bool `json:"ready"`
}
// ✅ Good
type MyStatus struct {
// Conditions represent the latest available observations
// +listType=map
// +listMapKey=type
Conditions []metav1.Condition `json:"conditions,omitempty"`
}
```
#### Improper Field Naming
**Issue**: Ambiguous or abbreviated names
```go
// ❌ Bad
type MySpec struct {
Timeout int `json:"timeout"` // Ambiguous unit
Cnt int `json:"cnt"` // Abbreviation
}
// ✅ Good
type MySpec struct {
// TimeoutSeconds is the timeout in seconds
// +kubebuilder:validation:Minimum=1
TimeoutSeconds int32 `json:"timeoutSeconds"`
// Count is the number of replicas
// +kubebuilder:validation:Minimum=0
Count int32 `json:"count"`
}
```
### OpenShift Convention Issues
#### Missing Documentation
**Issue**: Exported fields without Godoc
```go
// ❌ Bad
type MySpec struct {
Field string `json:"field"`
}
// ✅ Good
type MySpec struct {
// field specifies the configuration field for...
// This value determines how the operator will...
// Valid values include...
Field string `json:"field"`
}
```
#### Missing Validation
**Issue**: Fields without kubebuilder validation
```go
// ❌ Bad
type MySpec struct {
Mode string `json:"mode"`
}
// ✅ Good
type MySpec struct {
// mode defines the operational mode
// +kubebuilder:validation:Enum=Standard;Advanced;Debug
// +kubebuilder:validation:Required
Mode string `json:"mode"`
}
```
## Best Practices
1. **Start with Conventions**: Review conventions before writing APIs
2. **Use Code Generation**: Leverage controller-gen and kubebuilder markers
3. **Document Early**: Write Godoc comments as you define types
4. **Validate Everything**: Add validation markers for all fields
5. **Review Regularly**: Run this command during development and before PRs
6. **Follow Examples**: Study well-designed APIs in OpenShift core
## Arguments
- **repository-path** (optional): Path to repository containing CRDs. Defaults to current working directory.
## Exit Codes
- **0**: Analysis completed successfully
- **1**: Error during analysis (e.g., invalid path, no CRDs found)
## See Also
- [Kubernetes API Conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md)
- [OpenShift API Conventions](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md)
- [Kubebuilder Documentation](https://book.kubebuilder.io/)
- [Controller Runtime API](https://pkg.go.dev/sigs.k8s.io/controller-runtime)

580
commands/create-cluster.md Normal file
View File

@@ -0,0 +1,580 @@
---
description: Extract OpenShift installer from release image and create an OCP cluster
argument-hint: "[release-image] [platform] [options]"
---
## Name
openshift:create-cluster
## Synopsis
```
/openshift:create-cluster [release-image] [platform] [options]
```
## Description
The `create-cluster` command automates the process of extracting the OpenShift installer from a release image (if not already present) and creating a new OpenShift Container Platform (OCP) cluster. It handles installer extraction from OCP release images, configuration preparation, and cluster creation in a streamlined workflow.
This command is useful for:
- Setting up development/test clusters quickly
## ⚠️ When to Use This Tool
**IMPORTANT**: This is a last resort tool for advanced use cases. For most development workflows, you should use one of these better alternatives:
### Recommended Alternatives
1. **Cluster Bot**: Request ephemeral test clusters without managing infrastructure
- No cloud credentials needed
- Supports dependent PR testing
- Automatically cleaned up
2. **Gangway**
3. **Multi-PR Testing in CI**: Test multiple dependent PRs together using `/test-with` commands
### When to Use create-cluster
Only use this command when:
- You need full control over cluster configuration
- You're testing installer changes that aren't suitable for CI
- You need a long-lived development cluster on your own cloud account
- The alternatives don't meet your specific requirements
**Note**: This command requires significant setup (cloud credentials, pull secrets, DNS configuration, understanding of OCP versions). If you're new to OpenShift development, start with Cluster Bot or Gangway instead.
## Prerequisites
Before using this command, ensure you have:
1. **OpenShift CLI (`oc`)**: Required to extract the installer from the release image
- Install from: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
- Or use your package manager: `brew install openshift-cli` (macOS)
- Verify with: `oc version`
2. **Cloud Provider Credentials** configured for your chosen platform:
- **AWS**: `~/.aws/credentials` configured with appropriate permissions
- **Azure**: Azure CLI authenticated (`az login`)
- **GCP**: The command will guide you through service account setup (either using an existing service account JSON or creating a new one)
- **vSphere**: vCenter credentials
- **OpenStack**: clouds.yaml configured
3. **Pull Secret**: Download from [Red Hat Console](https://console.redhat.com/openshift/install/pull-secret)
4. **Domain/DNS Configuration**:
- AWS: Route53 hosted zone
- Other platforms: Appropriate DNS setup
## Arguments
The command accepts arguments in multiple ways:
### Positional Arguments
```
/openshift:create-cluster [release-image] [platform]
```
### Interactive Mode
If arguments are not provided, the command will interactively prompt for:
- OpenShift release image
- Platform (aws, azure, gcp, vsphere, openstack, none/baremetal)
- Cluster name
- Base domain
- Pull secret location
### Argument Details
- **release-image** (required): OpenShift release image to extract the installer from
- Production release: `quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64`
- CI build: `registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915`
- Stable release: `quay.io/openshift-release-dev/ocp-release:4.20.1-x86_64`
- The command will prompt for this if not provided
- **platform** (optional): Target platform for the cluster
- `aws`: Amazon Web Services
- `azure`: Microsoft Azure
- `gcp`: Google Cloud Platform
- `vsphere`: VMware vSphere
- `openstack`: OpenStack
- `none`: Bare metal / platform-agnostic
- Default: Prompts user to select
- **cluster-name** (optional): Name for the cluster
- Default: `ocp-cluster`
- Must be DNS-compatible
- **base-domain** (required): Base domain for the cluster
- Example: `example.com` → Cluster API will be `api.{cluster-name}.{base-domain}`
- **pull-secret** (required): Path to pull secret file
- User will be prompted to provide the path
- **installer-dir** (optional): Directory to store/find installer binaries
- Default: `~/.openshift-installers`
## Implementation
The command performs the following steps:
### 1. Validate Prerequisites
Check that required tools and credentials are available:
- Verify `oc` CLI is installed and available
- Verify cloud provider credentials are configured (if applicable)
- Confirm domain/DNS requirements
If any prerequisites are missing, provide clear instructions on how to configure them.
### 2. Get Release Image from User
If not provided as an argument, **prompt the user** for the OpenShift release image:
```
Please provide the OpenShift release image:
Examples:
- Production release: quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64
- CI build: registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915
- Stable release: quay.io/openshift-release-dev/ocp-release:4.20.1-x86_64
Release image:
```
Store the user's input as `$RELEASE_IMAGE`.
**Extract version from image** for naming:
```bash
# Parse version from image tag (e.g., "4.21.0-ec.2" or "4.21.0-0.ci-2025-10-27-031915")
VERSION=$(echo "$RELEASE_IMAGE" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+[^"]*' | head -1)
```
### 3. Determine Installer Location and Extract if Needed
```bash
INSTALLER_DIR="${installer-dir:-$HOME/.openshift-installers}"
INSTALLER_PATH="$INSTALLER_DIR/openshift-install-${VERSION}"
```
**Check if installer directory exists**:
- If `$INSTALLER_DIR` does not exist:
- **Ask user for confirmation**: "The installer directory `$INSTALLER_DIR` does not exist. Would you like to create it?"
- If user confirms (yes): Create the directory with `mkdir -p "$INSTALLER_DIR"`
- If user declines (no): Exit with error message suggesting an alternative path
**Check if the installer already exists** at `$INSTALLER_PATH`:
- If present: Verify it works with `"$INSTALLER_PATH" version`
- If version matches the release image: Skip extraction
- If different or fails: Proceed with extraction
- If not present: Proceed with extraction
**Extract installer from release image**:
1. **Verify `oc` CLI is available**:
```bash
if ! command -v oc &> /dev/null; then
echo "Error: 'oc' CLI not found. Please install the OpenShift CLI."
exit 1
fi
```
2. **Extract the installer binary**:
```bash
oc adm release extract \
--tools \
--from="$RELEASE_IMAGE" \
--to="$INSTALLER_DIR"
```
This extracts the `openshift-install` binary and other tools from the release image.
3. **Locate and rename the extracted installer**:
```bash
# The extract command creates a tar.gz with the tools
# Find the most recently extracted openshift-install tar (compatible with both GNU and BSD find)
INSTALLER_TAR=$(find "$INSTALLER_DIR" -name "openshift-install-*.tar.gz" -type f -exec ls -t {} + | head -1)
# Extract from tar and rename
cd "$INSTALLER_DIR"
tar -xzf "$INSTALLER_TAR" openshift-install
mv openshift-install "openshift-install-${VERSION}"
chmod +x "openshift-install-${VERSION}"
# Clean up the tar file
rm "$INSTALLER_TAR"
```
4. **Verify the installer**:
```bash
"$INSTALLER_PATH" version
```
Expected output should show the version matching `$VERSION`.
### 4. Prepare Installation Directory
Create a clean installation directory:
```bash
INSTALL_DIR="${cluster-name}-install-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$INSTALL_DIR"
cd "$INSTALL_DIR"
```
### 5. Collect Required Information and Generate install-config.yaml
**IMPORTANT**: Do NOT run the installer interactively. Instead, collect all required information from the user and generate the install-config.yaml programmatically.
**Step 5.1: Collect Information**
Prompt the user for the following information (if not already provided as arguments):
1. **SSH Public Key**:
- Check for existing SSH keys: `ls -la ~/.ssh/*.pub`
- Ask user to select from available keys or specify path
- Default: `~/.ssh/id_rsa.pub`
2. **Platform** (if not provided as argument):
- Ask user to select: aws, azure, gcp, vsphere, openstack, none
3. **Platform-specific details**:
- For AWS:
- Region (e.g., us-east-1, us-west-2)
- For Azure:
- Region (e.g., centralus, eastus)
- Cloud name (e.g., AzurePublicCloud)
- For GCP:
- Follow the **GCP Service Account Setup** (see Step 5.2a below)
- Project ID
- Region (e.g., us-central1)
- For other platforms: collect required platform-specific info
4. **Base Domain**:
- Ask for base domain (e.g., example.com, devcluster.openshift.com)
- Validate that domain is configured (e.g., Route53 hosted zone for AWS)
5. **Cluster Name**:
- Ask for cluster name or use default: `ocp-cluster`
- Validate DNS compatibility (lowercase, hyphens only)
6. **Pull Secret**:
- **IMPORTANT**: Always ask user to provide the path to their pull secret file
- Do NOT use default paths like `~/pull-secret.txt` or `~/Downloads/pull-secret.txt`
- Prompt: "Please provide the path to your pull secret file (download from https://console.redhat.com/openshift/install/pull-secret):"
- Read contents of pull secret file from the provided path
**Step 5.2a: GCP Service Account Setup** (Only for GCP platform)
If the platform is GCP, the installer requires a service account JSON file with appropriate permissions. Present the user with two options:
1. **Use an existing service account JSON file**
2. **Create a new service account**
**Ask the user**: "Do you want to use an existing service account JSON file or create a new one?"
**Option 1: Use Existing Service Account**
If the user chooses to use an existing service account:
- Prompt: "Please provide the path to your GCP service account JSON file:"
- Store the path as `$GCP_SERVICE_ACCOUNT_PATH`
- Verify the file exists and is valid JSON
- Set the environment variable:
```bash
export GOOGLE_APPLICATION_CREDENTIALS="$GCP_SERVICE_ACCOUNT_PATH"
```
**Option 2: Create New Service Account**
If the user chooses to create a new service account:
1. **Verify gcloud CLI is installed**:
```bash
if ! command -v gcloud &> /dev/null; then
echo "Error: 'gcloud' CLI not found. Please install the Google Cloud SDK."
echo "Visit: https://cloud.google.com/sdk/docs/install"
exit 1
fi
```
2. **Prompt for Kerberos ID**:
- Ask: "Please provide your Kerberos ID (e.g., jsmith):"
- Store as `$KERBEROS_ID`
- Validate it's not empty
3. **Set service account name**:
```bash
SERVICE_ACCOUNT_NAME="${KERBEROS_ID}-development"
```
4. **Create the service account**:
```bash
echo "Creating service account: $SERVICE_ACCOUNT_NAME"
gcloud iam service-accounts create "$SERVICE_ACCOUNT_NAME" --display-name="$SERVICE_ACCOUNT_NAME"
```
5. **Extract service account details**:
```bash
# Get service account information
SERVICE_ACCOUNT_JSON="$(gcloud iam service-accounts list --format json | jq -r '.[] | select(.name | match("/\(env.SERVICE_ACCOUNT_NAME)@"))')"
SERVICE_ACCOUNT_EMAIL="$(jq -r .email <<< "$SERVICE_ACCOUNT_JSON")"
PROJECT_ID="$(jq -r .projectId <<< "$SERVICE_ACCOUNT_JSON")"
echo "Service Account Email: $SERVICE_ACCOUNT_EMAIL"
echo "Project ID: $PROJECT_ID"
```
6. **Grant required permissions**:
```bash
echo "Granting IAM roles to service account..."
while IFS= read -r ROLE_TO_ADD ; do
echo "Adding role: $ROLE_TO_ADD"
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--condition="None" \
--member="serviceAccount:$SERVICE_ACCOUNT_EMAIL" \
--role="$ROLE_TO_ADD"
done << 'END_OF_ROLES'
roles/compute.admin
roles/iam.securityAdmin
roles/iam.serviceAccountAdmin
roles/iam.serviceAccountKeyAdmin
roles/iam.serviceAccountUser
roles/storage.admin
roles/dns.admin
roles/compute.loadBalancerAdmin
roles/iam.roleAdmin
END_OF_ROLES
echo "All roles granted successfully."
```
7. **Create and download service account key**:
```bash
KEY_FILE="${HOME}/.gcp/${SERVICE_ACCOUNT_NAME}-key.json"
mkdir -p "$(dirname "$KEY_FILE")"
echo "Creating service account key..."
gcloud iam service-accounts keys create "$KEY_FILE" \
--iam-account="$SERVICE_ACCOUNT_EMAIL"
echo "Service account key saved to: $KEY_FILE"
```
8. **Set environment variable**:
```bash
export GOOGLE_APPLICATION_CREDENTIALS="$KEY_FILE"
echo "GOOGLE_APPLICATION_CREDENTIALS set to: $KEY_FILE"
```
9. **Store PROJECT_ID for later use** in install-config.yaml generation.
**Step 5.2: Generate install-config.yaml**
Create the install-config.yaml file programmatically based on collected information:
```bash
# Read SSH public key
SSH_KEY=$(cat "$SSH_KEY_PATH")
# Read pull secret
PULL_SECRET=$(cat "$PULL_SECRET_PATH")
# Generate install-config.yaml
cat > install-config.yaml <<EOF
apiVersion: v1
baseDomain: ${BASE_DOMAIN}
metadata:
name: ${CLUSTER_NAME}
compute:
- name: worker
replicas: 3
controlPlane:
name: master
replicas: 3
networking:
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
serviceNetwork:
- 172.30.0.0/16
platform:
${PLATFORM}:
region: ${REGION}
pullSecret: '${PULL_SECRET}'
sshKey: '${SSH_KEY}'
EOF
```
**Platform-specific configurations**:
For **AWS**:
```yaml
platform:
aws:
region: us-east-1
```
For **Azure**:
```yaml
platform:
azure:
region: centralus
baseDomainResourceGroupName: ${RESOURCE_GROUP_NAME}
cloudName: AzurePublicCloud
```
For **GCP**:
```yaml
platform:
gcp:
projectID: ${PROJECT_ID}
region: us-central1
```
For **None/Baremetal**:
```yaml
platform:
none: {}
```
**IMPORTANT**: Always backup install-config.yaml after creation:
```bash
cp install-config.yaml install-config.yaml.backup
```
The installer consumes this file, so the backup is essential for reference.
### 6. Create the Cluster
Run the installer:
```bash
"$INSTALLER_PATH" create cluster --dir=.
```
Monitor the installation progress. This typically takes 30-45 minutes.
### 7. Post-Installation
Once installation completes:
1. **Display kubeconfig location**:
```
Kubeconfig: $INSTALL_DIR/auth/kubeconfig
```
2. **Display cluster credentials**:
```
Console URL: https://console-openshift-console.apps.${cluster-name}.${base-domain}
Username: kubeadmin
Password: (from $INSTALL_DIR/auth/kubeadmin-password)
```
3. **Export KUBECONFIG** (offer to add to shell profile):
```bash
export KUBECONFIG="$PWD/auth/kubeconfig"
```
4. **Verify cluster access**:
```bash
oc get nodes
oc get co # cluster operators
```
5. **Save cluster information** to a summary file:
```
Cluster: ${cluster-name}
Version: ${VERSION}
Release Image: ${RELEASE_IMAGE}
Platform: ${platform}
Console: https://console-openshift-console.apps.${cluster-name}.${base-domain}
API: https://api.${cluster-name}.${base-domain}:6443
Kubeconfig: $INSTALL_DIR/auth/kubeconfig
Created: $(date)
```
### 8. Error Handling
If installation fails:
1. **Capture logs**: Installation logs are in `.openshift_install.log`
2. **Provide diagnostics**: Check common failure points:
- Quota limits on cloud provider
- DNS configuration issues
- Invalid pull secret
- Network/firewall issues
3. **Cleanup guidance**: Inform user about cleanup:
```bash
"$INSTALLER_PATH" destroy cluster --dir=.
```
## Examples
### Example 1: Basic cluster creation (interactive)
```
/openshift:create-cluster
```
The command will prompt for release image and all necessary information.
### Example 2: Create AWS cluster with production release
```
/openshift:create-cluster quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64 aws
```
### Example 3: Create cluster with CI build
```
/openshift:create-cluster registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915 gcp
```
## Cleanup
To destroy the cluster after testing:
```bash
cd $INSTALL_DIR
"$INSTALLER_PATH" destroy cluster --dir=.
```
**WARNING**: This will permanently delete all cluster resources.
## Common Issues
1. **Pull secret not found**:
- Download from https://console.redhat.com/openshift/install/pull-secret
- Save to a secure location of your choice
- Provide the path when prompted during cluster creation
2. **Insufficient cloud quotas**:
- Check cloud provider quota limits
- Request quota increase if needed
3. **DNS issues**:
- Ensure base domain is properly configured
- For AWS, verify Route53 hosted zone exists
4. **SSH key not found**:
- Generate with `ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa`
5. **Unauthorized access to release image**:
- Error: `error: unable to read image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...: unauthorized: access to the requested resource is not authorized`
- For `quay.io/openshift-release-dev/ocp-v4.0-art-dev` you can get the pull secret from https://console.redhat.com/openshift/install/pull-secret and save it in a file and provide it here.
## Security Considerations
- **Pull secret**: Contains authentication for Red Hat registries. Keep secure.
- **kubeadmin password**: Stored in plaintext in auth directory. Rotate after cluster creation.
- **kubeconfig**: Contains cluster admin credentials. Protect appropriately.
- **Cloud credentials**: Never commit to version control.
## Return Value
- **Success**: Returns 0 and displays cluster information including kubeconfig path
- **Failure**: Returns non-zero and displays error diagnostics
## See Also
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/installing/
- OpenShift Install: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
- Platform-specific installation guides
## Arguments:
- **$1** (release-image): OpenShift release image to extract the installer from (e.g., `quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64`)
- **$2** (platform): Target cloud platform for cluster deployment (aws, azure, gcp, vsphere, openstack, none)

360
commands/destroy-cluster.md Normal file
View File

@@ -0,0 +1,360 @@
---
description: Destroy an OpenShift cluster created by create-cluster command
argument-hint: "[install-dir]"
---
## Name
openshift:destroy-cluster
## Synopsis
```
/openshift:destroy-cluster [install-dir]
```
## Description
The `destroy-cluster` command safely destroys an OpenShift Container Platform (OCP) cluster that was previously created using the `/openshift:create-cluster` command. It locates the appropriate installer binary, verifies the cluster information, and performs cleanup of all cloud resources.
This command is useful for:
- Cleaning up development/test clusters after testing
- Removing failed cluster installations
- Freeing up cloud resources and quotas
**⚠️ WARNING**: This operation is **irreversible** and will permanently delete:
- All cluster resources (VMs, load balancers, storage, etc.)
- All data stored in the cluster
- All configuration and credentials
- DNS records (if managed by the installer)
## Prerequisites
Before using this command, ensure you have:
1. **Installation directory** from the original cluster creation
- Contains the cluster metadata and terraform state
- Located at `{cluster-name}-install-{timestamp}` by default
2. **OpenShift installer binary** that matches the cluster version
- Should be available at `~/.openshift-installers/openshift-install-{version}`
- Same version used to create the cluster
3. **Cloud Provider Credentials** still configured and valid
- Same credentials used during cluster creation
- Must have permissions to delete resources
4. **Network connectivity** to the cloud provider
- Required to communicate with cloud APIs
## Arguments
- **install-dir** (optional): Path to the cluster installation directory
- Default: Interactive prompt to select from available installation directories
- Must contain cluster metadata files (metadata.json, terraform.tfstate, etc.)
- Example: `./my-cluster-install-20251028-120000`
## Implementation
The command performs the following steps:
### 1. Locate Installation Directory
If `install-dir` is not provided:
- Search for installation directories in the current directory
- Look for directories matching pattern `*-install-*` or containing `.openshift_install_state.json`
- Present a list of found directories to the user for selection
- Allow user to manually enter a path if directory not found
If `install-dir` is provided:
- Validate the directory exists
- Verify it contains cluster metadata files
### 2. Extract Cluster Information
Read cluster details from the installation directory:
```bash
# Read cluster metadata
if [ -f "$INSTALL_DIR/metadata.json" ]; then
CLUSTER_NAME=$(jq -r '.clusterName' "$INSTALL_DIR/metadata.json")
INFRA_ID=$(jq -r '.infraID' "$INSTALL_DIR/metadata.json")
PLATFORM=$(jq -r '.platform' "$INSTALL_DIR/metadata.json")
fi
# Try to extract version from cluster-info or log files
VERSION=$(grep -oE 'openshift-install.*v[0-9]+\.[0-9]+\.[0-9]+' "$INSTALL_DIR/.openshift_install.log" | head -1 | grep -oE '[0-9]+\.[0-9]+\.[0-9]+[^"]*' | head -1)
```
### 3. Display Cluster Information and Confirm
Show the user what will be destroyed:
```
Cluster Information:
Name: ${CLUSTER_NAME}
Infrastructure ID: ${INFRA_ID}
Platform: ${PLATFORM}
Installation Directory: ${INSTALL_DIR}
Version: ${VERSION}
⚠️ WARNING: This will permanently destroy the cluster and all its resources!
This action will delete:
- All cluster VMs and compute resources
- Load balancers and networking resources
- Storage volumes and persistent data
- DNS records
- All cluster configuration
Are you sure you want to destroy this cluster? (yes/no):
```
**Important**: Require the user to type "yes" (not just "y") to confirm destruction.
### 4. Locate the Correct Installer
Find the installer binary that matches the cluster version:
```bash
INSTALLER_DIR="${HOME}/.openshift-installers"
INSTALLER_PATH="$INSTALLER_DIR/openshift-install-${VERSION}"
# Check if the version-specific installer exists
if [ ! -f "$INSTALLER_PATH" ]; then
echo "Warning: Installer for version ${VERSION} not found at ${INSTALLER_PATH}"
echo "Searching for alternative installers..."
# Look for any installer in the installers directory
AVAILABLE_INSTALLERS=$(find "$INSTALLER_DIR" -name "openshift-install-*" -type f 2>/dev/null)
if [ -n "$AVAILABLE_INSTALLERS" ]; then
echo "Found installers:"
echo "$AVAILABLE_INSTALLERS"
echo ""
echo "You may use a different version installer, but this may cause issues."
echo "Would you like to:"
echo " 1. Use an available installer from the list above"
echo " 2. Extract the correct installer from the release image"
echo " 3. Cancel the operation"
else
echo "No installers found. Would you like to extract the installer? (yes/no):"
fi
fi
# Verify installer works
"$INSTALLER_PATH" version
```
### 5. Backup Important Files (Optional)
Offer to backup key files before destruction:
```
Would you like to backup cluster information before destroying? (yes/no):
```
If yes, create a backup:
```bash
BACKUP_DIR="${INSTALL_DIR}-backup-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"
# Backup key files
cp "$INSTALL_DIR/metadata.json" "$BACKUP_DIR/" 2>/dev/null
cp "$INSTALL_DIR/auth/kubeconfig" "$BACKUP_DIR/" 2>/dev/null
cp "$INSTALL_DIR/auth/kubeadmin-password" "$BACKUP_DIR/" 2>/dev/null
cp "$INSTALL_DIR/.openshift_install.log" "$BACKUP_DIR/" 2>/dev/null
cp "$INSTALL_DIR/install-config.yaml.backup" "$BACKUP_DIR/" 2>/dev/null
echo "Backup created at: $BACKUP_DIR"
```
### 6. Run Cluster Destroy
Execute the destroy command:
```bash
cd "$INSTALL_DIR"
echo "Starting cluster destruction..."
echo "This may take 10-15 minutes..."
"$INSTALLER_PATH" destroy cluster --dir=. --log-level=debug
DESTROY_EXIT_CODE=$?
```
Monitor the destruction progress and display status updates.
### 7. Verify Cleanup
After the destroy command completes:
1. **Check exit code**:
```bash
if [ $DESTROY_EXIT_CODE -eq 0 ]; then
echo "✅ Cluster destroyed successfully"
else
echo "❌ Cluster destruction failed with exit code: $DESTROY_EXIT_CODE"
echo "Check logs at: $INSTALL_DIR/.openshift_install.log"
fi
```
2. **Verify cloud resources** (platform-specific):
- AWS: Check for lingering resources with tag `kubernetes.io/cluster/${INFRA_ID}`
- Azure: Verify resource group deletion
- GCP: Check project for remaining resources
3. **List any remaining resources**:
```
If any resources remain, provide commands to manually clean them up.
```
### 8. Cleanup Installation Directory (Optional)
Ask the user if they want to remove the installation directory:
```
The cluster has been destroyed. Would you like to delete the installation directory? (yes/no):
Directory: $INSTALL_DIR
Size: $(du -sh "$INSTALL_DIR" | cut -f1)
```
If yes:
```bash
rm -rf "$INSTALL_DIR"
echo "Installation directory removed"
```
If no:
```bash
echo "Installation directory preserved at: $INSTALL_DIR"
echo "You can manually remove it later with: rm -rf $INSTALL_DIR"
```
### 9. Display Summary
Show final summary:
```
Cluster Destruction Summary:
Cluster Name: ${CLUSTER_NAME}
Status: Successfully destroyed
Platform: ${PLATFORM}
Duration: ${DURATION}
Backup: ${BACKUP_DIR} (if created)
Next steps:
- Verify your cloud console for any lingering resources
- Check your cloud billing to ensure resources are no longer incurring charges
- Remove installation directory if not already deleted: ${INSTALL_DIR}
```
## Error Handling
If destruction fails, the command should:
1. **Capture error logs** from `.openshift_install.log`
2. **Identify the failure point**:
- Timeout waiting for resource deletion
- Permission errors
- API rate limiting
- Network connectivity issues
- Resources locked or in use
3. **Provide recovery options**:
- Retry the destroy operation
- Manual cleanup instructions for specific resources
- Contact support if critical errors occur
Common failure scenarios:
**Timeout errors**:
```bash
# Some resources may take longer to delete
# Retry the destroy command:
"$INSTALLER_PATH" destroy cluster --dir="$INSTALL_DIR"
```
**Permission errors**:
```
Error: Cloud credentials may have expired or lack permissions
Solution:
1. Verify cloud credentials are still valid
2. Check IAM permissions for resource deletion
3. Re-run the destroy command after fixing credentials
```
**Partial destruction**:
```
Warning: Some resources could not be deleted automatically.
Remaining resources:
- Load balancer: ${LB_NAME}
- Security group: ${SG_NAME}
- S3 bucket: ${BUCKET_NAME}
Manual cleanup commands:
[Platform-specific commands to delete remaining resources]
```
## Examples
### Example 1: Destroy cluster with interactive directory selection
```
/openshift:destroy-cluster
```
The command will search for installation directories and prompt you to select one.
### Example 2: Destroy cluster with specific directory
```
/openshift:destroy-cluster ./my-cluster-install-20251028-120000
```
### Example 3: Destroy cluster with full path
```
/openshift:destroy-cluster /home/user/clusters/test-cluster-install-20251028-120000
```
## Common Issues
1. **Installation directory not found**:
- Ensure you're in the correct directory
- Provide the full path to the installation directory
- Check if the directory was moved or renamed
2. **Installer binary not found**:
- The command will help you extract the correct installer
- Alternatively, manually place the installer in `~/.openshift-installers/`
3. **Cloud credentials expired**:
- Refresh your cloud credentials
- Re-authenticate with the cloud provider CLI
- Re-run the destroy command
4. **Resources already deleted manually**:
- The destroy command may fail if resources were manually deleted
- Check the logs and manually clean up any remaining resources
- Remove the installation directory manually
5. **Destroy hangs or times out**:
- Some resources may take longer to delete (especially load balancers)
- Wait for the operation to complete (can take 15-30 minutes)
- If truly stuck, cancel and retry
- Check cloud console for resource status
## Safety Features
This command includes several safety measures:
1. **Confirmation required**: Must type "yes" to proceed
2. **Cluster information displayed**: Shows what will be destroyed before proceeding
3. **Backup option**: Offers to backup important files
4. **Validation checks**: Verifies installation directory and metadata
5. **Detailed logging**: All operations logged for troubleshooting
6. **Error recovery**: Provides manual cleanup instructions if automated cleanup fails
## Return Value
- **Success**: Returns 0 and displays destruction summary
- **Failure**: Returns non-zero and displays error diagnostics with recovery instructions
## See Also
- `/openshift:create-cluster` - Create a new OCP cluster
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/installing/
- Platform-specific cleanup guides
## Arguments:
- **$1** (install-dir): Path to the cluster installation directory created by create-cluster (optional, interactive if not provided)

View File

@@ -0,0 +1,79 @@
---
description: Expand basic test ideas or existing oc commands into comprehensive test scenarios with edge cases in oc CLI or Ginkgo format
argument-hint: [test-idea-or-file-or-commands] [format]
---
## Name
openshift:expand-test-case
## Synopsis
```
/openshift:expand-test-case [test-idea-or-file-or-commands] [format]
```
## Description
The `expand-test-case` command transforms basic test ideas or existing oc commands into comprehensive test scenarios. It accepts three types of input:
1. **Test idea**: Simple description of what to test (e.g., "verify pod deployment")
2. **File path**: Path to existing test file to expand (e.g., `/path/to/test.sh` or `/path/to/test.go`)
3. **oc commands**: Direct oc CLI commands to analyze and expand (e.g., `oc create pod nginx`)
The command expands the input to cover positive flows, negative scenarios, edge cases, and boundary conditions, helping QE engineers ensure thorough test coverage.
Supports two output formats:
- **oc CLI**: Shell scripts with oc commands for manual or automated execution
- **Ginkgo**: Go test code using Ginkgo/Gomega framework for E2E tests
## Implementation
The command analyzes the input and generates comprehensive scenarios:
1. **Parse Input**: Determine if input is a test idea, file path, or oc commands
- If file path: Read and analyze existing test code
- If oc commands: Parse commands to understand what's being tested
- If test idea: Understand the core feature or behavior
2. **Identify Test Dimensions**: Determine coverage aspects (functionality, security, performance, edge cases)
3. **Generate Positive Tests**: Happy path scenarios where everything works
4. **Generate Negative Tests**: Error handling, invalid inputs, permission issues
5. **Add Edge Cases**: Boundary values, race conditions, resource limits
6. **Define Validation**: Clear success criteria and assertions
7. **Format Output**: Generate in requested format (oc CLI or Ginkgo) - **MUST follow the standards in "Test Coverage Guidelines" section below**
**CRITICAL**: All generated test scenarios MUST adhere to the coverage dimensions, best practices, and standards defined in the **"Test Coverage Guidelines"** section below. Use the referenced examples and patterns from OpenShift origin repository.
## Test Coverage Guidelines
The command generates comprehensive test scenarios following industry best practices:
**Test Coverage Dimensions:**
- **Positive Tests**: Valid inputs and expected workflows
- **Negative Tests**: Invalid inputs, permission errors, missing dependencies
- **Edge Cases**: Boundary values (0, max values, empty inputs, special characters)
- **Security Tests**: RBAC validation, security context enforcement, privilege escalation
- **Resource Tests**: Low memory, disk pressure, network issues, rate limiting
- **Concurrency**: Multiple operations happening simultaneously
- **Failure Recovery**: Restart behavior, cleanup on failure
**References:**
- OpenShift Test Examples: https://github.com/openshift/origin/tree/master/test/extended
- Ginkgo BDD Framework: https://onsi.github.io/ginkgo/
- Test Pattern Catalog: https://github.com/openshift/origin/blob/master/test/extended/README.md
- oc CLI Reference: https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/developer-cli-commands.html
**Best Practices Applied:**
- Use stable, descriptive test names (no dynamic IDs or timestamps)
- Ensure proper resource cleanup (prevent resource leaks)
- Include meaningful assertions with clear failure messages
- Isolate tests (each test creates its own resources)
- Add appropriate timeouts to prevent hanging tests
- Follow Ginkgo patterns: Describe/Context/It hierarchy
- Use framework helpers: e2epod, e2enode, e2enamespace
## Arguments
- **$1** (test-idea-or-file-or-commands): One of:
- **Test idea**: Description of what to test
- **File path**: Path to existing test file
- **oc commands**: Set of oc CLI commands to analyze and expand
- **$2** (format): Output format - "oc CLI" or "Ginkgo" (optional, will prompt if not provided)

104
commands/new-e2e-test.md Normal file
View File

@@ -0,0 +1,104 @@
---
description: Write and validate new OpenShift E2E tests using Ginkgo framework
argument-hint: [test-specification]
---
## Name
openshift:new-e2e-test
## Synopsis
```
/new-e2e-test [test-specification]
```
## Description
The `new-e2e-test` command assists in writing and validating
new tests for the OpenShift test suite. It follows best practices for
Ginkgo-based testing and ensures test reliability through automated
validation.
This command handles the complete lifecycle of test development:
- Writes tests following Ginkgo patterns and OpenShift conventions
- Validates tests for reliability through multiple test runs
- Ensures proper test naming and structure
- Handles both origin repository and extension tests appropriately
## Test Framework Guidelines
### Ginkgo Framework
- OpenShift-tests uses **Ginkgo** as its testing framework
- Tests are organized in a BDD (Behavior-Driven Development) style with Describe/Context/It blocks
- All tests should follow Ginkgo patterns and conventions except
- You MUST NOT use BeforeAll, AfterAll hooks
- MUST NOT use ginkgo.Serial, instead use the [Serial] annotation in the test name if non-parallel execution is required
### Repository-Specific Guidelines
#### Origin Repository Tests
If working in the "origin" code repository:
- All tests should go into the `test/extended` directory
- If creating a new package, import it into `test/extended/include.go`
- After writing your test, **MUST** rebuild the openshift-tests binary using `make openshift-tests`
#### Other repositories
Other repositories have have different conventions for locations of
tests and how they get imported. Examine the code base and follow the
conventions defined.
## Critical Test Requirements
### Test Names
**CRITICAL**: Test names must be stable and deterministic.
#### ❌ NEVER Include Dynamic Information:
- Pod names (e.g., "test-pod-abc123")
- Timestamps
- Random UUIDs or generated identifiers
- Node names
- Namespace names with random suffixes
- Limits that may change later
#### ✅ ALWAYS Use Descriptive, Static Names:
- **Good example**: "should create a pod with custom security context"
- **Bad example**: "should create pod test-pod-xyz123 with custom security context"
- **Good example**: "should create a pod within a reasonable timeframe"
- **Bad example**: "should create a pod within 15 seconds"
### Results
**CRITICAL**: Tests must always produce a pass, fail or skip result. Do
not create tests that only produce pass or only produce a fail result.
## Test Structure Guidelines
### Best Practices
- Tests should be focused and test one specific behavior
- Use proper setup and cleanup in BeforeEach/AfterEach blocks
- Include appropriate timeouts for operations
- Add meaningful assertions with clear failure messages
- Follow existing patterns in the codebase for consistency
## Implementation
The command performs the following steps:
1. **Analyze Specification**: Parse the test specification provided by the user
2. **Write Test**: Create a new test file following Ginkgo and OpenShift conventions
- Determine correct location
- Follow proper test structure
- Use stable, descriptive naming
- Implement proper setup/cleanup
3. **Build Binary**: Rebuild the appropriate test binary (openshift-tests or a test extension)
## Arguments
- **$1** (test-specification): Description of the test behavior to validate. Should clearly specify:
- What feature/behavior to test
- Expected outcomes
- Any specific conditions or configurations

146
commands/rebase.md Normal file
View File

@@ -0,0 +1,146 @@
---
argument-hint: <tag>
description: Rebase OpenShift fork of an upstream repository to a new upstream release.
---
## Name
openshift:rebase
## Synopsis
```
/openshift:rebase [tag]
```
## Description
The `/openshift:rebase` command rebases git repository in the current working directory
to a new upstream release specified by `[tag]`. If no `[tag]` is specified, the command
tries to find the latest stable upstream release.
The repository must follow rules described in https://github.com/openshift/kubernetes/blob/master/REBASE.openshift.md,
namely all OpenShift-specific commits must have prefix `UPSTREAM:`.
## Implementation
### Pre-requisites
Three local remote repositories should be tracked from a local machine: `origin`
tracking the user's fork of this repository, `openshift` tracking this
repository and `upstream` tracking the upstream repository.
To verify the correct setup, use
```bash
git remote -v
```
Fail, if there is no `upstream`, `origin` or `openshift` remote.
### Rebase to the new upstream version
1. Fetch all the remote repositories including tags
```bash
git fetch --all
```
2. Find the main branch of the repository. It's either `master` or `main`. In the following steps, we will use `master`, but replace it with the main branch.
3. If user did not specify an upstream tag to rebase to as `<tag>`, find the greatest upstream tag that is not alpha, beta or rc.
4. Create a new branch based on the newest tag $1 of the upstream
repository. Name it after the tag.
```bash
git checkout -b rebase-<tag> <tag>
```
5. Merge `openshift/master` branch into the `rebase-$1` branch with merge strategy `ours`:
```bash
git merge -s ours openshift/master
```
6. Find the last rebase that has been done to `openshift/master`. We will use the upstream tag used for this rebase as `$previous_tag`.
7. Find the merge base of the `openshift/master` and `$previous_tag` by running `git merge-base openshift/master $previous_tag`. We will use this merge base as `$mergebase`.
8. Prepare `commits.tsv` tab-separated values file containing the set of carry
commits in the openshift/master branch that need to be considered for picking:
Create the commits file:
```
echo -e 'Sha\tMessage\tDecision' > commits.tsv
git log ${mergebase}..openshift/master --ancestry-path --reverse --no-merges --pretty="tformat:%h%x09%s%x09" | grep "UPSTREAM:" > commits.tsv
```
9. Go through the commits in the `commits.tsv` file and for each of them decide
whether to pick, drop or squash it. Commits carried on rebase branches have commit
messages prefixed as follows:
* `UPSTREAM: <carry>: Add OpenShift files`:
ALWAYS carry this commit and mark it as "cherry-pick".
This is a persistent carry that contains all OpenShift-specific files and should be present in every rebase.
* Other `UPSTREAM: <carry>` commit:
A persistent carry that needs to be considered for squashing.
Examine what files it modifies using `git show --stat <commit-sha>`.
If it modifies ONLY OpenShift-specific files (Dockerfile, OWNERS, .ci-operator.yaml, .snyk, etc.), mark it as "squash",
otherwise mark is as "cherry-pick".
* `UPSTREAM: <drop>`:
A carry that should probably not be picked for the subsequent rebase branch.
In general, these commits are used to maintain the codebase in ways that are branch-specific,
like the update of generated files or dependencies.
Mark such commit as "drop".
* `UPSTREAM: (upstream PR number)`:
The number identifies a PR in upstream repository (e.g. https://github.com/<upstream project>/<upstrem repository>/pull/<pr id>).
A commit with this message should only be picked into the subsequent rebase branch if the commits
of the referenced PR are not included in the upstream branch. To check if a given commit is included
in the upstream branch, open the referenced upstream PR and check any of its commits for the release tag.
For each commit:
- Print the decision you made and why.
- Update commits.tsv with the decision ("cherry-pick", "drop", or "squash").
10. Cherry-pick all commits marked as "cherry-pick" in commits.tsv.
Then squash ALL commits marked as "squash" into a single commit named "UPSTREAM: <carry>: Add OpenShift files"
to keep the number of <carry> commits as low as possible.
Use `git reset --soft` to squash multiple commits together, then create a single commit with all the changes.
The commit message should list what was included (e.g., "Additional changes: remove .github files, add .snyk file, update Dockerfile and .ci-operator.yaml").
11. If the upstream repository DOES NOT include `vendor/` directory and the OpenShift fork DOES, then update the vendor directory with `go mod tidy` and `go mod vendor`.
Amend these vendor updates into the "UPSTREAM: <carry>: Add OpenShift files" commit using `git commit --amend --no-edit`.
12. As a verification step, see the last rebase and ensure that all changes made in the last rebase are present in the current one.
Either as a cherry pick or were part of the rebase.
Verify all changes were applied during the rebase. Either as a cherry-picked patch or they were included in the new upstream tag.
List all these commits, together with checks you made and their result.
13. Verify the changes by running `make` and `make test` (or a similar command like like `go build ./...` and `go test ./...`).
Stop here if there are compilation errors or test failures that indicate real code issues.
If you make any new commits to fix compilation or tests, let user review these changes and then squash them into the commit "UPSTREAM: <carry>: Add OpenShift files" too.
14. Find links to upstream changelogs between `$previous_tag` and $1.
Make sure they are links to changelogs, not tags.
Print list of the links.
15. Create a github pull request against the OpenShift github repository (openshift/<repo-name>).
IMPORTANT: Use `--repo openshift/<repo-name>` to ensure the PR is created against the correct OpenShift repository, not the upstream.
The PR title should be "Rebase to $1 for OCP <current OCP version>".
Follow the repository .github/PULL_REQUEST_TEMPLATE.md, if it exists.
Description of the PR must look like:
```
## Upstream changelogs
<List links to all upstream changelogs, as composed in the previous step.>
## Summary of changes
<List all new major features and breaking changes that happened between $previous_tag and $1.
Do not list upstream commits or PRs, make a human readable summary of them.
Do not include small bug fixes, small updates, or dependency bumps.>
## Carried commits
<List of commits from commits.tsv. For each commit print a decision you made - either "drop", "cherry-pick", or "squash".>
Diff to upstream: <link to a diff between the upstream project/upstream repository/tag $1 and this PR (i.e. my personal fork with branch `rebase-$1`>
Previous rebase: <link to the previous rebase PR on github>
```
When opening the PR, ALWAYS use `gh pr create --web --repo openshift/<repo-name>` to allow user edit the PR before creation.

View File

@@ -0,0 +1,69 @@
---
description: Review test cases for completeness, quality, and best practices - accepts file path or direct oc commands/test code
argument-hint: [file-path-or-test-code-or-commands]
---
## Name
openshift:review-test-cases
## Synopsis
```
/openshift:review-test-cases [file-path-or-test-code-or-commands]
```
## Description
The `review-test-cases` command provides comprehensive review of OpenShift test cases to ensure quality, completeness, and adherence to best practices. It accepts three types of input:
1. **File path**: Path to test file (e.g., `/path/to/test.sh` or `/path/to/test.go`)
2. **oc commands**: Direct oc CLI commands to review (e.g., paste a set of oc commands)
3. **Test code**: Pasted Ginkgo test code to analyze
The command analyzes test code in both oc CLI shell scripts and Ginkgo Go tests, helping QE engineers identify gaps in test coverage, improve test reliability, and ensure tests follow OpenShift testing standards.
## Implementation
The command analyzes test cases and provides structured feedback:
1. **Parse Test Input**: Determine if input is a file path, oc commands, or test code
- If file path: Read and analyze the test file
- If oc commands: Parse command sequence
- If test code: Analyze pasted Ginkgo/test code
2. **Identify Test Format**: Detect if it's oc CLI shell script or Ginkgo Go code
3. **Analyze Test Structure**: Review organization, naming, and patterns
4. **Check Coverage**: Verify positive, negative, and edge case coverage
5. **Review Assertions**: Ensure proper validation and error checking
6. **Evaluate Cleanup**: Verify resource cleanup and namespace management
7. **Assess Best Practices**: **MUST follow the standards defined in "Testing Guidelines and References" section below**
8. **Generate Recommendations**: Provide actionable improvement suggestions based on the guidelines
**CRITICAL**: All reviews MUST be evaluated against the specific standards, references, and best practices listed in the **"Testing Guidelines and References"** section below. Do not use generic testing advice - follow the OpenShift-specific guidelines provided.
## Testing Guidelines and References
The review follows established testing best practices from:
**For Ginkgo/E2E Tests:**
- OpenShift Origin Test Extended: https://github.com/openshift/origin/tree/master/test/extended
- Ginkgo Testing Framework: https://onsi.github.io/ginkgo/
- OpenShift Test Best Practices: https://github.com/openshift/origin/blob/master/test/extended/README.md
**For oc CLI Tests:**
- OpenShift CLI Documentation: https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/developer-cli-commands.html
- Bash Best Practices: https://google.github.io/styleguide/shellguide.html
**Key Testing Standards:**
- Use descriptive, stable test names (no timestamps, random IDs)
- Proper resource cleanup (AfterEach, defer, trap)
- Meaningful assertions with clear failure messages
- Test isolation (each test creates own resources)
- Appropriate timeouts and waits
- No BeforeAll/AfterAll in Ginkgo tests
- Use framework helpers (e2epod, e2enode) when available
## Arguments
- **$1** (file-path-or-test-code-or-commands): One of:
- **File path**: Path to test file (shell script or Go test file)
- **oc commands**: Set of oc CLI commands to review
- **Test code**: Pasted test code (Ginkgo or shell script)

77
plugin.lock.json Normal file
View File

@@ -0,0 +1,77 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:openshift-eng/ai-helpers:plugins/openshift",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "2bae18158cc58ddaaeefbc685689899b7685b679",
"treeHash": "e1cbd8160922270ea583839ddf542ecd6b05e585ce18e35b2b3a77afcffb600e",
"generatedAt": "2025-11-28T10:27:30.115036Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "openshift",
"description": "OpenShift development utilities and helpers",
"version": "0.0.1"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "f431e975cfb1245dd1f40539ffa5244a4eba7924b86f1b45624385936463f42d"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "3ddc8d96e7dc034d6a1d71a3474127af79a9a18c0492be31c855b59c3d0ddf26"
},
{
"path": "commands/expand-test-case.md",
"sha256": "a2760ebd7b1865bf9f28a72152a2f82d9e1cf5f5e6d63ac4844589a6a30deb75"
},
{
"path": "commands/destroy-cluster.md",
"sha256": "9fd1d82c76120b20264b7f0d57e5e5b219b33ee6887afed621b7a61d696716ca"
},
{
"path": "commands/create-cluster.md",
"sha256": "e0aa12787e7c24f1f8dfc975ac5ac2b3998fe87e5fc83a0a85be475860aea8f6"
},
{
"path": "commands/new-e2e-test.md",
"sha256": "bbd0795dbbf8928456bf854fe97ec2becc6e6b24803e5e457ce145b657350ee5"
},
{
"path": "commands/rebase.md",
"sha256": "31f0983f9531cd42384f17fc4d298528a44a57cc1e5c4128911f8408428b4ab7"
},
{
"path": "commands/review-test-cases.md",
"sha256": "b4dcfd668cec760448e672f02026dd51cd148e0ad6df4de73a6ef4cdd80080d4"
},
{
"path": "commands/cluster-health-check.md",
"sha256": "bfba16ddedce875ca3968a9bbb30a3bad5f8e4943eddcd807e59631d864c2e0e"
},
{
"path": "commands/bump-deps.md",
"sha256": "f29f7359a1bc0e44395c268139958a105a3f6b0952fed7d1b3515deb33bcacee"
},
{
"path": "commands/crd-review.md",
"sha256": "dd56f8e8c7c29384e91251817b02ccfda2b7bfa3d75006c16a364cd2c34a3bcb"
}
],
"dirSha256": "e1cbd8160922270ea583839ddf542ecd6b05e585ce18e35b2b3a77afcffb600e"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}