Initial commit
This commit is contained in:
11
.claude-plugin/plugin.json
Normal file
11
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,11 @@
|
||||
{
|
||||
"name": "openshift",
|
||||
"description": "OpenShift development utilities and helpers",
|
||||
"version": "0.0.1",
|
||||
"author": {
|
||||
"name": "github.com/openshift-eng"
|
||||
},
|
||||
"commands": [
|
||||
"./commands"
|
||||
]
|
||||
}
|
||||
3
README.md
Normal file
3
README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# openshift
|
||||
|
||||
OpenShift development utilities and helpers
|
||||
422
commands/bump-deps.md
Normal file
422
commands/bump-deps.md
Normal file
@@ -0,0 +1,422 @@
|
||||
---
|
||||
description: Bump dependencies in OpenShift projects with automated analysis and PR creation
|
||||
argument-hint: <dependency> [version] [--create-jira] [--create-pr]
|
||||
---
|
||||
|
||||
## Name
|
||||
|
||||
openshift:bump-deps
|
||||
|
||||
## Synopsis
|
||||
|
||||
```
|
||||
/openshift:bump-deps <dependency> [version] [--create-jira] [--create-pr]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `openshift:bump-deps` command automates the process of bumping dependencies in OpenShift organization projects. It analyzes the dependency, determines the appropriate version to bump to, updates the necessary files (go.mod, go.sum, package.json, etc.), runs tests, and optionally creates Jira tickets and pull requests.
|
||||
|
||||
This command significantly reduces the manual effort required for dependency updates by automating:
|
||||
|
||||
- Dependency version discovery and analysis
|
||||
- Compatibility checking with current codebase
|
||||
- File updates (go.mod, package.json, Dockerfile, etc.)
|
||||
- Test execution to verify the update
|
||||
- Jira ticket creation with comprehensive details
|
||||
- Pull request creation with proper formatting
|
||||
- Release notes generation
|
||||
|
||||
The command intelligently handles different dependency types (Go modules, npm packages, container images, etc.) and can process single or multiple dependencies at once.
|
||||
|
||||
## Implementation
|
||||
|
||||
The command executes the following workflow:
|
||||
|
||||
### 1. Repository Analysis
|
||||
|
||||
- Detects repository type (Go, Node.js, Python, etc.)
|
||||
- Identifies dependency management files (go.mod, package.json, requirements.txt, etc.)
|
||||
- Determines current project structure and conventions
|
||||
- Checks for existing CI/CD configuration
|
||||
|
||||
### 2. Dependency Discovery
|
||||
|
||||
**For Go Projects:**
|
||||
- Parses go.mod to find current version
|
||||
- Uses `go list -m -versions <module>` to list available versions
|
||||
- Checks for major version compatibility (v0, v1, v2+)
|
||||
- Identifies if dependency is direct or indirect
|
||||
|
||||
**For Node.js Projects:**
|
||||
- Parses package.json for current version
|
||||
- Uses npm/yarn to find latest versions
|
||||
- Checks semantic versioning constraints
|
||||
- Identifies devDependencies vs dependencies
|
||||
|
||||
**For Container Images:**
|
||||
- Parses Dockerfile and related files
|
||||
- Checks registry for available tags
|
||||
- Verifies image digest and signatures
|
||||
- Identifies base images and tool images
|
||||
|
||||
**For Python Projects:**
|
||||
- Parses requirements.txt or pyproject.toml
|
||||
- Uses pip to find available versions
|
||||
- Checks for version constraints
|
||||
|
||||
### 3. Version Selection
|
||||
|
||||
If no version is specified:
|
||||
- Suggests latest stable version
|
||||
- Considers semantic versioning (patch, minor, major)
|
||||
- Checks for breaking changes in release notes
|
||||
- Validates against project's minimum version requirements
|
||||
|
||||
If version is specified:
|
||||
- Validates version exists
|
||||
- Checks compatibility with current project version
|
||||
- Warns about major version jumps
|
||||
|
||||
### 4. Impact Analysis
|
||||
|
||||
- Searches codebase for usage of the dependency
|
||||
- Identifies files importing/using the dependency
|
||||
- Analyzes API changes between versions
|
||||
- Checks for deprecated features being used
|
||||
- Reviews upstream changelog and release notes
|
||||
- Identifies potential breaking changes
|
||||
|
||||
### 5. File Updates
|
||||
|
||||
**Go Projects:**
|
||||
- Updates go.mod with new version
|
||||
- Runs `go mod tidy` to update go.sum
|
||||
- Runs `go mod vendor` if vendor directory exists
|
||||
- Updates any version constraints in comments
|
||||
|
||||
**Node.js Projects:**
|
||||
- Updates package.json
|
||||
- Runs `npm install` or `yarn install`
|
||||
- Updates package-lock.json or yarn.lock
|
||||
|
||||
**Container Images:**
|
||||
- Updates Dockerfile(s)
|
||||
- Updates related manifests (kubernetes, etc.)
|
||||
- Updates any CI configuration using the image
|
||||
|
||||
**Python Projects:**
|
||||
- Updates requirements.txt or pyproject.toml
|
||||
- Generates new lock file if applicable
|
||||
|
||||
### 6. Testing Strategy
|
||||
|
||||
- Identifies relevant test suites
|
||||
- Runs unit tests: `make test` or equivalent
|
||||
- Runs integration tests if available
|
||||
- Runs e2e tests for critical dependencies
|
||||
- Checks for test failures and analyzes logs
|
||||
- Verifies build succeeds: `make build`
|
||||
|
||||
### 7. Jira Ticket Creation (if --create-jira)
|
||||
|
||||
Creates a Jira ticket with:
|
||||
- **Summary**: `Bump {dependency} from {old_version} to {new_version}`
|
||||
- **Type**: Task or Bug (if security update)
|
||||
- **Components**: Auto-detected from repository
|
||||
- **Labels**: ["dependencies", "automated-update", "ai-generated"]
|
||||
- Adds "security" if CVE-related
|
||||
- Adds "breaking-change" if major version bump
|
||||
- **Description**: Includes:
|
||||
- Dependency information and type
|
||||
- Current and new versions
|
||||
- Changelog summary
|
||||
- Breaking changes (if any)
|
||||
- Files modified
|
||||
- Test results
|
||||
- Migration steps (if needed)
|
||||
- Links to upstream release notes
|
||||
- **Target Version**: Auto-detected from release branches
|
||||
|
||||
### 8. Pull Request Creation (if --create-pr)
|
||||
|
||||
Creates a pull request with:
|
||||
- **Title**: `[{JIRA-ID}] Bump {dependency} from {old_version} to {new_version}`
|
||||
- **Body**: Includes:
|
||||
- Link to Jira ticket
|
||||
- Summary of changes
|
||||
- Breaking changes callout
|
||||
- Testing performed
|
||||
- Checklist for reviewers
|
||||
- Release notes snippet
|
||||
- **Labels**: Auto-applied based on change type
|
||||
- **Branch naming**: `deps/{dependency}-{new_version}` or `{jira-id}-bump-{dependency}`
|
||||
|
||||
### 9. Conflict Resolution
|
||||
|
||||
If updates cause issues:
|
||||
- Identifies conflicting dependencies
|
||||
- Suggests resolution strategies
|
||||
- Can attempt automatic resolution for common cases
|
||||
- Provides manual resolution steps for complex scenarios
|
||||
|
||||
## Return Value
|
||||
|
||||
- **Claude agent text**: Processing status, test results, and summary
|
||||
- **Side effects**:
|
||||
- Modified dependency files (go.mod, package.json, etc.)
|
||||
- Updated lock files
|
||||
- Jira ticket created (if --create-jira)
|
||||
- Pull request created (if --create-pr)
|
||||
- Git branch created with changes
|
||||
|
||||
## Examples
|
||||
|
||||
1. **Bump a Go dependency to latest**:
|
||||
|
||||
```
|
||||
/openshift:bump-deps k8s.io/api
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Analyzing dependency: k8s.io/api
|
||||
Current version: v0.28.0
|
||||
Latest version: v0.29.1
|
||||
|
||||
Checking compatibility...
|
||||
✅ No breaking changes detected
|
||||
|
||||
Updating go.mod...
|
||||
Running go mod tidy...
|
||||
|
||||
Running tests...
|
||||
✅ All tests passed
|
||||
|
||||
Summary:
|
||||
- Dependency: k8s.io/api
|
||||
- Old version: v0.28.0
|
||||
- New version: v0.29.1
|
||||
- Files modified: go.mod, go.sum
|
||||
- Tests: ✅ Passed
|
||||
|
||||
Changes are ready. Use --create-pr to create a pull request.
|
||||
```
|
||||
|
||||
2. **Bump to a specific version with Jira ticket**:
|
||||
|
||||
```
|
||||
/openshift:bump-deps golang.org/x/net v0.20.0 --create-jira
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Analyzing dependency: golang.org/x/net
|
||||
Current version: v0.19.0
|
||||
Target version: v0.20.0
|
||||
|
||||
Reviewing changes...
|
||||
⚠️ Breaking changes detected in v0.20.0:
|
||||
- http2: Server.IdleTimeout applies to idle h2 connections
|
||||
|
||||
Updating go.mod...
|
||||
Running tests...
|
||||
✅ All tests passed
|
||||
|
||||
Creating Jira ticket...
|
||||
✅ Created: OCPBUGS-12345
|
||||
|
||||
Summary:
|
||||
- Jira: https://issues.redhat.com/browse/OCPBUGS-12345
|
||||
- Dependency: golang.org/x/net
|
||||
- Version: v0.19.0 → v0.20.0
|
||||
- Breaking changes: Yes
|
||||
```
|
||||
|
||||
3. **Bump and create PR in one step**:
|
||||
|
||||
```
|
||||
/openshift:bump-deps github.com/spf13/cobra --create-jira --create-pr
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Processing dependency bump for github.com/spf13/cobra...
|
||||
|
||||
[1/7] Analyzing dependency...
|
||||
Current: v1.7.0
|
||||
Latest: v1.8.0
|
||||
|
||||
[2/7] Checking changelog...
|
||||
Changes include:
|
||||
- New features: Enhanced shell completion
|
||||
- Bug fixes: 5 issues resolved
|
||||
- No breaking changes
|
||||
|
||||
[3/7] Updating files...
|
||||
✅ go.mod updated
|
||||
✅ go.sum updated
|
||||
|
||||
[4/7] Running tests...
|
||||
✅ Unit tests: 156/156 passed
|
||||
✅ Integration tests: 23/23 passed
|
||||
|
||||
[5/7] Creating Jira ticket...
|
||||
✅ Created: OCPBUGS-12346
|
||||
|
||||
[6/7] Creating git branch...
|
||||
✅ Branch: OCPBUGS-12346-bump-cobra
|
||||
|
||||
[7/7] Creating pull request...
|
||||
✅ PR created: #1234
|
||||
|
||||
Summary:
|
||||
- Jira: https://issues.redhat.com/browse/OCPBUGS-12346
|
||||
- PR: https://github.com/openshift/repo/pull/1234
|
||||
- Dependency: github.com/spf13/cobra
|
||||
- Version: v1.7.0 → v1.8.0
|
||||
- Tests: All passed
|
||||
|
||||
Next steps:
|
||||
1. Review the PR at the link above
|
||||
2. Address any reviewer comments
|
||||
3. Merge when approved
|
||||
```
|
||||
|
||||
4. **Bump multiple related dependencies**:
|
||||
|
||||
```
|
||||
/openshift:bump-deps "k8s.io/*"
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Found 8 Kubernetes dependencies to update:
|
||||
|
||||
[1/8] k8s.io/api: v0.28.0 → v0.29.1
|
||||
[2/8] k8s.io/apimachinery: v0.28.0 → v0.29.1
|
||||
[3/8] k8s.io/client-go: v0.28.0 → v0.29.1
|
||||
[4/8] k8s.io/kubectl: v0.28.0 → v0.29.1
|
||||
...
|
||||
|
||||
These should be updated together to maintain compatibility.
|
||||
Proceed with batch update? [y/N]
|
||||
```
|
||||
|
||||
5. **Bump a container base image**:
|
||||
|
||||
```
|
||||
/openshift:bump-deps registry.access.redhat.com/ubi9/ubi-minimal
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
Analyzing container image: ubi9/ubi-minimal
|
||||
Current: 9.3-1361
|
||||
Latest: 9.4-1194
|
||||
|
||||
Checking for security updates...
|
||||
✅ 3 CVEs fixed in new version
|
||||
|
||||
Updating Dockerfile...
|
||||
Building test image...
|
||||
Running container tests...
|
||||
✅ All tests passed
|
||||
|
||||
Files modified:
|
||||
- Dockerfile
|
||||
- .github/workflows/build.yml
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
- **$1** (required): Dependency identifier
|
||||
- Go module: `github.com/org/repo` or `golang.org/x/net`
|
||||
- npm package: `@types/node` or `react`
|
||||
- Container image: `registry.access.redhat.com/ubi9/ubi-minimal`
|
||||
- Wildcard for batch: `k8s.io/*` (requires confirmation)
|
||||
|
||||
- **$2** (optional): Target version
|
||||
- Semantic version: `v1.2.3`, `1.2.3`
|
||||
- Version range: `^1.2.0`, `~1.2.0`
|
||||
- Special: `latest`, `latest-stable`
|
||||
- If omitted: suggests latest stable version
|
||||
|
||||
- **--create-jira** (flag): Create a Jira ticket for the update
|
||||
- Auto-detects project from repository
|
||||
- Can be configured with JIRA_PROJECT env var
|
||||
- Ticket includes full change analysis
|
||||
|
||||
- **--create-pr** (flag): Create a pull request with the changes
|
||||
- Implies creating a git branch
|
||||
- Includes --create-jira automatically
|
||||
- PR is created as draft if tests fail
|
||||
|
||||
- **--jira-project** (option): Specify Jira project (default: auto-detect)
|
||||
- Example: `--jira-project OCPBUGS`
|
||||
|
||||
- **--component** (option): Specify Jira component (default: auto-detect)
|
||||
- Example: `--component "Control Plane"`
|
||||
|
||||
- **--branch** (option): Specify git branch name (default: auto-generate)
|
||||
- Example: `--branch feature/update-deps`
|
||||
|
||||
- **--skip-tests** (flag): Skip running tests (not recommended)
|
||||
- Use only for non-critical updates
|
||||
- PR will be marked as draft
|
||||
|
||||
- **--force** (flag): Force update even if tests fail
|
||||
- Creates PR as draft
|
||||
- Includes test failure details in PR
|
||||
|
||||
## Error Handling
|
||||
|
||||
The command handles common error cases:
|
||||
|
||||
- **Dependency not found**: Lists similar dependencies in project
|
||||
- **Version not found**: Shows available versions
|
||||
- **Test failures**:
|
||||
- Provides detailed error logs
|
||||
- Suggests potential fixes
|
||||
- Asks whether to create draft PR anyway
|
||||
- **Conflicting dependencies**:
|
||||
- Identifies conflicts
|
||||
- Suggests resolution order
|
||||
- Can attempt batch update
|
||||
- **Breaking changes**:
|
||||
- Highlights breaking changes
|
||||
- Links to migration guides
|
||||
- Requires explicit confirmation for major bumps
|
||||
- **Network failures**: Retries with exponential backoff
|
||||
- **Permission errors**: Checks git/GitHub authentication
|
||||
|
||||
## Notes
|
||||
|
||||
- Repository name and organization are auto-detected from `git remote -v`
|
||||
- For Go dependencies, supports both versioned (v2+) and unversioned modules
|
||||
- Automatically detects if running in a fork vs upstream repository
|
||||
- Respects `.gitignore` and doesn't commit generated/vendored files unnecessarily
|
||||
- Can handle dependencies with replace directives in go.mod
|
||||
- Supports monorepos with multiple go.mod files
|
||||
- All Jira tickets are labeled with "ai-generated" for tracking
|
||||
- PR creation requires GitHub CLI (gh) to be installed and authenticated
|
||||
- For security updates (CVEs), automatically prioritizes and labels appropriately
|
||||
- Compatible with Renovate - can be used to customize/enhance Renovate PRs
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- **JIRA_PROJECT**: Default Jira project for ticket creation
|
||||
- **JIRA_COMPONENT**: Default component for Jira tickets
|
||||
- **GITHUB_TOKEN**: GitHub authentication (if not using gh auth)
|
||||
- **DEFAULT_BRANCH**: Override default branch detection (default: main)
|
||||
|
||||
## See Also
|
||||
|
||||
- `utils:process-renovate-pr` - Process existing Renovate dependency PRs
|
||||
- `git:create-pr` - General PR creation command
|
||||
- `jira:create` - Manual Jira ticket creation
|
||||
542
commands/cluster-health-check.md
Normal file
542
commands/cluster-health-check.md
Normal file
@@ -0,0 +1,542 @@
|
||||
---
|
||||
description: Perform comprehensive health check on OpenShift cluster and report issues
|
||||
argument-hint: "[--verbose] [--output-format]"
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:cluster-health-check
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:cluster-health-check [--verbose] [--output-format json|text]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `cluster-health-check` command performs a comprehensive health analysis of an OpenShift/Kubernetes cluster and reports any detected issues. It examines cluster operators, nodes, deployments, pods, persistent volumes, and other critical resources to identify problems that may affect cluster stability or workload availability.
|
||||
|
||||
This command is useful for:
|
||||
- Quick cluster status assessment
|
||||
- Troubleshooting cluster issues
|
||||
- Pre-deployment validation
|
||||
- Regular health monitoring
|
||||
- Identifying degraded components
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before using this command, ensure you have:
|
||||
|
||||
1. **Kubernetes/OpenShift CLI**: Either `oc` (OpenShift) or `kubectl` (Kubernetes)
|
||||
- Install `oc` from: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
|
||||
- Or install `kubectl` from: https://kubernetes.io/docs/tasks/tools/
|
||||
- Verify with: `oc version` or `kubectl version`
|
||||
|
||||
2. **Active cluster connection**: Must be connected to a running cluster
|
||||
- Verify with: `oc whoami` or `kubectl cluster-info`
|
||||
- Ensure KUBECONFIG is set if needed
|
||||
|
||||
3. **Sufficient permissions**: Must have read access to cluster resources
|
||||
- Cluster-admin or monitoring role recommended for comprehensive checks
|
||||
- Minimum: ability to view nodes, pods, and cluster operators
|
||||
|
||||
## Arguments
|
||||
|
||||
- **--verbose** (optional): Enable detailed output with additional context
|
||||
- Shows resource-level details
|
||||
- Includes warning conditions
|
||||
- Provides remediation suggestions
|
||||
|
||||
- **--output-format** (optional): Output format for results
|
||||
- `text` (default): Human-readable text format
|
||||
- `json`: Machine-readable JSON format for automation
|
||||
|
||||
## Implementation
|
||||
|
||||
The command performs the following health checks:
|
||||
|
||||
### 1. Determine CLI Tool
|
||||
|
||||
Detect which Kubernetes CLI is available:
|
||||
|
||||
```bash
|
||||
if command -v oc &> /dev/null; then
|
||||
CLI="oc"
|
||||
CLUSTER_TYPE="OpenShift"
|
||||
elif command -v kubectl &> /dev/null; then
|
||||
CLI="kubectl"
|
||||
CLUSTER_TYPE="Kubernetes"
|
||||
else
|
||||
echo "Error: Neither 'oc' nor 'kubectl' CLI found. Please install one of them."
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
### 2. Verify Cluster Connectivity
|
||||
|
||||
Check if connected to a cluster:
|
||||
|
||||
```bash
|
||||
if ! $CLI cluster-info &> /dev/null; then
|
||||
echo "Error: Not connected to a cluster. Please configure your KUBECONFIG."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Get cluster version info
|
||||
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
|
||||
CLUSTER_VERSION=$($CLI version -o json 2>/dev/null | jq -r '.openshiftVersion // "unknown"')
|
||||
else
|
||||
CLUSTER_VERSION=$($CLI version --short 2>/dev/null | grep -i server | awk '{print $3}')
|
||||
fi
|
||||
```
|
||||
|
||||
### 3. Initialize Health Check Report
|
||||
|
||||
Create a report structure to collect findings:
|
||||
|
||||
```bash
|
||||
REPORT_FILE=".work/cluster-health-check/report-$(date +%Y%m%d-%H%M%S).txt"
|
||||
mkdir -p .work/cluster-health-check
|
||||
|
||||
# Initialize counters
|
||||
CRITICAL_ISSUES=0
|
||||
WARNING_ISSUES=0
|
||||
INFO_MESSAGES=0
|
||||
```
|
||||
|
||||
### 4. Check Cluster Operators (OpenShift only)
|
||||
|
||||
For OpenShift clusters, check cluster operator health:
|
||||
|
||||
```bash
|
||||
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
|
||||
echo "Checking Cluster Operators..."
|
||||
|
||||
# Get all cluster operators
|
||||
DEGRADED_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Degraded" and .status=="True")) | .metadata.name')
|
||||
|
||||
UNAVAILABLE_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Available" and .status=="False")) | .metadata.name')
|
||||
|
||||
PROGRESSING_COs=$($CLI get clusteroperators -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Progressing" and .status=="True")) | .metadata.name')
|
||||
|
||||
if [ -n "$DEGRADED_COs" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$DEGRADED_COs" | wc -l)))
|
||||
echo "❌ CRITICAL: Degraded cluster operators found:"
|
||||
echo "$DEGRADED_COs" | while read co; do
|
||||
echo " - $co"
|
||||
# Get degraded message
|
||||
$CLI get clusteroperator "$co" -o json | jq -r '.status.conditions[] | select(.type=="Degraded") | " Reason: \(.reason)\n Message: \(.message)"'
|
||||
done
|
||||
fi
|
||||
|
||||
if [ -n "$UNAVAILABLE_COs" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$UNAVAILABLE_COs" | wc -l)))
|
||||
echo "❌ CRITICAL: Unavailable cluster operators found:"
|
||||
echo "$UNAVAILABLE_COs" | while read co; do
|
||||
echo " - $co"
|
||||
done
|
||||
fi
|
||||
|
||||
if [ -n "$PROGRESSING_COs" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PROGRESSING_COs" | wc -l)))
|
||||
echo "⚠️ WARNING: Cluster operators in progress:"
|
||||
echo "$PROGRESSING_COs" | while read co; do
|
||||
echo " - $co"
|
||||
done
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
### 5. Check Node Health
|
||||
|
||||
Examine all cluster nodes for issues:
|
||||
|
||||
```bash
|
||||
echo "Checking Node Health..."
|
||||
|
||||
# Get nodes that are not Ready
|
||||
NOT_READY_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Ready" and .status!="True")) | .metadata.name')
|
||||
|
||||
if [ -n "$NOT_READY_NODES" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$NOT_READY_NODES" | wc -l)))
|
||||
echo "❌ CRITICAL: Nodes not in Ready state:"
|
||||
echo "$NOT_READY_NODES" | while read node; do
|
||||
echo " - $node"
|
||||
# Get node conditions
|
||||
$CLI get node "$node" -o json | jq -r '.status.conditions[] | " \(.type): \(.status) - \(.message // "N/A")"'
|
||||
done
|
||||
fi
|
||||
|
||||
# Check for SchedulingDisabled nodes
|
||||
DISABLED_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.spec.unschedulable==true) | .metadata.name')
|
||||
|
||||
if [ -n "$DISABLED_NODES" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$DISABLED_NODES" | wc -l)))
|
||||
echo "⚠️ WARNING: Nodes with scheduling disabled:"
|
||||
echo "$DISABLED_NODES" | while read node; do
|
||||
echo " - $node"
|
||||
done
|
||||
fi
|
||||
|
||||
# Check for node pressure conditions (MemoryPressure, DiskPressure, PIDPressure)
|
||||
PRESSURE_NODES=$($CLI get nodes -o json | jq -r '.items[] | select(.status.conditions[] | select((.type=="MemoryPressure" or .type=="DiskPressure" or .type=="PIDPressure") and .status=="True")) | .metadata.name')
|
||||
|
||||
if [ -n "$PRESSURE_NODES" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PRESSURE_NODES" | wc -l)))
|
||||
echo "⚠️ WARNING: Nodes under resource pressure:"
|
||||
echo "$PRESSURE_NODES" | while read node; do
|
||||
echo " - $node"
|
||||
$CLI get node "$node" -o json | jq -r '.status.conditions[] | select((.type=="MemoryPressure" or .type=="DiskPressure" or .type=="PIDPressure") and .status=="True") | " \(.type): \(.message // "N/A")"'
|
||||
done
|
||||
fi
|
||||
|
||||
# Check node resource utilization if metrics-server is available
|
||||
if $CLI top nodes &> /dev/null; then
|
||||
echo "Node Resource Utilization:"
|
||||
$CLI top nodes
|
||||
fi
|
||||
```
|
||||
|
||||
### 6. Check Pod Health Across All Namespaces
|
||||
|
||||
Identify problematic pods:
|
||||
|
||||
```bash
|
||||
echo "Checking Pod Health..."
|
||||
|
||||
# Get pods that are not Running or Completed
|
||||
FAILED_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.phase != "Running" and .status.phase != "Succeeded") | "\(.metadata.namespace)/\(.metadata.name) [\(.status.phase)]"')
|
||||
|
||||
if [ -n "$FAILED_PODS" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$FAILED_PODS" | wc -l)))
|
||||
echo "❌ CRITICAL: Pods in failed/pending state:"
|
||||
echo "$FAILED_PODS"
|
||||
fi
|
||||
|
||||
# Check for pods with restarts
|
||||
HIGH_RESTART_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .restartCount > 5) | "\(.metadata.namespace)/\(.metadata.name) [Restarts: \(.status.containerStatuses[0].restartCount)]"')
|
||||
|
||||
if [ -n "$HIGH_RESTART_PODS" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$HIGH_RESTART_PODS" | wc -l)))
|
||||
echo "⚠️ WARNING: Pods with high restart count (>5):"
|
||||
echo "$HIGH_RESTART_PODS"
|
||||
fi
|
||||
|
||||
# Check for CrashLoopBackOff pods
|
||||
CRASHLOOP_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .state.waiting?.reason == "CrashLoopBackOff") | "\(.metadata.namespace)/\(.metadata.name)"')
|
||||
|
||||
if [ -n "$CRASHLOOP_PODS" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$CRASHLOOP_PODS" | wc -l)))
|
||||
echo "❌ CRITICAL: Pods in CrashLoopBackOff:"
|
||||
echo "$CRASHLOOP_PODS"
|
||||
fi
|
||||
|
||||
# Check for ImagePullBackOff pods
|
||||
IMAGE_PULL_PODS=$($CLI get pods --all-namespaces -o json | jq -r '.items[] | select(.status.containerStatuses[]? | .state.waiting?.reason == "ImagePullBackOff" or .state.waiting?.reason == "ErrImagePull") | "\(.metadata.namespace)/\(.metadata.name)"')
|
||||
|
||||
if [ -n "$IMAGE_PULL_PODS" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$IMAGE_PULL_PODS" | wc -l)))
|
||||
echo "❌ CRITICAL: Pods with image pull errors:"
|
||||
echo "$IMAGE_PULL_PODS"
|
||||
fi
|
||||
```
|
||||
|
||||
### 7. Check Deployment/StatefulSet/DaemonSet Health
|
||||
|
||||
Verify workload controllers:
|
||||
|
||||
```bash
|
||||
echo "Checking Deployments..."
|
||||
|
||||
# Check deployments with unavailable replicas
|
||||
UNHEALTHY_DEPLOYMENTS=$($CLI get deployments --all-namespaces -o json | jq -r '.items[] | select(.status.unavailableReplicas > 0 or .status.replicas != .status.readyReplicas) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.readyReplicas // 0)/\(.spec.replicas)]"')
|
||||
|
||||
if [ -n "$UNHEALTHY_DEPLOYMENTS" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_DEPLOYMENTS" | wc -l)))
|
||||
echo "⚠️ WARNING: Deployments with unavailable replicas:"
|
||||
echo "$UNHEALTHY_DEPLOYMENTS"
|
||||
fi
|
||||
|
||||
echo "Checking StatefulSets..."
|
||||
|
||||
UNHEALTHY_STATEFULSETS=$($CLI get statefulsets --all-namespaces -o json | jq -r '.items[] | select(.status.replicas != .status.readyReplicas) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.readyReplicas // 0)/\(.spec.replicas)]"')
|
||||
|
||||
if [ -n "$UNHEALTHY_STATEFULSETS" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_STATEFULSETS" | wc -l)))
|
||||
echo "⚠️ WARNING: StatefulSets with unavailable replicas:"
|
||||
echo "$UNHEALTHY_STATEFULSETS"
|
||||
fi
|
||||
|
||||
echo "Checking DaemonSets..."
|
||||
|
||||
UNHEALTHY_DAEMONSETS=$($CLI get daemonsets --all-namespaces -o json | jq -r '.items[] | select(.status.numberReady != .status.desiredNumberScheduled) | "\(.metadata.namespace)/\(.metadata.name) [Ready: \(.status.numberReady)/\(.status.desiredNumberScheduled)]"')
|
||||
|
||||
if [ -n "$UNHEALTHY_DAEMONSETS" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$UNHEALTHY_DAEMONSETS" | wc -l)))
|
||||
echo "⚠️ WARNING: DaemonSets with unavailable pods:"
|
||||
echo "$UNHEALTHY_DAEMONSETS"
|
||||
fi
|
||||
```
|
||||
|
||||
### 8. Check Persistent Volume Claims
|
||||
|
||||
Check for storage issues:
|
||||
|
||||
```bash
|
||||
echo "Checking Persistent Volume Claims..."
|
||||
|
||||
# Get PVCs that are not Bound
|
||||
PENDING_PVCS=$($CLI get pvc --all-namespaces -o json | jq -r '.items[] | select(.status.phase != "Bound") | "\(.metadata.namespace)/\(.metadata.name) [\(.status.phase)]"')
|
||||
|
||||
if [ -n "$PENDING_PVCS" ]; then
|
||||
WARNING_ISSUES=$((WARNING_ISSUES + $(echo "$PENDING_PVCS" | wc -l)))
|
||||
echo "⚠️ WARNING: PVCs not in Bound state:"
|
||||
echo "$PENDING_PVCS"
|
||||
fi
|
||||
```
|
||||
|
||||
### 9. Check Critical Namespace Health
|
||||
|
||||
For OpenShift, check critical namespaces:
|
||||
|
||||
```bash
|
||||
if [ "$CLUSTER_TYPE" = "OpenShift" ]; then
|
||||
echo "Checking Critical Namespaces..."
|
||||
|
||||
CRITICAL_NAMESPACES="openshift-kube-apiserver openshift-etcd openshift-authentication openshift-console openshift-monitoring"
|
||||
|
||||
for ns in $CRITICAL_NAMESPACES; do
|
||||
# Check if namespace exists
|
||||
if ! $CLI get namespace "$ns" &> /dev/null; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + 1))
|
||||
echo "❌ CRITICAL: Critical namespace missing: $ns"
|
||||
continue
|
||||
fi
|
||||
|
||||
# Check for failed pods in critical namespace
|
||||
FAILED_IN_NS=$($CLI get pods -n "$ns" -o json | jq -r '.items[] | select(.status.phase != "Running" and .status.phase != "Succeeded") | .metadata.name')
|
||||
|
||||
if [ -n "$FAILED_IN_NS" ]; then
|
||||
CRITICAL_ISSUES=$((CRITICAL_ISSUES + $(echo "$FAILED_IN_NS" | wc -l)))
|
||||
echo "❌ CRITICAL: Failed pods in critical namespace $ns:"
|
||||
echo "$FAILED_IN_NS" | while read pod; do
|
||||
echo " - $pod"
|
||||
done
|
||||
fi
|
||||
done
|
||||
fi
|
||||
```
|
||||
|
||||
### 10. Check Events for Recent Errors
|
||||
|
||||
Look for recent warning/error events:
|
||||
|
||||
```bash
|
||||
echo "Checking Recent Events..."
|
||||
|
||||
# Get events from last 30 minutes with Warning or Error type
|
||||
RECENT_WARNINGS=$($CLI get events --all-namespaces --field-selector type=Warning -o json | jq -r --arg since "$(date -u -d '30 minutes ago' +%Y-%m-%dT%H:%M:%SZ 2>/dev/null || date -u -v-30M +%Y-%m-%dT%H:%M:%SZ)" '.items[] | select(.lastTimestamp > $since) | "\(.lastTimestamp) [\(.involvedObject.namespace)/\(.involvedObject.name)]: \(.message)"' | head -20)
|
||||
|
||||
if [ -n "$RECENT_WARNINGS" ]; then
|
||||
echo "⚠️ Recent Warning Events (last 30 minutes):"
|
||||
echo "$RECENT_WARNINGS"
|
||||
fi
|
||||
```
|
||||
|
||||
### 11. Generate Summary Report
|
||||
|
||||
Create a summary of findings:
|
||||
|
||||
```bash
|
||||
echo ""
|
||||
echo "==============================================="
|
||||
echo "Cluster Health Check Summary"
|
||||
echo "==============================================="
|
||||
echo "Cluster Type: $CLUSTER_TYPE"
|
||||
echo "Cluster Version: $CLUSTER_VERSION"
|
||||
echo "Check Time: $(date)"
|
||||
echo ""
|
||||
echo "Results:"
|
||||
echo " Critical Issues: $CRITICAL_ISSUES"
|
||||
echo " Warnings: $WARNING_ISSUES"
|
||||
echo ""
|
||||
|
||||
if [ $CRITICAL_ISSUES -eq 0 ] && [ $WARNING_ISSUES -eq 0 ]; then
|
||||
echo "✅ Cluster is healthy - no issues detected"
|
||||
exit 0
|
||||
elif [ $CRITICAL_ISSUES -gt 0 ]; then
|
||||
echo "❌ Cluster has CRITICAL issues requiring immediate attention"
|
||||
exit 1
|
||||
else
|
||||
echo "⚠️ Cluster has warnings - monitoring recommended"
|
||||
exit 0
|
||||
fi
|
||||
```
|
||||
|
||||
### 12. Optional: Export to JSON Format
|
||||
|
||||
If `--output-format json` is specified, export findings as JSON:
|
||||
|
||||
```json
|
||||
{
|
||||
"cluster": {
|
||||
"type": "OpenShift",
|
||||
"version": "4.21.0",
|
||||
"checkTime": "2025-10-31T12:00:00Z"
|
||||
},
|
||||
"summary": {
|
||||
"criticalIssues": 2,
|
||||
"warnings": 5,
|
||||
"healthy": false
|
||||
},
|
||||
"findings": {
|
||||
"clusterOperators": {
|
||||
"degraded": ["authentication", "monitoring"],
|
||||
"unavailable": [],
|
||||
"progressing": ["network"]
|
||||
},
|
||||
"nodes": {
|
||||
"notReady": ["worker-1"],
|
||||
"schedulingDisabled": ["worker-2"],
|
||||
"underPressure": []
|
||||
},
|
||||
"pods": {
|
||||
"failed": ["namespace/pod-1", "namespace/pod-2"],
|
||||
"crashLooping": [],
|
||||
"imagePullErrors": ["namespace/pod-3"]
|
||||
},
|
||||
"workloads": {
|
||||
"unhealthyDeployments": [],
|
||||
"unhealthyStatefulSets": [],
|
||||
"unhealthyDaemonSets": []
|
||||
},
|
||||
"storage": {
|
||||
"pendingPVCs": []
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Basic health check
|
||||
```
|
||||
/openshift:cluster-health-check
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Checking Cluster Operators...
|
||||
✅ All cluster operators healthy
|
||||
|
||||
Checking Node Health...
|
||||
⚠️ WARNING: Nodes with scheduling disabled:
|
||||
- ip-10-0-51-201.us-east-2.compute.internal
|
||||
|
||||
Checking Pod Health...
|
||||
✅ All pods healthy
|
||||
|
||||
...
|
||||
|
||||
===============================================
|
||||
Cluster Health Check Summary
|
||||
===============================================
|
||||
Cluster Type: OpenShift
|
||||
Cluster Version: 4.21.0
|
||||
Check Time: 2025-10-31 12:00:00
|
||||
|
||||
Results:
|
||||
Critical Issues: 0
|
||||
Warnings: 1
|
||||
|
||||
⚠️ Cluster has warnings - monitoring recommended
|
||||
```
|
||||
|
||||
### Example 2: Verbose health check
|
||||
```
|
||||
/openshift:cluster-health-check --verbose
|
||||
```
|
||||
|
||||
### Example 3: JSON output for automation
|
||||
```
|
||||
/openshift:cluster-health-check --output-format json
|
||||
```
|
||||
|
||||
## Return Value
|
||||
|
||||
The command returns different exit codes based on findings:
|
||||
|
||||
- **Exit 0**: No critical issues found (cluster is healthy or has only warnings)
|
||||
- **Exit 1**: Critical issues detected requiring immediate attention
|
||||
|
||||
**Output Format**:
|
||||
- **Text** (default): Human-readable report with emoji indicators
|
||||
- **JSON**: Structured data suitable for parsing/automation
|
||||
|
||||
## Common Issues and Remediation
|
||||
|
||||
### Degraded Cluster Operators
|
||||
|
||||
**Symptoms**: Cluster operators showing Degraded=True or Available=False
|
||||
|
||||
**Investigation**:
|
||||
```bash
|
||||
oc get clusteroperator <operator-name> -o yaml
|
||||
oc logs -n openshift-<operator-namespace> -l app=<operator-name>
|
||||
```
|
||||
|
||||
**Remediation**: Check operator logs and events for specific errors
|
||||
|
||||
### Nodes Not Ready
|
||||
|
||||
**Symptoms**: Nodes in NotReady state
|
||||
|
||||
**Investigation**:
|
||||
```bash
|
||||
oc describe node <node-name>
|
||||
oc get events --field-selector involvedObject.name=<node-name>
|
||||
```
|
||||
|
||||
**Remediation**: Common causes include network issues, disk pressure, or kubelet problems
|
||||
|
||||
### Pods in CrashLoopBackOff
|
||||
|
||||
**Symptoms**: Pods continuously restarting
|
||||
|
||||
**Investigation**:
|
||||
```bash
|
||||
oc logs <pod-name> -n <namespace> --previous
|
||||
oc describe pod <pod-name> -n <namespace>
|
||||
```
|
||||
|
||||
**Remediation**: Check application logs, resource limits, and configuration
|
||||
|
||||
### ImagePullBackOff Errors
|
||||
|
||||
**Symptoms**: Pods unable to pull container images
|
||||
|
||||
**Investigation**:
|
||||
```bash
|
||||
oc describe pod <pod-name> -n <namespace>
|
||||
```
|
||||
|
||||
**Remediation**: Verify image name, registry credentials, and network connectivity
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Read-only access**: This command only reads cluster state, no modifications
|
||||
- **Sensitive data**: Be cautious when sharing reports as they may contain cluster topology information
|
||||
- **RBAC requirements**: Ensure user has appropriate permissions for all resource types checked
|
||||
|
||||
## See Also
|
||||
|
||||
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/support/troubleshooting/
|
||||
- Kubernetes Troubleshooting: https://kubernetes.io/docs/tasks/debug/
|
||||
- Related commands: `/prow-job:analyze-test-failure`, `/must-gather:analyze`
|
||||
|
||||
## Notes
|
||||
|
||||
- The command checks cluster state at a point in time; transient issues may not be detected
|
||||
- For OpenShift clusters, cluster operator checks are performed
|
||||
- For vanilla Kubernetes, cluster operator checks are skipped
|
||||
- Resource utilization checks require metrics-server to be installed
|
||||
- Some checks may be skipped if user lacks sufficient permissions
|
||||
278
commands/crd-review.md
Normal file
278
commands/crd-review.md
Normal file
@@ -0,0 +1,278 @@
|
||||
---
|
||||
description: Review Kubernetes CRDs against Kubernetes and OpenShift API conventions
|
||||
argument-hint: [repository-path]
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:crd-review
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:crd-review [repository-path]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `openshift:crd-review` command analyzes Go Kubernetes Custom Resource Definitions (CRDs) in a repository against both:
|
||||
- **Kubernetes API Conventions** as defined in the [Kubernetes community guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md)
|
||||
- **OpenShift API Conventions** as defined in the [OpenShift development guide](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md)
|
||||
|
||||
This command helps ensure CRDs follow best practices for:
|
||||
- API naming conventions and patterns
|
||||
- Resource structure and field organization
|
||||
- Status field design and patterns
|
||||
- Field types and validation
|
||||
- Documentation standards
|
||||
- OpenShift-specific requirements
|
||||
|
||||
The review covers Go API type definitions, providing actionable feedback to improve API design.
|
||||
|
||||
## Key Convention Checks
|
||||
|
||||
### Kubernetes API Conventions
|
||||
|
||||
#### Naming Conventions
|
||||
- **Resource Names**: Must follow DNS label format (lowercase, alphanumeric, hyphens)
|
||||
- **Field Names**: PascalCase for Go, camelCase for JSON
|
||||
- **Avoid**: Abbreviations, underscores, ambiguous names
|
||||
- **Include**: Units/types in field names when needed (e.g., `timeoutSeconds`)
|
||||
|
||||
#### API Structure
|
||||
- **Required Fields**: Every API object must embed a `k8s.io/apimachinery/pkg/apis/meta/v1` `TypeMeta` struct
|
||||
- **Metadata**: Every API object must include a `k8s.io/apimachinery/pkg/apis/meta/v1` `ObjectMeta` struct called `metadata`
|
||||
- **Spec/Status Separation**: Clear separation between desired state (spec) and observed state (status)
|
||||
|
||||
#### Status Field Design
|
||||
- **Conditions**: Must include conditions array with:
|
||||
- `type`: Clear, human-readable condition type
|
||||
- `status`: `True`, `False`, or `Unknown`
|
||||
- `reason`: Machine-readable reason code
|
||||
- `message`: Human-readable message
|
||||
- `lastTransitionTime`: RFC 3339 timestamp
|
||||
|
||||
#### Field Types
|
||||
- **Integers**: Prefer `int32` over `int64`
|
||||
- **Avoid**: Unsigned integers, floating-point values
|
||||
- **Enums**: Use string constants, not numeric values
|
||||
- **Optional Fields**: Use pointers in Go
|
||||
|
||||
#### Versioning
|
||||
- **Group Names**: Use domain format (e.g., `myapp.example.com`)
|
||||
- **Version Strings**: Must match DNS label format (e.g., `v1`, `v1beta1`)
|
||||
- **Migration**: Provide clear paths between versions
|
||||
|
||||
### OpenShift API Conventions
|
||||
|
||||
#### Configuration vs Workload APIs
|
||||
- **Configuration APIs**: Typically cluster-scoped, manage cluster behavior
|
||||
- **Workload APIs**: Usually namespaced, user-facing resources
|
||||
|
||||
#### Field Design
|
||||
- **Avoid Boolean Fields**: Use enumerations that describe end-user behavior instead of binary true/false
|
||||
- ❌ Bad: `paused: true`
|
||||
- ✅ Good: `lifecycle: "Paused"` with enum values `["Paused", "Active"]`
|
||||
- **Object References**: Use specific types, omit "Ref" suffix
|
||||
- **Clear Semantics**: Each field should have one clear purpose
|
||||
|
||||
#### Documentation Requirements
|
||||
- **Godoc Comments**: Comprehensive documentation for all exported types and fields
|
||||
- **JSON Field Names**: Use JSON names in documentation (not Go names)
|
||||
- **User-Facing**: Write for users, not just developers
|
||||
- **Explain Interactions**: Document how fields interact with each other
|
||||
|
||||
#### Validation
|
||||
- **Kubebuilder Tags**: Use validation markers (`+kubebuilder:validation:*`)
|
||||
- **Enum Values**: Explicitly define allowed values
|
||||
- **Field Constraints**: Define minimums, maximums, patterns
|
||||
- **Meaningful Errors**: Validation messages should guide users
|
||||
|
||||
#### Union Types
|
||||
- **Discriminated Unions**: Use a discriminator field to select variant
|
||||
- **Optional Pointers**: All union members should be optional pointers
|
||||
- **Validation**: Ensure exactly one union member is set
|
||||
|
||||
## Implementation
|
||||
|
||||
The command performs the following analysis workflow:
|
||||
|
||||
1. **Repository Discovery**
|
||||
- Find Go API types (typically in `api/`, `pkg/apis/` directories)
|
||||
- Identify CRD generation markers (`+kubebuilder` comments)
|
||||
|
||||
2. **Kubernetes Convention Validation**
|
||||
- **Naming validation**: Check resource names, field names, condition types
|
||||
- **Structure validation**: Verify required fields, metadata, spec/status separation
|
||||
- **Status validation**: Ensure conditions array, proper condition structure
|
||||
- **Field type validation**: Check integer types, avoid floats, validate enums
|
||||
- **Versioning validation**: Verify group names and version strings
|
||||
|
||||
3. **OpenShift Convention Validation**
|
||||
- **API classification**: Identify configuration vs workload APIs
|
||||
- **Field design**: Flag boolean fields, check enumerations
|
||||
- **Documentation**: Verify Godoc comments, user-facing descriptions
|
||||
- **Validation markers**: Check kubebuilder validation tags
|
||||
- **Union types**: Validate discriminated union patterns
|
||||
|
||||
4. **Report Generation**
|
||||
- List all findings with severity levels (Critical, Warning, Info)
|
||||
- Provide specific file and line references
|
||||
- Include remediation suggestions
|
||||
- Highlight whether a suggested change might lead to breaking API changes
|
||||
- Link to relevant convention documentation
|
||||
|
||||
## Output Format
|
||||
|
||||
The command generates a structured report with:
|
||||
- **Summary**: Overview of findings by severity
|
||||
- **Kubernetes Findings**: Issues related to upstream conventions
|
||||
- **OpenShift Findings**: Issues related to OpenShift-specific patterns
|
||||
- **Recommendations**: Actionable steps to improve API design
|
||||
- **openshift/api crd-command reference**: Add a prominent note notifying the user of the existence of the openshift/api repository's api-review command (https://github.com/openshift/api/blob/master/.claude/commands/api-review.md) for PR reviews against that repository.
|
||||
|
||||
Each finding includes:
|
||||
- Severity level (❌ Critical, ⚠️ Warning, 💡 Info)
|
||||
- File location and line number
|
||||
- Description of the issue
|
||||
- Remediation suggestion
|
||||
- Link to relevant documentation
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Review current repository
|
||||
```
|
||||
/crd-review
|
||||
```
|
||||
Analyzes CRDs in the current working directory.
|
||||
|
||||
### Example 2: Review specific repository
|
||||
```
|
||||
/crd-review /path/to/operator-project
|
||||
```
|
||||
Analyzes CRDs in the specified directory.
|
||||
|
||||
### Example 3: Review with detailed output
|
||||
The command automatically provides detailed output including:
|
||||
- All CRD files found
|
||||
- Go API type definitions
|
||||
- Compliance summary
|
||||
- Specific violations with file references
|
||||
|
||||
## Common Findings
|
||||
|
||||
### Kubernetes Convention Issues
|
||||
|
||||
#### Boolean vs Enum Fields
|
||||
**Issue**: Using boolean where enum is better
|
||||
```go
|
||||
// ❌ Bad
|
||||
type MySpec struct {
|
||||
Enabled bool `json:"enabled"`
|
||||
}
|
||||
|
||||
// ✅ Good
|
||||
type MySpec struct {
|
||||
// State defines the operational state
|
||||
// Valid values are: "Enabled", "Disabled", "Auto"
|
||||
// +kubebuilder:validation:Enum=Enabled;Disabled;Auto
|
||||
State string `json:"state"`
|
||||
}
|
||||
```
|
||||
|
||||
#### Missing Status Conditions
|
||||
**Issue**: Status without conditions array
|
||||
```go
|
||||
// ❌ Bad
|
||||
type MyStatus struct {
|
||||
Ready bool `json:"ready"`
|
||||
}
|
||||
|
||||
// ✅ Good
|
||||
type MyStatus struct {
|
||||
// Conditions represent the latest available observations
|
||||
// +listType=map
|
||||
// +listMapKey=type
|
||||
Conditions []metav1.Condition `json:"conditions,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
#### Improper Field Naming
|
||||
**Issue**: Ambiguous or abbreviated names
|
||||
```go
|
||||
// ❌ Bad
|
||||
type MySpec struct {
|
||||
Timeout int `json:"timeout"` // Ambiguous unit
|
||||
Cnt int `json:"cnt"` // Abbreviation
|
||||
}
|
||||
|
||||
// ✅ Good
|
||||
type MySpec struct {
|
||||
// TimeoutSeconds is the timeout in seconds
|
||||
// +kubebuilder:validation:Minimum=1
|
||||
TimeoutSeconds int32 `json:"timeoutSeconds"`
|
||||
|
||||
// Count is the number of replicas
|
||||
// +kubebuilder:validation:Minimum=0
|
||||
Count int32 `json:"count"`
|
||||
}
|
||||
```
|
||||
|
||||
### OpenShift Convention Issues
|
||||
|
||||
#### Missing Documentation
|
||||
**Issue**: Exported fields without Godoc
|
||||
```go
|
||||
// ❌ Bad
|
||||
type MySpec struct {
|
||||
Field string `json:"field"`
|
||||
}
|
||||
|
||||
// ✅ Good
|
||||
type MySpec struct {
|
||||
// field specifies the configuration field for...
|
||||
// This value determines how the operator will...
|
||||
// Valid values include...
|
||||
Field string `json:"field"`
|
||||
}
|
||||
```
|
||||
|
||||
#### Missing Validation
|
||||
**Issue**: Fields without kubebuilder validation
|
||||
```go
|
||||
// ❌ Bad
|
||||
type MySpec struct {
|
||||
Mode string `json:"mode"`
|
||||
}
|
||||
|
||||
// ✅ Good
|
||||
type MySpec struct {
|
||||
// mode defines the operational mode
|
||||
// +kubebuilder:validation:Enum=Standard;Advanced;Debug
|
||||
// +kubebuilder:validation:Required
|
||||
Mode string `json:"mode"`
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Start with Conventions**: Review conventions before writing APIs
|
||||
2. **Use Code Generation**: Leverage controller-gen and kubebuilder markers
|
||||
3. **Document Early**: Write Godoc comments as you define types
|
||||
4. **Validate Everything**: Add validation markers for all fields
|
||||
5. **Review Regularly**: Run this command during development and before PRs
|
||||
6. **Follow Examples**: Study well-designed APIs in OpenShift core
|
||||
|
||||
## Arguments
|
||||
|
||||
- **repository-path** (optional): Path to repository containing CRDs. Defaults to current working directory.
|
||||
|
||||
## Exit Codes
|
||||
|
||||
- **0**: Analysis completed successfully
|
||||
- **1**: Error during analysis (e.g., invalid path, no CRDs found)
|
||||
|
||||
## See Also
|
||||
|
||||
- [Kubernetes API Conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md)
|
||||
- [OpenShift API Conventions](https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md)
|
||||
- [Kubebuilder Documentation](https://book.kubebuilder.io/)
|
||||
- [Controller Runtime API](https://pkg.go.dev/sigs.k8s.io/controller-runtime)
|
||||
580
commands/create-cluster.md
Normal file
580
commands/create-cluster.md
Normal file
@@ -0,0 +1,580 @@
|
||||
---
|
||||
description: Extract OpenShift installer from release image and create an OCP cluster
|
||||
argument-hint: "[release-image] [platform] [options]"
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:create-cluster
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:create-cluster [release-image] [platform] [options]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `create-cluster` command automates the process of extracting the OpenShift installer from a release image (if not already present) and creating a new OpenShift Container Platform (OCP) cluster. It handles installer extraction from OCP release images, configuration preparation, and cluster creation in a streamlined workflow.
|
||||
|
||||
This command is useful for:
|
||||
- Setting up development/test clusters quickly
|
||||
|
||||
## ⚠️ When to Use This Tool
|
||||
|
||||
**IMPORTANT**: This is a last resort tool for advanced use cases. For most development workflows, you should use one of these better alternatives:
|
||||
|
||||
### Recommended Alternatives
|
||||
|
||||
1. **Cluster Bot**: Request ephemeral test clusters without managing infrastructure
|
||||
- No cloud credentials needed
|
||||
- Supports dependent PR testing
|
||||
- Automatically cleaned up
|
||||
|
||||
2. **Gangway**
|
||||
|
||||
3. **Multi-PR Testing in CI**: Test multiple dependent PRs together using `/test-with` commands
|
||||
|
||||
### When to Use create-cluster
|
||||
|
||||
Only use this command when:
|
||||
- You need full control over cluster configuration
|
||||
- You're testing installer changes that aren't suitable for CI
|
||||
- You need a long-lived development cluster on your own cloud account
|
||||
- The alternatives don't meet your specific requirements
|
||||
|
||||
**Note**: This command requires significant setup (cloud credentials, pull secrets, DNS configuration, understanding of OCP versions). If you're new to OpenShift development, start with Cluster Bot or Gangway instead.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before using this command, ensure you have:
|
||||
|
||||
1. **OpenShift CLI (`oc`)**: Required to extract the installer from the release image
|
||||
- Install from: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
|
||||
- Or use your package manager: `brew install openshift-cli` (macOS)
|
||||
- Verify with: `oc version`
|
||||
|
||||
2. **Cloud Provider Credentials** configured for your chosen platform:
|
||||
- **AWS**: `~/.aws/credentials` configured with appropriate permissions
|
||||
- **Azure**: Azure CLI authenticated (`az login`)
|
||||
- **GCP**: The command will guide you through service account setup (either using an existing service account JSON or creating a new one)
|
||||
- **vSphere**: vCenter credentials
|
||||
- **OpenStack**: clouds.yaml configured
|
||||
|
||||
3. **Pull Secret**: Download from [Red Hat Console](https://console.redhat.com/openshift/install/pull-secret)
|
||||
|
||||
4. **Domain/DNS Configuration**:
|
||||
- AWS: Route53 hosted zone
|
||||
- Other platforms: Appropriate DNS setup
|
||||
|
||||
## Arguments
|
||||
|
||||
The command accepts arguments in multiple ways:
|
||||
|
||||
### Positional Arguments
|
||||
```
|
||||
/openshift:create-cluster [release-image] [platform]
|
||||
```
|
||||
|
||||
### Interactive Mode
|
||||
If arguments are not provided, the command will interactively prompt for:
|
||||
- OpenShift release image
|
||||
- Platform (aws, azure, gcp, vsphere, openstack, none/baremetal)
|
||||
- Cluster name
|
||||
- Base domain
|
||||
- Pull secret location
|
||||
|
||||
### Argument Details
|
||||
|
||||
- **release-image** (required): OpenShift release image to extract the installer from
|
||||
- Production release: `quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64`
|
||||
- CI build: `registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915`
|
||||
- Stable release: `quay.io/openshift-release-dev/ocp-release:4.20.1-x86_64`
|
||||
- The command will prompt for this if not provided
|
||||
|
||||
- **platform** (optional): Target platform for the cluster
|
||||
- `aws`: Amazon Web Services
|
||||
- `azure`: Microsoft Azure
|
||||
- `gcp`: Google Cloud Platform
|
||||
- `vsphere`: VMware vSphere
|
||||
- `openstack`: OpenStack
|
||||
- `none`: Bare metal / platform-agnostic
|
||||
- Default: Prompts user to select
|
||||
|
||||
- **cluster-name** (optional): Name for the cluster
|
||||
- Default: `ocp-cluster`
|
||||
- Must be DNS-compatible
|
||||
|
||||
- **base-domain** (required): Base domain for the cluster
|
||||
- Example: `example.com` → Cluster API will be `api.{cluster-name}.{base-domain}`
|
||||
|
||||
- **pull-secret** (required): Path to pull secret file
|
||||
- User will be prompted to provide the path
|
||||
|
||||
- **installer-dir** (optional): Directory to store/find installer binaries
|
||||
- Default: `~/.openshift-installers`
|
||||
|
||||
## Implementation
|
||||
|
||||
The command performs the following steps:
|
||||
|
||||
### 1. Validate Prerequisites
|
||||
|
||||
Check that required tools and credentials are available:
|
||||
- Verify `oc` CLI is installed and available
|
||||
- Verify cloud provider credentials are configured (if applicable)
|
||||
- Confirm domain/DNS requirements
|
||||
|
||||
If any prerequisites are missing, provide clear instructions on how to configure them.
|
||||
|
||||
### 2. Get Release Image from User
|
||||
|
||||
If not provided as an argument, **prompt the user** for the OpenShift release image:
|
||||
|
||||
```
|
||||
Please provide the OpenShift release image:
|
||||
|
||||
Examples:
|
||||
- Production release: quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64
|
||||
- CI build: registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915
|
||||
- Stable release: quay.io/openshift-release-dev/ocp-release:4.20.1-x86_64
|
||||
|
||||
Release image:
|
||||
```
|
||||
|
||||
Store the user's input as `$RELEASE_IMAGE`.
|
||||
|
||||
**Extract version from image** for naming:
|
||||
```bash
|
||||
# Parse version from image tag (e.g., "4.21.0-ec.2" or "4.21.0-0.ci-2025-10-27-031915")
|
||||
VERSION=$(echo "$RELEASE_IMAGE" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+[^"]*' | head -1)
|
||||
```
|
||||
|
||||
### 3. Determine Installer Location and Extract if Needed
|
||||
|
||||
```bash
|
||||
INSTALLER_DIR="${installer-dir:-$HOME/.openshift-installers}"
|
||||
INSTALLER_PATH="$INSTALLER_DIR/openshift-install-${VERSION}"
|
||||
```
|
||||
|
||||
**Check if installer directory exists**:
|
||||
- If `$INSTALLER_DIR` does not exist:
|
||||
- **Ask user for confirmation**: "The installer directory `$INSTALLER_DIR` does not exist. Would you like to create it?"
|
||||
- If user confirms (yes): Create the directory with `mkdir -p "$INSTALLER_DIR"`
|
||||
- If user declines (no): Exit with error message suggesting an alternative path
|
||||
|
||||
**Check if the installer already exists** at `$INSTALLER_PATH`:
|
||||
- If present: Verify it works with `"$INSTALLER_PATH" version`
|
||||
- If version matches the release image: Skip extraction
|
||||
- If different or fails: Proceed with extraction
|
||||
- If not present: Proceed with extraction
|
||||
|
||||
**Extract installer from release image**:
|
||||
|
||||
1. **Verify `oc` CLI is available**:
|
||||
```bash
|
||||
if ! command -v oc &> /dev/null; then
|
||||
echo "Error: 'oc' CLI not found. Please install the OpenShift CLI."
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
2. **Extract the installer binary**:
|
||||
```bash
|
||||
oc adm release extract \
|
||||
--tools \
|
||||
--from="$RELEASE_IMAGE" \
|
||||
--to="$INSTALLER_DIR"
|
||||
```
|
||||
|
||||
This extracts the `openshift-install` binary and other tools from the release image.
|
||||
|
||||
3. **Locate and rename the extracted installer**:
|
||||
```bash
|
||||
# The extract command creates a tar.gz with the tools
|
||||
# Find the most recently extracted openshift-install tar (compatible with both GNU and BSD find)
|
||||
INSTALLER_TAR=$(find "$INSTALLER_DIR" -name "openshift-install-*.tar.gz" -type f -exec ls -t {} + | head -1)
|
||||
|
||||
# Extract from tar and rename
|
||||
cd "$INSTALLER_DIR"
|
||||
tar -xzf "$INSTALLER_TAR" openshift-install
|
||||
mv openshift-install "openshift-install-${VERSION}"
|
||||
chmod +x "openshift-install-${VERSION}"
|
||||
|
||||
# Clean up the tar file
|
||||
rm "$INSTALLER_TAR"
|
||||
```
|
||||
|
||||
4. **Verify the installer**:
|
||||
```bash
|
||||
"$INSTALLER_PATH" version
|
||||
```
|
||||
|
||||
Expected output should show the version matching `$VERSION`.
|
||||
|
||||
### 4. Prepare Installation Directory
|
||||
|
||||
Create a clean installation directory:
|
||||
```bash
|
||||
INSTALL_DIR="${cluster-name}-install-$(date +%Y%m%d-%H%M%S)"
|
||||
mkdir -p "$INSTALL_DIR"
|
||||
cd "$INSTALL_DIR"
|
||||
```
|
||||
|
||||
### 5. Collect Required Information and Generate install-config.yaml
|
||||
|
||||
**IMPORTANT**: Do NOT run the installer interactively. Instead, collect all required information from the user and generate the install-config.yaml programmatically.
|
||||
|
||||
**Step 5.1: Collect Information**
|
||||
|
||||
Prompt the user for the following information (if not already provided as arguments):
|
||||
|
||||
1. **SSH Public Key**:
|
||||
- Check for existing SSH keys: `ls -la ~/.ssh/*.pub`
|
||||
- Ask user to select from available keys or specify path
|
||||
- Default: `~/.ssh/id_rsa.pub`
|
||||
|
||||
2. **Platform** (if not provided as argument):
|
||||
- Ask user to select: aws, azure, gcp, vsphere, openstack, none
|
||||
|
||||
3. **Platform-specific details**:
|
||||
- For AWS:
|
||||
- Region (e.g., us-east-1, us-west-2)
|
||||
- For Azure:
|
||||
- Region (e.g., centralus, eastus)
|
||||
- Cloud name (e.g., AzurePublicCloud)
|
||||
- For GCP:
|
||||
- Follow the **GCP Service Account Setup** (see Step 5.2a below)
|
||||
- Project ID
|
||||
- Region (e.g., us-central1)
|
||||
- For other platforms: collect required platform-specific info
|
||||
|
||||
4. **Base Domain**:
|
||||
- Ask for base domain (e.g., example.com, devcluster.openshift.com)
|
||||
- Validate that domain is configured (e.g., Route53 hosted zone for AWS)
|
||||
|
||||
5. **Cluster Name**:
|
||||
- Ask for cluster name or use default: `ocp-cluster`
|
||||
- Validate DNS compatibility (lowercase, hyphens only)
|
||||
|
||||
6. **Pull Secret**:
|
||||
- **IMPORTANT**: Always ask user to provide the path to their pull secret file
|
||||
- Do NOT use default paths like `~/pull-secret.txt` or `~/Downloads/pull-secret.txt`
|
||||
- Prompt: "Please provide the path to your pull secret file (download from https://console.redhat.com/openshift/install/pull-secret):"
|
||||
- Read contents of pull secret file from the provided path
|
||||
|
||||
**Step 5.2a: GCP Service Account Setup** (Only for GCP platform)
|
||||
|
||||
If the platform is GCP, the installer requires a service account JSON file with appropriate permissions. Present the user with two options:
|
||||
|
||||
1. **Use an existing service account JSON file**
|
||||
2. **Create a new service account**
|
||||
|
||||
**Ask the user**: "Do you want to use an existing service account JSON file or create a new one?"
|
||||
|
||||
**Option 1: Use Existing Service Account**
|
||||
|
||||
If the user chooses to use an existing service account:
|
||||
- Prompt: "Please provide the path to your GCP service account JSON file:"
|
||||
- Store the path as `$GCP_SERVICE_ACCOUNT_PATH`
|
||||
- Verify the file exists and is valid JSON
|
||||
- Set the environment variable:
|
||||
```bash
|
||||
export GOOGLE_APPLICATION_CREDENTIALS="$GCP_SERVICE_ACCOUNT_PATH"
|
||||
```
|
||||
|
||||
**Option 2: Create New Service Account**
|
||||
|
||||
If the user chooses to create a new service account:
|
||||
|
||||
1. **Verify gcloud CLI is installed**:
|
||||
```bash
|
||||
if ! command -v gcloud &> /dev/null; then
|
||||
echo "Error: 'gcloud' CLI not found. Please install the Google Cloud SDK."
|
||||
echo "Visit: https://cloud.google.com/sdk/docs/install"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
2. **Prompt for Kerberos ID**:
|
||||
- Ask: "Please provide your Kerberos ID (e.g., jsmith):"
|
||||
- Store as `$KERBEROS_ID`
|
||||
- Validate it's not empty
|
||||
|
||||
3. **Set service account name**:
|
||||
```bash
|
||||
SERVICE_ACCOUNT_NAME="${KERBEROS_ID}-development"
|
||||
```
|
||||
|
||||
4. **Create the service account**:
|
||||
```bash
|
||||
echo "Creating service account: $SERVICE_ACCOUNT_NAME"
|
||||
gcloud iam service-accounts create "$SERVICE_ACCOUNT_NAME" --display-name="$SERVICE_ACCOUNT_NAME"
|
||||
```
|
||||
|
||||
5. **Extract service account details**:
|
||||
```bash
|
||||
# Get service account information
|
||||
SERVICE_ACCOUNT_JSON="$(gcloud iam service-accounts list --format json | jq -r '.[] | select(.name | match("/\(env.SERVICE_ACCOUNT_NAME)@"))')"
|
||||
SERVICE_ACCOUNT_EMAIL="$(jq -r .email <<< "$SERVICE_ACCOUNT_JSON")"
|
||||
PROJECT_ID="$(jq -r .projectId <<< "$SERVICE_ACCOUNT_JSON")"
|
||||
|
||||
echo "Service Account Email: $SERVICE_ACCOUNT_EMAIL"
|
||||
echo "Project ID: $PROJECT_ID"
|
||||
```
|
||||
|
||||
6. **Grant required permissions**:
|
||||
```bash
|
||||
echo "Granting IAM roles to service account..."
|
||||
|
||||
while IFS= read -r ROLE_TO_ADD ; do
|
||||
echo "Adding role: $ROLE_TO_ADD"
|
||||
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
|
||||
--condition="None" \
|
||||
--member="serviceAccount:$SERVICE_ACCOUNT_EMAIL" \
|
||||
--role="$ROLE_TO_ADD"
|
||||
done << 'END_OF_ROLES'
|
||||
roles/compute.admin
|
||||
roles/iam.securityAdmin
|
||||
roles/iam.serviceAccountAdmin
|
||||
roles/iam.serviceAccountKeyAdmin
|
||||
roles/iam.serviceAccountUser
|
||||
roles/storage.admin
|
||||
roles/dns.admin
|
||||
roles/compute.loadBalancerAdmin
|
||||
roles/iam.roleAdmin
|
||||
END_OF_ROLES
|
||||
|
||||
echo "All roles granted successfully."
|
||||
```
|
||||
|
||||
7. **Create and download service account key**:
|
||||
```bash
|
||||
KEY_FILE="${HOME}/.gcp/${SERVICE_ACCOUNT_NAME}-key.json"
|
||||
mkdir -p "$(dirname "$KEY_FILE")"
|
||||
|
||||
echo "Creating service account key..."
|
||||
gcloud iam service-accounts keys create "$KEY_FILE" \
|
||||
--iam-account="$SERVICE_ACCOUNT_EMAIL"
|
||||
|
||||
echo "Service account key saved to: $KEY_FILE"
|
||||
```
|
||||
|
||||
8. **Set environment variable**:
|
||||
```bash
|
||||
export GOOGLE_APPLICATION_CREDENTIALS="$KEY_FILE"
|
||||
echo "GOOGLE_APPLICATION_CREDENTIALS set to: $KEY_FILE"
|
||||
```
|
||||
|
||||
9. **Store PROJECT_ID for later use** in install-config.yaml generation.
|
||||
|
||||
**Step 5.2: Generate install-config.yaml**
|
||||
|
||||
Create the install-config.yaml file programmatically based on collected information:
|
||||
|
||||
```bash
|
||||
# Read SSH public key
|
||||
SSH_KEY=$(cat "$SSH_KEY_PATH")
|
||||
|
||||
# Read pull secret
|
||||
PULL_SECRET=$(cat "$PULL_SECRET_PATH")
|
||||
|
||||
# Generate install-config.yaml
|
||||
cat > install-config.yaml <<EOF
|
||||
apiVersion: v1
|
||||
baseDomain: ${BASE_DOMAIN}
|
||||
metadata:
|
||||
name: ${CLUSTER_NAME}
|
||||
compute:
|
||||
- name: worker
|
||||
replicas: 3
|
||||
controlPlane:
|
||||
name: master
|
||||
replicas: 3
|
||||
networking:
|
||||
networkType: OVNKubernetes
|
||||
clusterNetwork:
|
||||
- cidr: 10.128.0.0/14
|
||||
hostPrefix: 23
|
||||
serviceNetwork:
|
||||
- 172.30.0.0/16
|
||||
platform:
|
||||
${PLATFORM}:
|
||||
region: ${REGION}
|
||||
pullSecret: '${PULL_SECRET}'
|
||||
sshKey: '${SSH_KEY}'
|
||||
EOF
|
||||
```
|
||||
|
||||
**Platform-specific configurations**:
|
||||
|
||||
For **AWS**:
|
||||
```yaml
|
||||
platform:
|
||||
aws:
|
||||
region: us-east-1
|
||||
```
|
||||
|
||||
For **Azure**:
|
||||
```yaml
|
||||
platform:
|
||||
azure:
|
||||
region: centralus
|
||||
baseDomainResourceGroupName: ${RESOURCE_GROUP_NAME}
|
||||
cloudName: AzurePublicCloud
|
||||
```
|
||||
|
||||
For **GCP**:
|
||||
```yaml
|
||||
platform:
|
||||
gcp:
|
||||
projectID: ${PROJECT_ID}
|
||||
region: us-central1
|
||||
```
|
||||
|
||||
For **None/Baremetal**:
|
||||
```yaml
|
||||
platform:
|
||||
none: {}
|
||||
```
|
||||
|
||||
**IMPORTANT**: Always backup install-config.yaml after creation:
|
||||
```bash
|
||||
cp install-config.yaml install-config.yaml.backup
|
||||
```
|
||||
|
||||
The installer consumes this file, so the backup is essential for reference.
|
||||
|
||||
### 6. Create the Cluster
|
||||
|
||||
Run the installer:
|
||||
```bash
|
||||
"$INSTALLER_PATH" create cluster --dir=.
|
||||
```
|
||||
|
||||
Monitor the installation progress. This typically takes 30-45 minutes.
|
||||
|
||||
### 7. Post-Installation
|
||||
|
||||
Once installation completes:
|
||||
|
||||
1. **Display kubeconfig location**:
|
||||
```
|
||||
Kubeconfig: $INSTALL_DIR/auth/kubeconfig
|
||||
```
|
||||
|
||||
2. **Display cluster credentials**:
|
||||
```
|
||||
Console URL: https://console-openshift-console.apps.${cluster-name}.${base-domain}
|
||||
Username: kubeadmin
|
||||
Password: (from $INSTALL_DIR/auth/kubeadmin-password)
|
||||
```
|
||||
|
||||
3. **Export KUBECONFIG** (offer to add to shell profile):
|
||||
```bash
|
||||
export KUBECONFIG="$PWD/auth/kubeconfig"
|
||||
```
|
||||
|
||||
4. **Verify cluster access**:
|
||||
```bash
|
||||
oc get nodes
|
||||
oc get co # cluster operators
|
||||
```
|
||||
|
||||
5. **Save cluster information** to a summary file:
|
||||
```
|
||||
Cluster: ${cluster-name}
|
||||
Version: ${VERSION}
|
||||
Release Image: ${RELEASE_IMAGE}
|
||||
Platform: ${platform}
|
||||
Console: https://console-openshift-console.apps.${cluster-name}.${base-domain}
|
||||
API: https://api.${cluster-name}.${base-domain}:6443
|
||||
Kubeconfig: $INSTALL_DIR/auth/kubeconfig
|
||||
Created: $(date)
|
||||
```
|
||||
|
||||
### 8. Error Handling
|
||||
|
||||
If installation fails:
|
||||
|
||||
1. **Capture logs**: Installation logs are in `.openshift_install.log`
|
||||
2. **Provide diagnostics**: Check common failure points:
|
||||
- Quota limits on cloud provider
|
||||
- DNS configuration issues
|
||||
- Invalid pull secret
|
||||
- Network/firewall issues
|
||||
3. **Cleanup guidance**: Inform user about cleanup:
|
||||
```bash
|
||||
"$INSTALLER_PATH" destroy cluster --dir=.
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Basic cluster creation (interactive)
|
||||
```
|
||||
/openshift:create-cluster
|
||||
```
|
||||
The command will prompt for release image and all necessary information.
|
||||
|
||||
### Example 2: Create AWS cluster with production release
|
||||
```
|
||||
/openshift:create-cluster quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64 aws
|
||||
```
|
||||
|
||||
### Example 3: Create cluster with CI build
|
||||
```
|
||||
/openshift:create-cluster registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-10-27-031915 gcp
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
|
||||
To destroy the cluster after testing:
|
||||
```bash
|
||||
cd $INSTALL_DIR
|
||||
"$INSTALLER_PATH" destroy cluster --dir=.
|
||||
```
|
||||
|
||||
**WARNING**: This will permanently delete all cluster resources.
|
||||
|
||||
## Common Issues
|
||||
|
||||
1. **Pull secret not found**:
|
||||
- Download from https://console.redhat.com/openshift/install/pull-secret
|
||||
- Save to a secure location of your choice
|
||||
- Provide the path when prompted during cluster creation
|
||||
|
||||
2. **Insufficient cloud quotas**:
|
||||
- Check cloud provider quota limits
|
||||
- Request quota increase if needed
|
||||
|
||||
3. **DNS issues**:
|
||||
- Ensure base domain is properly configured
|
||||
- For AWS, verify Route53 hosted zone exists
|
||||
|
||||
4. **SSH key not found**:
|
||||
- Generate with `ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa`
|
||||
|
||||
5. **Unauthorized access to release image**:
|
||||
- Error: `error: unable to read image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:...: unauthorized: access to the requested resource is not authorized`
|
||||
- For `quay.io/openshift-release-dev/ocp-v4.0-art-dev` you can get the pull secret from https://console.redhat.com/openshift/install/pull-secret and save it in a file and provide it here.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Pull secret**: Contains authentication for Red Hat registries. Keep secure.
|
||||
- **kubeadmin password**: Stored in plaintext in auth directory. Rotate after cluster creation.
|
||||
- **kubeconfig**: Contains cluster admin credentials. Protect appropriately.
|
||||
- **Cloud credentials**: Never commit to version control.
|
||||
|
||||
## Return Value
|
||||
|
||||
- **Success**: Returns 0 and displays cluster information including kubeconfig path
|
||||
- **Failure**: Returns non-zero and displays error diagnostics
|
||||
|
||||
## See Also
|
||||
|
||||
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/installing/
|
||||
- OpenShift Install: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/
|
||||
- Platform-specific installation guides
|
||||
|
||||
## Arguments:
|
||||
|
||||
- **$1** (release-image): OpenShift release image to extract the installer from (e.g., `quay.io/openshift-release-dev/ocp-release:4.21.0-ec.2-x86_64`)
|
||||
- **$2** (platform): Target cloud platform for cluster deployment (aws, azure, gcp, vsphere, openstack, none)
|
||||
360
commands/destroy-cluster.md
Normal file
360
commands/destroy-cluster.md
Normal file
@@ -0,0 +1,360 @@
|
||||
---
|
||||
description: Destroy an OpenShift cluster created by create-cluster command
|
||||
argument-hint: "[install-dir]"
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:destroy-cluster
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:destroy-cluster [install-dir]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `destroy-cluster` command safely destroys an OpenShift Container Platform (OCP) cluster that was previously created using the `/openshift:create-cluster` command. It locates the appropriate installer binary, verifies the cluster information, and performs cleanup of all cloud resources.
|
||||
|
||||
This command is useful for:
|
||||
- Cleaning up development/test clusters after testing
|
||||
- Removing failed cluster installations
|
||||
- Freeing up cloud resources and quotas
|
||||
|
||||
**⚠️ WARNING**: This operation is **irreversible** and will permanently delete:
|
||||
- All cluster resources (VMs, load balancers, storage, etc.)
|
||||
- All data stored in the cluster
|
||||
- All configuration and credentials
|
||||
- DNS records (if managed by the installer)
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before using this command, ensure you have:
|
||||
|
||||
1. **Installation directory** from the original cluster creation
|
||||
- Contains the cluster metadata and terraform state
|
||||
- Located at `{cluster-name}-install-{timestamp}` by default
|
||||
|
||||
2. **OpenShift installer binary** that matches the cluster version
|
||||
- Should be available at `~/.openshift-installers/openshift-install-{version}`
|
||||
- Same version used to create the cluster
|
||||
|
||||
3. **Cloud Provider Credentials** still configured and valid
|
||||
- Same credentials used during cluster creation
|
||||
- Must have permissions to delete resources
|
||||
|
||||
4. **Network connectivity** to the cloud provider
|
||||
- Required to communicate with cloud APIs
|
||||
|
||||
## Arguments
|
||||
|
||||
- **install-dir** (optional): Path to the cluster installation directory
|
||||
- Default: Interactive prompt to select from available installation directories
|
||||
- Must contain cluster metadata files (metadata.json, terraform.tfstate, etc.)
|
||||
- Example: `./my-cluster-install-20251028-120000`
|
||||
|
||||
## Implementation
|
||||
|
||||
The command performs the following steps:
|
||||
|
||||
### 1. Locate Installation Directory
|
||||
|
||||
If `install-dir` is not provided:
|
||||
- Search for installation directories in the current directory
|
||||
- Look for directories matching pattern `*-install-*` or containing `.openshift_install_state.json`
|
||||
- Present a list of found directories to the user for selection
|
||||
- Allow user to manually enter a path if directory not found
|
||||
|
||||
If `install-dir` is provided:
|
||||
- Validate the directory exists
|
||||
- Verify it contains cluster metadata files
|
||||
|
||||
### 2. Extract Cluster Information
|
||||
|
||||
Read cluster details from the installation directory:
|
||||
```bash
|
||||
# Read cluster metadata
|
||||
if [ -f "$INSTALL_DIR/metadata.json" ]; then
|
||||
CLUSTER_NAME=$(jq -r '.clusterName' "$INSTALL_DIR/metadata.json")
|
||||
INFRA_ID=$(jq -r '.infraID' "$INSTALL_DIR/metadata.json")
|
||||
PLATFORM=$(jq -r '.platform' "$INSTALL_DIR/metadata.json")
|
||||
fi
|
||||
|
||||
# Try to extract version from cluster-info or log files
|
||||
VERSION=$(grep -oE 'openshift-install.*v[0-9]+\.[0-9]+\.[0-9]+' "$INSTALL_DIR/.openshift_install.log" | head -1 | grep -oE '[0-9]+\.[0-9]+\.[0-9]+[^"]*' | head -1)
|
||||
```
|
||||
|
||||
### 3. Display Cluster Information and Confirm
|
||||
|
||||
Show the user what will be destroyed:
|
||||
```
|
||||
Cluster Information:
|
||||
Name: ${CLUSTER_NAME}
|
||||
Infrastructure ID: ${INFRA_ID}
|
||||
Platform: ${PLATFORM}
|
||||
Installation Directory: ${INSTALL_DIR}
|
||||
Version: ${VERSION}
|
||||
|
||||
⚠️ WARNING: This will permanently destroy the cluster and all its resources!
|
||||
|
||||
This action will delete:
|
||||
- All cluster VMs and compute resources
|
||||
- Load balancers and networking resources
|
||||
- Storage volumes and persistent data
|
||||
- DNS records
|
||||
- All cluster configuration
|
||||
|
||||
Are you sure you want to destroy this cluster? (yes/no):
|
||||
```
|
||||
|
||||
**Important**: Require the user to type "yes" (not just "y") to confirm destruction.
|
||||
|
||||
### 4. Locate the Correct Installer
|
||||
|
||||
Find the installer binary that matches the cluster version:
|
||||
```bash
|
||||
INSTALLER_DIR="${HOME}/.openshift-installers"
|
||||
INSTALLER_PATH="$INSTALLER_DIR/openshift-install-${VERSION}"
|
||||
|
||||
# Check if the version-specific installer exists
|
||||
if [ ! -f "$INSTALLER_PATH" ]; then
|
||||
echo "Warning: Installer for version ${VERSION} not found at ${INSTALLER_PATH}"
|
||||
echo "Searching for alternative installers..."
|
||||
|
||||
# Look for any installer in the installers directory
|
||||
AVAILABLE_INSTALLERS=$(find "$INSTALLER_DIR" -name "openshift-install-*" -type f 2>/dev/null)
|
||||
|
||||
if [ -n "$AVAILABLE_INSTALLERS" ]; then
|
||||
echo "Found installers:"
|
||||
echo "$AVAILABLE_INSTALLERS"
|
||||
echo ""
|
||||
echo "You may use a different version installer, but this may cause issues."
|
||||
echo "Would you like to:"
|
||||
echo " 1. Use an available installer from the list above"
|
||||
echo " 2. Extract the correct installer from the release image"
|
||||
echo " 3. Cancel the operation"
|
||||
else
|
||||
echo "No installers found. Would you like to extract the installer? (yes/no):"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Verify installer works
|
||||
"$INSTALLER_PATH" version
|
||||
```
|
||||
|
||||
### 5. Backup Important Files (Optional)
|
||||
|
||||
Offer to backup key files before destruction:
|
||||
```
|
||||
Would you like to backup cluster information before destroying? (yes/no):
|
||||
```
|
||||
|
||||
If yes, create a backup:
|
||||
```bash
|
||||
BACKUP_DIR="${INSTALL_DIR}-backup-$(date +%Y%m%d-%H%M%S)"
|
||||
mkdir -p "$BACKUP_DIR"
|
||||
|
||||
# Backup key files
|
||||
cp "$INSTALL_DIR/metadata.json" "$BACKUP_DIR/" 2>/dev/null
|
||||
cp "$INSTALL_DIR/auth/kubeconfig" "$BACKUP_DIR/" 2>/dev/null
|
||||
cp "$INSTALL_DIR/auth/kubeadmin-password" "$BACKUP_DIR/" 2>/dev/null
|
||||
cp "$INSTALL_DIR/.openshift_install.log" "$BACKUP_DIR/" 2>/dev/null
|
||||
cp "$INSTALL_DIR/install-config.yaml.backup" "$BACKUP_DIR/" 2>/dev/null
|
||||
|
||||
echo "Backup created at: $BACKUP_DIR"
|
||||
```
|
||||
|
||||
### 6. Run Cluster Destroy
|
||||
|
||||
Execute the destroy command:
|
||||
```bash
|
||||
cd "$INSTALL_DIR"
|
||||
|
||||
echo "Starting cluster destruction..."
|
||||
echo "This may take 10-15 minutes..."
|
||||
|
||||
"$INSTALLER_PATH" destroy cluster --dir=. --log-level=debug
|
||||
|
||||
DESTROY_EXIT_CODE=$?
|
||||
```
|
||||
|
||||
Monitor the destruction progress and display status updates.
|
||||
|
||||
### 7. Verify Cleanup
|
||||
|
||||
After the destroy command completes:
|
||||
|
||||
1. **Check exit code**:
|
||||
```bash
|
||||
if [ $DESTROY_EXIT_CODE -eq 0 ]; then
|
||||
echo "✅ Cluster destroyed successfully"
|
||||
else
|
||||
echo "❌ Cluster destruction failed with exit code: $DESTROY_EXIT_CODE"
|
||||
echo "Check logs at: $INSTALL_DIR/.openshift_install.log"
|
||||
fi
|
||||
```
|
||||
|
||||
2. **Verify cloud resources** (platform-specific):
|
||||
- AWS: Check for lingering resources with tag `kubernetes.io/cluster/${INFRA_ID}`
|
||||
- Azure: Verify resource group deletion
|
||||
- GCP: Check project for remaining resources
|
||||
|
||||
3. **List any remaining resources**:
|
||||
```
|
||||
If any resources remain, provide commands to manually clean them up.
|
||||
```
|
||||
|
||||
### 8. Cleanup Installation Directory (Optional)
|
||||
|
||||
Ask the user if they want to remove the installation directory:
|
||||
```
|
||||
The cluster has been destroyed. Would you like to delete the installation directory? (yes/no):
|
||||
Directory: $INSTALL_DIR
|
||||
Size: $(du -sh "$INSTALL_DIR" | cut -f1)
|
||||
```
|
||||
|
||||
If yes:
|
||||
```bash
|
||||
rm -rf "$INSTALL_DIR"
|
||||
echo "Installation directory removed"
|
||||
```
|
||||
|
||||
If no:
|
||||
```bash
|
||||
echo "Installation directory preserved at: $INSTALL_DIR"
|
||||
echo "You can manually remove it later with: rm -rf $INSTALL_DIR"
|
||||
```
|
||||
|
||||
### 9. Display Summary
|
||||
|
||||
Show final summary:
|
||||
```
|
||||
Cluster Destruction Summary:
|
||||
Cluster Name: ${CLUSTER_NAME}
|
||||
Status: Successfully destroyed
|
||||
Platform: ${PLATFORM}
|
||||
Duration: ${DURATION}
|
||||
Backup: ${BACKUP_DIR} (if created)
|
||||
|
||||
Next steps:
|
||||
- Verify your cloud console for any lingering resources
|
||||
- Check your cloud billing to ensure resources are no longer incurring charges
|
||||
- Remove installation directory if not already deleted: ${INSTALL_DIR}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
If destruction fails, the command should:
|
||||
|
||||
1. **Capture error logs** from `.openshift_install.log`
|
||||
2. **Identify the failure point**:
|
||||
- Timeout waiting for resource deletion
|
||||
- Permission errors
|
||||
- API rate limiting
|
||||
- Network connectivity issues
|
||||
- Resources locked or in use
|
||||
3. **Provide recovery options**:
|
||||
- Retry the destroy operation
|
||||
- Manual cleanup instructions for specific resources
|
||||
- Contact support if critical errors occur
|
||||
|
||||
Common failure scenarios:
|
||||
|
||||
**Timeout errors**:
|
||||
```bash
|
||||
# Some resources may take longer to delete
|
||||
# Retry the destroy command:
|
||||
"$INSTALLER_PATH" destroy cluster --dir="$INSTALL_DIR"
|
||||
```
|
||||
|
||||
**Permission errors**:
|
||||
```
|
||||
Error: Cloud credentials may have expired or lack permissions
|
||||
Solution:
|
||||
1. Verify cloud credentials are still valid
|
||||
2. Check IAM permissions for resource deletion
|
||||
3. Re-run the destroy command after fixing credentials
|
||||
```
|
||||
|
||||
**Partial destruction**:
|
||||
```
|
||||
Warning: Some resources could not be deleted automatically.
|
||||
|
||||
Remaining resources:
|
||||
- Load balancer: ${LB_NAME}
|
||||
- Security group: ${SG_NAME}
|
||||
- S3 bucket: ${BUCKET_NAME}
|
||||
|
||||
Manual cleanup commands:
|
||||
[Platform-specific commands to delete remaining resources]
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Destroy cluster with interactive directory selection
|
||||
```
|
||||
/openshift:destroy-cluster
|
||||
```
|
||||
The command will search for installation directories and prompt you to select one.
|
||||
|
||||
### Example 2: Destroy cluster with specific directory
|
||||
```
|
||||
/openshift:destroy-cluster ./my-cluster-install-20251028-120000
|
||||
```
|
||||
|
||||
### Example 3: Destroy cluster with full path
|
||||
```
|
||||
/openshift:destroy-cluster /home/user/clusters/test-cluster-install-20251028-120000
|
||||
```
|
||||
|
||||
## Common Issues
|
||||
|
||||
1. **Installation directory not found**:
|
||||
- Ensure you're in the correct directory
|
||||
- Provide the full path to the installation directory
|
||||
- Check if the directory was moved or renamed
|
||||
|
||||
2. **Installer binary not found**:
|
||||
- The command will help you extract the correct installer
|
||||
- Alternatively, manually place the installer in `~/.openshift-installers/`
|
||||
|
||||
3. **Cloud credentials expired**:
|
||||
- Refresh your cloud credentials
|
||||
- Re-authenticate with the cloud provider CLI
|
||||
- Re-run the destroy command
|
||||
|
||||
4. **Resources already deleted manually**:
|
||||
- The destroy command may fail if resources were manually deleted
|
||||
- Check the logs and manually clean up any remaining resources
|
||||
- Remove the installation directory manually
|
||||
|
||||
5. **Destroy hangs or times out**:
|
||||
- Some resources may take longer to delete (especially load balancers)
|
||||
- Wait for the operation to complete (can take 15-30 minutes)
|
||||
- If truly stuck, cancel and retry
|
||||
- Check cloud console for resource status
|
||||
|
||||
## Safety Features
|
||||
|
||||
This command includes several safety measures:
|
||||
|
||||
1. **Confirmation required**: Must type "yes" to proceed
|
||||
2. **Cluster information displayed**: Shows what will be destroyed before proceeding
|
||||
3. **Backup option**: Offers to backup important files
|
||||
4. **Validation checks**: Verifies installation directory and metadata
|
||||
5. **Detailed logging**: All operations logged for troubleshooting
|
||||
6. **Error recovery**: Provides manual cleanup instructions if automated cleanup fails
|
||||
|
||||
## Return Value
|
||||
|
||||
- **Success**: Returns 0 and displays destruction summary
|
||||
- **Failure**: Returns non-zero and displays error diagnostics with recovery instructions
|
||||
|
||||
## See Also
|
||||
|
||||
- `/openshift:create-cluster` - Create a new OCP cluster
|
||||
- OpenShift Documentation: https://docs.openshift.com/container-platform/latest/installing/
|
||||
- Platform-specific cleanup guides
|
||||
|
||||
## Arguments:
|
||||
|
||||
- **$1** (install-dir): Path to the cluster installation directory created by create-cluster (optional, interactive if not provided)
|
||||
79
commands/expand-test-case.md
Normal file
79
commands/expand-test-case.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
description: Expand basic test ideas or existing oc commands into comprehensive test scenarios with edge cases in oc CLI or Ginkgo format
|
||||
argument-hint: [test-idea-or-file-or-commands] [format]
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:expand-test-case
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:expand-test-case [test-idea-or-file-or-commands] [format]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `expand-test-case` command transforms basic test ideas or existing oc commands into comprehensive test scenarios. It accepts three types of input:
|
||||
|
||||
1. **Test idea**: Simple description of what to test (e.g., "verify pod deployment")
|
||||
2. **File path**: Path to existing test file to expand (e.g., `/path/to/test.sh` or `/path/to/test.go`)
|
||||
3. **oc commands**: Direct oc CLI commands to analyze and expand (e.g., `oc create pod nginx`)
|
||||
|
||||
The command expands the input to cover positive flows, negative scenarios, edge cases, and boundary conditions, helping QE engineers ensure thorough test coverage.
|
||||
|
||||
Supports two output formats:
|
||||
- **oc CLI**: Shell scripts with oc commands for manual or automated execution
|
||||
- **Ginkgo**: Go test code using Ginkgo/Gomega framework for E2E tests
|
||||
|
||||
## Implementation
|
||||
|
||||
The command analyzes the input and generates comprehensive scenarios:
|
||||
|
||||
1. **Parse Input**: Determine if input is a test idea, file path, or oc commands
|
||||
- If file path: Read and analyze existing test code
|
||||
- If oc commands: Parse commands to understand what's being tested
|
||||
- If test idea: Understand the core feature or behavior
|
||||
2. **Identify Test Dimensions**: Determine coverage aspects (functionality, security, performance, edge cases)
|
||||
3. **Generate Positive Tests**: Happy path scenarios where everything works
|
||||
4. **Generate Negative Tests**: Error handling, invalid inputs, permission issues
|
||||
5. **Add Edge Cases**: Boundary values, race conditions, resource limits
|
||||
6. **Define Validation**: Clear success criteria and assertions
|
||||
7. **Format Output**: Generate in requested format (oc CLI or Ginkgo) - **MUST follow the standards in "Test Coverage Guidelines" section below**
|
||||
|
||||
**CRITICAL**: All generated test scenarios MUST adhere to the coverage dimensions, best practices, and standards defined in the **"Test Coverage Guidelines"** section below. Use the referenced examples and patterns from OpenShift origin repository.
|
||||
|
||||
## Test Coverage Guidelines
|
||||
|
||||
The command generates comprehensive test scenarios following industry best practices:
|
||||
|
||||
**Test Coverage Dimensions:**
|
||||
- **Positive Tests**: Valid inputs and expected workflows
|
||||
- **Negative Tests**: Invalid inputs, permission errors, missing dependencies
|
||||
- **Edge Cases**: Boundary values (0, max values, empty inputs, special characters)
|
||||
- **Security Tests**: RBAC validation, security context enforcement, privilege escalation
|
||||
- **Resource Tests**: Low memory, disk pressure, network issues, rate limiting
|
||||
- **Concurrency**: Multiple operations happening simultaneously
|
||||
- **Failure Recovery**: Restart behavior, cleanup on failure
|
||||
|
||||
**References:**
|
||||
- OpenShift Test Examples: https://github.com/openshift/origin/tree/master/test/extended
|
||||
- Ginkgo BDD Framework: https://onsi.github.io/ginkgo/
|
||||
- Test Pattern Catalog: https://github.com/openshift/origin/blob/master/test/extended/README.md
|
||||
- oc CLI Reference: https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/developer-cli-commands.html
|
||||
|
||||
**Best Practices Applied:**
|
||||
- Use stable, descriptive test names (no dynamic IDs or timestamps)
|
||||
- Ensure proper resource cleanup (prevent resource leaks)
|
||||
- Include meaningful assertions with clear failure messages
|
||||
- Isolate tests (each test creates its own resources)
|
||||
- Add appropriate timeouts to prevent hanging tests
|
||||
- Follow Ginkgo patterns: Describe/Context/It hierarchy
|
||||
- Use framework helpers: e2epod, e2enode, e2enamespace
|
||||
|
||||
## Arguments
|
||||
|
||||
- **$1** (test-idea-or-file-or-commands): One of:
|
||||
- **Test idea**: Description of what to test
|
||||
- **File path**: Path to existing test file
|
||||
- **oc commands**: Set of oc CLI commands to analyze and expand
|
||||
- **$2** (format): Output format - "oc CLI" or "Ginkgo" (optional, will prompt if not provided)
|
||||
104
commands/new-e2e-test.md
Normal file
104
commands/new-e2e-test.md
Normal file
@@ -0,0 +1,104 @@
|
||||
---
|
||||
description: Write and validate new OpenShift E2E tests using Ginkgo framework
|
||||
argument-hint: [test-specification]
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:new-e2e-test
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/new-e2e-test [test-specification]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `new-e2e-test` command assists in writing and validating
|
||||
new tests for the OpenShift test suite. It follows best practices for
|
||||
Ginkgo-based testing and ensures test reliability through automated
|
||||
validation.
|
||||
|
||||
This command handles the complete lifecycle of test development:
|
||||
- Writes tests following Ginkgo patterns and OpenShift conventions
|
||||
- Validates tests for reliability through multiple test runs
|
||||
- Ensures proper test naming and structure
|
||||
- Handles both origin repository and extension tests appropriately
|
||||
|
||||
## Test Framework Guidelines
|
||||
|
||||
### Ginkgo Framework
|
||||
- OpenShift-tests uses **Ginkgo** as its testing framework
|
||||
- Tests are organized in a BDD (Behavior-Driven Development) style with Describe/Context/It blocks
|
||||
- All tests should follow Ginkgo patterns and conventions except
|
||||
- You MUST NOT use BeforeAll, AfterAll hooks
|
||||
- MUST NOT use ginkgo.Serial, instead use the [Serial] annotation in the test name if non-parallel execution is required
|
||||
|
||||
### Repository-Specific Guidelines
|
||||
|
||||
#### Origin Repository Tests
|
||||
|
||||
If working in the "origin" code repository:
|
||||
- All tests should go into the `test/extended` directory
|
||||
- If creating a new package, import it into `test/extended/include.go`
|
||||
- After writing your test, **MUST** rebuild the openshift-tests binary using `make openshift-tests`
|
||||
|
||||
#### Other repositories
|
||||
|
||||
Other repositories have have different conventions for locations of
|
||||
tests and how they get imported. Examine the code base and follow the
|
||||
conventions defined.
|
||||
|
||||
## Critical Test Requirements
|
||||
|
||||
### Test Names
|
||||
|
||||
**CRITICAL**: Test names must be stable and deterministic.
|
||||
|
||||
#### ❌ NEVER Include Dynamic Information:
|
||||
- Pod names (e.g., "test-pod-abc123")
|
||||
- Timestamps
|
||||
- Random UUIDs or generated identifiers
|
||||
- Node names
|
||||
- Namespace names with random suffixes
|
||||
- Limits that may change later
|
||||
|
||||
#### ✅ ALWAYS Use Descriptive, Static Names:
|
||||
- **Good example**: "should create a pod with custom security context"
|
||||
- **Bad example**: "should create pod test-pod-xyz123 with custom security context"
|
||||
|
||||
- **Good example**: "should create a pod within a reasonable timeframe"
|
||||
- **Bad example**: "should create a pod within 15 seconds"
|
||||
|
||||
### Results
|
||||
|
||||
**CRITICAL**: Tests must always produce a pass, fail or skip result. Do
|
||||
not create tests that only produce pass or only produce a fail result.
|
||||
|
||||
## Test Structure Guidelines
|
||||
|
||||
### Best Practices
|
||||
|
||||
- Tests should be focused and test one specific behavior
|
||||
- Use proper setup and cleanup in BeforeEach/AfterEach blocks
|
||||
- Include appropriate timeouts for operations
|
||||
- Add meaningful assertions with clear failure messages
|
||||
- Follow existing patterns in the codebase for consistency
|
||||
|
||||
## Implementation
|
||||
|
||||
The command performs the following steps:
|
||||
|
||||
1. **Analyze Specification**: Parse the test specification provided by the user
|
||||
2. **Write Test**: Create a new test file following Ginkgo and OpenShift conventions
|
||||
- Determine correct location
|
||||
- Follow proper test structure
|
||||
- Use stable, descriptive naming
|
||||
- Implement proper setup/cleanup
|
||||
3. **Build Binary**: Rebuild the appropriate test binary (openshift-tests or a test extension)
|
||||
|
||||
## Arguments
|
||||
|
||||
- **$1** (test-specification): Description of the test behavior to validate. Should clearly specify:
|
||||
- What feature/behavior to test
|
||||
- Expected outcomes
|
||||
- Any specific conditions or configurations
|
||||
146
commands/rebase.md
Normal file
146
commands/rebase.md
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
argument-hint: <tag>
|
||||
description: Rebase OpenShift fork of an upstream repository to a new upstream release.
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:rebase
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:rebase [tag]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `/openshift:rebase` command rebases git repository in the current working directory
|
||||
to a new upstream release specified by `[tag]`. If no `[tag]` is specified, the command
|
||||
tries to find the latest stable upstream release.
|
||||
|
||||
The repository must follow rules described in https://github.com/openshift/kubernetes/blob/master/REBASE.openshift.md,
|
||||
namely all OpenShift-specific commits must have prefix `UPSTREAM:`.
|
||||
|
||||
## Implementation
|
||||
|
||||
### Pre-requisites
|
||||
Three local remote repositories should be tracked from a local machine: `origin`
|
||||
tracking the user's fork of this repository, `openshift` tracking this
|
||||
repository and `upstream` tracking the upstream repository.
|
||||
|
||||
To verify the correct setup, use
|
||||
```bash
|
||||
git remote -v
|
||||
```
|
||||
|
||||
Fail, if there is no `upstream`, `origin` or `openshift` remote.
|
||||
|
||||
### Rebase to the new upstream version
|
||||
|
||||
1. Fetch all the remote repositories including tags
|
||||
```bash
|
||||
git fetch --all
|
||||
```
|
||||
|
||||
2. Find the main branch of the repository. It's either `master` or `main`. In the following steps, we will use `master`, but replace it with the main branch.
|
||||
|
||||
3. If user did not specify an upstream tag to rebase to as `<tag>`, find the greatest upstream tag that is not alpha, beta or rc.
|
||||
|
||||
4. Create a new branch based on the newest tag $1 of the upstream
|
||||
repository. Name it after the tag.
|
||||
```bash
|
||||
git checkout -b rebase-<tag> <tag>
|
||||
```
|
||||
|
||||
5. Merge `openshift/master` branch into the `rebase-$1` branch with merge strategy `ours`:
|
||||
```bash
|
||||
git merge -s ours openshift/master
|
||||
```
|
||||
|
||||
6. Find the last rebase that has been done to `openshift/master`. We will use the upstream tag used for this rebase as `$previous_tag`.
|
||||
|
||||
7. Find the merge base of the `openshift/master` and `$previous_tag` by running `git merge-base openshift/master $previous_tag`. We will use this merge base as `$mergebase`.
|
||||
|
||||
8. Prepare `commits.tsv` tab-separated values file containing the set of carry
|
||||
commits in the openshift/master branch that need to be considered for picking:
|
||||
|
||||
Create the commits file:
|
||||
```
|
||||
echo -e 'Sha\tMessage\tDecision' > commits.tsv
|
||||
git log ${mergebase}..openshift/master --ancestry-path --reverse --no-merges --pretty="tformat:%h%x09%s%x09" | grep "UPSTREAM:" > commits.tsv
|
||||
```
|
||||
|
||||
9. Go through the commits in the `commits.tsv` file and for each of them decide
|
||||
whether to pick, drop or squash it. Commits carried on rebase branches have commit
|
||||
messages prefixed as follows:
|
||||
|
||||
* `UPSTREAM: <carry>: Add OpenShift files`:
|
||||
ALWAYS carry this commit and mark it as "cherry-pick".
|
||||
This is a persistent carry that contains all OpenShift-specific files and should be present in every rebase.
|
||||
|
||||
* Other `UPSTREAM: <carry>` commit:
|
||||
A persistent carry that needs to be considered for squashing.
|
||||
Examine what files it modifies using `git show --stat <commit-sha>`.
|
||||
If it modifies ONLY OpenShift-specific files (Dockerfile, OWNERS, .ci-operator.yaml, .snyk, etc.), mark it as "squash",
|
||||
otherwise mark is as "cherry-pick".
|
||||
|
||||
* `UPSTREAM: <drop>`:
|
||||
A carry that should probably not be picked for the subsequent rebase branch.
|
||||
In general, these commits are used to maintain the codebase in ways that are branch-specific,
|
||||
like the update of generated files or dependencies.
|
||||
Mark such commit as "drop".
|
||||
|
||||
* `UPSTREAM: (upstream PR number)`:
|
||||
The number identifies a PR in upstream repository (e.g. https://github.com/<upstream project>/<upstrem repository>/pull/<pr id>).
|
||||
A commit with this message should only be picked into the subsequent rebase branch if the commits
|
||||
of the referenced PR are not included in the upstream branch. To check if a given commit is included
|
||||
in the upstream branch, open the referenced upstream PR and check any of its commits for the release tag.
|
||||
|
||||
For each commit:
|
||||
- Print the decision you made and why.
|
||||
- Update commits.tsv with the decision ("cherry-pick", "drop", or "squash").
|
||||
|
||||
10. Cherry-pick all commits marked as "cherry-pick" in commits.tsv.
|
||||
Then squash ALL commits marked as "squash" into a single commit named "UPSTREAM: <carry>: Add OpenShift files"
|
||||
to keep the number of <carry> commits as low as possible.
|
||||
|
||||
Use `git reset --soft` to squash multiple commits together, then create a single commit with all the changes.
|
||||
The commit message should list what was included (e.g., "Additional changes: remove .github files, add .snyk file, update Dockerfile and .ci-operator.yaml").
|
||||
|
||||
11. If the upstream repository DOES NOT include `vendor/` directory and the OpenShift fork DOES, then update the vendor directory with `go mod tidy` and `go mod vendor`.
|
||||
Amend these vendor updates into the "UPSTREAM: <carry>: Add OpenShift files" commit using `git commit --amend --no-edit`.
|
||||
|
||||
12. As a verification step, see the last rebase and ensure that all changes made in the last rebase are present in the current one.
|
||||
Either as a cherry pick or were part of the rebase.
|
||||
Verify all changes were applied during the rebase. Either as a cherry-picked patch or they were included in the new upstream tag.
|
||||
List all these commits, together with checks you made and their result.
|
||||
|
||||
13. Verify the changes by running `make` and `make test` (or a similar command like like `go build ./...` and `go test ./...`).
|
||||
Stop here if there are compilation errors or test failures that indicate real code issues.
|
||||
If you make any new commits to fix compilation or tests, let user review these changes and then squash them into the commit "UPSTREAM: <carry>: Add OpenShift files" too.
|
||||
|
||||
14. Find links to upstream changelogs between `$previous_tag` and $1.
|
||||
Make sure they are links to changelogs, not tags.
|
||||
Print list of the links.
|
||||
|
||||
15. Create a github pull request against the OpenShift github repository (openshift/<repo-name>).
|
||||
IMPORTANT: Use `--repo openshift/<repo-name>` to ensure the PR is created against the correct OpenShift repository, not the upstream.
|
||||
The PR title should be "Rebase to $1 for OCP <current OCP version>".
|
||||
Follow the repository .github/PULL_REQUEST_TEMPLATE.md, if it exists.
|
||||
Description of the PR must look like:
|
||||
```
|
||||
## Upstream changelogs
|
||||
<List links to all upstream changelogs, as composed in the previous step.>
|
||||
|
||||
## Summary of changes
|
||||
<List all new major features and breaking changes that happened between $previous_tag and $1.
|
||||
Do not list upstream commits or PRs, make a human readable summary of them.
|
||||
Do not include small bug fixes, small updates, or dependency bumps.>
|
||||
|
||||
## Carried commits
|
||||
<List of commits from commits.tsv. For each commit print a decision you made - either "drop", "cherry-pick", or "squash".>
|
||||
|
||||
Diff to upstream: <link to a diff between the upstream project/upstream repository/tag $1 and this PR (i.e. my personal fork with branch `rebase-$1`>
|
||||
|
||||
Previous rebase: <link to the previous rebase PR on github>
|
||||
```
|
||||
When opening the PR, ALWAYS use `gh pr create --web --repo openshift/<repo-name>` to allow user edit the PR before creation.
|
||||
69
commands/review-test-cases.md
Normal file
69
commands/review-test-cases.md
Normal file
@@ -0,0 +1,69 @@
|
||||
---
|
||||
description: Review test cases for completeness, quality, and best practices - accepts file path or direct oc commands/test code
|
||||
argument-hint: [file-path-or-test-code-or-commands]
|
||||
---
|
||||
|
||||
## Name
|
||||
openshift:review-test-cases
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/openshift:review-test-cases [file-path-or-test-code-or-commands]
|
||||
```
|
||||
|
||||
## Description
|
||||
|
||||
The `review-test-cases` command provides comprehensive review of OpenShift test cases to ensure quality, completeness, and adherence to best practices. It accepts three types of input:
|
||||
|
||||
1. **File path**: Path to test file (e.g., `/path/to/test.sh` or `/path/to/test.go`)
|
||||
2. **oc commands**: Direct oc CLI commands to review (e.g., paste a set of oc commands)
|
||||
3. **Test code**: Pasted Ginkgo test code to analyze
|
||||
|
||||
The command analyzes test code in both oc CLI shell scripts and Ginkgo Go tests, helping QE engineers identify gaps in test coverage, improve test reliability, and ensure tests follow OpenShift testing standards.
|
||||
|
||||
## Implementation
|
||||
|
||||
The command analyzes test cases and provides structured feedback:
|
||||
|
||||
1. **Parse Test Input**: Determine if input is a file path, oc commands, or test code
|
||||
- If file path: Read and analyze the test file
|
||||
- If oc commands: Parse command sequence
|
||||
- If test code: Analyze pasted Ginkgo/test code
|
||||
2. **Identify Test Format**: Detect if it's oc CLI shell script or Ginkgo Go code
|
||||
3. **Analyze Test Structure**: Review organization, naming, and patterns
|
||||
4. **Check Coverage**: Verify positive, negative, and edge case coverage
|
||||
5. **Review Assertions**: Ensure proper validation and error checking
|
||||
6. **Evaluate Cleanup**: Verify resource cleanup and namespace management
|
||||
7. **Assess Best Practices**: **MUST follow the standards defined in "Testing Guidelines and References" section below**
|
||||
8. **Generate Recommendations**: Provide actionable improvement suggestions based on the guidelines
|
||||
|
||||
**CRITICAL**: All reviews MUST be evaluated against the specific standards, references, and best practices listed in the **"Testing Guidelines and References"** section below. Do not use generic testing advice - follow the OpenShift-specific guidelines provided.
|
||||
|
||||
## Testing Guidelines and References
|
||||
|
||||
The review follows established testing best practices from:
|
||||
|
||||
**For Ginkgo/E2E Tests:**
|
||||
- OpenShift Origin Test Extended: https://github.com/openshift/origin/tree/master/test/extended
|
||||
- Ginkgo Testing Framework: https://onsi.github.io/ginkgo/
|
||||
- OpenShift Test Best Practices: https://github.com/openshift/origin/blob/master/test/extended/README.md
|
||||
|
||||
**For oc CLI Tests:**
|
||||
- OpenShift CLI Documentation: https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/developer-cli-commands.html
|
||||
- Bash Best Practices: https://google.github.io/styleguide/shellguide.html
|
||||
|
||||
**Key Testing Standards:**
|
||||
- Use descriptive, stable test names (no timestamps, random IDs)
|
||||
- Proper resource cleanup (AfterEach, defer, trap)
|
||||
- Meaningful assertions with clear failure messages
|
||||
- Test isolation (each test creates own resources)
|
||||
- Appropriate timeouts and waits
|
||||
- No BeforeAll/AfterAll in Ginkgo tests
|
||||
- Use framework helpers (e2epod, e2enode) when available
|
||||
|
||||
## Arguments
|
||||
|
||||
- **$1** (file-path-or-test-code-or-commands): One of:
|
||||
- **File path**: Path to test file (shell script or Go test file)
|
||||
- **oc commands**: Set of oc CLI commands to review
|
||||
- **Test code**: Pasted test code (Ginkgo or shell script)
|
||||
77
plugin.lock.json
Normal file
77
plugin.lock.json
Normal file
@@ -0,0 +1,77 @@
|
||||
{
|
||||
"$schema": "internal://schemas/plugin.lock.v1.json",
|
||||
"pluginId": "gh:openshift-eng/ai-helpers:plugins/openshift",
|
||||
"normalized": {
|
||||
"repo": null,
|
||||
"ref": "refs/tags/v20251128.0",
|
||||
"commit": "2bae18158cc58ddaaeefbc685689899b7685b679",
|
||||
"treeHash": "e1cbd8160922270ea583839ddf542ecd6b05e585ce18e35b2b3a77afcffb600e",
|
||||
"generatedAt": "2025-11-28T10:27:30.115036Z",
|
||||
"toolVersion": "publish_plugins.py@0.2.0"
|
||||
},
|
||||
"origin": {
|
||||
"remote": "git@github.com:zhongweili/42plugin-data.git",
|
||||
"branch": "master",
|
||||
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
|
||||
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
|
||||
},
|
||||
"manifest": {
|
||||
"name": "openshift",
|
||||
"description": "OpenShift development utilities and helpers",
|
||||
"version": "0.0.1"
|
||||
},
|
||||
"content": {
|
||||
"files": [
|
||||
{
|
||||
"path": "README.md",
|
||||
"sha256": "f431e975cfb1245dd1f40539ffa5244a4eba7924b86f1b45624385936463f42d"
|
||||
},
|
||||
{
|
||||
"path": ".claude-plugin/plugin.json",
|
||||
"sha256": "3ddc8d96e7dc034d6a1d71a3474127af79a9a18c0492be31c855b59c3d0ddf26"
|
||||
},
|
||||
{
|
||||
"path": "commands/expand-test-case.md",
|
||||
"sha256": "a2760ebd7b1865bf9f28a72152a2f82d9e1cf5f5e6d63ac4844589a6a30deb75"
|
||||
},
|
||||
{
|
||||
"path": "commands/destroy-cluster.md",
|
||||
"sha256": "9fd1d82c76120b20264b7f0d57e5e5b219b33ee6887afed621b7a61d696716ca"
|
||||
},
|
||||
{
|
||||
"path": "commands/create-cluster.md",
|
||||
"sha256": "e0aa12787e7c24f1f8dfc975ac5ac2b3998fe87e5fc83a0a85be475860aea8f6"
|
||||
},
|
||||
{
|
||||
"path": "commands/new-e2e-test.md",
|
||||
"sha256": "bbd0795dbbf8928456bf854fe97ec2becc6e6b24803e5e457ce145b657350ee5"
|
||||
},
|
||||
{
|
||||
"path": "commands/rebase.md",
|
||||
"sha256": "31f0983f9531cd42384f17fc4d298528a44a57cc1e5c4128911f8408428b4ab7"
|
||||
},
|
||||
{
|
||||
"path": "commands/review-test-cases.md",
|
||||
"sha256": "b4dcfd668cec760448e672f02026dd51cd148e0ad6df4de73a6ef4cdd80080d4"
|
||||
},
|
||||
{
|
||||
"path": "commands/cluster-health-check.md",
|
||||
"sha256": "bfba16ddedce875ca3968a9bbb30a3bad5f8e4943eddcd807e59631d864c2e0e"
|
||||
},
|
||||
{
|
||||
"path": "commands/bump-deps.md",
|
||||
"sha256": "f29f7359a1bc0e44395c268139958a105a3f6b0952fed7d1b3515deb33bcacee"
|
||||
},
|
||||
{
|
||||
"path": "commands/crd-review.md",
|
||||
"sha256": "dd56f8e8c7c29384e91251817b02ccfda2b7bfa3d75006c16a364cd2c34a3bcb"
|
||||
}
|
||||
],
|
||||
"dirSha256": "e1cbd8160922270ea583839ddf542ecd6b05e585ce18e35b2b3a77afcffb600e"
|
||||
},
|
||||
"security": {
|
||||
"scannedAt": null,
|
||||
"scannerVersion": null,
|
||||
"flags": []
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user