--- description: Analyze and grade component health based on regression and JIRA bug metrics argument-hint: [--components comp1 comp2 ...] [--project JIRAPROJECT] --- ## Name component-health:analyze ## Synopsis ``` /component-health:analyze [--components comp1 comp2 ...] [--project JIRAPROJECT] ``` ## Description The `component-health:analyze` command provides comprehensive component health analysis for a specified OpenShift release by **automatically combining** regression management metrics with JIRA bug backlog data. **CRITICAL**: This command REQUIRES and AUTOMATICALLY fetches BOTH data sources: 1. Regression data (via summarize-regressions) 2. JIRA bug data (via summarize-jiras) The analysis is INCOMPLETE without both data sources. Both are fetched automatically without user prompting. The command evaluates component health based on: 1. **Regression Management** (ALWAYS fetched automatically): How well components are managing test regressions - Triage coverage (% of regressions triaged to JIRA bugs) - Triage timeliness (average time from detection to triage) - Resolution speed (average time from detection to closure) 2. **Bug Backlog Health** (ALWAYS fetched automatically): Current state of open bugs for components - Open bug counts by component - Bug age distribution - Bug priority breakdown - Recent bug flow (opened vs closed in last 30 days) This command is useful for: - **Grading overall component health** using multiple quality metrics - **Identifying components** that need help with regression or bug management - **Tracking quality trends** across releases - **Generating comprehensive quality scorecards** for stakeholders - **Prioritizing engineering investment** based on data-driven insights Grading is subjective and not meant to be a critique of team performance. This is intended to help identify where help is needed and track progress as we improve our quality practices. ## Implementation **CRITICAL WORKFLOW**: The analyze command MUST execute steps 3 and 4 (fetch regression data AND fetch JIRA data) automatically without waiting for user prompting. Both data sources are required for a complete analysis. 1. **Parse Arguments**: Extract release version and optional filters from arguments - Release format: "X.Y" (e.g., "4.17", "4.21") - Optional filters: - `--components`: Space-separated list of component search strings (fuzzy match) - `--project`: JIRA project key (default: "OCPBUGS") 2. **Resolve Component Names**: Use fuzzy matching to find actual component names - Run list_components.py to get all available components: ```bash python3 plugins/component-health/skills/list-components/list_components.py --release ``` - If `--components` was provided: - For each search string, find all components containing that string (case-insensitive) - Combine all matches into a single list - Remove duplicates - If no matches found for a search string, warn the user and show available components - If `--components` was NOT provided: - Use all available components from the list 3. **Fetch Regression Summary**: REQUIRED - Always call the summarize-regressions command **IMPORTANT**: This step is REQUIRED for the analyze command. Regression data must ALWAYS be fetched automatically without user prompting. The analyze command combines both regression and bug metrics - it is incomplete without both data sources. - **ALWAYS execute this step** - do not skip or wait for user to request it - Execute: `/component-health:summarize-regressions [--components ...]` - Pass resolved component names - Extract regression metrics: - Total regressions, triage percentages, timing metrics - Per-component breakdowns - Open vs closed regression counts - Note development window dates for context - If regression API is unreachable, inform the user and note this in the report but continue with bug-only analysis 4. **Fetch JIRA Bug Summary**: REQUIRED - Always call the summarize-jiras command **IMPORTANT**: This step is REQUIRED for the analyze command. JIRA bug data must ALWAYS be fetched automatically without user prompting. The analyze command combines both regression and bug metrics - it is incomplete without both data sources. - **ALWAYS execute this step** - do not skip or wait for user to request it - For each resolved component name: - Execute: `/component-health:summarize-jiras --project --component "" --limit 1000` - Note: Must iterate over components because JIRA queries can be too large otherwise - Aggregate bug metrics across all components: - Total open bugs by component - Bug age distribution - Opened vs closed in last 30 days - Priority breakdowns - If JIRA authentication is not configured, inform the user and provide setup instructions - If JIRA queries fail, note this in the report but continue with regression-only analysis 5. **Calculate Combined Health Grades**: REQUIRED - Analyze BOTH regression and bug data **IMPORTANT**: This step requires data from BOTH step 3 (regressions) AND step 4 (JIRA bugs). Do not perform analysis with only one data source unless the other failed to fetch. **For each component, grade based on:** a. **Regression Health** (from step 3: summarize-regressions): - Triage Coverage: % of regressions triaged - 90-100%: Excellent ✅ - 70-89%: Good ⚠️ - 50-69%: Needs Improvement ⚠️ - <50%: Poor ❌ - Triage Timeliness: Average hours to triage - <24 hours: Excellent ✅ - 24-72 hours: Good ⚠️ - 72-168 hours (1 week): Needs Improvement ⚠️ - >168 hours: Poor ❌ - Resolution Speed: Average hours to close - <168 hours (1 week): Excellent ✅ - 168-336 hours (1-2 weeks): Good ⚠️ - 336-720 hours (2-4 weeks): Needs Improvement ⚠️ - >720 hours (4+ weeks): Poor ❌ b. **Bug Backlog Health** (from step 4: summarize-jiras): - Open Bug Count: Total open bugs - Component-relative thresholds (compare across components) - Bug Age: Average/maximum age of open bugs - <30 days average: Excellent ✅ - 30-90 days: Good ⚠️ - 90-180 days: Needs Improvement ⚠️ - >180 days: Poor ❌ - Bug Flow: Opened vs closed in last 30 days - More closed than opened: Positive trend ✅ - Equal: Stable ⚠️ - More opened than closed: Growing backlog ❌ c. **Combined Health Score**: Weighted average of regression and bug health - Weight regression health more heavily (e.g., 60%) as it's more actionable - Bug backlog provides context (40%) 6. **Display Overall Health Report**: Present comprehensive analysis combining BOTH data sources **IMPORTANT**: The report MUST include BOTH regression metrics AND JIRA bug metrics. Do not present regression-only analysis unless JIRA data fetch failed. - Show which components were matched (if fuzzy search was used) - Inform user that both regression and bug data were analyzed **Section 1: Overall Release Health** - Release version and development window - Overall regression metrics (from summarize-regressions): - Total regressions, triage %, timing metrics - Overall bug metrics (from summarize-jiras): - Total open bugs, opened/closed last 30 days, priority breakdown - High-level combined health grade **Section 2: Per-Component Health Scorecard** - Ranked table of components from best to worst combined health - Key metrics per component (BOTH regression AND bug data): - Regression triage coverage - Average triage time - Average resolution time - Open bug count (from JIRA) - Bug age metrics (from JIRA) - Bug flow (opened vs closed, from JIRA) - Combined health grade - Visual indicators (✅ ⚠️ ❌) for quick assessment **Section 3: Components Needing Attention** - Prioritized list of components with specific issues from BOTH sources - Actionable recommendations for each component: - "X open untriaged regressions need triage" (only OPEN, not closed) - "High bug backlog: X open bugs (Y older than 90 days)" (from JIRA) - "Growing bug backlog: +X net bugs in last 30 days" (from JIRA) - "Slow regression triage: X hours average" - Context for each issue 7. **Offer HTML Report Generation** (AFTER displaying the text report): - Ask the user if they would like an interactive HTML report - If yes, generate an HTML report combining both data sources - Use template from: `plugins/component-health/skills/analyze-regressions/report_template.html` - Enhance template to include bug backlog metrics - Save report to: `.work/component-health-{release}/health-report.html` - Open the report in the user's default browser - Display the file path to the user 8. **Error Handling**: Handle common error scenarios - Network connectivity issues - Invalid release format - Missing regression or JIRA data - API errors - No matches for component filter - JIRA authentication issues ## Return Value The command outputs a **Comprehensive Component Health Report**: ### Overall Health Grade From combined regression and bug data: - **Release**: OpenShift version and development window - **Regression Metrics**: - Total regressions: X (Y% triaged) - Average triage time: X hours - Average resolution time: X hours - Open vs closed breakdown - **Bug Backlog Metrics**: - Total open bugs: X across all components - Bugs opened/closed in last 30 days - Priority distribution - **Overall Health**: Combined grade (Excellent/Good/Needs Improvement/Poor) ### Per-Component Health Scorecard Ranked table combining both metrics: | Component | Regression Triage | Triage Time | Resolution Time | Open Bugs | Bug Age | Health Grade | |-----------|-------------------|-------------|-----------------|-----------|---------|--------------| | kube-apiserver | 100.0% | 58 hrs | 144 hrs | 15 | 45d avg | ✅ Excellent | | etcd | 95.0% | 84 hrs | 192 hrs | 8 | 30d avg | ✅ Good | | Monitoring | 86.7% | 68 hrs | 156 hrs | 23 | 120d avg | ⚠️ Needs Improvement | ### Components Needing Attention Prioritized list with actionable items: ``` 1. Monitoring (Needs Improvement): - 1 open untriaged regression (needs triage) - High bug backlog: 23 open bugs (8 older than 90 days) - Growing backlog: +5 net bugs in last 30 days - Recommendation: Focus on triaging open regression and addressing oldest bugs 2. Example-Component (Poor): - 5 open untriaged regressions (urgent triage needed) - Slow triage response: 120 hours average - Very high bug backlog: 45 open bugs (15 older than 180 days) - Recommendation: Immediate triage sprint needed; consider bug backlog cleanup initiative ``` **IMPORTANT**: When listing untriaged regressions: - **Only list OPEN untriaged regressions** - these are actionable - **Do NOT recommend triaging closed regressions** - tooling doesn't support retroactive triage - Calculate actionable count as: `open.total - open.triaged` ### Additional Sections If requested: - Detailed regression metrics by component - Detailed bug breakdowns by status and priority - Links to Sippy dashboards for regression analysis - Links to JIRA queries for bug investigation - Trends compared to previous releases (if available) ## Examples 1. **Analyze overall component health for a release**: ``` /component-health:analyze 4.17 ``` Automatically fetches and analyzes BOTH data sources for release 4.17: - Regression management metrics (via summarize-regressions) - JIRA bug backlog metrics (via summarize-jiras) - Combined health grades based on both sources - Prioritized recommendations using both regression and bug data 2. **Analyze specific components (exact match)**: ``` /component-health:analyze 4.21 --components Monitoring Etcd ``` Automatically fetches BOTH regression and bug data for Monitoring and Etcd: - Compares combined health between the two components - Shows regression metrics AND bug backlog for each - Identifies which component needs more attention - Provides targeted recommendations based on both data sources 3. **Analyze by fuzzy search**: ``` /component-health:analyze 4.21 --components network ``` Automatically fetches BOTH data sources for all components containing "network": - Finds all networking components (e.g., "Networking / ovn-kubernetes", "Networking / DNS", etc.) - Compares combined health across all networking components - Shows regression metrics AND bug backlog for each - Identifies networking-related quality issues from both sources - Provides targeted recommendations 4. **Analyze with custom JIRA project**: ``` /component-health:analyze 4.21 --project OCPSTRAT ``` Analyzes health using bugs from OCPSTRAT project instead of default OCPBUGS. 5. **In-development release analysis**: ``` /component-health:analyze 4.21 ``` Automatically fetches BOTH data sources for an in-development release: - Shows current regression management state - Shows current bug backlog state - Tracks bug flow trends (opened vs closed) - Identifies areas to focus on before GA based on both regression and bug metrics ## Arguments - `$1` (required): Release version - Format: "X.Y" (e.g., "4.17", "4.21") - Must be a valid OpenShift release number - `$2+` (optional): Filter flags - `--components [search2 ...]`: Filter by component names using fuzzy search - Space-separated list of component search strings - Case-insensitive substring matching - Each search string matches all components containing that substring - If no components provided, all components are analyzed - Applied to both regression and bug queries - Example: "network" matches "Networking / ovn-kubernetes", "Networking / DNS", etc. - Example: "kube-" matches "kube-apiserver", "kube-controller-manager", etc. - `--project `: JIRA project key - Default: "OCPBUGS" - Use alternative project if component bugs are tracked elsewhere - Examples: "OCPSTRAT", "OCPQE" ## Prerequisites 1. **Python 3**: Required to run the underlying data fetching scripts - Check: `which python3` - Version: 3.6 or later 2. **JIRA Authentication**: Environment variables must be configured for bug data - `JIRA_URL`: Your JIRA instance URL - `JIRA_PERSONAL_TOKEN`: Your JIRA bearer token or personal access token - See `/component-health:summarize-jiras` for setup instructions 3. **Network Access**: Must be able to reach both component health API and JIRA - Ensure HTTPS requests can be made to both services - Check firewall and VPN settings if needed ## Notes - **CRITICAL**: This command AUTOMATICALLY fetches data from TWO sources: 1. Regression API (via `/component-health:summarize-regressions`) 2. JIRA API (via `/component-health:summarize-jiras`) - Both data sources are REQUIRED and fetched automatically without user prompting - The analysis is incomplete without both regression and bug data - Health grades are subjective and intended as guidance, not criticism - Recommendations focus on actionable items (open untriaged regressions, not closed) - Infrastructure regressions are automatically filtered from regression counts - JIRA queries default to open bugs + bugs closed in last 30 days - HTML reports provide interactive visualizations combining both data sources - If one data source fails, the command continues with the available data and notes the failure - For detailed regression data only, use `/component-health:list-regressions` - For detailed JIRA data only, use `/component-health:list-jiras` - This command provides the most comprehensive view by combining both sources ## See Also - Related Command: `/component-health:summarize-regressions` (regression metrics) - Related Command: `/component-health:summarize-jiras` (bug backlog metrics) - Related Command: `/component-health:list-regressions` (raw regression data) - Related Command: `/component-health:list-jiras` (raw JIRA data) - Skill Documentation: `plugins/component-health/skills/analyze-regressions/SKILL.md` - Script: `plugins/component-health/skills/list-regressions/list_regressions.py` - Script: `plugins/component-health/skills/summarize-jiras/summarize_jiras.py`