Initial commit
This commit is contained in:
348
commands/analyze.md
Normal file
348
commands/analyze.md
Normal file
@@ -0,0 +1,348 @@
|
||||
---
|
||||
description: Analyze sosreport archive for system diagnostics and issues
|
||||
argument-hint: <path-to-sosreport> [--only <areas>] [--skip <areas>]
|
||||
---
|
||||
|
||||
## Name
|
||||
sosreport:analyze
|
||||
|
||||
## Synopsis
|
||||
```
|
||||
/sosreport:analyze <path-to-sosreport> [--only <areas>] [--skip <areas>]
|
||||
```
|
||||
|
||||
**Analysis Areas:**
|
||||
|
||||
- **`logs`**: Analyze system and application logs (journald, syslog, dmesg, application logs)
|
||||
- Identifies errors, warnings, critical messages
|
||||
- Detects OOM killer events, kernel panics, segfaults
|
||||
- Counts and categorizes errors by severity
|
||||
- Provides timeline of critical events
|
||||
|
||||
- **`resources`**: Analyze system resource usage (memory, CPU, disk, processes)
|
||||
- Memory usage, swap, and pressure indicators
|
||||
- CPU information and load averages
|
||||
- Disk usage and filesystem capacity
|
||||
- Top resource consumers and zombie processes
|
||||
|
||||
- **`network`**: Analyze network configuration and connectivity
|
||||
- Network interface status and IP addresses
|
||||
- Routing table and default gateway
|
||||
- Active connections and listening services
|
||||
- Firewall rules (firewalld/iptables)
|
||||
- DNS configuration and hostname resolution
|
||||
|
||||
- **`system-config`**: Analyze system configuration (packages, services, security)
|
||||
- OS version and kernel information
|
||||
- Installed package versions
|
||||
- Systemd service status and failures
|
||||
- SELinux/AppArmor configuration and denials
|
||||
- Kernel parameters and resource limits
|
||||
|
||||
## Description
|
||||
The `sosreport:analyze` command performs comprehensive analysis of a sosreport archive (from <https://github.com/sosreport/sos>) to identify system issues, configuration problems, and potential causes of failures. It examines system logs, resource usage, network configuration, installed packages, and other diagnostic data collected by sosreport.
|
||||
|
||||
By default, all analysis areas are executed. Use `--only` to run specific areas or `--skip` to exclude areas from analysis.
|
||||
|
||||
## Arguments
|
||||
- `$1` (required): Path to the sosreport archive file (`.tar.gz` or `.tar.xz`) or extracted directory
|
||||
- `--only <areas>` (optional): Comma-separated list of analysis areas to run. Valid areas: `logs`, `resources`, `network`, `system-config`. If not specified, all areas are analyzed.
|
||||
- `--skip <areas>` (optional): Comma-separated list of analysis areas to skip. Valid areas: `logs`, `resources`, `network`, `system-config`. Cannot be used with `--only`.
|
||||
|
||||
## Implementation
|
||||
|
||||
The sosreport analysis is organized into several specialized phases, each with detailed implementation guidance in separate skill documents. The command supports selective analysis through optional arguments.
|
||||
|
||||
### 1. Parse Arguments and Determine Analysis Scope
|
||||
|
||||
1. **Parse command-line arguments**
|
||||
- Extract the sosreport path (required first argument)
|
||||
- Check for `--only` flag and parse comma-separated areas
|
||||
- Check for `--skip` flag and parse comma-separated areas
|
||||
- Validate that `--only` and `--skip` are not used together
|
||||
|
||||
2. **Validate analysis areas**
|
||||
- Valid areas: `logs`, `resources`, `network`, `system-config`
|
||||
- If invalid area specified, return error with list of valid areas
|
||||
- Normalize area names (case-insensitive, accept variations like `system` for `system-config`)
|
||||
|
||||
3. **Determine which skills to run**
|
||||
- If no flags specified: Run all skills (default comprehensive analysis)
|
||||
- If `--only` specified: Run only the specified skills
|
||||
- If `--skip` specified: Run all skills except the specified ones
|
||||
- Store the list of skills to execute for later phases
|
||||
|
||||
4. **Example argument parsing**:
|
||||
```bash
|
||||
# Parse: /sosreport:analyze /path/sos.tar.gz --only logs,network
|
||||
# Result: Run only logs-analysis and network-analysis skills
|
||||
|
||||
# Parse: /sosreport:analyze /path/sos.tar.gz --skip resources
|
||||
# Result: Run logs, network, and system-config (skip resources)
|
||||
|
||||
# Parse: /sosreport:analyze /path/sos.tar.gz
|
||||
# Result: Run all skills (comprehensive analysis)
|
||||
```
|
||||
|
||||
### 2. Extract and Validate Sosreport
|
||||
|
||||
1. **Check if path exists**
|
||||
- Verify the provided path points to a valid file or directory
|
||||
- If file doesn't exist, return error with helpful message
|
||||
|
||||
2. **Extract archive if needed**
|
||||
- If path is a `.tar.gz` or `.tar.xz` file:
|
||||
- Create extraction directory: `.work/sosreport-analyze/{timestamp}/`
|
||||
- Extract archive: `tar -xf <path> -C .work/sosreport-analyze/{timestamp}/`
|
||||
- Store extracted directory path for analysis
|
||||
- If path is already a directory:
|
||||
- Verify it's a valid sosreport directory (check for `sos_commands/`, `sos_logs/`, etc.)
|
||||
- Use the directory directly
|
||||
|
||||
3. **Identify sosreport structure**
|
||||
- Locate the root directory (usually has format `sosreport-{hostname}-{date}/`)
|
||||
- Verify expected directories exist: `sos_commands/`, `sos_logs/`, `sos_reports/`
|
||||
|
||||
### 3. Analyze System Logs
|
||||
|
||||
**Run condition**: Only if `logs` area is selected (or no filters specified)
|
||||
**Detailed implementation**: See `plugins/sosreport/skills/logs-analysis/SKILL.md`
|
||||
|
||||
Perform comprehensive log analysis including:
|
||||
- Journald logs (journalctl output)
|
||||
- System logs (messages, dmesg, secure)
|
||||
- Application-specific logs
|
||||
- Error counting and categorization
|
||||
- Timeline of critical events
|
||||
- OOM killer events, kernel panics, segfaults
|
||||
|
||||
**Key outputs**:
|
||||
- Error statistics by severity
|
||||
- Top error messages by frequency
|
||||
- Critical findings with timestamps
|
||||
- Log file locations for investigation
|
||||
|
||||
### 4. Analyze Resource Usage
|
||||
|
||||
**Run condition**: Only if `resources` area is selected (or no filters specified)
|
||||
**Detailed implementation**: See `plugins/sosreport/skills/resource-analysis/SKILL.md`
|
||||
|
||||
Perform resource analysis including:
|
||||
- Memory usage and pressure indicators
|
||||
- CPU information and load averages
|
||||
- Disk usage and I/O errors
|
||||
- Process analysis (top consumers, zombies)
|
||||
- Resource exhaustion patterns
|
||||
|
||||
**Key outputs**:
|
||||
- Memory usage metrics and swap status
|
||||
- CPU count and load per CPU
|
||||
- Filesystems near capacity
|
||||
- Top CPU and memory-consuming processes
|
||||
- Resource-related issues and recommendations
|
||||
|
||||
### 5. Analyze Network Configuration
|
||||
|
||||
**Run condition**: Only if `network` area is selected (or no filters specified)
|
||||
**Detailed implementation**: See `plugins/sosreport/skills/network-analysis/SKILL.md`
|
||||
|
||||
Perform network analysis including:
|
||||
- Network interface configuration and status
|
||||
- Routing table and default gateway
|
||||
- Active connections and listening services
|
||||
- Firewall rules (firewalld/iptables)
|
||||
- DNS configuration and hostname resolution
|
||||
- Network errors from logs
|
||||
|
||||
**Key outputs**:
|
||||
- Interface status with IP addresses
|
||||
- Routing configuration
|
||||
- Connection statistics by state
|
||||
- Firewall configuration summary
|
||||
- DNS and hostname settings
|
||||
- Network-related errors and issues
|
||||
|
||||
### 6. Analyze Installed Packages and System Configuration
|
||||
|
||||
**Run condition**: Only if `system-config` area is selected (or no filters specified)
|
||||
**Detailed implementation**: See `plugins/sosreport/skills/system-config-analysis/SKILL.md`
|
||||
|
||||
Perform system configuration analysis including:
|
||||
- OS version and kernel information
|
||||
- Installed package versions
|
||||
- Systemd service status
|
||||
- Failed services with reasons
|
||||
- SELinux/AppArmor configuration and denials
|
||||
- Kernel parameters and resource limits
|
||||
|
||||
**Key outputs**:
|
||||
- System information summary
|
||||
- Key package versions
|
||||
- Failed services with failure reasons
|
||||
- SELinux status and denial count
|
||||
- Configuration issues and recommendations
|
||||
|
||||
### 7. Generate Interactive Summary
|
||||
|
||||
1. **Create findings structure**
|
||||
- Organize findings by category (Critical, High, Medium, Low, Info)
|
||||
- Include only findings from the selected analysis areas
|
||||
- For each finding, include:
|
||||
- Severity level
|
||||
- Category (logs, resources, network, packages, config)
|
||||
- Description of the issue
|
||||
- Evidence (file paths, log snippets, metrics)
|
||||
- Recommended actions
|
||||
|
||||
2. **Display summary in terminal**
|
||||
- Show executive summary with key statistics
|
||||
- List critical and high-severity findings
|
||||
- Provide file paths for detailed investigation
|
||||
- Include timeline of significant events
|
||||
- Suggest next steps for troubleshooting
|
||||
|
||||
3. **Format output**
|
||||
```bash
|
||||
SOSREPORT ANALYSIS SUMMARY
|
||||
==========================
|
||||
|
||||
System: {hostname}
|
||||
Report Date: {date}
|
||||
OS: {os_version}
|
||||
Kernel: {kernel_version}
|
||||
|
||||
CRITICAL ISSUES (count)
|
||||
-----------------------
|
||||
- [Issue description with file reference]
|
||||
|
||||
HIGH PRIORITY (count)
|
||||
---------------------
|
||||
- [Issue description with file reference]
|
||||
|
||||
MEDIUM PRIORITY (count)
|
||||
-----------------------
|
||||
- [Issue description with file reference]
|
||||
|
||||
RESOURCE SUMMARY
|
||||
----------------
|
||||
- Memory: X GB used / Y GB total (Z% used)
|
||||
- Disk: Most full filesystem at X%
|
||||
- Load Average: X.XX, X.XX, X.XX
|
||||
|
||||
TOP ERRORS IN LOGS
|
||||
------------------
|
||||
1. [Error message] (count occurrences)
|
||||
2. [Error message] (count occurrences)
|
||||
|
||||
FAILED SERVICES
|
||||
---------------
|
||||
- [service name]: [reason]
|
||||
|
||||
RECOMMENDATIONS
|
||||
---------------
|
||||
1. [Actionable recommendation]
|
||||
2. [Actionable recommendation]
|
||||
|
||||
ANALYSIS LOCATION
|
||||
-----------------
|
||||
Extracted to: {extraction_path}
|
||||
```
|
||||
|
||||
4. **Interactive drill-down**
|
||||
- Offer to explore specific areas in more detail
|
||||
- Allow user to ask follow-up questions about findings
|
||||
- Provide file paths for manual investigation
|
||||
|
||||
## Return Value
|
||||
|
||||
- **Format**: Interactive summary displayed in terminal with categorized findings
|
||||
- **Exit code**:
|
||||
- 0 if analysis completes successfully
|
||||
- 1 if sosreport path is invalid
|
||||
- 2 if sosreport structure is malformed
|
||||
|
||||
## Examples
|
||||
|
||||
1. **Comprehensive analysis (default)**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport-server01-2024-01-15.tar.xz
|
||||
```
|
||||
|
||||
Extracts archive to `.work/sosreport-analyze/{timestamp}/` and performs comprehensive analysis using all skills (logs, resources, network, system-config).
|
||||
|
||||
2. **Analyze only logs and network**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport-server01-2024-01-15.tar.xz --only logs,network
|
||||
```
|
||||
|
||||
Performs only log analysis and network analysis. Useful when investigating connectivity or service issues without needing full resource analysis.
|
||||
|
||||
3. **Skip resource analysis**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport.tar.gz --skip resources
|
||||
```
|
||||
|
||||
Performs all analysis except resource analysis. Useful when you already know resource metrics and want to focus on configuration and logs.
|
||||
|
||||
4. **Quick log-only analysis**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport.tar.xz --only logs
|
||||
```
|
||||
|
||||
Performs only log analysis. Fastest option for quickly identifying errors and critical events without analyzing configuration or resources.
|
||||
|
||||
5. **Analyze extracted sosreport directory**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport-server01-2024-01-15/
|
||||
```
|
||||
|
||||
Analyzes an already extracted sosreport directory with comprehensive analysis.
|
||||
|
||||
6. **Selective analysis on extracted directory**:
|
||||
```bash
|
||||
/sosreport:analyze /tmp/sosreport-server01-2024-01-15/ --only system-config,network
|
||||
```
|
||||
|
||||
Analyzes only system configuration and network from an already extracted directory.
|
||||
|
||||
7. **Follow-up investigation**:
|
||||
```bash
|
||||
User: /sosreport:analyze /tmp/sosreport.tar.gz --only logs
|
||||
Agent: [Shows log analysis summary]
|
||||
User: Can you now analyze the resources as well?
|
||||
Agent: /sosreport:analyze /tmp/sosreport.tar.gz --only resources
|
||||
Agent: [Shows resource analysis]
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- Sosreport structure varies by OS version and sosreport version
|
||||
- Command handles both compressed archives and extracted directories
|
||||
- Analysis focuses on common issues but can be extended for specific use cases
|
||||
- For OpenShift/Kubernetes sosreports, additional pod/container analysis may be relevant
|
||||
- Large sosreports (>1GB) may take several minutes to analyze
|
||||
- **Selective analysis**: Use `--only` or `--skip` to run specific analysis areas for faster results
|
||||
- **Performance**: Running only needed analysis areas reduces analysis time significantly
|
||||
- **Valid areas**: `logs`, `resources`, `network`, `system-config`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **tar utility**: Required for extracting compressed sosreports
|
||||
- Check: `which tar`
|
||||
- Usually pre-installed on Linux/macOS
|
||||
|
||||
2. **Sufficient disk space**: Extracted sosreports can be large
|
||||
- Check available space: `df -h .work/`
|
||||
- Recommend at least 2x the compressed archive size
|
||||
|
||||
## See Also
|
||||
|
||||
### Analysis Skills
|
||||
- **Logs Analysis**: `plugins/sosreport/skills/logs-analysis/SKILL.md` - Detailed guidance for analyzing system and application logs
|
||||
- **Resource Analysis**: `plugins/sosreport/skills/resource-analysis/SKILL.md` - Detailed guidance for analyzing memory, CPU, disk, and processes
|
||||
- **Network Analysis**: `plugins/sosreport/skills/network-analysis/SKILL.md` - Detailed guidance for analyzing network configuration and connectivity
|
||||
- **System Configuration Analysis**: `plugins/sosreport/skills/system-config-analysis/SKILL.md` - Detailed guidance for analyzing packages, services, and security settings
|
||||
|
||||
### External Resources
|
||||
- Sosreport documentation: <https://github.com/sosreport/sos>
|
||||
- Red Hat sosreport guide: <https://access.redhat.com/solutions/3592>
|
||||
Reference in New Issue
Block a user