Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:53:27 +08:00
commit 3860d946f9
11 changed files with 2940 additions and 0 deletions

View File

@@ -0,0 +1,221 @@
---
name: kernel-debug-loop
description: This skill should be used when performing fast iterative kernel debugging, running time-bound kernel sessions to detect specific log signals or test kernel behavior. Use for rapid feedback cycles during kernel development, boot sequence analysis, or feature verification.
---
# Kernel Debug Loop
Fast iterative kernel debugging with signal detection and time-bounded execution.
## Purpose
This skill provides a rapid feedback loop for kernel development by running the Breenix kernel for short, time-bounded sessions (default 15 seconds) while monitoring logs in real-time for specific signals. The kernel terminates immediately when the expected signal is detected, or when the timeout expires, enabling fast iteration cycles during debugging.
## When to Use This Skill
Use this skill when:
- **Iterative debugging**: Testing kernel changes with quick feedback loops
- **Boot sequence analysis**: Verifying the kernel reaches specific initialization checkpoints
- **Signal detection**: Waiting for specific kernel log messages before proceeding
- **Behavior verification**: Confirming the kernel responds correctly to tests or inputs
- **Fast failure detection**: Identifying boot failures or hangs quickly without waiting for full timeout
- **Checkpoint validation**: Ensuring the kernel reaches expected states during execution
## How to Use
### Basic Usage
The skill provides the `quick_debug.py` script for time-bounded kernel runs with optional signal detection.
**Run kernel with signal detection:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "KERNEL_INITIALIZED"
```
This runs the kernel for up to 15 seconds, terminating immediately when "KERNEL_INITIALIZED" appears in the logs.
**Run kernel with custom timeout:**
```bash
kernel-debug-loop/scripts/quick_debug.py --timeout 30
```
Runs the kernel for up to 30 seconds without specific signal detection.
**Run in BIOS mode:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "Boot complete" --mode bios
```
**Quiet mode (kernel output only):**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "READY" --quiet
```
Suppresses progress messages, showing only kernel output.
### Common Signals to Watch For
Based on Breenix's test infrastructure, common signals include:
- `🎯 KERNEL_POST_TESTS_COMPLETE 🎯` - All runtime tests completed
- `KERNEL_INITIALIZED` - Basic kernel initialization complete
- `USER_PROCESS_STARTED` - User process execution began
- `MEMORY_MANAGER_READY` - Memory management subsystem initialized
- Custom checkpoint markers added for specific debugging needs
### Workflow Patterns
#### Pattern 1: Fast Iteration During Development
When making changes to kernel initialization:
1. Make code change
2. Run: `kernel-debug-loop/scripts/quick_debug.py --signal "TARGET_CHECKPOINT" --timeout 10`
3. Verify signal appears or analyze why it didn't
4. Iterate
This provides feedback in ~10-15 seconds instead of waiting for full kernel execution or manual termination.
#### Pattern 2: Boot Sequence Verification
When debugging boot issues:
1. Identify the checkpoint expected to be reached
2. Run with that checkpoint as the signal
3. If timeout occurs, the kernel failed to reach that point
4. Examine the output buffer to see how far boot progressed
5. Add intermediate checkpoints to narrow down the failure point
#### Pattern 3: Regression Testing
When verifying fixes:
1. Run with the signal that was previously failing to appear
2. Success (signal found) confirms the fix worked
3. Failure (timeout) indicates the issue persists
4. The output buffer contains diagnostic information
#### Pattern 4: Performance Checkpoint Analysis
When optimizing boot time:
1. Run with a specific checkpoint signal
2. Note the elapsed time when signal is found
3. Make optimization changes
4. Re-run to measure improvement
5. The script reports exact elapsed time for comparison
### Integration with Claude Workflows
When assisting with kernel debugging:
1. **Suggest checkpoints**: Recommend adding strategic log markers at key points
2. **Run quick tests**: Use this script to verify changes before full test suite
3. **Analyze output**: Parse the output buffer to diagnose issues
4. **Iterate rapidly**: Chain multiple quick debug runs to test hypotheses
5. **Report findings**: Summarize what signals were found and timing information
### Script Output
The script provides:
- **Real-time kernel output**: All kernel logs stream to stdout during execution
- **Status indicators**: Visual feedback on signal detection and timeout
- **Session summary**: Success/failure status, timing, and output statistics
- **Exit code**: 0 if signal found (or no signal specified), 1 if timeout without signal
### Output Buffer Analysis
After a debug session, the entire kernel output is available for analysis:
- Search for error messages or warnings
- Verify initialization sequence order
- Check memory allocation patterns
- Analyze interrupt handling
- Examine test results
### Advanced Usage
**Multiple checkpoint verification:**
Run sequential sessions to verify a series of checkpoints:
```bash
for signal in "PHASE1" "PHASE2" "PHASE3"; do
kernel-debug-loop/scripts/quick_debug.py --signal "$signal" --quiet || break
done
```
**Capture output for analysis:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "READY" > kernel_output.log 2>&1
```
**Integration with test scripts:**
```python
import subprocess
result = subprocess.run(
['kernel-debug-loop/scripts/quick_debug.py', '--signal', 'TEST_COMPLETE'],
capture_output=True,
text=True
)
if result.returncode == 0:
print("Test passed!")
else:
print("Test failed or timed out")
analyze_output(result.stdout)
```
## Best Practices
1. **Add strategic checkpoints**: Insert log markers at key kernel execution points
2. **Use descriptive signals**: Make signal patterns unique and meaningful
3. **Set appropriate timeouts**: Balance between waiting long enough and fast iteration
4. **Check exit codes**: Use return codes in scripts for automation
5. **Save output for analysis**: Redirect output when debugging complex issues
6. **Start broad, narrow down**: If a checkpoint isn't reached, add earlier checkpoints
7. **Combine with full tests**: Use for quick iteration, then validate with full test suite
## Technical Details
- **Timeout**: Default 15 seconds, configurable via `--timeout`
- **Signal detection**: Performs substring matching on each output line
- **Termination**: Graceful SIGTERM followed by SIGKILL if needed
- **Output buffering**: Line-buffered for real-time display
- **Exit codes**: 0 for success (signal found or no signal specified), 1 for timeout/failure
## Examples
**Verify kernel reaches user mode:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "USER_PROCESS_STARTED" --timeout 20
```
**Quick sanity check after changes:**
```bash
kernel-debug-loop/scripts/quick_debug.py --timeout 5
```
**Debug memory initialization:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "MEMORY_MANAGER_READY" --quiet > mem_init.log
```
**Test both UEFI and BIOS modes:**
```bash
kernel-debug-loop/scripts/quick_debug.py --signal "BOOT_COMPLETE" --mode uefi
kernel-debug-loop/scripts/quick_debug.py --signal "BOOT_COMPLETE" --mode bios
```

View File

@@ -0,0 +1,206 @@
#!/usr/bin/env python3
"""
Fast kernel debug loop with signal detection.
Runs the Breenix kernel for up to a specified timeout (default 15s),
monitoring logs in real-time for specific signals. Terminates immediately
when the signal is found or when the timeout expires.
"""
import argparse
import os
import subprocess
import sys
import time
import signal as sig
from pathlib import Path
from datetime import datetime
import select
class DebugSession:
def __init__(self, signal_pattern=None, timeout=15, mode="uefi", quiet=False):
self.signal_pattern = signal_pattern
self.timeout = timeout
self.mode = mode
self.quiet = quiet
self.process = None
self.output_buffer = []
self.signal_found = False
self.start_time = None
def run(self):
"""Execute the debug session."""
project_root = Path(__file__).parent.parent.parent.resolve()
# Determine the cargo command
if self.mode == "bios":
cmd = ["cargo", "run", "--release", "--features", "testing",
"--bin", "qemu-bios", "--", "-serial", "stdio", "-display", "none"]
else:
cmd = ["cargo", "run", "--release", "--features", "testing",
"--bin", "qemu-uefi", "--", "-serial", "stdio", "-display", "none"]
if not self.quiet:
print(f"🔍 Starting kernel debug session ({self.timeout}s timeout)", file=sys.stderr)
if self.signal_pattern:
print(f" Watching for signal: {self.signal_pattern}", file=sys.stderr)
print("", file=sys.stderr)
self.start_time = time.time()
# Start the process
self.process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
cwd=project_root,
text=True,
bufsize=1, # Line buffered
)
# Set up signal handler for clean termination
sig.signal(sig.SIGINT, self._signal_handler)
sig.signal(sig.SIGTERM, self._signal_handler)
try:
self._monitor_output()
finally:
self._cleanup()
return self._generate_report()
def _monitor_output(self):
"""Monitor process output in real-time."""
while True:
# Check timeout
elapsed = time.time() - self.start_time
if elapsed >= self.timeout:
if not self.quiet:
print(f"\n⏱️ Timeout reached ({self.timeout}s)", file=sys.stderr)
break
# Check if process is still running
if self.process.poll() is not None:
# Process terminated, read any remaining output
remaining = self.process.stdout.read()
if remaining:
for line in remaining.splitlines():
self._process_line(line)
break
# Read line with timeout
line = self.process.stdout.readline()
if line:
line = line.rstrip('\n')
self._process_line(line)
# Check for signal
if self.signal_pattern and self.signal_pattern in line:
self.signal_found = True
if not self.quiet:
print(f"\n✅ Signal found: {self.signal_pattern}", file=sys.stderr)
break
else:
# Small sleep to prevent busy waiting
time.sleep(0.01)
def _process_line(self, line):
"""Process a single line of output."""
self.output_buffer.append(line)
if not self.quiet:
print(line)
def _cleanup(self):
"""Clean up the subprocess."""
if self.process and self.process.poll() is None:
if not self.quiet:
print("\n🛑 Terminating kernel...", file=sys.stderr)
# Try graceful termination first
self.process.terminate()
try:
self.process.wait(timeout=2)
except subprocess.TimeoutExpired:
# Force kill if needed
self.process.kill()
self.process.wait()
def _signal_handler(self, signum, frame):
"""Handle interrupt signals."""
if not self.quiet:
print("\n\n⚠️ Interrupted by user", file=sys.stderr)
self._cleanup()
sys.exit(1)
def _generate_report(self):
"""Generate a debug report from the session."""
elapsed = time.time() - self.start_time
report = {
'success': self.signal_found if self.signal_pattern else True,
'signal_found': self.signal_found,
'signal_pattern': self.signal_pattern,
'elapsed_time': elapsed,
'timeout': self.timeout,
'output_lines': len(self.output_buffer),
'output': '\n'.join(self.output_buffer),
}
if not self.quiet:
print("\n" + "="*60, file=sys.stderr)
print("📊 Debug Session Summary", file=sys.stderr)
print("="*60, file=sys.stderr)
print(f"Status: {'✅ SUCCESS' if report['success'] else '❌ TIMEOUT'}", file=sys.stderr)
if self.signal_pattern:
print(f"Signal: {'Found' if self.signal_found else 'Not found'}", file=sys.stderr)
print(f"Time: {elapsed:.2f}s / {self.timeout}s", file=sys.stderr)
print(f"Output lines: {len(self.output_buffer)}", file=sys.stderr)
print("="*60, file=sys.stderr)
return report
def main():
parser = argparse.ArgumentParser(
description='Fast kernel debug loop with signal detection'
)
parser.add_argument(
'--signal',
help='Signal pattern to watch for in kernel output'
)
parser.add_argument(
'--timeout',
type=float,
default=15.0,
help='Maximum time to run (seconds, default: 15)'
)
parser.add_argument(
'--mode',
choices=['uefi', 'bios'],
default='uefi',
help='Boot mode (default: uefi)'
)
parser.add_argument(
'--quiet',
action='store_true',
help='Suppress progress output, only show kernel output'
)
args = parser.parse_args()
session = DebugSession(
signal_pattern=args.signal,
timeout=args.timeout,
mode=args.mode,
quiet=args.quiet
)
report = session.run()
# Exit with appropriate code
sys.exit(0 if report['success'] else 1)
if __name__ == '__main__':
main()