Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:16:56 +08:00
commit 8a3d331e04
61 changed files with 11808 additions and 0 deletions

View File

@@ -0,0 +1,292 @@
# Mode 1: Git Worktree Isolation
## When to Use
**Best for:**
- Read-only skills or skills with minimal file operations
- Quick validation during development
- Skills that don't require system package installation
- Testing iterations where speed matters
**Not suitable for:**
- Skills that install system packages (npm install, apt-get, brew, etc.)
- Skills that modify system configurations
- Skills that require a clean Node.js environment
**Risk Level**: Low complexity skills only
## Advantages
-**Fast**: Creates worktree in seconds
- 💾 **Efficient**: Shares git history, minimal disk space
- 🔄 **Repeatable**: Easy to create, test, and destroy
- 🛠️ **Familiar**: Same git tools you already know
## Limitations
- ❌ Shares system packages (node_modules, global npm packages)
- ❌ Shares environment variables and configs
- ❌ Same OS user and permissions
- ❌ Cannot test system-level dependencies
- ⚠️ Not true isolation - just a separate git checkout
## Prerequisites
1. Must be in a git repository
2. Git worktree feature available (Git 2.5+)
3. Clean working directory (or willing to proceed with uncommitted changes)
4. Sufficient disk space for additional worktree
## Workflow
### Step 1: Validate Environment
```bash
# Check if in git repo
git rev-parse --is-inside-work-tree
# Check for uncommitted changes
git status --porcelain
# Get current repo name
basename $(git rev-parse --show-toplevel)
```
If dirty working directory → warn user but allow proceeding (isolation is separate)
### Step 2: Create Isolation Worktree
**Generate unique branch name:**
```bash
BRANCH_NAME="test-skill-$(date +%s)" # e.g., test-skill-1699876543
```
**Create worktree:**
```bash
WORKTREE_PATH="../$(basename $(pwd))-${BRANCH_NAME}"
git worktree add "$WORKTREE_PATH" -b "$BRANCH_NAME"
```
Example result: `/Users/connor/claude-test-skill-1699876543/`
### Step 3: Copy Skill to Worktree
```bash
# Copy skill directory to worktree's .claude/skills/
cp -r ~/.claude/skills/[skill-name] "$WORKTREE_PATH/.claude/skills/"
# Or if skill is in current repo
cp -r ./skills/[skill-name] "$WORKTREE_PATH/.claude/skills/"
```
**Verify copy:**
```bash
ls -la "$WORKTREE_PATH/.claude/skills/[skill-name]/"
```
### Step 4: Setup Development Environment
**Install dependencies if needed:**
```bash
cd "$WORKTREE_PATH"
# Detect package manager
if [ -f "pnpm-lock.yaml" ]; then
pnpm install
elif [ -f "yarn.lock" ]; then
yarn install
elif [ -f "package-lock.json" ]; then
npm install
fi
```
**Copy environment files (optional):**
```bash
# Only if skill needs .env for testing
cp ../.env "$WORKTREE_PATH/.env"
```
### Step 5: Take "Before" Snapshot
```bash
# List all files in worktree
find "$WORKTREE_PATH" -type f > /tmp/before-files.txt
# List running processes (for comparison later)
ps aux > /tmp/before-processes.txt
# Current disk usage
du -sh "$WORKTREE_PATH" > /tmp/before-disk.txt
```
### Step 6: Execute Skill in Worktree
**Open new Claude Code session in worktree:**
```bash
cd "$WORKTREE_PATH"
claude
```
**Run skill with test trigger:**
- User manually tests skill with trigger phrases
- OR: Use Claude CLI to run skill programmatically (if available)
**Monitor execution:**
- Watch for errors in output
- Note execution time
- Check resource usage
### Step 7: Take "After" Snapshot
```bash
# List all files after execution
find "$WORKTREE_PATH" -type f > /tmp/after-files.txt
# Compare before/after
diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
# Check for new processes
ps aux > /tmp/after-processes.txt
diff /tmp/before-processes.txt /tmp/after-processes.txt > /tmp/process-changes.txt
# Check disk usage
du -sh "$WORKTREE_PATH" > /tmp/after-disk.txt
```
### Step 8: Analyze Results
**Check for side effects:**
```bash
# Files created
grep ">" /tmp/file-changes.txt | wc -l
# Files deleted
grep "<" /tmp/file-changes.txt | wc -l
# New processes (filter out expected ones)
# Look for processes related to skill
```
**Validate cleanup:**
```bash
# Check for leftover temp files
find "$WORKTREE_PATH" -name "*.tmp" -o -name "*.temp" -o -name ".cache"
# Check for orphaned processes
# Look for processes still running from skill
```
### Step 9: Generate Report
**Execution Results:**
- ✅ Skill completed successfully / ❌ Skill failed with error
- ⏱️ Execution time: Xs
- 📊 Resource usage: XMB disk, X% CPU
**Side Effects:**
- Files created: [count] (list if < 10)
- Files modified: [count]
- Processes created: [count]
- Temporary files remaining: [count]
**Dependency Analysis:**
- Required tools: [list tools used by skill]
- Hardcoded paths: [list any absolute paths found]
- Environment variables: [list any ENV vars referenced]
### Step 10: Cleanup
**Ask user:**
```
Test complete. Worktree location: $WORKTREE_PATH
Options:
1. Keep worktree for debugging
2. Remove worktree and branch
3. Remove worktree, keep branch
Your choice?
```
**Cleanup commands:**
```bash
# Option 2: Full cleanup
git worktree remove "$WORKTREE_PATH"
git branch -D "$BRANCH_NAME"
# Option 3: Keep branch
git worktree remove "$WORKTREE_PATH"
```
## Interpreting Results
### ✅ **PASS** - Ready for git worktree environments
- Skill completed without errors
- No unexpected file modifications
- No orphaned processes
- No hardcoded paths detected
- Temporary files cleaned up
### ⚠️ **WARNING** - Works but has minor issues
- Skill works but left temporary files
- Uses some hardcoded paths (but non-critical)
- Performance could be improved
- Missing some documentation
### ❌ **FAIL** - Not ready
- Skill crashed or hung
- Requires system packages not installed
- Modifies files outside skill directory without permission
- Creates orphaned processes
- Has critical hardcoded paths
## Common Issues
### Issue: "Skill not found in Claude"
**Cause**: Skill wasn't copied to worktree's .claude/skills/
**Fix**: Verify copy command and path
### Issue: "Permission denied" errors
**Cause**: Skill trying to write to protected directories
**Fix**: Identify problematic paths, suggest using /tmp or skill directory
### Issue: "Command not found"
**Cause**: Skill depends on system tool not installed
**Fix**: Document dependency, suggest adding to skill README
### Issue: Test results different from main directory
**Cause**: Different node_modules or configs
**Fix**: This is expected - worktree shares some state, not true isolation
## Best Practices
1. **Always take before/after snapshots** for accurate comparison
2. **Test multiple times** to ensure consistency
3. **Check temp directories** (`/tmp`, `/var/tmp`) for leftover files
4. **Monitor processes** for at least 30s after skill completes
5. **Document all dependencies** found during testing
6. **Use relative paths** in skill code, never absolute
7. **Cleanup worktrees** regularly to avoid clutter
## Quick Command Reference
```bash
# Create test worktree
git worktree add ../test-branch -b test-branch
# List all worktrees
git worktree list
# Remove worktree
git worktree remove ../test-branch
# Remove worktree and branch
git worktree remove ../test-branch && git branch -D test-branch
# Find temp files created
find /tmp -name "*skill-name*" -mtime -1
```
---
**Remember:** Git worktree provides quick, lightweight isolation but is NOT true isolation. Use for low-risk skills or fast iteration during development. For skills that modify system state, use Docker or VM modes.

View File

@@ -0,0 +1,468 @@
# Mode 2: Docker Container Isolation
## Using Docker Helper Library
**RECOMMENDED:** Use the helper library for robust error handling and cleanup.
```bash
source ~/.claude/skills/skill-isolation-tester/lib/docker-helpers.sh
# Set cleanup trap (runs automatically on exit)
trap cleanup_on_exit EXIT
# Pre-flight checks
preflight_check_docker || exit 1
```
The helper library provides:
- Shell command validation (prevents syntax errors)
- Retry logic with exponential backoff
- Automatic cleanup on exit
- Pre-flight Docker environment checks
- Safe build and run functions
See `lib/docker-helpers.sh` for full documentation.
---
## When to Use
**Best for:**
- Skills that install npm/pip packages or system dependencies
- Skills that modify configuration files
- Medium-risk skills that need OS-level isolation
- Testing skills with different Claude Code versions
- Reproducible testing environments
**Not suitable for:**
- Skills that require VM operations or nested virtualization
- Skills that need GUI access (without X11 forwarding)
- Extremely high-risk skills (use VM mode instead)
**Risk Level**: Low to medium complexity skills
## Advantages
- 🏗️ **True OS Isolation**: Complete filesystem and process separation
- 📦 **Reproducible**: Same environment every time
- 🔒 **Sandboxed**: Limited access to host system
- 🎯 **Precise**: Control exactly what's installed
- 🗑️ **Clean**: Easy to destroy and recreate
## Limitations
- ⏱️ Slower than git worktree (container overhead)
- 💾 Requires disk space for images
- 🐳 Requires Docker installation and running daemon
- ⚙️ More complex setup than worktree
- 🔧 May need volume mounts for file access
## Prerequisites
1. Docker installed and running (`docker info`)
2. Sufficient disk space (~1GB for base image + skill)
3. Permissions to run Docker commands
4. Internet connection (first time only, to pull images)
## Workflow
### Step 1: Validate Docker Environment
```bash
# Check Docker is installed
command -v docker || { echo "Docker not installed"; exit 1; }
# Check Docker daemon is running
docker info > /dev/null 2>&1 || { echo "Docker daemon not running"; exit 1; }
# Check disk space
docker system df
```
### Step 2: Choose Base Image
**Options:**
1. **claude-code-base** (preferred if available)
- Pre-built image with Claude Code installed
- Fastest startup time
2. **ubuntu:22.04** (fallback)
- Install Claude Code manually
- More control over environment
**Check if custom image exists:**
```bash
docker images | grep claude-code-base
```
### Step 3: Prepare Skill for Container
**Create temporary directory:**
```bash
TEST_DIR="/tmp/skill-test-$(date +%s)"
mkdir -p "$TEST_DIR"
# Copy skill to test directory
cp -r ~/.claude/skills/[skill-name] "$TEST_DIR/"
# Create Dockerfile
cat > "$TEST_DIR/Dockerfile" <<'EOF'
FROM ubuntu:22.04
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
git \
nodejs \
npm \
&& rm -rf /var/lib/apt/lists/*
# Install Claude Code (adjust version as needed)
RUN npm install -g @anthropic/claude-code
# Create directory structure
RUN mkdir -p /root/.claude/skills
# Copy skill
COPY [skill-name]/ /root/.claude/skills/[skill-name]/
# Set working directory
WORKDIR /root
# Default command
CMD ["/bin/bash"]
EOF
```
### Step 4: Build Docker Image
```bash
cd "$TEST_DIR"
# Build image with tag
docker build -t skill-test:[skill-name] .
# Verify build succeeded
docker images | grep skill-test
```
**Expected build time:** 2-5 minutes (first time), < 30s (cached)
### Step 5: Take "Before" Snapshot
**Create container (don't start yet):**
```bash
CONTAINER_ID=$(docker create \
--name skill-test-$(date +%s) \
--memory="512m" \
--cpus="1.0" \
skill-test:[skill-name])
echo "Container ID: $CONTAINER_ID"
```
**Snapshot filesystem:**
```bash
docker export $CONTAINER_ID | tar -t > /tmp/before-files.txt
```
### Step 6: Run Skill in Container
**Start container interactively:**
```bash
docker start -ai $CONTAINER_ID
```
**Or run with test command:**
```bash
docker run -it \
--name skill-test \
--rm \
--memory="512m" \
--cpus="1.0" \
skill-test:[skill-name] \
bash -c "claude skill run [skill-name] --test"
```
**Monitor execution:**
```bash
# In another terminal, watch resource usage
docker stats $CONTAINER_ID
# Watch logs
docker logs -f $CONTAINER_ID
```
### Step 7: Take "After" Snapshot
**Commit container state:**
```bash
docker commit $CONTAINER_ID skill-test:[skill-name]-after
```
**Export and compare files:**
```bash
# Export after state
docker export $CONTAINER_ID | tar -t > /tmp/after-files.txt
# Find differences
diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
# Count changes
echo "Files added: $(grep ">" /tmp/file-changes.txt | wc -l)"
echo "Files removed: $(grep "<" /tmp/file-changes.txt | wc -l)"
```
**Check for running processes:**
```bash
docker exec $CONTAINER_ID ps aux > /tmp/processes.txt
```
### Step 8: Analyze Results
**Extract skill logs:**
```bash
docker logs $CONTAINER_ID > /tmp/skill-execution.log
# Check for errors
grep -i "error\|fail\|exception" /tmp/skill-execution.log
```
**Check resource usage:**
```bash
docker stats --no-stream $CONTAINER_ID
```
**Inspect filesystem changes:**
```bash
# List files in skill directory
docker exec $CONTAINER_ID find /root/.claude/skills/[skill-name] -type f
# Check temp directories
docker exec $CONTAINER_ID find /tmp -name "*skill*" -o -name "*.tmp"
# Check for leftover processes
docker exec $CONTAINER_ID ps aux | grep -v "ps\|bash"
```
**Analyze dependencies:**
```bash
# Check what packages were installed
docker diff $CONTAINER_ID | grep -E "^A /usr|^A /var/lib"
# Check what commands were executed
docker logs $CONTAINER_ID | grep -E "npm install|apt-get|pip install"
```
### Step 9: Generate Report
**Execution Status:**
```markdown
## Execution Results
**Container**: $CONTAINER_ID
**Base Image**: ubuntu:22.04
**Status**: [Running/Stopped/Exited]
**Exit Code**: $(docker inspect $CONTAINER_ID --format='{{.State.ExitCode}}')
**Resource Usage**:
- Memory: XMB / 512MB
- CPU: X%
- Execution Time: Xs
```
**Side Effects:**
```markdown
## Filesystem Changes
Files added: X
Files modified: X
Files deleted: X
**Significant changes:**
- /tmp/skill-temp-xyz.log (5KB)
- /root/.claude/cache/skill-data.json (15KB)
```
**Dependency Analysis:**
```markdown
## Dependencies Detected
**System Packages**:
- curl (already present)
- jq (installed by skill)
**NPM Packages**:
- lodash@4.17.21 (installed)
**Hardcoded Paths**:
⚠️ /root/.claude/config (line 45)
→ Use $HOME/.claude/config instead
```
### Step 10: Cleanup
**Ask user:**
```
Test complete. Container: $CONTAINER_ID
Options:
1. Keep container for debugging (docker start -ai $CONTAINER_ID)
2. Stop container, keep image (can restart later)
3. Remove container and image (full cleanup)
Your choice?
```
**Cleanup commands:**
```bash
# Option 2: Stop container
docker stop $CONTAINER_ID
# Option 3: Full cleanup
docker rm -f $CONTAINER_ID
docker rmi skill-test:[skill-name]
docker rmi skill-test:[skill-name]-after
# Cleanup test directory
rm -rf "$TEST_DIR"
```
**Cleanup all test containers:**
```bash
docker ps -a | grep skill-test | awk '{print $1}' | xargs docker rm -f
docker images | grep skill-test | awk '{print $3}' | xargs docker rmi -f
```
## Interpreting Results
### ✅ **PASS** - Production Ready
- Container exited with code 0
- Skill completed successfully
- No excessive resource usage
- All dependencies documented
- No orphaned processes
- Temp files in acceptable locations (/tmp only)
### ⚠️ **WARNING** - Needs Improvement
- Exit code 0 but warnings in logs
- Higher than expected resource usage
- Some undocumented dependencies
- Minor cleanup issues
### ❌ **FAIL** - Not Ready
- Container exited with non-zero code
- Skill crashed or hung
- Excessive resource usage (> 512MB memory)
- Attempted to access outside container
- Critical dependencies not documented
## Common Issues
### Issue: "Docker daemon not running"
**Fix**:
```bash
# macOS
open -a Docker
# Linux
sudo systemctl start docker
```
### Issue: "Permission denied" when building image
**Cause**: User not in docker group
**Fix**:
```bash
# Add user to docker group
sudo usermod -aG docker $USER
# Logout/login or run:
newgrp docker
```
### Issue: "No space left on device"
**Cause**: Docker disk space full
**Fix**:
```bash
# Clean up old images and containers
docker system prune -a
# Check space
docker system df
```
### Issue: Skill requires GUI
**Cause**: Skill opens browser or displays graphics
**Fix**: Add X11 forwarding or mark skill as requiring GUI
## Advanced Techniques
### Volume Mounts for Live Testing
```bash
# Mount skill directory for live editing
docker run -it \
-v ~/.claude/skills/[skill-name]:/root/.claude/skills/[skill-name] \
skill-test:[skill-name]
```
### Custom Network Settings
```bash
# Isolated network (no internet)
docker run -it --network=none skill-test:[skill-name]
# Monitor network traffic
docker run -it --cap-add=NET_ADMIN skill-test:[skill-name]
```
### Multi-Stage Testing
```bash
# Test with different Node versions
docker build -t skill-test:node16 --build-arg NODE_VERSION=16 .
docker build -t skill-test:node18 --build-arg NODE_VERSION=18 .
docker build -t skill-test:node20 --build-arg NODE_VERSION=20 .
```
## Best Practices
1. **Always set resource limits** (`--memory`, `--cpus`) to prevent runaway processes
2. **Use `--rm` flag** for auto-cleanup in simple tests
3. **Tag images clearly** with skill name and version
4. **Cache base images** to speed up subsequent tests
5. **Export test results** before removing containers
6. **Test with minimal permissions** first, add as needed
7. **Document all APT/NPM/PIP installs** found during testing
## Quick Command Reference
```bash
# Build test image
docker build -t skill-test:my-skill .
# Run with auto-cleanup
docker run -it --rm skill-test:my-skill
# Run with resource limits
docker run -it --memory="512m" --cpus="1.0" skill-test:my-skill
# Check container status
docker ps -a | grep skill-test
# View container logs
docker logs <container-id>
# Execute command in running container
docker exec <container-id> <command>
# Stop and remove all test containers
docker ps -a | grep skill-test | awk '{print $1}' | xargs docker rm -f
# Remove all test images
docker images | grep skill-test | awk '{print $3}' | xargs docker rmi
```
---
**Remember:** Docker provides strong isolation with reproducible environments. Use for skills that install packages or modify system files. For highest security, use VM mode instead.

View File

@@ -0,0 +1,565 @@
# Mode 3: VM (Virtual Machine) Isolation
## When to Use
**Best for:**
- High-risk skills that modify system configurations
- Skills that require kernel modules or system services
- Testing skills that interact with VMs themselves
- Maximum isolation and security
- Skills from untrusted sources
**Not suitable for:**
- Quick iteration during development (too slow)
- Skills that are obviously safe and read-only
- Situations where speed is more important than isolation
**Risk Level**: Medium to high complexity skills
## Advantages
- 🔒 **Complete Isolation**: Separate kernel, OS, and all resources
- 🛡️ **Maximum Security**: Host system is completely protected
- 🖥️ **Real OS Environment**: Test on actual Linux/macOS distributions
- 📸 **Snapshots**: Easy rollback to clean state
- 🧪 **Destructive Testing**: Safe to test potentially dangerous operations
## Limitations
- 🐌 **Slow**: Minutes to provision, slower execution
- 💾 **Disk Space**: 10-20GB per VM
- 💰 **Resource Intensive**: Requires significant RAM and CPU
- 🔧 **Complex Setup**: More moving parts to configure
- ⏱️ **Longer Feedback Loop**: Not ideal for rapid iteration
## Prerequisites
1. Virtualization software installed:
- **macOS**: UTM, Parallels, or VMware Fusion
- **Linux**: QEMU/KVM, VirtualBox, or virt-manager
- **Windows**: VirtualBox, Hyper-V, or VMware Workstation
2. Base VM image or ISO:
- Ubuntu 22.04 LTS (recommended)
- Debian 12
- Fedora 39
3. System resources:
- 8GB+ host RAM (allocate 2-4GB to VM)
- 20GB+ disk space
- CPU virtualization enabled (VT-x/AMD-V)
4. Command-line tools:
- **macOS with UTM**: `utmctl` or use UI
- **Linux**: `virsh` (libvirt) or `vboxmanage` (VirtualBox)
- **Multipass**: `multipass` (cross-platform, recommended)
## Recommended: Use Multipass
Multipass is the easiest option for cross-platform VM management:
```bash
# Install Multipass
# macOS:
brew install multipass
# Linux:
sudo snap install multipass
# Windows:
# Download from https://multipass.run/
```
## Workflow
### Step 1: Validate Virtualization Environment
```bash
# Check virtualization is enabled (Linux)
grep -E 'vmx|svm' /proc/cpuinfo
# Check Multipass is installed
command -v multipass || { echo "Install Multipass"; exit 1; }
# Check available resources
multipass info || echo "First time setup needed"
```
### Step 2: Create Base VM
**Launch clean Ubuntu VM:**
```bash
VM_NAME="skill-test-$(date +%s)"
# Launch VM with Multipass
multipass launch \
--name "$VM_NAME" \
--cpus 2 \
--memory 2G \
--disk 10G \
22.04
# Wait for VM to be ready
multipass exec "$VM_NAME" -- cloud-init status --wait
```
**Or use UTM (macOS GUI):**
1. Download Ubuntu 22.04 ARM64 ISO
2. Create new VM with 2GB RAM, 10GB disk
3. Install Ubuntu and setup user
4. Note VM name for scripts
**Or use virsh (Linux CLI):**
```bash
# Download cloud image
wget https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img
# Create VM
virt-install \
--name "$VM_NAME" \
--memory 2048 \
--vcpus 2 \
--disk ubuntu-22.04-server-cloudimg-amd64.img \
--import \
--os-variant ubuntu22.04
```
### Step 3: Install Claude Code in VM
```bash
# Install system dependencies
multipass exec "$VM_NAME" -- sudo apt-get update
multipass exec "$VM_NAME" -- sudo apt-get install -y \
curl \
git \
nodejs \
npm
# Install Claude Code
multipass exec "$VM_NAME" -- npm install -g @anthropic/claude-code
# Verify installation
multipass exec "$VM_NAME" -- which claude
```
### Step 4: Copy Skill to VM
```bash
# Create directory structure
multipass exec "$VM_NAME" -- mkdir -p /home/ubuntu/.claude/skills
# Copy skill to VM
multipass transfer \
~/.claude/skills/[skill-name] \
"$VM_NAME":/home/ubuntu/.claude/skills/
# Verify copy
multipass exec "$VM_NAME" -- ls -la /home/ubuntu/.claude/skills/[skill-name]
```
### Step 5: Take VM Snapshot
**With Multipass:**
```bash
# Multipass doesn't support snapshots directly
# Instead, we'll capture filesystem state
multipass exec "$VM_NAME" -- find /home/ubuntu -type f > /tmp/before-files.txt
multipass exec "$VM_NAME" -- dpkg -l > /tmp/before-packages.txt
multipass exec "$VM_NAME" -- ps aux > /tmp/before-processes.txt
```
**With UTM (macOS):**
```bash
# Take snapshot via UI or CLI if available
utmctl snapshot "$VM_NAME" --name "before-skill-test"
```
**With virsh (Linux):**
```bash
virsh snapshot-create-as "$VM_NAME" before-skill-test "Before skill test"
```
### Step 6: Execute Skill in VM
**Start Claude Code session in VM:**
```bash
# Interactive session
multipass shell "$VM_NAME"
# Then inside VM:
claude
# Run skill with trigger phrase
```
**Or execute non-interactively:**
```bash
# If skill has test command
multipass exec "$VM_NAME" -- \
bash -c "claude skill run [skill-name] --test"
```
**Monitor from host:**
```bash
# Watch resource usage
multipass info "$VM_NAME" --format json | jq '.info[] | {memory_usage, cpu_usage}'
# Tail logs
multipass exec "$VM_NAME" -- tail -f /var/log/syslog
```
### Step 7: Take Post-Execution Snapshot
```bash
# Capture filesystem state
multipass exec "$VM_NAME" -- find /home/ubuntu -type f > /tmp/after-files.txt
multipass exec "$VM_NAME" -- dpkg -l > /tmp/after-packages.txt
multipass exec "$VM_NAME" -- ps aux > /tmp/after-processes.txt
# Compare
diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
diff /tmp/before-packages.txt /tmp/after-packages.txt > /tmp/package-changes.txt
diff /tmp/before-processes.txt /tmp/after-processes.txt > /tmp/process-changes.txt
```
**Snapshot VM state:**
```bash
# virsh
virsh snapshot-create-as "$VM_NAME" after-skill-test "After skill test"
# UTM (macOS)
utmctl snapshot "$VM_NAME" --name "after-skill-test"
```
### Step 8: Analyze Results
**Extract execution logs:**
```bash
# Copy Claude Code logs from VM
multipass transfer \
"$VM_NAME":/home/ubuntu/.claude/logs/ \
/tmp/skill-test-logs/
# Analyze logs
grep -i "error\|warning\|fail" /tmp/skill-test-logs/*.log
```
**Check filesystem changes:**
```bash
echo "Files added: $(grep ">" /tmp/file-changes.txt | wc -l)"
echo "Files removed: $(grep "<" /tmp/file-changes.txt | wc -l)"
# Check for unexpected modifications
grep ">/etc/" /tmp/file-changes.txt # System config changes
grep ">/usr/local/" /tmp/file-changes.txt # Global installs
```
**Check package changes:**
```bash
# List newly installed packages
grep ">" /tmp/package-changes.txt
# Check for removed packages
grep "<" /tmp/package-changes.txt
```
**Check for orphaned processes:**
```bash
# Processes still running after skill completion
grep ">" /tmp/process-changes.txt | grep -v "ps\|grep\|ssh"
```
**System modifications:**
```bash
# Check for systemd services
multipass exec "$VM_NAME" -- systemctl list-units --type=service --state=running
# Check for cron jobs
multipass exec "$VM_NAME" -- crontab -l
# Check for environment modifications
multipass exec "$VM_NAME" -- cat /etc/environment
```
### Step 9: Generate Comprehensive Report
```markdown
# VM Isolation Test Report: [skill-name]
## Environment
**VM Platform**: Multipass / UTM / virsh
**OS**: Ubuntu 22.04 LTS
**VM Name**: $VM_NAME
**Resources**: 2 vCPU, 2GB RAM, 10GB disk
## Execution Results
**Status**: ✅ Completed successfully
**Duration**: 45 seconds
**Exit Code**: 0
## Filesystem Changes
**Files Added**: 12
- `/home/ubuntu/.claude/cache/skill-data.json` (15KB)
- `/tmp/skill-temp-*.log` (3 files, 45KB total)
- `/home/ubuntu/.cache/skill-assets/` (8 files, 120KB)
**Files Modified**: 2
- `/home/ubuntu/.claude/config.json` (updated skill registry)
- `/home/ubuntu/.bash_history` (normal)
**Files Deleted**: 0
## Package Changes
**Installed Packages**: 2
- `jq` (1.6-2.1ubuntu3)
- `tree` (2.0.2-1)
**Removed Packages**: 0
## System Modifications
✅ No systemd services added
✅ No cron jobs created
✅ No environment variables modified
⚠️ Found leftover temp files in /tmp
## Process Analysis
**Orphaned Processes**: 0
**Background Jobs**: 0
**Network Connections**: 0
## Security Assessment
✅ No unauthorized file access attempts
✅ No privilege escalation attempts
✅ No suspicious network activity
✅ All operations within user home directory
## Dependency Analysis
**System Packages Required**:
- `jq` (for JSON processing) - Not documented in README
- `tree` (for directory visualization) - Optional
**NPM Packages Required**: None beyond Claude Code
**Hardcoded Paths Detected**:
⚠️ `/home/ubuntu/.claude/cache` (line 67)
→ Should use `$HOME/.claude/cache` or `~/.claude/cache`
## Recommendations
1. **CRITICAL**: Document `jq` dependency in README.md
2. **HIGH**: Fix hardcoded path on line 67
3. **MEDIUM**: Clean up /tmp files before skill exits
4. **LOW**: Consider making `tree` dependency optional
## Overall Grade: B (READY with minor fixes)
**Portability**: 85/100
**Cleanliness**: 75/100
**Security**: 100/100
**Documentation**: 70/100
**Final Status**: ✅ **APPROVED** for public release after addressing CRITICAL and HIGH priority items
```
### Step 10: Cleanup or Preserve
**Ask user:**
```
Test complete. VM: $VM_NAME
Options:
1. Keep VM for manual inspection
Command: multipass shell $VM_NAME
2. Stop VM (can restart later)
Command: multipass stop $VM_NAME
3. Delete VM and snapshots (full cleanup)
Command: multipass delete $VM_NAME && multipass purge
4. Rollback to "before" snapshot and retest
(virsh/UTM only)
Your choice?
```
**Cleanup commands:**
```bash
# Option 2: Stop VM
multipass stop "$VM_NAME"
# Option 3: Full cleanup
multipass delete "$VM_NAME"
multipass purge
# Cleanup temp files
rm -rf /tmp/skill-test-logs
rm /tmp/before-*.txt /tmp/after-*.txt /tmp/*-changes.txt
```
## Interpreting Results
### ✅ **PASS** - Production Ready
- VM still bootable after test
- Skill completed successfully
- No unauthorized system modifications
- All dependencies documented
- No security issues detected
- Clean cleanup (no orphaned resources)
### ⚠️ **WARNING** - Needs Review
- Skill works but left system modifications
- Installed undocumented packages
- Modified system configs (needs user consent)
- Performance issues (high resource usage)
### ❌ **FAIL** - Not Safe
- VM corrupted or unbootable
- Skill crashed or hung indefinitely
- Unauthorized privilege escalation
- Malicious behavior detected
- Critical undocumented dependencies
- Data exfiltration attempts
## Common Issues
### Issue: "Multipass not found"
**Fix**:
```bash
# macOS
brew install multipass
# Linux
sudo snap install multipass
```
### Issue: "Virtualization not enabled"
**Cause**: VT-x/AMD-V disabled in BIOS
**Fix**: Enable virtualization in BIOS/UEFI settings
### Issue: "Failed to launch VM"
**Cause**: Insufficient resources
**Fix**:
```bash
# Reduce VM resources
multipass launch --cpus 1 --memory 1G --disk 5G
```
### Issue: "VM network not working"
**Cause**: Network bridge issues
**Fix**:
```bash
# Restart Multipass daemon
# macOS
sudo launchctl kickstart -k system/com.canonical.multipassd
# Linux
sudo systemctl restart snap.multipass.multipassd
```
### Issue: "Can't copy files to VM"
**Cause**: SSH/sftp issues
**Fix**:
```bash
# Mount host directory instead
multipass mount ~/.claude/skills "$VM_NAME":/mnt/skills
```
## Advanced Techniques
### Automated Testing Pipeline
```bash
#!/bin/bash
# test-skill-vm.sh
SKILL_NAME="$1"
VM_NAME="skill-test-$SKILL_NAME-$(date +%s)"
# Launch VM
multipass launch --name "$VM_NAME" 22.04
# Setup
multipass exec "$VM_NAME" -- bash -c "
sudo apt-get update
sudo apt-get install -y nodejs npm
npm install -g @anthropic/claude-code
"
# Copy skill
multipass transfer ~/.claude/skills/$SKILL_NAME "$VM_NAME":/home/ubuntu/.claude/skills/
# Run test
multipass exec "$VM_NAME" -- claude skill test $SKILL_NAME
# Cleanup
multipass delete "$VM_NAME"
multipass purge
```
### Testing on Multiple OS Versions
```bash
# Test on Ubuntu 20.04, 22.04, and 24.04
for version in 20.04 22.04 24.04; do
VM="skill-test-ubuntu-${version}"
multipass launch --name "$VM" $version
# ... run tests ...
multipass delete "$VM"
done
```
### Network Isolation Testing
```bash
# Create VM without internet access (if supported by hypervisor)
# Then test if skill fails gracefully without network
```
## Best Practices
1. **Always take snapshots** before running skills
2. **Test on clean VMs** - don't reuse VMs between tests
3. **Monitor resource usage** - catch runaway processes
4. **Check system logs** (`/var/log/syslog`) for warnings
5. **Test rollback** - ensure VM can be restored
6. **Document all system dependencies** found
7. **Use minimal VM resources** to catch resource issues
8. **Archive test results** before destroying VMs
## Quick Command Reference
```bash
# Launch VM
multipass launch --name test-vm 22.04
# List VMs
multipass list
# Shell into VM
multipass shell test-vm
# Execute command in VM
multipass exec test-vm -- <command>
# Copy file to VM
multipass transfer local-file test-vm:/remote/path
# Copy file from VM
multipass transfer test-vm:/remote/path local-file
# Stop VM
multipass stop test-vm
# Start VM
multipass start test-vm
# Delete VM
multipass delete test-vm && multipass purge
# VM info
multipass info test-vm
```
---
**Remember:** VM isolation is the gold standard for testing high-risk skills. It's slower but provides complete security and accurate testing of system-level behaviors. Use for skills from untrusted sources or skills that modify system state.