Initial commit

2025-11-29 18:16:56 +08:00
commit 8a3d331e04
61 changed files with 11808 additions and 0 deletions
--- a/skills/skill-isolation-tester/modes/mode1-git-worktree.md
+++ b/skills/skill-isolation-tester/modes/mode1-git-worktree.md
@@ -0,0 +1,292 @@
+# Mode 1: Git Worktree Isolation
+
+## When to Use
+
+**Best for:**
+- Read-only skills or skills with minimal file operations
+- Quick validation during development
+- Skills that don't require system package installation
+- Testing iterations where speed matters
+
+**Not suitable for:**
+- Skills that install system packages (npm install, apt-get, brew, etc.)
+- Skills that modify system configurations
+- Skills that require a clean Node.js environment
+
+**Risk Level**: Low complexity skills only
+
+## Advantages
+
+- ⚡ **Fast**: Creates worktree in seconds
+- 💾 **Efficient**: Shares git history, minimal disk space
+- 🔄 **Repeatable**: Easy to create, test, and destroy
+- 🛠️ **Familiar**: Same git tools you already know
+
+## Limitations
+
+- ❌ Shares system packages (node_modules, global npm packages)
+- ❌ Shares environment variables and configs
+- ❌ Same OS user and permissions
+- ❌ Cannot test system-level dependencies
+- ⚠️ Not true isolation - just a separate git checkout
+
+## Prerequisites
+
+1. Must be in a git repository
+2. Git worktree feature available (Git 2.5+)
+3. Clean working directory (or willing to proceed with uncommitted changes)
+4. Sufficient disk space for additional worktree
+
+## Workflow
+
+### Step 1: Validate Environment
+
+```bash
+# Check if in git repo
+git rev-parse --is-inside-work-tree
+
+# Check for uncommitted changes
+git status --porcelain
+
+# Get current repo name
+basename $(git rev-parse --show-toplevel)
+```
+
+If dirty working directory → warn user but allow proceeding (isolation is separate)
+
+### Step 2: Create Isolation Worktree
+
+**Generate unique branch name:**
+```bash
+BRANCH_NAME="test-skill-$(date +%s)"  # e.g., test-skill-1699876543
+```
+
+**Create worktree:**
+```bash
+WORKTREE_PATH="../$(basename $(pwd))-${BRANCH_NAME}"
+git worktree add "$WORKTREE_PATH" -b "$BRANCH_NAME"
+```
+
+Example result: `/Users/connor/claude-test-skill-1699876543/`
+
+### Step 3: Copy Skill to Worktree
+
+```bash
+# Copy skill directory to worktree's .claude/skills/
+cp -r ~/.claude/skills/[skill-name] "$WORKTREE_PATH/.claude/skills/"
+
+# Or if skill is in current repo
+cp -r ./skills/[skill-name] "$WORKTREE_PATH/.claude/skills/"
+```
+
+**Verify copy:**
+```bash
+ls -la "$WORKTREE_PATH/.claude/skills/[skill-name]/"
+```
+
+### Step 4: Setup Development Environment
+
+**Install dependencies if needed:**
+```bash
+cd "$WORKTREE_PATH"
+
+# Detect package manager
+if [ -f "pnpm-lock.yaml" ]; then
+  pnpm install
+elif [ -f "yarn.lock" ]; then
+  yarn install
+elif [ -f "package-lock.json" ]; then
+  npm install
+fi
+```
+
+**Copy environment files (optional):**
+```bash
+# Only if skill needs .env for testing
+cp ../.env "$WORKTREE_PATH/.env"
+```
+
+### Step 5: Take "Before" Snapshot
+
+```bash
+# List all files in worktree
+find "$WORKTREE_PATH" -type f > /tmp/before-files.txt
+
+# List running processes (for comparison later)
+ps aux > /tmp/before-processes.txt
+
+# Current disk usage
+du -sh "$WORKTREE_PATH" > /tmp/before-disk.txt
+```
+
+### Step 6: Execute Skill in Worktree
+
+**Open new Claude Code session in worktree:**
+```bash
+cd "$WORKTREE_PATH"
+claude
+```
+
+**Run skill with test trigger:**
+- User manually tests skill with trigger phrases
+- OR: Use Claude CLI to run skill programmatically (if available)
+
+**Monitor execution:**
+- Watch for errors in output
+- Note execution time
+- Check resource usage
+
+### Step 7: Take "After" Snapshot
+
+```bash
+# List all files after execution
+find "$WORKTREE_PATH" -type f > /tmp/after-files.txt
+
+# Compare before/after
+diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
+
+# Check for new processes
+ps aux > /tmp/after-processes.txt
+diff /tmp/before-processes.txt /tmp/after-processes.txt > /tmp/process-changes.txt
+
+# Check disk usage
+du -sh "$WORKTREE_PATH" > /tmp/after-disk.txt
+```
+
+### Step 8: Analyze Results
+
+**Check for side effects:**
+```bash
+# Files created
+grep ">" /tmp/file-changes.txt | wc -l
+
+# Files deleted
+grep "<" /tmp/file-changes.txt | wc -l
+
+# New processes (filter out expected ones)
+# Look for processes related to skill
+```
+
+**Validate cleanup:**
+```bash
+# Check for leftover temp files
+find "$WORKTREE_PATH" -name "*.tmp" -o -name "*.temp" -o -name ".cache"
+
+# Check for orphaned processes
+# Look for processes still running from skill
+```
+
+### Step 9: Generate Report
+
+**Execution Results:**
+- ✅ Skill completed successfully / ❌ Skill failed with error
+- ⏱️ Execution time: Xs
+- 📊 Resource usage: XMB disk, X% CPU
+
+**Side Effects:**
+- Files created: [count] (list if < 10)
+- Files modified: [count]
+- Processes created: [count]
+- Temporary files remaining: [count]
+
+**Dependency Analysis:**
+- Required tools: [list tools used by skill]
+- Hardcoded paths: [list any absolute paths found]
+- Environment variables: [list any ENV vars referenced]
+
+### Step 10: Cleanup
+
+**Ask user:**
+```
+Test complete. Worktree location: $WORKTREE_PATH
+
+Options:
+1. Keep worktree for debugging
+2. Remove worktree and branch
+3. Remove worktree, keep branch
+
+Your choice?
+```
+
+**Cleanup commands:**
+```bash
+# Option 2: Full cleanup
+git worktree remove "$WORKTREE_PATH"
+git branch -D "$BRANCH_NAME"
+
+# Option 3: Keep branch
+git worktree remove "$WORKTREE_PATH"
+```
+
+## Interpreting Results
+
+### ✅ **PASS** - Ready for git worktree environments
+- Skill completed without errors
+- No unexpected file modifications
+- No orphaned processes
+- No hardcoded paths detected
+- Temporary files cleaned up
+
+### ⚠️ **WARNING** - Works but has minor issues
+- Skill works but left temporary files
+- Uses some hardcoded paths (but non-critical)
+- Performance could be improved
+- Missing some documentation
+
+### ❌ **FAIL** - Not ready
+- Skill crashed or hung
+- Requires system packages not installed
+- Modifies files outside skill directory without permission
+- Creates orphaned processes
+- Has critical hardcoded paths
+
+## Common Issues
+
+### Issue: "Skill not found in Claude"
+**Cause**: Skill wasn't copied to worktree's .claude/skills/
+**Fix**: Verify copy command and path
+
+### Issue: "Permission denied" errors
+**Cause**: Skill trying to write to protected directories
+**Fix**: Identify problematic paths, suggest using /tmp or skill directory
+
+### Issue: "Command not found"
+**Cause**: Skill depends on system tool not installed
+**Fix**: Document dependency, suggest adding to skill README
+
+### Issue: Test results different from main directory
+**Cause**: Different node_modules or configs
+**Fix**: This is expected - worktree shares some state, not true isolation
+
+## Best Practices
+
+1. **Always take before/after snapshots** for accurate comparison
+2. **Test multiple times** to ensure consistency
+3. **Check temp directories** (`/tmp`, `/var/tmp`) for leftover files
+4. **Monitor processes** for at least 30s after skill completes
+5. **Document all dependencies** found during testing
+6. **Use relative paths** in skill code, never absolute
+7. **Cleanup worktrees** regularly to avoid clutter
+
+## Quick Command Reference
+
+```bash
+# Create test worktree
+git worktree add ../test-branch -b test-branch
+
+# List all worktrees
+git worktree list
+
+# Remove worktree
+git worktree remove ../test-branch
+
+# Remove worktree and branch
+git worktree remove ../test-branch && git branch -D test-branch
+
+# Find temp files created
+find /tmp -name "*skill-name*" -mtime -1
+```
+
+---
+
+**Remember:** Git worktree provides quick, lightweight isolation but is NOT true isolation. Use for low-risk skills or fast iteration during development. For skills that modify system state, use Docker or VM modes.
--- a/skills/skill-isolation-tester/modes/mode2-docker.md
+++ b/skills/skill-isolation-tester/modes/mode2-docker.md
@@ -0,0 +1,468 @@
+# Mode 2: Docker Container Isolation
+
+## Using Docker Helper Library
+
+**RECOMMENDED:** Use the helper library for robust error handling and cleanup.
+
+```bash
+source ~/.claude/skills/skill-isolation-tester/lib/docker-helpers.sh
+
+# Set cleanup trap (runs automatically on exit)
+trap cleanup_on_exit EXIT
+
+# Pre-flight checks
+preflight_check_docker || exit 1
+```
+
+The helper library provides:
+- Shell command validation (prevents syntax errors)
+- Retry logic with exponential backoff
+- Automatic cleanup on exit
+- Pre-flight Docker environment checks
+- Safe build and run functions
+
+See `lib/docker-helpers.sh` for full documentation.
+
+---
+
+## When to Use
+
+**Best for:**
+- Skills that install npm/pip packages or system dependencies
+- Skills that modify configuration files
+- Medium-risk skills that need OS-level isolation
+- Testing skills with different Claude Code versions
+- Reproducible testing environments
+
+**Not suitable for:**
+- Skills that require VM operations or nested virtualization
+- Skills that need GUI access (without X11 forwarding)
+- Extremely high-risk skills (use VM mode instead)
+
+**Risk Level**: Low to medium complexity skills
+
+## Advantages
+
+- 🏗️ **True OS Isolation**: Complete filesystem and process separation
+- 📦 **Reproducible**: Same environment every time
+- 🔒 **Sandboxed**: Limited access to host system
+- 🎯 **Precise**: Control exactly what's installed
+- 🗑️ **Clean**: Easy to destroy and recreate
+
+## Limitations
+
+- ⏱️ Slower than git worktree (container overhead)
+- 💾 Requires disk space for images
+- 🐳 Requires Docker installation and running daemon
+- ⚙️ More complex setup than worktree
+- 🔧 May need volume mounts for file access
+
+## Prerequisites
+
+1. Docker installed and running (`docker info`)
+2. Sufficient disk space (~1GB for base image + skill)
+3. Permissions to run Docker commands
+4. Internet connection (first time only, to pull images)
+
+## Workflow
+
+### Step 1: Validate Docker Environment
+
+```bash
+# Check Docker is installed
+command -v docker || { echo "Docker not installed"; exit 1; }
+
+# Check Docker daemon is running
+docker info > /dev/null 2>&1 || { echo "Docker daemon not running"; exit 1; }
+
+# Check disk space
+docker system df
+```
+
+### Step 2: Choose Base Image
+
+**Options:**
+1. **claude-code-base** (preferred if available)
+   - Pre-built image with Claude Code installed
+   - Fastest startup time
+
+2. **ubuntu:22.04** (fallback)
+   - Install Claude Code manually
+   - More control over environment
+
+**Check if custom image exists:**
+```bash
+docker images | grep claude-code-base
+```
+
+### Step 3: Prepare Skill for Container
+
+**Create temporary directory:**
+```bash
+TEST_DIR="/tmp/skill-test-$(date +%s)"
+mkdir -p "$TEST_DIR"
+
+# Copy skill to test directory
+cp -r ~/.claude/skills/[skill-name] "$TEST_DIR/"
+
+# Create Dockerfile
+cat > "$TEST_DIR/Dockerfile" <<'EOF'
+FROM ubuntu:22.04
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    curl \
+    git \
+    nodejs \
+    npm \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install Claude Code (adjust version as needed)
+RUN npm install -g @anthropic/claude-code
+
+# Create directory structure
+RUN mkdir -p /root/.claude/skills
+
+# Copy skill
+COPY [skill-name]/ /root/.claude/skills/[skill-name]/
+
+# Set working directory
+WORKDIR /root
+
+# Default command
+CMD ["/bin/bash"]
+EOF
+```
+
+### Step 4: Build Docker Image
+
+```bash
+cd "$TEST_DIR"
+
+# Build image with tag
+docker build -t skill-test:[skill-name] .
+
+# Verify build succeeded
+docker images | grep skill-test
+```
+
+**Expected build time:** 2-5 minutes (first time), < 30s (cached)
+
+### Step 5: Take "Before" Snapshot
+
+**Create container (don't start yet):**
+```bash
+CONTAINER_ID=$(docker create \
+  --name skill-test-$(date +%s) \
+  --memory="512m" \
+  --cpus="1.0" \
+  skill-test:[skill-name])
+
+echo "Container ID: $CONTAINER_ID"
+```
+
+**Snapshot filesystem:**
+```bash
+docker export $CONTAINER_ID | tar -t > /tmp/before-files.txt
+```
+
+### Step 6: Run Skill in Container
+
+**Start container interactively:**
+```bash
+docker start -ai $CONTAINER_ID
+```
+
+**Or run with test command:**
+```bash
+docker run -it \
+  --name skill-test \
+  --rm \
+  --memory="512m" \
+  --cpus="1.0" \
+  skill-test:[skill-name] \
+  bash -c "claude skill run [skill-name] --test"
+```
+
+**Monitor execution:**
+```bash
+# In another terminal, watch resource usage
+docker stats $CONTAINER_ID
+
+# Watch logs
+docker logs -f $CONTAINER_ID
+```
+
+### Step 7: Take "After" Snapshot
+
+**Commit container state:**
+```bash
+docker commit $CONTAINER_ID skill-test:[skill-name]-after
+```
+
+**Export and compare files:**
+```bash
+# Export after state
+docker export $CONTAINER_ID | tar -t > /tmp/after-files.txt
+
+# Find differences
+diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
+
+# Count changes
+echo "Files added: $(grep ">" /tmp/file-changes.txt | wc -l)"
+echo "Files removed: $(grep "<" /tmp/file-changes.txt | wc -l)"
+```
+
+**Check for running processes:**
+```bash
+docker exec $CONTAINER_ID ps aux > /tmp/processes.txt
+```
+
+### Step 8: Analyze Results
+
+**Extract skill logs:**
+```bash
+docker logs $CONTAINER_ID > /tmp/skill-execution.log
+
+# Check for errors
+grep -i "error\|fail\|exception" /tmp/skill-execution.log
+```
+
+**Check resource usage:**
+```bash
+docker stats --no-stream $CONTAINER_ID
+```
+
+**Inspect filesystem changes:**
+```bash
+# List files in skill directory
+docker exec $CONTAINER_ID find /root/.claude/skills/[skill-name] -type f
+
+# Check temp directories
+docker exec $CONTAINER_ID find /tmp -name "*skill*" -o -name "*.tmp"
+
+# Check for leftover processes
+docker exec $CONTAINER_ID ps aux | grep -v "ps\|bash"
+```
+
+**Analyze dependencies:**
+```bash
+# Check what packages were installed
+docker diff $CONTAINER_ID | grep -E "^A /usr|^A /var/lib"
+
+# Check what commands were executed
+docker logs $CONTAINER_ID | grep -E "npm install|apt-get|pip install"
+```
+
+### Step 9: Generate Report
+
+**Execution Status:**
+```markdown
+## Execution Results
+
+**Container**: $CONTAINER_ID
+**Base Image**: ubuntu:22.04
+**Status**: [Running/Stopped/Exited]
+**Exit Code**: $(docker inspect $CONTAINER_ID --format='{{.State.ExitCode}}')
+
+**Resource Usage**:
+- Memory: XMB / 512MB
+- CPU: X%
+- Execution Time: Xs
+```
+
+**Side Effects:**
+```markdown
+## Filesystem Changes
+
+Files added: X
+Files modified: X
+Files deleted: X
+
+**Significant changes:**
+- /tmp/skill-temp-xyz.log (5KB)
+- /root/.claude/cache/skill-data.json (15KB)
+```
+
+**Dependency Analysis:**
+```markdown
+## Dependencies Detected
+
+**System Packages**:
+- curl (already present)
+- jq (installed by skill)
+
+**NPM Packages**:
+- lodash@4.17.21 (installed)
+
+**Hardcoded Paths**:
+⚠️ /root/.claude/config (line 45)
+→ Use $HOME/.claude/config instead
+```
+
+### Step 10: Cleanup
+
+**Ask user:**
+```
+Test complete. Container: $CONTAINER_ID
+
+Options:
+1. Keep container for debugging (docker start -ai $CONTAINER_ID)
+2. Stop container, keep image (can restart later)
+3. Remove container and image (full cleanup)
+
+Your choice?
+```
+
+**Cleanup commands:**
+```bash
+# Option 2: Stop container
+docker stop $CONTAINER_ID
+
+# Option 3: Full cleanup
+docker rm -f $CONTAINER_ID
+docker rmi skill-test:[skill-name]
+docker rmi skill-test:[skill-name]-after
+
+# Cleanup test directory
+rm -rf "$TEST_DIR"
+```
+
+**Cleanup all test containers:**
+```bash
+docker ps -a | grep skill-test | awk '{print $1}' | xargs docker rm -f
+docker images | grep skill-test | awk '{print $3}' | xargs docker rmi -f
+```
+
+## Interpreting Results
+
+### ✅ **PASS** - Production Ready
+- Container exited with code 0
+- Skill completed successfully
+- No excessive resource usage
+- All dependencies documented
+- No orphaned processes
+- Temp files in acceptable locations (/tmp only)
+
+### ⚠️ **WARNING** - Needs Improvement
+- Exit code 0 but warnings in logs
+- Higher than expected resource usage
+- Some undocumented dependencies
+- Minor cleanup issues
+
+### ❌ **FAIL** - Not Ready
+- Container exited with non-zero code
+- Skill crashed or hung
+- Excessive resource usage (> 512MB memory)
+- Attempted to access outside container
+- Critical dependencies not documented
+
+## Common Issues
+
+### Issue: "Docker daemon not running"
+**Fix**:
+```bash
+# macOS
+open -a Docker
+
+# Linux
+sudo systemctl start docker
+```
+
+### Issue: "Permission denied" when building image
+**Cause**: User not in docker group
+**Fix**:
+```bash
+# Add user to docker group
+sudo usermod -aG docker $USER
+
+# Logout/login or run:
+newgrp docker
+```
+
+### Issue: "No space left on device"
+**Cause**: Docker disk space full
+**Fix**:
+```bash
+# Clean up old images and containers
+docker system prune -a
+
+# Check space
+docker system df
+```
+
+### Issue: Skill requires GUI
+**Cause**: Skill opens browser or displays graphics
+**Fix**: Add X11 forwarding or mark skill as requiring GUI
+
+## Advanced Techniques
+
+### Volume Mounts for Live Testing
+
+```bash
+# Mount skill directory for live editing
+docker run -it \
+  -v ~/.claude/skills/[skill-name]:/root/.claude/skills/[skill-name] \
+  skill-test:[skill-name]
+```
+
+### Custom Network Settings
+
+```bash
+# Isolated network (no internet)
+docker run -it --network=none skill-test:[skill-name]
+
+# Monitor network traffic
+docker run -it --cap-add=NET_ADMIN skill-test:[skill-name]
+```
+
+### Multi-Stage Testing
+
+```bash
+# Test with different Node versions
+docker build -t skill-test:node16 --build-arg NODE_VERSION=16 .
+docker build -t skill-test:node18 --build-arg NODE_VERSION=18 .
+docker build -t skill-test:node20 --build-arg NODE_VERSION=20 .
+```
+
+## Best Practices
+
+1. **Always set resource limits** (`--memory`, `--cpus`) to prevent runaway processes
+2. **Use `--rm` flag** for auto-cleanup in simple tests
+3. **Tag images clearly** with skill name and version
+4. **Cache base images** to speed up subsequent tests
+5. **Export test results** before removing containers
+6. **Test with minimal permissions** first, add as needed
+7. **Document all APT/NPM/PIP installs** found during testing
+
+## Quick Command Reference
+
+```bash
+# Build test image
+docker build -t skill-test:my-skill .
+
+# Run with auto-cleanup
+docker run -it --rm skill-test:my-skill
+
+# Run with resource limits
+docker run -it --memory="512m" --cpus="1.0" skill-test:my-skill
+
+# Check container status
+docker ps -a | grep skill-test
+
+# View container logs
+docker logs <container-id>
+
+# Execute command in running container
+docker exec <container-id> <command>
+
+# Stop and remove all test containers
+docker ps -a | grep skill-test | awk '{print $1}' | xargs docker rm -f
+
+# Remove all test images
+docker images | grep skill-test | awk '{print $3}' | xargs docker rmi
+```
+
+---
+
+**Remember:** Docker provides strong isolation with reproducible environments. Use for skills that install packages or modify system files. For highest security, use VM mode instead.
--- a/skills/skill-isolation-tester/modes/mode3-vm.md
+++ b/skills/skill-isolation-tester/modes/mode3-vm.md
@@ -0,0 +1,565 @@
+# Mode 3: VM (Virtual Machine) Isolation
+
+## When to Use
+
+**Best for:**
+- High-risk skills that modify system configurations
+- Skills that require kernel modules or system services
+- Testing skills that interact with VMs themselves
+- Maximum isolation and security
+- Skills from untrusted sources
+
+**Not suitable for:**
+- Quick iteration during development (too slow)
+- Skills that are obviously safe and read-only
+- Situations where speed is more important than isolation
+
+**Risk Level**: Medium to high complexity skills
+
+## Advantages
+
+- 🔒 **Complete Isolation**: Separate kernel, OS, and all resources
+- 🛡️ **Maximum Security**: Host system is completely protected
+- 🖥️ **Real OS Environment**: Test on actual Linux/macOS distributions
+- 📸 **Snapshots**: Easy rollback to clean state
+- 🧪 **Destructive Testing**: Safe to test potentially dangerous operations
+
+## Limitations
+
+- 🐌 **Slow**: Minutes to provision, slower execution
+- 💾 **Disk Space**: 10-20GB per VM
+- 💰 **Resource Intensive**: Requires significant RAM and CPU
+- 🔧 **Complex Setup**: More moving parts to configure
+- ⏱️ **Longer Feedback Loop**: Not ideal for rapid iteration
+
+## Prerequisites
+
+1. Virtualization software installed:
+   - **macOS**: UTM, Parallels, or VMware Fusion
+   - **Linux**: QEMU/KVM, VirtualBox, or virt-manager
+   - **Windows**: VirtualBox, Hyper-V, or VMware Workstation
+
+2. Base VM image or ISO:
+   - Ubuntu 22.04 LTS (recommended)
+   - Debian 12
+   - Fedora 39
+
+3. System resources:
+   - 8GB+ host RAM (allocate 2-4GB to VM)
+   - 20GB+ disk space
+   - CPU virtualization enabled (VT-x/AMD-V)
+
+4. Command-line tools:
+   - **macOS with UTM**: `utmctl` or use UI
+   - **Linux**: `virsh` (libvirt) or `vboxmanage` (VirtualBox)
+   - **Multipass**: `multipass` (cross-platform, recommended)
+
+## Recommended: Use Multipass
+
+Multipass is the easiest option for cross-platform VM management:
+
+```bash
+# Install Multipass
+# macOS:
+brew install multipass
+
+# Linux:
+sudo snap install multipass
+
+# Windows:
+# Download from https://multipass.run/
+```
+
+## Workflow
+
+### Step 1: Validate Virtualization Environment
+
+```bash
+# Check virtualization is enabled (Linux)
+grep -E 'vmx|svm' /proc/cpuinfo
+
+# Check Multipass is installed
+command -v multipass || { echo "Install Multipass"; exit 1; }
+
+# Check available resources
+multipass info || echo "First time setup needed"
+```
+
+### Step 2: Create Base VM
+
+**Launch clean Ubuntu VM:**
+```bash
+VM_NAME="skill-test-$(date +%s)"
+
+# Launch VM with Multipass
+multipass launch \
+  --name "$VM_NAME" \
+  --cpus 2 \
+  --memory 2G \
+  --disk 10G \
+  22.04
+
+# Wait for VM to be ready
+multipass exec "$VM_NAME" -- cloud-init status --wait
+```
+
+**Or use UTM (macOS GUI):**
+1. Download Ubuntu 22.04 ARM64 ISO
+2. Create new VM with 2GB RAM, 10GB disk
+3. Install Ubuntu and setup user
+4. Note VM name for scripts
+
+**Or use virsh (Linux CLI):**
+```bash
+# Download cloud image
+wget https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img
+
+# Create VM
+virt-install \
+  --name "$VM_NAME" \
+  --memory 2048 \
+  --vcpus 2 \
+  --disk ubuntu-22.04-server-cloudimg-amd64.img \
+  --import \
+  --os-variant ubuntu22.04
+```
+
+### Step 3: Install Claude Code in VM
+
+```bash
+# Install system dependencies
+multipass exec "$VM_NAME" -- sudo apt-get update
+multipass exec "$VM_NAME" -- sudo apt-get install -y \
+  curl \
+  git \
+  nodejs \
+  npm
+
+# Install Claude Code
+multipass exec "$VM_NAME" -- npm install -g @anthropic/claude-code
+
+# Verify installation
+multipass exec "$VM_NAME" -- which claude
+```
+
+### Step 4: Copy Skill to VM
+
+```bash
+# Create directory structure
+multipass exec "$VM_NAME" -- mkdir -p /home/ubuntu/.claude/skills
+
+# Copy skill to VM
+multipass transfer \
+  ~/.claude/skills/[skill-name] \
+  "$VM_NAME":/home/ubuntu/.claude/skills/
+
+# Verify copy
+multipass exec "$VM_NAME" -- ls -la /home/ubuntu/.claude/skills/[skill-name]
+```
+
+### Step 5: Take VM Snapshot
+
+**With Multipass:**
+```bash
+# Multipass doesn't support snapshots directly
+# Instead, we'll capture filesystem state
+multipass exec "$VM_NAME" -- find /home/ubuntu -type f > /tmp/before-files.txt
+multipass exec "$VM_NAME" -- dpkg -l > /tmp/before-packages.txt
+multipass exec "$VM_NAME" -- ps aux > /tmp/before-processes.txt
+```
+
+**With UTM (macOS):**
+```bash
+# Take snapshot via UI or CLI if available
+utmctl snapshot "$VM_NAME" --name "before-skill-test"
+```
+
+**With virsh (Linux):**
+```bash
+virsh snapshot-create-as "$VM_NAME" before-skill-test "Before skill test"
+```
+
+### Step 6: Execute Skill in VM
+
+**Start Claude Code session in VM:**
+```bash
+# Interactive session
+multipass shell "$VM_NAME"
+
+# Then inside VM:
+claude
+
+# Run skill with trigger phrase
+```
+
+**Or execute non-interactively:**
+```bash
+# If skill has test command
+multipass exec "$VM_NAME" -- \
+  bash -c "claude skill run [skill-name] --test"
+```
+
+**Monitor from host:**
+```bash
+# Watch resource usage
+multipass info "$VM_NAME" --format json | jq '.info[] | {memory_usage, cpu_usage}'
+
+# Tail logs
+multipass exec "$VM_NAME" -- tail -f /var/log/syslog
+```
+
+### Step 7: Take Post-Execution Snapshot
+
+```bash
+# Capture filesystem state
+multipass exec "$VM_NAME" -- find /home/ubuntu -type f > /tmp/after-files.txt
+multipass exec "$VM_NAME" -- dpkg -l > /tmp/after-packages.txt
+multipass exec "$VM_NAME" -- ps aux > /tmp/after-processes.txt
+
+# Compare
+diff /tmp/before-files.txt /tmp/after-files.txt > /tmp/file-changes.txt
+diff /tmp/before-packages.txt /tmp/after-packages.txt > /tmp/package-changes.txt
+diff /tmp/before-processes.txt /tmp/after-processes.txt > /tmp/process-changes.txt
+```
+
+**Snapshot VM state:**
+```bash
+# virsh
+virsh snapshot-create-as "$VM_NAME" after-skill-test "After skill test"
+
+# UTM (macOS)
+utmctl snapshot "$VM_NAME" --name "after-skill-test"
+```
+
+### Step 8: Analyze Results
+
+**Extract execution logs:**
+```bash
+# Copy Claude Code logs from VM
+multipass transfer \
+  "$VM_NAME":/home/ubuntu/.claude/logs/ \
+  /tmp/skill-test-logs/
+
+# Analyze logs
+grep -i "error\|warning\|fail" /tmp/skill-test-logs/*.log
+```
+
+**Check filesystem changes:**
+```bash
+echo "Files added: $(grep ">" /tmp/file-changes.txt | wc -l)"
+echo "Files removed: $(grep "<" /tmp/file-changes.txt | wc -l)"
+
+# Check for unexpected modifications
+grep ">/etc/" /tmp/file-changes.txt  # System config changes
+grep ">/usr/local/" /tmp/file-changes.txt  # Global installs
+```
+
+**Check package changes:**
+```bash
+# List newly installed packages
+grep ">" /tmp/package-changes.txt
+
+# Check for removed packages
+grep "<" /tmp/package-changes.txt
+```
+
+**Check for orphaned processes:**
+```bash
+# Processes still running after skill completion
+grep ">" /tmp/process-changes.txt | grep -v "ps\|grep\|ssh"
+```
+
+**System modifications:**
+```bash
+# Check for systemd services
+multipass exec "$VM_NAME" -- systemctl list-units --type=service --state=running
+
+# Check for cron jobs
+multipass exec "$VM_NAME" -- crontab -l
+
+# Check for environment modifications
+multipass exec "$VM_NAME" -- cat /etc/environment
+```
+
+### Step 9: Generate Comprehensive Report
+
+```markdown
+# VM Isolation Test Report: [skill-name]
+
+## Environment
+**VM Platform**: Multipass / UTM / virsh
+**OS**: Ubuntu 22.04 LTS
+**VM Name**: $VM_NAME
+**Resources**: 2 vCPU, 2GB RAM, 10GB disk
+
+## Execution Results
+**Status**: ✅ Completed successfully
+**Duration**: 45 seconds
+**Exit Code**: 0
+
+## Filesystem Changes
+**Files Added**: 12
+- `/home/ubuntu/.claude/cache/skill-data.json` (15KB)
+- `/tmp/skill-temp-*.log` (3 files, 45KB total)
+- `/home/ubuntu/.cache/skill-assets/` (8 files, 120KB)
+
+**Files Modified**: 2
+- `/home/ubuntu/.claude/config.json` (updated skill registry)
+- `/home/ubuntu/.bash_history` (normal)
+
+**Files Deleted**: 0
+
+## Package Changes
+**Installed Packages**: 2
+- `jq` (1.6-2.1ubuntu3)
+- `tree` (2.0.2-1)
+
+**Removed Packages**: 0
+
+## System Modifications
+✅ No systemd services added
+✅ No cron jobs created
+✅ No environment variables modified
+⚠️  Found leftover temp files in /tmp
+
+## Process Analysis
+**Orphaned Processes**: 0
+**Background Jobs**: 0
+**Network Connections**: 0
+
+## Security Assessment
+✅ No unauthorized file access attempts
+✅ No privilege escalation attempts
+✅ No suspicious network activity
+✅ All operations within user home directory
+
+## Dependency Analysis
+**System Packages Required**:
+- `jq` (for JSON processing) - Not documented in README
+- `tree` (for directory visualization) - Optional
+
+**NPM Packages Required**: None beyond Claude Code
+
+**Hardcoded Paths Detected**:
+⚠️  `/home/ubuntu/.claude/cache` (line 67)
+→  Should use `$HOME/.claude/cache` or `~/.claude/cache`
+
+## Recommendations
+1. **CRITICAL**: Document `jq` dependency in README.md
+2. **HIGH**: Fix hardcoded path on line 67
+3. **MEDIUM**: Clean up /tmp files before skill exits
+4. **LOW**: Consider making `tree` dependency optional
+
+## Overall Grade: B (READY with minor fixes)
+
+**Portability**: 85/100
+**Cleanliness**: 75/100
+**Security**: 100/100
+**Documentation**: 70/100
+
+**Final Status**: ✅ **APPROVED** for public release after addressing CRITICAL and HIGH priority items
+```
+
+### Step 10: Cleanup or Preserve
+
+**Ask user:**
+```
+Test complete. VM: $VM_NAME
+
+Options:
+1. Keep VM for manual inspection
+   Command: multipass shell $VM_NAME
+
+2. Stop VM (can restart later)
+   Command: multipass stop $VM_NAME
+
+3. Delete VM and snapshots (full cleanup)
+   Command: multipass delete $VM_NAME && multipass purge
+
+4. Rollback to "before" snapshot and retest
+   (virsh/UTM only)
+
+Your choice?
+```
+
+**Cleanup commands:**
+```bash
+# Option 2: Stop VM
+multipass stop "$VM_NAME"
+
+# Option 3: Full cleanup
+multipass delete "$VM_NAME"
+multipass purge
+
+# Cleanup temp files
+rm -rf /tmp/skill-test-logs
+rm /tmp/before-*.txt /tmp/after-*.txt /tmp/*-changes.txt
+```
+
+## Interpreting Results
+
+### ✅ **PASS** - Production Ready
+- VM still bootable after test
+- Skill completed successfully
+- No unauthorized system modifications
+- All dependencies documented
+- No security issues detected
+- Clean cleanup (no orphaned resources)
+
+### ⚠️ **WARNING** - Needs Review
+- Skill works but left system modifications
+- Installed undocumented packages
+- Modified system configs (needs user consent)
+- Performance issues (high resource usage)
+
+### ❌ **FAIL** - Not Safe
+- VM corrupted or unbootable
+- Skill crashed or hung indefinitely
+- Unauthorized privilege escalation
+- Malicious behavior detected
+- Critical undocumented dependencies
+- Data exfiltration attempts
+
+## Common Issues
+
+### Issue: "Multipass not found"
+**Fix**:
+```bash
+# macOS
+brew install multipass
+
+# Linux
+sudo snap install multipass
+```
+
+### Issue: "Virtualization not enabled"
+**Cause**: VT-x/AMD-V disabled in BIOS
+**Fix**: Enable virtualization in BIOS/UEFI settings
+
+### Issue: "Failed to launch VM"
+**Cause**: Insufficient resources
+**Fix**:
+```bash
+# Reduce VM resources
+multipass launch --cpus 1 --memory 1G --disk 5G
+```
+
+### Issue: "VM network not working"
+**Cause**: Network bridge issues
+**Fix**:
+```bash
+# Restart Multipass daemon
+# macOS
+sudo launchctl kickstart -k system/com.canonical.multipassd
+
+# Linux
+sudo systemctl restart snap.multipass.multipassd
+```
+
+### Issue: "Can't copy files to VM"
+**Cause**: SSH/sftp issues
+**Fix**:
+```bash
+# Mount host directory instead
+multipass mount ~/.claude/skills "$VM_NAME":/mnt/skills
+```
+
+## Advanced Techniques
+
+### Automated Testing Pipeline
+
+```bash
+#!/bin/bash
+# test-skill-vm.sh
+
+SKILL_NAME="$1"
+VM_NAME="skill-test-$SKILL_NAME-$(date +%s)"
+
+# Launch VM
+multipass launch --name "$VM_NAME" 22.04
+
+# Setup
+multipass exec "$VM_NAME" -- bash -c "
+  sudo apt-get update
+  sudo apt-get install -y nodejs npm
+  npm install -g @anthropic/claude-code
+"
+
+# Copy skill
+multipass transfer ~/.claude/skills/$SKILL_NAME "$VM_NAME":/home/ubuntu/.claude/skills/
+
+# Run test
+multipass exec "$VM_NAME" -- claude skill test $SKILL_NAME
+
+# Cleanup
+multipass delete "$VM_NAME"
+multipass purge
+```
+
+### Testing on Multiple OS Versions
+
+```bash
+# Test on Ubuntu 20.04, 22.04, and 24.04
+for version in 20.04 22.04 24.04; do
+  VM="skill-test-ubuntu-${version}"
+  multipass launch --name "$VM" $version
+  # ... run tests ...
+  multipass delete "$VM"
+done
+```
+
+### Network Isolation Testing
+
+```bash
+# Create VM without internet access (if supported by hypervisor)
+# Then test if skill fails gracefully without network
+```
+
+## Best Practices
+
+1. **Always take snapshots** before running skills
+2. **Test on clean VMs** - don't reuse VMs between tests
+3. **Monitor resource usage** - catch runaway processes
+4. **Check system logs** (`/var/log/syslog`) for warnings
+5. **Test rollback** - ensure VM can be restored
+6. **Document all system dependencies** found
+7. **Use minimal VM resources** to catch resource issues
+8. **Archive test results** before destroying VMs
+
+## Quick Command Reference
+
+```bash
+# Launch VM
+multipass launch --name test-vm 22.04
+
+# List VMs
+multipass list
+
+# Shell into VM
+multipass shell test-vm
+
+# Execute command in VM
+multipass exec test-vm -- <command>
+
+# Copy file to VM
+multipass transfer local-file test-vm:/remote/path
+
+# Copy file from VM
+multipass transfer test-vm:/remote/path local-file
+
+# Stop VM
+multipass stop test-vm
+
+# Start VM
+multipass start test-vm
+
+# Delete VM
+multipass delete test-vm && multipass purge
+
+# VM info
+multipass info test-vm
+```
+
+---
+
+**Remember:** VM isolation is the gold standard for testing high-risk skills. It's slower but provides complete security and accurate testing of system-level behaviors. Use for skills from untrusted sources or skills that modify system state.