gh-cskiro-claudex/skills/meta/skill-isolation-tester/data/risk-assessment.md

# Skill Risk Assessment Guide

## Overview

This guide helps you assess the risk level of a skill to determine the appropriate isolation environment for testing. Risk assessment prevents over-isolation (wasting time) and under-isolation (security issues).

## Risk Levels

### Low Risk → Git Worktree

**Characteristics:**
- Read-only operations on existing files
- No system commands (bash, npm, apt, etc.)
- No file creation outside skill directory
- No network requests
- Pure data processing or analysis
- File reading and reporting only

**Examples:**
- Code analyzer that reads files and generates reports
- Configuration validator that checks syntax
- Documentation generator from code comments
- Markdown formatter or linter
- Log file parser

**Appropriate Environment:** Git Worktree (fast, lightweight)

### Medium Risk → Docker

**Characteristics:**
- File creation in user directories
- NPM/pip package installation
- Bash commands for file operations
- Git operations (clone, commit, etc.)
- Network requests (API calls, downloads)
- Environment variable reads
- Temporary file creation
- Database connections (local)

**Examples:**
- Code generator that creates new files
- Package installer or dependency manager
- API integration that fetches remote data
- Build tool that compiles code
- Test runner that executes tests
- Migration tool that updates files

**Appropriate Environment:** Docker (OS isolation, reproducible)

### High Risk → VM

**Characteristics:**
- System configuration changes (/etc/ modifications)
- Service installation (systemd, cron)
- Kernel module loading
- VM or container operations
- Database schema migrations (production)
- Destructive operations (file deletion, disk formatting)
- Privilege escalation (sudo commands)
- Unknown or untrusted source

**Examples:**
- System setup automation
- Infrastructure provisioning
- VM management tools
- Security testing tools
- Experimental or unreviewed skills
- Skills from external repositories

**Appropriate Environment:** VM (complete isolation, safest)

## Assessment Checklist

### Step 1: Parse Skill Manifest (SKILL.md)

Read the skill's SKILL.md and look for these keywords:

**Low Risk Indicators:**
- "analyze", "read", "parse", "validate", "check", "lint", "format"
- "generate report", "calculate", "summarize"
- Read-only file operations
- No system commands mentioned

**Medium Risk Indicators:**
- "install", "create", "write", "modify", "update", "build", "compile"
- "npm install", "pip install", "git clone"
- "fetch", "download", "API call"
- File creation mentioned
- Bash commands for file operations

**High Risk Indicators:**
- "sudo", "systemctl", "cron", "service"
- "configure system", "modify /etc"
- "VM", "docker run", "container"
- "delete", "remove", "format"
- "root access", "privilege"

### Step 2: Scan Skill Code

If skill includes scripts or code files, scan for:

**Red Flags (High Risk):**
```bash
# In bash scripts
sudo
systemctl
/etc/
chmod 777
rm -rf /
dd if=
mkfs
usermod
passwd
```

```javascript
// In JavaScript/Node
require('child_process').exec('sudo')
fs.rmdirSync('/', { recursive: true })
process.setuid(0)
```

```python
# In Python
os.system('sudo')
import subprocess
subprocess.run(['sudo', ...])
```

**Medium Risk Patterns:**
```bash
npm install
git clone
curl | bash
apt-get install
brew install
pip install
mkdir -p
touch
echo > file
```

**Low Risk Patterns:**
```bash
cat file.txt
grep pattern
find . -name
ls -la
echo "message"
```

### Step 3: Check Dependencies

Review plugin.json or README for dependencies:

**Low Risk:**
- No external dependencies
- Pure JavaScript/Python/Ruby standard library
- Read-only CLI tools (cat, grep, jq for reading only)

**Medium Risk:**
- NPM packages listed
- Python packages (via requirements.txt)
- Common CLI tools (git, curl, wget)
- Database connections (read/write)

**High Risk:**
- System packages (apt, yum, brew)
- Kernel modules
- Root-level dependencies
- Unsigned binaries
- External scripts from unknown sources

### Step 4: Review File Operations

Check what directories the skill accesses:

**Low Risk:**
- Reads from current directory only
- Reads from specified input files
- Writes reports to current directory

**Medium Risk:**
- Reads/writes to ~/.claude/
- Reads/writes to /tmp/
- Creates files in user directories
- Modifies project files

**High Risk:**
- Accesses /etc/
- Accesses /usr/ or /usr/local/
- Accesses /sys/ or /proc/
- Modifies system binaries
- Accesses /var/log/

### Step 5: Network Activity Assessment

**Low Risk:**
- No network activity
- Reads from local cache only

**Medium Risk:**
- HTTP GET requests to public APIs
- Documented API endpoints
- Read-only data fetching
- HTTPS only

**High Risk:**
- HTTP POST with sensitive data
- Unclear network destinations
- Raw socket operations
- Arbitrary URL from user input
- Self-updating mechanism

## Automatic Risk Scoring

Use this scoring system:

```javascript
function assessSkillRisk(skill) {
  let score = 0;

  // File operations
  if (mentions(skill, "read", "parse", "analyze")) score += 1;
  if (mentions(skill, "write", "create", "modify")) score += 3;
  if (mentions(skill, "delete", "remove", "rm -rf")) score += 8;

  // System operations
  if (mentions(skill, "npm install", "pip install")) score += 3;
  if (mentions(skill, "apt-get", "brew install")) score += 5;
  if (mentions(skill, "sudo", "systemctl", "service")) score += 10;

  // File paths
  if (accesses(skill, "~/", "/tmp/")) score += 2;
  if (accesses(skill, "/etc/", "/usr/")) score += 8;

  // Network
  if (mentions(skill, "fetch", "API", "curl")) score += 2;
  if (mentions(skill, "download", "wget")) score += 3;

  // Process operations
  if (mentions(skill, "exec", "spawn", "child_process")) score += 4;

  // Determine risk level
  if (score <= 3) return "low";      // Worktree
  if (score <= 10) return "medium";  // Docker
  return "high";                     // VM
}
```

**Scoring Reference:**
- 0-3: Low Risk → Git Worktree
- 4-10: Medium Risk → Docker
- 11+: High Risk → VM

## Special Cases

### Unknown or Unreviewed Skills

**Default:** High Risk (VM isolation)

Even if skill appears low risk, use VM for first test of:
- Skills from external repositories
- Skills without documentation
- Skills with obfuscated code
- Skills from untrusted authors

### Skills in Active Development

**Recommendation:** Medium Risk (Docker)

For your own skills during development:
- Start with Git Worktree for speed
- Use Docker before committing
- Use VM before public release

### Skills from Marketplace

**Recommendation:** Follow listed risk level

Trusted marketplace skills can use their documented risk level.

## Override Cases

User can always override automatic detection:

```
test skill low-risk-skill in vm     # More isolation than needed (safe but slow)
test skill high-risk-skill in docker # Less isolation (not recommended)
```

**Warn user if choosing lower isolation than recommended.**

## Risk Re-assessment

Re-assess risk if skill is updated:
- Major version changes
- New dependencies added
- New file operations
- Expanded scope

## Decision Tree

```
Start
  |
  ├─ Does skill read files only?
  |    └─ YES → Low Risk (Worktree)
  |    └─ NO → Continue
  |
  ├─ Does skill install packages or modify files?
  |    └─ YES → Medium Risk (Docker)
  |    └─ NO → Continue
  |
  ├─ Does skill modify system configs or use sudo?
  |    └─ YES → High Risk (VM)
  |    └─ NO → Continue
  |
  └─ Is skill from untrusted source?
       └─ YES → High Risk (VM)
       └─ NO → Medium Risk (Docker)
```

## Example Assessments

### Example 1: "code-formatter"

**Description:** Formats JavaScript/TypeScript files using prettier

**Analysis:**
- Reads files: Yes (score: +1)
- Writes files: Yes (score: +3)
- System commands: No
- Dependencies: prettier (npm package) (score: +3)
- File paths: Current directory only

**Total Score:** 7
**Risk Level:** Medium → Docker

**Reasoning:** Modifies files but limited to project directory. Docker provides adequate isolation.

### Example 2: "log-analyzer"

**Description:** Parses log files and generates HTML report

**Analysis:**
- Reads files: Yes (score: +1)
- Writes files: Yes (HTML report) (score: +3)
- System commands: No
- Dependencies: None
- File paths: Current directory + /tmp for temp files (score: +2)

**Total Score:** 6
**Risk Level:** Medium → Docker

**Reasoning:** Safe operations but creates files. Docker ensures clean testing.

### Example 3: "system-auditor"

**Description:** Audits system security configuration

**Analysis:**
- Reads files: Yes, including /etc/ (score: +1 + 8)
- System commands: Runs systemctl, checks services (score: +10)
- Dependencies: System tools
- File paths: /etc/, /var/log/ (score: +8)

**Total Score:** 27
**Risk Level:** High → VM

**Reasoning:** Accesses sensitive system directories and uses system commands. VM required.

### Example 4: "markdown-linter"

**Description:** Checks markdown files for style violations

**Analysis:**
- Reads files: Yes (score: +1)
- Writes files: No (only stdout)
- System commands: No
- Dependencies: None
- File paths: Current directory only

**Total Score:** 1
**Risk Level:** Low → Git Worktree

**Reasoning:** Pure read-only analysis. Worktree is sufficient and fast.

---

**Remember:** When in doubt, choose higher isolation. It's better to be safe than to clean up a compromised system. Speed is secondary to security.