Files
gh-cskiro-claudex-meta-tools/skills/skill-isolation-tester/data/risk-assessment.md
2025-11-29 18:16:56 +08:00

9.3 KiB

Skill Risk Assessment Guide

Overview

This guide helps you assess the risk level of a skill to determine the appropriate isolation environment for testing. Risk assessment prevents over-isolation (wasting time) and under-isolation (security issues).

Risk Levels

Low Risk → Git Worktree

Characteristics:

  • Read-only operations on existing files
  • No system commands (bash, npm, apt, etc.)
  • No file creation outside skill directory
  • No network requests
  • Pure data processing or analysis
  • File reading and reporting only

Examples:

  • Code analyzer that reads files and generates reports
  • Configuration validator that checks syntax
  • Documentation generator from code comments
  • Markdown formatter or linter
  • Log file parser

Appropriate Environment: Git Worktree (fast, lightweight)

Medium Risk → Docker

Characteristics:

  • File creation in user directories
  • NPM/pip package installation
  • Bash commands for file operations
  • Git operations (clone, commit, etc.)
  • Network requests (API calls, downloads)
  • Environment variable reads
  • Temporary file creation
  • Database connections (local)

Examples:

  • Code generator that creates new files
  • Package installer or dependency manager
  • API integration that fetches remote data
  • Build tool that compiles code
  • Test runner that executes tests
  • Migration tool that updates files

Appropriate Environment: Docker (OS isolation, reproducible)

High Risk → VM

Characteristics:

  • System configuration changes (/etc/ modifications)
  • Service installation (systemd, cron)
  • Kernel module loading
  • VM or container operations
  • Database schema migrations (production)
  • Destructive operations (file deletion, disk formatting)
  • Privilege escalation (sudo commands)
  • Unknown or untrusted source

Examples:

  • System setup automation
  • Infrastructure provisioning
  • VM management tools
  • Security testing tools
  • Experimental or unreviewed skills
  • Skills from external repositories

Appropriate Environment: VM (complete isolation, safest)

Assessment Checklist

Step 1: Parse Skill Manifest (SKILL.md)

Read the skill's SKILL.md and look for these keywords:

Low Risk Indicators:

  • "analyze", "read", "parse", "validate", "check", "lint", "format"
  • "generate report", "calculate", "summarize"
  • Read-only file operations
  • No system commands mentioned

Medium Risk Indicators:

  • "install", "create", "write", "modify", "update", "build", "compile"
  • "npm install", "pip install", "git clone"
  • "fetch", "download", "API call"
  • File creation mentioned
  • Bash commands for file operations

High Risk Indicators:

  • "sudo", "systemctl", "cron", "service"
  • "configure system", "modify /etc"
  • "VM", "docker run", "container"
  • "delete", "remove", "format"
  • "root access", "privilege"

Step 2: Scan Skill Code

If skill includes scripts or code files, scan for:

Red Flags (High Risk):

# In bash scripts
sudo
systemctl
/etc/
chmod 777
rm -rf /
dd if=
mkfs
usermod
passwd
// In JavaScript/Node
require('child_process').exec('sudo')
fs.rmdirSync('/', { recursive: true })
process.setuid(0)
# In Python
os.system('sudo')
import subprocess
subprocess.run(['sudo', ...])

Medium Risk Patterns:

npm install
git clone
curl | bash
apt-get install
brew install
pip install
mkdir -p
touch
echo > file

Low Risk Patterns:

cat file.txt
grep pattern
find . -name
ls -la
echo "message"

Step 3: Check Dependencies

Review plugin.json or README for dependencies:

Low Risk:

  • No external dependencies
  • Pure JavaScript/Python/Ruby standard library
  • Read-only CLI tools (cat, grep, jq for reading only)

Medium Risk:

  • NPM packages listed
  • Python packages (via requirements.txt)
  • Common CLI tools (git, curl, wget)
  • Database connections (read/write)

High Risk:

  • System packages (apt, yum, brew)
  • Kernel modules
  • Root-level dependencies
  • Unsigned binaries
  • External scripts from unknown sources

Step 4: Review File Operations

Check what directories the skill accesses:

Low Risk:

  • Reads from current directory only
  • Reads from specified input files
  • Writes reports to current directory

Medium Risk:

  • Reads/writes to ~/.claude/
  • Reads/writes to /tmp/
  • Creates files in user directories
  • Modifies project files

High Risk:

  • Accesses /etc/
  • Accesses /usr/ or /usr/local/
  • Accesses /sys/ or /proc/
  • Modifies system binaries
  • Accesses /var/log/

Step 5: Network Activity Assessment

Low Risk:

  • No network activity
  • Reads from local cache only

Medium Risk:

  • HTTP GET requests to public APIs
  • Documented API endpoints
  • Read-only data fetching
  • HTTPS only

High Risk:

  • HTTP POST with sensitive data
  • Unclear network destinations
  • Raw socket operations
  • Arbitrary URL from user input
  • Self-updating mechanism

Automatic Risk Scoring

Use this scoring system:

function assessSkillRisk(skill) {
  let score = 0;

  // File operations
  if (mentions(skill, "read", "parse", "analyze")) score += 1;
  if (mentions(skill, "write", "create", "modify")) score += 3;
  if (mentions(skill, "delete", "remove", "rm -rf")) score += 8;

  // System operations
  if (mentions(skill, "npm install", "pip install")) score += 3;
  if (mentions(skill, "apt-get", "brew install")) score += 5;
  if (mentions(skill, "sudo", "systemctl", "service")) score += 10;

  // File paths
  if (accesses(skill, "~/", "/tmp/")) score += 2;
  if (accesses(skill, "/etc/", "/usr/")) score += 8;

  // Network
  if (mentions(skill, "fetch", "API", "curl")) score += 2;
  if (mentions(skill, "download", "wget")) score += 3;

  // Process operations
  if (mentions(skill, "exec", "spawn", "child_process")) score += 4;

  // Determine risk level
  if (score <= 3) return "low";      // Worktree
  if (score <= 10) return "medium";  // Docker
  return "high";                     // VM
}

Scoring Reference:

  • 0-3: Low Risk → Git Worktree
  • 4-10: Medium Risk → Docker
  • 11+: High Risk → VM

Special Cases

Unknown or Unreviewed Skills

Default: High Risk (VM isolation)

Even if skill appears low risk, use VM for first test of:

  • Skills from external repositories
  • Skills without documentation
  • Skills with obfuscated code
  • Skills from untrusted authors

Skills in Active Development

Recommendation: Medium Risk (Docker)

For your own skills during development:

  • Start with Git Worktree for speed
  • Use Docker before committing
  • Use VM before public release

Skills from Marketplace

Recommendation: Follow listed risk level

Trusted marketplace skills can use their documented risk level.

Override Cases

User can always override automatic detection:

test skill low-risk-skill in vm     # More isolation than needed (safe but slow)
test skill high-risk-skill in docker # Less isolation (not recommended)

Warn user if choosing lower isolation than recommended.

Risk Re-assessment

Re-assess risk if skill is updated:

  • Major version changes
  • New dependencies added
  • New file operations
  • Expanded scope

Decision Tree

Start
  |
  ├─ Does skill read files only?
  |    └─ YES → Low Risk (Worktree)
  |    └─ NO → Continue
  |
  ├─ Does skill install packages or modify files?
  |    └─ YES → Medium Risk (Docker)
  |    └─ NO → Continue
  |
  ├─ Does skill modify system configs or use sudo?
  |    └─ YES → High Risk (VM)
  |    └─ NO → Continue
  |
  └─ Is skill from untrusted source?
       └─ YES → High Risk (VM)
       └─ NO → Medium Risk (Docker)

Example Assessments

Example 1: "code-formatter"

Description: Formats JavaScript/TypeScript files using prettier

Analysis:

  • Reads files: Yes (score: +1)
  • Writes files: Yes (score: +3)
  • System commands: No
  • Dependencies: prettier (npm package) (score: +3)
  • File paths: Current directory only

Total Score: 7 Risk Level: Medium → Docker

Reasoning: Modifies files but limited to project directory. Docker provides adequate isolation.

Example 2: "log-analyzer"

Description: Parses log files and generates HTML report

Analysis:

  • Reads files: Yes (score: +1)
  • Writes files: Yes (HTML report) (score: +3)
  • System commands: No
  • Dependencies: None
  • File paths: Current directory + /tmp for temp files (score: +2)

Total Score: 6 Risk Level: Medium → Docker

Reasoning: Safe operations but creates files. Docker ensures clean testing.

Example 3: "system-auditor"

Description: Audits system security configuration

Analysis:

  • Reads files: Yes, including /etc/ (score: +1 + 8)
  • System commands: Runs systemctl, checks services (score: +10)
  • Dependencies: System tools
  • File paths: /etc/, /var/log/ (score: +8)

Total Score: 27 Risk Level: High → VM

Reasoning: Accesses sensitive system directories and uses system commands. VM required.

Example 4: "markdown-linter"

Description: Checks markdown files for style violations

Analysis:

  • Reads files: Yes (score: +1)
  • Writes files: No (only stdout)
  • System commands: No
  • Dependencies: None
  • File paths: Current directory only

Total Score: 1 Risk Level: Low → Git Worktree

Reasoning: Pure read-only analysis. Worktree is sufficient and fast.


Remember: When in doubt, choose higher isolation. It's better to be safe than to clean up a compromised system. Speed is secondary to security.