6.2 KiB
6.2 KiB
Operation: Check URL Safety
Validate URL safety, enforce HTTPS, and detect malicious patterns in URLs and code.
Parameters from $ARGUMENTS
- path: Target directory or file to scan (required)
- https-only: Enforce HTTPS for all URLs (true|false, default: false)
- allow-localhost: Allow http://localhost URLs (true|false, default: true)
- check-code-patterns: Check for dangerous code execution patterns (true|false, default: true)
URL Safety Checks
Protocol Validation:
- HTTPS enforcement (production contexts)
- HTTP allowed only for localhost/127.0.0.1
- FTP/telnet flagged as insecure
- file:// protocol flagged (potential security risk)
Malicious Patterns:
curl ... | sh- Remote code executionwget ... | bash- Remote script executioneval(fetch(...))- Dynamic code executionexec(...)with URLs - Command injection riskrm -rfin scripts downloaded from URLs- Obfuscated URLs (base64, hex encoded)
Domain Validation:
- Check for typosquatting (common package registries)
- Suspicious TLDs (.tk, .ml, .ga, .cf)
- IP addresses instead of domains
- Shortened URLs (bit.ly, tinyurl) - potential phishing
Workflow
-
Parse arguments
Extract path, https-only, allow-localhost, check-code-patterns Validate path exists Determine scan scope -
Execute URL validator
Execute .scripts/url-validator.py "$path" "$https_only" "$allow_localhost" "$check_code_patterns" Returns: - 0: All URLs safe - 1: Unsafe URLs detected - 2: Validation error -
Analyze results
Categorize findings: - CRITICAL: Remote code execution patterns - HIGH: Non-HTTPS in production, obfuscated URLs - MEDIUM: HTTP in non-localhost, suspicious TLDs - LOW: Shortened URLs, IP addresses Generate context-aware remediation -
Format output
URL Safety Scan Results ━━━━━━━━━━━━━━━━━━━━━━ Path: <path> URLs Scanned: <count> CRITICAL Issues (<count>): ❌ <file>:<line>: Remote code execution pattern Pattern: curl https://example.com/script.sh | bash Risk: Executes arbitrary code without verification Remediation: Download, verify, then execute HIGH Issues (<count>): ⚠️ <file>:<line>: Non-HTTPS URL in production context URL: http://api.example.com Risk: Man-in-the-middle attacks Remediation: Use HTTPS Summary: - Total URLs: <count> - Safe: <count> - Unsafe: <count> - Action required: <yes|no>
Dangerous Code Patterns
Remote Execution (CRITICAL):
# Dangerous patterns
curl https://example.com/install.sh | bash
wget -qO- https://get.example.com | sh
eval "$(curl -fsSL https://example.com/script)"
bash <(curl -s https://example.com/setup.sh)
Dynamic Code Execution (HIGH):
// Dangerous patterns
eval(fetch(url).then(r => r.text()))
new Function(await fetch(url).text())()
exec(`curl ${url}`)
Command Injection (HIGH):
# Vulnerable patterns
wget $USER_INPUT
curl "$UNTRUSTED_URL"
git clone $URL # without validation
Safe Alternatives
Instead of curl | sh:
# Safe: Download, verify, then execute
wget https://example.com/install.sh
sha256sum -c install.sh.sha256
chmod +x install.sh
./install.sh
Instead of eval(fetch()):
// Safe: Fetch as data, validate, then use
const response = await fetch(url);
const data = await response.json();
// Process data, not as code
Examples
# Check all URLs, enforce HTTPS
/security-scan urls path:. https-only:true
# Allow localhost HTTP during development
/security-scan urls path:. https-only:true allow-localhost:true
# Check for code execution patterns
/security-scan urls path:./scripts/ check-code-patterns:true
# Scan specific file
/security-scan urls path:./install.sh
Error Handling
Path not found:
ERROR: Path does not exist: <path>
Remediation: Verify path and try again
No URLs found:
INFO: No URLs detected
No action required
Python unavailable:
ERROR: Python3 not available
Remediation: Install Python 3.x or skip URL validation
Context-Aware Rules
Production contexts (strict):
- package.json scripts
- Dockerfiles
- CI/CD configs (.github/, .gitlab-ci.yml)
- Installation scripts (install.sh, setup.sh) → Enforce HTTPS, no remote execution
Development contexts (relaxed):
- Test files (test, spec)
- Mock data
- Local development configs → Allow HTTP for localhost
Documentation contexts (informational):
- README.md
- *.md files
- Comments → Flag but don't fail
URL Categories
Registry URLs (validate carefully):
- npm: https://registry.npmjs.org
- PyPI: https://pypi.org
- Docker: https://registry.hub.docker.com
- GitHub: https://github.com → Verify exact domain, check for typosquatting
CDN URLs (HTTPS required):
- https://cdn.jsdelivr.net
- https://unpkg.com
- https://cdnjs.cloudflare.com → Must use HTTPS, verify integrity hashes
Shortened URLs (flag for review):
- bit.ly, tinyurl.com, goo.gl → Cannot verify destination, recommend expanding
Remediation Guidance
For remote code execution:
- Remove pipe-to-shell patterns
- Download scripts explicitly
- Verify checksums/signatures
- Review code before execution
- Use official package managers when possible
For non-HTTPS URLs:
- Update to HTTPS version
- Verify certificate validity
- Pin certificate if highly sensitive
- Consider using subresource integrity (SRI) for CDNs
For suspicious URLs:
- Verify domain legitimacy
- Check for typosquatting
- Expand shortened URLs
- Review destination manually
Output Format
{
"scan_type": "urls",
"path": "<path>",
"urls_scanned": <count>,
"unsafe_urls": <count>,
"severity_breakdown": {
"critical": <count>,
"high": <count>,
"medium": <count>,
"low": <count>
},
"findings": [
{
"file": "<file_path>",
"line": <line_number>,
"url": "<url>",
"issue": "<issue_type>",
"severity": "<severity>",
"risk": "<risk_description>",
"remediation": "<action>"
}
],
"action_required": <boolean>
}
Request: $ARGUMENTS