Files
gh-dhofheinz-open-plugins-p…/commands/security-scan/check-urls.md
2025-11-29 18:20:28 +08:00

263 lines
6.2 KiB
Markdown

## Operation: Check URL Safety
Validate URL safety, enforce HTTPS, and detect malicious patterns in URLs and code.
### Parameters from $ARGUMENTS
- **path**: Target directory or file to scan (required)
- **https-only**: Enforce HTTPS for all URLs (true|false, default: false)
- **allow-localhost**: Allow http://localhost URLs (true|false, default: true)
- **check-code-patterns**: Check for dangerous code execution patterns (true|false, default: true)
### URL Safety Checks
**Protocol Validation**:
- HTTPS enforcement (production contexts)
- HTTP allowed only for localhost/127.0.0.1
- FTP/telnet flagged as insecure
- file:// protocol flagged (potential security risk)
**Malicious Patterns**:
- `curl ... | sh` - Remote code execution
- `wget ... | bash` - Remote script execution
- `eval(fetch(...))` - Dynamic code execution
- `exec(...)` with URLs - Command injection risk
- `rm -rf` in scripts downloaded from URLs
- Obfuscated URLs (base64, hex encoded)
**Domain Validation**:
- Check for typosquatting (common package registries)
- Suspicious TLDs (.tk, .ml, .ga, .cf)
- IP addresses instead of domains
- Shortened URLs (bit.ly, tinyurl) - potential phishing
### Workflow
1. **Parse arguments**
```
Extract path, https-only, allow-localhost, check-code-patterns
Validate path exists
Determine scan scope
```
2. **Execute URL validator**
```bash
Execute .scripts/url-validator.py "$path" "$https_only" "$allow_localhost" "$check_code_patterns"
Returns:
- 0: All URLs safe
- 1: Unsafe URLs detected
- 2: Validation error
```
3. **Analyze results**
```
Categorize findings:
- CRITICAL: Remote code execution patterns
- HIGH: Non-HTTPS in production, obfuscated URLs
- MEDIUM: HTTP in non-localhost, suspicious TLDs
- LOW: Shortened URLs, IP addresses
Generate context-aware remediation
```
4. **Format output**
```
URL Safety Scan Results
━━━━━━━━━━━━━━━━━━━━━━
Path: <path>
URLs Scanned: <count>
CRITICAL Issues (<count>):
❌ <file>:<line>: Remote code execution pattern
Pattern: curl https://example.com/script.sh | bash
Risk: Executes arbitrary code without verification
Remediation: Download, verify, then execute
HIGH Issues (<count>):
⚠️ <file>:<line>: Non-HTTPS URL in production context
URL: http://api.example.com
Risk: Man-in-the-middle attacks
Remediation: Use HTTPS
Summary:
- Total URLs: <count>
- Safe: <count>
- Unsafe: <count>
- Action required: <yes|no>
```
### Dangerous Code Patterns
**Remote Execution** (CRITICAL):
```bash
# Dangerous patterns
curl https://example.com/install.sh | bash
wget -qO- https://get.example.com | sh
eval "$(curl -fsSL https://example.com/script)"
bash <(curl -s https://example.com/setup.sh)
```
**Dynamic Code Execution** (HIGH):
```javascript
// Dangerous patterns
eval(fetch(url).then(r => r.text()))
new Function(await fetch(url).text())()
exec(`curl ${url}`)
```
**Command Injection** (HIGH):
```bash
# Vulnerable patterns
wget $USER_INPUT
curl "$UNTRUSTED_URL"
git clone $URL # without validation
```
### Safe Alternatives
**Instead of curl | sh**:
```bash
# Safe: Download, verify, then execute
wget https://example.com/install.sh
sha256sum -c install.sh.sha256
chmod +x install.sh
./install.sh
```
**Instead of eval(fetch())**:
```javascript
// Safe: Fetch as data, validate, then use
const response = await fetch(url);
const data = await response.json();
// Process data, not as code
```
### Examples
```bash
# Check all URLs, enforce HTTPS
/security-scan urls path:. https-only:true
# Allow localhost HTTP during development
/security-scan urls path:. https-only:true allow-localhost:true
# Check for code execution patterns
/security-scan urls path:./scripts/ check-code-patterns:true
# Scan specific file
/security-scan urls path:./install.sh
```
### Error Handling
**Path not found**:
```
ERROR: Path does not exist: <path>
Remediation: Verify path and try again
```
**No URLs found**:
```
INFO: No URLs detected
No action required
```
**Python unavailable**:
```
ERROR: Python3 not available
Remediation: Install Python 3.x or skip URL validation
```
### Context-Aware Rules
**Production contexts** (strict):
- package.json scripts
- Dockerfiles
- CI/CD configs (.github/, .gitlab-ci.yml)
- Installation scripts (install.sh, setup.sh)
→ Enforce HTTPS, no remote execution
**Development contexts** (relaxed):
- Test files (*test*, *spec*)
- Mock data
- Local development configs
→ Allow HTTP for localhost
**Documentation contexts** (informational):
- README.md
- *.md files
- Comments
→ Flag but don't fail
### URL Categories
**Registry URLs** (validate carefully):
- npm: https://registry.npmjs.org
- PyPI: https://pypi.org
- Docker: https://registry.hub.docker.com
- GitHub: https://github.com
→ Verify exact domain, check for typosquatting
**CDN URLs** (HTTPS required):
- https://cdn.jsdelivr.net
- https://unpkg.com
- https://cdnjs.cloudflare.com
→ Must use HTTPS, verify integrity hashes
**Shortened URLs** (flag for review):
- bit.ly, tinyurl.com, goo.gl
→ Cannot verify destination, recommend expanding
### Remediation Guidance
**For remote code execution**:
1. Remove pipe-to-shell patterns
2. Download scripts explicitly
3. Verify checksums/signatures
4. Review code before execution
5. Use official package managers when possible
**For non-HTTPS URLs**:
1. Update to HTTPS version
2. Verify certificate validity
3. Pin certificate if highly sensitive
4. Consider using subresource integrity (SRI) for CDNs
**For suspicious URLs**:
1. Verify domain legitimacy
2. Check for typosquatting
3. Expand shortened URLs
4. Review destination manually
### Output Format
```json
{
"scan_type": "urls",
"path": "<path>",
"urls_scanned": <count>,
"unsafe_urls": <count>,
"severity_breakdown": {
"critical": <count>,
"high": <count>,
"medium": <count>,
"low": <count>
},
"findings": [
{
"file": "<file_path>",
"line": <line_number>,
"url": "<url>",
"issue": "<issue_type>",
"severity": "<severity>",
"risk": "<risk_description>",
"remediation": "<action>"
}
],
"action_required": <boolean>
}
```
**Request**: $ARGUMENTS