Initial commit
This commit is contained in:
284
skills/appsec/sast-semgrep/SKILL.md
Normal file
284
skills/appsec/sast-semgrep/SKILL.md
Normal file
@@ -0,0 +1,284 @@
|
||||
---
|
||||
name: sast-semgrep
|
||||
description: >
|
||||
Static application security testing (SAST) using Semgrep for vulnerability detection,
|
||||
security code review, and secure coding guidance with OWASP and CWE framework mapping.
|
||||
Use when: (1) Scanning code for security vulnerabilities across multiple languages,
|
||||
(2) Performing security code reviews with pattern-based detection, (3) Integrating
|
||||
SAST checks into CI/CD pipelines, (4) Providing remediation guidance with OWASP Top 10
|
||||
and CWE mappings, (5) Creating custom security rules for organization-specific patterns,
|
||||
(6) Analyzing dependencies for known vulnerabilities.
|
||||
version: 0.1.0
|
||||
maintainer: SirAppSec
|
||||
category: appsec
|
||||
tags: [sast, semgrep, vulnerability-scanning, code-security, owasp, cwe, security-review]
|
||||
frameworks: [OWASP, CWE, SANS-25]
|
||||
dependencies:
|
||||
python: ">=3.8"
|
||||
packages: [semgrep]
|
||||
tools: [git]
|
||||
references:
|
||||
- https://semgrep.dev/docs/
|
||||
- https://owasp.org/Top10/
|
||||
- https://cwe.mitre.org/
|
||||
---
|
||||
|
||||
# SAST with Semgrep
|
||||
|
||||
## Overview
|
||||
|
||||
Perform comprehensive static application security testing using Semgrep, a fast, open-source
|
||||
static analysis tool. This skill provides automated vulnerability detection, security code
|
||||
review workflows, and remediation guidance mapped to OWASP Top 10 and CWE standards.
|
||||
|
||||
## Quick Start
|
||||
|
||||
Scan a codebase for security vulnerabilities:
|
||||
|
||||
```bash
|
||||
semgrep --config=auto --severity=ERROR --severity=WARNING /path/to/code
|
||||
```
|
||||
|
||||
Run with OWASP Top 10 ruleset:
|
||||
|
||||
```bash
|
||||
semgrep --config="p/owasp-top-ten" /path/to/code
|
||||
```
|
||||
|
||||
## Core Workflows
|
||||
|
||||
### Workflow 1: Initial Security Scan
|
||||
|
||||
1. Identify the primary languages in the codebase
|
||||
2. Run `scripts/semgrep_scan.py` with appropriate rulesets
|
||||
3. Parse findings and categorize by severity (CRITICAL, HIGH, MEDIUM, LOW)
|
||||
4. Map findings to OWASP Top 10 and CWE categories
|
||||
5. Generate prioritized remediation report
|
||||
|
||||
### Workflow 2: Security Code Review
|
||||
|
||||
1. For pull requests or commits, run targeted scans on changed files
|
||||
2. Use `semgrep --diff` to scan only modified code
|
||||
3. Flag high-severity findings as blocking issues
|
||||
4. Provide inline remediation guidance from `references/remediation_guide.md`
|
||||
5. Link findings to secure coding patterns
|
||||
|
||||
### Workflow 3: Custom Rule Development
|
||||
|
||||
1. Identify organization-specific security patterns to detect
|
||||
2. Create custom Semgrep rules in YAML format using `assets/rule_template.yaml`
|
||||
3. Test rules against known vulnerable code samples
|
||||
4. Integrate custom rules into CI/CD pipeline
|
||||
5. Document rules in `references/custom_rules.md`
|
||||
|
||||
### Workflow 4: CI/CD Integration
|
||||
|
||||
1. Add Semgrep to CI/CD pipeline using `assets/ci_config_examples/`
|
||||
2. Configure baseline scanning for pull requests
|
||||
3. Set severity thresholds (fail on CRITICAL/HIGH)
|
||||
4. Generate SARIF output for security dashboards
|
||||
5. Track metrics: vulnerabilities found, fix rate, false positives
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Sensitive Data Handling**: Semgrep scans code locally; ensure scan results don't leak
|
||||
secrets or proprietary code patterns. Use `--max-lines-per-finding` to limit output.
|
||||
|
||||
- **Access Control**: Semgrep scans require read access to source code. Restrict scan
|
||||
result access to authorized security and development teams.
|
||||
|
||||
- **Audit Logging**: Log all scan executions with timestamps, user, commit hash, and
|
||||
findings count for compliance auditing.
|
||||
|
||||
- **Compliance**: SAST scanning supports SOC2, PCI-DSS, and GDPR compliance requirements.
|
||||
Maintain scan history and remediation tracking.
|
||||
|
||||
- **Safe Defaults**: Use `--config=auto` for balanced detection. For security-critical
|
||||
applications, use `--config="p/security-audit"` for comprehensive coverage.
|
||||
|
||||
## Language Support
|
||||
|
||||
Semgrep supports 30+ languages including:
|
||||
- **Web**: JavaScript, TypeScript, Python, Ruby, PHP, Java, C#, Go
|
||||
- **Mobile**: Swift, Kotlin, Java (Android)
|
||||
- **Infrastructure**: Terraform, Dockerfile, YAML, JSON
|
||||
- **Other**: C, C++, Rust, Scala, Solidity
|
||||
|
||||
## Bundled Resources
|
||||
|
||||
### Scripts
|
||||
|
||||
- `scripts/semgrep_scan.py` - Full-featured scanning with OWASP/CWE mapping and reporting
|
||||
- `scripts/baseline_scan.sh` - Quick baseline scan for CI/CD
|
||||
- `scripts/diff_scan.sh` - Scan only changed files (for PRs)
|
||||
|
||||
### References
|
||||
|
||||
- `references/owasp_cwe_mapping.md` - OWASP Top 10 to CWE mapping with Semgrep rules
|
||||
- `references/remediation_guide.md` - Vulnerability remediation patterns by category
|
||||
- `references/rule_library.md` - Curated list of useful Semgrep rulesets
|
||||
|
||||
### Assets
|
||||
|
||||
- `assets/rule_template.yaml` - Template for creating custom Semgrep rules
|
||||
- `assets/ci_config_examples/` - CI/CD integration examples (GitHub Actions, GitLab CI)
|
||||
- `assets/semgrep_config.yaml` - Recommended Semgrep configuration
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern 1: Daily Security Baseline Scan
|
||||
|
||||
```bash
|
||||
# Run comprehensive scan and generate report
|
||||
scripts/semgrep_scan.py --config security-audit \
|
||||
--output results.json \
|
||||
--format json \
|
||||
--severity HIGH CRITICAL
|
||||
```
|
||||
|
||||
### Pattern 2: Pull Request Security Gate
|
||||
|
||||
```bash
|
||||
# Scan only changed files, fail on HIGH/CRITICAL
|
||||
scripts/diff_scan.sh --fail-on high \
|
||||
--base-branch main \
|
||||
--output sarif
|
||||
```
|
||||
|
||||
### Pattern 3: Vulnerability Research
|
||||
|
||||
```bash
|
||||
# Search for specific vulnerability patterns
|
||||
semgrep --config "r/javascript.lang.security.audit.xss" \
|
||||
--json /path/to/code | jq '.results'
|
||||
```
|
||||
|
||||
### Pattern 4: Custom Rule Validation
|
||||
|
||||
```bash
|
||||
# Test custom rule against vulnerable samples
|
||||
semgrep --config assets/custom_rules.yaml \
|
||||
--test tests/vulnerable_samples/
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
- **GitHub Actions**: Use `semgrep/semgrep-action@v1` with SARIF upload
|
||||
- **GitLab CI**: Run as security scanning job with artifact reports
|
||||
- **Jenkins**: Execute as build step with quality gate integration
|
||||
- **pre-commit hooks**: Run lightweight scans on staged files
|
||||
|
||||
See `assets/ci_config_examples/` for ready-to-use configurations.
|
||||
|
||||
### Security Tool Integration
|
||||
|
||||
- **SIEM/SOAR**: Export findings in JSON/SARIF for ingestion
|
||||
- **Vulnerability Management**: Integrate with Jira, DefectDojo, or ThreadFix
|
||||
- **IDE Integration**: Use Semgrep IDE plugins for real-time detection
|
||||
- **Secret Scanning**: Combine with tools like trufflehog, gitleaks
|
||||
|
||||
### SDLC Integration
|
||||
|
||||
- **Requirements Phase**: Define security requirements and custom rules
|
||||
- **Development**: IDE plugins provide real-time feedback
|
||||
- **Code Review**: Automated security review in PR workflow
|
||||
- **Testing**: Integrate with security testing framework
|
||||
- **Deployment**: Final security gate before production
|
||||
|
||||
## Severity Classification
|
||||
|
||||
Semgrep findings are classified by severity:
|
||||
|
||||
- **CRITICAL**: Exploitable vulnerabilities (SQLi, RCE, Auth bypass)
|
||||
- **HIGH**: Significant security risks (XSS, CSRF, sensitive data exposure)
|
||||
- **MEDIUM**: Security weaknesses (weak crypto, missing validation)
|
||||
- **LOW**: Code quality issues with security implications
|
||||
- **INFO**: Security best practice recommendations
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
For large codebases:
|
||||
|
||||
```bash
|
||||
# Use --jobs for parallel scanning
|
||||
semgrep --config auto --jobs 4
|
||||
|
||||
# Exclude vendor/test code
|
||||
semgrep --config auto --exclude "vendor/" --exclude "test/"
|
||||
|
||||
# Use lightweight rulesets for faster feedback
|
||||
semgrep --config "p/owasp-top-ten" --exclude-rule "generic.*"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Issue: Too Many False Positives
|
||||
|
||||
**Solution**:
|
||||
- Use `--exclude-rule` to disable noisy rules
|
||||
- Create `.semgrepignore` file to exclude false positive patterns
|
||||
- Tune rules using `--severity` filtering
|
||||
- Add `# nosemgrep` comments for confirmed false positives (with justification)
|
||||
|
||||
### Issue: Scan Taking Too Long
|
||||
|
||||
**Solution**:
|
||||
- Use `--exclude` for vendor/generated code
|
||||
- Increase `--jobs` for parallel processing
|
||||
- Use targeted rulesets instead of `--config=auto`
|
||||
- Run incremental scans with `--diff`
|
||||
|
||||
### Issue: Missing Vulnerabilities
|
||||
|
||||
**Solution**:
|
||||
- Use comprehensive rulesets: `p/security-audit` or `p/owasp-top-ten`
|
||||
- Consult `references/rule_library.md` for specialized rules
|
||||
- Create custom rules for organization-specific patterns
|
||||
- Combine with dynamic analysis (DAST) and dependency scanning
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Creating Custom Rules
|
||||
|
||||
See `references/rule_library.md` for guidance on writing effective Semgrep rules.
|
||||
Use `assets/rule_template.yaml` as a starting point.
|
||||
|
||||
Example rule structure:
|
||||
```yaml
|
||||
rules:
|
||||
- id: custom-sql-injection
|
||||
patterns:
|
||||
- pattern: execute($QUERY)
|
||||
- pattern-inside: |
|
||||
$QUERY = $USER_INPUT + ...
|
||||
message: Potential SQL injection from user input concatenation
|
||||
severity: ERROR
|
||||
languages: [python]
|
||||
metadata:
|
||||
cwe: "CWE-89"
|
||||
owasp: "A03:2021-Injection"
|
||||
```
|
||||
|
||||
### OWASP Top 10 Coverage
|
||||
|
||||
This skill provides detection for all OWASP Top 10 2021 categories.
|
||||
See `references/owasp_cwe_mapping.md` for complete coverage matrix.
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Baseline First**: Establish security baseline before enforcing gates
|
||||
2. **Progressive Rollout**: Start with HIGH/CRITICAL, expand to MEDIUM over time
|
||||
3. **Developer Training**: Educate team on common vulnerabilities and fixes
|
||||
4. **Rule Maintenance**: Regularly update rulesets and tune for your stack
|
||||
5. **Metrics Tracking**: Monitor vulnerability trends, MTTR, and false positive rate
|
||||
6. **Defense in Depth**: Combine with DAST, SCA, and manual code review
|
||||
|
||||
## References
|
||||
|
||||
- [Semgrep Documentation](https://semgrep.dev/docs/)
|
||||
- [Semgrep Rule Registry](https://semgrep.dev/explore)
|
||||
- [OWASP Top 10 2021](https://owasp.org/Top10/)
|
||||
- [CWE Top 25](https://cwe.mitre.org/top25/)
|
||||
- [SANS Top 25](https://www.sans.org/top25-software-errors/)
|
||||
@@ -0,0 +1,141 @@
|
||||
# GitHub Actions - Semgrep Security Scanning
|
||||
# Save as .github/workflows/semgrep.yml
|
||||
|
||||
name: Semgrep Security Scan
|
||||
|
||||
on:
|
||||
# Scan on push to main/master
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- master
|
||||
# Scan pull requests
|
||||
pull_request:
|
||||
branches:
|
||||
- main
|
||||
- master
|
||||
# Manual trigger
|
||||
workflow_dispatch:
|
||||
# Schedule daily scans
|
||||
schedule:
|
||||
- cron: '0 0 * * *' # Run at midnight UTC
|
||||
|
||||
jobs:
|
||||
semgrep:
|
||||
name: SAST Security Scan
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
# Required for uploading results to GitHub Security
|
||||
permissions:
|
||||
security-events: write
|
||||
actions: read
|
||||
contents: read
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Run Semgrep
|
||||
uses: semgrep/semgrep-action@v1
|
||||
with:
|
||||
# Ruleset to use
|
||||
config: >-
|
||||
p/security-audit
|
||||
p/owasp-top-ten
|
||||
p/cwe-top-25
|
||||
|
||||
# Generate SARIF for GitHub Security
|
||||
publishToken: ${{ secrets.SEMGREP_APP_TOKEN }}
|
||||
publishDeployment: ${{ secrets.SEMGREP_DEPLOYMENT_ID }}
|
||||
|
||||
# Fail on HIGH/ERROR severity
|
||||
# auditOn: push
|
||||
|
||||
- name: Upload SARIF to GitHub Security
|
||||
if: always()
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: semgrep.sarif
|
||||
|
||||
- name: Upload scan results as artifact
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: semgrep-results
|
||||
path: semgrep.sarif
|
||||
|
||||
# Alternative: Simpler configuration without Semgrep Cloud
|
||||
---
|
||||
name: Semgrep Security Scan (Simple)
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches: [main, master]
|
||||
push:
|
||||
branches: [main, master]
|
||||
|
||||
jobs:
|
||||
semgrep:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install Semgrep
|
||||
run: pip install semgrep
|
||||
|
||||
- name: Run Semgrep Scan
|
||||
run: |
|
||||
semgrep --config="p/security-audit" \
|
||||
--config="p/owasp-top-ten" \
|
||||
--sarif \
|
||||
--output=semgrep-results.sarif \
|
||||
--severity=ERROR \
|
||||
--severity=WARNING
|
||||
|
||||
- name: Upload SARIF results
|
||||
if: always()
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: semgrep-results.sarif
|
||||
|
||||
# PR-specific: Only scan changed files
|
||||
---
|
||||
name: Semgrep PR Scan
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
|
||||
jobs:
|
||||
semgrep-diff:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0 # Fetch full history for diff
|
||||
|
||||
- name: Install Semgrep
|
||||
run: pip install semgrep
|
||||
|
||||
- name: Scan changed files only
|
||||
run: |
|
||||
semgrep --config="p/security-audit" \
|
||||
--baseline-commit="${{ github.event.pull_request.base.sha }}" \
|
||||
--json \
|
||||
--output=results.json
|
||||
|
||||
- name: Check for findings
|
||||
run: |
|
||||
FINDINGS=$(jq '.results | length' results.json)
|
||||
echo "Found $FINDINGS security issues"
|
||||
if [ "$FINDINGS" -gt 0 ]; then
|
||||
echo "❌ Security issues detected!"
|
||||
jq '.results[] | "[\(.extra.severity)] \(.check_id) - \(.path):\(.start.line)"' results.json
|
||||
exit 1
|
||||
else
|
||||
echo "✅ No security issues found"
|
||||
fi
|
||||
@@ -0,0 +1,106 @@
|
||||
# GitLab CI - Semgrep Security Scanning
|
||||
# Add to .gitlab-ci.yml
|
||||
|
||||
stages:
|
||||
- test
|
||||
- security
|
||||
|
||||
# Basic Semgrep scan
|
||||
semgrep-scan:
|
||||
stage: security
|
||||
image: semgrep/semgrep:latest
|
||||
script:
|
||||
- semgrep --config="p/security-audit"
|
||||
--config="p/owasp-top-ten"
|
||||
--gitlab-sast
|
||||
--output=gl-sast-report.json
|
||||
artifacts:
|
||||
reports:
|
||||
sast: gl-sast-report.json
|
||||
paths:
|
||||
- gl-sast-report.json
|
||||
expire_in: 1 week
|
||||
rules:
|
||||
- if: $CI_MERGE_REQUEST_ID # Run on MRs
|
||||
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH # Run on default branch
|
||||
|
||||
# Advanced: Fail on HIGH severity findings
|
||||
semgrep-strict:
|
||||
stage: security
|
||||
image: python:3.11-slim
|
||||
before_script:
|
||||
- pip install semgrep
|
||||
script:
|
||||
- |
|
||||
semgrep --config="p/security-audit" \
|
||||
--severity=ERROR \
|
||||
--json \
|
||||
--output=results.json
|
||||
|
||||
CRITICAL=$(jq '[.results[] | select(.extra.severity == "ERROR")] | length' results.json)
|
||||
echo "Found $CRITICAL critical findings"
|
||||
|
||||
if [ "$CRITICAL" -gt 0 ]; then
|
||||
echo "❌ Critical security issues detected!"
|
||||
jq '.results[] | select(.extra.severity == "ERROR")' results.json
|
||||
exit 1
|
||||
fi
|
||||
artifacts:
|
||||
paths:
|
||||
- results.json
|
||||
expire_in: 1 week
|
||||
when: always
|
||||
allow_failure: false
|
||||
|
||||
# Differential scanning - only new findings in MR
|
||||
semgrep-diff:
|
||||
stage: security
|
||||
image: semgrep/semgrep:latest
|
||||
script:
|
||||
- git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
|
||||
- |
|
||||
semgrep --config="p/security-audit" \
|
||||
--baseline-commit="origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME" \
|
||||
--gitlab-sast \
|
||||
--output=gl-sast-report.json
|
||||
artifacts:
|
||||
reports:
|
||||
sast: gl-sast-report.json
|
||||
rules:
|
||||
- if: $CI_MERGE_REQUEST_ID
|
||||
|
||||
# Scheduled full scan (daily)
|
||||
semgrep-scheduled:
|
||||
stage: security
|
||||
image: semgrep/semgrep:latest
|
||||
script:
|
||||
- |
|
||||
semgrep --config="p/security-audit" \
|
||||
--config="p/owasp-top-ten" \
|
||||
--config="p/cwe-top-25" \
|
||||
--json \
|
||||
--output=full-scan-results.json
|
||||
artifacts:
|
||||
paths:
|
||||
- full-scan-results.json
|
||||
expire_in: 30 days
|
||||
rules:
|
||||
- if: $CI_PIPELINE_SOURCE == "schedule"
|
||||
|
||||
# Custom rules integration
|
||||
semgrep-custom:
|
||||
stage: security
|
||||
image: semgrep/semgrep:latest
|
||||
script:
|
||||
- |
|
||||
semgrep --config="p/owasp-top-ten" \
|
||||
--config="custom-rules/security.yaml" \
|
||||
--gitlab-sast \
|
||||
--output=gl-sast-report.json
|
||||
artifacts:
|
||||
reports:
|
||||
sast: gl-sast-report.json
|
||||
rules:
|
||||
- if: $CI_MERGE_REQUEST_ID
|
||||
exists:
|
||||
- custom-rules/security.yaml
|
||||
@@ -0,0 +1,190 @@
|
||||
// Jenkinsfile - Semgrep Security Scanning
|
||||
// Basic pipeline with Semgrep security gate
|
||||
|
||||
pipeline {
|
||||
agent any
|
||||
|
||||
environment {
|
||||
SEMGREP_VERSION = '1.50.0' // Pin to specific version
|
||||
}
|
||||
|
||||
stages {
|
||||
stage('Checkout') {
|
||||
steps {
|
||||
checkout scm
|
||||
}
|
||||
}
|
||||
|
||||
stage('Security Scan') {
|
||||
steps {
|
||||
script {
|
||||
// Install Semgrep
|
||||
sh 'pip3 install semgrep==${SEMGREP_VERSION}'
|
||||
|
||||
// Run Semgrep scan
|
||||
sh '''
|
||||
semgrep --config="p/security-audit" \
|
||||
--config="p/owasp-top-ten" \
|
||||
--json \
|
||||
--output=semgrep-results.json \
|
||||
--severity=ERROR \
|
||||
--severity=WARNING
|
||||
'''
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
stage('Process Results') {
|
||||
steps {
|
||||
script {
|
||||
// Parse results
|
||||
def results = readJSON file: 'semgrep-results.json'
|
||||
def findings = results.results.size()
|
||||
def critical = results.results.findAll {
|
||||
it.extra.severity == 'ERROR'
|
||||
}.size()
|
||||
|
||||
echo "Total findings: ${findings}"
|
||||
echo "Critical findings: ${critical}"
|
||||
|
||||
// Fail build if critical findings
|
||||
if (critical > 0) {
|
||||
error("❌ Critical security vulnerabilities detected!")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
post {
|
||||
always {
|
||||
// Archive scan results
|
||||
archiveArtifacts artifacts: 'semgrep-results.json',
|
||||
fingerprint: true
|
||||
|
||||
// Publish results (if using warnings-ng plugin)
|
||||
// recordIssues(
|
||||
// tools: [semgrep(pattern: 'semgrep-results.json')],
|
||||
// qualityGates: [[threshold: 1, type: 'TOTAL', unstable: false]]
|
||||
// )
|
||||
}
|
||||
failure {
|
||||
echo '❌ Security scan failed - review findings'
|
||||
}
|
||||
success {
|
||||
echo '✅ No critical security issues detected'
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Advanced: Differential scanning for PRs
|
||||
pipeline {
|
||||
agent any
|
||||
|
||||
environment {
|
||||
TARGET_BRANCH = env.CHANGE_TARGET ?: 'main'
|
||||
}
|
||||
|
||||
stages {
|
||||
stage('Checkout') {
|
||||
steps {
|
||||
checkout scm
|
||||
|
||||
script {
|
||||
// Fetch target branch for comparison
|
||||
sh """
|
||||
git fetch origin ${TARGET_BRANCH}:${TARGET_BRANCH}
|
||||
"""
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
stage('Differential Scan') {
|
||||
when {
|
||||
changeRequest() // Only for pull requests
|
||||
}
|
||||
steps {
|
||||
sh """
|
||||
pip3 install semgrep
|
||||
|
||||
semgrep --config="p/security-audit" \
|
||||
--baseline-commit="${TARGET_BRANCH}" \
|
||||
--json \
|
||||
--output=semgrep-diff.json
|
||||
"""
|
||||
|
||||
script {
|
||||
def results = readJSON file: 'semgrep-diff.json'
|
||||
def newFindings = results.results.size()
|
||||
|
||||
if (newFindings > 0) {
|
||||
echo "❌ ${newFindings} new security issues introduced"
|
||||
error("Fix security issues before merging")
|
||||
} else {
|
||||
echo "✅ No new security issues"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
stage('Full Scan') {
|
||||
when {
|
||||
branch 'main' // Full scan on main branch
|
||||
}
|
||||
steps {
|
||||
sh """
|
||||
semgrep --config="p/security-audit" \
|
||||
--config="p/owasp-top-ten" \
|
||||
--config="p/cwe-top-25" \
|
||||
--json \
|
||||
--output=semgrep-full.json
|
||||
"""
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
post {
|
||||
always {
|
||||
archiveArtifacts artifacts: 'semgrep-*.json',
|
||||
allowEmptyArchive: true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// With custom rules
|
||||
pipeline {
|
||||
agent any
|
||||
|
||||
stages {
|
||||
stage('Security Scan with Custom Rules') {
|
||||
steps {
|
||||
sh """
|
||||
pip3 install semgrep
|
||||
|
||||
# Run with both official and custom rules
|
||||
semgrep --config="p/owasp-top-ten" \
|
||||
--config="custom-rules/" \
|
||||
--json \
|
||||
--output=results.json
|
||||
"""
|
||||
|
||||
script {
|
||||
// Generate HTML report (requires additional tooling)
|
||||
sh """
|
||||
python3 -c "
|
||||
import json
|
||||
with open('semgrep-results.json') as f:
|
||||
results = json.load(f)
|
||||
findings = results['results']
|
||||
print(f'Security Scan Complete:')
|
||||
print(f' Total Findings: {len(findings)}')
|
||||
for severity in ['ERROR', 'WARNING', 'INFO']:
|
||||
count = len([f for f in findings if f.get('extra', {}).get('severity') == severity])
|
||||
print(f' {severity}: {count}')
|
||||
"
|
||||
"""
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
120
skills/appsec/sast-semgrep/assets/rule_template.yaml
Normal file
120
skills/appsec/sast-semgrep/assets/rule_template.yaml
Normal file
@@ -0,0 +1,120 @@
|
||||
rules:
|
||||
- id: custom-rule-template
|
||||
# Pattern matching - choose one or combine multiple
|
||||
pattern: dangerous_function($ARG)
|
||||
# OR use pattern combinations:
|
||||
# patterns:
|
||||
# - pattern: execute($QUERY)
|
||||
# - pattern-inside: |
|
||||
# $QUERY = $USER_INPUT + ...
|
||||
# - pattern-not: execute("SAFE_QUERY")
|
||||
|
||||
# Message shown when rule matches
|
||||
message: |
|
||||
Potential security vulnerability detected.
|
||||
Explain the risk and provide remediation guidance.
|
||||
|
||||
# Severity level
|
||||
severity: ERROR # ERROR, WARNING, or INFO
|
||||
|
||||
# Supported languages
|
||||
languages: [python] # python, javascript, java, go, etc.
|
||||
|
||||
# Metadata for categorization and tracking
|
||||
metadata:
|
||||
category: security
|
||||
technology: [web-app]
|
||||
cwe:
|
||||
- "CWE-XXX: Vulnerability Name"
|
||||
owasp:
|
||||
- "AXX:2021-Category Name"
|
||||
confidence: HIGH # HIGH, MEDIUM, LOW
|
||||
likelihood: MEDIUM # How likely is exploitation
|
||||
impact: HIGH # Potential security impact
|
||||
references:
|
||||
- https://owasp.org/...
|
||||
- https://cwe.mitre.org/data/definitions/XXX.html
|
||||
subcategory:
|
||||
- vuln-type # e.g., sqli, xss, command-injection
|
||||
|
||||
# Optional: Autofix suggestion
|
||||
# fix: |
|
||||
# safe_function($ARG)
|
||||
|
||||
# Optional: Path filtering
|
||||
# paths:
|
||||
# include:
|
||||
# - "src/"
|
||||
# exclude:
|
||||
# - "*/tests/*"
|
||||
# - "*/test_*.py"
|
||||
|
||||
# Example: SQL Injection Detection
|
||||
- id: example-sql-injection
|
||||
patterns:
|
||||
- pattern-either:
|
||||
- pattern: cursor.execute(f"... {$VAR} ...")
|
||||
- pattern: cursor.execute("..." + $VAR + "...")
|
||||
- pattern-not: cursor.execute("...", ...)
|
||||
message: |
|
||||
SQL injection vulnerability detected. User input is concatenated into SQL query.
|
||||
|
||||
Remediation:
|
||||
- Use parameterized queries: cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
|
||||
- Use ORM methods that automatically parameterize queries
|
||||
severity: ERROR
|
||||
languages: [python]
|
||||
metadata:
|
||||
category: security
|
||||
cwe: ["CWE-89: SQL Injection"]
|
||||
owasp: ["A03:2021-Injection"]
|
||||
confidence: HIGH
|
||||
likelihood: HIGH
|
||||
impact: HIGH
|
||||
references:
|
||||
- https://owasp.org/Top10/A03_2021-Injection/
|
||||
|
||||
# Example: Hard-coded Secret Detection
|
||||
- id: example-hardcoded-secret
|
||||
pattern-regex: |
|
||||
(password|passwd|pwd|secret|token|api[_-]?key)\s*=\s*['"][^'"]{8,}['"]
|
||||
message: |
|
||||
Potential hard-coded secret detected.
|
||||
|
||||
Remediation:
|
||||
- Use environment variables: os.getenv('API_KEY')
|
||||
- Use secrets management: AWS Secrets Manager, HashiCorp Vault
|
||||
- Never commit secrets to version control
|
||||
severity: WARNING
|
||||
languages: [python, javascript, java, go]
|
||||
metadata:
|
||||
category: security
|
||||
cwe: ["CWE-798: Use of Hard-coded Credentials"]
|
||||
owasp: ["A07:2021-Identification-and-Authentication-Failures"]
|
||||
confidence: MEDIUM
|
||||
|
||||
# Example: Insecure Deserialization
|
||||
- id: example-unsafe-deserialization
|
||||
patterns:
|
||||
- pattern-either:
|
||||
- pattern: pickle.loads($DATA)
|
||||
- pattern: pickle.load($FILE)
|
||||
- pattern-not-inside: |
|
||||
# Safe pickle usage
|
||||
...
|
||||
message: |
|
||||
Unsafe deserialization using pickle. Attackers can execute arbitrary code.
|
||||
|
||||
Remediation:
|
||||
- Use JSON for serialization: json.loads(data)
|
||||
- If pickle is required, validate and sanitize data source
|
||||
- Never deserialize data from untrusted sources
|
||||
severity: ERROR
|
||||
languages: [python]
|
||||
metadata:
|
||||
category: security
|
||||
cwe: ["CWE-502: Deserialization of Untrusted Data"]
|
||||
owasp: ["A08:2021-Software-and-Data-Integrity-Failures"]
|
||||
confidence: HIGH
|
||||
likelihood: HIGH
|
||||
impact: CRITICAL
|
||||
80
skills/appsec/sast-semgrep/assets/semgrep_config.yaml
Normal file
80
skills/appsec/sast-semgrep/assets/semgrep_config.yaml
Normal file
@@ -0,0 +1,80 @@
|
||||
# Recommended Semgrep Configuration
|
||||
# Save as .semgrepconfig or semgrep.yml in your project root
|
||||
|
||||
# Rules to run
|
||||
rules: p/security-audit
|
||||
|
||||
# Alternative: Specify multiple rulesets
|
||||
# rules:
|
||||
# - p/owasp-top-ten
|
||||
# - p/cwe-top-25
|
||||
# - path/to/custom-rules.yaml
|
||||
|
||||
# Paths to exclude from scanning
|
||||
exclude:
|
||||
- "*/node_modules/*"
|
||||
- "*/vendor/*"
|
||||
- "*/.venv/*"
|
||||
- "*/venv/*"
|
||||
- "*/dist/*"
|
||||
- "*/build/*"
|
||||
- "*/.git/*"
|
||||
- "*/tests/*"
|
||||
- "*/test/*"
|
||||
- "*_test.go"
|
||||
- "test_*.py"
|
||||
- "*.test.js"
|
||||
- "*.spec.js"
|
||||
- "*.min.js"
|
||||
- "*.bundle.js"
|
||||
|
||||
# Paths to include (optional - scans all by default)
|
||||
# include:
|
||||
# - "src/"
|
||||
# - "app/"
|
||||
# - "lib/"
|
||||
|
||||
# Maximum file size to scan (in bytes)
|
||||
max_target_bytes: 1000000 # 1MB
|
||||
|
||||
# Timeout for each file (in seconds)
|
||||
timeout: 30
|
||||
|
||||
# Number of jobs for parallel scanning
|
||||
# jobs: 4
|
||||
|
||||
# Metrics and telemetry (disable for privacy)
|
||||
metrics: off
|
||||
|
||||
# Autofix mode (use with caution)
|
||||
# autofix: false
|
||||
|
||||
# Output format
|
||||
# Can be: text, json, sarif, gitlab-sast, junit-xml, emacs, vim
|
||||
# Set via CLI: semgrep --config=<this-file> --json
|
||||
# output_format: text
|
||||
|
||||
# Severity thresholds
|
||||
# Only report findings at or above this severity
|
||||
# Can be: ERROR, WARNING, INFO
|
||||
# min_severity: WARNING
|
||||
|
||||
# Scan statistics
|
||||
# Show timing and performance stats
|
||||
# time: false
|
||||
# Show stats after scanning
|
||||
# verbose: false
|
||||
|
||||
# CI/CD specific settings
|
||||
# These are typically set via CLI or CI environment
|
||||
|
||||
# Fail on findings
|
||||
# Set exit code 1 if findings are detected
|
||||
# error: true
|
||||
|
||||
# Baseline commit for diff scanning
|
||||
# baseline_commit: origin/main
|
||||
|
||||
# SARIF output settings (for GitHub Security, etc.)
|
||||
# sarif:
|
||||
# output: semgrep-results.sarif
|
||||
300
skills/appsec/sast-semgrep/references/owasp_cwe_mapping.md
Normal file
300
skills/appsec/sast-semgrep/references/owasp_cwe_mapping.md
Normal file
@@ -0,0 +1,300 @@
|
||||
# OWASP Top 10 to CWE Mapping with Semgrep Rules
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [A01:2021 - Broken Access Control](#a012021---broken-access-control)
|
||||
- [A02:2021 - Cryptographic Failures](#a022021---cryptographic-failures)
|
||||
- [A03:2021 - Injection](#a032021---injection)
|
||||
- [A04:2021 - Insecure Design](#a042021---insecure-design)
|
||||
- [A05:2021 - Security Misconfiguration](#a052021---security-misconfiguration)
|
||||
- [A06:2021 - Vulnerable and Outdated Components](#a062021---vulnerable-and-outdated-components)
|
||||
- [A07:2021 - Identification and Authentication Failures](#a072021---identification-and-authentication-failures)
|
||||
- [A08:2021 - Software and Data Integrity Failures](#a082021---software-and-data-integrity-failures)
|
||||
- [A09:2021 - Security Logging and Monitoring Failures](#a092021---security-logging-and-monitoring-failures)
|
||||
- [A10:2021 - Server-Side Request Forgery (SSRF)](#a102021---server-side-request-forgery-ssrf)
|
||||
|
||||
## A01:2021 - Broken Access Control
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-22: Path Traversal
|
||||
- CWE-23: Relative Path Traversal
|
||||
- CWE-35: Path Traversal
|
||||
- CWE-352: Cross-Site Request Forgery (CSRF)
|
||||
- CWE-434: Unrestricted Upload of Dangerous File Type
|
||||
- CWE-639: Authorization Bypass Through User-Controlled Key
|
||||
- CWE-918: Server-Side Request Forgery (SSRF)
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Path traversal detection
|
||||
semgrep --config "r/python.lang.security.audit.path-traversal"
|
||||
|
||||
# Missing authorization checks
|
||||
semgrep --config "r/generic.secrets.security.detected-generic-secret"
|
||||
|
||||
# CSRF protection
|
||||
semgrep --config "r/javascript.express.security.audit.express-check-csurf-middleware-usage"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Unrestricted file access using user input
|
||||
- Missing or improper authorization checks
|
||||
- Insecure direct object references (IDOR)
|
||||
- Elevation of privilege vulnerabilities
|
||||
|
||||
## A02:2021 - Cryptographic Failures
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-259: Use of Hard-coded Password
|
||||
- CWE-326: Inadequate Encryption Strength
|
||||
- CWE-327: Use of Broken/Risky Crypto Algorithm
|
||||
- CWE-328: Reversible One-Way Hash
|
||||
- CWE-330: Use of Insufficiently Random Values
|
||||
- CWE-780: Use of RSA Without OAEP
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Weak crypto algorithms
|
||||
semgrep --config "p/crypto"
|
||||
|
||||
# Hard-coded secrets
|
||||
semgrep --config "p/secrets"
|
||||
|
||||
# Insecure random
|
||||
semgrep --config "r/python.lang.security.audit.insecure-random"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Use of MD5, SHA1 for cryptographic purposes
|
||||
- Hard-coded passwords, API keys, tokens
|
||||
- Weak encryption algorithms (DES, RC4)
|
||||
- Insecure random number generation
|
||||
|
||||
## A03:2021 - Injection
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-79: Cross-site Scripting (XSS)
|
||||
- CWE-89: SQL Injection
|
||||
- CWE-95: Improper Neutralization of Directives in Dynamically Evaluated Code (eval injection)
|
||||
- CWE-917: Expression Language Injection
|
||||
- CWE-943: Improper Neutralization of Special Elements in Data Query Logic
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# SQL Injection
|
||||
semgrep --config "r/python.django.security.injection.sql"
|
||||
semgrep --config "r/javascript.sequelize.security.audit.sequelize-injection"
|
||||
|
||||
# XSS
|
||||
semgrep --config "r/javascript.express.security.audit.xss"
|
||||
semgrep --config "r/python.flask.security.audit.template-xss"
|
||||
|
||||
# Command Injection
|
||||
semgrep --config "r/python.lang.security.audit.dangerous-subprocess-use"
|
||||
|
||||
# Code Injection
|
||||
semgrep --config "r/python.lang.security.audit.exec-used"
|
||||
semgrep --config "r/javascript.lang.security.audit.eval-detected"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Unsafe SQL query construction
|
||||
- Unescaped user input in HTML context
|
||||
- OS command execution with user input
|
||||
- Use of eval() or similar dynamic code execution
|
||||
|
||||
## A04:2021 - Insecure Design
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-209: Generation of Error Message with Sensitive Information
|
||||
- CWE-256: Unprotected Storage of Credentials
|
||||
- CWE-501: Trust Boundary Violation
|
||||
- CWE-522: Insufficiently Protected Credentials
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Information disclosure
|
||||
semgrep --config "r/python.flask.security.audit.debug-enabled"
|
||||
|
||||
# Missing security controls
|
||||
semgrep --config "p/security-audit"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Debug mode enabled in production
|
||||
- Verbose error messages exposing internals
|
||||
- Missing rate limiting
|
||||
- Insecure default configurations
|
||||
|
||||
## A05:2021 - Security Misconfiguration
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-16: Configuration
|
||||
- CWE-611: Improper Restriction of XML External Entity Reference
|
||||
- CWE-614: Sensitive Cookie in HTTPS Session Without 'Secure' Attribute
|
||||
- CWE-756: Missing Custom Error Page
|
||||
- CWE-776: Improper Restriction of Recursive Entity References in DTDs
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# XXE vulnerabilities
|
||||
semgrep --config "r/python.lang.security.audit.avoid-lxml-in-xml-parsing"
|
||||
|
||||
# Insecure cookie settings
|
||||
semgrep --config "r/javascript.express.security.audit.express-cookie-settings"
|
||||
|
||||
# CORS misconfiguration
|
||||
semgrep --config "r/javascript.express.security.audit.express-cors-misconfiguration"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- XML External Entity (XXE) vulnerabilities
|
||||
- Insecure cookie flags (missing Secure, HttpOnly, SameSite)
|
||||
- Open CORS policies
|
||||
- Unnecessary features enabled
|
||||
|
||||
## A06:2021 - Vulnerable and Outdated Components
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-1035: Using Components with Known Vulnerabilities
|
||||
- CWE-1104: Use of Unmaintained Third Party Components
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Known vulnerable dependencies
|
||||
semgrep --config "p/supply-chain"
|
||||
|
||||
# Deprecated APIs
|
||||
semgrep --config "p/owasp-top-ten"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Outdated library versions
|
||||
- Dependencies with known CVEs
|
||||
- Use of deprecated/unmaintained packages
|
||||
- Insecure package imports
|
||||
|
||||
## A07:2021 - Identification and Authentication Failures
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-287: Improper Authentication
|
||||
- CWE-288: Authentication Bypass Using Alternate Path/Channel
|
||||
- CWE-306: Missing Authentication for Critical Function
|
||||
- CWE-307: Improper Restriction of Excessive Authentication Attempts
|
||||
- CWE-521: Weak Password Requirements
|
||||
- CWE-798: Use of Hard-coded Credentials
|
||||
- CWE-916: Use of Password Hash With Insufficient Computational Effort
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Weak password hashing
|
||||
semgrep --config "r/python.lang.security.audit.hashlib-md5-used"
|
||||
|
||||
# Missing authentication
|
||||
semgrep --config "p/jwt"
|
||||
|
||||
# Session management
|
||||
semgrep --config "r/javascript.express.security.audit.express-session-misconfiguration"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Weak password hashing (MD5, SHA1 without salt)
|
||||
- Missing multi-factor authentication
|
||||
- Predictable session identifiers
|
||||
- Credential stuffing vulnerabilities
|
||||
|
||||
## A08:2021 - Software and Data Integrity Failures
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-345: Insufficient Verification of Data Authenticity
|
||||
- CWE-502: Deserialization of Untrusted Data
|
||||
- CWE-829: Inclusion of Functionality from Untrusted Control Sphere
|
||||
- CWE-915: Improperly Controlled Modification of Dynamically-Determined Object Attributes
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Unsafe deserialization
|
||||
semgrep --config "r/python.lang.security.audit.unsafe-pickle"
|
||||
semgrep --config "r/javascript.lang.security.audit.unsafe-deserialization"
|
||||
|
||||
# Prototype pollution
|
||||
semgrep --config "r/javascript.lang.security.audit.prototype-pollution"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Unsafe deserialization (pickle, YAML, JSON)
|
||||
- Missing integrity checks on updates
|
||||
- Prototype pollution in JavaScript
|
||||
- Unsafe code loading from external sources
|
||||
|
||||
## A09:2021 - Security Logging and Monitoring Failures
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-117: Improper Output Neutralization for Logs
|
||||
- CWE-223: Omission of Security-relevant Information
|
||||
- CWE-532: Information Exposure Through Log Files
|
||||
- CWE-778: Insufficient Logging
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# Log injection
|
||||
semgrep --config "r/python.lang.security.audit.logging-unsanitized-input"
|
||||
|
||||
# Sensitive data in logs
|
||||
semgrep --config "p/secrets"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Log injection vulnerabilities
|
||||
- Sensitive data logged (passwords, tokens)
|
||||
- Missing security event logging
|
||||
- Insufficient audit trails
|
||||
|
||||
## A10:2021 - Server-Side Request Forgery (SSRF)
|
||||
|
||||
### CWE Mappings
|
||||
- CWE-918: Server-Side Request Forgery (SSRF)
|
||||
|
||||
### Semgrep Rules
|
||||
```bash
|
||||
# SSRF detection
|
||||
semgrep --config "r/python.requests.security.audit.requests-http-request"
|
||||
semgrep --config "r/javascript.lang.security.audit.detect-unsafe-url"
|
||||
```
|
||||
|
||||
### Detection Patterns
|
||||
- Unvalidated URL fetching
|
||||
- Internal network access via user input
|
||||
- Missing URL validation
|
||||
- Bypassing access controls via SSRF
|
||||
|
||||
## Using This Mapping
|
||||
|
||||
### Scan for Specific OWASP Category
|
||||
|
||||
```bash
|
||||
# Example: Scan for Injection vulnerabilities (A03)
|
||||
semgrep --config "r/python.django.security.injection.sql" \
|
||||
--config "r/python.lang.security.audit.exec-used" \
|
||||
/path/to/code
|
||||
```
|
||||
|
||||
### Comprehensive OWASP Top 10 Scan
|
||||
|
||||
```bash
|
||||
semgrep --config="p/owasp-top-ten" /path/to/code
|
||||
```
|
||||
|
||||
### Filter by CWE
|
||||
|
||||
```bash
|
||||
# Scan and filter results by CWE
|
||||
semgrep --config="p/security-audit" --json /path/to/code | \
|
||||
jq '.results[] | select(.extra.metadata.cwe == "CWE-89")'
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [OWASP Top 10 2021](https://owasp.org/Top10/)
|
||||
- [CWE/SANS Top 25](https://cwe.mitre.org/top25/)
|
||||
- [Semgrep Rule Registry](https://semgrep.dev/explore)
|
||||
471
skills/appsec/sast-semgrep/references/remediation_guide.md
Normal file
471
skills/appsec/sast-semgrep/references/remediation_guide.md
Normal file
@@ -0,0 +1,471 @@
|
||||
# Vulnerability Remediation Guide
|
||||
|
||||
Security remediation patterns organized by vulnerability category.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [SQL Injection](#sql-injection)
|
||||
- [Cross-Site Scripting (XSS)](#cross-site-scripting-xss)
|
||||
- [Command Injection](#command-injection)
|
||||
- [Path Traversal](#path-traversal)
|
||||
- [Insecure Deserialization](#insecure-deserialization)
|
||||
- [Weak Cryptography](#weak-cryptography)
|
||||
- [Authentication & Session Management](#authentication--session-management)
|
||||
- [CSRF](#csrf)
|
||||
- [SSRF](#ssrf)
|
||||
- [XXE](#xxe)
|
||||
|
||||
## SQL Injection
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
query = f"SELECT * FROM users WHERE id = {user_id}"
|
||||
cursor.execute(query)
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Use parameterized queries
|
||||
query = "SELECT * FROM users WHERE id = %s"
|
||||
cursor.execute(query, (user_id,))
|
||||
|
||||
# Or use ORM
|
||||
user = User.objects.get(id=user_id)
|
||||
```
|
||||
|
||||
### Framework-Specific Solutions
|
||||
|
||||
**Django:**
|
||||
```python
|
||||
# Use Django ORM (safe by default)
|
||||
User.objects.filter(email=user_email)
|
||||
|
||||
# For raw SQL, use parameterized queries
|
||||
User.objects.raw('SELECT * FROM myapp_user WHERE email = %s', [user_email])
|
||||
```
|
||||
|
||||
**Node.js (Sequelize):**
|
||||
```javascript
|
||||
// Use parameterized queries
|
||||
User.findAll({
|
||||
where: { email: userEmail }
|
||||
});
|
||||
|
||||
// Or use replacements
|
||||
sequelize.query(
|
||||
'SELECT * FROM users WHERE email = :email',
|
||||
{ replacements: { email: userEmail } }
|
||||
);
|
||||
```
|
||||
|
||||
**Java (JDBC):**
|
||||
```java
|
||||
// Use PreparedStatement
|
||||
String query = "SELECT * FROM users WHERE id = ?";
|
||||
PreparedStatement stmt = conn.prepareStatement(query);
|
||||
stmt.setInt(1, userId);
|
||||
ResultSet rs = stmt.executeQuery();
|
||||
```
|
||||
|
||||
## Cross-Site Scripting (XSS)
|
||||
|
||||
### Vulnerability Pattern
|
||||
```javascript
|
||||
// VULNERABLE
|
||||
element.innerHTML = userInput;
|
||||
document.write(userInput);
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```javascript
|
||||
// SECURE: Use textContent for text
|
||||
element.textContent = userInput;
|
||||
|
||||
// Or properly escape HTML
|
||||
element.innerHTML = escapeHtml(userInput);
|
||||
|
||||
function escapeHtml(unsafe) {
|
||||
return unsafe
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">")
|
||||
.replace(/"/g, """)
|
||||
.replace(/'/g, "'");
|
||||
}
|
||||
```
|
||||
|
||||
### Framework-Specific Solutions
|
||||
|
||||
**React:**
|
||||
```javascript
|
||||
// React auto-escapes by default
|
||||
<div>{userInput}</div>
|
||||
|
||||
// For HTML content, sanitize first
|
||||
import DOMPurify from 'dompurify';
|
||||
<div dangerouslySetInnerHTML={{__html: DOMPurify.sanitize(userInput)}} />
|
||||
```
|
||||
|
||||
**Flask/Jinja2:**
|
||||
```python
|
||||
# Templates auto-escape by default
|
||||
{{ user_input }}
|
||||
|
||||
# For HTML content, sanitize
|
||||
from markupsafe import Markup
|
||||
import bleach
|
||||
{{ Markup(bleach.clean(user_input)) }}
|
||||
```
|
||||
|
||||
**Django:**
|
||||
```django
|
||||
{# Auto-escaped by default #}
|
||||
{{ user_input }}
|
||||
|
||||
{# Mark as safe only after sanitization #}
|
||||
{{ user_input|safe }}
|
||||
```
|
||||
|
||||
## Command Injection
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
os.system(f"ping {user_host}")
|
||||
subprocess.call(f"ls {user_directory}", shell=True)
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Use subprocess with list arguments
|
||||
import subprocess
|
||||
subprocess.run(['ping', '-c', '1', user_host],
|
||||
capture_output=True, check=True)
|
||||
|
||||
# Validate input against allowlist
|
||||
import shlex
|
||||
if not re.match(r'^[a-zA-Z0-9.-]+$', user_host):
|
||||
raise ValueError("Invalid hostname")
|
||||
subprocess.run(['ping', '-c', '1', user_host])
|
||||
```
|
||||
|
||||
**Node.js:**
|
||||
```javascript
|
||||
// VULNERABLE
|
||||
exec(`ls ${userDir}`);
|
||||
|
||||
// SECURE
|
||||
const { execFile } = require('child_process');
|
||||
execFile('ls', [userDir], (error, stdout) => {
|
||||
// Handle output
|
||||
});
|
||||
```
|
||||
|
||||
## Path Traversal
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
file_path = os.path.join('/uploads', user_filename)
|
||||
with open(file_path) as f:
|
||||
return f.read()
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Validate and normalize path
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
def safe_join(directory, user_path):
|
||||
# Normalize and resolve path
|
||||
base_dir = Path(directory).resolve()
|
||||
file_path = (base_dir / user_path).resolve()
|
||||
|
||||
# Ensure it's within base directory
|
||||
if not str(file_path).startswith(str(base_dir)):
|
||||
raise ValueError("Path traversal detected")
|
||||
|
||||
return file_path
|
||||
|
||||
try:
|
||||
safe_path = safe_join('/uploads', user_filename)
|
||||
with open(safe_path) as f:
|
||||
return f.read()
|
||||
except ValueError:
|
||||
return "Invalid filename"
|
||||
```
|
||||
|
||||
## Insecure Deserialization
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
import pickle
|
||||
data = pickle.loads(user_data)
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Use safe formats like JSON
|
||||
import json
|
||||
data = json.loads(user_data)
|
||||
|
||||
# If you must deserialize, validate and restrict
|
||||
import yaml
|
||||
data = yaml.safe_load(user_data) # Use safe_load, not load
|
||||
```
|
||||
|
||||
**Node.js:**
|
||||
```javascript
|
||||
// VULNERABLE
|
||||
const data = eval(userInput);
|
||||
const obj = Function(userInput)();
|
||||
|
||||
// SECURE
|
||||
const data = JSON.parse(userInput);
|
||||
|
||||
// For complex objects, use schema validation
|
||||
const Joi = require('joi');
|
||||
const schema = Joi.object({
|
||||
name: Joi.string().required(),
|
||||
email: Joi.string().email().required()
|
||||
});
|
||||
const { value, error } = schema.validate(JSON.parse(userInput));
|
||||
```
|
||||
|
||||
## Weak Cryptography
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
import hashlib
|
||||
password_hash = hashlib.md5(password.encode()).hexdigest()
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Use bcrypt or argon2
|
||||
import bcrypt
|
||||
|
||||
# Hashing
|
||||
password_hash = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
|
||||
|
||||
# Verification
|
||||
if bcrypt.checkpw(password.encode(), stored_hash):
|
||||
print("Password correct")
|
||||
|
||||
# Or use argon2
|
||||
from argon2 import PasswordHasher
|
||||
ph = PasswordHasher()
|
||||
hash = ph.hash(password)
|
||||
ph.verify(hash, password)
|
||||
```
|
||||
|
||||
**Encryption:**
|
||||
```python
|
||||
# VULNERABLE
|
||||
from Crypto.Cipher import DES
|
||||
cipher = DES.new(key, DES.MODE_ECB)
|
||||
|
||||
# SECURE: Use AES-GCM
|
||||
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
|
||||
import os
|
||||
|
||||
key = AESGCM.generate_key(bit_length=256)
|
||||
aesgcm = AESGCM(key)
|
||||
nonce = os.urandom(12)
|
||||
ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)
|
||||
```
|
||||
|
||||
## Authentication & Session Management
|
||||
|
||||
### Vulnerability Pattern
|
||||
```javascript
|
||||
// VULNERABLE
|
||||
app.use(session({
|
||||
secret: 'weak-secret',
|
||||
cookie: { secure: false }
|
||||
}));
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```javascript
|
||||
// SECURE
|
||||
const session = require('express-session');
|
||||
app.use(session({
|
||||
secret: process.env.SESSION_SECRET, // Strong random secret
|
||||
resave: false,
|
||||
saveUninitialized: false,
|
||||
cookie: {
|
||||
secure: true, // HTTPS only
|
||||
httpOnly: true, // No JavaScript access
|
||||
sameSite: 'strict', // CSRF protection
|
||||
maxAge: 3600000 // 1 hour
|
||||
}
|
||||
}));
|
||||
```
|
||||
|
||||
**Password Requirements:**
|
||||
```python
|
||||
# Implement strong password policy
|
||||
import re
|
||||
|
||||
def validate_password(password):
|
||||
if len(password) < 12:
|
||||
return False
|
||||
if not re.search(r'[A-Z]', password):
|
||||
return False
|
||||
if not re.search(r'[a-z]', password):
|
||||
return False
|
||||
if not re.search(r'[0-9]', password):
|
||||
return False
|
||||
if not re.search(r'[!@#$%^&*(),.?":{}|<>]', password):
|
||||
return False
|
||||
return True
|
||||
```
|
||||
|
||||
## CSRF
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE: No CSRF protection
|
||||
@app.route('/transfer', methods=['POST'])
|
||||
def transfer():
|
||||
amount = request.form['amount']
|
||||
to_account = request.form['to']
|
||||
# Process transfer
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Use CSRF tokens
|
||||
from flask_wtf.csrf import CSRFProtect
|
||||
csrf = CSRFProtect(app)
|
||||
|
||||
@app.route('/transfer', methods=['POST'])
|
||||
@csrf.exempt # Only if using custom CSRF
|
||||
def transfer():
|
||||
# CSRF token automatically validated
|
||||
amount = request.form['amount']
|
||||
to_account = request.form['to']
|
||||
```
|
||||
|
||||
**Express.js:**
|
||||
```javascript
|
||||
const csrf = require('csurf');
|
||||
const csrfProtection = csrf({ cookie: true });
|
||||
|
||||
app.post('/transfer', csrfProtection, (req, res) => {
|
||||
// CSRF token validated
|
||||
const { amount, to } = req.body;
|
||||
});
|
||||
```
|
||||
|
||||
## SSRF
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
import requests
|
||||
url = request.args.get('url')
|
||||
response = requests.get(url)
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Validate URLs and use allowlist
|
||||
import requests
|
||||
from urllib.parse import urlparse
|
||||
|
||||
ALLOWED_DOMAINS = ['api.example.com', 'cdn.example.com']
|
||||
|
||||
def safe_fetch(url):
|
||||
parsed = urlparse(url)
|
||||
|
||||
# Check protocol
|
||||
if parsed.scheme not in ['http', 'https']:
|
||||
raise ValueError("Invalid protocol")
|
||||
|
||||
# Check domain against allowlist
|
||||
if parsed.netloc not in ALLOWED_DOMAINS:
|
||||
raise ValueError("Domain not allowed")
|
||||
|
||||
# Block internal IPs
|
||||
import ipaddress
|
||||
try:
|
||||
ip = ipaddress.ip_address(parsed.hostname)
|
||||
if ip.is_private:
|
||||
raise ValueError("Private IP not allowed")
|
||||
except ValueError:
|
||||
pass # Not an IP, continue
|
||||
|
||||
return requests.get(url, timeout=5)
|
||||
```
|
||||
|
||||
## XXE
|
||||
|
||||
### Vulnerability Pattern
|
||||
```python
|
||||
# VULNERABLE
|
||||
from lxml import etree
|
||||
tree = etree.parse(user_xml)
|
||||
```
|
||||
|
||||
### Secure Remediation
|
||||
```python
|
||||
# SECURE: Disable external entities
|
||||
from lxml import etree
|
||||
|
||||
parser = etree.XMLParser(
|
||||
resolve_entities=False,
|
||||
no_network=True,
|
||||
dtd_validation=False
|
||||
)
|
||||
tree = etree.parse(user_xml, parser)
|
||||
|
||||
# Or use defusedxml
|
||||
from defusedxml import ElementTree
|
||||
tree = ElementTree.parse(user_xml)
|
||||
```
|
||||
|
||||
**Node.js:**
|
||||
```javascript
|
||||
// Use secure XML parser
|
||||
const libxmljs = require('libxmljs');
|
||||
const xml = libxmljs.parseXml(userXml, {
|
||||
noent: false, // Disable entity expansion
|
||||
dtdload: false,
|
||||
dtdvalid: false
|
||||
});
|
||||
```
|
||||
|
||||
## General Security Principles
|
||||
|
||||
1. **Input Validation**: Validate all user input against expected format
|
||||
2. **Output Encoding**: Encode output based on context (HTML, URL, SQL, etc.)
|
||||
3. **Least Privilege**: Grant minimum necessary permissions
|
||||
4. **Defense in Depth**: Use multiple layers of security controls
|
||||
5. **Fail Securely**: Ensure failures don't expose sensitive data
|
||||
6. **Secure Defaults**: Use secure configuration by default
|
||||
7. **Keep Dependencies Updated**: Regularly update libraries and frameworks
|
||||
|
||||
## Testing Remediation
|
||||
|
||||
After applying fixes:
|
||||
|
||||
1. **Verify with Semgrep**: Re-scan to ensure vulnerability is resolved
|
||||
```bash
|
||||
semgrep --config <ruleset> fixed_file.py
|
||||
```
|
||||
|
||||
2. **Manual Testing**: Attempt to exploit the vulnerability
|
||||
3. **Code Review**: Have peer review the fix
|
||||
4. **Integration Tests**: Add tests to prevent regression
|
||||
|
||||
## References
|
||||
|
||||
- [OWASP Cheat Sheet Series](https://cheatsheetseries.owasp.org/)
|
||||
- [CWE Mitigations](https://cwe.mitre.org/)
|
||||
- [Semgrep Autofix](https://semgrep.dev/docs/writing-rules/autofix/)
|
||||
425
skills/appsec/sast-semgrep/references/rule_library.md
Normal file
425
skills/appsec/sast-semgrep/references/rule_library.md
Normal file
@@ -0,0 +1,425 @@
|
||||
# Semgrep Rule Library
|
||||
|
||||
Curated collection of useful Semgrep rulesets and custom rule writing guidance.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Official Rulesets](#official-rulesets)
|
||||
- [Language-Specific Rules](#language-specific-rules)
|
||||
- [Framework-Specific Rules](#framework-specific-rules)
|
||||
- [Custom Rule Writing](#custom-rule-writing)
|
||||
- [Rule Testing](#rule-testing)
|
||||
|
||||
## Official Rulesets
|
||||
|
||||
### Comprehensive Rulesets
|
||||
|
||||
| Ruleset | Config | Description | Use Case |
|
||||
|---------|--------|-------------|----------|
|
||||
| Auto | `auto` | Automatically selected rules based on detected languages | Quick scans, baseline |
|
||||
| Security Audit | `p/security-audit` | Comprehensive security rules across languages | Deep security review |
|
||||
| OWASP Top 10 | `p/owasp-top-ten` | OWASP Top 10 2021 coverage | Compliance, security gates |
|
||||
| CWE Top 25 | `p/cwe-top-25` | SANS/CWE Top 25 dangerous errors | Critical vulnerability detection |
|
||||
| CI | `p/ci` | Fast, low false-positive rules for CI/CD | Pull request gates |
|
||||
| Default | `p/default` | Balanced security and quality rules | General purpose scanning |
|
||||
|
||||
### Specialized Rulesets
|
||||
|
||||
| Ruleset | Config | Focus Area |
|
||||
|---------|--------|------------|
|
||||
| Secrets | `p/secrets` | Hard-coded credentials, API keys |
|
||||
| Cryptography | `p/crypto` | Weak crypto, hashing issues |
|
||||
| Supply Chain | `p/supply-chain` | Dependency vulnerabilities |
|
||||
| JWT | `p/jwt` | JSON Web Token security |
|
||||
| SQL Injection | `p/sql-injection` | SQL injection patterns |
|
||||
| XSS | `p/xss` | Cross-site scripting |
|
||||
| Command Injection | `p/command-injection` | OS command injection |
|
||||
|
||||
## Language-Specific Rules
|
||||
|
||||
### Python
|
||||
|
||||
```bash
|
||||
# Django security
|
||||
semgrep --config "p/django"
|
||||
|
||||
# Flask security
|
||||
semgrep --config "r/python.flask.security"
|
||||
|
||||
# General Python security
|
||||
semgrep --config "r/python.lang.security"
|
||||
|
||||
# Specific vulnerabilities
|
||||
semgrep --config "r/python.lang.security.audit.exec-used"
|
||||
semgrep --config "r/python.lang.security.audit.unsafe-pickle"
|
||||
semgrep --config "r/python.lang.security.audit.dangerous-subprocess-use"
|
||||
```
|
||||
|
||||
**Key Python Rules:**
|
||||
- `python.django.security.injection.sql.sql-injection-db-cursor-execute`
|
||||
- `python.flask.security.xss.audit.template-xss`
|
||||
- `python.lang.security.audit.exec-used`
|
||||
- `python.lang.security.audit.dangerous-os-module-methods`
|
||||
- `python.lang.security.audit.hashlib-md5-used`
|
||||
|
||||
### JavaScript/TypeScript
|
||||
|
||||
```bash
|
||||
# Express.js security
|
||||
semgrep --config "p/express"
|
||||
|
||||
# React security
|
||||
semgrep --config "p/react"
|
||||
|
||||
# Node.js security
|
||||
semgrep --config "r/javascript.lang.security"
|
||||
|
||||
# Specific vulnerabilities
|
||||
semgrep --config "r/javascript.lang.security.audit.eval-detected"
|
||||
semgrep --config "r/javascript.lang.security.audit.unsafe-exec"
|
||||
```
|
||||
|
||||
**Key JavaScript Rules:**
|
||||
- `javascript.express.security.audit.xss.mustache.var-in-href`
|
||||
- `javascript.lang.security.audit.eval-detected`
|
||||
- `javascript.lang.security.audit.path-traversal`
|
||||
- `javascript.sequelize.security.audit.sequelize-injection-express`
|
||||
|
||||
### Java
|
||||
|
||||
```bash
|
||||
# Spring security
|
||||
semgrep --config "p/spring"
|
||||
|
||||
# General Java security
|
||||
semgrep --config "r/java.lang.security"
|
||||
|
||||
# Specific frameworks
|
||||
semgrep --config "r/java.spring.security"
|
||||
```
|
||||
|
||||
**Key Java Rules:**
|
||||
- `java.lang.security.audit.sqli.jdbc-sqli`
|
||||
- `java.lang.security.audit.xxe.xmlinputfactory-xxe`
|
||||
- `java.spring.security.audit.spring-cookie-missing-httponly`
|
||||
|
||||
### Go
|
||||
|
||||
```bash
|
||||
# Go security rules
|
||||
semgrep --config "r/go.lang.security"
|
||||
|
||||
# Specific vulnerabilities
|
||||
semgrep --config "r/go.lang.security.audit.net.use-of-tls-with-go-sql-driver"
|
||||
semgrep --config "r/go.lang.security.audit.crypto.use_of_weak_crypto"
|
||||
```
|
||||
|
||||
### PHP
|
||||
|
||||
```bash
|
||||
# PHP security
|
||||
semgrep --config "p/php"
|
||||
|
||||
# Laravel security
|
||||
semgrep --config "r/php.laravel.security"
|
||||
|
||||
# Specific vulnerabilities
|
||||
semgrep --config "r/php.lang.security.audit.sqli"
|
||||
semgrep --config "r/php.lang.security.audit.dangerous-exec"
|
||||
```
|
||||
|
||||
## Framework-Specific Rules
|
||||
|
||||
### Web Frameworks
|
||||
|
||||
**Django:**
|
||||
```bash
|
||||
semgrep --config "p/django"
|
||||
# Covers: SQL injection, XSS, CSRF, auth issues
|
||||
```
|
||||
|
||||
**Flask:**
|
||||
```bash
|
||||
semgrep --config "r/python.flask.security"
|
||||
# Covers: XSS, debug mode, secure cookies
|
||||
```
|
||||
|
||||
**Express.js:**
|
||||
```bash
|
||||
semgrep --config "p/express"
|
||||
# Covers: XSS, CSRF, session config, CORS
|
||||
```
|
||||
|
||||
**Spring Boot:**
|
||||
```bash
|
||||
semgrep --config "p/spring"
|
||||
# Covers: SQL injection, XXE, auth, SSRF
|
||||
```
|
||||
|
||||
### Cloud & Infrastructure
|
||||
|
||||
**Terraform:**
|
||||
```bash
|
||||
semgrep --config "r/terraform.lang.security"
|
||||
# Covers: S3 buckets, security groups, encryption
|
||||
```
|
||||
|
||||
**Kubernetes:**
|
||||
```bash
|
||||
semgrep --config "r/yaml.kubernetes.security"
|
||||
# Covers: privileged containers, secrets, rbac
|
||||
```
|
||||
|
||||
**Docker:**
|
||||
```bash
|
||||
semgrep --config "r/dockerfile.security"
|
||||
# Covers: unsafe base images, secrets, root user
|
||||
```
|
||||
|
||||
## Custom Rule Writing
|
||||
|
||||
### Rule Anatomy
|
||||
|
||||
```yaml
|
||||
rules:
|
||||
- id: custom-rule-id
|
||||
pattern: execute($SQL)
|
||||
message: Potential security issue detected
|
||||
severity: WARNING
|
||||
languages: [python]
|
||||
metadata:
|
||||
category: security
|
||||
cwe: "CWE-89"
|
||||
owasp: "A03:2021-Injection"
|
||||
confidence: HIGH
|
||||
```
|
||||
|
||||
### Pattern Types
|
||||
|
||||
**1. Basic Pattern**
|
||||
```yaml
|
||||
pattern: dangerous_function($ARG)
|
||||
```
|
||||
|
||||
**2. Pattern-Inside (Context)**
|
||||
```yaml
|
||||
patterns:
|
||||
- pattern: execute($QUERY)
|
||||
- pattern-inside: |
|
||||
$QUERY = $USER_INPUT + ...
|
||||
```
|
||||
|
||||
**3. Pattern-Not (Exclusion)**
|
||||
```yaml
|
||||
patterns:
|
||||
- pattern: execute($QUERY)
|
||||
- pattern-not: execute("SELECT * FROM safe_table")
|
||||
```
|
||||
|
||||
**4. Pattern-Either (OR logic)**
|
||||
```yaml
|
||||
pattern-either:
|
||||
- pattern: eval($ARG)
|
||||
- pattern: exec($ARG)
|
||||
```
|
||||
|
||||
**5. Metavariable Comparison**
|
||||
```yaml
|
||||
patterns:
|
||||
- pattern: crypto.encrypt($DATA, $KEY)
|
||||
- metavariable-comparison:
|
||||
metavariable: $KEY
|
||||
comparison: len($KEY) < 16
|
||||
```
|
||||
|
||||
### Example Custom Rules
|
||||
|
||||
**Detect Hard-coded AWS Keys:**
|
||||
```yaml
|
||||
rules:
|
||||
- id: hardcoded-aws-key
|
||||
patterns:
|
||||
- pattern-regex: 'AKIA[0-9A-Z]{16}'
|
||||
message: Hard-coded AWS access key detected
|
||||
severity: ERROR
|
||||
languages: [python, javascript, java, go]
|
||||
metadata:
|
||||
category: security
|
||||
cwe: "CWE-798"
|
||||
confidence: HIGH
|
||||
```
|
||||
|
||||
**Detect Unsafe File Operations:**
|
||||
```yaml
|
||||
rules:
|
||||
- id: unsafe-file-read
|
||||
patterns:
|
||||
- pattern: open($PATH, ...)
|
||||
- pattern-inside: |
|
||||
def $FUNC(..., $USER_INPUT, ...):
|
||||
...
|
||||
$PATH = ... + $USER_INPUT + ...
|
||||
...
|
||||
message: File path constructed from user input (path traversal risk)
|
||||
severity: WARNING
|
||||
languages: [python]
|
||||
metadata:
|
||||
cwe: "CWE-22"
|
||||
owasp: "A01:2021-Broken-Access-Control"
|
||||
```
|
||||
|
||||
**Detect Missing CSRF Protection:**
|
||||
```yaml
|
||||
rules:
|
||||
- id: flask-missing-csrf
|
||||
patterns:
|
||||
- pattern: |
|
||||
@app.route($PATH, methods=[..., "POST", ...])
|
||||
def $FUNC(...):
|
||||
...
|
||||
- pattern-not-inside: |
|
||||
@csrf.exempt
|
||||
...
|
||||
- pattern-not-inside: |
|
||||
csrf_token = ...
|
||||
...
|
||||
message: POST route without CSRF protection
|
||||
severity: ERROR
|
||||
languages: [python]
|
||||
metadata:
|
||||
cwe: "CWE-352"
|
||||
owasp: "A01:2021-Broken-Access-Control"
|
||||
```
|
||||
|
||||
**Detect Insecure Random:**
|
||||
```yaml
|
||||
rules:
|
||||
- id: insecure-random-for-crypto
|
||||
patterns:
|
||||
- pattern-either:
|
||||
- pattern: random.random()
|
||||
- pattern: random.randint(...)
|
||||
- pattern-inside: |
|
||||
def ..._token(...):
|
||||
...
|
||||
message: Using insecure random for security token
|
||||
severity: ERROR
|
||||
languages: [python]
|
||||
metadata:
|
||||
cwe: "CWE-330"
|
||||
fix: "Use secrets module: secrets.token_bytes(32)"
|
||||
```
|
||||
|
||||
### Rule Metadata Best Practices
|
||||
|
||||
Include comprehensive metadata:
|
||||
```yaml
|
||||
metadata:
|
||||
category: security # Type of issue
|
||||
cwe: "CWE-XXX" # CWE mapping
|
||||
owasp: "AXX:2021-Name" # OWASP category
|
||||
confidence: HIGH|MEDIUM|LOW # Detection confidence
|
||||
likelihood: HIGH|MEDIUM|LOW # Exploitation likelihood
|
||||
impact: HIGH|MEDIUM|LOW # Security impact
|
||||
subcategory: [vuln-type] # More specific categorization
|
||||
source-rule: url # If adapted from elsewhere
|
||||
references:
|
||||
- https://example.com/docs
|
||||
```
|
||||
|
||||
## Rule Testing
|
||||
|
||||
### Test File Structure
|
||||
```
|
||||
custom-rules/
|
||||
├── rules.yaml # Your custom rules
|
||||
└── tests/
|
||||
├── test-sqli.py # Test cases
|
||||
└── test-xss.js # Test cases
|
||||
```
|
||||
|
||||
### Writing Tests
|
||||
|
||||
```python
|
||||
# tests/test-sqli.py
|
||||
|
||||
# ruleid: custom-sql-injection
|
||||
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
|
||||
|
||||
# ok: custom-sql-injection
|
||||
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Test custom rules
|
||||
semgrep --config rules.yaml --test tests/
|
||||
|
||||
# Validate rule syntax
|
||||
semgrep --validate --config rules.yaml
|
||||
```
|
||||
|
||||
## Rule Performance Optimization
|
||||
|
||||
### 1. Use Specific Patterns
|
||||
```yaml
|
||||
# SLOW
|
||||
pattern: $X
|
||||
|
||||
# FAST
|
||||
pattern: dangerous_function($X)
|
||||
```
|
||||
|
||||
### 2. Limit Language Scope
|
||||
```yaml
|
||||
# Only scan relevant languages
|
||||
languages: [python, javascript]
|
||||
```
|
||||
|
||||
### 3. Use Pattern-Inside Wisely
|
||||
```yaml
|
||||
# Narrow down context early
|
||||
patterns:
|
||||
- pattern-inside: |
|
||||
def handle_request(...):
|
||||
...
|
||||
- pattern: execute($QUERY)
|
||||
```
|
||||
|
||||
### 4. Exclude Test Files
|
||||
```yaml
|
||||
paths:
|
||||
exclude:
|
||||
- "*/test_*.py"
|
||||
- "*/tests/*"
|
||||
- "*_test.go"
|
||||
```
|
||||
|
||||
## Community Rules
|
||||
|
||||
Explore community-contributed rules:
|
||||
|
||||
```bash
|
||||
# Browse rules by technology
|
||||
semgrep --config "r/python.django"
|
||||
semgrep --config "r/javascript.react"
|
||||
semgrep --config "r/go.gorilla"
|
||||
|
||||
# Browse by vulnerability type
|
||||
semgrep --config "r/generic.secrets"
|
||||
semgrep --config "r/generic.html-templates"
|
||||
```
|
||||
|
||||
**Useful Community Rulesets:**
|
||||
- `r/python.aws-lambda.security` - AWS Lambda security
|
||||
- `r/terraform.aws.security` - AWS Terraform
|
||||
- `r/dockerfile.best-practice` - Docker best practices
|
||||
- `r/yaml.github-actions.security` - GitHub Actions security
|
||||
|
||||
## References
|
||||
|
||||
- [Semgrep Rule Syntax](https://semgrep.dev/docs/writing-rules/rule-syntax/)
|
||||
- [Semgrep Registry](https://semgrep.dev/explore)
|
||||
- [Pattern Examples](https://semgrep.dev/docs/writing-rules/pattern-examples/)
|
||||
- [Rule Writing Tutorial](https://semgrep.dev/learn)
|
||||
Reference in New Issue
Block a user