Initial commit

2025-11-29 17:51:02 +08:00
commit ff1f4bd119
252 changed files with 72682 additions and 0 deletions
--- a/skills/appsec/sast-bandit/references/remediation_guide.md
+++ b/skills/appsec/sast-bandit/references/remediation_guide.md
@@ -0,0 +1,622 @@
+# Bandit Finding Remediation Guide
+
+Comprehensive secure coding patterns and remediation strategies for common Bandit findings.
+
+## Table of Contents
+
+- [Hardcoded Credentials](#hardcoded-credentials)
+- [SQL Injection](#sql-injection)
+- [Command Injection](#command-injection)
+- [Weak Cryptography](#weak-cryptography)
+- [Insecure Deserialization](#insecure-deserialization)
+- [XML External Entity (XXE)](#xml-external-entity-xxe)
+- [Security Misconfiguration](#security-misconfiguration)
+
+---
+
+## Hardcoded Credentials
+
+### B105, B106, B107: Hardcoded Passwords
+
+**Vulnerable Code:**
+```python
+# B105: Hardcoded password string
+DATABASE_PASSWORD = "admin123"
+
+# B106: Hardcoded password in function call
+db.connect(host="localhost", password="secret_password")
+
+# B107: Hardcoded password default argument
+def connect_db(password="default_pass"):
+    pass
+```
+
+**Secure Solution:**
+```python
+import os
+from dotenv import load_dotenv
+
+# Load environment variables
+load_dotenv()
+
+# Use environment variables
+DATABASE_PASSWORD = os.environ.get("DATABASE_PASSWORD")
+if not DATABASE_PASSWORD:
+    raise ValueError("DATABASE_PASSWORD environment variable not set")
+
+# Use environment variables in function calls
+db.connect(
+    host=os.environ.get("DB_HOST", "localhost"),
+    password=os.environ.get("DB_PASSWORD")
+)
+
+# Use secret management service (example with AWS Secrets Manager)
+import boto3
+from botocore.exceptions import ClientError
+
+def get_secret(secret_name):
+    session = boto3.session.Session()
+    client = session.client(service_name='secretsmanager', region_name='us-east-1')
+    try:
+        response = client.get_secret_value(SecretId=secret_name)
+        return response['SecretString']
+    except ClientError as e:
+        raise Exception(f"Failed to retrieve secret: {e}")
+
+DATABASE_PASSWORD = get_secret("prod/db/password")
+```
+
+**Best Practices:**
+- Use environment variables with `.env` files (never commit `.env` to version control)
+- Use secret management services (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault)
+- Implement secret rotation policies
+- Use configuration management tools (Ansible Vault, Kubernetes Secrets)
+
+---
+
+## SQL Injection
+
+### B608: SQL Injection via String Formatting
+
+**Vulnerable Code:**
+```python
+# String formatting (UNSAFE)
+user_id = request.GET['id']
+query = f"SELECT * FROM users WHERE id = {user_id}"
+cursor.execute(query)
+
+# String concatenation (UNSAFE)
+query = "SELECT * FROM users WHERE username = '" + username + "'"
+cursor.execute(query)
+
+# Percent formatting (UNSAFE)
+query = "SELECT * FROM users WHERE email = '%s'" % email
+cursor.execute(query)
+```
+
+**Secure Solution with psycopg2:**
+```python
+import psycopg2
+
+# Parameterized queries (SAFE)
+user_id = request.GET['id']
+query = "SELECT * FROM users WHERE id = %s"
+cursor.execute(query, (user_id,))
+
+# Multiple parameters
+query = "SELECT * FROM users WHERE username = %s AND active = %s"
+cursor.execute(query, (username, True))
+
+# Named parameters
+query = "SELECT * FROM users WHERE username = %(username)s AND email = %(email)s"
+cursor.execute(query, {'username': username, 'email': email})
+```
+
+**Secure Solution with SQLAlchemy ORM:**
+```python
+from sqlalchemy import create_engine, select
+from sqlalchemy.orm import Session
+
+# Using ORM (SAFE)
+with Session(engine) as session:
+    stmt = select(User).where(User.username == username)
+    user = session.execute(stmt).scalar_one_or_none()
+
+# Using bound parameters with raw SQL (SAFE)
+with Session(engine) as session:
+    result = session.execute(
+        text("SELECT * FROM users WHERE username = :username"),
+        {"username": username}
+    )
+```
+
+**Secure Solution with Django ORM:**
+```python
+from django.db.models import Q
+
+# Django ORM (SAFE)
+users = User.objects.filter(username=username)
+
+# Complex queries (SAFE)
+users = User.objects.filter(Q(username=username) | Q(email=email))
+
+# Raw SQL with parameters (SAFE)
+from django.db import connection
+with connection.cursor() as cursor:
+    cursor.execute("SELECT * FROM users WHERE username = %s", [username])
+```
+
+**Best Practices:**
+- Always use parameterized queries or prepared statements
+- Never concatenate user input into SQL queries
+- Use ORM when possible for automatic escaping
+- Validate and sanitize inputs at application boundaries
+- Apply least privilege principle to database accounts
+
+---
+
+## Command Injection
+
+### B602, B604, B605: Shell Injection in Subprocess
+
+**Vulnerable Code:**
+```python
+import subprocess
+import os
+
+# shell=True with user input (VERY UNSAFE)
+filename = request.GET['file']
+subprocess.call(f"cat {filename}", shell=True)
+
+# os.system with user input (VERY UNSAFE)
+os.system(f"ping -c 1 {hostname}")
+
+# String concatenation (UNSAFE)
+cmd = "curl " + user_url
+subprocess.call(cmd, shell=True)
+```
+
+**Secure Solution:**
+```python
+import subprocess
+import shlex
+from pathlib import Path
+
+# Use list of arguments without shell=True (SAFE)
+filename = request.GET['file']
+subprocess.run(["cat", filename], check=True, capture_output=True)
+
+# Validate input before use
+def validate_filename(filename):
+    """Validate filename to prevent path traversal."""
+    # Allow only alphanumeric, dash, underscore, and dot
+    if not re.match(r'^[a-zA-Z0-9_.-]+$', filename):
+        raise ValueError("Invalid filename")
+
+    # Resolve to absolute path and check it's within allowed directory
+    file_path = Path(UPLOAD_DIR) / filename
+    if not file_path.resolve().is_relative_to(Path(UPLOAD_DIR).resolve()):
+        raise ValueError("Path traversal detected")
+
+    return file_path
+
+filename = validate_filename(request.GET['file'])
+subprocess.run(["cat", str(filename)], check=True, capture_output=True)
+
+# Use shlex.split() for complex commands
+import shlex
+command_string = "ping -c 1 example.com"
+subprocess.run(shlex.split(command_string), check=True, capture_output=True)
+
+# Whitelist approach for restricted commands
+ALLOWED_COMMANDS = {
+    'ping': ['ping', '-c', '1'],
+    'traceroute': ['traceroute', '-m', '10'],
+}
+
+command_type = request.GET['command']
+target = request.GET['target']
+
+if command_type not in ALLOWED_COMMANDS:
+    raise ValueError("Command not allowed")
+
+# Validate target (e.g., IP address or hostname)
+if not re.match(r'^[a-zA-Z0-9.-]+$', target):
+    raise ValueError("Invalid target")
+
+cmd = ALLOWED_COMMANDS[command_type] + [target]
+subprocess.run(cmd, check=True, capture_output=True, timeout=10)
+```
+
+**Best Practices:**
+- Never use `shell=True` with user input
+- Pass arguments as list, not string
+- Validate and whitelist all user inputs
+- Use `shlex.split()` for parsing command strings
+- Implement timeouts to prevent DoS
+- Run subprocesses with minimal privileges
+
+---
+
+## Weak Cryptography
+
+### B303, B304, B324: Weak Hash Functions
+
+**Vulnerable Code:**
+```python
+import hashlib
+import md5  # Deprecated
+
+# MD5 (WEAK)
+password_hash = hashlib.md5(password.encode()).hexdigest()
+
+# SHA1 (WEAK)
+token = hashlib.sha1(user_data.encode()).hexdigest()
+```
+
+**Secure Solution:**
+```python
+import hashlib
+import secrets
+import bcrypt
+from argon2 import PasswordHasher
+
+# SHA-256 for general hashing (ACCEPTABLE for non-password data)
+data_hash = hashlib.sha256(data.encode()).hexdigest()
+
+# SHA-512 (BETTER for general hashing)
+data_hash = hashlib.sha512(data.encode()).hexdigest()
+
+# bcrypt for password hashing (RECOMMENDED)
+def hash_password(password: str) -> bytes:
+    """Hash password using bcrypt with salt."""
+    salt = bcrypt.gensalt(rounds=12)  # Cost factor 12
+    return bcrypt.hashpw(password.encode(), salt)
+
+def verify_password(password: str, hashed: bytes) -> bool:
+    """Verify password against bcrypt hash."""
+    return bcrypt.checkpw(password.encode(), hashed)
+
+# Argon2 for password hashing (BEST - winner of Password Hashing Competition)
+ph = PasswordHasher(
+    time_cost=2,       # Number of iterations
+    memory_cost=65536, # Memory usage in KiB (64 MB)
+    parallelism=4,     # Number of parallel threads
+)
+
+def hash_password_argon2(password: str) -> str:
+    """Hash password using Argon2."""
+    return ph.hash(password)
+
+def verify_password_argon2(password: str, hashed: str) -> bool:
+    """Verify password against Argon2 hash."""
+    try:
+        ph.verify(hashed, password)
+        return True
+    except:
+        return False
+
+# HMAC for message authentication
+import hmac
+def create_signature(message: str, secret_key: bytes) -> str:
+    """Create HMAC-SHA256 signature."""
+    return hmac.new(
+        secret_key,
+        message.encode(),
+        hashlib.sha256
+    ).hexdigest()
+```
+
+### B501, B502, B503: Weak SSL/TLS Configuration
+
+**Vulnerable Code:**
+```python
+import ssl
+import requests
+
+# Weak SSL version (UNSAFE)
+context = ssl.SSLContext(ssl.PROTOCOL_SSLv3)
+
+# Disabling certificate verification (VERY UNSAFE)
+requests.get('https://example.com', verify=False)
+```
+
+**Secure Solution:**
+```python
+import ssl
+import requests
+
+# Strong SSL/TLS configuration (SAFE)
+context = ssl.create_default_context()
+context.minimum_version = ssl.TLSVersion.TLSv1_2
+context.maximum_version = ssl.TLSVersion.TLSv1_3
+
+# Restrict cipher suites
+context.set_ciphers('ECDHE+AESGCM:ECDHE+CHACHA20:DHE+AESGCM:DHE+CHACHA20:!aNULL:!MD5:!DSS')
+
+# Enable certificate verification (default in requests)
+response = requests.get('https://example.com', verify=True)
+
+# Custom CA bundle
+response = requests.get('https://example.com', verify='/path/to/ca-bundle.crt')
+
+# For urllib
+import urllib.request
+import certifi
+
+url = 'https://example.com'
+response = urllib.request.urlopen(url, context=context, cafile=certifi.where())
+```
+
+**Best Practices:**
+- Use TLS 1.2 or TLS 1.3 only
+- Disable weak cipher suites
+- Always verify certificates in production
+- Use certificate pinning for critical connections
+- Regularly update SSL/TLS libraries
+
+---
+
+## Insecure Deserialization
+
+### B301: Pickle Usage
+
+**Vulnerable Code:**
+```python
+import pickle
+
+# Deserializing untrusted data (VERY UNSAFE)
+user_data = pickle.loads(request.body)
+
+# Loading from file (UNSAFE if file is from untrusted source)
+with open('user_session.pkl', 'rb') as f:
+    session = pickle.load(f)
+```
+
+**Secure Solution:**
+```python
+import json
+import msgpack
+from cryptography.fernet import Fernet
+
+# Use JSON for simple data (SAFE)
+user_data = json.loads(request.body)
+
+# Use MessagePack for binary efficiency (SAFE)
+user_data = msgpack.unpackb(request.body)
+
+# If pickle is absolutely necessary, use cryptographic signing
+import hmac
+import hashlib
+import pickle
+
+SECRET_KEY = os.environ['SECRET_KEY'].encode()
+
+def secure_pickle_dumps(obj):
+    """Serialize with HMAC signature."""
+    pickled = pickle.dumps(obj)
+    signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()
+    return signature + pickled
+
+def secure_pickle_loads(data):
+    """Deserialize with signature verification."""
+    signature = data[:32]  # SHA256 is 32 bytes
+    pickled = data[32:]
+
+    expected_signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()
+
+    if not hmac.compare_digest(signature, expected_signature):
+        raise ValueError("Invalid signature - data may be tampered")
+
+    return pickle.loads(pickled)
+
+# Better: Use itsdangerous for secure serialization
+from itsdangerous import URLSafeSerializer
+
+serializer = URLSafeSerializer(SECRET_KEY)
+
+# Serialize (signed and safe)
+token = serializer.dumps({'user_id': 123, 'role': 'admin'})
+
+# Deserialize (verified)
+data = serializer.loads(token)
+```
+
+**Best Practices:**
+- Avoid pickle for untrusted data
+- Use JSON, MessagePack, or Protocol Buffers
+- If pickle is required, implement cryptographic signing
+- Use `itsdangerous` library for secure token serialization
+- Restrict pickle to internal, trusted data only
+
+---
+
+## XML External Entity (XXE)
+
+### B313-B320, B405-B412: XML Parsing Vulnerabilities
+
+**Vulnerable Code:**
+```python
+import xml.etree.ElementTree as ET
+from lxml import etree
+
+# Unsafe XML parsing (VULNERABLE to XXE)
+tree = ET.parse(user_xml_file)
+root = tree.getroot()
+
+# lxml unsafe parsing
+parser = etree.XMLParser()
+tree = etree.parse(user_xml_file, parser)
+```
+
+**Secure Solution:**
+```python
+import xml.etree.ElementTree as ET
+from lxml import etree
+import defusedxml.ElementTree as defusedET
+
+# Use defusedxml (RECOMMENDED)
+tree = defusedET.parse(user_xml_file)
+root = tree.getroot()
+
+# Disable external entities in ElementTree
+ET.XMLParser.entity = {}  # Disable entity expansion
+
+# Secure lxml configuration
+parser = etree.XMLParser(
+    resolve_entities=False,  # Disable entity resolution
+    no_network=True,         # Disable network access
+    dtd_validation=False,    # Disable DTD validation
+    load_dtd=False          # Don't load DTD
+)
+tree = etree.parse(user_xml_file, parser)
+
+# Alternative: Use JSON instead of XML when possible
+import json
+data = json.loads(request.body)
+```
+
+**Best Practices:**
+- Use `defusedxml` library for all XML parsing
+- Disable DTD processing and external entity resolution
+- Validate XML against strict schema (XSD)
+- Consider using JSON instead of XML for APIs
+- Never parse XML from untrusted sources without defusedxml
+
+---
+
+## Security Misconfiguration
+
+### B201: Flask Debug Mode
+
+**Vulnerable Code:**
+```python
+from flask import Flask
+
+app = Flask(__name__)
+
+# Debug mode in production (VERY UNSAFE)
+app.run(debug=True, host='0.0.0.0')
+```
+
+**Secure Solution:**
+```python
+from flask import Flask
+import os
+
+app = Flask(__name__)
+
+# Use environment-based configuration
+DEBUG = os.environ.get('FLASK_DEBUG', 'false').lower() == 'true'
+ENV = os.environ.get('FLASK_ENV', 'production')
+
+if ENV == 'production' and DEBUG:
+    raise ValueError("Debug mode cannot be enabled in production")
+
+app.config['DEBUG'] = DEBUG
+app.config['ENV'] = ENV
+app.config['SECRET_KEY'] = os.environ['SECRET_KEY']
+
+# Use production WSGI server
+if ENV == 'production':
+    # Deploy with gunicorn or uwsgi, not app.run()
+    # gunicorn -w 4 -b 0.0.0.0:8000 app:app
+    pass
+else:
+    app.run(debug=DEBUG, host='127.0.0.1', port=5000)
+```
+
+### B506: YAML Load
+
+**Vulnerable Code:**
+```python
+import yaml
+
+# Arbitrary code execution (VERY UNSAFE)
+config = yaml.load(user_input, Loader=yaml.Loader)
+```
+
+**Secure Solution:**
+```python
+import yaml
+
+# Safe YAML loading (SAFE)
+config = yaml.safe_load(user_input)
+
+# For complex objects, use schema validation
+from schema import Schema, And, Use, Optional
+
+config_schema = Schema({
+    'database': {
+        'host': And(str, len),
+        'port': And(Use(int), lambda n: 1024 <= n <= 65535),
+    },
+    Optional('debug'): bool,
+})
+
+config = yaml.safe_load(user_input)
+validated_config = config_schema.validate(config)
+```
+
+### B701, B702, B703: Template Autoescape
+
+**Vulnerable Code:**
+```python
+from jinja2 import Environment
+
+# Autoescape disabled (XSS VULNERABLE)
+env = Environment(autoescape=False)
+template = env.from_string(user_template)
+output = template.render(name=user_input)
+```
+
+**Secure Solution:**
+```python
+from jinja2 import Environment, select_autoescape
+from markupsafe import Markup, escape
+
+# Enable autoescape (SAFE)
+env = Environment(
+    autoescape=select_autoescape(['html', 'xml'])
+)
+
+# Or for all templates
+env = Environment(autoescape=True)
+
+# Explicitly mark safe content
+def render_html(content):
+    # Sanitize first
+    clean_content = escape(content)
+    return Markup(clean_content)
+
+# Django: Ensure autoescape is enabled (default)
+# In Django templates:
+# {{ user_input }}  <!-- Auto-escaped -->
+# {{ user_input|safe }}  <!-- Only use after sanitization -->
+```
+
+**Best Practices:**
+- Always enable autoescape in template engines
+- Never mark user input as safe without sanitization
+- Use Content Security Policy (CSP) headers
+- Validate and sanitize all user inputs
+- Use templating libraries with secure defaults
+
+---
+
+## General Security Principles
+
+1. **Defense in Depth**: Implement multiple layers of security controls
+2. **Least Privilege**: Grant minimum necessary permissions
+3. **Fail Securely**: Errors should not expose sensitive information
+4. **Input Validation**: Validate all inputs at trust boundaries
+5. **Output Encoding**: Encode data based on output context
+6. **Secure Defaults**: Use secure configurations by default
+7. **Keep Dependencies Updated**: Regularly update security libraries
+8. **Security Testing**: Include security tests in CI/CD pipelines
+
+## Additional Resources
+
+- [OWASP Cheat Sheet Series](https://cheatsheetseries.owasp.org/)
+- [Python Security Best Practices](https://python.readthedocs.io/en/stable/library/security_warnings.html)
+- [CWE Top 25](https://cwe.mitre.org/top25/)