# Bandit Finding Remediation Guide Comprehensive secure coding patterns and remediation strategies for common Bandit findings. ## Table of Contents - [Hardcoded Credentials](#hardcoded-credentials) - [SQL Injection](#sql-injection) - [Command Injection](#command-injection) - [Weak Cryptography](#weak-cryptography) - [Insecure Deserialization](#insecure-deserialization) - [XML External Entity (XXE)](#xml-external-entity-xxe) - [Security Misconfiguration](#security-misconfiguration) --- ## Hardcoded Credentials ### B105, B106, B107: Hardcoded Passwords **Vulnerable Code:** ```python # B105: Hardcoded password string DATABASE_PASSWORD = "admin123" # B106: Hardcoded password in function call db.connect(host="localhost", password="secret_password") # B107: Hardcoded password default argument def connect_db(password="default_pass"): pass ``` **Secure Solution:** ```python import os from dotenv import load_dotenv # Load environment variables load_dotenv() # Use environment variables DATABASE_PASSWORD = os.environ.get("DATABASE_PASSWORD") if not DATABASE_PASSWORD: raise ValueError("DATABASE_PASSWORD environment variable not set") # Use environment variables in function calls db.connect( host=os.environ.get("DB_HOST", "localhost"), password=os.environ.get("DB_PASSWORD") ) # Use secret management service (example with AWS Secrets Manager) import boto3 from botocore.exceptions import ClientError def get_secret(secret_name): session = boto3.session.Session() client = session.client(service_name='secretsmanager', region_name='us-east-1') try: response = client.get_secret_value(SecretId=secret_name) return response['SecretString'] except ClientError as e: raise Exception(f"Failed to retrieve secret: {e}") DATABASE_PASSWORD = get_secret("prod/db/password") ``` **Best Practices:** - Use environment variables with `.env` files (never commit `.env` to version control) - Use secret management services (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) - Implement secret rotation policies - Use configuration management tools (Ansible Vault, Kubernetes Secrets) --- ## SQL Injection ### B608: SQL Injection via String Formatting **Vulnerable Code:** ```python # String formatting (UNSAFE) user_id = request.GET['id'] query = f"SELECT * FROM users WHERE id = {user_id}" cursor.execute(query) # String concatenation (UNSAFE) query = "SELECT * FROM users WHERE username = '" + username + "'" cursor.execute(query) # Percent formatting (UNSAFE) query = "SELECT * FROM users WHERE email = '%s'" % email cursor.execute(query) ``` **Secure Solution with psycopg2:** ```python import psycopg2 # Parameterized queries (SAFE) user_id = request.GET['id'] query = "SELECT * FROM users WHERE id = %s" cursor.execute(query, (user_id,)) # Multiple parameters query = "SELECT * FROM users WHERE username = %s AND active = %s" cursor.execute(query, (username, True)) # Named parameters query = "SELECT * FROM users WHERE username = %(username)s AND email = %(email)s" cursor.execute(query, {'username': username, 'email': email}) ``` **Secure Solution with SQLAlchemy ORM:** ```python from sqlalchemy import create_engine, select from sqlalchemy.orm import Session # Using ORM (SAFE) with Session(engine) as session: stmt = select(User).where(User.username == username) user = session.execute(stmt).scalar_one_or_none() # Using bound parameters with raw SQL (SAFE) with Session(engine) as session: result = session.execute( text("SELECT * FROM users WHERE username = :username"), {"username": username} ) ``` **Secure Solution with Django ORM:** ```python from django.db.models import Q # Django ORM (SAFE) users = User.objects.filter(username=username) # Complex queries (SAFE) users = User.objects.filter(Q(username=username) | Q(email=email)) # Raw SQL with parameters (SAFE) from django.db import connection with connection.cursor() as cursor: cursor.execute("SELECT * FROM users WHERE username = %s", [username]) ``` **Best Practices:** - Always use parameterized queries or prepared statements - Never concatenate user input into SQL queries - Use ORM when possible for automatic escaping - Validate and sanitize inputs at application boundaries - Apply least privilege principle to database accounts --- ## Command Injection ### B602, B604, B605: Shell Injection in Subprocess **Vulnerable Code:** ```python import subprocess import os # shell=True with user input (VERY UNSAFE) filename = request.GET['file'] subprocess.call(f"cat {filename}", shell=True) # os.system with user input (VERY UNSAFE) os.system(f"ping -c 1 {hostname}") # String concatenation (UNSAFE) cmd = "curl " + user_url subprocess.call(cmd, shell=True) ``` **Secure Solution:** ```python import subprocess import shlex from pathlib import Path # Use list of arguments without shell=True (SAFE) filename = request.GET['file'] subprocess.run(["cat", filename], check=True, capture_output=True) # Validate input before use def validate_filename(filename): """Validate filename to prevent path traversal.""" # Allow only alphanumeric, dash, underscore, and dot if not re.match(r'^[a-zA-Z0-9_.-]+$', filename): raise ValueError("Invalid filename") # Resolve to absolute path and check it's within allowed directory file_path = Path(UPLOAD_DIR) / filename if not file_path.resolve().is_relative_to(Path(UPLOAD_DIR).resolve()): raise ValueError("Path traversal detected") return file_path filename = validate_filename(request.GET['file']) subprocess.run(["cat", str(filename)], check=True, capture_output=True) # Use shlex.split() for complex commands import shlex command_string = "ping -c 1 example.com" subprocess.run(shlex.split(command_string), check=True, capture_output=True) # Whitelist approach for restricted commands ALLOWED_COMMANDS = { 'ping': ['ping', '-c', '1'], 'traceroute': ['traceroute', '-m', '10'], } command_type = request.GET['command'] target = request.GET['target'] if command_type not in ALLOWED_COMMANDS: raise ValueError("Command not allowed") # Validate target (e.g., IP address or hostname) if not re.match(r'^[a-zA-Z0-9.-]+$', target): raise ValueError("Invalid target") cmd = ALLOWED_COMMANDS[command_type] + [target] subprocess.run(cmd, check=True, capture_output=True, timeout=10) ``` **Best Practices:** - Never use `shell=True` with user input - Pass arguments as list, not string - Validate and whitelist all user inputs - Use `shlex.split()` for parsing command strings - Implement timeouts to prevent DoS - Run subprocesses with minimal privileges --- ## Weak Cryptography ### B303, B304, B324: Weak Hash Functions **Vulnerable Code:** ```python import hashlib import md5 # Deprecated # MD5 (WEAK) password_hash = hashlib.md5(password.encode()).hexdigest() # SHA1 (WEAK) token = hashlib.sha1(user_data.encode()).hexdigest() ``` **Secure Solution:** ```python import hashlib import secrets import bcrypt from argon2 import PasswordHasher # SHA-256 for general hashing (ACCEPTABLE for non-password data) data_hash = hashlib.sha256(data.encode()).hexdigest() # SHA-512 (BETTER for general hashing) data_hash = hashlib.sha512(data.encode()).hexdigest() # bcrypt for password hashing (RECOMMENDED) def hash_password(password: str) -> bytes: """Hash password using bcrypt with salt.""" salt = bcrypt.gensalt(rounds=12) # Cost factor 12 return bcrypt.hashpw(password.encode(), salt) def verify_password(password: str, hashed: bytes) -> bool: """Verify password against bcrypt hash.""" return bcrypt.checkpw(password.encode(), hashed) # Argon2 for password hashing (BEST - winner of Password Hashing Competition) ph = PasswordHasher( time_cost=2, # Number of iterations memory_cost=65536, # Memory usage in KiB (64 MB) parallelism=4, # Number of parallel threads ) def hash_password_argon2(password: str) -> str: """Hash password using Argon2.""" return ph.hash(password) def verify_password_argon2(password: str, hashed: str) -> bool: """Verify password against Argon2 hash.""" try: ph.verify(hashed, password) return True except: return False # HMAC for message authentication import hmac def create_signature(message: str, secret_key: bytes) -> str: """Create HMAC-SHA256 signature.""" return hmac.new( secret_key, message.encode(), hashlib.sha256 ).hexdigest() ``` ### B501, B502, B503: Weak SSL/TLS Configuration **Vulnerable Code:** ```python import ssl import requests # Weak SSL version (UNSAFE) context = ssl.SSLContext(ssl.PROTOCOL_SSLv3) # Disabling certificate verification (VERY UNSAFE) requests.get('https://example.com', verify=False) ``` **Secure Solution:** ```python import ssl import requests # Strong SSL/TLS configuration (SAFE) context = ssl.create_default_context() context.minimum_version = ssl.TLSVersion.TLSv1_2 context.maximum_version = ssl.TLSVersion.TLSv1_3 # Restrict cipher suites context.set_ciphers('ECDHE+AESGCM:ECDHE+CHACHA20:DHE+AESGCM:DHE+CHACHA20:!aNULL:!MD5:!DSS') # Enable certificate verification (default in requests) response = requests.get('https://example.com', verify=True) # Custom CA bundle response = requests.get('https://example.com', verify='/path/to/ca-bundle.crt') # For urllib import urllib.request import certifi url = 'https://example.com' response = urllib.request.urlopen(url, context=context, cafile=certifi.where()) ``` **Best Practices:** - Use TLS 1.2 or TLS 1.3 only - Disable weak cipher suites - Always verify certificates in production - Use certificate pinning for critical connections - Regularly update SSL/TLS libraries --- ## Insecure Deserialization ### B301: Pickle Usage **Vulnerable Code:** ```python import pickle # Deserializing untrusted data (VERY UNSAFE) user_data = pickle.loads(request.body) # Loading from file (UNSAFE if file is from untrusted source) with open('user_session.pkl', 'rb') as f: session = pickle.load(f) ``` **Secure Solution:** ```python import json import msgpack from cryptography.fernet import Fernet # Use JSON for simple data (SAFE) user_data = json.loads(request.body) # Use MessagePack for binary efficiency (SAFE) user_data = msgpack.unpackb(request.body) # If pickle is absolutely necessary, use cryptographic signing import hmac import hashlib import pickle SECRET_KEY = os.environ['SECRET_KEY'].encode() def secure_pickle_dumps(obj): """Serialize with HMAC signature.""" pickled = pickle.dumps(obj) signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest() return signature + pickled def secure_pickle_loads(data): """Deserialize with signature verification.""" signature = data[:32] # SHA256 is 32 bytes pickled = data[32:] expected_signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest() if not hmac.compare_digest(signature, expected_signature): raise ValueError("Invalid signature - data may be tampered") return pickle.loads(pickled) # Better: Use itsdangerous for secure serialization from itsdangerous import URLSafeSerializer serializer = URLSafeSerializer(SECRET_KEY) # Serialize (signed and safe) token = serializer.dumps({'user_id': 123, 'role': 'admin'}) # Deserialize (verified) data = serializer.loads(token) ``` **Best Practices:** - Avoid pickle for untrusted data - Use JSON, MessagePack, or Protocol Buffers - If pickle is required, implement cryptographic signing - Use `itsdangerous` library for secure token serialization - Restrict pickle to internal, trusted data only --- ## XML External Entity (XXE) ### B313-B320, B405-B412: XML Parsing Vulnerabilities **Vulnerable Code:** ```python import xml.etree.ElementTree as ET from lxml import etree # Unsafe XML parsing (VULNERABLE to XXE) tree = ET.parse(user_xml_file) root = tree.getroot() # lxml unsafe parsing parser = etree.XMLParser() tree = etree.parse(user_xml_file, parser) ``` **Secure Solution:** ```python import xml.etree.ElementTree as ET from lxml import etree import defusedxml.ElementTree as defusedET # Use defusedxml (RECOMMENDED) tree = defusedET.parse(user_xml_file) root = tree.getroot() # Disable external entities in ElementTree ET.XMLParser.entity = {} # Disable entity expansion # Secure lxml configuration parser = etree.XMLParser( resolve_entities=False, # Disable entity resolution no_network=True, # Disable network access dtd_validation=False, # Disable DTD validation load_dtd=False # Don't load DTD ) tree = etree.parse(user_xml_file, parser) # Alternative: Use JSON instead of XML when possible import json data = json.loads(request.body) ``` **Best Practices:** - Use `defusedxml` library for all XML parsing - Disable DTD processing and external entity resolution - Validate XML against strict schema (XSD) - Consider using JSON instead of XML for APIs - Never parse XML from untrusted sources without defusedxml --- ## Security Misconfiguration ### B201: Flask Debug Mode **Vulnerable Code:** ```python from flask import Flask app = Flask(__name__) # Debug mode in production (VERY UNSAFE) app.run(debug=True, host='0.0.0.0') ``` **Secure Solution:** ```python from flask import Flask import os app = Flask(__name__) # Use environment-based configuration DEBUG = os.environ.get('FLASK_DEBUG', 'false').lower() == 'true' ENV = os.environ.get('FLASK_ENV', 'production') if ENV == 'production' and DEBUG: raise ValueError("Debug mode cannot be enabled in production") app.config['DEBUG'] = DEBUG app.config['ENV'] = ENV app.config['SECRET_KEY'] = os.environ['SECRET_KEY'] # Use production WSGI server if ENV == 'production': # Deploy with gunicorn or uwsgi, not app.run() # gunicorn -w 4 -b 0.0.0.0:8000 app:app pass else: app.run(debug=DEBUG, host='127.0.0.1', port=5000) ``` ### B506: YAML Load **Vulnerable Code:** ```python import yaml # Arbitrary code execution (VERY UNSAFE) config = yaml.load(user_input, Loader=yaml.Loader) ``` **Secure Solution:** ```python import yaml # Safe YAML loading (SAFE) config = yaml.safe_load(user_input) # For complex objects, use schema validation from schema import Schema, And, Use, Optional config_schema = Schema({ 'database': { 'host': And(str, len), 'port': And(Use(int), lambda n: 1024 <= n <= 65535), }, Optional('debug'): bool, }) config = yaml.safe_load(user_input) validated_config = config_schema.validate(config) ``` ### B701, B702, B703: Template Autoescape **Vulnerable Code:** ```python from jinja2 import Environment # Autoescape disabled (XSS VULNERABLE) env = Environment(autoescape=False) template = env.from_string(user_template) output = template.render(name=user_input) ``` **Secure Solution:** ```python from jinja2 import Environment, select_autoescape from markupsafe import Markup, escape # Enable autoescape (SAFE) env = Environment( autoescape=select_autoescape(['html', 'xml']) ) # Or for all templates env = Environment(autoescape=True) # Explicitly mark safe content def render_html(content): # Sanitize first clean_content = escape(content) return Markup(clean_content) # Django: Ensure autoescape is enabled (default) # In Django templates: # {{ user_input }} # {{ user_input|safe }} ``` **Best Practices:** - Always enable autoescape in template engines - Never mark user input as safe without sanitization - Use Content Security Policy (CSP) headers - Validate and sanitize all user inputs - Use templating libraries with secure defaults --- ## General Security Principles 1. **Defense in Depth**: Implement multiple layers of security controls 2. **Least Privilege**: Grant minimum necessary permissions 3. **Fail Securely**: Errors should not expose sensitive information 4. **Input Validation**: Validate all inputs at trust boundaries 5. **Output Encoding**: Encode data based on output context 6. **Secure Defaults**: Use secure configurations by default 7. **Keep Dependencies Updated**: Regularly update security libraries 8. **Security Testing**: Include security tests in CI/CD pipelines ## Additional Resources - [OWASP Cheat Sheet Series](https://cheatsheetseries.owasp.org/) - [Python Security Best Practices](https://python.readthedocs.io/en/stable/library/security_warnings.html) - [CWE Top 25](https://cwe.mitre.org/top25/)