16 KiB
16 KiB
Bandit Finding Remediation Guide
Comprehensive secure coding patterns and remediation strategies for common Bandit findings.
Table of Contents
- Hardcoded Credentials
- SQL Injection
- Command Injection
- Weak Cryptography
- Insecure Deserialization
- XML External Entity (XXE)
- Security Misconfiguration
Hardcoded Credentials
B105, B106, B107: Hardcoded Passwords
Vulnerable Code:
# B105: Hardcoded password string
DATABASE_PASSWORD = "admin123"
# B106: Hardcoded password in function call
db.connect(host="localhost", password="secret_password")
# B107: Hardcoded password default argument
def connect_db(password="default_pass"):
pass
Secure Solution:
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Use environment variables
DATABASE_PASSWORD = os.environ.get("DATABASE_PASSWORD")
if not DATABASE_PASSWORD:
raise ValueError("DATABASE_PASSWORD environment variable not set")
# Use environment variables in function calls
db.connect(
host=os.environ.get("DB_HOST", "localhost"),
password=os.environ.get("DB_PASSWORD")
)
# Use secret management service (example with AWS Secrets Manager)
import boto3
from botocore.exceptions import ClientError
def get_secret(secret_name):
session = boto3.session.Session()
client = session.client(service_name='secretsmanager', region_name='us-east-1')
try:
response = client.get_secret_value(SecretId=secret_name)
return response['SecretString']
except ClientError as e:
raise Exception(f"Failed to retrieve secret: {e}")
DATABASE_PASSWORD = get_secret("prod/db/password")
Best Practices:
- Use environment variables with
.envfiles (never commit.envto version control) - Use secret management services (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault)
- Implement secret rotation policies
- Use configuration management tools (Ansible Vault, Kubernetes Secrets)
SQL Injection
B608: SQL Injection via String Formatting
Vulnerable Code:
# String formatting (UNSAFE)
user_id = request.GET['id']
query = f"SELECT * FROM users WHERE id = {user_id}"
cursor.execute(query)
# String concatenation (UNSAFE)
query = "SELECT * FROM users WHERE username = '" + username + "'"
cursor.execute(query)
# Percent formatting (UNSAFE)
query = "SELECT * FROM users WHERE email = '%s'" % email
cursor.execute(query)
Secure Solution with psycopg2:
import psycopg2
# Parameterized queries (SAFE)
user_id = request.GET['id']
query = "SELECT * FROM users WHERE id = %s"
cursor.execute(query, (user_id,))
# Multiple parameters
query = "SELECT * FROM users WHERE username = %s AND active = %s"
cursor.execute(query, (username, True))
# Named parameters
query = "SELECT * FROM users WHERE username = %(username)s AND email = %(email)s"
cursor.execute(query, {'username': username, 'email': email})
Secure Solution with SQLAlchemy ORM:
from sqlalchemy import create_engine, select
from sqlalchemy.orm import Session
# Using ORM (SAFE)
with Session(engine) as session:
stmt = select(User).where(User.username == username)
user = session.execute(stmt).scalar_one_or_none()
# Using bound parameters with raw SQL (SAFE)
with Session(engine) as session:
result = session.execute(
text("SELECT * FROM users WHERE username = :username"),
{"username": username}
)
Secure Solution with Django ORM:
from django.db.models import Q
# Django ORM (SAFE)
users = User.objects.filter(username=username)
# Complex queries (SAFE)
users = User.objects.filter(Q(username=username) | Q(email=email))
# Raw SQL with parameters (SAFE)
from django.db import connection
with connection.cursor() as cursor:
cursor.execute("SELECT * FROM users WHERE username = %s", [username])
Best Practices:
- Always use parameterized queries or prepared statements
- Never concatenate user input into SQL queries
- Use ORM when possible for automatic escaping
- Validate and sanitize inputs at application boundaries
- Apply least privilege principle to database accounts
Command Injection
B602, B604, B605: Shell Injection in Subprocess
Vulnerable Code:
import subprocess
import os
# shell=True with user input (VERY UNSAFE)
filename = request.GET['file']
subprocess.call(f"cat {filename}", shell=True)
# os.system with user input (VERY UNSAFE)
os.system(f"ping -c 1 {hostname}")
# String concatenation (UNSAFE)
cmd = "curl " + user_url
subprocess.call(cmd, shell=True)
Secure Solution:
import subprocess
import shlex
from pathlib import Path
# Use list of arguments without shell=True (SAFE)
filename = request.GET['file']
subprocess.run(["cat", filename], check=True, capture_output=True)
# Validate input before use
def validate_filename(filename):
"""Validate filename to prevent path traversal."""
# Allow only alphanumeric, dash, underscore, and dot
if not re.match(r'^[a-zA-Z0-9_.-]+$', filename):
raise ValueError("Invalid filename")
# Resolve to absolute path and check it's within allowed directory
file_path = Path(UPLOAD_DIR) / filename
if not file_path.resolve().is_relative_to(Path(UPLOAD_DIR).resolve()):
raise ValueError("Path traversal detected")
return file_path
filename = validate_filename(request.GET['file'])
subprocess.run(["cat", str(filename)], check=True, capture_output=True)
# Use shlex.split() for complex commands
import shlex
command_string = "ping -c 1 example.com"
subprocess.run(shlex.split(command_string), check=True, capture_output=True)
# Whitelist approach for restricted commands
ALLOWED_COMMANDS = {
'ping': ['ping', '-c', '1'],
'traceroute': ['traceroute', '-m', '10'],
}
command_type = request.GET['command']
target = request.GET['target']
if command_type not in ALLOWED_COMMANDS:
raise ValueError("Command not allowed")
# Validate target (e.g., IP address or hostname)
if not re.match(r'^[a-zA-Z0-9.-]+$', target):
raise ValueError("Invalid target")
cmd = ALLOWED_COMMANDS[command_type] + [target]
subprocess.run(cmd, check=True, capture_output=True, timeout=10)
Best Practices:
- Never use
shell=Truewith user input - Pass arguments as list, not string
- Validate and whitelist all user inputs
- Use
shlex.split()for parsing command strings - Implement timeouts to prevent DoS
- Run subprocesses with minimal privileges
Weak Cryptography
B303, B304, B324: Weak Hash Functions
Vulnerable Code:
import hashlib
import md5 # Deprecated
# MD5 (WEAK)
password_hash = hashlib.md5(password.encode()).hexdigest()
# SHA1 (WEAK)
token = hashlib.sha1(user_data.encode()).hexdigest()
Secure Solution:
import hashlib
import secrets
import bcrypt
from argon2 import PasswordHasher
# SHA-256 for general hashing (ACCEPTABLE for non-password data)
data_hash = hashlib.sha256(data.encode()).hexdigest()
# SHA-512 (BETTER for general hashing)
data_hash = hashlib.sha512(data.encode()).hexdigest()
# bcrypt for password hashing (RECOMMENDED)
def hash_password(password: str) -> bytes:
"""Hash password using bcrypt with salt."""
salt = bcrypt.gensalt(rounds=12) # Cost factor 12
return bcrypt.hashpw(password.encode(), salt)
def verify_password(password: str, hashed: bytes) -> bool:
"""Verify password against bcrypt hash."""
return bcrypt.checkpw(password.encode(), hashed)
# Argon2 for password hashing (BEST - winner of Password Hashing Competition)
ph = PasswordHasher(
time_cost=2, # Number of iterations
memory_cost=65536, # Memory usage in KiB (64 MB)
parallelism=4, # Number of parallel threads
)
def hash_password_argon2(password: str) -> str:
"""Hash password using Argon2."""
return ph.hash(password)
def verify_password_argon2(password: str, hashed: str) -> bool:
"""Verify password against Argon2 hash."""
try:
ph.verify(hashed, password)
return True
except:
return False
# HMAC for message authentication
import hmac
def create_signature(message: str, secret_key: bytes) -> str:
"""Create HMAC-SHA256 signature."""
return hmac.new(
secret_key,
message.encode(),
hashlib.sha256
).hexdigest()
B501, B502, B503: Weak SSL/TLS Configuration
Vulnerable Code:
import ssl
import requests
# Weak SSL version (UNSAFE)
context = ssl.SSLContext(ssl.PROTOCOL_SSLv3)
# Disabling certificate verification (VERY UNSAFE)
requests.get('https://example.com', verify=False)
Secure Solution:
import ssl
import requests
# Strong SSL/TLS configuration (SAFE)
context = ssl.create_default_context()
context.minimum_version = ssl.TLSVersion.TLSv1_2
context.maximum_version = ssl.TLSVersion.TLSv1_3
# Restrict cipher suites
context.set_ciphers('ECDHE+AESGCM:ECDHE+CHACHA20:DHE+AESGCM:DHE+CHACHA20:!aNULL:!MD5:!DSS')
# Enable certificate verification (default in requests)
response = requests.get('https://example.com', verify=True)
# Custom CA bundle
response = requests.get('https://example.com', verify='/path/to/ca-bundle.crt')
# For urllib
import urllib.request
import certifi
url = 'https://example.com'
response = urllib.request.urlopen(url, context=context, cafile=certifi.where())
Best Practices:
- Use TLS 1.2 or TLS 1.3 only
- Disable weak cipher suites
- Always verify certificates in production
- Use certificate pinning for critical connections
- Regularly update SSL/TLS libraries
Insecure Deserialization
B301: Pickle Usage
Vulnerable Code:
import pickle
# Deserializing untrusted data (VERY UNSAFE)
user_data = pickle.loads(request.body)
# Loading from file (UNSAFE if file is from untrusted source)
with open('user_session.pkl', 'rb') as f:
session = pickle.load(f)
Secure Solution:
import json
import msgpack
from cryptography.fernet import Fernet
# Use JSON for simple data (SAFE)
user_data = json.loads(request.body)
# Use MessagePack for binary efficiency (SAFE)
user_data = msgpack.unpackb(request.body)
# If pickle is absolutely necessary, use cryptographic signing
import hmac
import hashlib
import pickle
SECRET_KEY = os.environ['SECRET_KEY'].encode()
def secure_pickle_dumps(obj):
"""Serialize with HMAC signature."""
pickled = pickle.dumps(obj)
signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()
return signature + pickled
def secure_pickle_loads(data):
"""Deserialize with signature verification."""
signature = data[:32] # SHA256 is 32 bytes
pickled = data[32:]
expected_signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()
if not hmac.compare_digest(signature, expected_signature):
raise ValueError("Invalid signature - data may be tampered")
return pickle.loads(pickled)
# Better: Use itsdangerous for secure serialization
from itsdangerous import URLSafeSerializer
serializer = URLSafeSerializer(SECRET_KEY)
# Serialize (signed and safe)
token = serializer.dumps({'user_id': 123, 'role': 'admin'})
# Deserialize (verified)
data = serializer.loads(token)
Best Practices:
- Avoid pickle for untrusted data
- Use JSON, MessagePack, or Protocol Buffers
- If pickle is required, implement cryptographic signing
- Use
itsdangerouslibrary for secure token serialization - Restrict pickle to internal, trusted data only
XML External Entity (XXE)
B313-B320, B405-B412: XML Parsing Vulnerabilities
Vulnerable Code:
import xml.etree.ElementTree as ET
from lxml import etree
# Unsafe XML parsing (VULNERABLE to XXE)
tree = ET.parse(user_xml_file)
root = tree.getroot()
# lxml unsafe parsing
parser = etree.XMLParser()
tree = etree.parse(user_xml_file, parser)
Secure Solution:
import xml.etree.ElementTree as ET
from lxml import etree
import defusedxml.ElementTree as defusedET
# Use defusedxml (RECOMMENDED)
tree = defusedET.parse(user_xml_file)
root = tree.getroot()
# Disable external entities in ElementTree
ET.XMLParser.entity = {} # Disable entity expansion
# Secure lxml configuration
parser = etree.XMLParser(
resolve_entities=False, # Disable entity resolution
no_network=True, # Disable network access
dtd_validation=False, # Disable DTD validation
load_dtd=False # Don't load DTD
)
tree = etree.parse(user_xml_file, parser)
# Alternative: Use JSON instead of XML when possible
import json
data = json.loads(request.body)
Best Practices:
- Use
defusedxmllibrary for all XML parsing - Disable DTD processing and external entity resolution
- Validate XML against strict schema (XSD)
- Consider using JSON instead of XML for APIs
- Never parse XML from untrusted sources without defusedxml
Security Misconfiguration
B201: Flask Debug Mode
Vulnerable Code:
from flask import Flask
app = Flask(__name__)
# Debug mode in production (VERY UNSAFE)
app.run(debug=True, host='0.0.0.0')
Secure Solution:
from flask import Flask
import os
app = Flask(__name__)
# Use environment-based configuration
DEBUG = os.environ.get('FLASK_DEBUG', 'false').lower() == 'true'
ENV = os.environ.get('FLASK_ENV', 'production')
if ENV == 'production' and DEBUG:
raise ValueError("Debug mode cannot be enabled in production")
app.config['DEBUG'] = DEBUG
app.config['ENV'] = ENV
app.config['SECRET_KEY'] = os.environ['SECRET_KEY']
# Use production WSGI server
if ENV == 'production':
# Deploy with gunicorn or uwsgi, not app.run()
# gunicorn -w 4 -b 0.0.0.0:8000 app:app
pass
else:
app.run(debug=DEBUG, host='127.0.0.1', port=5000)
B506: YAML Load
Vulnerable Code:
import yaml
# Arbitrary code execution (VERY UNSAFE)
config = yaml.load(user_input, Loader=yaml.Loader)
Secure Solution:
import yaml
# Safe YAML loading (SAFE)
config = yaml.safe_load(user_input)
# For complex objects, use schema validation
from schema import Schema, And, Use, Optional
config_schema = Schema({
'database': {
'host': And(str, len),
'port': And(Use(int), lambda n: 1024 <= n <= 65535),
},
Optional('debug'): bool,
})
config = yaml.safe_load(user_input)
validated_config = config_schema.validate(config)
B701, B702, B703: Template Autoescape
Vulnerable Code:
from jinja2 import Environment
# Autoescape disabled (XSS VULNERABLE)
env = Environment(autoescape=False)
template = env.from_string(user_template)
output = template.render(name=user_input)
Secure Solution:
from jinja2 import Environment, select_autoescape
from markupsafe import Markup, escape
# Enable autoescape (SAFE)
env = Environment(
autoescape=select_autoescape(['html', 'xml'])
)
# Or for all templates
env = Environment(autoescape=True)
# Explicitly mark safe content
def render_html(content):
# Sanitize first
clean_content = escape(content)
return Markup(clean_content)
# Django: Ensure autoescape is enabled (default)
# In Django templates:
# {{ user_input }} <!-- Auto-escaped -->
# {{ user_input|safe }} <!-- Only use after sanitization -->
Best Practices:
- Always enable autoescape in template engines
- Never mark user input as safe without sanitization
- Use Content Security Policy (CSP) headers
- Validate and sanitize all user inputs
- Use templating libraries with secure defaults
General Security Principles
- Defense in Depth: Implement multiple layers of security controls
- Least Privilege: Grant minimum necessary permissions
- Fail Securely: Errors should not expose sensitive information
- Input Validation: Validate all inputs at trust boundaries
- Output Encoding: Encode data based on output context
- Secure Defaults: Use secure configurations by default
- Keep Dependencies Updated: Regularly update security libraries
- Security Testing: Include security tests in CI/CD pipelines