Files
gh-agentsecops-secopsagentkit/skills/appsec/sast-bandit/references/remediation_guide.md
2025-11-29 17:51:02 +08:00

16 KiB

Bandit Finding Remediation Guide

Comprehensive secure coding patterns and remediation strategies for common Bandit findings.

Table of Contents


Hardcoded Credentials

B105, B106, B107: Hardcoded Passwords

Vulnerable Code:

# B105: Hardcoded password string
DATABASE_PASSWORD = "admin123"

# B106: Hardcoded password in function call
db.connect(host="localhost", password="secret_password")

# B107: Hardcoded password default argument
def connect_db(password="default_pass"):
    pass

Secure Solution:

import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Use environment variables
DATABASE_PASSWORD = os.environ.get("DATABASE_PASSWORD")
if not DATABASE_PASSWORD:
    raise ValueError("DATABASE_PASSWORD environment variable not set")

# Use environment variables in function calls
db.connect(
    host=os.environ.get("DB_HOST", "localhost"),
    password=os.environ.get("DB_PASSWORD")
)

# Use secret management service (example with AWS Secrets Manager)
import boto3
from botocore.exceptions import ClientError

def get_secret(secret_name):
    session = boto3.session.Session()
    client = session.client(service_name='secretsmanager', region_name='us-east-1')
    try:
        response = client.get_secret_value(SecretId=secret_name)
        return response['SecretString']
    except ClientError as e:
        raise Exception(f"Failed to retrieve secret: {e}")

DATABASE_PASSWORD = get_secret("prod/db/password")

Best Practices:

  • Use environment variables with .env files (never commit .env to version control)
  • Use secret management services (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault)
  • Implement secret rotation policies
  • Use configuration management tools (Ansible Vault, Kubernetes Secrets)

SQL Injection

B608: SQL Injection via String Formatting

Vulnerable Code:

# String formatting (UNSAFE)
user_id = request.GET['id']
query = f"SELECT * FROM users WHERE id = {user_id}"
cursor.execute(query)

# String concatenation (UNSAFE)
query = "SELECT * FROM users WHERE username = '" + username + "'"
cursor.execute(query)

# Percent formatting (UNSAFE)
query = "SELECT * FROM users WHERE email = '%s'" % email
cursor.execute(query)

Secure Solution with psycopg2:

import psycopg2

# Parameterized queries (SAFE)
user_id = request.GET['id']
query = "SELECT * FROM users WHERE id = %s"
cursor.execute(query, (user_id,))

# Multiple parameters
query = "SELECT * FROM users WHERE username = %s AND active = %s"
cursor.execute(query, (username, True))

# Named parameters
query = "SELECT * FROM users WHERE username = %(username)s AND email = %(email)s"
cursor.execute(query, {'username': username, 'email': email})

Secure Solution with SQLAlchemy ORM:

from sqlalchemy import create_engine, select
from sqlalchemy.orm import Session

# Using ORM (SAFE)
with Session(engine) as session:
    stmt = select(User).where(User.username == username)
    user = session.execute(stmt).scalar_one_or_none()

# Using bound parameters with raw SQL (SAFE)
with Session(engine) as session:
    result = session.execute(
        text("SELECT * FROM users WHERE username = :username"),
        {"username": username}
    )

Secure Solution with Django ORM:

from django.db.models import Q

# Django ORM (SAFE)
users = User.objects.filter(username=username)

# Complex queries (SAFE)
users = User.objects.filter(Q(username=username) | Q(email=email))

# Raw SQL with parameters (SAFE)
from django.db import connection
with connection.cursor() as cursor:
    cursor.execute("SELECT * FROM users WHERE username = %s", [username])

Best Practices:

  • Always use parameterized queries or prepared statements
  • Never concatenate user input into SQL queries
  • Use ORM when possible for automatic escaping
  • Validate and sanitize inputs at application boundaries
  • Apply least privilege principle to database accounts

Command Injection

B602, B604, B605: Shell Injection in Subprocess

Vulnerable Code:

import subprocess
import os

# shell=True with user input (VERY UNSAFE)
filename = request.GET['file']
subprocess.call(f"cat {filename}", shell=True)

# os.system with user input (VERY UNSAFE)
os.system(f"ping -c 1 {hostname}")

# String concatenation (UNSAFE)
cmd = "curl " + user_url
subprocess.call(cmd, shell=True)

Secure Solution:

import subprocess
import shlex
from pathlib import Path

# Use list of arguments without shell=True (SAFE)
filename = request.GET['file']
subprocess.run(["cat", filename], check=True, capture_output=True)

# Validate input before use
def validate_filename(filename):
    """Validate filename to prevent path traversal."""
    # Allow only alphanumeric, dash, underscore, and dot
    if not re.match(r'^[a-zA-Z0-9_.-]+$', filename):
        raise ValueError("Invalid filename")

    # Resolve to absolute path and check it's within allowed directory
    file_path = Path(UPLOAD_DIR) / filename
    if not file_path.resolve().is_relative_to(Path(UPLOAD_DIR).resolve()):
        raise ValueError("Path traversal detected")

    return file_path

filename = validate_filename(request.GET['file'])
subprocess.run(["cat", str(filename)], check=True, capture_output=True)

# Use shlex.split() for complex commands
import shlex
command_string = "ping -c 1 example.com"
subprocess.run(shlex.split(command_string), check=True, capture_output=True)

# Whitelist approach for restricted commands
ALLOWED_COMMANDS = {
    'ping': ['ping', '-c', '1'],
    'traceroute': ['traceroute', '-m', '10'],
}

command_type = request.GET['command']
target = request.GET['target']

if command_type not in ALLOWED_COMMANDS:
    raise ValueError("Command not allowed")

# Validate target (e.g., IP address or hostname)
if not re.match(r'^[a-zA-Z0-9.-]+$', target):
    raise ValueError("Invalid target")

cmd = ALLOWED_COMMANDS[command_type] + [target]
subprocess.run(cmd, check=True, capture_output=True, timeout=10)

Best Practices:

  • Never use shell=True with user input
  • Pass arguments as list, not string
  • Validate and whitelist all user inputs
  • Use shlex.split() for parsing command strings
  • Implement timeouts to prevent DoS
  • Run subprocesses with minimal privileges

Weak Cryptography

B303, B304, B324: Weak Hash Functions

Vulnerable Code:

import hashlib
import md5  # Deprecated

# MD5 (WEAK)
password_hash = hashlib.md5(password.encode()).hexdigest()

# SHA1 (WEAK)
token = hashlib.sha1(user_data.encode()).hexdigest()

Secure Solution:

import hashlib
import secrets
import bcrypt
from argon2 import PasswordHasher

# SHA-256 for general hashing (ACCEPTABLE for non-password data)
data_hash = hashlib.sha256(data.encode()).hexdigest()

# SHA-512 (BETTER for general hashing)
data_hash = hashlib.sha512(data.encode()).hexdigest()

# bcrypt for password hashing (RECOMMENDED)
def hash_password(password: str) -> bytes:
    """Hash password using bcrypt with salt."""
    salt = bcrypt.gensalt(rounds=12)  # Cost factor 12
    return bcrypt.hashpw(password.encode(), salt)

def verify_password(password: str, hashed: bytes) -> bool:
    """Verify password against bcrypt hash."""
    return bcrypt.checkpw(password.encode(), hashed)

# Argon2 for password hashing (BEST - winner of Password Hashing Competition)
ph = PasswordHasher(
    time_cost=2,       # Number of iterations
    memory_cost=65536, # Memory usage in KiB (64 MB)
    parallelism=4,     # Number of parallel threads
)

def hash_password_argon2(password: str) -> str:
    """Hash password using Argon2."""
    return ph.hash(password)

def verify_password_argon2(password: str, hashed: str) -> bool:
    """Verify password against Argon2 hash."""
    try:
        ph.verify(hashed, password)
        return True
    except:
        return False

# HMAC for message authentication
import hmac
def create_signature(message: str, secret_key: bytes) -> str:
    """Create HMAC-SHA256 signature."""
    return hmac.new(
        secret_key,
        message.encode(),
        hashlib.sha256
    ).hexdigest()

B501, B502, B503: Weak SSL/TLS Configuration

Vulnerable Code:

import ssl
import requests

# Weak SSL version (UNSAFE)
context = ssl.SSLContext(ssl.PROTOCOL_SSLv3)

# Disabling certificate verification (VERY UNSAFE)
requests.get('https://example.com', verify=False)

Secure Solution:

import ssl
import requests

# Strong SSL/TLS configuration (SAFE)
context = ssl.create_default_context()
context.minimum_version = ssl.TLSVersion.TLSv1_2
context.maximum_version = ssl.TLSVersion.TLSv1_3

# Restrict cipher suites
context.set_ciphers('ECDHE+AESGCM:ECDHE+CHACHA20:DHE+AESGCM:DHE+CHACHA20:!aNULL:!MD5:!DSS')

# Enable certificate verification (default in requests)
response = requests.get('https://example.com', verify=True)

# Custom CA bundle
response = requests.get('https://example.com', verify='/path/to/ca-bundle.crt')

# For urllib
import urllib.request
import certifi

url = 'https://example.com'
response = urllib.request.urlopen(url, context=context, cafile=certifi.where())

Best Practices:

  • Use TLS 1.2 or TLS 1.3 only
  • Disable weak cipher suites
  • Always verify certificates in production
  • Use certificate pinning for critical connections
  • Regularly update SSL/TLS libraries

Insecure Deserialization

B301: Pickle Usage

Vulnerable Code:

import pickle

# Deserializing untrusted data (VERY UNSAFE)
user_data = pickle.loads(request.body)

# Loading from file (UNSAFE if file is from untrusted source)
with open('user_session.pkl', 'rb') as f:
    session = pickle.load(f)

Secure Solution:

import json
import msgpack
from cryptography.fernet import Fernet

# Use JSON for simple data (SAFE)
user_data = json.loads(request.body)

# Use MessagePack for binary efficiency (SAFE)
user_data = msgpack.unpackb(request.body)

# If pickle is absolutely necessary, use cryptographic signing
import hmac
import hashlib
import pickle

SECRET_KEY = os.environ['SECRET_KEY'].encode()

def secure_pickle_dumps(obj):
    """Serialize with HMAC signature."""
    pickled = pickle.dumps(obj)
    signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()
    return signature + pickled

def secure_pickle_loads(data):
    """Deserialize with signature verification."""
    signature = data[:32]  # SHA256 is 32 bytes
    pickled = data[32:]

    expected_signature = hmac.new(SECRET_KEY, pickled, hashlib.sha256).digest()

    if not hmac.compare_digest(signature, expected_signature):
        raise ValueError("Invalid signature - data may be tampered")

    return pickle.loads(pickled)

# Better: Use itsdangerous for secure serialization
from itsdangerous import URLSafeSerializer

serializer = URLSafeSerializer(SECRET_KEY)

# Serialize (signed and safe)
token = serializer.dumps({'user_id': 123, 'role': 'admin'})

# Deserialize (verified)
data = serializer.loads(token)

Best Practices:

  • Avoid pickle for untrusted data
  • Use JSON, MessagePack, or Protocol Buffers
  • If pickle is required, implement cryptographic signing
  • Use itsdangerous library for secure token serialization
  • Restrict pickle to internal, trusted data only

XML External Entity (XXE)

B313-B320, B405-B412: XML Parsing Vulnerabilities

Vulnerable Code:

import xml.etree.ElementTree as ET
from lxml import etree

# Unsafe XML parsing (VULNERABLE to XXE)
tree = ET.parse(user_xml_file)
root = tree.getroot()

# lxml unsafe parsing
parser = etree.XMLParser()
tree = etree.parse(user_xml_file, parser)

Secure Solution:

import xml.etree.ElementTree as ET
from lxml import etree
import defusedxml.ElementTree as defusedET

# Use defusedxml (RECOMMENDED)
tree = defusedET.parse(user_xml_file)
root = tree.getroot()

# Disable external entities in ElementTree
ET.XMLParser.entity = {}  # Disable entity expansion

# Secure lxml configuration
parser = etree.XMLParser(
    resolve_entities=False,  # Disable entity resolution
    no_network=True,         # Disable network access
    dtd_validation=False,    # Disable DTD validation
    load_dtd=False          # Don't load DTD
)
tree = etree.parse(user_xml_file, parser)

# Alternative: Use JSON instead of XML when possible
import json
data = json.loads(request.body)

Best Practices:

  • Use defusedxml library for all XML parsing
  • Disable DTD processing and external entity resolution
  • Validate XML against strict schema (XSD)
  • Consider using JSON instead of XML for APIs
  • Never parse XML from untrusted sources without defusedxml

Security Misconfiguration

B201: Flask Debug Mode

Vulnerable Code:

from flask import Flask

app = Flask(__name__)

# Debug mode in production (VERY UNSAFE)
app.run(debug=True, host='0.0.0.0')

Secure Solution:

from flask import Flask
import os

app = Flask(__name__)

# Use environment-based configuration
DEBUG = os.environ.get('FLASK_DEBUG', 'false').lower() == 'true'
ENV = os.environ.get('FLASK_ENV', 'production')

if ENV == 'production' and DEBUG:
    raise ValueError("Debug mode cannot be enabled in production")

app.config['DEBUG'] = DEBUG
app.config['ENV'] = ENV
app.config['SECRET_KEY'] = os.environ['SECRET_KEY']

# Use production WSGI server
if ENV == 'production':
    # Deploy with gunicorn or uwsgi, not app.run()
    # gunicorn -w 4 -b 0.0.0.0:8000 app:app
    pass
else:
    app.run(debug=DEBUG, host='127.0.0.1', port=5000)

B506: YAML Load

Vulnerable Code:

import yaml

# Arbitrary code execution (VERY UNSAFE)
config = yaml.load(user_input, Loader=yaml.Loader)

Secure Solution:

import yaml

# Safe YAML loading (SAFE)
config = yaml.safe_load(user_input)

# For complex objects, use schema validation
from schema import Schema, And, Use, Optional

config_schema = Schema({
    'database': {
        'host': And(str, len),
        'port': And(Use(int), lambda n: 1024 <= n <= 65535),
    },
    Optional('debug'): bool,
})

config = yaml.safe_load(user_input)
validated_config = config_schema.validate(config)

B701, B702, B703: Template Autoescape

Vulnerable Code:

from jinja2 import Environment

# Autoescape disabled (XSS VULNERABLE)
env = Environment(autoescape=False)
template = env.from_string(user_template)
output = template.render(name=user_input)

Secure Solution:

from jinja2 import Environment, select_autoescape
from markupsafe import Markup, escape

# Enable autoescape (SAFE)
env = Environment(
    autoescape=select_autoescape(['html', 'xml'])
)

# Or for all templates
env = Environment(autoescape=True)

# Explicitly mark safe content
def render_html(content):
    # Sanitize first
    clean_content = escape(content)
    return Markup(clean_content)

# Django: Ensure autoescape is enabled (default)
# In Django templates:
# {{ user_input }}  <!-- Auto-escaped -->
# {{ user_input|safe }}  <!-- Only use after sanitization -->

Best Practices:

  • Always enable autoescape in template engines
  • Never mark user input as safe without sanitization
  • Use Content Security Policy (CSP) headers
  • Validate and sanitize all user inputs
  • Use templating libraries with secure defaults

General Security Principles

  1. Defense in Depth: Implement multiple layers of security controls
  2. Least Privilege: Grant minimum necessary permissions
  3. Fail Securely: Errors should not expose sensitive information
  4. Input Validation: Validate all inputs at trust boundaries
  5. Output Encoding: Encode data based on output context
  6. Secure Defaults: Use secure configurations by default
  7. Keep Dependencies Updated: Regularly update security libraries
  8. Security Testing: Include security tests in CI/CD pipelines

Additional Resources