Files
2025-11-29 17:51:02 +08:00

14 KiB

False Positives Management

Strategies for managing false positives in Gitleaks secret detection.

Table of Contents

Understanding False Positives

False positives occur when legitimate code patterns match secret detection rules.

Categories of False Positives

  1. Example/Placeholder Values: Documentation and examples using fake credentials
  2. Test Fixtures: Test data with credential-like patterns
  3. Non-Secret Constants: Configuration values that match patterns but aren't sensitive
  4. Generated Code: Auto-generated code with high-entropy strings
  5. Comments and Documentation: Explanatory text matching patterns

Impact Assessment

Before allowlisting, verify it's truly a false positive:

# Extract the flagged value
echo "api_key_here" | base64  # Check if valid encoding
curl -H "Authorization: Bearer <token>" https://api.service.com/test  # Test if active

# Check git history for when added
git log -p --all -S "flagged_value"

# Review context around detection
git show <commit-sha>:<file-path>

Allowlist Strategies

1. Path-Based Allowlisting

Exclude entire directories or file patterns:

[allowlist]
description = "Exclude test and documentation files"
paths = [
  '''test/.*''',              # All test directories
  '''tests/.*''',             # Alternative test directory name
  '''.*/fixtures/.*''',       # Test fixtures anywhere
  '''examples/.*''',          # Example code
  '''docs/.*''',              # Documentation
  '''.*\.md$''',              # Markdown files
  '''.*\.rst$''',             # ReStructuredText files
  '''.*_test\.go$''',         # Go test files
  '''.*\.test\.js$''',        # JavaScript test files
  '''.*\.spec\.ts$''',        # TypeScript spec files
]

2. Stopword Allowlisting

Filter out known placeholder values:

[allowlist]
description = "Common placeholder values"
stopwords = [
  "example",
  "placeholder",
  "your_api_key_here",
  "your_secret_here",
  "REPLACEME",
  "CHANGEME",
  "xxxxxx",
  "000000",
  "123456",
  "abcdef",
  "sample",
  "dummy",
  "fake",
  "test_key",
  "mock_token",
]

3. Commit-Based Allowlisting

Allowlist specific commits after manual verification:

[allowlist]
description = "Verified false positives"
commits = [
  "a1b2c3d4e5f6",  # Initial test fixtures - verified 2024-01-15
  "f6e5d4c3b2a1",  # Documentation examples - verified 2024-01-16
]

Add comment explaining why each commit is allowlisted.

4. Regex Allowlisting

Allowlist specific patterns:

[allowlist]
description = "Pattern-based allowlist"
regexes = [
  '''example_api_key_[0-9]+''',            # Example keys with numeric suffix
  '''key\s*=\s*["']EXAMPLE["']''',         # Explicitly marked examples
  '''(?i)test_?password_?[0-9]*''',        # Test passwords
  '''(?i)dummy.*secret''',                 # Dummy secrets
]

5. Rule-Specific Allowlisting

Create exceptions for specific rules only:

[[rules]]
id = "generic-api-key"
description = "Generic API Key"
regex = '''(?i)api_key\s*=\s*["']([a-zA-Z0-9]{32})["']'''

[rules.allowlist]
description = "Allow generic API key pattern in specific contexts"
paths = ['''config/defaults\.yaml''']
regexes = ['''api_key\s*=\s*["']example''']

6. Global vs Rule Allowlists

Global allowlists override rule-specific ones:

# Global allowlist - highest precedence
[allowlist]
description = "Organization-wide exceptions"
paths = ['''vendor/''', '''node_modules/''']

# Rule-specific allowlist
[[rules]]
id = "custom-secret"
[rules.allowlist]
description = "Exceptions only for this rule"
paths = ['''config/template\.yml''']

Common False Positive Patterns

1. Documentation Examples

Problem: README and documentation contain example credentials.

Solution:

[allowlist]
paths = [
  '''README\.md$''',
  '''CONTRIBUTING\.md$''',
  '''docs/.*\.md$''',
  '''.*\.example$''',           # .env.example files
  '''.*\.template$''',          # Template files
  '''.*\.sample$''',            # Sample configurations
]

stopwords = [
  "example.com",
  "user@example.org",
  "YOUR_API_KEY",
]

2. Test Fixtures

Problem: Test data contains credential-like strings for testing credential handling.

Solution:

[allowlist]
paths = [
  '''test/fixtures/.*''',
  '''spec/fixtures/.*''',
  '''.*/testdata/.*''',         # Go convention
  '''.*/mocks/.*''',
  '''cypress/fixtures/.*''',    # Cypress test data
]

# Or use inline comments in code
# password = "test_password_123"  # gitleaks:allow

3. Generated Code

Problem: Code generators produce high-entropy identifiers.

Solution:

[allowlist]
description = "Generated code"
paths = [
  '''.*\.pb\.go$''',            # Protocol buffer generated code
  '''.*_generated\..*''',        # Generated file marker
  '''node_modules/.*''',         # Dependencies
  '''vendor/.*''',               # Vendored dependencies
  '''dist/.*''',                 # Build output
  '''build/.*''',
]

4. Configuration Templates

Problem: Config templates with placeholder values match patterns.

Solution:

[allowlist]
paths = [
  '''config/.*\.template''',
  '''templates/.*''',
  '''.*\.tpl$''',
  '''.*\.tmpl$''',
]

stopwords = [
  "REPLACE_WITH_YOUR",
  "CONFIGURE_ME",
  "SET_THIS_VALUE",
]

5. Base64 Encoded Strings

Problem: Non-secret base64 data flagged due to high entropy.

Solution:

# Increase entropy threshold to reduce false positives
[[rules]]
id = "high-entropy-base64"
regex = '''[a-zA-Z0-9+/]{40,}={0,2}'''
entropy = 5.5  # Increase from default 4.5

Or allowlist specific patterns:

[allowlist]
regexes = [
  '''data:image/[^;]+;base64,''',  # Base64 encoded images
  '''-----BEGIN CERTIFICATE-----''',  # Public certificates (not private keys)
]

6. Public Keys and Certificates

Problem: Public keys detected (which are not secrets).

Solution:

[allowlist]
regexes = [
  '''-----BEGIN PUBLIC KEY-----''',
  '''-----BEGIN CERTIFICATE-----''',
  '''-----BEGIN X509 CERTIFICATE-----''',
]

# But DO NOT allowlist:
# -----BEGIN PRIVATE KEY-----
# -----BEGIN RSA PRIVATE KEY-----

7. UUIDs and Identifiers

Problem: UUIDs match high-entropy patterns.

Solution:

[allowlist]
regexes = [
  '''[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}''',  # UUID
  '''[0-9a-f]{24}''',  # MongoDB ObjectId
]

Or adjust entropy detection:

[[rules]]
id = "generic-high-entropy"
entropy = 6.0  # Only flag very high entropy

Configuration Examples

Minimal Configuration

Start with broad allowlists, refine over time:

title = "Minimal Gitleaks Configuration"

[extend]
useDefault = true  # Use all built-in rules

[allowlist]
description = "Broad allowlist for initial rollout"
paths = [
  '''test/.*''',
  '''.*\.md$''',
  '''vendor/.*''',
  '''node_modules/.*''',
]

stopwords = [
  "example",
  "test",
  "mock",
  "dummy",
]

Strict Configuration

Minimize false positives with targeted allowlists:

title = "Strict Gitleaks Configuration"

[extend]
useDefault = true

[allowlist]
description = "Minimal allowlist - verify all exceptions"

# Only allow specific known false positives
paths = [
  '''docs/api-examples\.md''',     # API documentation with examples
  '''test/fixtures/auth\.json''',  # Authentication test fixtures
]

# Specific known placeholder values
stopwords = [
  "YOUR_API_KEY_HERE",
  "sk_test_example_key_123456789",
]

# Manually verified commits
commits = [
  "abc123def456",  # Test fixtures added - verified 2024-01-15 by security@company.com
]

Balanced Configuration

Balance detection sensitivity with operational overhead:

title = "Balanced Gitleaks Configuration"

[extend]
useDefault = true

[allowlist]
description = "Balanced allowlist"

# Common non-secret paths
paths = [
  '''test/fixtures/.*''',
  '''spec/fixtures/.*''',
  '''.*\.md$''',
  '''docs/.*''',
  '''examples/.*''',
  '''vendor/.*''',
  '''node_modules/.*''',
]

# Common placeholders
stopwords = [
  "example",
  "placeholder",
  "your_key_here",
  "replace_me",
  "changeme",
  "test",
  "dummy",
  "mock",
]

# Public non-secrets
regexes = [
  '''-----BEGIN CERTIFICATE-----''',
  '''-----BEGIN PUBLIC KEY-----''',
  '''data:image/[^;]+;base64,''',
]

Best Practices

1. Document Allowlist Decisions

Always add comments explaining why patterns are allowlisted:

[allowlist]
description = "Verified false positives - reviewed 2024-01-15"

# Test fixtures created during initial test suite development
# Contains only example credentials for testing credential validation
paths = ['''test/fixtures/credentials\.json''']

# Documentation examples using clearly fake values
# All examples prefixed with "example_" or "test_"
stopwords = ["example_", "test_"]

2. Regular Allowlist Review

Schedule periodic reviews:

#!/bin/bash
# review-allowlist.sh

echo "Gitleaks Allowlist Review"
echo "========================="
echo ""

# Show allowlist paths
echo "Allowlisted paths:"
grep -A 10 "^\[allowlist\]" .gitleaks.toml | grep "paths = "

# Show allowlisted commits
echo ""
echo "Allowlisted commits:"
grep -A 10 "^\[allowlist\]" .gitleaks.toml | grep "commits = "

# Check if commits still exist
# (May have been removed in history rewrite)
git rev-parse --verify abc123def456 2>/dev/null || echo "WARNING: Commit abc123def456 not found"

3. Use Inline Annotations Sparingly

For one-off false positives, use inline comments:

# This is a test password for unit tests only
# gitleaks:allow
TEST_PASSWORD = "test_password_123"

Warning: Overuse of inline annotations indicates poorly tuned configuration.

4. Version Control Your Configuration

Track changes to .gitleaks.toml:

git log -p .gitleaks.toml

# See who allowlisted what and when
git blame .gitleaks.toml

5. Test Allowlist Changes

Before committing allowlist changes:

# Test configuration
gitleaks detect --config .gitleaks.toml -v

# Verify specific file is now allowed
gitleaks detect --config .gitleaks.toml --source test/fixtures/credentials.json

# Verify secret is still caught in production code
echo 'api_key = "sk_live_actual_key"' > /tmp/test_detection.py
gitleaks detect --config .gitleaks.toml --source /tmp/test_detection.py --no-git

6. Separate Allowlists by Environment

Use different configurations for different contexts:

# Strict config for production code
gitleaks detect --config .gitleaks.strict.toml --source src/

# Lenient config for test code
gitleaks detect --config .gitleaks.lenient.toml --source test/

7. Monitor False Positive Rate

Track metrics over time:

# Total findings
TOTAL=$(gitleaks detect --report-format json 2>/dev/null | jq '. | length')

# Run with allowlist
AFTER_FILTER=$(gitleaks detect --config .gitleaks.toml --report-format json 2>/dev/null | jq '. | length')

# Calculate reduction
echo "False positive reduction: $(($TOTAL - $AFTER_FILTER)) / $TOTAL"

Target: < 10% false positive rate for good developer experience.

8. Security Review for New Allowlists

Require security team approval for:

  • New allowlisted paths in src/ or production code
  • New allowlisted commits (verify manually first)
  • Changes to rule-specific allowlists
  • New stopwords that could mask real secrets

9. Avoid Overly Broad Patterns

Bad (too broad):

[allowlist]
paths = ['''.*''']  # Disables all detection!
stopwords = ["key", "secret"]  # Matches too many real secrets

Good (specific):

[allowlist]
paths = ['''test/unit/.*\.test\.js$''']  # Specific test directory
stopwords = ["example_key", "test_secret"]  # Specific placeholders

10. Escape Special Characters

When using regex patterns, escape properly:

[allowlist]
regexes = [
  '''api\.example\.com''',  # Literal dot
  '''config\[\'key\'\]''',  # Literal brackets and quotes
]

Troubleshooting False Positives

Issue: Can't Identify Source of False Positive

# Run with verbose output
gitleaks detect -v | grep "RuleID"

# Get detailed finding information
gitleaks detect --report-format json | jq '.[] | {file: .File, line: .StartLine, rule: .RuleID}'

# View context around detection
gitleaks detect --report-format json | jq -r '.[0] | .File, .StartLine' | xargs -I {} sh -c 'sed -n "{}-5,{}+5p" {}'

Issue: Allowlist Not Working

# Verify config is loaded
gitleaks detect --config .gitleaks.toml -v 2>&1 | grep "config"

# Check regex syntax
echo "test_string" | grep -E 'your_regex_pattern'

# Test path matching
echo "test/fixtures/file.json" | grep -E 'test/fixtures/.*'

Issue: Too Many False Positives

  1. Export findings: gitleaks detect --report-format json > findings.json
  2. Analyze patterns: jq -r '.[].File' findings.json | sort | uniq -c | sort -rn
  3. Group by rule: jq -r '.[].RuleID' findings.json | sort | uniq -c | sort -rn
  4. Create targeted allowlists based on analysis

False Positive vs Real Secret

When unsure, err on the side of caution:

Indicator False Positive Real Secret
Location Test/docs/examples Production code
Pattern "example", "test", "mock" No such indicators
Entropy Low/medium High
Format Incomplete/truncated Complete/valid
Context Educational comments Functional code
Git history Added in test commits Added furtively

When in doubt: Treat as real secret and investigate.