zhongwei/gh-agentsecops-secopsagentkit

Fork 0

Files

Zhongwei Li ff1f4bd119 Initial commit

2025-11-29 17:51:02 +08:00

14 KiB

Raw Permalink Blame History

False Positives Management

Strategies for managing false positives in Gitleaks secret detection.

Understanding False Positives
Allowlist Strategies
Common False Positive Patterns
Configuration Examples
Best Practices

Understanding False Positives

False positives occur when legitimate code patterns match secret detection rules.

Categories of False Positives

Example/Placeholder Values: Documentation and examples using fake credentials
Test Fixtures: Test data with credential-like patterns
Non-Secret Constants: Configuration values that match patterns but aren't sensitive
Generated Code: Auto-generated code with high-entropy strings
Comments and Documentation: Explanatory text matching patterns

Impact Assessment

Before allowlisting, verify it's truly a false positive:

# Extract the flagged value
echo "api_key_here" | base64  # Check if valid encoding
curl -H "Authorization: Bearer <token>" https://api.service.com/test  # Test if active

# Check git history for when added
git log -p --all -S "flagged_value"

# Review context around detection
git show <commit-sha>:<file-path>

Allowlist Strategies

1. Path-Based Allowlisting

Exclude entire directories or file patterns:

[allowlist]
description = "Exclude test and documentation files"
paths = [
  '''test/.*''',              # All test directories
  '''tests/.*''',             # Alternative test directory name
  '''.*/fixtures/.*''',       # Test fixtures anywhere
  '''examples/.*''',          # Example code
  '''docs/.*''',              # Documentation
  '''.*\.md$''',              # Markdown files
  '''.*\.rst$''',             # ReStructuredText files
  '''.*_test\.go$''',         # Go test files
  '''.*\.test\.js$''',        # JavaScript test files
  '''.*\.spec\.ts$''',        # TypeScript spec files
]

2. Stopword Allowlisting

Filter out known placeholder values:

[allowlist]
description = "Common placeholder values"
stopwords = [
  "example",
  "placeholder",
  "your_api_key_here",
  "your_secret_here",
  "REPLACEME",
  "CHANGEME",
  "xxxxxx",
  "000000",
  "123456",
  "abcdef",
  "sample",
  "dummy",
  "fake",
  "test_key",
  "mock_token",
]

3. Commit-Based Allowlisting

Allowlist specific commits after manual verification:

[allowlist]
description = "Verified false positives"
commits = [
  "a1b2c3d4e5f6",  # Initial test fixtures - verified 2024-01-15
  "f6e5d4c3b2a1",  # Documentation examples - verified 2024-01-16
]

Add comment explaining why each commit is allowlisted.

4. Regex Allowlisting

Allowlist specific patterns:

[allowlist]
description = "Pattern-based allowlist"
regexes = [
  '''example_api_key_[0-9]+''',            # Example keys with numeric suffix
  '''key\s*=\s*["']EXAMPLE["']''',         # Explicitly marked examples
  '''(?i)test_?password_?[0-9]*''',        # Test passwords
  '''(?i)dummy.*secret''',                 # Dummy secrets
]

5. Rule-Specific Allowlisting

Create exceptions for specific rules only:

[[rules]]
id = "generic-api-key"
description = "Generic API Key"
regex = '''(?i)api_key\s*=\s*["']([a-zA-Z0-9]{32})["']'''

[rules.allowlist]
description = "Allow generic API key pattern in specific contexts"
paths = ['''config/defaults\.yaml''']
regexes = ['''api_key\s*=\s*["']example''']

6. Global vs Rule Allowlists

Global allowlists override rule-specific ones:

# Global allowlist - highest precedence
[allowlist]
description = "Organization-wide exceptions"
paths = ['''vendor/''', '''node_modules/''']

# Rule-specific allowlist
[[rules]]
id = "custom-secret"
[rules.allowlist]
description = "Exceptions only for this rule"
paths = ['''config/template\.yml''']

Common False Positive Patterns

1. Documentation Examples

Problem: README and documentation contain example credentials.

Solution:

[allowlist]
paths = [
  '''README\.md$''',
  '''CONTRIBUTING\.md$''',
  '''docs/.*\.md$''',
  '''.*\.example$''',           # .env.example files
  '''.*\.template$''',          # Template files
  '''.*\.sample$''',            # Sample configurations
]

stopwords = [
  "example.com",
  "user@example.org",
  "YOUR_API_KEY",
]

2. Test Fixtures

Problem: Test data contains credential-like strings for testing credential handling.

Solution:

[allowlist]
paths = [
  '''test/fixtures/.*''',
  '''spec/fixtures/.*''',
  '''.*/testdata/.*''',         # Go convention
  '''.*/mocks/.*''',
  '''cypress/fixtures/.*''',    # Cypress test data
]

# Or use inline comments in code
# password = "test_password_123"  # gitleaks:allow

3. Generated Code

Problem: Code generators produce high-entropy identifiers.

Solution:

[allowlist]
description = "Generated code"
paths = [
  '''.*\.pb\.go$''',            # Protocol buffer generated code
  '''.*_generated\..*''',        # Generated file marker
  '''node_modules/.*''',         # Dependencies
  '''vendor/.*''',               # Vendored dependencies
  '''dist/.*''',                 # Build output
  '''build/.*''',
]

4. Configuration Templates

Problem: Config templates with placeholder values match patterns.

Solution:

[allowlist]
paths = [
  '''config/.*\.template''',
  '''templates/.*''',
  '''.*\.tpl$''',
  '''.*\.tmpl$''',
]

stopwords = [
  "REPLACE_WITH_YOUR",
  "CONFIGURE_ME",
  "SET_THIS_VALUE",
]

5. Base64 Encoded Strings

Problem: Non-secret base64 data flagged due to high entropy.

Solution:

# Increase entropy threshold to reduce false positives
[[rules]]
id = "high-entropy-base64"
regex = '''[a-zA-Z0-9+/]{40,}={0,2}'''
entropy = 5.5  # Increase from default 4.5

Or allowlist specific patterns:

[allowlist]
regexes = [
  '''data:image/[^;]+;base64,''',  # Base64 encoded images
  '''-----BEGIN CERTIFICATE-----''',  # Public certificates (not private keys)
]

6. Public Keys and Certificates

Problem: Public keys detected (which are not secrets).

Solution:

[allowlist]
regexes = [
  '''-----BEGIN PUBLIC KEY-----''',
  '''-----BEGIN CERTIFICATE-----''',
  '''-----BEGIN X509 CERTIFICATE-----''',
]

# But DO NOT allowlist:
# -----BEGIN PRIVATE KEY-----
# -----BEGIN RSA PRIVATE KEY-----

7. UUIDs and Identifiers

Problem: UUIDs match high-entropy patterns.

Solution:

[allowlist]
regexes = [
  '''[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}''',  # UUID
  '''[0-9a-f]{24}''',  # MongoDB ObjectId
]

Or adjust entropy detection:

[[rules]]
id = "generic-high-entropy"
entropy = 6.0  # Only flag very high entropy

Configuration Examples

Minimal Configuration

Start with broad allowlists, refine over time:

title = "Minimal Gitleaks Configuration"

[extend]
useDefault = true  # Use all built-in rules

[allowlist]
description = "Broad allowlist for initial rollout"
paths = [
  '''test/.*''',
  '''.*\.md$''',
  '''vendor/.*''',
  '''node_modules/.*''',
]

stopwords = [
  "example",
  "test",
  "mock",
  "dummy",
]

Strict Configuration

Minimize false positives with targeted allowlists:

title = "Strict Gitleaks Configuration"

[extend]
useDefault = true

[allowlist]
description = "Minimal allowlist - verify all exceptions"

# Only allow specific known false positives
paths = [
  '''docs/api-examples\.md''',     # API documentation with examples
  '''test/fixtures/auth\.json''',  # Authentication test fixtures
]

# Specific known placeholder values
stopwords = [
  "YOUR_API_KEY_HERE",
  "sk_test_example_key_123456789",
]

# Manually verified commits
commits = [
  "abc123def456",  # Test fixtures added - verified 2024-01-15 by security@company.com
]

Balanced Configuration

Balance detection sensitivity with operational overhead:

title = "Balanced Gitleaks Configuration"

[extend]
useDefault = true

[allowlist]
description = "Balanced allowlist"

# Common non-secret paths
paths = [
  '''test/fixtures/.*''',
  '''spec/fixtures/.*''',
  '''.*\.md$''',
  '''docs/.*''',
  '''examples/.*''',
  '''vendor/.*''',
  '''node_modules/.*''',
]

# Common placeholders
stopwords = [
  "example",
  "placeholder",
  "your_key_here",
  "replace_me",
  "changeme",
  "test",
  "dummy",
  "mock",
]

# Public non-secrets
regexes = [
  '''-----BEGIN CERTIFICATE-----''',
  '''-----BEGIN PUBLIC KEY-----''',
  '''data:image/[^;]+;base64,''',
]

Best Practices

1. Document Allowlist Decisions

Always add comments explaining why patterns are allowlisted:

[allowlist]
description = "Verified false positives - reviewed 2024-01-15"

# Test fixtures created during initial test suite development
# Contains only example credentials for testing credential validation
paths = ['''test/fixtures/credentials\.json''']

# Documentation examples using clearly fake values
# All examples prefixed with "example_" or "test_"
stopwords = ["example_", "test_"]

2. Regular Allowlist Review

Schedule periodic reviews:

#!/bin/bash
# review-allowlist.sh

echo "Gitleaks Allowlist Review"
echo "========================="
echo ""

# Show allowlist paths
echo "Allowlisted paths:"
grep -A 10 "^\[allowlist\]" .gitleaks.toml | grep "paths = "

# Show allowlisted commits
echo ""
echo "Allowlisted commits:"
grep -A 10 "^\[allowlist\]" .gitleaks.toml | grep "commits = "

# Check if commits still exist
# (May have been removed in history rewrite)
git rev-parse --verify abc123def456 2>/dev/null || echo "WARNING: Commit abc123def456 not found"

3. Use Inline Annotations Sparingly

For one-off false positives, use inline comments:

# This is a test password for unit tests only
# gitleaks:allow
TEST_PASSWORD = "test_password_123"

Warning: Overuse of inline annotations indicates poorly tuned configuration.

4. Version Control Your Configuration

Track changes to .gitleaks.toml:

git log -p .gitleaks.toml

# See who allowlisted what and when
git blame .gitleaks.toml

5. Test Allowlist Changes

Before committing allowlist changes:

# Test configuration
gitleaks detect --config .gitleaks.toml -v

# Verify specific file is now allowed
gitleaks detect --config .gitleaks.toml --source test/fixtures/credentials.json

# Verify secret is still caught in production code
echo 'api_key = "sk_live_actual_key"' > /tmp/test_detection.py
gitleaks detect --config .gitleaks.toml --source /tmp/test_detection.py --no-git

6. Separate Allowlists by Environment

Use different configurations for different contexts:

# Strict config for production code
gitleaks detect --config .gitleaks.strict.toml --source src/

# Lenient config for test code
gitleaks detect --config .gitleaks.lenient.toml --source test/

7. Monitor False Positive Rate

Track metrics over time:

# Total findings
TOTAL=$(gitleaks detect --report-format json 2>/dev/null | jq '. | length')

# Run with allowlist
AFTER_FILTER=$(gitleaks detect --config .gitleaks.toml --report-format json 2>/dev/null | jq '. | length')

# Calculate reduction
echo "False positive reduction: $(($TOTAL - $AFTER_FILTER)) / $TOTAL"

Target: < 10% false positive rate for good developer experience.

8. Security Review for New Allowlists

Require security team approval for:

New allowlisted paths in src/ or production code
New allowlisted commits (verify manually first)
Changes to rule-specific allowlists
New stopwords that could mask real secrets

9. Avoid Overly Broad Patterns

Bad (too broad):

[allowlist]
paths = ['''.*''']  # Disables all detection!
stopwords = ["key", "secret"]  # Matches too many real secrets

Good (specific):

[allowlist]
paths = ['''test/unit/.*\.test\.js$''']  # Specific test directory
stopwords = ["example_key", "test_secret"]  # Specific placeholders

10. Escape Special Characters

When using regex patterns, escape properly:

[allowlist]
regexes = [
  '''api\.example\.com''',  # Literal dot
  '''config\[\'key\'\]''',  # Literal brackets and quotes
]

Troubleshooting False Positives

Issue: Can't Identify Source of False Positive

# Run with verbose output
gitleaks detect -v | grep "RuleID"

# Get detailed finding information
gitleaks detect --report-format json | jq '.[] | {file: .File, line: .StartLine, rule: .RuleID}'

# View context around detection
gitleaks detect --report-format json | jq -r '.[0] | .File, .StartLine' | xargs -I {} sh -c 'sed -n "{}-5,{}+5p" {}'

Issue: Allowlist Not Working

# Verify config is loaded
gitleaks detect --config .gitleaks.toml -v 2>&1 | grep "config"

# Check regex syntax
echo "test_string" | grep -E 'your_regex_pattern'

# Test path matching
echo "test/fixtures/file.json" | grep -E 'test/fixtures/.*'

Issue: Too Many False Positives

Export findings: gitleaks detect --report-format json > findings.json
Analyze patterns: jq -r '.[].File' findings.json | sort | uniq -c | sort -rn
Group by rule: jq -r '.[].RuleID' findings.json | sort | uniq -c | sort -rn
Create targeted allowlists based on analysis

False Positive vs Real Secret

When unsure, err on the side of caution:

Indicator	False Positive	Real Secret
Location	Test/docs/examples	Production code
Pattern	"example", "test", "mock"	No such indicators
Entropy	Low/medium	High
Format	Incomplete/truncated	Complete/valid
Context	Educational comments	Functional code
Git history	Added in test commits	Added furtively

When in doubt: Treat as real secret and investigate.

14 KiB Raw Permalink Blame History

False Positives Management

Table of Contents

Understanding False Positives

Categories of False Positives

Impact Assessment

Allowlist Strategies

1. Path-Based Allowlisting

2. Stopword Allowlisting

3. Commit-Based Allowlisting

4. Regex Allowlisting

5. Rule-Specific Allowlisting

6. Global vs Rule Allowlists

Common False Positive Patterns

1. Documentation Examples

2. Test Fixtures

3. Generated Code

4. Configuration Templates

5. Base64 Encoded Strings

6. Public Keys and Certificates

7. UUIDs and Identifiers

Configuration Examples

Minimal Configuration

Strict Configuration

Balanced Configuration

Best Practices

1. Document Allowlist Decisions

2. Regular Allowlist Review

3. Use Inline Annotations Sparingly

4. Version Control Your Configuration

5. Test Allowlist Changes

6. Separate Allowlists by Environment

7. Monitor False Positive Rate

8. Security Review for New Allowlists

9. Avoid Overly Broad Patterns

10. Escape Special Characters

Troubleshooting False Positives

Issue: Can't Identify Source of False Positive

Issue: Allowlist Not Working

Issue: Too Many False Positives

False Positive vs Real Secret

14 KiB

Raw Permalink Blame History