gh-rawveg-skillsforge-marke…/skills/haveibeenpwned/SKILL.md

---
name: haveibeenpwned
description: HaveIBeenPwned API Documentation - Check if email accounts or passwords have been compromised in data breaches
---

# Have I Been Pwned API Skill

Expert assistance for integrating the Have I Been Pwned (HIBP) API v3 to check for compromised accounts, passwords, and data breaches. This skill provides comprehensive guidance for building security tools, breach notification systems, and password validation features.

## When to Use This Skill

This skill should be triggered when:
- **Checking if emails/accounts appear in data breaches** - "check if this email was pwned"
- **Validating password security** - "check if password is in breach database"
- **Building breach notification systems** - "notify users about compromised accounts"
- **Implementing password validation** - "prevent users from choosing pwned passwords"
- **Querying stealer logs** - "check if credentials were stolen by malware"
- **Integrating HIBP into authentication flows** - "add breach checking to login"
- **Monitoring domains for compromised emails** - "track breaches affecting our domain"
- **Working with the HIBP API** - any questions about authentication, rate limits, or endpoints

## Quick Reference

### 1. Basic Account Breach Check

```python
import requests

def check_account_breaches(email, api_key):
    """Check if an account appears in any breaches"""
    headers = {
        'hibp-api-key': api_key,
        'user-agent': 'MyApp/1.0'
    }

    url = f'https://haveibeenpwned.com/api/v3/breachedaccount/{email}'
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        return response.json()  # List of breach objects
    elif response.status_code == 404:
        return []  # Account not found in breaches
    else:
        response.raise_for_status()

# Usage
breaches = check_account_breaches('user@example.com', 'your-api-key')
print(f"Found in {len(breaches)} breaches")
```

### 2. Password Breach Check (k-Anonymity)

```python
import hashlib
import requests

def check_password_pwned(password):
    """Check if password appears in breaches using k-anonymity"""
    # Hash password with SHA-1
    sha1_hash = hashlib.sha1(password.encode('utf-8')).hexdigest().upper()
    prefix = sha1_hash[:5]
    suffix = sha1_hash[5:]

    # Query API with first 5 characters only
    url = f'https://api.pwnedpasswords.com/range/{prefix}'
    response = requests.get(url)

    # Parse response for matching suffix
    hashes = (line.split(':') for line in response.text.splitlines())
    for hash_suffix, count in hashes:
        if hash_suffix == suffix:
            return int(count)  # Times password appears in breaches
    return 0  # Password not found

# Usage
count = check_password_pwned('password123')
if count > 0:
    print(f"⚠️ Password found {count} times in breaches!")
```

### 3. Get All Breaches in System

```python
import requests

def get_all_breaches(domain=None):
    """Retrieve all breaches, optionally filtered by domain"""
    url = 'https://haveibeenpwned.com/api/v3/breaches'
    params = {'domain': domain} if domain else {}

    headers = {'user-agent': 'MyApp/1.0'}
    response = requests.get(url, headers=headers, params=params)

    return response.json()

# Usage - no authentication required
breaches = get_all_breaches()
print(f"Total breaches: {len(breaches)}")

# Filter by domain
adobe_breaches = get_all_breaches(domain='adobe.com')
```

### 4. Monitor for New Breaches

```python
import requests
import time

def monitor_latest_breach(check_interval=3600):
    """Poll for new breaches every hour"""
    last_breach_name = None

    while True:
        url = 'https://haveibeenpwned.com/api/v3/latestbreach'
        headers = {'user-agent': 'MyApp/1.0'}
        response = requests.get(url, headers=headers)

        if response.status_code == 200:
            breach = response.json()
            if breach['Name'] != last_breach_name:
                print(f"🆕 New breach: {breach['Title']}")
                print(f"   Accounts affected: {breach['PwnCount']:,}")
                last_breach_name = breach['Name']

        time.sleep(check_interval)
```

### 5. Domain-Wide Breach Search

```python
import requests

def search_domain_breaches(domain, api_key):
    """Search for all breached emails in a verified domain"""
    headers = {
        'hibp-api-key': api_key,
        'user-agent': 'MyApp/1.0'
    }

    url = f'https://haveibeenpwned.com/api/v3/breacheddomain/{domain}'
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        results = response.json()
        # Returns: {"alias1": ["Adobe"], "alias2": ["Adobe", "Gawker"]}
        total_affected = len(results)
        print(f"Found {total_affected} compromised accounts")
        return results
    else:
        response.raise_for_status()
```

### 6. Check Pastes for Account

```python
import requests

def check_pastes(email, api_key):
    """Check if email appears in any pastes"""
    headers = {
        'hibp-api-key': api_key,
        'user-agent': 'MyApp/1.0'
    }

    url = f'https://haveibeenpwned.com/api/v3/pasteaccount/{email}'
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        pastes = response.json()
        for paste in pastes:
            print(f"{paste['Source']}: {paste['Title']}")
            print(f"  Date: {paste['Date']}")
            print(f"  Emails found: {paste['EmailCount']}")
        return pastes
    elif response.status_code == 404:
        return []  # No pastes found
```

### 7. Enhanced Password Check with Padding

```python
import hashlib
import requests

def check_password_secure(password):
    """Check password with padding to prevent inference attacks"""
    sha1_hash = hashlib.sha1(password.encode('utf-8')).hexdigest().upper()
    prefix = sha1_hash[:5]
    suffix = sha1_hash[5:]

    headers = {'Add-Padding': 'true'}
    url = f'https://api.pwnedpasswords.com/range/{prefix}'
    response = requests.get(url, headers=headers)

    # Parse response, ignore padded entries (count=0)
    for line in response.text.splitlines():
        hash_suffix, count = line.split(':')
        if hash_suffix == suffix and int(count) > 0:
            return int(count)
    return 0
```

### 8. Handle Rate Limiting

```python
import requests
import time

def api_call_with_retry(url, headers, max_retries=3):
    """Make API call with automatic retry on rate limit"""
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            # Rate limited - wait and retry
            retry_after = int(response.headers.get('retry-after', 2))
            print(f"Rate limited, waiting {retry_after}s...")
            time.sleep(retry_after)
            continue

        return response

    raise Exception("Max retries exceeded")
```

### 9. Check Subscription Status

```python
import requests

def get_subscription_info(api_key):
    """Retrieve API subscription details and limits"""
    headers = {
        'hibp-api-key': api_key,
        'user-agent': 'MyApp/1.0'
    }

    url = 'https://haveibeenpwned.com/api/v3/subscription/status'
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        info = response.json()
        print(f"Plan: {info['SubscriptionName']}")
        print(f"Rate limit: {info['Rpm']} requests/minute")
        print(f"Valid until: {info['SubscribedUntil']}")
        return info
```

### 10. Stealer Logs Search

```python
import requests

def check_stealer_logs(email, api_key):
    """Check if credentials appear in info stealer malware logs"""
    headers = {
        'hibp-api-key': api_key,
        'user-agent': 'MyApp/1.0'
    }

    url = f'https://haveibeenpwned.com/api/v3/stealerlogsbyemail/{email}'
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        domains = response.json()  # List of website domains
        print(f"Credentials found for {len(domains)} websites")
        return domains
    elif response.status_code == 404:
        return []  # Not found in stealer logs

# Requires Pwned 5+ subscription
```

## Key Concepts

### Authentication
- **API Key Format**: 32-character hexadecimal string
- **Header**: `hibp-api-key: {your-key}`
- **User-Agent Required**: Must set valid user-agent header (returns 403 if missing)
- **Test Key**: `00000000000000000000000000000000` for integration testing

### k-Anonymity Model
The Pwned Passwords API uses **k-anonymity** to protect user privacy:
1. Client hashes password locally with SHA-1
2. Sends only **first 5 characters** of hash to API
3. API returns ~800 matching hash suffixes
4. Client checks locally if full hash matches

This ensures the actual password **never leaves your system**.

### Rate Limiting
- **Varies by subscription tier**: Pwned 5 = 1,000 requests/minute
- **HTTP 429 response** when exceeded with `retry-after` header
- **Pwned Passwords API**: No rate limit
- **Best practice**: Implement exponential backoff on 429 responses

### Breach Model Attributes
Key fields in breach objects:
- **Name**: Unique identifier (e.g., "Adobe")
- **Title**: Human-readable name
- **BreachDate**: When breach occurred (ISO 8601)
- **PwnCount**: Total compromised accounts
- **DataClasses**: Types of data exposed (emails, passwords, etc.)
- **IsVerified**: Breach authenticity confirmed
- **IsSensitive**: Excluded from public searches

### Response Codes
| Code | Meaning |
|------|---------|
| 200 | Success - data found |
| 404 | Not found (account not in breaches) |
| 401 | Unauthorized (invalid API key) |
| 403 | Forbidden (missing user-agent) |
| 429 | Rate limit exceeded |

## Reference Files

This skill includes comprehensive API documentation in `references/`:

- **other.md** - Complete HIBP API v3 reference with all endpoints, authentication, and usage examples

The reference file contains:
- **All API endpoints** - Breaches, pastes, passwords, stealer logs
- **Request/response formats** - Headers, parameters, JSON structures
- **Authentication details** - API key setup and usage
- **Rate limiting information** - Subscription tiers and retry strategies
- **Test accounts** - Pre-configured test data for integration
- **Code examples** - Real-world implementation patterns

Use `view` to read the reference file when you need detailed information about specific endpoints or advanced features.

## Working with This Skill

### For Beginners
Start by understanding the core concepts:
1. **Password checking** - Use Pwned Passwords API (no authentication required)
2. **Account breaches** - Requires API key from haveibeenpwned.com
3. **k-Anonymity** - Learn how password hashing protects privacy

Begin with Quick Reference examples #1 (breach check) and #2 (password check).

### For Integration Projects
Focus on:
1. **Authentication setup** - Get API key and configure headers
2. **Rate limiting** - Implement retry logic (example #8)
3. **Error handling** - Handle 404, 401, 429 responses properly
4. **User experience** - Provide clear messaging about breach exposure

Review Quick Reference examples #5 (domain search) and #9 (subscription info).

### For Production Systems
Consider:
1. **Caching** - Store breach results to reduce API calls
2. **Background processing** - Check breaches asynchronously
3. **Monitoring** - Track new breaches with latest breach endpoint (example #4)
4. **Privacy** - Never log passwords, use k-anonymity model
5. **Compliance** - Follow attribution requirements (CC BY 4.0)

### For Security Tools
Advanced patterns:
1. **Stealer logs** - Check malware-stolen credentials (example #10)
2. **Domain monitoring** - Track all compromised accounts in your organization
3. **Paste monitoring** - Alert on email exposure in public pastes (example #6)
4. **Padding** - Use response padding to prevent inference attacks (example #7)

## Common Patterns

### Pattern 1: Sign-up Password Validation
```python
# Prevent users from choosing compromised passwords
def validate_signup_password(password):
    count = check_password_pwned(password)
    if count > 0:
        return False, f"This password appears in {count} data breaches"
    return True, "Password is secure"
```

### Pattern 2: Breach Notification System
```python
# Notify users when their account appears in new breach
def notify_affected_users():
    latest = get_latest_breach()
    affected_users = query_users_in_breach(latest['Name'])
    for user in affected_users:
        send_notification(user, latest)
```

### Pattern 3: Compliance Check
```python
# Verify all domain accounts for compliance reporting
def domain_security_audit(domain, api_key):
    breached = search_domain_breaches(domain, api_key)
    report = {
        'total_accounts': len(breached),
        'affected_accounts': breached,
        'timestamp': datetime.now()
    }
    return report
```

## API Endpoints Summary

### Authenticated Endpoints (Require API Key)
- `GET /breachedaccount/{account}` - Check account breaches
- `GET /pasteaccount/{account}` - Check pastes
- `GET /breacheddomain/{domain}` - Domain-wide search
- `GET /subscribeddomains` - List verified domains
- `GET /subscription/status` - Check subscription
- `GET /stealerlogsbyemail/{email}` - Stealer logs by email
- `GET /stealerlogsbywebsitedomain/{domain}` - Stealer logs by site
- `GET /stealerlogsbyemaildomain/{domain}` - Stealer logs by email domain

### Public Endpoints (No Authentication)
- `GET /breaches` - All breaches in system
- `GET /breach/{name}` - Single breach details
- `GET /latestbreach` - Most recent breach
- `GET /dataclasses` - List of data types
- `GET https://api.pwnedpasswords.com/range/{prefix}` - Password check

## Testing

### Test Accounts
Use these on domain `hibp-integration-tests.com`:
- `account-exists@` - Has breaches and pastes
- `multiple-breaches@` - Three different breaches
- `spam-list-only@` - Only spam-flagged breach
- `stealer-log@` - In stealer logs
- `opt-out@` - No results (opted out)

### Test API Key
Use `00000000000000000000000000000000` for integration testing.

## Best Practices

1. **Always set User-Agent** - Required header, returns 403 without it
2. **Use HTTPS only** - API requires TLS 1.2+
3. **Implement retry logic** - Handle 429 rate limits gracefully
4. **Cache breach data** - Reduce API calls for frequently checked accounts
5. **Never log passwords** - Use k-anonymity model, hash locally
6. **Provide attribution** - Link to haveibeenpwned.com (CC BY 4.0 license)
7. **Handle 404 gracefully** - "Not found" is good news for users
8. **Use padding for passwords** - Add `Add-Padding: true` header

## Resources

### Official Links
- API Documentation: https://haveibeenpwned.com/API/v3
- Get API Key: https://haveibeenpwned.com/API/Key
- Dashboard: https://haveibeenpwned.com/DomainSearch

### Community Tools
- **PwnedPasswordsDownloader** (GitHub) - Download full password database
- Integration libraries available for Python, JavaScript, Go, C#, and more

## Acceptable Use

**Permitted:**
- Security tools and breach notifications
- Password validation in authentication systems
- Compliance and security audits
- Educational and research purposes

**Prohibited:**
- Targeting or harming breach victims
- Denial-of-service attacks
- Circumventing security measures
- Misrepresenting data source
- Automating undocumented APIs

Violations may result in API key revocation or IP blocking.

## Notes

- Breach data licensed under **Creative Commons Attribution 4.0**
- Pwned Passwords has no licensing requirements
- CORS only supported for unauthenticated endpoints
- Never expose API keys in client-side code
- Service tracks **917+ breaches** as of API documentation date