480 lines
15 KiB
Markdown
480 lines
15 KiB
Markdown
---
|
|
name: haveibeenpwned
|
|
description: HaveIBeenPwned API Documentation - Check if email accounts or passwords have been compromised in data breaches
|
|
---
|
|
|
|
# Have I Been Pwned API Skill
|
|
|
|
Expert assistance for integrating the Have I Been Pwned (HIBP) API v3 to check for compromised accounts, passwords, and data breaches. This skill provides comprehensive guidance for building security tools, breach notification systems, and password validation features.
|
|
|
|
## When to Use This Skill
|
|
|
|
This skill should be triggered when:
|
|
- **Checking if emails/accounts appear in data breaches** - "check if this email was pwned"
|
|
- **Validating password security** - "check if password is in breach database"
|
|
- **Building breach notification systems** - "notify users about compromised accounts"
|
|
- **Implementing password validation** - "prevent users from choosing pwned passwords"
|
|
- **Querying stealer logs** - "check if credentials were stolen by malware"
|
|
- **Integrating HIBP into authentication flows** - "add breach checking to login"
|
|
- **Monitoring domains for compromised emails** - "track breaches affecting our domain"
|
|
- **Working with the HIBP API** - any questions about authentication, rate limits, or endpoints
|
|
|
|
## Quick Reference
|
|
|
|
### 1. Basic Account Breach Check
|
|
|
|
```python
|
|
import requests
|
|
|
|
def check_account_breaches(email, api_key):
|
|
"""Check if an account appears in any breaches"""
|
|
headers = {
|
|
'hibp-api-key': api_key,
|
|
'user-agent': 'MyApp/1.0'
|
|
}
|
|
|
|
url = f'https://haveibeenpwned.com/api/v3/breachedaccount/{email}'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
return response.json() # List of breach objects
|
|
elif response.status_code == 404:
|
|
return [] # Account not found in breaches
|
|
else:
|
|
response.raise_for_status()
|
|
|
|
# Usage
|
|
breaches = check_account_breaches('user@example.com', 'your-api-key')
|
|
print(f"Found in {len(breaches)} breaches")
|
|
```
|
|
|
|
### 2. Password Breach Check (k-Anonymity)
|
|
|
|
```python
|
|
import hashlib
|
|
import requests
|
|
|
|
def check_password_pwned(password):
|
|
"""Check if password appears in breaches using k-anonymity"""
|
|
# Hash password with SHA-1
|
|
sha1_hash = hashlib.sha1(password.encode('utf-8')).hexdigest().upper()
|
|
prefix = sha1_hash[:5]
|
|
suffix = sha1_hash[5:]
|
|
|
|
# Query API with first 5 characters only
|
|
url = f'https://api.pwnedpasswords.com/range/{prefix}'
|
|
response = requests.get(url)
|
|
|
|
# Parse response for matching suffix
|
|
hashes = (line.split(':') for line in response.text.splitlines())
|
|
for hash_suffix, count in hashes:
|
|
if hash_suffix == suffix:
|
|
return int(count) # Times password appears in breaches
|
|
return 0 # Password not found
|
|
|
|
# Usage
|
|
count = check_password_pwned('password123')
|
|
if count > 0:
|
|
print(f"⚠️ Password found {count} times in breaches!")
|
|
```
|
|
|
|
### 3. Get All Breaches in System
|
|
|
|
```python
|
|
import requests
|
|
|
|
def get_all_breaches(domain=None):
|
|
"""Retrieve all breaches, optionally filtered by domain"""
|
|
url = 'https://haveibeenpwned.com/api/v3/breaches'
|
|
params = {'domain': domain} if domain else {}
|
|
|
|
headers = {'user-agent': 'MyApp/1.0'}
|
|
response = requests.get(url, headers=headers, params=params)
|
|
|
|
return response.json()
|
|
|
|
# Usage - no authentication required
|
|
breaches = get_all_breaches()
|
|
print(f"Total breaches: {len(breaches)}")
|
|
|
|
# Filter by domain
|
|
adobe_breaches = get_all_breaches(domain='adobe.com')
|
|
```
|
|
|
|
### 4. Monitor for New Breaches
|
|
|
|
```python
|
|
import requests
|
|
import time
|
|
|
|
def monitor_latest_breach(check_interval=3600):
|
|
"""Poll for new breaches every hour"""
|
|
last_breach_name = None
|
|
|
|
while True:
|
|
url = 'https://haveibeenpwned.com/api/v3/latestbreach'
|
|
headers = {'user-agent': 'MyApp/1.0'}
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
breach = response.json()
|
|
if breach['Name'] != last_breach_name:
|
|
print(f"🆕 New breach: {breach['Title']}")
|
|
print(f" Accounts affected: {breach['PwnCount']:,}")
|
|
last_breach_name = breach['Name']
|
|
|
|
time.sleep(check_interval)
|
|
```
|
|
|
|
### 5. Domain-Wide Breach Search
|
|
|
|
```python
|
|
import requests
|
|
|
|
def search_domain_breaches(domain, api_key):
|
|
"""Search for all breached emails in a verified domain"""
|
|
headers = {
|
|
'hibp-api-key': api_key,
|
|
'user-agent': 'MyApp/1.0'
|
|
}
|
|
|
|
url = f'https://haveibeenpwned.com/api/v3/breacheddomain/{domain}'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
results = response.json()
|
|
# Returns: {"alias1": ["Adobe"], "alias2": ["Adobe", "Gawker"]}
|
|
total_affected = len(results)
|
|
print(f"Found {total_affected} compromised accounts")
|
|
return results
|
|
else:
|
|
response.raise_for_status()
|
|
```
|
|
|
|
### 6. Check Pastes for Account
|
|
|
|
```python
|
|
import requests
|
|
|
|
def check_pastes(email, api_key):
|
|
"""Check if email appears in any pastes"""
|
|
headers = {
|
|
'hibp-api-key': api_key,
|
|
'user-agent': 'MyApp/1.0'
|
|
}
|
|
|
|
url = f'https://haveibeenpwned.com/api/v3/pasteaccount/{email}'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
pastes = response.json()
|
|
for paste in pastes:
|
|
print(f"{paste['Source']}: {paste['Title']}")
|
|
print(f" Date: {paste['Date']}")
|
|
print(f" Emails found: {paste['EmailCount']}")
|
|
return pastes
|
|
elif response.status_code == 404:
|
|
return [] # No pastes found
|
|
```
|
|
|
|
### 7. Enhanced Password Check with Padding
|
|
|
|
```python
|
|
import hashlib
|
|
import requests
|
|
|
|
def check_password_secure(password):
|
|
"""Check password with padding to prevent inference attacks"""
|
|
sha1_hash = hashlib.sha1(password.encode('utf-8')).hexdigest().upper()
|
|
prefix = sha1_hash[:5]
|
|
suffix = sha1_hash[5:]
|
|
|
|
headers = {'Add-Padding': 'true'}
|
|
url = f'https://api.pwnedpasswords.com/range/{prefix}'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
# Parse response, ignore padded entries (count=0)
|
|
for line in response.text.splitlines():
|
|
hash_suffix, count = line.split(':')
|
|
if hash_suffix == suffix and int(count) > 0:
|
|
return int(count)
|
|
return 0
|
|
```
|
|
|
|
### 8. Handle Rate Limiting
|
|
|
|
```python
|
|
import requests
|
|
import time
|
|
|
|
def api_call_with_retry(url, headers, max_retries=3):
|
|
"""Make API call with automatic retry on rate limit"""
|
|
for attempt in range(max_retries):
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 429:
|
|
# Rate limited - wait and retry
|
|
retry_after = int(response.headers.get('retry-after', 2))
|
|
print(f"Rate limited, waiting {retry_after}s...")
|
|
time.sleep(retry_after)
|
|
continue
|
|
|
|
return response
|
|
|
|
raise Exception("Max retries exceeded")
|
|
```
|
|
|
|
### 9. Check Subscription Status
|
|
|
|
```python
|
|
import requests
|
|
|
|
def get_subscription_info(api_key):
|
|
"""Retrieve API subscription details and limits"""
|
|
headers = {
|
|
'hibp-api-key': api_key,
|
|
'user-agent': 'MyApp/1.0'
|
|
}
|
|
|
|
url = 'https://haveibeenpwned.com/api/v3/subscription/status'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
info = response.json()
|
|
print(f"Plan: {info['SubscriptionName']}")
|
|
print(f"Rate limit: {info['Rpm']} requests/minute")
|
|
print(f"Valid until: {info['SubscribedUntil']}")
|
|
return info
|
|
```
|
|
|
|
### 10. Stealer Logs Search
|
|
|
|
```python
|
|
import requests
|
|
|
|
def check_stealer_logs(email, api_key):
|
|
"""Check if credentials appear in info stealer malware logs"""
|
|
headers = {
|
|
'hibp-api-key': api_key,
|
|
'user-agent': 'MyApp/1.0'
|
|
}
|
|
|
|
url = f'https://haveibeenpwned.com/api/v3/stealerlogsbyemail/{email}'
|
|
response = requests.get(url, headers=headers)
|
|
|
|
if response.status_code == 200:
|
|
domains = response.json() # List of website domains
|
|
print(f"Credentials found for {len(domains)} websites")
|
|
return domains
|
|
elif response.status_code == 404:
|
|
return [] # Not found in stealer logs
|
|
|
|
# Requires Pwned 5+ subscription
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
### Authentication
|
|
- **API Key Format**: 32-character hexadecimal string
|
|
- **Header**: `hibp-api-key: {your-key}`
|
|
- **User-Agent Required**: Must set valid user-agent header (returns 403 if missing)
|
|
- **Test Key**: `00000000000000000000000000000000` for integration testing
|
|
|
|
### k-Anonymity Model
|
|
The Pwned Passwords API uses **k-anonymity** to protect user privacy:
|
|
1. Client hashes password locally with SHA-1
|
|
2. Sends only **first 5 characters** of hash to API
|
|
3. API returns ~800 matching hash suffixes
|
|
4. Client checks locally if full hash matches
|
|
|
|
This ensures the actual password **never leaves your system**.
|
|
|
|
### Rate Limiting
|
|
- **Varies by subscription tier**: Pwned 5 = 1,000 requests/minute
|
|
- **HTTP 429 response** when exceeded with `retry-after` header
|
|
- **Pwned Passwords API**: No rate limit
|
|
- **Best practice**: Implement exponential backoff on 429 responses
|
|
|
|
### Breach Model Attributes
|
|
Key fields in breach objects:
|
|
- **Name**: Unique identifier (e.g., "Adobe")
|
|
- **Title**: Human-readable name
|
|
- **BreachDate**: When breach occurred (ISO 8601)
|
|
- **PwnCount**: Total compromised accounts
|
|
- **DataClasses**: Types of data exposed (emails, passwords, etc.)
|
|
- **IsVerified**: Breach authenticity confirmed
|
|
- **IsSensitive**: Excluded from public searches
|
|
|
|
### Response Codes
|
|
| Code | Meaning |
|
|
|------|---------|
|
|
| 200 | Success - data found |
|
|
| 404 | Not found (account not in breaches) |
|
|
| 401 | Unauthorized (invalid API key) |
|
|
| 403 | Forbidden (missing user-agent) |
|
|
| 429 | Rate limit exceeded |
|
|
|
|
## Reference Files
|
|
|
|
This skill includes comprehensive API documentation in `references/`:
|
|
|
|
- **other.md** - Complete HIBP API v3 reference with all endpoints, authentication, and usage examples
|
|
|
|
The reference file contains:
|
|
- **All API endpoints** - Breaches, pastes, passwords, stealer logs
|
|
- **Request/response formats** - Headers, parameters, JSON structures
|
|
- **Authentication details** - API key setup and usage
|
|
- **Rate limiting information** - Subscription tiers and retry strategies
|
|
- **Test accounts** - Pre-configured test data for integration
|
|
- **Code examples** - Real-world implementation patterns
|
|
|
|
Use `view` to read the reference file when you need detailed information about specific endpoints or advanced features.
|
|
|
|
## Working with This Skill
|
|
|
|
### For Beginners
|
|
Start by understanding the core concepts:
|
|
1. **Password checking** - Use Pwned Passwords API (no authentication required)
|
|
2. **Account breaches** - Requires API key from haveibeenpwned.com
|
|
3. **k-Anonymity** - Learn how password hashing protects privacy
|
|
|
|
Begin with Quick Reference examples #1 (breach check) and #2 (password check).
|
|
|
|
### For Integration Projects
|
|
Focus on:
|
|
1. **Authentication setup** - Get API key and configure headers
|
|
2. **Rate limiting** - Implement retry logic (example #8)
|
|
3. **Error handling** - Handle 404, 401, 429 responses properly
|
|
4. **User experience** - Provide clear messaging about breach exposure
|
|
|
|
Review Quick Reference examples #5 (domain search) and #9 (subscription info).
|
|
|
|
### For Production Systems
|
|
Consider:
|
|
1. **Caching** - Store breach results to reduce API calls
|
|
2. **Background processing** - Check breaches asynchronously
|
|
3. **Monitoring** - Track new breaches with latest breach endpoint (example #4)
|
|
4. **Privacy** - Never log passwords, use k-anonymity model
|
|
5. **Compliance** - Follow attribution requirements (CC BY 4.0)
|
|
|
|
### For Security Tools
|
|
Advanced patterns:
|
|
1. **Stealer logs** - Check malware-stolen credentials (example #10)
|
|
2. **Domain monitoring** - Track all compromised accounts in your organization
|
|
3. **Paste monitoring** - Alert on email exposure in public pastes (example #6)
|
|
4. **Padding** - Use response padding to prevent inference attacks (example #7)
|
|
|
|
## Common Patterns
|
|
|
|
### Pattern 1: Sign-up Password Validation
|
|
```python
|
|
# Prevent users from choosing compromised passwords
|
|
def validate_signup_password(password):
|
|
count = check_password_pwned(password)
|
|
if count > 0:
|
|
return False, f"This password appears in {count} data breaches"
|
|
return True, "Password is secure"
|
|
```
|
|
|
|
### Pattern 2: Breach Notification System
|
|
```python
|
|
# Notify users when their account appears in new breach
|
|
def notify_affected_users():
|
|
latest = get_latest_breach()
|
|
affected_users = query_users_in_breach(latest['Name'])
|
|
for user in affected_users:
|
|
send_notification(user, latest)
|
|
```
|
|
|
|
### Pattern 3: Compliance Check
|
|
```python
|
|
# Verify all domain accounts for compliance reporting
|
|
def domain_security_audit(domain, api_key):
|
|
breached = search_domain_breaches(domain, api_key)
|
|
report = {
|
|
'total_accounts': len(breached),
|
|
'affected_accounts': breached,
|
|
'timestamp': datetime.now()
|
|
}
|
|
return report
|
|
```
|
|
|
|
## API Endpoints Summary
|
|
|
|
### Authenticated Endpoints (Require API Key)
|
|
- `GET /breachedaccount/{account}` - Check account breaches
|
|
- `GET /pasteaccount/{account}` - Check pastes
|
|
- `GET /breacheddomain/{domain}` - Domain-wide search
|
|
- `GET /subscribeddomains` - List verified domains
|
|
- `GET /subscription/status` - Check subscription
|
|
- `GET /stealerlogsbyemail/{email}` - Stealer logs by email
|
|
- `GET /stealerlogsbywebsitedomain/{domain}` - Stealer logs by site
|
|
- `GET /stealerlogsbyemaildomain/{domain}` - Stealer logs by email domain
|
|
|
|
### Public Endpoints (No Authentication)
|
|
- `GET /breaches` - All breaches in system
|
|
- `GET /breach/{name}` - Single breach details
|
|
- `GET /latestbreach` - Most recent breach
|
|
- `GET /dataclasses` - List of data types
|
|
- `GET https://api.pwnedpasswords.com/range/{prefix}` - Password check
|
|
|
|
## Testing
|
|
|
|
### Test Accounts
|
|
Use these on domain `hibp-integration-tests.com`:
|
|
- `account-exists@` - Has breaches and pastes
|
|
- `multiple-breaches@` - Three different breaches
|
|
- `spam-list-only@` - Only spam-flagged breach
|
|
- `stealer-log@` - In stealer logs
|
|
- `opt-out@` - No results (opted out)
|
|
|
|
### Test API Key
|
|
Use `00000000000000000000000000000000` for integration testing.
|
|
|
|
## Best Practices
|
|
|
|
1. **Always set User-Agent** - Required header, returns 403 without it
|
|
2. **Use HTTPS only** - API requires TLS 1.2+
|
|
3. **Implement retry logic** - Handle 429 rate limits gracefully
|
|
4. **Cache breach data** - Reduce API calls for frequently checked accounts
|
|
5. **Never log passwords** - Use k-anonymity model, hash locally
|
|
6. **Provide attribution** - Link to haveibeenpwned.com (CC BY 4.0 license)
|
|
7. **Handle 404 gracefully** - "Not found" is good news for users
|
|
8. **Use padding for passwords** - Add `Add-Padding: true` header
|
|
|
|
## Resources
|
|
|
|
### Official Links
|
|
- API Documentation: https://haveibeenpwned.com/API/v3
|
|
- Get API Key: https://haveibeenpwned.com/API/Key
|
|
- Dashboard: https://haveibeenpwned.com/DomainSearch
|
|
|
|
### Community Tools
|
|
- **PwnedPasswordsDownloader** (GitHub) - Download full password database
|
|
- Integration libraries available for Python, JavaScript, Go, C#, and more
|
|
|
|
## Acceptable Use
|
|
|
|
**Permitted:**
|
|
- Security tools and breach notifications
|
|
- Password validation in authentication systems
|
|
- Compliance and security audits
|
|
- Educational and research purposes
|
|
|
|
**Prohibited:**
|
|
- Targeting or harming breach victims
|
|
- Denial-of-service attacks
|
|
- Circumventing security measures
|
|
- Misrepresenting data source
|
|
- Automating undocumented APIs
|
|
|
|
Violations may result in API key revocation or IP blocking.
|
|
|
|
## Notes
|
|
|
|
- Breach data licensed under **Creative Commons Attribution 4.0**
|
|
- Pwned Passwords has no licensing requirements
|
|
- CORS only supported for unauthenticated endpoints
|
|
- Never expose API keys in client-side code
|
|
- Service tracks **917+ breaches** as of API documentation date
|