Initial commit
This commit is contained in:
80
references/rate-limits.md
Normal file
80
references/rate-limits.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# Rate Limits Guide
|
||||
|
||||
Complete guide to Claude API rate limits and how to handle them.
|
||||
|
||||
## Overview
|
||||
|
||||
Claude API uses **token bucket algorithm** for rate limiting:
|
||||
- Capacity continuously replenishes
|
||||
- Three types: Requests per minute (RPM), Tokens per minute (TPM), Daily tokens
|
||||
- Limits vary by account tier and model
|
||||
|
||||
## Rate Limit Tiers
|
||||
|
||||
| Tier | Requirements | Example Limits (Sonnet 3.5) |
|
||||
|------|--------------|------------------------------|
|
||||
| Tier 1 | New account | 50 RPM, 40k TPM |
|
||||
| Tier 2 | $10 spend | 1000 RPM, 100k TPM |
|
||||
| Tier 3 | $50 spend | 2000 RPM, 200k TPM |
|
||||
| Tier 4 | $500 spend | 4000 RPM, 400k TPM |
|
||||
|
||||
**Note**: Limits vary by model. Check Console for exact limits.
|
||||
|
||||
## Response Headers
|
||||
|
||||
Every API response includes:
|
||||
|
||||
```
|
||||
anthropic-ratelimit-requests-limit: 50
|
||||
anthropic-ratelimit-requests-remaining: 49
|
||||
anthropic-ratelimit-requests-reset: 2025-10-25T12:00:00Z
|
||||
anthropic-ratelimit-tokens-limit: 50000
|
||||
anthropic-ratelimit-tokens-remaining: 49500
|
||||
anthropic-ratelimit-tokens-reset: 2025-10-25T12:01:00Z
|
||||
```
|
||||
|
||||
On 429 errors:
|
||||
```
|
||||
retry-after: 60 // Seconds until retry allowed
|
||||
```
|
||||
|
||||
## Handling Rate Limits
|
||||
|
||||
### Basic Exponential Backoff
|
||||
|
||||
```typescript
|
||||
async function withRetry(requestFn, maxRetries = 3) {
|
||||
for (let attempt = 0; attempt < maxRetries; attempt++) {
|
||||
try {
|
||||
return await requestFn();
|
||||
} catch (error) {
|
||||
if (error.status === 429) {
|
||||
const delay = 1000 * Math.pow(2, attempt);
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
} else {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Respecting retry-after Header
|
||||
|
||||
```typescript
|
||||
const retryAfter = error.response?.headers?.['retry-after'];
|
||||
const delay = retryAfter ? parseInt(retryAfter) * 1000 : baseDelay;
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Monitor headers** - Check remaining requests/tokens
|
||||
2. **Implement backoff** - Exponential delay on 429
|
||||
3. **Respect retry-after** - Use provided wait time
|
||||
4. **Batch requests** - Group when possible
|
||||
5. **Use caching** - Reduce duplicate requests
|
||||
6. **Upgrade tier** - Scale with usage
|
||||
|
||||
## Official Docs
|
||||
|
||||
https://docs.claude.com/en/api/rate-limits
|
||||
Reference in New Issue
Block a user