Initial commit
This commit is contained in:
245
references/models-catalog.md
Normal file
245
references/models-catalog.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# Cloudflare Workers AI - Models Catalog
|
||||
|
||||
Complete catalog of Workers AI models organized by task type.
|
||||
|
||||
**Last Updated**: 2025-10-21
|
||||
**Official Catalog**: https://developers.cloudflare.com/workers-ai/models/
|
||||
|
||||
---
|
||||
|
||||
## Text Generation (LLMs)
|
||||
|
||||
### Meta Llama Models
|
||||
|
||||
| Model ID | Size | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/meta/llama-3.1-8b-instruct` | 8B | General purpose, balanced | 300/min |
|
||||
| `@cf/meta/llama-3.1-8b-instruct-fast` | 8B | Faster inference | 300/min |
|
||||
| `@cf/meta/llama-3.2-1b-instruct` | 1B | Ultra-fast, simple tasks | 300/min |
|
||||
| `@cf/meta/llama-3.2-3b-instruct` | 3B | Fast, good quality | 300/min |
|
||||
| `@cf/meta/llama-2-7b-chat-int8` | 7B | Legacy, reliable | 300/min |
|
||||
| `@cf/meta/llama-2-13b-chat-awq` | 13B | Higher quality (slower) | 300/min |
|
||||
|
||||
### Qwen Models
|
||||
|
||||
| Model ID | Size | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/qwen/qwen1.5-14b-chat-awq` | 14B | High quality, complex reasoning | 150/min |
|
||||
| `@cf/qwen/qwen1.5-7b-chat-awq` | 7B | Balanced quality/speed | 300/min |
|
||||
| `@cf/qwen/qwen1.5-1.8b-chat` | 1.8B | Fast, lightweight | 720/min |
|
||||
| `@cf/qwen/qwen1.5-0.5b-chat` | 0.5B | Ultra-fast, ultra-lightweight | 1500/min |
|
||||
|
||||
### Mistral Models
|
||||
|
||||
| Model ID | Size | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@hf/thebloke/mistral-7b-instruct-v0.1-awq` | 7B | Fast, efficient | 400/min |
|
||||
| `@hf/thebloke/openhermes-2.5-mistral-7b-awq` | 7B | Instruction following | 300/min |
|
||||
|
||||
### DeepSeek Models
|
||||
|
||||
| Model ID | Size | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` | 32B | Coding, technical content | 300/min |
|
||||
| `@cf/deepseek-ai/deepseek-coder-6.7b-instruct-awq` | 6.7B | Code generation | 300/min |
|
||||
|
||||
### Other Models
|
||||
|
||||
| Model ID | Size | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/tinyllama/tinyllama-1.1b-chat-v1.0` | 1.1B | Extremely fast, limited capability | 720/min |
|
||||
| `@cf/microsoft/phi-2` | 2.7B | Fast, efficient | 720/min |
|
||||
| `@cf/google/gemma-2b-it-lora` | 2B | Instruction tuned | 300/min |
|
||||
| `@cf/google/gemma-7b-it-lora` | 7B | Higher quality | 300/min |
|
||||
|
||||
---
|
||||
|
||||
## Text Embeddings
|
||||
|
||||
| Model ID | Dimensions | Best For | Rate Limit |
|
||||
|----------|-----------|----------|------------|
|
||||
| `@cf/baai/bge-base-en-v1.5` | 768 | General purpose RAG | 3000/min |
|
||||
| `@cf/baai/bge-large-en-v1.5` | 1024 | High accuracy search | 1500/min |
|
||||
| `@cf/baai/bge-small-en-v1.5` | 384 | Fast, low storage | 3000/min |
|
||||
| `@cf/baai/bge-m3` | 1024 | Multilingual | 3000/min |
|
||||
|
||||
**Use Case**: RAG, semantic search, similarity detection, clustering
|
||||
|
||||
---
|
||||
|
||||
## Image Generation
|
||||
|
||||
| Model ID | Type | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/black-forest-labs/flux-1-schnell` | Text-to-Image | Photorealistic, high quality | 720/min |
|
||||
| `@cf/stabilityai/stable-diffusion-xl-base-1.0` | Text-to-Image | General purpose | 720/min |
|
||||
| `@cf/lykon/dreamshaper-8-lcm` | Text-to-Image | Artistic, stylized | 720/min |
|
||||
| `@cf/runwayml/stable-diffusion-v1-5-img2img` | Image-to-Image | Transform images | 1500/min |
|
||||
| `@cf/runwayml/stable-diffusion-v1-5-inpainting` | Inpainting | Edit specific areas | 1500/min |
|
||||
| `@cf/bytedance/stable-diffusion-xl-lightning` | Text-to-Image | Fast generation | 720/min |
|
||||
|
||||
**Output**: PNG images (~5 MB max)
|
||||
|
||||
---
|
||||
|
||||
## Vision Models
|
||||
|
||||
| Model ID | Task | Best For | Rate Limit |
|
||||
|----------|------|----------|------------|
|
||||
| `@cf/meta/llama-3.2-11b-vision-instruct` | Image Understanding | Q&A, captioning, analysis | 720/min |
|
||||
| `@cf/unum/uform-gen2-qwen-500m` | Image Captioning | Fast captions | 720/min |
|
||||
|
||||
**Input**: Base64-encoded images
|
||||
|
||||
---
|
||||
|
||||
## Translation
|
||||
|
||||
| Model ID | Languages | Rate Limit |
|
||||
|----------|-----------|------------|
|
||||
| `@cf/meta/m2m100-1.2b` | 100+ languages | 720/min |
|
||||
|
||||
**Supported Language Pairs**: https://developers.cloudflare.com/workers-ai/models/m2m100-1.2b/
|
||||
|
||||
---
|
||||
|
||||
## Text Classification
|
||||
|
||||
| Model ID | Task | Rate Limit |
|
||||
|----------|------|------------|
|
||||
| `@cf/huggingface/distilbert-sst-2-int8` | Sentiment analysis | 2000/min |
|
||||
| `@hf/thebloke/openhermes-2.5-mistral-7b-awq` | General classification | 300/min |
|
||||
|
||||
**Output**: Label + confidence score
|
||||
|
||||
---
|
||||
|
||||
## Automatic Speech Recognition
|
||||
|
||||
| Model ID | Best For | Rate Limit |
|
||||
|----------|----------|------------|
|
||||
| `@cf/openai/whisper` | General transcription | 720/min |
|
||||
| `@cf/openai/whisper-tiny-en` | English only, fast | 720/min |
|
||||
|
||||
**Input**: Audio files (MP3, WAV, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Object Detection
|
||||
|
||||
| Model ID | Task | Rate Limit |
|
||||
|----------|------|------------|
|
||||
| `@cf/facebook/detr-resnet-50` | Object detection | 3000/min |
|
||||
|
||||
**Output**: Bounding boxes + labels
|
||||
|
||||
---
|
||||
|
||||
## Image Classification
|
||||
|
||||
| Model ID | Classes | Rate Limit |
|
||||
|----------|---------|------------|
|
||||
| `@cf/microsoft/resnet-50` | 1000 ImageNet classes | 3000/min |
|
||||
|
||||
**Output**: Top-5 predictions with probabilities
|
||||
|
||||
---
|
||||
|
||||
## Summarization
|
||||
|
||||
| Model ID | Best For | Rate Limit |
|
||||
|----------|----------|------------|
|
||||
| `@cf/facebook/bart-large-cnn` | News articles, documents | 1500/min |
|
||||
|
||||
---
|
||||
|
||||
## Text-to-Image (Legacy)
|
||||
|
||||
| Model ID | Type | Rate Limit |
|
||||
|----------|------|------------|
|
||||
| `@cf/stabilityai/stable-diffusion-v1-5-img2img` | Image-to-Image | 1500/min |
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Guide
|
||||
|
||||
### For Text Generation
|
||||
|
||||
**Speed Priority:**
|
||||
1. `@cf/qwen/qwen1.5-0.5b-chat` (1500/min)
|
||||
2. `@cf/meta/llama-3.2-1b-instruct` (300/min)
|
||||
3. `@cf/tinyllama/tinyllama-1.1b-chat-v1.0` (720/min)
|
||||
|
||||
**Quality Priority:**
|
||||
1. `@cf/qwen/qwen1.5-14b-chat-awq` (150/min)
|
||||
2. `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` (300/min)
|
||||
3. `@cf/meta/llama-3.1-8b-instruct` (300/min)
|
||||
|
||||
**Balanced:**
|
||||
1. `@cf/meta/llama-3.1-8b-instruct` (300/min)
|
||||
2. `@hf/thebloke/mistral-7b-instruct-v0.1-awq` (400/min)
|
||||
3. `@cf/qwen/qwen1.5-7b-chat-awq` (300/min)
|
||||
|
||||
### For Embeddings
|
||||
|
||||
**General Purpose RAG:**
|
||||
- `@cf/baai/bge-base-en-v1.5` (768 dims, 3000/min)
|
||||
|
||||
**High Accuracy:**
|
||||
- `@cf/baai/bge-large-en-v1.5` (1024 dims, 1500/min)
|
||||
|
||||
**Fast/Low Storage:**
|
||||
- `@cf/baai/bge-small-en-v1.5` (384 dims, 3000/min)
|
||||
|
||||
### For Image Generation
|
||||
|
||||
**Best Quality:**
|
||||
- `@cf/black-forest-labs/flux-1-schnell`
|
||||
|
||||
**General Purpose:**
|
||||
- `@cf/stabilityai/stable-diffusion-xl-base-1.0`
|
||||
|
||||
**Artistic/Stylized:**
|
||||
- `@cf/lykon/dreamshaper-8-lcm`
|
||||
|
||||
**Fast:**
|
||||
- `@cf/bytedance/stable-diffusion-xl-lightning`
|
||||
|
||||
---
|
||||
|
||||
## Rate Limits Summary
|
||||
|
||||
| Task Type | Default Limit | High-Speed Models |
|
||||
|-----------|---------------|-------------------|
|
||||
| Text Generation | 300/min | 400-1500/min |
|
||||
| Text Embeddings | 3000/min | 1500/min (large) |
|
||||
| Image Generation | 720/min | 720/min |
|
||||
| Vision Models | 720/min | 720/min |
|
||||
| Translation | 720/min | 720/min |
|
||||
| Classification | 2000/min | 2000/min |
|
||||
| Speech Recognition | 720/min | 720/min |
|
||||
| Object Detection | 3000/min | 3000/min |
|
||||
|
||||
---
|
||||
|
||||
## Pricing (Neurons)
|
||||
|
||||
Pricing varies by model. Common examples:
|
||||
|
||||
| Model | Input (1M tokens) | Output (1M tokens) |
|
||||
|-------|-------------------|-------------------|
|
||||
| Llama 3.2 1B | $0.027 | $0.201 |
|
||||
| Llama 3.1 8B | $0.088 | $0.606 |
|
||||
| BGE-base embeddings | $0.005 | N/A |
|
||||
| Flux image gen | ~$0.011/image | N/A |
|
||||
|
||||
**Free Tier**: 10,000 neurons/day
|
||||
**Paid Tier**: $0.011 per 1,000 neurons
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Official Models Catalog](https://developers.cloudflare.com/workers-ai/models/)
|
||||
- [Rate Limits](https://developers.cloudflare.com/workers-ai/platform/limits/)
|
||||
- [Pricing](https://developers.cloudflare.com/workers-ai/platform/pricing/)
|
||||
Reference in New Issue
Block a user