Files
gh-jezweb-claude-skills-ski…/references/models-catalog.md
2025-11-30 08:24:38 +08:00

246 lines
7.3 KiB
Markdown

# Cloudflare Workers AI - Models Catalog
Complete catalog of Workers AI models organized by task type.
**Last Updated**: 2025-10-21
**Official Catalog**: https://developers.cloudflare.com/workers-ai/models/
---
## Text Generation (LLMs)
### Meta Llama Models
| Model ID | Size | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/meta/llama-3.1-8b-instruct` | 8B | General purpose, balanced | 300/min |
| `@cf/meta/llama-3.1-8b-instruct-fast` | 8B | Faster inference | 300/min |
| `@cf/meta/llama-3.2-1b-instruct` | 1B | Ultra-fast, simple tasks | 300/min |
| `@cf/meta/llama-3.2-3b-instruct` | 3B | Fast, good quality | 300/min |
| `@cf/meta/llama-2-7b-chat-int8` | 7B | Legacy, reliable | 300/min |
| `@cf/meta/llama-2-13b-chat-awq` | 13B | Higher quality (slower) | 300/min |
### Qwen Models
| Model ID | Size | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/qwen/qwen1.5-14b-chat-awq` | 14B | High quality, complex reasoning | 150/min |
| `@cf/qwen/qwen1.5-7b-chat-awq` | 7B | Balanced quality/speed | 300/min |
| `@cf/qwen/qwen1.5-1.8b-chat` | 1.8B | Fast, lightweight | 720/min |
| `@cf/qwen/qwen1.5-0.5b-chat` | 0.5B | Ultra-fast, ultra-lightweight | 1500/min |
### Mistral Models
| Model ID | Size | Best For | Rate Limit |
|----------|------|----------|------------|
| `@hf/thebloke/mistral-7b-instruct-v0.1-awq` | 7B | Fast, efficient | 400/min |
| `@hf/thebloke/openhermes-2.5-mistral-7b-awq` | 7B | Instruction following | 300/min |
### DeepSeek Models
| Model ID | Size | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` | 32B | Coding, technical content | 300/min |
| `@cf/deepseek-ai/deepseek-coder-6.7b-instruct-awq` | 6.7B | Code generation | 300/min |
### Other Models
| Model ID | Size | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/tinyllama/tinyllama-1.1b-chat-v1.0` | 1.1B | Extremely fast, limited capability | 720/min |
| `@cf/microsoft/phi-2` | 2.7B | Fast, efficient | 720/min |
| `@cf/google/gemma-2b-it-lora` | 2B | Instruction tuned | 300/min |
| `@cf/google/gemma-7b-it-lora` | 7B | Higher quality | 300/min |
---
## Text Embeddings
| Model ID | Dimensions | Best For | Rate Limit |
|----------|-----------|----------|------------|
| `@cf/baai/bge-base-en-v1.5` | 768 | General purpose RAG | 3000/min |
| `@cf/baai/bge-large-en-v1.5` | 1024 | High accuracy search | 1500/min |
| `@cf/baai/bge-small-en-v1.5` | 384 | Fast, low storage | 3000/min |
| `@cf/baai/bge-m3` | 1024 | Multilingual | 3000/min |
**Use Case**: RAG, semantic search, similarity detection, clustering
---
## Image Generation
| Model ID | Type | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/black-forest-labs/flux-1-schnell` | Text-to-Image | Photorealistic, high quality | 720/min |
| `@cf/stabilityai/stable-diffusion-xl-base-1.0` | Text-to-Image | General purpose | 720/min |
| `@cf/lykon/dreamshaper-8-lcm` | Text-to-Image | Artistic, stylized | 720/min |
| `@cf/runwayml/stable-diffusion-v1-5-img2img` | Image-to-Image | Transform images | 1500/min |
| `@cf/runwayml/stable-diffusion-v1-5-inpainting` | Inpainting | Edit specific areas | 1500/min |
| `@cf/bytedance/stable-diffusion-xl-lightning` | Text-to-Image | Fast generation | 720/min |
**Output**: PNG images (~5 MB max)
---
## Vision Models
| Model ID | Task | Best For | Rate Limit |
|----------|------|----------|------------|
| `@cf/meta/llama-3.2-11b-vision-instruct` | Image Understanding | Q&A, captioning, analysis | 720/min |
| `@cf/unum/uform-gen2-qwen-500m` | Image Captioning | Fast captions | 720/min |
**Input**: Base64-encoded images
---
## Translation
| Model ID | Languages | Rate Limit |
|----------|-----------|------------|
| `@cf/meta/m2m100-1.2b` | 100+ languages | 720/min |
**Supported Language Pairs**: https://developers.cloudflare.com/workers-ai/models/m2m100-1.2b/
---
## Text Classification
| Model ID | Task | Rate Limit |
|----------|------|------------|
| `@cf/huggingface/distilbert-sst-2-int8` | Sentiment analysis | 2000/min |
| `@hf/thebloke/openhermes-2.5-mistral-7b-awq` | General classification | 300/min |
**Output**: Label + confidence score
---
## Automatic Speech Recognition
| Model ID | Best For | Rate Limit |
|----------|----------|------------|
| `@cf/openai/whisper` | General transcription | 720/min |
| `@cf/openai/whisper-tiny-en` | English only, fast | 720/min |
**Input**: Audio files (MP3, WAV, etc.)
---
## Object Detection
| Model ID | Task | Rate Limit |
|----------|------|------------|
| `@cf/facebook/detr-resnet-50` | Object detection | 3000/min |
**Output**: Bounding boxes + labels
---
## Image Classification
| Model ID | Classes | Rate Limit |
|----------|---------|------------|
| `@cf/microsoft/resnet-50` | 1000 ImageNet classes | 3000/min |
**Output**: Top-5 predictions with probabilities
---
## Summarization
| Model ID | Best For | Rate Limit |
|----------|----------|------------|
| `@cf/facebook/bart-large-cnn` | News articles, documents | 1500/min |
---
## Text-to-Image (Legacy)
| Model ID | Type | Rate Limit |
|----------|------|------------|
| `@cf/stabilityai/stable-diffusion-v1-5-img2img` | Image-to-Image | 1500/min |
---
## Model Selection Guide
### For Text Generation
**Speed Priority:**
1. `@cf/qwen/qwen1.5-0.5b-chat` (1500/min)
2. `@cf/meta/llama-3.2-1b-instruct` (300/min)
3. `@cf/tinyllama/tinyllama-1.1b-chat-v1.0` (720/min)
**Quality Priority:**
1. `@cf/qwen/qwen1.5-14b-chat-awq` (150/min)
2. `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` (300/min)
3. `@cf/meta/llama-3.1-8b-instruct` (300/min)
**Balanced:**
1. `@cf/meta/llama-3.1-8b-instruct` (300/min)
2. `@hf/thebloke/mistral-7b-instruct-v0.1-awq` (400/min)
3. `@cf/qwen/qwen1.5-7b-chat-awq` (300/min)
### For Embeddings
**General Purpose RAG:**
- `@cf/baai/bge-base-en-v1.5` (768 dims, 3000/min)
**High Accuracy:**
- `@cf/baai/bge-large-en-v1.5` (1024 dims, 1500/min)
**Fast/Low Storage:**
- `@cf/baai/bge-small-en-v1.5` (384 dims, 3000/min)
### For Image Generation
**Best Quality:**
- `@cf/black-forest-labs/flux-1-schnell`
**General Purpose:**
- `@cf/stabilityai/stable-diffusion-xl-base-1.0`
**Artistic/Stylized:**
- `@cf/lykon/dreamshaper-8-lcm`
**Fast:**
- `@cf/bytedance/stable-diffusion-xl-lightning`
---
## Rate Limits Summary
| Task Type | Default Limit | High-Speed Models |
|-----------|---------------|-------------------|
| Text Generation | 300/min | 400-1500/min |
| Text Embeddings | 3000/min | 1500/min (large) |
| Image Generation | 720/min | 720/min |
| Vision Models | 720/min | 720/min |
| Translation | 720/min | 720/min |
| Classification | 2000/min | 2000/min |
| Speech Recognition | 720/min | 720/min |
| Object Detection | 3000/min | 3000/min |
---
## Pricing (Neurons)
Pricing varies by model. Common examples:
| Model | Input (1M tokens) | Output (1M tokens) |
|-------|-------------------|-------------------|
| Llama 3.2 1B | $0.027 | $0.201 |
| Llama 3.1 8B | $0.088 | $0.606 |
| BGE-base embeddings | $0.005 | N/A |
| Flux image gen | ~$0.011/image | N/A |
**Free Tier**: 10,000 neurons/day
**Paid Tier**: $0.011 per 1,000 neurons
---
## References
- [Official Models Catalog](https://developers.cloudflare.com/workers-ai/models/)
- [Rate Limits](https://developers.cloudflare.com/workers-ai/platform/limits/)
- [Pricing](https://developers.cloudflare.com/workers-ai/platform/pricing/)