# Model Providers Reference Complete reference for SAP AI Core model providers and available models. **Documentation Source:** [https://github.com/SAP-docs/sap-artificial-intelligence/tree/main/docs/sap-ai-core](https://github.com/SAP-docs/sap-artificial-intelligence/tree/main/docs/sap-ai-core) **Latest Models:** SAP Note 3437766 --- ## Overview SAP AI Core provides access to models from six providers via the Generative AI Hub. All models are accessed through a unified API, allowing easy switching between providers. --- ## Provider Summary | Provider | Executable ID | Access Type | Model Categories | |----------|---------------|-------------|------------------| | Azure OpenAI | `azure-openai` | Remote | Chat, Embeddings, Vision | | SAP Open Source | `aicore-opensource` | Local | Chat, Embeddings, Vision | | Google Vertex AI | `gcp-vertexai` | Remote | Chat, Embeddings, Code | | AWS Bedrock | `aws-bedrock` | Remote | Chat, Embeddings | | Mistral AI | `aicore-mistralai` | Local | Chat, Code | | IBM | `aicore-ibm` | Local | Chat, Code | --- ## 1. Azure OpenAI **Executable ID:** `azure-openai` **Access Type:** Remote (Azure-hosted) ### Chat Models | Model | Context | Capabilities | Use Case | |-------|---------|--------------|----------| | gpt-4o | 128K | Chat, Vision | Advanced reasoning, multimodal | | gpt-4o-mini | 128K | Chat, Vision | Cost-efficient, fast | | gpt-4-turbo | 128K | Chat, Vision | Previous flagship | | gpt-4 | 8K/32K | Chat | Reasoning, analysis | | gpt-35-turbo | 4K/16K | Chat | Fast, economical | ### Embedding Models | Model | Dimensions | Use Case | |-------|------------|----------| | text-embedding-3-large | 3072 | High accuracy embeddings | | text-embedding-3-small | 1536 | Cost-efficient embeddings | | text-embedding-ada-002 | 1536 | Legacy embeddings | ### Configuration Example ```json { "name": "azure-gpt4o-config", "executableId": "azure-openai", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "gpt-4o"}, {"key": "modelVersion", "value": "2024-05-13"} ] } ``` --- ## 2. SAP-Hosted Open Source **Executable ID:** `aicore-opensource` **Access Type:** Local (SAP-hosted) ### Llama Models | Model | Parameters | Context | Capabilities | |-------|------------|---------|--------------| | llama-3.1-405b | 405B | 128K | Advanced reasoning | | llama-3.1-70b | 70B | 128K | Strong reasoning | | llama-3.1-8b | 8B | 128K | Fast, efficient | | llama-3.2-90b-vision | 90B | 128K | Vision + text | | llama-3.2-11b-vision | 11B | 128K | Vision + text | | llama-3.2-3b | 3B | 128K | Lightweight | | llama-3.2-1b | 1B | 128K | Edge deployment | ### Mistral Models (Open Source) | Model | Parameters | Context | |-------|------------|---------| | mistral-7b-instruct | 7B | 32K | | mixtral-8x7b | 46.7B | 32K | ### Falcon Models | Model | Parameters | Context | |-------|------------|---------| | falcon-40b | 40B | 2K | ### Configuration Example ```json { "name": "llama-config", "executableId": "aicore-opensource", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "meta--llama-3.1-70b-instruct"}, {"key": "modelVersion", "value": "latest"} ] } ``` --- ## 3. Google Vertex AI **Executable ID:** `gcp-vertexai` **Access Type:** Remote (Google Cloud) ### Gemini Models | Model | Context | Capabilities | |-------|---------|--------------| | gemini-2.5-pro | 2M | Chat, Vision, Code, Long context | | gemini-2.5-flash | 1M | Fast, multimodal | | gemini-2.5-flash-lite | 1M | Fast, lower-cost multimodal | | gemini-2.0-flash | 1M | Flash family, multimodal | | gemini-2.0-flash-lite | 1M | Flash family, lower-cost | ### PaLM 2 Models | Model | Use Case | |-------|----------| | text-bison | Text generation | | chat-bison | Conversational | | code-bison | Code generation | ### Embedding Models | Model | Dimensions | |-------|------------| | text-embedding-004 | 768 | | textembedding-gecko | 768 | ### Configuration Example ```json { "name": "gemini-config", "executableId": "gcp-vertexai", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "gemini-1.5-pro"}, {"key": "modelVersion", "value": "latest"} ] } ``` --- ## 4. AWS Bedrock **Executable ID:** `aws-bedrock` **Access Type:** Remote (AWS) ### Anthropic Claude Models | Model | Context | Capabilities | |-------|---------|--------------| | claude-sonnet-4-5 | 200K | Latest, advanced reasoning | | claude-4-opus | 200K | Highest capability | | claude-4-sonnet | 200K | Balanced, high performance | | claude-opus-4-1 | 200K | Extended Opus capabilities | | claude-3-7-sonnet | 200K | Improved Sonnet 3.5 | | claude-3-5-sonnet | 200K | Advanced reasoning | | claude-3-opus | 200K | High capability | | claude-3-sonnet | 200K | Balanced performance | | claude-3-haiku | 200K | Fast, efficient | ### Amazon Titan Models | Model | Use Case | |-------|----------| | titan-text-express | General text | | titan-text-lite | Lightweight | | titan-embed-text | Embeddings | ### Meta Llama 3 (Bedrock) | Model | Parameters | |-------|------------| | llama-3-70b | 70B | | llama-3-8b | 8B | ### Configuration Example ```json { "name": "claude-config", "executableId": "aws-bedrock", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "anthropic--claude-3-5-sonnet"}, {"key": "modelVersion", "value": "latest"} ] } ``` --- ## 5. Mistral AI **Executable ID:** `aicore-mistralai` **Access Type:** Local (SAP-hosted) ### Models | Model | Parameters | Context | Use Case | |-------|------------|---------|----------| | mistral-large | - | 32K | Advanced reasoning | | mistral-medium | - | 32K | Balanced | | mistral-small | - | 32K | Cost-efficient | | codestral | - | 32K | Code generation | ### Configuration Example ```json { "name": "mistral-config", "executableId": "aicore-mistralai", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "mistralai--mistral-large"}, {"key": "modelVersion", "value": "latest"} ] } ``` --- ## 6. IBM **Executable ID:** `aicore-ibm` **Access Type:** Local (SAP-hosted) ### Granite Models | Model | Parameters | Use Case | |-------|------------|----------| | granite-13b-chat | 13B | Conversational | | granite-13b-instruct | 13B | Task completion | | granite-code | - | Code generation | ### Configuration Example ```json { "name": "granite-config", "executableId": "aicore-ibm", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "ibm--granite-13b-chat"}, {"key": "modelVersion", "value": "latest"} ] } ``` --- ## Model Selection Guide ### By Use Case | Use Case | Recommended Models | |----------|-------------------| | General chat | gpt-4o, claude-3-5-sonnet, gemini-1.5-pro | | Code generation | gpt-4o, codestral, claude-3-5-sonnet | | Long documents | gemini-1.5-pro (2M), claude-3 (200K), gpt-4o (128K) | | Vision/images | gpt-4o, gemini-1.5-pro, llama-3.2-vision | | Embeddings | text-embedding-3-large, text-embedding-004 | | Cost-sensitive | gpt-4o-mini, mistral-small, llama-3.1-8b | | High throughput | gpt-35-turbo, claude-3-haiku, mistral-small | ### By Budget | Budget | Tier | Models | |--------|------|--------| | Low | Economy | gpt-4o-mini, claude-3-haiku, mistral-small | | Medium | Standard | gpt-4o, claude-3-sonnet, gemini-1.5-flash | | High | Premium | claude-3-opus, gpt-4-turbo, gemini-1.5-pro | ### By Capability | Capability | Best Models | |------------|-------------| | Reasoning | claude-3-opus, gpt-4o, llama-3.1-405b | | Speed | claude-3-haiku, gpt-35-turbo, mistral-small | | Context length | gemini-1.5-pro (2M), claude-3 (200K) | | Multimodal | gpt-4o, gemini-1.5-pro, llama-3.2-vision | | Code | codestral, gpt-4o, claude-3-5-sonnet | --- ## Model Version Management ### Version Strategies | Strategy | Configuration | Use Case | |----------|---------------|----------| | Latest | `"modelVersion": "latest"` | Development, auto-upgrade | | Pinned | `"modelVersion": "2024-05-13"` | Production stability | ### Checking Available Versions ```bash curl -X GET "$AI_API_URL/v2/lm/scenarios/foundation-models/models" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" | \ jq '.resources[] | select(.model == "gpt-4o") | .versions' ``` ### Handling Deprecation 1. Monitor `deprecationDate` in model metadata 2. Plan migration before `retirementDate` 3. Test new version in staging 4. Update configuration with new version 5. Patch existing deployments --- ## Pricing Considerations Pricing varies by: - Model complexity (larger = more expensive) - Input vs output tokens (output often 2-3x input cost) - Provider region - Access type (Remote vs Local) **Reference:** SAP Note 3437766 for current token rates. ### Cost Optimization 1. **Right-size models**: Use smaller models for simple tasks 2. **Batch requests**: Combine multiple queries when possible 3. **Cache responses**: Store and reuse common query results 4. **Limit tokens**: Set appropriate `max_tokens` limits 5. **Use streaming**: No additional cost, better UX --- ## Rate Limits Rate limits vary by: - Service plan tier - Model provider - Specific model **Default limits** (vary by configuration): - Requests per minute: 60-600 - Tokens per minute: 40K-400K ### Handling Rate Limits ```python import time from requests.exceptions import HTTPError def call_with_retry(func, max_retries=3): for attempt in range(max_retries): try: return func() except HTTPError as e: if e.response.status_code == 429: wait_time = 2 ** attempt time.sleep(wait_time) else: raise raise Exception("Max retries exceeded") ``` --- ## Documentation Links - Supported Models: [https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/supported-models-509e588.md](https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/supported-models-509e588.md) - Generative AI Hub: [https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/generative-ai-hub-7db524e.md](https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/generative-ai-hub-7db524e.md) - SAP Note 3437766: Token rates, limits, deprecation dates - SAP Discovery Center: [https://discovery-center.cloud.sap/serviceCatalog/sap-ai-core](https://discovery-center.cloud.sap/serviceCatalog/sap-ai-core)