# Generative AI Hub Reference Complete reference for SAP AI Core Generative AI Hub. **Documentation Source:** [https://github.com/SAP-docs/sap-artificial-intelligence/tree/main/docs/sap-ai-core](https://github.com/SAP-docs/sap-artificial-intelligence/tree/main/docs/sap-ai-core) --- ## Overview The Generative AI Hub integrates large language models (LLMs) into SAP AI Core and SAP AI Launchpad, providing unified access to models from multiple providers. ### Key Features - Access to LLMs from 6 providers via unified API - Harmonized API for model switching without code changes - Prompt experimentation in AI Launchpad UI - Orchestration workflows with filtering, masking, grounding - Token-based metering and billing ### Prerequisites - SAP AI Core with **Extended** service plan - Valid service key credentials - Resource group created --- ## Global Scenarios Two scenarios provide generative AI access: | Scenario ID | Description | Use Case | |-------------|-------------|----------| | `foundation-models` | Direct model access | Single model deployment | | `orchestration` | Unified multi-model access | Pipeline workflows | --- ## Model Providers ### 1. Azure OpenAI (`azure-openai`) Access to OpenAI models via Azure's private instance. **Models:** - GPT-4o, GPT-4o-mini - GPT-4 Turbo, GPT-4 - GPT-3.5 Turbo - text-embedding-3-large, text-embedding-3-small **Capabilities:** Chat, embeddings, vision ### 2. SAP-Hosted Open Source (`aicore-opensource`) SAP-hosted open source models via OpenAI-compatible API. **Models:** - Llama 3.1 (8B, 70B, 405B) - Llama 3.2 (1B, 3B, 11B-Vision, 90B-Vision) - Mistral 7B, Mixtral 8x7B - Falcon 40B **Capabilities:** Chat, embeddings, vision (select models) ### 3. Google Vertex AI (`gcp-vertexai`) Access to Google's AI models. **Models:** - Gemini 1.5 Pro, Gemini 1.5 Flash - Gemini 1.0 Pro - PaLM 2 (text-bison, chat-bison) - text-embedding-004 **Capabilities:** Chat, embeddings, vision, code ### 4. AWS Bedrock (`aws-bedrock`) Access to models via AWS Bedrock. **Models:** - Anthropic Claude 3.5 Sonnet, Claude 3 Opus/Sonnet/Haiku - Amazon Titan Text, Titan Embeddings - Meta Llama 3 - Cohere Command **Capabilities:** Chat, embeddings ### 5. Mistral AI (`aicore-mistralai`) SAP-hosted Mistral models. **Models:** - Mistral Large - Mistral Medium - Mistral Small - Mistral 7B Instruct - Codestral **Capabilities:** Chat, code ### 6. IBM (`aicore-ibm`) SAP-hosted IBM models. **Models:** - Granite 13B Chat, Granite 13B Instruct - Granite Code **Capabilities:** Chat, code --- ## API: List Available Models ```bash curl -X GET "$AI_API_URL/v2/lm/scenarios/foundation-models/models" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" ``` ### Response Structure ```json { "count": 50, "resources": [ { "model": "gpt-4o", "accessType": "Remote", "displayName": "GPT-4o", "provider": "azure-openai", "allowedScenarios": ["foundation-models"], "executableId": "azure-openai", "description": "OpenAI's most advanced model", "versions": [ { "name": "2024-05-13", "isLatest": true, "capabilities": ["text-generation", "chat", "vision"], "contextLength": 128000, "inputCost": 5.0, "outputCost": 15.0, "deprecationDate": null, "retirementDate": null, "isStreamingSupported": true } ] } ] } ``` ### Model Metadata Fields | Field | Description | |-------|-------------| | `model` | Model identifier for API calls | | `accessType` | "Remote" (external) or "Local" (SAP-hosted) | | `provider` | Provider identifier | | `executableId` | Executable ID for deployments | | `contextLength` | Maximum context window tokens | | `inputCost` | Cost per 1K input tokens | | `outputCost` | Cost per 1K output tokens | | `deprecationDate` | Date version becomes deprecated | | `retirementDate` | Date version is removed | | `isStreamingSupported` | Streaming capability | --- ## Deploying a Model ### Step 1: Create Configuration ```bash curl -X POST "$AI_API_URL/v2/lm/configurations" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "name": "gpt4o-deployment-config", "executableId": "azure-openai", "scenarioId": "foundation-models", "parameterBindings": [ {"key": "modelName", "value": "gpt-4o"}, {"key": "modelVersion", "value": "latest"} ] }' ``` ### Step 2: Create Deployment ```bash curl -X POST "$AI_API_URL/v2/lm/deployments" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "configurationId": "" }' ``` ### Step 3: Check Status ```bash curl -X GET "$AI_API_URL/v2/lm/deployments/" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" ``` Wait for status `RUNNING` and note the `deploymentUrl`. --- ## Using the Harmonized API The harmonized API provides unified access without model-specific code. ### Chat Completion ```bash curl -X POST "$DEPLOYMENT_URL/chat/completions" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is SAP AI Core?"} ], "max_tokens": 1000, "temperature": 0.7 }' ``` ### With Streaming ```bash curl -X POST "$DEPLOYMENT_URL/chat/completions" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Tell me a story"}], "stream": true }' ``` ### Embeddings ```bash curl -X POST "$DEPLOYMENT_URL/embeddings" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "model": "text-embedding-3-large", "input": ["Document chunk to embed"], "encoding_format": "float" }' ``` --- ## Orchestration Deployment For unified access to multiple models: ### Create Orchestration Deployment ```bash # Get orchestration configuration ID curl -X GET "$AI_API_URL/v2/lm/configurations?scenarioId=orchestration" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" # Create deployment curl -X POST "$AI_API_URL/v2/lm/deployments" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "configurationId": "" }' ``` ### Use Orchestration API ```bash curl -X POST "$ORCHESTRATION_URL/v2/completion" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "config": { "module_configurations": { "llm_module_config": { "model_name": "gpt-4o", "model_version": "latest" }, "templating_module_config": { "template": [ {"role": "user", "content": "{{?prompt}}"} ] } } }, "input_params": { "prompt": "What is machine learning?" } }' ``` --- ## Model Version Management ### Auto-Upgrade Strategy Set `modelVersion` to `"latest"` for automatic upgrades: ```json { "parameterBindings": [ {"key": "modelName", "value": "gpt-4o"}, {"key": "modelVersion", "value": "latest"} ] } ``` ### Pinned Version Strategy Specify exact version for stability: ```json { "parameterBindings": [ {"key": "modelName", "value": "gpt-4o"}, {"key": "modelVersion", "value": "2024-05-13"} ] } ``` ### Manual Version Upgrade Patch deployment with new configuration: ```bash curl -X PATCH "$AI_API_URL/v2/lm/deployments/" \ -H "Authorization: Bearer $AUTH_TOKEN" \ -H "AI-Resource-Group: default" \ -H "Content-Type: application/json" \ -d '{ "configurationId": "" }' ``` --- ## SAP AI Launchpad UI ### Prompt Experimentation Access: **Workspaces** → **Generative AI Hub** → **Prompt Editor** Features: - Interactive prompt testing - Model selection and parameter tuning - Variable placeholders - Image inputs (select models) - Streaming responses - Save prompts (manager roles) ### Required Roles | Role | Capabilities | |------|--------------| | `genai_manager` | Full access, save prompts | | `genai_experimenter` | Test only, no save | | `prompt_manager` | Manage saved prompts | | `prompt_experimenter` | Use saved prompts | | `prompt_media_executor` | Upload images | ### Prompt Types - **Question Answering**: Q&A interactions - **Summarization**: Extract key points - **Inferencing**: Sentiment, entity extraction - **Transformations**: Translation, format conversion - **Expansions**: Content generation --- ## Model Library View model specifications and benchmarks in AI Launchpad: **Access:** Generative AI Hub → Model Library Information available: - Model capabilities - Context window sizes - Performance benchmarks (win rates, arena scores) - Cost per token - Deprecation schedules --- ## Rate Limits and Quotas Refer to **SAP Note 3437766** for: - Token conversion rates per model - Rate limits (requests/minute, tokens/minute) - Regional availability - Deprecation dates ### Quota Increase Request Submit support ticket: - Component: `CA-ML-AIC` - Include: tenant ID, current limits, requested limits, justification --- ## Best Practices ### Model Selection | Use Case | Recommended Model | |----------|-------------------| | General chat | GPT-4o, Claude 3.5 Sonnet | | Cost-sensitive | GPT-4o-mini, Mistral Small | | Long context | GPT-4o (128K), Claude 3 (200K) | | Embeddings | text-embedding-3-large | | Code | Codestral, GPT-4o | | Vision | GPT-4o, Gemini 1.5 Pro | ### Cost Optimization 1. Use smaller models for simple tasks 2. Implement caching for repeated queries 3. Set appropriate `max_tokens` limits 4. Use streaming for better UX without extra cost 5. Monitor token usage via AI Launchpad analytics ### Reliability 1. Implement fallback configurations 2. Pin model versions in production 3. Monitor deprecation dates 4. Test before upgrading versions --- ## Documentation Links - Generative AI Hub: [https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/generative-ai-hub-7db524e.md](https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/generative-ai-hub-7db524e.md) - Supported Models: [https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/supported-models-509e588.md](https://github.com/SAP-docs/sap-artificial-intelligence/blob/main/docs/sap-ai-core/supported-models-509e588.md) - SAP Note 3437766: Token rates, limits, deprecation - SAP Discovery Center: [https://discovery-center.cloud.sap/serviceCatalog/sap-ai-core](https://discovery-center.cloud.sap/serviceCatalog/sap-ai-core)