8.8 KiB
Data Workflow Operators Guide
Complete guide for data workflow orchestration in SAP Data Intelligence.
Table of Contents
- Overview
- Workflow Structure
- Available Operators
- Data Transfer
- Remote Execution
- Control Flow
- Notifications
- Best Practices
Overview
Data Workflow operators orchestrate data processing tasks that run for a limited time and finish with either "completed" or "dead" status.
Key Characteristics:
- Sequential execution via signal passing
- Operators start after receiving input signal
- Each operator has input, output, and error ports
- Unconnected output ports cause graph failure
Important: Do not mix Data Workflow operators with non-Data Workflow operators in the same graph.
Workflow Structure
Required Components
Every data workflow requires:
- Workflow Trigger: First operator (starts the workflow)
- Workflow Terminator: Last operator (ends the workflow)
Basic Structure
[Workflow Trigger] -> [Task Operator(s)] -> [Workflow Terminator]
Signal Flow
- Workflow Trigger sends initial signal
- Each operator waits for input signal
- Operator executes task
- Operator sends output signal (or error)
- Next operator receives signal and executes
- Workflow Terminator completes the graph
Available Operators
Core Workflow Operators
| Operator | Purpose |
|---|---|
| Workflow Trigger | Initiates workflow execution |
| Workflow Terminator | Concludes workflow with status |
| Workflow Split | Duplicates signal for parallel paths |
| Workflow Merge (AND) | Combines outputs using logical AND |
| Workflow Merge (OR) | Combines outputs using logical OR |
Task Operators
| Operator | Purpose |
|---|---|
| Data Transfer | Move data between systems |
| Data Transform | Apply data transformations |
| Pipeline | Execute DI graphs locally or remotely |
| SAP Data Services Job | Run remote Data Services jobs |
| SAP HANA Flowgraph | Execute HANA flowgraphs |
| BW Process Chain | Run BW process chains |
| Notification | Send email notifications |
Data Transfer
Purpose
Transfer data from SAP systems to cloud storage.
Supported Sources
- SAP Business Warehouse (BW)
- SAP HANA
Supported Targets
- Amazon S3
- Google Cloud Storage
- Hadoop Distributed File System (HDFS)
- SAP Vora
Transfer Modes
| Mode | Description | Best For |
|---|---|---|
| BW OLAP | Default BW access | Small datasets |
| Generated HANA Views | Partition-based transfer | Large datasets |
| BW ODP | Datastore extraction | Cloud/distributed storage |
BW OLAP Mode
- Default mode for BW sources
- Uses standard OLAP interface (like RSRT2)
- Single result set processing
- Cell export limitations
- Not suitable for large-scale transfers
Generated HANA Views Mode
Requirements:
- Connection via DI Connection Management
- SAP BW 4.2.0 or later
- Working HANA database connection
- SSL certificates (if required)
- Query with generated calculation view (no restricted attributes)
Advantage: Transfers partitions separately, enabling large result sets and parallel processing.
BW ODP Mode
Works with: Datastores only
Supported Targets:
- Azure Data Lake
- Google Cloud Storage
- HDFS
- Alibaba OSS
- Amazon S3
- Semantic Data Lake
- Azure Storage Blob
Note: Partition processing is sequential, not parallel.
Configuration
Source Connection: BW_SYSTEM
Target Connection: S3_BUCKET
Transfer Mode: Generated HANA Views
Source Query: /NAMESPACE/QUERY
Target Path: /data/export/
Remote Execution
Pipeline Operator
Execute SAP Data Intelligence graphs.
Options:
- Local execution (same DI instance)
- Remote execution (different DI instance)
- Parameter passing
- Synchronous/asynchronous
Configuration:
Graph: /namespace/my_pipeline
Connection: (for remote)
Parameters: key=value pairs
Wait for Completion: Yes/No
SAP Data Services Job
Execute jobs in remote SAP Data Services systems.
Prerequisites:
- Data Services connection configured
- Job accessible from DI
Configuration:
Connection: DS_CONNECTION
Repository: REPO_NAME
Job: JOB_NAME
Global Variables: VAR1=VALUE1
SAP HANA Flowgraph
Execute flowgraphs in remote HANA systems.
Prerequisites:
- HANA connection configured
- Flowgraph deployed
Configuration:
Connection: HANA_CONNECTION
Flowgraph: SCHEMA.FLOWGRAPH_NAME
Parameters: (if applicable)
BW Process Chain
Execute SAP BW process chains.
Prerequisites:
- BW connection configured
- Process chain accessible
Configuration:
Connection: BW_CONNECTION
Process Chain: CHAIN_ID
Variant: (if applicable)
Wait for Completion: Yes/No
Control Flow
Workflow Split
Duplicates incoming signal to multiple output ports.
┌──→ [Task A]
[Trigger] → [Split] ┼──→ [Task B]
└──→ [Task C]
Use Case: Parallel execution paths
Workflow Merge (AND)
Combines multiple inputs using logical AND. Sends output only when all inputs received.
[Task A] ──┐
[Task B] ──┼──→ [Merge AND] → [Next Task]
[Task C] ──┘
Use Case: Wait for all parallel tasks to complete
Workflow Merge (OR)
Combines multiple inputs using logical OR. Sends output when any input received.
[Task A] ──┐
[Task B] ──┼──→ [Merge OR] → [Next Task]
[Task C] ──┘
Use Case: Continue when first task completes
Error Handling
Error Port:
- All task operators have error ports
- Connect error port to handle failures
- Unhandled errors terminate workflow
┌── success ──→ [Continue]
[Task] ────────────┤
└── error ──→ [Error Handler]
Notifications
Notification Operator
Send email notifications during workflow execution.
Configuration:
SMTP Connection: EMAIL_CONNECTION
To: recipients@company.com
CC: (optional)
Subject: Workflow ${workflow.name} - ${status}
Body: The workflow completed at ${timestamp}
Attachment: (optional file path)
Use Cases
- Success notifications
- Error alerts
- Progress updates
- Audit trail
Template Variables
| Variable | Description |
|---|---|
${workflow.name} |
Workflow name |
${status} |
Execution status |
${timestamp} |
Current timestamp |
${error.message} |
Error details (if failed) |
Best Practices
Design Principles
- Clear Flow: Design linear or clearly branching workflows
- Error Handling: Always connect error ports
- Notifications: Add alerts for critical failures
- Idempotency: Design tasks to be re-runnable
Performance
- Parallelize: Use Split/Merge for independent tasks
- Optimize Transfers: Choose appropriate transfer mode
- Monitor Progress: Track workflow status
- Resource Planning: Consider target system load
Reliability
- Test Components: Validate each task individually
- Handle Failures: Implement retry logic where needed
- Clean Up: Manage temporary data
- Document: Maintain workflow documentation
Example Workflow
[Trigger]
↓
[Split]
├──→ [Transfer from BW] ──→ [Merge AND]
└──→ [Transfer from HANA] ─┘
↓
[Transform Data]
↓
[Load to Target]
↓
[Send Notification]
↓
[Terminator]
Documentation Links
- Data Workflow Operators: https://github.com/SAP-docs/sap-hana-cloud-data-intelligence/tree/main/docs/modelingguide/working-with-dataworkflow-operators
- Transfer Data: https://github.com/SAP-docs/sap-hana-cloud-data-intelligence/blob/main/docs/modelingguide/working-with-dataworkflow-operators/transfer-data-b250a0b.md
- Transfer Modes: https://github.com/SAP-docs/sap-hana-cloud-data-intelligence/blob/main/docs/modelingguide/working-with-dataworkflow-operators/transfer-modes-a615280.md
Last Updated: 2025-11-22