806 lines
19 KiB
Markdown
806 lines
19 KiB
Markdown
# Data Acquisition and Preparation Reference
|
|
|
|
**Source**: [https://github.com/SAP-docs/sap-datasphere/tree/main/docs/Acquiring-Preparing-Modeling-Data/Acquiring-and-Preparing-Data-in-the-Data-Builder](https://github.com/SAP-docs/sap-datasphere/tree/main/docs/Acquiring-Preparing-Modeling-Data/Acquiring-and-Preparing-Data-in-the-Data-Builder)
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Data Flows](#data-flows)
|
|
2. [Replication Flows](#replication-flows)
|
|
3. [Transformation Flows](#transformation-flows)
|
|
4. [Local Tables](#local-tables)
|
|
5. [Remote Tables](#remote-tables)
|
|
6. [Task Chains](#task-chains)
|
|
7. [Python Operators](#python-operators)
|
|
8. [Data Transformation](#data-transformation)
|
|
9. [Semantic Onboarding](#semantic-onboarding)
|
|
10. [File Spaces and Object Store](#file-spaces-and-object-store)
|
|
|
|
---
|
|
|
|
## Data Flows
|
|
|
|
Data flows provide ETL capabilities for data transformation and loading.
|
|
|
|
### Prerequisites
|
|
|
|
**Required Privileges**:
|
|
- Data Warehouse General (`-R------`) - SAP Datasphere access
|
|
- Connection (`-R------`) - Read connections
|
|
- Data Warehouse Data Builder (`CRUD----`) - Create/edit/delete flows
|
|
- Space Files (`CRUD----`) - Manage space objects
|
|
- Data Warehouse Data Integration (`-RU-----`) - Run flows
|
|
- Data Warehouse Data Integration (`-R--E---`) - Schedule flows
|
|
|
|
### Creating a Data Flow
|
|
|
|
1. Navigate to Data Builder
|
|
2. Select "New Data Flow"
|
|
3. Add source operators
|
|
4. Add transformation operators
|
|
5. Add target operator
|
|
6. Save and deploy
|
|
|
|
### Key Limitations
|
|
|
|
- **No delta processing**: Use replication flows for delta/CDC data instead
|
|
- **Single target table** only per data flow
|
|
- **Local tables only**: Data flows load exclusively to local tables in the repository
|
|
- **Double quotes unsupported** in identifiers (column/table names)
|
|
- **Spatial data types** not supported
|
|
- **ABAP source preview** unavailable (except CDS views and LTR objects)
|
|
- **Transformation operators** cannot be previewed
|
|
|
|
### Advanced Properties
|
|
|
|
**Dynamic Memory Allocation**:
|
|
| Setting | Memory Range | Use Case |
|
|
|---------|--------------|----------|
|
|
| Small | 1-2 GB | Low volume |
|
|
| Medium | 2-3 GB | Standard volume |
|
|
| Large | 3-5 GB | High volume |
|
|
|
|
**Additional Options**:
|
|
- Automatic restart on failure
|
|
- Input parameters support
|
|
|
|
### Data Flow Operators
|
|
|
|
**Source Operators**:
|
|
- Remote tables
|
|
- Local tables
|
|
- Views
|
|
- CSV files
|
|
|
|
**Transformation Operators**:
|
|
|
|
| Operator | Purpose | Configuration |
|
|
|----------|---------|---------------|
|
|
| Join | Combine sources | Join type, conditions |
|
|
| Union | Stack sources | Column mapping |
|
|
| Projection | Select columns | Include/exclude, rename |
|
|
| Filter | Row filtering | Filter conditions |
|
|
| Aggregation | Group and aggregate | Group by, aggregates |
|
|
| Script | Custom Python | Python code |
|
|
| Calculated Column | Derived values | Expression |
|
|
|
|
**Target Operators**:
|
|
- Local table (new or existing)
|
|
- Truncate and insert or delta merge
|
|
|
|
### Join Operations
|
|
|
|
**Join Types**:
|
|
- Inner Join: Matching rows only
|
|
- Left Outer: All left + matching right
|
|
- Right Outer: All right + matching left
|
|
- Full Outer: All rows from both
|
|
- Cross Join: Cartesian product
|
|
|
|
**Join Conditions**:
|
|
```
|
|
source1.column = source2.column
|
|
```
|
|
|
|
### Aggregation Operations
|
|
|
|
**Aggregate Functions**:
|
|
- SUM, AVG, MIN, MAX
|
|
- COUNT, COUNT DISTINCT
|
|
- FIRST, LAST
|
|
|
|
### Calculated Columns
|
|
|
|
**Expression Syntax**:
|
|
```sql
|
|
CASE WHEN column1 > 100 THEN 'High' ELSE 'Low' END
|
|
CONCAT(first_name, ' ', last_name)
|
|
ROUND(amount * exchange_rate, 2)
|
|
```
|
|
|
|
### Input Parameters
|
|
|
|
Define runtime parameters for dynamic filtering:
|
|
|
|
**Parameter Types**:
|
|
- String
|
|
- Integer
|
|
- Date
|
|
- Timestamp
|
|
|
|
**Usage in Expressions**:
|
|
```sql
|
|
WHERE region = :IP_REGION
|
|
```
|
|
|
|
### Running Data Flows
|
|
|
|
**Execution Options**:
|
|
- Manual run from Data Builder
|
|
- Scheduled via task chain
|
|
- API trigger
|
|
|
|
**Run Modes**:
|
|
- Full: Process all data
|
|
- Delta: Process changes only (requires delta capture)
|
|
|
|
---
|
|
|
|
## Replication Flows
|
|
|
|
Replicate data from source systems to SAP Datasphere or external targets.
|
|
|
|
### Creating a Replication Flow
|
|
|
|
1. Navigate to Data Builder
|
|
2. Select "New Replication Flow"
|
|
3. Add source connection and objects
|
|
4. Add target connection
|
|
5. Configure load type and mappings
|
|
6. Save and deploy
|
|
|
|
### Source Systems
|
|
|
|
**SAP Sources**:
|
|
- SAP S/4HANA Cloud (ODP, CDS views)
|
|
- SAP S/4HANA On-Premise (ODP, SLT, CDS)
|
|
- SAP BW/4HANA
|
|
- SAP ECC
|
|
- SAP HANA
|
|
|
|
**Cloud Storage Sources**:
|
|
- Amazon S3
|
|
- Azure Blob Storage
|
|
- Google Cloud Storage
|
|
- SFTP
|
|
|
|
**Streaming Sources**:
|
|
- Apache Kafka
|
|
- Confluent Kafka
|
|
|
|
### Target Systems
|
|
|
|
**SAP Datasphere Targets**:
|
|
- Local tables (managed by replication flow)
|
|
|
|
**External Targets**:
|
|
- Apache Kafka
|
|
- Confluent Kafka
|
|
- Google BigQuery
|
|
- Amazon S3
|
|
- Azure Blob Storage
|
|
- Google Cloud Storage
|
|
- SFTP
|
|
- SAP Signavio
|
|
|
|
### Load Types
|
|
|
|
| Load Type | Description | Use Case |
|
|
|-----------|-------------|----------|
|
|
| Initial Only | One-time full load | Static data |
|
|
| Initial + Delta | Full load then changes | Standard replication |
|
|
| Real-Time | Continuous streaming | Live data |
|
|
|
|
### Configuration Options
|
|
|
|
**Flow-Level Properties**:
|
|
| Property | Description | Default |
|
|
|----------|-------------|---------|
|
|
| Delta Load Frequency | Interval for delta changes | Configurable |
|
|
| Skip Unmapped Target Columns | Ignore unmapped columns | Optional |
|
|
| Merge Data Automatically | Auto-merge for file space targets | Requires consent |
|
|
| Source Thread Limit | Parallel threads for source (1-160) | 16 |
|
|
| Target Thread Limit | Parallel threads for target (1-160) | 16 |
|
|
| Content Type | Template or Native format | Template |
|
|
|
|
**Object-Level Properties**:
|
|
| Property | Description |
|
|
|----------|-------------|
|
|
| Load Type | Initial Only, Initial+Delta, Delta Only |
|
|
| Delta Capture | Enable CDC tracking |
|
|
| ABAP Exit | Custom projection logic |
|
|
| Object Thread Count | Thread count for delta operations |
|
|
| Delete Before Load | Clear target before loading |
|
|
|
|
### Critical Constraints
|
|
|
|
- **No input parameters**: Replication flows do not support input parameters
|
|
- **Thread limits read-only at design time**: Editable only after deployment
|
|
- **Content Type applies globally**: Selection affects all replication objects in the flow
|
|
- **ABAP systems**: Consult SAP Note 3297105 before creating replication flows
|
|
|
|
### Content Type (ABAP Sources)
|
|
|
|
| Type | Date/Timestamp Handling | Use Case |
|
|
|------|-------------------------|----------|
|
|
| Template Type | Applies ISO format requirements | Standard integration |
|
|
| Native Type | Dates → strings, timestamps → decimals | Custom formatting |
|
|
|
|
**Filters**:
|
|
- Define row-level filters on source
|
|
- Multiple filter conditions with AND/OR
|
|
- **Important**: For ODP-CDS, filters must apply to primary key fields only
|
|
|
|
**Mappings**:
|
|
- Automatic column mapping
|
|
- Manual mapping overrides
|
|
- Exclude columns
|
|
|
|
**Projections**:
|
|
- Custom SQL expressions
|
|
- Column transformations
|
|
- Calculated columns
|
|
- ABAP Exit for custom projection logic
|
|
|
|
### Sizing and Performance
|
|
|
|
**Thread Configuration**:
|
|
- Source/Target Thread Limits: 1-160 (default: 16)
|
|
- Higher values = more parallelism but more resources
|
|
- Consider source system capacity
|
|
|
|
**Capacity Planning**:
|
|
- Estimate data volume per table
|
|
- Consider network bandwidth
|
|
- Plan for parallel execution
|
|
- RFC fast serialization (SAP Note 3486245) for improved performance
|
|
|
|
**Load Balancing**:
|
|
- Distribute across multiple flows
|
|
- Schedule during off-peak hours
|
|
- Monitor resource consumption
|
|
|
|
### Unsupported Data Types
|
|
|
|
- BLOB, CLOB (large objects)
|
|
- Spatial data types
|
|
- Custom ABAP types
|
|
- Virtual Tables (SAP HANA Smart Data Access)
|
|
- Row Tables (use COLUMN TABLE only)
|
|
|
|
---
|
|
|
|
## Transformation Flows
|
|
|
|
Delta-aware transformations with automatic change propagation.
|
|
|
|
### Creating a Transformation Flow
|
|
|
|
1. Navigate to Data Builder
|
|
2. Select "New Transformation Flow"
|
|
3. Add source (view or graphical view)
|
|
4. Add target table
|
|
5. Configure run settings
|
|
6. Save and deploy
|
|
|
|
### Key Constraints and Limitations
|
|
|
|
**Data Access Restrictions**:
|
|
Views and Open SQL schema objects cannot be used if they:
|
|
- Reference remote tables (except BW Bridge)
|
|
- Consume views with data access controls
|
|
- Have controls applied to them
|
|
|
|
**Loading Constraints**:
|
|
- Loading delta changes from views is not supported
|
|
- Only loads data to local SAP Datasphere repository tables
|
|
- Remote tables in BW Bridge spaces must be shared with the SAP Datasphere space
|
|
|
|
### Runtime Options
|
|
|
|
| Runtime | Storage Target | Use Case |
|
|
|---------|----------------|----------|
|
|
| HANA | SAP HANA Database storage | Standard transformations |
|
|
| SPARK | SAP HANA Data Lake Files storage | Large-scale file processing |
|
|
|
|
### Load Types
|
|
|
|
| Load Type | Description | Requirements |
|
|
|-----------|-------------|--------------|
|
|
| Initial Only | Full dataset load | None |
|
|
| Initial and Delta | Full load then changes | Delta capture enabled on source and target tables |
|
|
|
|
### Input Parameter Constraints
|
|
|
|
- Cannot be created/edited in Graphical View Editor
|
|
- Scheduled flows use default values
|
|
- **Not supported** in Python operations (Spark runtime)
|
|
- Exclude from task chain input parameters
|
|
|
|
### Source Options
|
|
|
|
- Graphical view (created inline)
|
|
- SQL view (created inline)
|
|
- Existing views
|
|
|
|
### Target Table Management
|
|
|
|
**Options**:
|
|
- Create new local table
|
|
- Use existing local table
|
|
|
|
**Column Handling**:
|
|
- Add new columns automatically
|
|
- Map columns manually
|
|
- Exclude columns
|
|
|
|
### Run Modes
|
|
|
|
| Mode | Action | Use Case |
|
|
|------|--------|----------|
|
|
| Start | Process delta changes | Regular runs |
|
|
| Delete | Remove target records | Cleanup |
|
|
| Truncate | Clear and reload | Full refresh |
|
|
|
|
### Delta Processing
|
|
|
|
Transformation flows track changes automatically:
|
|
- Insert: New records
|
|
- Update: Modified records
|
|
- Delete: Removed records
|
|
|
|
### File Space Transformations
|
|
|
|
Transform data in object store (file spaces):
|
|
|
|
**Supported Functions**:
|
|
- String functions
|
|
- Numeric functions
|
|
- Date functions
|
|
- Conversion functions
|
|
|
|
---
|
|
|
|
## Local Tables
|
|
|
|
Store data directly in SAP Datasphere.
|
|
|
|
### Creating Local Tables
|
|
|
|
**Methods**:
|
|
1. Data Builder > New Table
|
|
2. Import from CSV
|
|
3. Create from data flow target
|
|
4. Create from replication flow target
|
|
|
|
### Storage Options
|
|
|
|
| Storage | Target System | Use Case |
|
|
|---------|---------------|----------|
|
|
| Disk | SAP HANA Cloud, SAP HANA database | Standard persistent storage |
|
|
| In-Memory | SAP HANA Cloud, SAP HANA database | High-performance hot data |
|
|
| File | SAP HANA Cloud data lake storage | Large-scale cost-effective storage |
|
|
|
|
### Table Properties
|
|
|
|
**Key Columns**:
|
|
- Primary key definition
|
|
- Unique constraints
|
|
|
|
**Data Types**:
|
|
- String (VARCHAR)
|
|
- Integer (INT, BIGINT)
|
|
- Decimal (DECIMAL)
|
|
- Date, Time, Timestamp
|
|
- Boolean
|
|
- Binary
|
|
|
|
### Partitioning
|
|
|
|
**Partition Types**:
|
|
- Range partitioning (date/numeric)
|
|
- Hash partitioning
|
|
|
|
**Benefits**:
|
|
- Improved query performance
|
|
- Parallel processing
|
|
- Selective data loading
|
|
|
|
### Delta Capture
|
|
|
|
Enable change tracking for incremental processing:
|
|
|
|
1. Enable delta capture on table
|
|
2. Track insert/update/delete operations
|
|
3. Query changes with delta tokens
|
|
|
|
**Important Constraint**: Once delta capture is enabled and deployed, it **cannot be modified or disabled**.
|
|
|
|
### Allow Data Transport
|
|
|
|
Available for dimensions on SAP Business Data Cloud formation tenants:
|
|
- Enables data inclusion during repository package transport
|
|
- Limited to initial import data initialization
|
|
- **Applies only to**: Dimensions, text entities, or relational datasets
|
|
|
|
### Data Maintenance
|
|
|
|
**Operations**:
|
|
- Insert records
|
|
- Update records
|
|
- Delete records
|
|
- Truncate table
|
|
- Load from file
|
|
|
|
### Local Table (File)
|
|
|
|
Store data in object store:
|
|
|
|
**Supported Formats**:
|
|
- Parquet
|
|
- CSV
|
|
- JSON
|
|
|
|
**Use Cases**:
|
|
- Large datasets
|
|
- Cost-effective storage
|
|
- Integration with data lakes
|
|
|
|
---
|
|
|
|
## Remote Tables
|
|
|
|
Virtual access to external data without copying.
|
|
|
|
### Importing Remote Tables
|
|
|
|
1. Select connection in source browser
|
|
2. Choose tables/views to import
|
|
3. Configure import settings
|
|
4. Deploy remote table
|
|
|
|
### Data Access Modes
|
|
|
|
| Mode | Description | Performance |
|
|
|------|-------------|-------------|
|
|
| Remote | Query source directly | Network dependent |
|
|
| Replicated | Copy to local storage | Fast queries |
|
|
|
|
### Replication Options
|
|
|
|
**Full Replication**:
|
|
- Copy all data
|
|
- Scheduled refresh
|
|
|
|
**Real-Time Replication**:
|
|
- Continuous change capture
|
|
- Near real-time updates
|
|
|
|
**Partitioned Replication**:
|
|
- Divide data into partitions
|
|
- Parallel loading
|
|
|
|
### Remote Table Properties
|
|
|
|
**Statistics**:
|
|
- Create statistics for query optimization
|
|
- Update statistics periodically
|
|
|
|
**Filters**:
|
|
- Define partitioning filters
|
|
- Limit data volume
|
|
|
|
---
|
|
|
|
## Task Chains
|
|
|
|
Orchestrate multiple data integration tasks.
|
|
|
|
### Creating Task Chains
|
|
|
|
1. Navigate to Data Builder
|
|
2. Select "New Task Chain"
|
|
3. Add task nodes
|
|
4. Configure dependencies
|
|
5. Save and deploy
|
|
|
|
### Supported Task Types
|
|
|
|
**Repository Objects**:
|
|
| Task Type | Activity | Description |
|
|
|-----------|----------|-------------|
|
|
| Remote Tables | Replicate | Replicate remote table data |
|
|
| Views | Persist | Persist view data to storage |
|
|
| Intelligent Lookups | Run | Execute intelligent lookup |
|
|
| Data Flows | Run | Execute data flow |
|
|
| Replication Flows | Run | Run with load type *Initial Only* |
|
|
| Transformation Flows | Run | Execute transformation flow |
|
|
| Local Tables | Delete Records | Delete records with Change Type "Deleted" |
|
|
| Local Tables (File) | Merge | Merge delta files |
|
|
| Local Tables (File) | Optimize | Compact files |
|
|
| Local Tables (File) | Delete Records | Remove data |
|
|
|
|
**Non-Repository Objects**:
|
|
| Task Type | Description |
|
|
|-----------|-------------|
|
|
| Open SQL Procedure | Execute SAP HANA schema procedures |
|
|
| BW Bridge Process Chain | Run SAP BW Bridge processes |
|
|
|
|
**Toolbar-Only Objects**:
|
|
| Task Type | Description |
|
|
|-----------|-------------|
|
|
| API Task | Call external REST APIs |
|
|
| Notification Task | Send email notifications |
|
|
|
|
**Nested Objects**:
|
|
| Task Type | Description |
|
|
|-----------|-------------|
|
|
| Task Chain | Reference locally-created or shared task chains |
|
|
|
|
### Object Prerequisites
|
|
|
|
- All objects must be deployed before adding to task chains
|
|
- SAP HANA Open SQL schema procedures require EXECUTE privileges granted to space users
|
|
- Views **cannot** have data access controls assigned
|
|
- Data flows with input parameters use default values during task chain execution
|
|
- Persisting views may include only one parameter with default value
|
|
|
|
### Execution Control
|
|
|
|
**Sequential Execution**:
|
|
- Tasks run one after another
|
|
- Succeeding task runs only when previous completes with *completed* status
|
|
- Failure stops chain execution
|
|
|
|
**Parallel Execution**:
|
|
- Multiple branches run simultaneously
|
|
- Completion condition options:
|
|
- **ANY**: Succeeds when any parallel task completes
|
|
- **ALL**: Succeeds only when all parallel tasks complete
|
|
- Synchronization at join points
|
|
|
|
**Layout Options**:
|
|
- Top-Bottom orientation
|
|
- Left-Right orientation
|
|
- Drag tasks to reorder
|
|
|
|
**Apache Spark Settings**:
|
|
- Override default Apache Spark Application Settings per task
|
|
- Configure memory and executor settings
|
|
|
|
### Input Parameters
|
|
|
|
Pass parameters to task chain tasks:
|
|
|
|
**Parameter Definition**:
|
|
```yaml
|
|
name: region
|
|
type: string
|
|
default: "US"
|
|
```
|
|
|
|
**Parameter Usage**:
|
|
- Pass to data flows
|
|
- Use in filters
|
|
- Dynamic configuration
|
|
|
|
### Scheduling
|
|
|
|
**Simple Schedule**:
|
|
- Daily, weekly, monthly
|
|
- Specific time
|
|
|
|
**Cron Expression**:
|
|
```
|
|
0 0 6 * * ? # Daily at 6 AM
|
|
0 0 */4 * * ? # Every 4 hours
|
|
```
|
|
|
|
**Important Scheduling Constraint**: If scheduling remote tables with *Replicated (Real-time)* data access, replication type converts to batch replication at the next scheduled run (eliminates real-time updates).
|
|
|
|
### Email Notifications
|
|
|
|
Configure notifications for:
|
|
- Success
|
|
- Failure
|
|
- Warning
|
|
|
|
**Recipient Options**:
|
|
- Tenant users (searchable after task chain is deployed)
|
|
- External email addresses (requires deployed task chain for recipient selection)
|
|
|
|
**Export Constraint**: CSN/JSON export does not include notification recipients
|
|
|
|
---
|
|
|
|
## Python Operators
|
|
|
|
Custom data processing with Python.
|
|
|
|
### Creating Python Operators
|
|
|
|
1. Add Script operator to data flow
|
|
2. Define input/output ports
|
|
3. Write Python code
|
|
4. Configure execution
|
|
|
|
### Python Script Structure
|
|
|
|
```python
|
|
def transform(data):
|
|
"""
|
|
Transform input data.
|
|
|
|
Args:
|
|
data: pandas DataFrame
|
|
|
|
Returns:
|
|
pandas DataFrame
|
|
"""
|
|
# Your transformation logic
|
|
result = data.copy()
|
|
result['new_column'] = result['existing'].apply(my_function)
|
|
return result
|
|
```
|
|
|
|
### Available Libraries
|
|
|
|
- pandas
|
|
- numpy
|
|
- scipy
|
|
- scikit-learn
|
|
- datetime
|
|
|
|
### Best Practices
|
|
|
|
- Keep transformations simple
|
|
- Handle null values explicitly
|
|
- Log errors appropriately
|
|
- Test with sample data
|
|
|
|
---
|
|
|
|
## Data Transformation
|
|
|
|
Column-level transformations in graphical views.
|
|
|
|
### Text Transformations
|
|
|
|
| Function | Description | Example |
|
|
|----------|-------------|---------|
|
|
| Change Case | Upper/lower/title | UPPER(name) |
|
|
| Concatenate | Join columns | CONCAT(first, last) |
|
|
| Extract | Substring | SUBSTRING(text, 1, 5) |
|
|
| Split | Divide by delimiter | SPLIT(full_name, ' ') |
|
|
| Find/Replace | Text substitution | REPLACE(text, 'old', 'new') |
|
|
|
|
### Numeric Transformations
|
|
|
|
| Function | Description |
|
|
|----------|-------------|
|
|
| ROUND | Round to precision |
|
|
| FLOOR | Round down |
|
|
| CEIL | Round up |
|
|
| ABS | Absolute value |
|
|
| MOD | Modulo operation |
|
|
|
|
### Date Transformations
|
|
|
|
| Function | Description |
|
|
|----------|-------------|
|
|
| YEAR | Extract year |
|
|
| MONTH | Extract month |
|
|
| DAY | Extract day |
|
|
| DATEDIFF | Date difference |
|
|
| ADD_DAYS | Add days to date |
|
|
|
|
### Filter Operations
|
|
|
|
```sql
|
|
-- Numeric filter
|
|
amount > 1000
|
|
|
|
-- Text filter
|
|
region IN ('US', 'EU', 'APAC')
|
|
|
|
-- Date filter
|
|
order_date >= '2024-01-01'
|
|
|
|
-- Null handling
|
|
customer_name IS NOT NULL
|
|
```
|
|
|
|
---
|
|
|
|
## Semantic Onboarding
|
|
|
|
Import objects with business semantics from SAP systems.
|
|
|
|
### SAP S/4HANA Import
|
|
|
|
Import CDS views with annotations:
|
|
- Semantic types (currency, unit)
|
|
- Associations
|
|
- Hierarchies
|
|
- Text relationships
|
|
|
|
### SAP BW/4HANA Import
|
|
|
|
Import BW objects:
|
|
- InfoObjects
|
|
- CompositeProviders
|
|
- Queries
|
|
- Analysis Authorizations
|
|
|
|
### Import Process
|
|
|
|
1. Select source connection
|
|
2. Browse available objects
|
|
3. Select objects to import
|
|
4. Review semantic mapping
|
|
5. Deploy imported objects
|
|
|
|
---
|
|
|
|
## File Spaces and Object Store
|
|
|
|
Store and process data in object store.
|
|
|
|
### Creating File Spaces
|
|
|
|
1. System > Configuration > Spaces
|
|
2. Create new file space
|
|
3. Configure object store connection
|
|
4. Set storage limits
|
|
|
|
### Data Loading
|
|
|
|
**Supported Formats**:
|
|
- Parquet (recommended)
|
|
- CSV
|
|
- JSON
|
|
|
|
**Loading Methods**:
|
|
- Replication flows
|
|
- Transformation flows
|
|
- API upload
|
|
|
|
### In-Memory Acceleration
|
|
|
|
Enable in-memory storage for faster queries:
|
|
|
|
1. Select table/view
|
|
2. Enable in-memory storage
|
|
3. Configure refresh schedule
|
|
|
|
### Premium Outbound Integration
|
|
|
|
Export data to external systems:
|
|
- Configure outbound connection
|
|
- Schedule exports
|
|
- Monitor transfer status
|
|
|
|
---
|
|
|
|
## Documentation Links
|
|
|
|
- **Data Flows**: [https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/e30fd14](https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/e30fd14)
|
|
- **Replication Flows**: [https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/25e2bd7](https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/25e2bd7)
|
|
- **Transformation Flows**: [https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/f7161e6](https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/f7161e6)
|
|
- **Task Chains**: [https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/d1afbc2](https://help.sap.com/docs/SAP_DATASPHERE/c8a54ee704e94e15926551293243fd1d/d1afbc2)
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-11-22
|