Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:18:03 +08:00
commit 31ff8e1c29
18 changed files with 5925 additions and 0 deletions

View File

@@ -0,0 +1,29 @@
# Creating Workspaces
## Create Workspace with Large Storage Format
**Step 1:** List available capacities
```bash
fab ls .capacities
```
**Step 2:** Create workspace on chosen capacity
```bash
fab mkdir "Workspace Name.Workspace" -P capacityName="MyCapacity"
```
**Step 3:** Get workspace ID
```bash
fab get "Workspace Name.Workspace" -q "id"
```
**Step 4:** Set default storage format to Large
```bash
fab api -A powerbi -X patch "groups/<workspace-id>" -i '{"defaultDatasetStorageFormat":"Large"}'
```
Done. The workspace now defaults to Large storage format for all new semantic models.

View File

@@ -0,0 +1,242 @@
# Fabric API Reference
Direct API access via `fab api` for operations beyond standard commands.
## API Basics
```bash
# Fabric API (default)
fab api "<endpoint>"
# Power BI API
fab api -A powerbi "<endpoint>"
# With query
fab api "<endpoint>" -q "value[0].id"
# POST with body
fab api -X post "<endpoint>" -i '{"key":"value"}'
```
## Capacities
```bash
# List all capacities
fab api capacities
# Response includes: id, displayName, sku (F2, F64, FT1, PP3), region, state
```
**Pause capacity** (cost savings):
```bash
# CAUTION: Pausing stops all workloads on that capacity
# Resume is intentionally NOT documented - too dangerous for automation
# Use Azure Portal for resume operations
# To pause via Azure CLI (not fab):
az resource update --ids "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Fabric/capacities/{name}" \
--set properties.state=Paused
```
## Gateways
```bash
# List gateways
fab api -A powerbi gateways
# Get gateway datasources
GATEWAY_ID="<gateway-id>"
fab api -A powerbi "gateways/$GATEWAY_ID/datasources"
# Get gateway users
fab api -A powerbi "gateways/$GATEWAY_ID/users"
```
## Deployment Pipelines
```bash
# List pipelines (user)
fab api -A powerbi pipelines
# List pipelines (admin - all tenant)
fab api -A powerbi admin/pipelines
# Get pipeline stages
PIPELINE_ID="<pipeline-id>"
fab api -A powerbi "pipelines/$PIPELINE_ID/stages"
# Get pipeline operations
fab api -A powerbi "pipelines/$PIPELINE_ID/operations"
```
**Deploy content** (use Fabric API):
```bash
# Assign workspace to stage
fab api -X post "deploymentPipelines/$PIPELINE_ID/stages/$STAGE_ID/assignWorkspace" \
-i '{"workspaceId":"<workspace-id>"}'
# Deploy to next stage
fab api -X post "deploymentPipelines/$PIPELINE_ID/deploy" -i '{
"sourceStageOrder": 0,
"targetStageOrder": 1,
"options": {"allowCreateArtifact": true, "allowOverwriteArtifact": true}
}'
```
## Domains
```bash
# List domains
fab api admin/domains
# Get domain workspaces
DOMAIN_ID="<domain-id>"
fab api "admin/domains/$DOMAIN_ID/workspaces"
# Assign workspaces to domain
fab api -X post "admin/domains/$DOMAIN_ID/assignWorkspaces" \
-i '{"workspacesIds":["<ws-id-1>","<ws-id-2>"]}'
```
## Dataflows
**Gen1** (Power BI dataflows):
```bash
# List all dataflows (admin)
fab api -A powerbi admin/dataflows
# List workspace dataflows
WS_ID="<workspace-id>"
fab api -A powerbi "groups/$WS_ID/dataflows"
# Refresh dataflow
DATAFLOW_ID="<dataflow-id>"
fab api -A powerbi -X post "groups/$WS_ID/dataflows/$DATAFLOW_ID/refreshes"
```
**Gen2** (Fabric dataflows):
```bash
# Gen2 dataflows are Fabric items - use standard fab commands
fab ls "ws.Workspace" | grep DataflowGen2
fab get "ws.Workspace/Flow.DataflowGen2" -q "id"
```
## Apps
**Workspace Apps** (published from workspaces):
```bash
# List user's apps
fab api -A powerbi apps
# List all apps (admin)
fab api -A powerbi 'admin/apps?$top=100'
# Get app details
APP_ID="<app-id>"
fab api -A powerbi "apps/$APP_ID"
# Get app reports
fab api -A powerbi "apps/$APP_ID/reports"
# Get app dashboards
fab api -A powerbi "apps/$APP_ID/dashboards"
```
**Org Apps** (template apps from AppSource):
```bash
# Org apps are installed from AppSource marketplace
# They appear in the regular apps endpoint after installation
# No separate API for org app catalog - use AppSource
```
## Admin Operations
### Workspaces
```bash
# List all workspaces (requires $top)
fab api -A powerbi 'admin/groups?$top=100'
# Response includes: id, name, type, state, capacityId, pipelineId
# Get workspace users
fab api -A powerbi "admin/groups/$WS_ID/users"
```
### Items
```bash
# List all items in tenant
fab api admin/items
# Response includes: id, type, name, workspaceId, capacityId, creatorPrincipal
```
### Security Scanning
```bash
# Reports shared with entire org (security risk)
fab api -A powerbi "admin/widelySharedArtifacts/linksSharedToWholeOrganization"
# Reports published to web (security risk)
fab api -A powerbi "admin/widelySharedArtifacts/publishedToWeb"
```
### Activity Events
```bash
# Get activity events (last 30 days max)
# Dates must be in ISO 8601 format with quotes
START="2025-11-26T00:00:00Z"
END="2025-11-27T00:00:00Z"
fab api -A powerbi "admin/activityevents?startDateTime='$START'&endDateTime='$END'"
```
## Common Patterns
### Extract ID for Chaining
```bash
# Get ID and remove quotes
WS_ID=$(fab get "ws.Workspace" -q "id" | tr -d '"')
MODEL_ID=$(fab get "ws.Workspace/Model.SemanticModel" -q "id" | tr -d '"')
# Use in API call
fab api -A powerbi "groups/$WS_ID/datasets/$MODEL_ID/refreshes" -X post -i '{"type":"Full"}'
```
### Pagination
```bash
# APIs with $top often have pagination
# Check for @odata.nextLink in response
fab api -A powerbi 'admin/groups?$top=100' -q "@odata.nextLink"
# Use returned URL for next page
```
### Error Handling
```bash
# Check status_code in response
# 200 = success
# 400 = bad request (check parameters)
# 401 = unauthorized (re-authenticate)
# 403 = forbidden (insufficient permissions)
# 404 = not found
```
## API Audiences
| Audience | Flag | Base URL | Use Case |
|----------|------|----------|----------|
| Fabric | (default) | api.fabric.microsoft.com | Fabric items, workspaces, admin |
| Power BI | `-A powerbi` | api.powerbi.com | Reports, datasets, gateways, pipelines |
Most admin operations work with both APIs but return different formats.

View File

@@ -0,0 +1,666 @@
# Notebook Operations
Comprehensive guide for working with Fabric notebooks using the Fabric CLI.
## Overview
Fabric notebooks are interactive documents for data engineering, data science, and analytics. They can be executed, scheduled, and managed via the CLI.
## Getting Notebook Information
### Basic Notebook Info
```bash
# Check if notebook exists
fab exists "Production.Workspace/ETL Pipeline.Notebook"
# Get notebook properties
fab get "Production.Workspace/ETL Pipeline.Notebook"
# Get with verbose details
fab get "Production.Workspace/ETL Pipeline.Notebook" -v
# Get only notebook ID
fab get "Production.Workspace/ETL Pipeline.Notebook" -q "id"
```
### Get Notebook Definition
```bash
# Get full notebook definition
fab get "Production.Workspace/ETL Pipeline.Notebook" -q "definition"
# Save definition to file
fab get "Production.Workspace/ETL Pipeline.Notebook" -q "definition" -o /tmp/notebook-def.json
# Get notebook content (cells)
fab get "Production.Workspace/ETL Pipeline.Notebook" -q "definition.parts[?path=='notebook-content.py'].payload | [0]"
```
## Exporting Notebooks
### Export as IPYNB
```bash
# Export notebook
fab export "Production.Workspace/ETL Pipeline.Notebook" -o /tmp/notebooks
# This creates:
# /tmp/notebooks/ETL Pipeline.Notebook/
# ├── notebook-content.py (or .ipynb)
# └── metadata files
```
### Export All Notebooks from Workspace
```bash
# Export all notebooks
WS_ID=$(fab get "Production.Workspace" -q "id")
NOTEBOOKS=$(fab api "workspaces/$WS_ID/items" -q "value[?type=='Notebook'].displayName")
for NOTEBOOK in $NOTEBOOKS; do
fab export "Production.Workspace/$NOTEBOOK.Notebook" -o /tmp/notebooks
done
```
## Importing Notebooks
### Import from Local
```bash
# Import notebook from .ipynb format (default)
fab import "Production.Workspace/New ETL.Notebook" -i /tmp/notebooks/ETL\ Pipeline.Notebook
# Import from .py format
fab import "Production.Workspace/Script.Notebook" -i /tmp/script.py --format py
```
### Copy Between Workspaces
```bash
# Copy notebook
fab cp "Dev.Workspace/ETL.Notebook" "Production.Workspace"
# Copy with new name
fab cp "Dev.Workspace/ETL.Notebook" "Production.Workspace/Prod ETL.Notebook"
```
## Creating Notebooks
### Create Blank Notebook
```bash
# Get workspace ID first
fab get "Production.Workspace" -q "id"
# Create via API
fab api -X post "workspaces/<workspace-id>/notebooks" -i '{"displayName": "New Data Processing"}'
```
### Create and Configure Query Notebook
Use this workflow to create a notebook for querying lakehouse tables with Spark SQL.
#### Step 1: Create the notebook
```bash
# Get workspace ID
fab get "Sales.Workspace" -q "id"
# Returns: 4caf7825-81ac-4c94-9e46-306b4c20a4d5
# Create notebook
fab api -X post "workspaces/4caf7825-81ac-4c94-9e46-306b4c20a4d5/notebooks" -i '{"displayName": "Data Query"}'
# Returns notebook ID: 97bbd18d-c293-46b8-8536-82fb8bc9bd58
```
#### Step 2: Get lakehouse ID (required for notebook metadata)
```bash
fab get "Sales.Workspace/SalesLH.Lakehouse" -q "id"
# Returns: ddbcc575-805b-4922-84db-ca451b318755
```
#### Step 3: Create notebook code in Fabric format
```bash
cat > /tmp/notebook.py <<'EOF'
# Fabric notebook source
# METADATA ********************
# META {
# META "kernel_info": {
# META "name": "synapse_pyspark"
# META },
# META "dependencies": {
# META "lakehouse": {
# META "default_lakehouse": "ddbcc575-805b-4922-84db-ca451b318755",
# META "default_lakehouse_name": "SalesLH",
# META "default_lakehouse_workspace_id": "4caf7825-81ac-4c94-9e46-306b4c20a4d5"
# META }
# META }
# META }
# CELL ********************
# Query lakehouse table
df = spark.sql("""
SELECT
date_key,
COUNT(*) as num_records
FROM gold.sets
GROUP BY date_key
ORDER BY date_key DESC
LIMIT 10
""")
# IMPORTANT: Convert to pandas and print to capture output
# display(df) will NOT show results via API
pandas_df = df.toPandas()
print(pandas_df)
print(f"\nLatest date: {pandas_df.iloc[0]['date_key']}")
EOF
```
#### Step 4: Base64 encode and create update definition
```bash
base64 -i /tmp/notebook.py > /tmp/notebook-b64.txt
cat > /tmp/update.json <<EOF
{
"definition": {
"parts": [
{
"path": "notebook-content.py",
"payload": "$(cat /tmp/notebook-b64.txt)",
"payloadType": "InlineBase64"
}
]
}
}
EOF
```
#### Step 5: Update notebook with code
```bash
fab api -X post "workspaces/4caf7825-81ac-4c94-9e46-306b4c20a4d5/notebooks/97bbd18d-c293-46b8-8536-82fb8bc9bd58/updateDefinition" -i /tmp/update.json --show_headers
# Returns operation ID in Location header
```
#### Step 6: Check update completed
```bash
fab api "operations/<operation-id>"
# Wait for status: "Succeeded"
```
#### Step 7: Run the notebook
```bash
fab job start "Sales.Workspace/Data Query.Notebook"
# Returns job instance ID
```
#### Step 8: Check execution status
```bash
fab job run-status "Sales.Workspace/Data Query.Notebook" --id <job-id>
# Wait for status: "Completed"
```
#### Step 9: Get results (download from Fabric UI)
- Open notebook in Fabric UI after execution
- Print output will be visible in cell outputs
- Download .ipynb file to see printed results locally
#### Critical Requirements
1. **File format**: Must be `notebook-content.py` (NOT `.ipynb`)
2. **Lakehouse ID**: Must include `default_lakehouse` ID in metadata (not just name)
3. **Spark session**: Will be automatically available when lakehouse is attached
4. **Capturing output**: Use `df.toPandas()` and `print()` - `display()` won't show in API
5. **Results location**: Print output visible in UI and downloaded .ipynb, NOT in definition
#### Common Issues
- `NameError: name 'spark' is not defined` - Lakehouse not attached (missing default_lakehouse ID)
- Job "Completed" but no results - Used display() instead of print()
- Update fails - Used .ipynb path instead of .py
### Create from Template
```bash
# Export template
fab export "Templates.Workspace/Template Notebook.Notebook" -o /tmp/templates
# Import as new notebook
fab import "Production.Workspace/Custom Notebook.Notebook" -i /tmp/templates/Template\ Notebook.Notebook
```
## Running Notebooks
### Run Synchronously (Wait for Completion)
```bash
# Run notebook and wait
fab job run "Production.Workspace/ETL Pipeline.Notebook"
# Run with timeout (seconds)
fab job run "Production.Workspace/Long Process.Notebook" --timeout 600
```
### Run with Parameters
```bash
# Run with basic parameters
fab job run "Production.Workspace/ETL Pipeline.Notebook" -P \
date:string=2024-01-01,\
batch_size:int=1000,\
debug_mode:bool=false,\
threshold:float=0.95
# Parameters must match types defined in notebook
# Supported types: string, int, float, bool
```
### Run with Spark Configuration
```bash
# Run with custom Spark settings
fab job run "Production.Workspace/Big Data Processing.Notebook" -C '{
"conf": {
"spark.executor.memory": "8g",
"spark.executor.cores": "4",
"spark.dynamicAllocation.enabled": "true"
},
"environment": {
"id": "<environment-id>",
"name": "Production Environment"
}
}'
# Run with default lakehouse
fab job run "Production.Workspace/Data Ingestion.Notebook" -C '{
"defaultLakehouse": {
"name": "MainLakehouse",
"id": "<lakehouse-id>",
"workspaceId": "<workspace-id>"
}
}'
# Run with workspace Spark pool
fab job run "Production.Workspace/Analytics.Notebook" -C '{
"useStarterPool": false,
"useWorkspacePool": "HighMemoryPool"
}'
```
### Run with Combined Parameters and Configuration
```bash
# Combine parameters and configuration
fab job run "Production.Workspace/ETL Pipeline.Notebook" \
-P date:string=2024-01-01,batch:int=500 \
-C '{
"defaultLakehouse": {"name": "StagingLH", "id": "<lakehouse-id>"},
"conf": {"spark.sql.shuffle.partitions": "200"}
}'
```
### Run Asynchronously
```bash
# Start notebook and return immediately
JOB_ID=$(fab job start "Production.Workspace/ETL Pipeline.Notebook" | grep -o '"id": "[^"]*"' | cut -d'"' -f4)
# Check status later
fab job run-status "Production.Workspace/ETL Pipeline.Notebook" --id "$JOB_ID"
```
## Monitoring Notebook Executions
### Get Job Status
```bash
# Check specific job
fab job run-status "Production.Workspace/ETL Pipeline.Notebook" --id <job-id>
# Get detailed status via API
WS_ID=$(fab get "Production.Workspace" -q "id")
NOTEBOOK_ID=$(fab get "Production.Workspace/ETL Pipeline.Notebook" -q "id")
fab api "workspaces/$WS_ID/items/$NOTEBOOK_ID/jobs/instances/<job-id>"
```
### List Execution History
```bash
# List all job runs
fab job run-list "Production.Workspace/ETL Pipeline.Notebook"
# List only scheduled runs
fab job run-list "Production.Workspace/ETL Pipeline.Notebook" --schedule
# Get latest run status
fab job run-list "Production.Workspace/ETL Pipeline.Notebook" | head -n 1
```
### Cancel Running Job
```bash
fab job run-cancel "Production.Workspace/ETL Pipeline.Notebook" --id <job-id>
```
## Scheduling Notebooks
### Create Cron Schedule
```bash
# Run every 30 minutes
fab job run-sch "Production.Workspace/ETL Pipeline.Notebook" \
--type cron \
--interval 30 \
--start 2024-11-15T00:00:00 \
--end 2025-12-31T23:59:00 \
--enable
```
### Create Daily Schedule
```bash
# Run daily at 2 AM and 2 PM
fab job run-sch "Production.Workspace/ETL Pipeline.Notebook" \
--type daily \
--interval 02:00,14:00 \
--start 2024-11-15T00:00:00 \
--end 2025-12-31T23:59:00 \
--enable
```
### Create Weekly Schedule
```bash
# Run Monday and Friday at 9 AM
fab job run-sch "Production.Workspace/Weekly Report.Notebook" \
--type weekly \
--interval 09:00 \
--days Monday,Friday \
--start 2024-11-15T00:00:00 \
--enable
```
### Update Schedule
```bash
# Modify existing schedule
fab job run-update "Production.Workspace/ETL Pipeline.Notebook" \
--id <schedule-id> \
--type daily \
--interval 03:00 \
--enable
# Disable schedule
fab job run-update "Production.Workspace/ETL Pipeline.Notebook" \
--id <schedule-id> \
--disable
```
## Notebook Configuration
### Set Default Lakehouse
```bash
# Via notebook properties
fab set "Production.Workspace/ETL.Notebook" -q lakehouse -i '{
"known_lakehouses": [{"id": "<lakehouse-id>"}],
"default_lakehouse": "<lakehouse-id>",
"default_lakehouse_name": "MainLakehouse",
"default_lakehouse_workspace_id": "<workspace-id>"
}'
```
### Set Default Environment
```bash
fab set "Production.Workspace/ETL.Notebook" -q environment -i '{
"environmentId": "<environment-id>",
"workspaceId": "<workspace-id>"
}'
```
### Set Default Warehouse
```bash
fab set "Production.Workspace/Analytics.Notebook" -q warehouse -i '{
"known_warehouses": [{"id": "<warehouse-id>", "type": "Datawarehouse"}],
"default_warehouse": "<warehouse-id>"
}'
```
## Updating Notebooks
### Update Display Name
```bash
fab set "Production.Workspace/ETL.Notebook" -q displayName -i "ETL Pipeline v2"
```
### Update Description
```bash
fab set "Production.Workspace/ETL.Notebook" -q description -i "Daily ETL pipeline for sales data ingestion and transformation"
```
## Deleting Notebooks
```bash
# Delete with confirmation (interactive)
fab rm "Dev.Workspace/Old Notebook.Notebook"
# Force delete without confirmation
fab rm "Dev.Workspace/Old Notebook.Notebook" -f
```
## Advanced Workflows
### Parameterized Notebook Execution
```python
# Create parametrized notebook with cell tagged as "parameters"
# In notebook, create cell:
date = "2024-01-01" # default
batch_size = 1000 # default
debug = False # default
# Execute with different parameters
fab job run "Production.Workspace/Parameterized.Notebook" -P \
date:string=2024-02-15,\
batch_size:int=2000,\
debug:bool=true
```
### Notebook Orchestration Pipeline
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
DATE=$(date +%Y-%m-%d)
# 1. Run ingestion notebook
echo "Starting data ingestion..."
fab job run "$WORKSPACE/1_Ingest_Data.Notebook" -P date:string=$DATE
# 2. Run transformation notebook
echo "Running transformations..."
fab job run "$WORKSPACE/2_Transform_Data.Notebook" -P date:string=$DATE
# 3. Run analytics notebook
echo "Generating analytics..."
fab job run "$WORKSPACE/3_Analytics.Notebook" -P date:string=$DATE
# 4. Run reporting notebook
echo "Creating reports..."
fab job run "$WORKSPACE/4_Reports.Notebook" -P date:string=$DATE
echo "Pipeline completed for $DATE"
```
### Monitoring Long-Running Notebook
```bash
#!/bin/bash
NOTEBOOK="Production.Workspace/Long Process.Notebook"
# Start job
JOB_ID=$(fab job start "$NOTEBOOK" -P date:string=$(date +%Y-%m-%d) | \
grep -o '"id": "[^"]*"' | head -1 | cut -d'"' -f4)
echo "Started job: $JOB_ID"
# Poll status every 30 seconds
while true; do
STATUS=$(fab job run-status "$NOTEBOOK" --id "$JOB_ID" | \
grep -o '"status": "[^"]*"' | cut -d'"' -f4)
echo "[$(date +%H:%M:%S)] Status: $STATUS"
if [[ "$STATUS" == "Completed" ]] || [[ "$STATUS" == "Failed" ]]; then
break
fi
sleep 30
done
if [[ "$STATUS" == "Completed" ]]; then
echo "Job completed successfully"
exit 0
else
echo "Job failed"
exit 1
fi
```
### Conditional Notebook Execution
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
# Check if data is ready
DATA_READY=$(fab api "workspaces/<ws-id>/lakehouses/<lh-id>/Files/ready.flag" 2>&1 | grep -c "200")
if [ "$DATA_READY" -eq 1 ]; then
echo "Data ready, running notebook..."
fab job run "$WORKSPACE/Process Data.Notebook" -P date:string=$(date +%Y-%m-%d)
else
echo "Data not ready, skipping execution"
fi
```
## Notebook Definition Structure
Notebook definition contains:
NotebookName.Notebook/
├── .platform # Git integration metadata
├── notebook-content.py # Python code (or .ipynb format)
└── metadata.json # Notebook metadata
### Query Notebook Content
```bash
NOTEBOOK="Production.Workspace/ETL.Notebook"
# Get Python code content
fab get "$NOTEBOOK" -q "definition.parts[?path=='notebook-content.py'].payload | [0]" | base64 -d
# Get metadata
fab get "$NOTEBOOK" -q "definition.parts[?path=='metadata.json'].payload | [0]" | base64 -d | jq .
```
## Troubleshooting
### Notebook Execution Failures
```bash
# Check recent execution
fab job run-list "Production.Workspace/ETL.Notebook" | head -n 5
# Get detailed error
fab job run-status "Production.Workspace/ETL.Notebook" --id <job-id> -q "error"
# Common issues:
# - Lakehouse not attached
# - Invalid parameters
# - Spark configuration errors
# - Missing dependencies
```
### Parameter Type Mismatches
```bash
# Parameters must match expected types
# ❌ Wrong: -P count:string=100 (should be int)
# ✅ Right: -P count:int=100
# Check notebook definition for parameter types
fab get "Production.Workspace/ETL.Notebook" -q "definition.parts[?path=='notebook-content.py']"
```
### Lakehouse Access Issues
```bash
# Verify lakehouse exists and is accessible
fab exists "Production.Workspace/MainLakehouse.Lakehouse"
# Check notebook's lakehouse configuration
fab get "Production.Workspace/ETL.Notebook" -q "properties.lakehouse"
# Re-attach lakehouse
fab set "Production.Workspace/ETL.Notebook" -q lakehouse -i '{
"known_lakehouses": [{"id": "<lakehouse-id>"}],
"default_lakehouse": "<lakehouse-id>",
"default_lakehouse_name": "MainLakehouse",
"default_lakehouse_workspace_id": "<workspace-id>"
}'
```
## Performance Tips
1. **Use workspace pools**: Faster startup than starter pool
2. **Cache data in lakehouses**: Avoid re-fetching data
3. **Parameterize notebooks**: Reuse logic with different inputs
4. **Monitor execution time**: Set appropriate timeouts
5. **Use async execution**: Don't block on long-running notebooks
6. **Optimize Spark config**: Tune for specific workloads
## Best Practices
1. **Tag parameter cells**: Use "parameters" tag for injected params
2. **Handle failures gracefully**: Add error handling and logging
3. **Version control notebooks**: Export and commit to Git
4. **Use descriptive names**: Clear naming for scheduled jobs
5. **Document parameters**: Add comments explaining expected inputs
6. **Test locally first**: Validate in development workspace
7. **Monitor schedules**: Review execution history regularly
8. **Clean up old notebooks**: Remove unused notebooks
## Security Considerations
1. **Credential management**: Use Key Vault for secrets
2. **Workspace permissions**: Control who can execute notebooks
3. **Parameter validation**: Sanitize inputs in notebook code
4. **Data access**: Respect lakehouse/warehouse permissions
5. **Logging**: Don't log sensitive information
## Related Scripts
- `scripts/run_notebook_pipeline.py` - Orchestrate multiple notebooks
- `scripts/monitor_notebook.py` - Monitor long-running executions
- `scripts/export_notebook.py` - Export with validation
- `scripts/schedule_notebook.py` - Simplified scheduling interface

View File

@@ -0,0 +1,47 @@
# Querying Data
## Query a Semantic Model (DAX)
```bash
# 1. Get workspace and model IDs
fab get "ws.Workspace" -q "id"
fab get "ws.Workspace/Model.SemanticModel" -q "id"
# 2. Execute DAX query
fab api -A powerbi "groups/<ws-id>/datasets/<model-id>/executeQueries" \
-X post -i '{"queries":[{"query":"EVALUATE TOPN(10, '\''TableName'\'')"}]}'
```
Or use the helper script:
```bash
python3 scripts/execute_dax.py "ws.Workspace/Model.SemanticModel" -q "EVALUATE TOPN(10, 'Table')"
```
## Query a Lakehouse Table
Lakehouse tables cannot be queried directly via API. Create a Direct Lake semantic model first.
```bash
# 1. Create Direct Lake model from lakehouse table
python3 scripts/create_direct_lake_model.py \
"src.Workspace/LH.Lakehouse" \
"dest.Workspace/Model.SemanticModel" \
-t schema.table
# 2. Query via DAX
python3 scripts/execute_dax.py "dest.Workspace/Model.SemanticModel" -q "EVALUATE TOPN(10, 'table')"
# 3. (Optional) Delete temporary model
fab rm "dest.Workspace/Model.SemanticModel" -f
```
## Get Lakehouse SQL Endpoint
For external SQL clients:
```bash
fab get "ws.Workspace/LH.Lakehouse" -q "properties.sqlEndpointProperties"
```
Returns `connectionString` and `id` for SQL connections.

View File

@@ -0,0 +1,454 @@
# Fabric CLI Quick Start Guide
Real working examples using Fabric workspaces and items. These commands are ready to copy-paste and modify for your own workspaces.
## Finding Items
### List Workspaces
```bash
# List all workspaces
fab ls
# List with details (shows IDs, types, etc.)
fab ls -l
# Find specific workspace
fab ls | grep "Sales"
```
### List Items in Workspace
```bash
# List all items in workspace
fab ls "Sales.Workspace"
# List with details (shows IDs, modification dates)
fab ls "Sales.Workspace" -l
# Filter by type
fab ls "Sales.Workspace" | grep "Notebook"
fab ls "Sales.Workspace" | grep "SemanticModel"
fab ls "Sales.Workspace" | grep "Lakehouse"
```
### Check if Item Exists
```bash
# Check workspace exists
fab exists "Sales.Workspace"
# Check specific item exists
fab exists "Sales.Workspace/Sales Model.SemanticModel"
fab exists "Sales.Workspace/SalesLH.Lakehouse"
fab exists "Sales.Workspace/ETL - Extract.Notebook"
```
### Get Item Details
```bash
# Get basic properties
fab get "Sales.Workspace/Sales Model.SemanticModel"
# Get all properties (verbose)
fab get "Sales.Workspace/Sales Model.SemanticModel" -v
# Get specific property (workspace ID)
fab get "Sales.Workspace" -q "id"
# Get specific property (model ID)
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "id"
# Get display name
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "displayName"
```
## Working with Semantic Models
### Get Model Information
```bash
# Get model definition (full TMDL structure)
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "definition"
# Save definition to file
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "definition" > sales-model-definition.json
# Get model creation date
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "properties.createdDateTime"
# Get model type (DirectLake, Import, etc.)
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "properties.mode"
```
### Check Refresh Status
```bash
# First, get the workspace ID
fab get "Sales.Workspace" -q "id"
# Returns: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Then get the model ID
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "id"
# Returns: 12345678-abcd-ef12-3456-789abcdef012
# Now use those IDs to get latest refresh (put $top in the URL)
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes?\$top=1"
# Get full refresh history
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes"
```
### Query Model with DAX
```bash
# First, get the model definition to see table/column names
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "definition"
# Get the workspace and model IDs
fab get "Sales.Workspace" -q "id"
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "id"
# Execute DAX query (using proper table qualification)
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/executeQueries" -X post -i '{"queries":[{"query":"EVALUATE TOPN(1, '\''Orders'\'', '\''Orders'\''[OrderDate], DESC)"}],"serializerSettings":{"includeNulls":true}}'
# Query top 5 records from a table
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/executeQueries" -X post -i '{"queries":[{"query":"EVALUATE TOPN(5, '\''Orders'\'')"}],"serializerSettings":{"includeNulls":true}}'
```
### Trigger Model Refresh
```bash
# Get workspace and model IDs
fab get "Sales.Workspace" -q "id"
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "id"
# Trigger full refresh
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes" -X post -i '{"type": "Full", "commitMode": "Transactional"}'
# Monitor refresh status
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes?\$top=1"
```
## Working with Notebooks
### List Notebooks
```bash
# List all notebooks in workspace
fab ls "Sales.Workspace" | grep "Notebook"
# Get specific notebook details
fab get "Sales.Workspace/ETL - Extract.Notebook"
# Get notebook ID
fab get "Sales.Workspace/ETL - Extract.Notebook" -q "id"
```
### Run Notebook
```bash
# Run notebook synchronously (wait for completion)
fab job run "Sales.Workspace/ETL - Extract.Notebook"
# Run with timeout (300 seconds = 5 minutes)
fab job run "Sales.Workspace/ETL - Extract.Notebook" --timeout 300
# Run with parameters
fab job run "Sales.Workspace/ETL - Extract.Notebook" -P \
date:string=2025-10-17,\
debug:bool=true
```
### Run Notebook Asynchronously
```bash
# Start notebook and return immediately
fab job start "Sales.Workspace/ETL - Extract.Notebook"
# Check execution history
fab job run-list "Sales.Workspace/ETL - Extract.Notebook"
# Check specific job status (replace <job-id> with actual ID)
fab job run-status "Sales.Workspace/ETL - Extract.Notebook" --id <job-id>
```
### Get Notebook Definition
```bash
# Get full notebook definition
fab get "Sales.Workspace/ETL - Extract.Notebook" -q "definition"
# Save to file
fab get "Sales.Workspace/ETL - Extract.Notebook" -q "definition" > etl-extract-notebook.json
# Get notebook code content
fab get "Sales.Workspace/ETL - Extract.Notebook" -q "definition.parts[?path=='notebook-content.py'].payload | [0]" | base64 -d
```
## Working with Lakehouses
### Browse Lakehouse
```bash
# List lakehouse contents
fab ls "Sales.Workspace/SalesLH.Lakehouse"
# List Files directory
fab ls "Sales.Workspace/SalesLH.Lakehouse/Files"
# List specific folder in Files
fab ls "Sales.Workspace/SalesLH.Lakehouse/Files/2025/10"
# List Tables
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables"
# List with details (shows sizes, modified dates)
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables" -l
# List specific schema tables
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/bronze"
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/gold"
```
### Get Table Schema
```bash
# View table schema
fab table schema "Sales.Workspace/SalesLH.Lakehouse/Tables/bronze/raw_orders"
fab table schema "Sales.Workspace/SalesLH.Lakehouse/Tables/gold/orders"
# Save schema to file
fab table schema "Sales.Workspace/SalesLH.Lakehouse/Tables/gold/orders" > orders-schema.json
```
### Check Table Last Modified
```bash
# List tables with modification times
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/gold" -l
# Get specific table details
fab get "Sales.Workspace/SalesLH.Lakehouse/Tables/gold/orders"
```
## Working with Reports
### List Reports
```bash
# List all reports
fab ls "Sales.Workspace" | grep "Report"
# Get report details
fab get "Sales.Workspace/Sales Dashboard.Report"
# Get report ID
fab get "Sales.Workspace/Sales Dashboard.Report" -q "id"
```
### Export Report Definition
```bash
# Get report definition as JSON
fab get "Sales.Workspace/Sales Dashboard.Report" -q "definition" > sales-report.json
# Export report to local directory (creates PBIR structure)
fab export "Sales.Workspace/Sales Dashboard.Report" -o ./reports-backup -f
```
### Get Report Metadata
```bash
# Get connected semantic model ID
fab get "Sales.Workspace/Sales Dashboard.Report" -q "properties.datasetId"
# Get report connection string
fab get "Sales.Workspace/Sales Dashboard.Report" -q "definition.parts[?path=='definition.pbir'].payload.datasetReference"
```
## Download and Re-upload Workflows
### Backup Semantic Model
```bash
# 1. Get model definition
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "definition" > backup-sales-model-$(date +%Y%m%d).json
# 2. Get model metadata
fab get "Sales.Workspace/Sales Model.SemanticModel" > backup-sales-model-metadata-$(date +%Y%m%d).json
```
### Export and Import Notebook
```bash
# Export notebook
fab export "Sales.Workspace/ETL - Extract.Notebook" -o ./notebooks-backup
# Import to another workspace (or same workspace with different name)
fab import "Dev.Workspace/ETL Extract Copy.Notebook" -i ./notebooks-backup/ETL\ -\ Extract.Notebook
```
### Copy Items Between Workspaces
```bash
# Copy semantic model
fab cp "Sales.Workspace/Sales Model.SemanticModel" "Dev.Workspace"
# Copy with new name
fab cp "Sales.Workspace/Sales Model.SemanticModel" "Dev.Workspace/Sales Model Test.SemanticModel"
# Copy notebook
fab cp "Sales.Workspace/ETL - Extract.Notebook" "Dev.Workspace"
# Copy report
fab cp "Sales.Workspace/Sales Dashboard.Report" "Dev.Workspace"
```
## Combined Workflows
### Complete Model Status Check
```bash
# Check last refresh
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes?\$top=1"
# Check latest data in model
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/executeQueries" -X post -i '{"queries":[{"query":"EVALUATE TOPN(1, '\''Orders'\'', '\''Orders'\''[OrderDate], DESC)"}],"serializerSettings":{"includeNulls":true}}'
# Check lakehouse data freshness
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/gold/orders" -l
```
### Check All Notebooks in Workspace
```bash
# List all notebooks
fab ls "Sales.Workspace" | grep Notebook
# Check execution history for each
fab job run-list "Sales.Workspace/ETL - Extract.Notebook"
fab job run-list "Sales.Workspace/ETL - Transform.Notebook"
```
### Monitor Lakehouse Data Freshness
```bash
# Check gold layer tables
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/gold" -l
# Check bronze layer tables
fab ls "Sales.Workspace/SalesLH.Lakehouse/Tables/bronze" -l
# Check latest files
fab ls "Sales.Workspace/SalesLH.Lakehouse/Files/2025/10" -l
```
## Tips and Tricks
### Get IDs for API Calls
```bash
# Get workspace ID
fab get "Sales.Workspace" -q "id"
# Get model ID
fab get "Sales.Workspace/Sales Model.SemanticModel" -q "id"
# Get lakehouse ID
fab get "Sales.Workspace/SalesLH.Lakehouse" -q "id"
# Then use the IDs directly in API calls
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items"
fab api -A powerbi "groups/a1b2c3d4-e5f6-7890-abcd-ef1234567890/datasets/12345678-abcd-ef12-3456-789abcdef012/refreshes"
```
### Pipe to jq for Pretty JSON
```bash
# Pretty print JSON output
fab get "Sales.Workspace/Sales Model.SemanticModel" | jq .
# Extract specific fields
fab get "Sales.Workspace/Sales Model.SemanticModel" | jq '{id: .id, name: .displayName, created: .properties.createdDateTime}'
# Get workspace ID first, then filter arrays
fab get "Sales.Workspace" -q "id"
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" | jq '.value[] | select(.type=="Notebook") | .displayName'
```
### Use with grep for Filtering
```bash
# Find items by pattern
fab ls "Sales.Workspace" | grep -i "etl"
fab ls "Sales.Workspace" | grep -i "sales"
# Count items by type
fab ls "Sales.Workspace" | grep -c "Notebook"
fab ls "Sales.Workspace" | grep -c "SemanticModel"
```
### Create Aliases for Common Commands
```bash
# Add to ~/.bashrc or ~/.zshrc
alias sales-ls='fab ls "Sales.Workspace"'
alias sales-notebooks='fab ls "Sales.Workspace" | grep Notebook'
alias sales-refresh='fab api -A powerbi "groups/<ws-id>/datasets/<model-id>/refreshes?\$top=1"'
# Then use:
sales-ls
sales-notebooks
sales-refresh
```
## Common Patterns
### Get All Items of a Type
```bash
# Get workspace ID first
fab get "Sales.Workspace" -q "id"
# Get all notebooks
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?type=='Notebook']"
# Get all semantic models
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?type=='SemanticModel']"
# Get all lakehouses
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?type=='Lakehouse']"
# Get all reports
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?type=='Report']"
```
### Export Entire Workspace
```bash
# Export all items in workspace
fab export "Sales.Workspace" -o ./sales-workspace-backup -a
# This creates a full backup with all items
```
### Find Items by Name Pattern
```bash
# Get workspace ID first
fab get "Sales.Workspace" -q "id"
# Find items with "ETL" in name
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?contains(displayName, 'ETL')]"
# Find items with "Sales" in name
fab api "workspaces/a1b2c3d4-e5f6-7890-abcd-ef1234567890/items" -q "value[?contains(displayName, 'Sales')]"
```
## Next Steps
- See [semantic-models.md](./semantic-models.md) for advanced model operations
- See [notebooks.md](./notebooks.md) for notebook scheduling and orchestration
- See [reports.md](./reports.md) for report deployment workflows
- See [scripts/README.md](../scripts/README.md) for helper scripts

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,274 @@
# Report Operations
## Get Report Info
```bash
# Check exists
fab exists "ws.Workspace/Report.Report"
# Get properties
fab get "ws.Workspace/Report.Report"
# Get ID
fab get "ws.Workspace/Report.Report" -q "id"
```
## Get Report Definition
```bash
# Full definition
fab get "ws.Workspace/Report.Report" -q "definition"
# Save to file
fab get "ws.Workspace/Report.Report" -q "definition" -o /tmp/report-def.json
# Specific parts
fab get "ws.Workspace/Report.Report" -q "definition.parts[?path=='definition/report.json'].payload | [0]"
```
## Get Connected Model
```bash
# Get model reference from definition.pbir
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path, 'definition.pbir')].payload | [0]"
```
Output shows `byConnection.connectionString` with `semanticmodelid`.
## Export Report
1. Export to local directory:
```bash
fab export "ws.Workspace/Report.Report" -o /tmp/exports -f
```
2. Creates structure:
```
Report.Report/
├── .platform
├── definition.pbir
└── definition/
├── report.json
├── version.json
└── pages/
└── {page-id}/
├── page.json
└── visuals/{visual-id}/visual.json
```
## Import Report
1. Import from local PBIP:
```bash
fab import "ws.Workspace/Report.Report" -i /tmp/exports/Report.Report -f
```
2. Import with new name:
```bash
fab import "ws.Workspace/NewName.Report" -i /tmp/exports/Report.Report -f
```
## Copy Report Between Workspaces
```bash
fab cp "dev.Workspace/Report.Report" "prod.Workspace" -f
```
## Create Blank Report
1. Get model ID:
```bash
fab get "ws.Workspace/Model.SemanticModel" -q "id"
```
2. Create report via API:
```bash
WS_ID=$(fab get "ws.Workspace" -q "id" | tr -d '"')
fab api -X post "workspaces/$WS_ID/reports" -i '{
"displayName": "New Report",
"datasetId": "<model-id>"
}'
```
## Update Report Properties
```bash
# Rename
fab set "ws.Workspace/Report.Report" -q displayName -i "New Name"
# Update description
fab set "ws.Workspace/Report.Report" -q description -i "Description text"
```
## Rebind to Different Model
1. Get new model ID:
```bash
fab get "ws.Workspace/NewModel.SemanticModel" -q "id"
```
2. Rebind:
```bash
fab set "ws.Workspace/Report.Report" -q semanticModelId -i "<new-model-id>"
```
## Delete Report
```bash
fab rm "ws.Workspace/Report.Report" -f
```
## List Pages
```bash
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path, 'page.json')].path"
```
## List Visuals
```bash
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path, '/visuals/')].path"
```
## Count Visuals by Type
1. Export visuals:
```bash
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path,'/visuals/')]" > /tmp/visuals.json
```
2. Count by type:
```bash
jq -r '.[] | .payload.visual.visualType' < /tmp/visuals.json | sort | uniq -c | sort -rn
```
## Extract Fields Used in Report
1. Export visuals (if not done):
```bash
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path,'/visuals/')]" > /tmp/visuals.json
```
2. List unique fields:
```bash
jq -r '[.[] | (.payload.visual.query.queryState // {} | to_entries[] | .value.projections[]? | if .field.Column then "\(.field.Column.Expression.SourceRef.Entity).\(.field.Column.Property)" elif .field.Measure then "\(.field.Measure.Expression.SourceRef.Entity).\(.field.Measure.Property)" else empty end)] | unique | sort | .[]' < /tmp/visuals.json
```
## Validate Fields Against Model
1. Export report:
```bash
fab export "ws.Workspace/Report.Report" -o /tmp/report -f
```
2. Extract field references:
```bash
find /tmp/report -name "visual.json" -exec grep -B2 '"Property":' {} \; | \
grep -E '"Entity":|"Property":' | paste -d' ' - - | \
sed 's/.*"Entity": "\([^"]*\)".*"Property": "\([^"]*\)".*/\1.\2/' | sort -u
```
3. Compare against model definition to find missing fields.
## Report Permissions
1. Get IDs:
```bash
WS_ID=$(fab get "ws.Workspace" -q "id" | tr -d '"')
REPORT_ID=$(fab get "ws.Workspace/Report.Report" -q "id" | tr -d '"')
```
2. List users:
```bash
fab api -A powerbi "groups/$WS_ID/reports/$REPORT_ID/users"
```
3. Add user:
```bash
fab api -A powerbi "groups/$WS_ID/reports/$REPORT_ID/users" -X post -i '{
"emailAddress": "user@domain.com",
"reportUserAccessRight": "View"
}'
```
## Deploy Report (Dev to Prod)
1. Export from dev:
```bash
fab export "dev.Workspace/Report.Report" -o /tmp/deploy -f
```
2. Import to prod:
```bash
fab import "prod.Workspace/Report.Report" -i /tmp/deploy/Report.Report -f
```
3. Verify:
```bash
fab exists "prod.Workspace/Report.Report"
```
## Clone Report with Different Model
1. Export source:
```bash
fab export "ws.Workspace/Template.Report" -o /tmp/clone -f
```
2. Edit `/tmp/clone/Template.Report/definition.pbir` to update `semanticmodelid`
3. Import as new report:
```bash
fab import "ws.Workspace/NewReport.Report" -i /tmp/clone/Template.Report -f
```
## Troubleshooting
### Report Not Found
```bash
fab exists "ws.Workspace"
fab ls "ws.Workspace" | grep -i report
```
### Model Binding Issues
```bash
# Check current binding
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path, 'definition.pbir')].payload | [0]"
# Rebind
fab set "ws.Workspace/Report.Report" -q semanticModelId -i "<model-id>"
```
### Import Fails
```bash
# Verify structure
ls -R /tmp/exports/Report.Report/
# Check definition is valid JSON
fab get "ws.Workspace/Report.Report" -q "definition" | jq . > /dev/null && echo "Valid"
```

View File

@@ -0,0 +1,583 @@
# Semantic Model Operations
Comprehensive guide for working with semantic models (Power BI datasets) using the Fabric CLI.
## Overview
Semantic models in Fabric use TMDL (Tabular Model Definition Language) format for their definitions. This guide covers getting, updating, exporting, and managing semantic models.
## Getting Model Information
### Basic Model Info
```bash
# Check if model exists
fab exists "Production.Workspace/Sales.SemanticModel"
# Get model properties
fab get "Production.Workspace/Sales.SemanticModel"
# Get model with all details (verbose)
fab get "Production.Workspace/Sales.SemanticModel" -v
# Get only model ID
fab get "Production.Workspace/Sales.SemanticModel" -q "id"
```
### Get Model Definition
The model definition contains all TMDL parts (tables, measures, relationships, etc.):
```bash
# Get full definition (all TMDL parts)
fab get "Production.Workspace/Sales.SemanticModel" -q "definition"
# Save definition to file
fab get "Production.Workspace/Sales.SemanticModel" -q "definition" -o /tmp/model-def.json
```
### Get Specific TMDL Parts
```bash
# Get model.tmdl (main model properties)
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?path=='model.tmdl'].payload | [0]"
# Get specific table definition
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?path=='definition/tables/Customers.tmdl'].payload | [0]"
# Get all table definitions
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?starts_with(path, 'definition/tables/')]"
# Get relationships.tmdl
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?path=='definition/relationships.tmdl'].payload | [0]"
# Get functions.tmdl (DAX functions)
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?path=='definition/functions.tmdl'].payload | [0]"
# Get all definition part paths (for reference)
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[].path"
```
## Exporting Models
### Export as PBIP (Power BI Project)
PBIP format is best for local development in Power BI Desktop or Tabular Editor:
```bash
# Export using the export script
python3 scripts/export_semantic_model_as_pbip.py \
"Production.Workspace/Sales.SemanticModel" -o /tmp/exports
```
### Export as TMDL
The export script creates PBIP format which includes TMDL in the definition folder:
```bash
python3 scripts/export_semantic_model_as_pbip.py \
"Production.Workspace/Sales.SemanticModel" -o /tmp/exports
# TMDL files will be in: /tmp/exports/Sales.SemanticModel/definition/
```
### Export Specific Parts Only
```bash
# Export just tables
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?starts_with(path, 'definition/tables/')]" -o /tmp/tables.json
# Export just measures (within tables)
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?contains(path, '/tables/')]" | grep -A 20 "measure"
```
## Listing Model Contents
```bash
# List all items in model (if OneLake enabled)
fab ls "Production.Workspace/Sales.SemanticModel"
# Query model structure via API
fab api workspaces -q "value[?displayName=='Production'].id | [0]" | xargs -I {} \
fab api "workspaces/{}/items" -q "value[?type=='SemanticModel']"
```
## Updating Model Definitions
**CRITICAL**: When updating semantic models, you must:
1. Get the full definition
2. Modify the specific parts you want to change
3. Include ALL parts in the update request (modified + unmodified)
4. Never include `.platform` file
5. Test immediately
### Update Workflow
```bash
# 1. Get workspace and model IDs
WS_ID=$(fab get "Production.Workspace" -q "id")
MODEL_ID=$(fab get "Production.Workspace/Sales.SemanticModel" -q "id")
# 2. Get current definition
fab get "Production.Workspace/Sales.SemanticModel" -q "definition" -o /tmp/current-def.json
# 3. Modify definition (edit JSON file or use script)
# ... modify /tmp/current-def.json ...
# 4. Wrap definition in update request
cat > /tmp/update-request.json <<EOF
{
"definition": $(cat /tmp/current-def.json)
}
EOF
# 5. Update via API
fab api -X post "workspaces/$WS_ID/semanticModels/$MODEL_ID/updateDefinition" \
-i /tmp/update-request.json \
--show_headers
# 6. Extract operation ID from Location header and poll status
OPERATION_ID="<extracted-from-Location-header>"
fab api "operations/$OPERATION_ID"
```
### Example: Add a Measure
```python
# Python script to add measure to definition
import json
with open('/tmp/current-def.json', 'r') as f:
definition = json.load(f)
# Find the table's TMDL part
for part in definition['parts']:
if part['path'] == 'definition/tables/Sales.tmdl':
# Decode base64 content
import base64
tmdl_content = base64.b64decode(part['payload']).decode('utf-8')
# Add measure (simplified - real implementation needs proper TMDL syntax)
measure_tmdl = """
measure 'Total Revenue' = SUM(Sales[Amount])
formatString: #,0.00
displayFolder: "KPIs"
"""
tmdl_content += measure_tmdl
# Re-encode
part['payload'] = base64.b64encode(tmdl_content.encode('utf-8')).decode('utf-8')
# Save modified definition
with open('/tmp/modified-def.json', 'w') as f:
json.dump(definition, f)
```
## Executing DAX Queries
Use Power BI API to execute DAX queries against semantic models:
```bash
# Get model ID
MODEL_ID=$(fab get "Production.Workspace/Sales.SemanticModel" -q "id")
# Execute simple DAX query
fab api -A powerbi "datasets/$MODEL_ID/executeQueries" -X post -i '{
"queries": [{
"query": "EVALUATE VALUES(Date[Year])"
}]
}'
# Execute TOPN query
fab api -A powerbi "datasets/$MODEL_ID/executeQueries" -X post -i '{
"queries": [{
"query": "EVALUATE TOPN(10, Sales, Sales[Amount], DESC)"
}]
}'
# Execute multiple queries
fab api -A powerbi "datasets/$MODEL_ID/executeQueries" -X post -i '{
"queries": [
{"query": "EVALUATE VALUES(Date[Year])"},
{"query": "EVALUATE SUMMARIZE(Sales, Date[Year], \"Total\", SUM(Sales[Amount]))"}
],
"serializerSettings": {
"includeNulls": false
}
}'
# Execute query with parameters
fab api -A powerbi "datasets/$MODEL_ID/executeQueries" -X post -i '{
"queries": [{
"query": "EVALUATE FILTER(Sales, Sales[Year] = @Year)",
"parameters": [
{"name": "@Year", "value": "2024"}
]
}]
}'
```
#### Using the DAX query script
```bash
python3 scripts/execute_dax.py \
--workspace "Production.Workspace" \
--model "Sales.SemanticModel" \
--query "EVALUATE TOPN(10, Sales)" \
--output /tmp/results.json
```
## Refreshing Models
```bash
# Get workspace and model IDs
WS_ID=$(fab get "Production.Workspace" -q "id")
MODEL_ID=$(fab get "Production.Workspace/Sales.SemanticModel" -q "id")
# Trigger full refresh
fab api -A powerbi "groups/$WS_ID/datasets/$MODEL_ID/refreshes" -X post -i '{"type":"Full"}'
# Check latest refresh status
fab api -A powerbi "groups/$WS_ID/datasets/$MODEL_ID/refreshes?\$top=1"
# Get refresh history
fab api -A powerbi "groups/$WS_ID/datasets/$MODEL_ID/refreshes"
# Cancel refresh
REFRESH_ID="<refresh-request-id>"
fab api -A powerbi "groups/$WS_ID/datasets/$MODEL_ID/refreshes/$REFRESH_ID" -X delete
```
## Model Refresh Schedule
```bash
MODEL_ID=$(fab get "Production.Workspace/Sales.SemanticModel" -q "id")
# Get current schedule
fab api -A powerbi "datasets/$MODEL_ID/refreshSchedule"
# Update schedule (daily at 2 AM)
fab api -A powerbi "datasets/$MODEL_ID/refreshSchedule" -X patch -i '{
"enabled": true,
"days": ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"],
"times": ["02:00"],
"localTimeZoneId": "UTC"
}'
# Disable schedule
fab api -A powerbi "datasets/$MODEL_ID/refreshSchedule" -X patch -i '{
"enabled": false
}'
```
## Copying Models
```bash
# Copy semantic model between workspaces (full paths required)
fab cp "Dev.Workspace/Sales.SemanticModel" "Production.Workspace/Sales.SemanticModel" -f
# Copy with new name
fab cp "Dev.Workspace/Sales.SemanticModel" "Production.Workspace/SalesProduction.SemanticModel" -f
# Note: Both source and destination must include workspace.Workspace/model.SemanticModel
# This copies the definition, not data or refreshes
```
## Model Deployment Workflow
### Dev to Production
```bash
#!/bin/bash
DEV_WS="Development.Workspace"
PROD_WS="Production.Workspace"
MODEL_NAME="Sales.SemanticModel"
# 1. Export from dev
fab export "$DEV_WS/$MODEL_NAME" -o /tmp/deployment
# 2. Test locally (optional - requires Power BI Desktop)
# Open /tmp/deployment/Sales/*.pbip in Power BI Desktop
# 3. Import to production
fab import "$PROD_WS/$MODEL_NAME" -i /tmp/deployment/$MODEL_NAME
# 4. Trigger refresh in production
PROD_MODEL_ID=$(fab get "$PROD_WS/$MODEL_NAME" -q "id")
fab api -A powerbi "datasets/$PROD_MODEL_ID/refreshes" -X post -i '{"type": "Full"}'
# 5. Monitor refresh
sleep 10
fab api -A powerbi "datasets/$PROD_MODEL_ID/refreshes" -q "value[0]"
```
## Working with Model Metadata
### Update Display Name
```bash
fab set "Production.Workspace/Sales.SemanticModel" -q displayName -i "Sales Analytics Model"
```
### Update Description
```bash
fab set "Production.Workspace/Sales.SemanticModel" -q description -i "Primary sales analytics semantic model for production reporting"
```
## Advanced Patterns
### Extract All Measures
```bash
# Get all table definitions containing measures
fab get "Production.Workspace/Sales.SemanticModel" -q "definition.parts[?contains(path, '/tables/')]" -o /tmp/tables.json
# Process with script to extract measures
python3 << 'EOF'
import json
import base64
with open('/tmp/tables.json', 'r') as f:
parts = json.load(f)
measures = []
for part in parts:
if 'tables' in part['path']:
content = base64.b64decode(part['payload']).decode('utf-8')
# Extract measure definitions (simple regex - real parser needed for production)
import re
measure_blocks = re.findall(r'measure\s+[^\n]+\s*=.*?(?=\n\s*(?:measure|column|$))', content, re.DOTALL)
measures.extend(measure_blocks)
for i, measure in enumerate(measures, 1):
print(f"\n--- Measure {i} ---")
print(measure)
EOF
```
### Compare Models (Diff)
```bash
# Export both models
fab get "Production.Workspace/Sales.SemanticModel" -q "definition" -o /tmp/model1-def.json
fab get "Dev.Workspace/Sales.SemanticModel" -q "definition" -o /tmp/model2-def.json
# Use diff tool
diff <(jq -S . /tmp/model1-def.json) <(jq -S . /tmp/model2-def.json)
# jq -S sorts keys for consistent comparison
```
### Backup Model Definition
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
MODEL="Sales.SemanticModel"
BACKUP_DIR="/backups/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"
# Export as multiple formats for redundancy
fab get "$WORKSPACE/$MODEL" -q "definition" -o "$BACKUP_DIR/definition.json"
# Export as PBIP
python3 scripts/export_semantic_model_as_pbip.py \
"$WORKSPACE/$MODEL" -o "$BACKUP_DIR/pbip"
# Save metadata
fab get "$WORKSPACE/$MODEL" -o "$BACKUP_DIR/metadata.json"
echo "Backup completed: $BACKUP_DIR"
```
## TMDL Structure Reference
A semantic model's TMDL definition consists of these parts:
```
model.tmdl # Model properties, culture, compatibility
.platform # Git integration metadata (exclude from updates)
definition/
├── model.tmdl # Alternative location for model properties
├── database.tmdl # Database properties
├── roles.tmdl # Row-level security roles
├── relationships.tmdl # Relationships between tables
├── functions.tmdl # DAX user-defined functions
├── expressions/ # M queries for data sources
│ ├── Source1.tmdl
│ └── Source2.tmdl
└── tables/ # Table definitions
├── Customers.tmdl # Columns, measures, hierarchies
├── Sales.tmdl
├── Products.tmdl
└── Date.tmdl
```
### Common TMDL Parts to Query
```bash
MODEL="Production.Workspace/Sales.SemanticModel"
# Model properties
fab get "$MODEL" -q "definition.parts[?path=='model.tmdl'].payload | [0]"
# Roles and RLS
fab get "$MODEL" -q "definition.parts[?path=='definition/roles.tmdl'].payload | [0]"
# Relationships
fab get "$MODEL" -q "definition.parts[?path=='definition/relationships.tmdl'].payload | [0]"
# Data source expressions
fab get "$MODEL" -q "definition.parts[?starts_with(path, 'definition/expressions/')]"
# All tables
fab get "$MODEL" -q "definition.parts[?starts_with(path, 'definition/tables/')].path"
```
## Troubleshooting
### Model Not Found
```bash
# Verify workspace exists
fab exists "Production.Workspace"
# List semantic models in workspace
WS_ID=$(fab get "Production.Workspace" -q "id")
fab api "workspaces/$WS_ID/items" -q "value[?type=='SemanticModel']"
```
### Update Definition Fails
Common issues:
1. **Included `.platform` file**: Never include this in updates
2. **Missing parts**: Must include ALL parts, not just modified ones
3. **Invalid TMDL syntax**: Validate TMDL before updating
4. **Encoding issues**: Ensure base64 encoding is correct
```bash
# Debug update operation
fab api "operations/$OPERATION_ID" -q "error"
```
### DAX Query Errors
```bash
# Check model is online
fab get "Production.Workspace/Sales.SemanticModel" -q "properties"
# Try simple query first
MODEL_ID=$(fab get "Production.Workspace/Sales.SemanticModel" -q "id")
fab api -A powerbi "datasets/$MODEL_ID/executeQueries" -X post -i '{
"queries": [{"query": "EVALUATE {1}"}]
}'
```
## Storage Mode
Check table partition mode to determine if model is Direct Lake, Import, or DirectQuery.
```bash
# Get table definition and check partition mode
fab get "ws.Workspace/Model.SemanticModel" -q "definition.parts[?contains(path, 'tables/TableName')].payload | [0]"
```
Output shows partition type:
```
# Direct Lake
partition TableName = entity
mode: directLake
source
entityName: table_name
schemaName: schema
expressionSource: DatabaseQuery
# Import
partition TableName = m
mode: import
source =
let
Source = Sql.Database("connection", "database"),
Data = Source{[Schema="schema",Item="table"]}[Data]
in
Data
```
## Workspace Access
```bash
# Get workspace ID
fab get "ws.Workspace" -q "id"
# List users with access
fab api -A powerbi "groups/<workspace-id>/users"
```
Output:
```json
{
"value": [
{
"emailAddress": "user@domain.com",
"groupUserAccessRight": "Admin",
"displayName": "User Name",
"principalType": "User"
}
]
}
```
Access rights: `Admin`, `Member`, `Contributor`, `Viewer`
## Find Reports Using a Model
Check report's `definition.pbir` for `byConnection.semanticmodelid`:
```bash
# Get model ID
fab get "ws.Workspace/Model.SemanticModel" -q "id"
# Check a report's connection
fab get "ws.Workspace/Report.Report" -q "definition.parts[?contains(path, 'definition.pbir')].payload | [0]"
```
Output:
```json
{
"datasetReference": {
"byConnection": {
"connectionString": "...semanticmodelid=bee906a0-255e-..."
}
}
}
```
To find all reports using a model, check each report's definition.pbir for matching `semanticmodelid`.
## Performance Tips
1. **Cache model IDs**: Don't repeatedly query for the same ID
2. **Use JMESPath filtering**: Get only what you need
3. **Batch DAX queries**: Combine multiple queries in one request
4. **Export during off-hours**: Large model exports can be slow
5. **Use Power BI API for queries**: It's optimized for DAX execution
## Security Considerations
1. **Row-Level Security**: Check roles before exposing data
2. **Credentials in data sources**: Don't commit data source credentials
3. **Sensitive measures**: Review calculated columns/measures for sensitive logic
4. **Export restrictions**: Ensure exported models don't contain sensitive data
## Related Scripts
- `scripts/create_direct_lake_model.py` - Create Direct Lake model from lakehouse table
- `scripts/export_semantic_model_as_pbip.py` - Export model as PBIP
- `scripts/execute_dax.py` - Execute DAX queries

View File

@@ -0,0 +1,578 @@
# Workspace Operations
Comprehensive guide for managing Fabric workspaces using the Fabric CLI.
## Overview
Workspaces are containers for Fabric items and provide collaboration and security boundaries. This guide covers workspace management, configuration, and operations.
## Listing Workspaces
### List All Workspaces
```bash
# Simple list
fab ls
# Detailed list with metadata
fab ls -l
# List with hidden tenant-level items
fab ls -la
# Hidden items include: capacities, connections, domains, gateways
```
### Filter Workspaces
```bash
# Using API with JMESPath query
fab api workspaces -q "value[].{name: displayName, id: id, type: type}"
# Filter by name pattern
fab api workspaces -q "value[?contains(displayName, 'Production')]"
# Filter by capacity
fab api workspaces -q "value[?capacityId=='<capacity-id>']"
# Get workspace count
fab api workspaces -q "value | length"
```
## Getting Workspace Information
### Basic Workspace Info
```bash
# Check if workspace exists
fab exists "Production.Workspace"
# Get workspace details
fab get "Production.Workspace"
# Get specific property
fab get "Production.Workspace" -q "id"
fab get "Production.Workspace" -q "capacityId"
fab get "Production.Workspace" -q "description"
# Get all properties (verbose)
fab get "Production.Workspace" -v
# Save to file
fab get "Production.Workspace" -o /tmp/workspace-info.json
```
### Get Workspace Configuration
```bash
# Get Spark settings
fab get "Production.Workspace" -q "sparkSettings"
# Get Spark runtime version
fab get "Production.Workspace" -q "sparkSettings.environment.runtimeVersion"
# Get default Spark pool
fab get "Production.Workspace" -q "sparkSettings.pool.defaultPool"
```
## Creating Workspaces
### Create with Default Capacity
```bash
# Use CLI-configured default capacity
fab mkdir "NewWorkspace.Workspace"
# Verify capacity configuration first
fab api workspaces -q "value[0].capacityId"
```
### Create with Specific Capacity
```bash
# Assign to specific capacity
fab mkdir "Production Workspace.Workspace" -P capacityname=ProductionCapacity
# Get capacity name from capacity list
fab ls -la | grep Capacity
```
### Create without Capacity
```bash
# Create in shared capacity (not recommended for production)
fab mkdir "Dev Workspace.Workspace" -P capacityname=none
```
## Listing Workspace Contents
### List Items in Workspace
```bash
# Simple list
fab ls "Production.Workspace"
# Detailed list with metadata
fab ls "Production.Workspace" -l
# Include hidden items (Spark pools, managed identities, etc.)
fab ls "Production.Workspace" -la
# Hidden workspace items include:
# - External Data Shares
# - Managed Identities
# - Managed Private Endpoints
# - Spark Pools
```
### Filter Items by Type
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# List semantic models only
fab api "workspaces/$WS_ID/items" -q "value[?type=='SemanticModel']"
# List reports only
fab api "workspaces/$WS_ID/items" -q "value[?type=='Report']"
# List notebooks
fab api "workspaces/$WS_ID/items" -q "value[?type=='Notebook']"
# List lakehouses
fab api "workspaces/$WS_ID/items" -q "value[?type=='Lakehouse']"
# Count items by type
fab api "workspaces/$WS_ID/items" -q "value | group_by(@, &type)"
```
## Updating Workspaces
### Update Display Name
```bash
fab set "OldName.Workspace" -q displayName -i "NewName"
# Note: This changes the display name, not the workspace ID
```
### Update Description
```bash
fab set "Production.Workspace" -q description -i "Production environment for enterprise analytics"
```
### Configure Spark Settings
```bash
# Set Spark runtime version
fab set "Production.Workspace" -q sparkSettings.environment.runtimeVersion -i 1.2
# Set starter pool as default
fab set "Production.Workspace" -q sparkSettings.pool.defaultPool -i '{
"name": "Starter Pool",
"type": "Workspace"
}'
# Set custom workspace pool
fab set "Production.Workspace" -q sparkSettings.pool.defaultPool -i '{
"name": "HighMemoryPool",
"type": "Workspace",
"id": "<pool-id>"
}'
```
## Capacity Management
### Assign Workspace to Capacity
```bash
# Get capacity ID
CAPACITY_ID=$(fab api -A azure "subscriptions/<subscription-id>/providers/Microsoft.Fabric/capacities?api-version=2023-11-01" -q "value[?name=='MyCapacity'].id | [0]")
# Assign workspace
fab assign "Production.Workspace" -P capacityId=$CAPACITY_ID
```
### Unassign from Capacity
```bash
# Move to shared capacity
fab unassign "Dev.Workspace"
```
### List Workspaces by Capacity
```bash
# Get all workspaces
fab api workspaces -q "value[] | group_by(@, &capacityId)"
# List workspaces on specific capacity
fab api workspaces -q "value[?capacityId=='<capacity-id>'].displayName"
```
## Workspace Migration
### Export Entire Workspace
```bash
# Export all items
fab export "Production.Workspace" -o /tmp/workspace-backup -a
# This exports all supported item types:
# - Notebooks
# - Data Pipelines
# - Reports
# - Semantic Models
# - etc.
```
### Selective Export
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
OUTPUT_DIR="/tmp/migration"
# Export only semantic models
WS_ID=$(fab get "$WORKSPACE" -q "id")
MODELS=$(fab api "workspaces/$WS_ID/items" -q "value[?type=='SemanticModel'].displayName")
for MODEL in $MODELS; do
fab export "$WORKSPACE/$MODEL.SemanticModel" -o "$OUTPUT_DIR/models"
done
# Export only reports
REPORTS=$(fab api "workspaces/$WS_ID/items" -q "value[?type=='Report'].displayName")
for REPORT in $REPORTS; do
fab export "$WORKSPACE/$REPORT.Report" -o "$OUTPUT_DIR/reports"
done
```
### Copy Workspace Contents
```bash
# Copy all items to another workspace (interactive selection)
fab cp "Source.Workspace" "Target.Workspace"
# Copy specific items
fab cp "Source.Workspace/Model.SemanticModel" "Target.Workspace"
fab cp "Source.Workspace/Report.Report" "Target.Workspace"
fab cp "Source.Workspace/Notebook.Notebook" "Target.Workspace"
```
## Deleting Workspaces
### Delete with Confirmation
```bash
# Interactive confirmation (lists items first)
fab rm "OldWorkspace.Workspace"
```
### Force Delete
```bash
# Delete workspace and all contents without confirmation
# ⚠️ DANGEROUS - Cannot be undone
fab rm "TestWorkspace.Workspace" -f
```
## Navigation
### Change to Workspace
```bash
# Navigate to workspace
fab cd "Production.Workspace"
# Verify current location
fab pwd
# Navigate to personal workspace
fab cd ~
```
### Relative Navigation
```bash
# From workspace to another
fab cd "../Dev.Workspace"
# To parent (tenant level)
fab cd ..
```
## Workspace Inventory
### Get Complete Inventory
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
WS_ID=$(fab get "$WORKSPACE" -q "id")
echo "=== Workspace: $WORKSPACE ==="
echo
# Get all items
ITEMS=$(fab api "workspaces/$WS_ID/items")
# Count by type
echo "Item Counts:"
echo "$ITEMS" | jq -r '.value | group_by(.type) | map({type: .[0].type, count: length}) | .[] | "\(.type): \(.count)"'
echo
echo "Total Items: $(echo "$ITEMS" | jq '.value | length')"
# List items
echo
echo "=== Items ==="
echo "$ITEMS" | jq -r '.value[] | "\(.type): \(.displayName)"' | sort
```
### Generate Inventory Report
```bash
#!/bin/bash
OUTPUT_FILE="/tmp/workspace-inventory.csv"
echo "Workspace,Item Type,Item Name,Created Date,Modified Date" > "$OUTPUT_FILE"
# Get all workspaces
WORKSPACES=$(fab api workspaces -q "value[].{name: displayName, id: id}")
echo "$WORKSPACES" | jq -r '.[] | [.name, .id] | @tsv' | while IFS=$'\t' read -r WS_NAME WS_ID; do
# Get items in workspace
ITEMS=$(fab api "workspaces/$WS_ID/items")
echo "$ITEMS" | jq -r --arg ws "$WS_NAME" '.value[] | [$ws, .type, .displayName, .createdDate, .lastModifiedDate] | @csv' >> "$OUTPUT_FILE"
done
echo "Inventory saved to $OUTPUT_FILE"
```
## Workspace Permissions
### List Workspace Users
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# List users with access
fab api -A powerbi "groups/$WS_ID/users"
```
### Add User to Workspace
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# Add user as member
fab api -A powerbi "groups/$WS_ID/users" -X post -i '{
"emailAddress": "user@company.com",
"groupUserAccessRight": "Member"
}'
# Access levels: Admin, Member, Contributor, Viewer
```
### Remove User from Workspace
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# Remove user
fab api -A powerbi "groups/$WS_ID/users/user@company.com" -X delete
```
## Workspace Settings
### Git Integration
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# Get Git connection status
fab api "workspaces/$WS_ID/git/connection"
# Connect to Git (requires Git integration setup)
fab api -X post "workspaces/$WS_ID/git/initializeConnection" -i '{
"gitProviderDetails": {
"organizationName": "myorg",
"projectName": "fabric-project",
"repositoryName": "production",
"branchName": "main",
"directoryName": "/workspace-content"
}
}'
```
## Advanced Workflows
### Clone Workspace
```bash
#!/bin/bash
SOURCE_WS="Template.Workspace"
TARGET_WS="New Project.Workspace"
CAPACITY="MyCapacity"
# 1. Create target workspace
fab mkdir "$TARGET_WS" -P capacityname=$CAPACITY
# 2. Export all items from source
fab export "$SOURCE_WS" -o /tmp/clone -a
# 3. Import items to target
for ITEM in /tmp/clone/*; do
ITEM_NAME=$(basename "$ITEM")
fab import "$TARGET_WS/$ITEM_NAME" -i "$ITEM"
done
echo "Workspace cloned successfully"
```
### Workspace Comparison
```bash
#!/bin/bash
WS1="Production.Workspace"
WS2="Development.Workspace"
WS1_ID=$(fab get "$WS1" -q "id")
WS2_ID=$(fab get "$WS2" -q "id")
echo "=== Comparing Workspaces ==="
echo
echo "--- $WS1 ---"
fab api "workspaces/$WS1_ID/items" -q "value[].{type: type, name: displayName}" | jq -r '.[] | "\(.type): \(.name)"' | sort > /tmp/ws1.txt
echo "--- $WS2 ---"
fab api "workspaces/$WS2_ID/items" -q "value[].{type: type, name: displayName}" | jq -r '.[] | "\(.type): \(.name)"' | sort > /tmp/ws2.txt
echo
echo "=== Differences ==="
diff /tmp/ws1.txt /tmp/ws2.txt
rm /tmp/ws1.txt /tmp/ws2.txt
```
### Batch Workspace Operations
```bash
#!/bin/bash
# Update description for all production workspaces
PROD_WORKSPACES=$(fab api workspaces -q "value[?contains(displayName, 'Prod')].displayName")
for WS in $PROD_WORKSPACES; do
echo "Updating $WS..."
fab set "$WS.Workspace" -q description -i "Production environment - managed by Data Platform team"
done
```
## Workspace Monitoring
### Monitor Workspace Activity
```bash
WS_ID=$(fab get "Production.Workspace" -q "id")
# Get activity events (requires admin access)
fab api -A powerbi "admin/activityevents?filter=Workspace%20eq%20'$WS_ID'"
```
### Track Workspace Size
```bash
#!/bin/bash
WORKSPACE="Production.Workspace"
WS_ID=$(fab get "$WORKSPACE" -q "id")
# Count items
ITEM_COUNT=$(fab api "workspaces/$WS_ID/items" -q "value | length")
# Count by type
echo "=== Workspace: $WORKSPACE ==="
echo "Total Items: $ITEM_COUNT"
echo
echo "Items by Type:"
fab api "workspaces/$WS_ID/items" -q "value | group_by(@, &type) | map({type: .[0].type, count: length}) | sort_by(.count) | reverse | .[]" | jq -r '"\(.type): \(.count)"'
```
## Troubleshooting
### Workspace Not Found
```bash
# List all workspaces to verify name
fab ls | grep -i "production"
# Get by ID directly
fab api "workspaces/<workspace-id>"
```
### Capacity Issues
```bash
# Check workspace capacity assignment
fab get "Production.Workspace" -q "capacityId"
# List available capacities
fab ls -la | grep Capacity
# Verify capacity status (via Azure API)
fab api -A azure "subscriptions/<subscription-id>/providers/Microsoft.Fabric/capacities?api-version=2023-11-01" -q "value[].{name: name, state: properties.state, sku: sku.name}"
```
### Permission Errors
```bash
# Verify your access level
WS_ID=$(fab get "Production.Workspace" -q "id")
fab api -A powerbi "groups/$WS_ID/users" | grep "$(whoami)"
# Check if you're workspace admin
fab api -A powerbi "groups/$WS_ID/users" -q "value[?emailAddress=='your@email.com'].groupUserAccessRight"
```
## Best Practices
1. **Naming conventions**: Use consistent naming (e.g., "ProjectName - Environment")
2. **Capacity planning**: Assign workspaces to appropriate capacities
3. **Access control**: Use least-privilege principle for permissions
4. **Git integration**: Enable for production workspaces
5. **Regular backups**: Export critical workspaces periodically
6. **Documentation**: Maintain workspace descriptions
7. **Monitoring**: Track workspace activity and growth
8. **Cleanup**: Remove unused workspaces regularly
## Performance Tips
1. **Cache workspace IDs**: Don't repeatedly query for same ID
2. **Use JMESPath filters**: Get only needed data
3. **Parallel operations**: Export multiple items concurrently
4. **Batch updates**: Group similar operations
5. **Off-peak operations**: Schedule large migrations during low usage
## Security Considerations
1. **Access reviews**: Regularly audit workspace permissions
2. **Sensitive data**: Use appropriate security labels
3. **Capacity isolation**: Separate dev/test/prod workspaces
4. **Git secrets**: Don't commit credentials in Git-integrated workspaces
5. **Audit logging**: Enable and monitor activity logs
## Related Scripts
- `scripts/download_workspace.py` - Download complete workspace with all items and lakehouse files