Initial commit

2025-11-30 08:28:47 +08:00
commit 1458762357
9 changed files with 3659 additions and 0 deletions
--- a/skills/databricks-2025.md
+++ b/skills/databricks-2025.md
@@ -0,0 +1,812 @@
+---
+name: databricks-2025
+description: Databricks Job activity and 2025 Azure Data Factory connectors
+---
+
+## 🚨 CRITICAL GUIDELINES
+
+### Windows File Path Requirements
+
+**MANDATORY: Always Use Backslashes on Windows for File Paths**
+
+When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`).
+
+**Examples:**
+- ❌ WRONG: `D:/repos/project/file.tsx`
+- ✅ CORRECT: `D:\repos\project\file.tsx`
+
+This applies to:
+- Edit tool file_path parameter
+- Write tool file_path parameter
+- All file operations on Windows systems
+
+
+### Documentation Guidelines
+
+**NEVER create new documentation files unless explicitly requested by the user.**
+
+- **Priority**: Update existing README.md files rather than creating new documentation
+- **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise
+- **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone
+- **User preference**: Only create additional .md files when user specifically asks for documentation
+
+
+---
+
+# Azure Data Factory Databricks Integration 2025
+
+## Databricks Job Activity (Recommended 2025)
+
+**🚨 CRITICAL UPDATE (2025):** The Databricks Job activity is now the **ONLY recommended method** for orchestrating Databricks in ADF. Microsoft strongly recommends migrating from legacy Notebook, Python, and JAR activities.
+
+### Why Databricks Job Activity?
+
+**Old Pattern (Notebook Activity - ❌ LEGACY):**
+```json
+{
+  "name": "RunNotebook",
+  "type": "DatabricksNotebook",  // ❌ DEPRECATED - Migrate to DatabricksJob
+  "linkedServiceName": { "referenceName": "DatabricksLinkedService" },
+  "typeProperties": {
+    "notebookPath": "/Users/user@example.com/MyNotebook",
+    "baseParameters": { "param1": "value1" }
+  }
+}
+```
+
+**New Pattern (Databricks Job Activity - ✅ CURRENT 2025):**
+```json
+{
+  "name": "RunDatabricksWorkflow",
+  "type": "DatabricksJob",  // ✅ CORRECT activity type (NOT DatabricksSparkJob)
+  "linkedServiceName": { "referenceName": "DatabricksLinkedService" },
+  "typeProperties": {
+    "jobId": "123456",  // Reference existing Databricks Workflow Job
+    "jobParameters": {  // Pass parameters to the Job
+      "param1": "value1",
+      "runDate": "@pipeline().parameters.ProcessingDate"
+    }
+  },
+  "policy": {
+    "timeout": "0.12:00:00",
+    "retry": 2,
+    "retryIntervalInSeconds": 30
+  }
+}
+```
+
+### Benefits of Databricks Job Activity (2025)
+
+1. **Serverless Execution by Default:**
+   - ✅ No cluster specification needed in linked service
+   - ✅ Automatically runs on Databricks serverless compute
+   - ✅ Faster startup times and lower costs
+   - ✅ Managed infrastructure by Databricks
+
+2. **Advanced Workflow Features:**
+   - ✅ **Run As** - Execute jobs as specific users/service principals
+   - ✅ **Task Values** - Pass data between tasks within workflow
+   - ✅ **Conditional Execution** - If/Else and For Each task types
+   - ✅ **AI/BI Tasks** - Model serving endpoints, Power BI semantic models
+   - ✅ **Repair Runs** - Rerun failed tasks without reprocessing successful ones
+   - ✅ **Notifications/Alerts** - Built-in alerting on job failures
+   - ✅ **Git Integration** - Version control for notebooks and code
+   - ✅ **DABs Support** - Databricks Asset Bundles for deployment
+   - ✅ **Built-in Lineage** - Data lineage tracking across tasks
+   - ✅ **Queuing and Concurrent Runs** - Better resource management
+
+3. **Centralized Job Management:**
+   - Jobs defined once in Databricks workspace
+   - Single source of truth for all environments
+   - Versioning through Databricks (Git-backed)
+   - Consistent across orchestration tools
+
+4. **Better Orchestration:**
+   - Complex task dependencies within Job
+   - Multiple heterogeneous tasks (notebook, Python, SQL, Delta Live Tables)
+   - Job-level monitoring and logging
+   - Parameter passing between tasks
+
+5. **Improved Reliability:**
+   - Retry logic at Job and task level
+   - Better error handling and recovery
+   - Automatic cluster management
+
+6. **Cost Optimization:**
+   - Serverless compute (pay only for execution)
+   - Job clusters (auto-terminating)
+   - Optimized cluster sizing per task
+   - Spot instance support
+
+### Implementation
+
+#### 1. Create Databricks Job
+
+```python
+# In Databricks workspace
+# Create Job with tasks
+{
+  "name": "Data Processing Job",
+  "tasks": [
+    {
+      "task_key": "ingest",
+      "notebook_task": {
+        "notebook_path": "/Notebooks/Ingest",
+        "base_parameters": {}
+      },
+      "job_cluster_key": "small_cluster"
+    },
+    {
+      "task_key": "transform",
+      "depends_on": [{ "task_key": "ingest" }],
+      "notebook_task": {
+        "notebook_path": "/Notebooks/Transform"
+      },
+      "job_cluster_key": "medium_cluster"
+    },
+    {
+      "task_key": "load",
+      "depends_on": [{ "task_key": "transform" }],
+      "notebook_task": {
+        "notebook_path": "/Notebooks/Load"
+      },
+      "job_cluster_key": "small_cluster"
+    }
+  ],
+  "job_clusters": [
+    {
+      "job_cluster_key": "small_cluster",
+      "new_cluster": {
+        "spark_version": "13.3.x-scala2.12",
+        "node_type_id": "Standard_DS3_v2",
+        "num_workers": 2
+      }
+    },
+    {
+      "job_cluster_key": "medium_cluster",
+      "new_cluster": {
+        "spark_version": "13.3.x-scala2.12",
+        "node_type_id": "Standard_DS4_v2",
+        "num_workers": 8
+      }
+    }
+  ]
+}
+
+# Get Job ID after creation
+```
+
+#### 2. Create ADF Pipeline with Databricks Job Activity (2025)
+
+```json
+{
+  "name": "PL_Databricks_Serverless_Workflow",
+  "properties": {
+    "activities": [
+      {
+        "name": "ExecuteDatabricksWorkflow",
+        "type": "DatabricksJob",  // ✅ Correct activity type
+        "dependsOn": [],
+        "policy": {
+          "timeout": "0.12:00:00",
+          "retry": 2,
+          "retryIntervalInSeconds": 30
+        },
+        "typeProperties": {
+          "jobId": "123456",  // Databricks Job ID from workspace
+          "jobParameters": {  // ⚠️ Use jobParameters (not parameters)
+            "input_path": "/mnt/data/input",
+            "output_path": "/mnt/data/output",
+            "run_date": "@pipeline().parameters.runDate",
+            "environment": "@pipeline().parameters.environment"
+          }
+        },
+        "linkedServiceName": {
+          "referenceName": "DatabricksLinkedService_Serverless",
+          "type": "LinkedServiceReference"
+        }
+      },
+      {
+        "name": "LogJobExecution",
+        "type": "WebActivity",
+        "dependsOn": [
+          {
+            "activity": "ExecuteDatabricksWorkflow",
+            "dependencyConditions": ["Succeeded"]
+          }
+        ],
+        "typeProperties": {
+          "url": "@pipeline().parameters.LoggingEndpoint",
+          "method": "POST",
+          "body": {
+            "jobId": "123456",
+            "runId": "@activity('ExecuteDatabricksWorkflow').output.runId",
+            "status": "Succeeded",
+            "duration": "@activity('ExecuteDatabricksWorkflow').output.executionDuration"
+          }
+        }
+      }
+    ],
+    "parameters": {
+      "runDate": {
+        "type": "string",
+        "defaultValue": "@utcnow()"
+      },
+      "environment": {
+        "type": "string",
+        "defaultValue": "production"
+      },
+      "LoggingEndpoint": {
+        "type": "string"
+      }
+    }
+  }
+}
+```
+
+#### 3. Configure Linked Service (2025 - Serverless)
+
+**✅ RECOMMENDED: Serverless Linked Service (No Cluster Configuration)**
+```json
+{
+  "name": "DatabricksLinkedService_Serverless",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "AzureDatabricks",
+    "typeProperties": {
+      "domain": "https://adb-123456789.azuredatabricks.net",
+      "authentication": "MSI"  // ✅ Managed Identity (recommended 2025)
+      // ⚠️ NO existingClusterId or newClusterNodeType needed for serverless!
+      // The Databricks Job activity automatically uses serverless compute
+    }
+  }
+}
+```
+
+**Alternative: Access Token Authentication**
+```json
+{
+  "name": "DatabricksLinkedService_Token",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "AzureDatabricks",
+    "typeProperties": {
+      "domain": "https://adb-123456789.azuredatabricks.net",
+      "accessToken": {
+        "type": "AzureKeyVaultSecret",
+        "store": {
+          "referenceName": "AzureKeyVault",
+          "type": "LinkedServiceReference"
+        },
+        "secretName": "databricks-access-token"
+      }
+    }
+  }
+}
+```
+
+**🚨 CRITICAL: For Databricks Job activity, DO NOT specify cluster properties in the linked service. The job configuration in Databricks workspace controls compute resources.**
+
+## 🆕 2025 New Connectors and Enhancements
+
+### ServiceNow V2 Connector (RECOMMENDED - V1 End of Support)
+
+**🚨 CRITICAL: ServiceNow V1 connector is at End of Support stage. Migrate to V2 immediately!**
+
+**Key Features of V2:**
+- ✅ **Native Query Builder** - Aligns with ServiceNow's condition builder experience
+- ✅ **Enhanced Performance** - Optimized data extraction
+- ✅ **Better Error Handling** - Improved diagnostics and retry logic
+- ✅ **OData Support** - Modern API integration patterns
+
+**Copy Activity Example:**
+```json
+{
+  "name": "CopyFromServiceNowV2",
+  "type": "Copy",
+  "inputs": [
+    {
+      "referenceName": "ServiceNowV2Source",
+      "type": "DatasetReference"
+    }
+  ],
+  "outputs": [
+    {
+      "referenceName": "AzureSqlSink",
+      "type": "DatasetReference"
+    }
+  ],
+  "typeProperties": {
+    "source": {
+      "type": "ServiceNowV2Source",
+      "query": "sysparm_query=active=true^priority=1^sys_created_on>=javascript:gs.dateGenerate('2025-01-01')",
+      "httpRequestTimeout": "00:01:40"  // 100 seconds
+    },
+    "sink": {
+      "type": "AzureSqlSink",
+      "writeBehavior": "upsert",
+      "upsertSettings": {
+        "useTempDB": true,
+        "keys": ["sys_id"]
+      }
+    },
+    "enableStaging": true,
+    "stagingSettings": {
+      "linkedServiceName": {
+        "referenceName": "AzureBlobStorage",
+        "type": "LinkedServiceReference"
+      }
+    }
+  }
+}
+```
+
+**Linked Service (OAuth2 - Recommended):**
+```json
+{
+  "name": "ServiceNowV2LinkedService",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "ServiceNowV2",
+    "typeProperties": {
+      "endpoint": "https://dev12345.service-now.com",
+      "authenticationType": "OAuth2",
+      "clientId": "your-oauth-client-id",
+      "clientSecret": {
+        "type": "AzureKeyVaultSecret",
+        "store": {
+          "referenceName": "AzureKeyVault",
+          "type": "LinkedServiceReference"
+        },
+        "secretName": "servicenow-client-secret"
+      },
+      "username": "service-account@company.com",
+      "password": {
+        "type": "AzureKeyVaultSecret",
+        "store": {
+          "referenceName": "AzureKeyVault",
+          "type": "LinkedServiceReference"
+        },
+        "secretName": "servicenow-password"
+      },
+      "grantType": "password"
+    }
+  }
+}
+```
+
+**Linked Service (Basic Authentication - Legacy):**
+```json
+{
+  "name": "ServiceNowV2LinkedService_Basic",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "ServiceNowV2",
+    "typeProperties": {
+      "endpoint": "https://dev12345.service-now.com",
+      "authenticationType": "Basic",
+      "username": "admin",
+      "password": {
+        "type": "AzureKeyVaultSecret",
+        "store": {
+          "referenceName": "AzureKeyVault",
+          "type": "LinkedServiceReference"
+        },
+        "secretName": "servicenow-password"
+      }
+    }
+  }
+}
+```
+
+**Migration from V1 to V2:**
+1. Update linked service type from `ServiceNow` to `ServiceNowV2`
+2. Update source type from `ServiceNowSource` to `ServiceNowV2Source`
+3. Test queries in ServiceNow UI's condition builder first
+4. Adjust timeout settings if needed (V2 may have different performance)
+
+### Enhanced PostgreSQL Connector
+
+Improved performance and features:
+
+```json
+{
+  "name": "PostgreSQLLinkedService",
+  "type": "PostgreSql",
+  "typeProperties": {
+    "connectionString": "host=myserver.postgres.database.azure.com;port=5432;database=mydb;uid=myuser",
+    "password": {
+      "type": "AzureKeyVaultSecret",
+      "store": { "referenceName": "KeyVault" },
+      "secretName": "postgres-password"
+    },
+    // 2025 enhancement
+    "enableSsl": true,
+    "sslMode": "Require"
+  }
+}
+```
+
+### Microsoft Fabric Warehouse Connector (NEW 2025)
+
+**🆕 Native support for Microsoft Fabric Warehouse (Q3 2024+)**
+
+**Supported Activities:**
+- ✅ Copy Activity (source and sink)
+- ✅ Lookup Activity
+- ✅ Get Metadata Activity
+- ✅ Script Activity
+- ✅ Stored Procedure Activity
+
+**Linked Service Configuration:**
+```json
+{
+  "name": "FabricWarehouseLinkedService",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "Warehouse",  // ✅ NEW dedicated Fabric Warehouse type
+    "typeProperties": {
+      "endpoint": "myworkspace.datawarehouse.fabric.microsoft.com",
+      "warehouse": "MyWarehouse",
+      "authenticationType": "ServicePrincipal",  // Recommended
+      "servicePrincipalId": "<app-registration-id>",
+      "servicePrincipalKey": {
+        "type": "AzureKeyVaultSecret",
+        "store": {
+          "referenceName": "AzureKeyVault",
+          "type": "LinkedServiceReference"
+        },
+        "secretName": "fabric-warehouse-sp-key"
+      },
+      "tenant": "<tenant-id>"
+    }
+  }
+}
+```
+
+**Alternative: Managed Identity Authentication (Preferred)**
+```json
+{
+  "name": "FabricWarehouseLinkedService_ManagedIdentity",
+  "type": "Microsoft.DataFactory/factories/linkedservices",
+  "properties": {
+    "type": "Warehouse",
+    "typeProperties": {
+      "endpoint": "myworkspace.datawarehouse.fabric.microsoft.com",
+      "warehouse": "MyWarehouse",
+      "authenticationType": "SystemAssignedManagedIdentity"
+    }
+  }
+}
+```
+
+**Copy Activity Example:**
+```json
+{
+  "name": "CopyToFabricWarehouse",
+  "type": "Copy",
+  "inputs": [
+    {
+      "referenceName": "AzureSqlSource",
+      "type": "DatasetReference"
+    }
+  ],
+  "outputs": [
+    {
+      "referenceName": "FabricWarehouseSink",
+      "type": "DatasetReference"
+    }
+  ],
+  "typeProperties": {
+    "source": {
+      "type": "AzureSqlSource"
+    },
+    "sink": {
+      "type": "WarehouseSink",
+      "writeBehavior": "insert",  // or "upsert"
+      "writeBatchSize": 10000,
+      "tableOption": "autoCreate"  // Auto-create table if not exists
+    },
+    "enableStaging": true,  // Recommended for large data
+    "stagingSettings": {
+      "linkedServiceName": {
+        "referenceName": "AzureBlobStorage",
+        "type": "LinkedServiceReference"
+      },
+      "path": "staging/fabric-warehouse"
+    },
+    "translator": {
+      "type": "TabularTranslator",
+      "mappings": [
+        {
+          "source": { "name": "CustomerID" },
+          "sink": { "name": "customer_id" }
+        }
+      ]
+    }
+  }
+}
+```
+
+**Best Practices for Fabric Warehouse:**
+- ✅ Use managed identity for authentication (no secret rotation)
+- ✅ Enable staging for large data loads (> 1GB)
+- ✅ Use `tableOption: autoCreate` for dynamic schema creation
+- ✅ Leverage Fabric's lakehouse integration for unified analytics
+- ✅ Monitor Fabric capacity units (CU) consumption
+
+### Enhanced Snowflake Connector
+
+Improved performance:
+
+```json
+{
+  "name": "SnowflakeLinkedService",
+  "type": "Snowflake",
+  "typeProperties": {
+    "connectionString": "jdbc:snowflake://myaccount.snowflakecomputing.com",
+    "database": "mydb",
+    "warehouse": "mywarehouse",
+    "authenticationType": "KeyPair",
+    "username": "myuser",
+    "privateKey": {
+      "type": "AzureKeyVaultSecret",
+      "store": { "referenceName": "KeyVault" },
+      "secretName": "snowflake-private-key"
+    },
+    "privateKeyPassphrase": {
+      "type": "AzureKeyVaultSecret",
+      "store": { "referenceName": "KeyVault" },
+      "secretName": "snowflake-passphrase"
+    }
+  }
+}
+```
+
+## Managed Identity for Azure Storage (2025)
+
+### Azure Table Storage
+
+Now supports system-assigned and user-assigned managed identity:
+
+```json
+{
+  "name": "AzureTableStorageLinkedService",
+  "type": "AzureTableStorage",
+  "typeProperties": {
+    "serviceEndpoint": "https://mystorageaccount.table.core.windows.net",
+    "authenticationType": "ManagedIdentity"  // New in 2025
+    // Or user-assigned:
+    // "credential": {
+    //   "referenceName": "UserAssignedManagedIdentity"
+    // }
+  }
+}
+```
+
+### Azure Files
+
+Now supports managed identity authentication:
+
+```json
+{
+  "name": "AzureFilesLinkedService",
+  "type": "AzureFileStorage",
+  "typeProperties": {
+    "fileShare": "myshare",
+    "accountName": "mystorageaccount",
+    "authenticationType": "ManagedIdentity"  // New in 2025
+  }
+}
+```
+
+## Mapping Data Flows - Spark 3.3
+
+Spark 3.3 now powers Mapping Data Flows:
+
+**Performance Improvements:**
+- 30% faster data processing
+- Improved memory management
+- Better partition handling
+- Enhanced join performance
+
+**New Features:**
+- Adaptive Query Execution (AQE)
+- Dynamic partition pruning
+- Improved caching
+- Better column statistics
+
+```json
+{
+  "name": "DataFlow1",
+  "type": "MappingDataFlow",
+  "typeProperties": {
+    "sources": [
+      {
+        "dataset": { "referenceName": "SourceDataset" }
+      }
+    ],
+    "transformations": [
+      {
+        "name": "Transform1"
+      }
+    ],
+    "sinks": [
+      {
+        "dataset": { "referenceName": "SinkDataset" }
+      }
+    ]
+  }
+}
+```
+
+## Azure DevOps Server 2022 Support
+
+Git integration now supports on-premises Azure DevOps Server 2022:
+
+```json
+{
+  "name": "DataFactory",
+  "properties": {
+    "repoConfiguration": {
+      "type": "AzureDevOpsGit",
+      "accountName": "on-prem-ado-server",
+      "projectName": "MyProject",
+      "repositoryName": "adf-repo",
+      "collaborationBranch": "main",
+      "rootFolder": "/",
+      "hostName": "https://ado-server.company.com"  // On-premises server
+    }
+  }
+}
+```
+
+## 🔐 Managed Identity 2025 Best Practices
+
+### User-Assigned vs System-Assigned Managed Identity
+
+**System-Assigned Managed Identity:**
+```json
+{
+  "type": "AzureBlobStorage",
+  "typeProperties": {
+    "serviceEndpoint": "https://mystorageaccount.blob.core.windows.net",
+    "accountKind": "StorageV2"
+    // ✅ Uses Data Factory's system-assigned identity automatically
+  }
+}
+```
+
+**User-Assigned Managed Identity (NEW 2025):**
+```json
+{
+  "type": "AzureBlobStorage",
+  "typeProperties": {
+    "serviceEndpoint": "https://mystorageaccount.blob.core.windows.net",
+    "accountKind": "StorageV2",
+    "credential": {
+      "referenceName": "UserAssignedManagedIdentityCredential",
+      "type": "CredentialReference"
+    }
+  }
+}
+```
+
+**When to Use User-Assigned:**
+- ✅ Sharing identity across multiple data factories
+- ✅ Complex multi-environment setups
+- ✅ Granular permission management
+- ✅ Identity lifecycle independent of data factory
+
+**Credential Consolidation (NEW 2025):**
+
+ADF now supports a centralized **Credentials** feature:
+```json
+{
+  "name": "ManagedIdentityCredential",
+  "type": "Microsoft.DataFactory/factories/credentials",
+  "properties": {
+    "type": "ManagedIdentity",
+    "typeProperties": {
+      "resourceId": "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identity-name}"
+    }
+  }
+}
+```
+
+**Benefits:**
+- ✅ Consolidate all Microsoft Entra ID-based credentials in one place
+- ✅ Reuse credentials across multiple linked services
+- ✅ Centralized permission management
+- ✅ Easier audit and compliance tracking
+
+### MFA Enforcement Compatibility (October 2025)
+
+**🚨 IMPORTANT: Azure requires MFA for all users by October 2025**
+
+**Impact on ADF:**
+- ✅ **Managed identities are UNAFFECTED** - No MFA required for service accounts
+- ✅ Continue using system-assigned and user-assigned identities without changes
+- ❌ **Interactive user logins affected** - Personal Azure AD accounts need MFA
+- ✅ **Service principals with certificate auth** - Recommended alternative to secrets
+
+**Best Practice:**
+```json
+{
+  "type": "AzureSqlDatabase",
+  "typeProperties": {
+    "server": "myserver.database.windows.net",
+    "database": "mydb",
+    "authenticationType": "SystemAssignedManagedIdentity"
+    // ✅ No MFA needed, no secret rotation, passwordless
+  }
+}
+```
+
+### Principle of Least Privilege (2025)
+
+**Storage Blob Data Roles:**
+- `Storage Blob Data Reader` - Read-only access (source)
+- `Storage Blob Data Contributor` - Read/write access (sink)
+- ❌ Avoid `Storage Blob Data Owner` unless needed
+
+**SQL Database Roles:**
+```sql
+-- Create contained database user for managed identity
+CREATE USER [datafactory-name] FROM EXTERNAL PROVIDER;
+
+-- Grant minimal required permissions
+ALTER ROLE db_datareader ADD MEMBER [datafactory-name];
+ALTER ROLE db_datawriter ADD MEMBER [datafactory-name];
+
+-- ❌ Avoid db_owner unless truly needed
+```
+
+**Key Vault Access Policies:**
+```json
+{
+  "permissions": {
+    "secrets": ["Get"]  // ✅ Only Get permission needed
+    // ❌ Don't grant List, Set, Delete unless required
+  }
+}
+```
+
+## Best Practices (2025)
+
+1. **Use Databricks Job Activity (MANDATORY):**
+   - ❌ STOP using Notebook, Python, JAR activities
+   - ✅ Migrate to DatabricksJob activity immediately
+   - ✅ Define workflows in Databricks workspace
+   - ✅ Leverage serverless compute (no cluster config needed)
+   - ✅ Utilize advanced features (Run As, Task Values, If/Else, Repair Runs)
+
+2. **Managed Identity Authentication (MANDATORY 2025):**
+   - ✅ Use managed identities for ALL Azure resources
+   - ✅ Prefer system-assigned for simple scenarios
+   - ✅ Use user-assigned for shared identity needs
+   - ✅ Leverage Credentials feature for consolidation
+   - ✅ MFA-compliant for October 2025 enforcement
+   - ❌ Avoid access keys and connection strings
+   - ✅ Store any remaining secrets in Key Vault
+
+3. **Monitor Job Execution:**
+   - Track Databricks Job run IDs from ADF output
+   - Log Job parameters for auditability
+   - Set up alerts for job failures
+   - Use Databricks job-level monitoring
+   - Leverage built-in lineage tracking
+
+4. **Optimize Spark 3.3 Usage (Data Flows):**
+   - Enable Adaptive Query Execution (AQE)
+   - Use appropriate partition counts (4-8 per core)
+   - Monitor execution plans in Databricks
+   - Use broadcast joins for small dimensions
+   - Implement dynamic partition pruning
+
+## Resources
+
+- [Databricks Job Activity](https://learn.microsoft.com/azure/data-factory/transform-data-using-databricks-spark-job)
+- [ADF Connectors](https://learn.microsoft.com/azure/data-factory/connector-overview)
+- [Managed Identity Authentication](https://learn.microsoft.com/azure/data-factory/data-factory-service-identity)
+- [Mapping Data Flows](https://learn.microsoft.com/azure/data-factory/concepts-data-flow-overview)