--- name: transform-table description: Transform a single database table to staging format with data quality improvements, PII handling, and JSON extraction --- # Transform Single Table to Staging ## ⚠️ CRITICAL: This command enforces strict sub-agent delegation I'll help you transform a database table to staging format using the appropriate staging-transformer sub-agent (Presto/Trino or Hive). --- ## Required Information Please provide the following details: ### 1. Source Table - **Database Name**: Source database (e.g., `client_src`, `demo_db`) - **Table Name**: Source table name (e.g., `customer_profiles_histunion`) ### 2. Target Configuration - **Staging Database**: Target database (default: `client_stg`) - **Lookup Database**: Reference database for rules (default: `client_config`) ### 3. SQL Engine (Optional) - **Engine**: Choose one: - `presto` or `trino` - Presto/Trino SQL engine (default) - `hive` - Hive SQL engine - If not specified, will default to Presto/Trino ### 4. Transformation Requirements (Optional) - **Deduplication**: Required? (will check client_config.staging_trnsfrm_rules) - **JSON Columns**: Will auto-detect and process - **Join Logic**: Any joins needed? (will check additional_rules) --- ## What I'll Do ### Step 1: Detect SQL Engine I will determine the appropriate sub-agent: - **Presto/Trino keywords** → `staging-transformer-presto` - **Hive keywords** → `staging-transformer-hive` - **No specification** → `staging-transformer-presto` (default) ### Step 2: Delegate to Specialized Agent I will invoke the appropriate staging-transformer agent with: - Complete table transformation context - All mandatory requirements (13 rules) - Engine-specific SQL generation - Full compliance validation ### Step 3: Sub-Agent Will Execute The specialized agent will: 1. **Validate table existence** (MANDATORY first step) 2. **Analyze metadata** (columns, types, data samples) 3. **Check configuration** (deduplication rules, additional rules) 4. **Detect JSON columns** (automatic processing) 5. **Generate SQL files**: - `staging/init_queries/{source_db}_{table_name}_init.sql` (Presto) - `staging/queries/{source_db}_{table_name}.sql` (Presto) - `staging/queries/{source_db}_{table_name}_upsert.sql` (if dedup, Presto) - OR `staging_hive/queries/{source_db}_{table_name}.sql` (Hive) 6. **Update configuration**: `staging/config/src_params.yml` or `staging_hive/config/src_params.yml` 7. **Create/verify DIG file**: `staging/staging_transformation.dig` or `staging_hive/staging_hive.dig` 8. **Execute git workflow**: Commit, branch, push, PR creation --- ## Quality Assurance The sub-agent ensures complete compliance with all requirements: ✅ **Column Limit Management** (max 200 columns) ✅ **JSON Detection & Extraction** (automatic) ✅ **Date Processing** (4 outputs per date column) ✅ **Email/Phone Validation** (with hashing) ✅ **String Standardization** (UPPER, TRIM, NULL handling) ✅ **Deduplication Logic** (if configured) ✅ **Join Processing** (if specified) ✅ **Incremental Processing** (state tracking) ✅ **SQL File Creation** (init, incremental, upsert) ✅ **DIG File Management** (conditional creation) ✅ **Configuration Update** (src_params.yml) ✅ **Git Workflow** (complete automation) ✅ **Treasure Data Compatibility** (VARCHAR/BIGINT timestamps) --- ## Output Files ### For Presto/Trino Engine: 1. `staging/init_queries/{source_db}_{table_name}_init.sql` - Initial load SQL 2. `staging/queries/{source_db}_{table_name}.sql` - Incremental SQL 3. `staging/queries/{source_db}_{table_name}_upsert.sql` - Upsert SQL (if dedup) 4. `staging/config/src_params.yml` - Updated configuration 5. `staging/staging_transformation.dig` - Workflow (created if not exists) ### For Hive Engine: 1. `staging_hive/queries/{source_db}_{table_name}.sql` - Combined SQL 2. `staging_hive/config/src_params.yml` - Updated configuration 3. `staging_hive/staging_hive.dig` - Workflow (created if not exists) 4. `staging_hive/queries/get_max_time.sql` - Template (created if not exists) 5. `staging_hive/queries/get_stg_rows_for_delete.sql` - Template (created if not exists) ### Plus: - Git commit with comprehensive message - Pull request with transformation summary - Validation report --- ## Example Usage ### Example 1: Presto Engine (Default) ``` User: Transform table client_src.customer_profiles_histunion → Engine: Presto (default) → Sub-agent: staging-transformer-presto → Output: staging/ directory files ``` ### Example 2: Hive Engine (Explicit) ``` User: Transform table client_src.klaviyo_events_histunion using Hive → Engine: Hive → Sub-agent: staging-transformer-hive → Output: staging_hive/ directory files ``` ### Example 3: With Custom Databases ``` User: Transform demo_db.orders_histunion Use demo_db_stg as staging database Use client_config for lookup → Engine: Presto (default) → Custom databases applied ``` --- ## Next Steps After Transformation 1. **Review generated files**: ```bash ls -l staging/queries/ ls -l staging/init_queries/ cat staging/config/src_params.yml ``` 2. **Review Pull Request**: - Check transformation summary - Verify all validation gates passed - Review generated SQL 3. **Test the workflow**: ```bash cd staging td wf push td wf run staging_transformation.dig ``` 4. **Monitor execution**: ```sql SELECT * FROM client_config.inc_log WHERE table_name = '{your_table}' ORDER BY inc_value DESC LIMIT 1 ``` --- ## Production-Ready Guarantee All transformations will: - ✅ Work the first time - ✅ Follow consistent patterns - ✅ Include complete error handling - ✅ Include comprehensive data quality - ✅ Be maintainable and documented - ✅ Match production standards exactly --- **Ready to proceed? Please provide the source database and table name, and I'll delegate to the appropriate staging-transformer agent for complete processing.**