5.9 KiB
name, description
| name | description |
|---|---|
| transform-table | Transform a single database table to staging format with data quality improvements, PII handling, and JSON extraction |
Transform Single Table to Staging
⚠️ CRITICAL: This command enforces strict sub-agent delegation
I'll help you transform a database table to staging format using the appropriate staging-transformer sub-agent (Presto/Trino or Hive).
Required Information
Please provide the following details:
1. Source Table
- Database Name: Source database (e.g.,
client_src,demo_db) - Table Name: Source table name (e.g.,
customer_profiles_histunion)
2. Target Configuration
- Staging Database: Target database (default:
client_stg) - Lookup Database: Reference database for rules (default:
client_config)
3. SQL Engine (Optional)
- Engine: Choose one:
prestoortrino- Presto/Trino SQL engine (default)hive- Hive SQL engine- If not specified, will default to Presto/Trino
4. Transformation Requirements (Optional)
- Deduplication: Required? (will check client_config.staging_trnsfrm_rules)
- JSON Columns: Will auto-detect and process
- Join Logic: Any joins needed? (will check additional_rules)
What I'll Do
Step 1: Detect SQL Engine
I will determine the appropriate sub-agent:
- Presto/Trino keywords →
staging-transformer-presto - Hive keywords →
staging-transformer-hive - No specification →
staging-transformer-presto(default)
Step 2: Delegate to Specialized Agent
I will invoke the appropriate staging-transformer agent with:
- Complete table transformation context
- All mandatory requirements (13 rules)
- Engine-specific SQL generation
- Full compliance validation
Step 3: Sub-Agent Will Execute
The specialized agent will:
- Validate table existence (MANDATORY first step)
- Analyze metadata (columns, types, data samples)
- Check configuration (deduplication rules, additional rules)
- Detect JSON columns (automatic processing)
- Generate SQL files:
staging/init_queries/{source_db}_{table_name}_init.sql(Presto)staging/queries/{source_db}_{table_name}.sql(Presto)staging/queries/{source_db}_{table_name}_upsert.sql(if dedup, Presto)- OR
staging_hive/queries/{source_db}_{table_name}.sql(Hive)
- Update configuration:
staging/config/src_params.ymlorstaging_hive/config/src_params.yml - Create/verify DIG file:
staging/staging_transformation.digorstaging_hive/staging_hive.dig - Execute git workflow: Commit, branch, push, PR creation
Quality Assurance
The sub-agent ensures complete compliance with all requirements:
✅ Column Limit Management (max 200 columns) ✅ JSON Detection & Extraction (automatic) ✅ Date Processing (4 outputs per date column) ✅ Email/Phone Validation (with hashing) ✅ String Standardization (UPPER, TRIM, NULL handling) ✅ Deduplication Logic (if configured) ✅ Join Processing (if specified) ✅ Incremental Processing (state tracking) ✅ SQL File Creation (init, incremental, upsert) ✅ DIG File Management (conditional creation) ✅ Configuration Update (src_params.yml) ✅ Git Workflow (complete automation) ✅ Treasure Data Compatibility (VARCHAR/BIGINT timestamps)
Output Files
For Presto/Trino Engine:
staging/init_queries/{source_db}_{table_name}_init.sql- Initial load SQLstaging/queries/{source_db}_{table_name}.sql- Incremental SQLstaging/queries/{source_db}_{table_name}_upsert.sql- Upsert SQL (if dedup)staging/config/src_params.yml- Updated configurationstaging/staging_transformation.dig- Workflow (created if not exists)
For Hive Engine:
staging_hive/queries/{source_db}_{table_name}.sql- Combined SQLstaging_hive/config/src_params.yml- Updated configurationstaging_hive/staging_hive.dig- Workflow (created if not exists)staging_hive/queries/get_max_time.sql- Template (created if not exists)staging_hive/queries/get_stg_rows_for_delete.sql- Template (created if not exists)
Plus:
- Git commit with comprehensive message
- Pull request with transformation summary
- Validation report
Example Usage
Example 1: Presto Engine (Default)
User: Transform table client_src.customer_profiles_histunion
→ Engine: Presto (default)
→ Sub-agent: staging-transformer-presto
→ Output: staging/ directory files
Example 2: Hive Engine (Explicit)
User: Transform table client_src.klaviyo_events_histunion using Hive
→ Engine: Hive
→ Sub-agent: staging-transformer-hive
→ Output: staging_hive/ directory files
Example 3: With Custom Databases
User: Transform demo_db.orders_histunion
Use demo_db_stg as staging database
Use client_config for lookup
→ Engine: Presto (default)
→ Custom databases applied
Next Steps After Transformation
-
Review generated files:
ls -l staging/queries/ ls -l staging/init_queries/ cat staging/config/src_params.yml -
Review Pull Request:
- Check transformation summary
- Verify all validation gates passed
- Review generated SQL
-
Test the workflow:
cd staging td wf push td wf run staging_transformation.dig -
Monitor execution:
SELECT * FROM client_config.inc_log WHERE table_name = '{your_table}' ORDER BY inc_value DESC LIMIT 1
Production-Ready Guarantee
All transformations will:
- ✅ Work the first time
- ✅ Follow consistent patterns
- ✅ Include complete error handling
- ✅ Include comprehensive data quality
- ✅ Be maintainable and documented
- ✅ Match production standards exactly
Ready to proceed? Please provide the source database and table name, and I'll delegate to the appropriate staging-transformer agent for complete processing.