Initial commit
This commit is contained in:
186
commands/transform-table.md
Normal file
186
commands/transform-table.md
Normal file
@@ -0,0 +1,186 @@
|
||||
---
|
||||
name: transform-table
|
||||
description: Transform a single database table to staging format with data quality improvements, PII handling, and JSON extraction
|
||||
---
|
||||
|
||||
# Transform Single Table to Staging
|
||||
|
||||
## ⚠️ CRITICAL: This command enforces strict sub-agent delegation
|
||||
|
||||
I'll help you transform a database table to staging format using the appropriate staging-transformer sub-agent (Presto/Trino or Hive).
|
||||
|
||||
---
|
||||
|
||||
## Required Information
|
||||
|
||||
Please provide the following details:
|
||||
|
||||
### 1. Source Table
|
||||
- **Database Name**: Source database (e.g., `client_src`, `demo_db`)
|
||||
- **Table Name**: Source table name (e.g., `customer_profiles_histunion`)
|
||||
|
||||
### 2. Target Configuration
|
||||
- **Staging Database**: Target database (default: `client_stg`)
|
||||
- **Lookup Database**: Reference database for rules (default: `client_config`)
|
||||
|
||||
### 3. SQL Engine (Optional)
|
||||
- **Engine**: Choose one:
|
||||
- `presto` or `trino` - Presto/Trino SQL engine (default)
|
||||
- `hive` - Hive SQL engine
|
||||
- If not specified, will default to Presto/Trino
|
||||
|
||||
### 4. Transformation Requirements (Optional)
|
||||
- **Deduplication**: Required? (will check client_config.staging_trnsfrm_rules)
|
||||
- **JSON Columns**: Will auto-detect and process
|
||||
- **Join Logic**: Any joins needed? (will check additional_rules)
|
||||
|
||||
---
|
||||
|
||||
## What I'll Do
|
||||
|
||||
### Step 1: Detect SQL Engine
|
||||
I will determine the appropriate sub-agent:
|
||||
- **Presto/Trino keywords** → `staging-transformer-presto`
|
||||
- **Hive keywords** → `staging-transformer-hive`
|
||||
- **No specification** → `staging-transformer-presto` (default)
|
||||
|
||||
### Step 2: Delegate to Specialized Agent
|
||||
I will invoke the appropriate staging-transformer agent with:
|
||||
- Complete table transformation context
|
||||
- All mandatory requirements (13 rules)
|
||||
- Engine-specific SQL generation
|
||||
- Full compliance validation
|
||||
|
||||
### Step 3: Sub-Agent Will Execute
|
||||
The specialized agent will:
|
||||
1. **Validate table existence** (MANDATORY first step)
|
||||
2. **Analyze metadata** (columns, types, data samples)
|
||||
3. **Check configuration** (deduplication rules, additional rules)
|
||||
4. **Detect JSON columns** (automatic processing)
|
||||
5. **Generate SQL files**:
|
||||
- `staging/init_queries/{source_db}_{table_name}_init.sql` (Presto)
|
||||
- `staging/queries/{source_db}_{table_name}.sql` (Presto)
|
||||
- `staging/queries/{source_db}_{table_name}_upsert.sql` (if dedup, Presto)
|
||||
- OR `staging_hive/queries/{source_db}_{table_name}.sql` (Hive)
|
||||
6. **Update configuration**: `staging/config/src_params.yml` or `staging_hive/config/src_params.yml`
|
||||
7. **Create/verify DIG file**: `staging/staging_transformation.dig` or `staging_hive/staging_hive.dig`
|
||||
8. **Execute git workflow**: Commit, branch, push, PR creation
|
||||
|
||||
---
|
||||
|
||||
## Quality Assurance
|
||||
|
||||
The sub-agent ensures complete compliance with all requirements:
|
||||
|
||||
✅ **Column Limit Management** (max 200 columns)
|
||||
✅ **JSON Detection & Extraction** (automatic)
|
||||
✅ **Date Processing** (4 outputs per date column)
|
||||
✅ **Email/Phone Validation** (with hashing)
|
||||
✅ **String Standardization** (UPPER, TRIM, NULL handling)
|
||||
✅ **Deduplication Logic** (if configured)
|
||||
✅ **Join Processing** (if specified)
|
||||
✅ **Incremental Processing** (state tracking)
|
||||
✅ **SQL File Creation** (init, incremental, upsert)
|
||||
✅ **DIG File Management** (conditional creation)
|
||||
✅ **Configuration Update** (src_params.yml)
|
||||
✅ **Git Workflow** (complete automation)
|
||||
✅ **Treasure Data Compatibility** (VARCHAR/BIGINT timestamps)
|
||||
|
||||
---
|
||||
|
||||
## Output Files
|
||||
|
||||
### For Presto/Trino Engine:
|
||||
1. `staging/init_queries/{source_db}_{table_name}_init.sql` - Initial load SQL
|
||||
2. `staging/queries/{source_db}_{table_name}.sql` - Incremental SQL
|
||||
3. `staging/queries/{source_db}_{table_name}_upsert.sql` - Upsert SQL (if dedup)
|
||||
4. `staging/config/src_params.yml` - Updated configuration
|
||||
5. `staging/staging_transformation.dig` - Workflow (created if not exists)
|
||||
|
||||
### For Hive Engine:
|
||||
1. `staging_hive/queries/{source_db}_{table_name}.sql` - Combined SQL
|
||||
2. `staging_hive/config/src_params.yml` - Updated configuration
|
||||
3. `staging_hive/staging_hive.dig` - Workflow (created if not exists)
|
||||
4. `staging_hive/queries/get_max_time.sql` - Template (created if not exists)
|
||||
5. `staging_hive/queries/get_stg_rows_for_delete.sql` - Template (created if not exists)
|
||||
|
||||
### Plus:
|
||||
- Git commit with comprehensive message
|
||||
- Pull request with transformation summary
|
||||
- Validation report
|
||||
|
||||
---
|
||||
|
||||
## Example Usage
|
||||
|
||||
### Example 1: Presto Engine (Default)
|
||||
```
|
||||
User: Transform table client_src.customer_profiles_histunion
|
||||
→ Engine: Presto (default)
|
||||
→ Sub-agent: staging-transformer-presto
|
||||
→ Output: staging/ directory files
|
||||
```
|
||||
|
||||
### Example 2: Hive Engine (Explicit)
|
||||
```
|
||||
User: Transform table client_src.klaviyo_events_histunion using Hive
|
||||
→ Engine: Hive
|
||||
→ Sub-agent: staging-transformer-hive
|
||||
→ Output: staging_hive/ directory files
|
||||
```
|
||||
|
||||
### Example 3: With Custom Databases
|
||||
```
|
||||
User: Transform demo_db.orders_histunion
|
||||
Use demo_db_stg as staging database
|
||||
Use client_config for lookup
|
||||
→ Engine: Presto (default)
|
||||
→ Custom databases applied
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps After Transformation
|
||||
|
||||
1. **Review generated files**:
|
||||
```bash
|
||||
ls -l staging/queries/
|
||||
ls -l staging/init_queries/
|
||||
cat staging/config/src_params.yml
|
||||
```
|
||||
|
||||
2. **Review Pull Request**:
|
||||
- Check transformation summary
|
||||
- Verify all validation gates passed
|
||||
- Review generated SQL
|
||||
|
||||
3. **Test the workflow**:
|
||||
```bash
|
||||
cd staging
|
||||
td wf push
|
||||
td wf run staging_transformation.dig
|
||||
```
|
||||
|
||||
4. **Monitor execution**:
|
||||
```sql
|
||||
SELECT * FROM client_config.inc_log
|
||||
WHERE table_name = '{your_table}'
|
||||
ORDER BY inc_value DESC
|
||||
LIMIT 1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production-Ready Guarantee
|
||||
|
||||
All transformations will:
|
||||
- ✅ Work the first time
|
||||
- ✅ Follow consistent patterns
|
||||
- ✅ Include complete error handling
|
||||
- ✅ Include comprehensive data quality
|
||||
- ✅ Be maintainable and documented
|
||||
- ✅ Match production standards exactly
|
||||
|
||||
---
|
||||
|
||||
**Ready to proceed? Please provide the source database and table name, and I'll delegate to the appropriate staging-transformer agent for complete processing.**
|
||||
Reference in New Issue
Block a user