8.6 KiB
name, description
| name | description |
|---|---|
| unify-create-config | Generate core ID unification configuration files (unify.yml and id_unification.dig) |
Create Core Unification Configuration
Overview
I'll generate core ID unification configuration files using the id-unification-creator specialized agent.
This command creates TD-COMPLIANT unification files:
- ✅ DYNAMIC CONFIGURATION - Based on prep table analysis
- ✅ METHOD-SPECIFIC - Persistent_id OR canonical_id (never both)
- ✅ REGIONAL ENDPOINTS - Correct URL for your region
- ✅ SCHEMA VALIDATION - Prevents first-run failures
Prerequisites
REQUIRED: Prep table configuration must exist:
unification/config/environment.yml- Client configurationunification/config/src_prep_params.yml- Prep table mappings
If you haven't created these yet, run:
/cdp-unification:unify-create-prepfirst, OR/cdp-unification:unify-setupfor complete end-to-end setup
What You Need to Provide
1. ID Method Selection
Choose ONE method:
Option A: persistent_id (RECOMMENDED)
- Stable IDs that persist across updates
- Better for customer data platforms
- Recommended for most use cases
- Provide persistent_id name (e.g.,
td_claude_id,stable_customer_id)
Option B: canonical_id
- Traditional approach with merge capabilities
- Good for legacy systems
- Provide canonical_id name (e.g.,
master_customer_id)
2. Update Strategy
- Full Refresh: Reprocess all data each time (
full_refresh: true) - Incremental: Process only new/updated records (
full_refresh: false)
3. Regional Endpoint
Choose your Treasure Data region:
- US: https://api-cdp.treasuredata.com/unifications/workflow_call
- EU: https://api-cdp.eu01.treasuredata.com/unifications/workflow_call
- Asia Pacific: https://api-cdp.ap02.treasuredata.com/unifications/workflow_call
- Japan: https://api-cdp.treasuredata.co.jp/unifications/workflow_call
4. Unification Name
- Name for this unification project (e.g.,
claude,customer_360)
What I'll Do
Step 1: Validate Prerequisites
I'll check that these files exist:
unification/config/environment.ymlunification/config/src_prep_params.yml
And extract:
- Client short name
- Unified input table name
- All prep table configurations with column mappings
Step 2: Extract Key Information
I'll parse src_prep_params.yml to identify:
- All unique
alias_ascolumn names - Key types: email, phone, td_client_id, td_global_id, customer_id, etc.
- Complete list of available keys for
merge_by_keys
Step 3: Generate unification/config/unify.yml
I'll create:
name: {unif_name}
keys:
- name: email
invalid_texts: ['']
- name: td_client_id
invalid_texts: ['']
- name: phone
invalid_texts: ['']
# ... ALL detected key types
tables:
- database: ${client_short_name}_${stg}
table: ${globals.unif_input_tbl}
incremental_columns: [time]
key_columns:
- {column: email, key: email}
- {column: td_client_id, key: td_client_id}
- {column: phone, key: phone}
# ... ALL alias_as columns mapped
# ONLY ONE of these sections (based on your selection):
persistent_ids:
- name: {persistent_id_name}
merge_by_keys: [email, td_client_id, phone, ...]
merge_iterations: 15
# OR
canonical_ids:
- name: {canonical_id_name}
merge_by_keys: [email, td_client_id, phone, ...]
merge_iterations: 15
Step 4: Validate and Update Schema (CRITICAL)
I'll prevent first-run failures by:
- Reading
unify.ymlto extractmerge_by_keyslist - Reading
queries/create_schema.sqlto check existing columns - Comparing required vs existing columns
- Updating
create_schema.sqlif missing columns:- Add all keys from
merge_by_keysas varchar - Add source, time, ingest_time columns
- Update BOTH table definitions (main and tmp)
- Add all keys from
Step 5: Generate unification/id_unification.dig
I'll create:
timezone: UTC
_export:
!include : config/environment.yml
!include : config/src_prep_params.yml
+call_unification:
http_call>: {REGIONAL_ENDPOINT_URL}
headers:
- authorization: ${secret:td.apikey}
- content-type: application/json
method: POST
retry: true
content_format: json
content:
run_persistent_ids: {true/false} # ONLY if persistent_id
run_canonical_ids: {true/false} # ONLY if canonical_id
run_enrichments: true
run_master_tables: true
full_refresh: {true/false}
keep_debug_tables: true
unification:
!include : config/unify.yml
Expected Output
Files Created
unification/
├── config/
│ └── unify.yml ✓ Dynamic configuration
├── queries/
│ └── create_schema.sql ✓ Updated with all columns
└── id_unification.dig ✓ Core unification workflow
Example unify.yml (persistent_id method)
name: customer_360
keys:
- name: email
invalid_texts: ['']
- name: td_client_id
invalid_texts: ['']
valid_regexp: '^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$'
tables:
- database: ${client_short_name}_${stg}
table: ${globals.unif_input_tbl}
incremental_columns: [time]
key_columns:
- {column: email, key: email}
- {column: td_client_id, key: td_client_id}
persistent_ids:
- name: td_claude_id
merge_by_keys: [email, td_client_id]
merge_iterations: 15
Example id_unification.dig (US region, incremental)
timezone: UTC
_export:
!include : config/environment.yml
!include : config/src_prep_params.yml
+call_unification:
http_call>: https://api-cdp.treasuredata.com/unifications/workflow_call
headers:
- authorization: ${secret:td.apikey}
- content-type: application/json
method: POST
retry: true
content_format: json
content:
run_persistent_ids: true
run_enrichments: true
run_master_tables: true
full_refresh: false
keep_debug_tables: true
unification:
!include : config/unify.yml
Critical Requirements
✅ Dynamic Configuration
- All keys detected from
src_prep_params.yml - All column mappings from prep analysis
- Method-specific configuration (never both)
⚠️ Schema Completeness
create_schema.sqlMUST contain ALL columns frommerge_by_keys- Prevents "column not found" errors on first run
- Updates both main and tmp table definitions
⚠️ Config File Inclusion
id_unification.digMUST include BOTH config files in_export:environment.yml- For${client_short_name}_${stg}src_prep_params.yml- For${globals.unif_input_tbl}
⚠️ Regional Endpoint
- Must use exact URL for selected region
- Different endpoints for US, EU, Asia Pacific, Japan
Validation Checklist
Before completing, I'll verify:
- unify.yml contains all detected key types
- key_columns section maps ALL alias_as columns
- Only ONE ID method section exists
- merge_by_keys includes ALL available keys
- CRITICAL: create_schema.sql contains ALL columns from merge_by_keys
- CRITICAL: Both table definitions updated (main and tmp)
- id_unification.dig has correct regional endpoint
- CRITICAL: _export includes BOTH config files
- Workflow flags match selected method only
- Proper TD YAML/DIG syntax
Success Criteria
All generated files will:
- ✅ TD-COMPLIANT - Work without modification in TD
- ✅ DYNAMICALLY CONFIGURED - Based on actual prep analysis
- ✅ METHOD-ACCURATE - Exact implementation of selected method
- ✅ REGIONALLY CORRECT - Proper endpoint for region
- ✅ SCHEMA-COMPLETE - All required columns present
Next Steps
After creating core config, you can:
- Test unification workflow:
dig run unification/id_unification.dig - Add enrichment: Use
/cdp-unification:unify-setupto add staging enrichment - Create main orchestrator: Combine prep + unification + enrichment
Getting Started
Ready to create core unification config? Please provide:
-
ID Method:
- Choose:
persistent_idorcanonical_id - Provide ID name: e.g.,
td_claude_id
- Choose:
-
Update Strategy:
- Choose:
incrementalorfull_refresh
- Choose:
-
Regional Endpoint:
- Choose:
US,EU,Asia Pacific, orJapan
- Choose:
-
Unification Name:
- e.g.,
customer_360,claude
- e.g.,
Example:
ID Method: persistent_id
ID Name: td_claude_id
Update Strategy: incremental
Region: US
Unification Name: customer_360
I'll call the id-unification-creator agent to generate all core unification files.
Let's create your unification configuration!