843 lines
19 KiB
Markdown
843 lines
19 KiB
Markdown
# Database Connections Guide
|
|
|
|
Complete guide for connecting MXCP to external databases (PostgreSQL, MySQL, SQLite, SQL Server) using DuckDB's ATTACH functionality and dbt integration.
|
|
|
|
## Overview
|
|
|
|
MXCP can connect to external databases in two ways:
|
|
1. **Direct querying** via DuckDB ATTACH (read data from external databases)
|
|
2. **dbt integration** (transform external data using dbt sources and models)
|
|
|
|
**Key principle**: External databases → DuckDB (via ATTACH or dbt) → MXCP tools
|
|
|
|
## When to Use Database Connections
|
|
|
|
**Use database connections when**:
|
|
- You have existing data in PostgreSQL, MySQL, or other SQL databases
|
|
- You want to query production databases (read-only recommended)
|
|
- You need to join external data with local data
|
|
- You want to cache/materialize external data locally
|
|
|
|
**Don't use database connections when**:
|
|
- You can export data to CSV (use dbt seeds instead - simpler and safer)
|
|
- You need real-time writes (MXCP is read-focused)
|
|
- The database has complex security requirements (use API wrapper instead)
|
|
|
|
## Method 1: Direct Database Access with ATTACH
|
|
|
|
### PostgreSQL Connection
|
|
|
|
#### Basic ATTACH Syntax
|
|
|
|
```sql
|
|
-- Attach PostgreSQL database
|
|
INSTALL postgres;
|
|
LOAD postgres;
|
|
|
|
ATTACH 'host=localhost port=5432 dbname=mydb user=myuser password=mypass'
|
|
AS postgres_db (TYPE POSTGRES);
|
|
|
|
-- Query attached database
|
|
SELECT * FROM postgres_db.public.customers WHERE country = 'US';
|
|
```
|
|
|
|
#### Complete Working Example
|
|
|
|
**Project structure**:
|
|
```
|
|
postgres-query/
|
|
├── mxcp-site.yml
|
|
├── config.yml # Database credentials
|
|
├── tools/
|
|
│ ├── query_customers.yml
|
|
│ └── get_orders.yml
|
|
└── sql/
|
|
└── setup.sql # ATTACH commands
|
|
```
|
|
|
|
**Step 1: Create config.yml with database credentials**
|
|
|
|
```yaml
|
|
# config.yml (in project directory)
|
|
mxcp: 1
|
|
|
|
profiles:
|
|
default:
|
|
secrets:
|
|
- name: postgres_connection
|
|
type: env
|
|
parameters:
|
|
env_var: POSTGRES_CONNECTION_STRING
|
|
|
|
# Alternative: separate credentials
|
|
- name: db_host
|
|
type: env
|
|
parameters:
|
|
env_var: DB_HOST
|
|
- name: db_user
|
|
type: env
|
|
parameters:
|
|
env_var: DB_USER
|
|
- name: db_password
|
|
type: env
|
|
parameters:
|
|
env_var: DB_PASSWORD
|
|
```
|
|
|
|
**Step 2: Set environment variables**
|
|
|
|
```bash
|
|
# Option 1: Connection string
|
|
export POSTGRES_CONNECTION_STRING="host=localhost port=5432 dbname=mydb user=myuser password=mypass"
|
|
|
|
# Option 2: Separate credentials
|
|
export DB_HOST="localhost"
|
|
export DB_USER="myuser"
|
|
export DB_PASSWORD="mypass"
|
|
```
|
|
|
|
**Step 3: Create SQL setup file**
|
|
|
|
```sql
|
|
-- sql/setup.sql
|
|
-- Install and load PostgreSQL extension
|
|
INSTALL postgres;
|
|
LOAD postgres;
|
|
|
|
-- Attach database (connection string from environment)
|
|
ATTACH 'host=${DB_HOST} port=5432 dbname=mydb user=${DB_USER} password=${DB_PASSWORD}'
|
|
AS prod_db (TYPE POSTGRES);
|
|
```
|
|
|
|
**Step 4: Create query tool**
|
|
|
|
```yaml
|
|
# tools/query_customers.yml
|
|
mxcp: 1
|
|
tool:
|
|
name: query_customers
|
|
description: "Query customers from PostgreSQL database by country"
|
|
parameters:
|
|
- name: country
|
|
type: string
|
|
description: "Filter by country code (e.g., 'US', 'UK')"
|
|
required: false
|
|
return:
|
|
type: array
|
|
items:
|
|
type: object
|
|
properties:
|
|
customer_id: { type: integer }
|
|
name: { type: string }
|
|
email: { type: string }
|
|
country: { type: string }
|
|
source:
|
|
code: |
|
|
-- First ensure PostgreSQL is attached
|
|
INSTALL postgres;
|
|
LOAD postgres;
|
|
ATTACH IF NOT EXISTS 'host=${DB_HOST} port=5432 dbname=mydb user=${DB_USER} password=${DB_PASSWORD}'
|
|
AS prod_db (TYPE POSTGRES);
|
|
|
|
-- Query attached database
|
|
SELECT
|
|
customer_id,
|
|
name,
|
|
email,
|
|
country
|
|
FROM prod_db.public.customers
|
|
WHERE $country IS NULL OR country = $country
|
|
ORDER BY customer_id
|
|
LIMIT 1000
|
|
tests:
|
|
- name: "test_connection"
|
|
arguments: []
|
|
# Test will verify connection works
|
|
```
|
|
|
|
**Step 5: Validate and test**
|
|
|
|
```bash
|
|
# Set credentials
|
|
export DB_HOST="localhost"
|
|
export DB_USER="myuser"
|
|
export DB_PASSWORD="mypass"
|
|
|
|
# Validate structure
|
|
mxcp validate
|
|
|
|
# Test tool
|
|
mxcp run tool query_customers --param country="US"
|
|
|
|
# Start server
|
|
mxcp serve
|
|
```
|
|
|
|
### MySQL Connection
|
|
|
|
```sql
|
|
-- Install MySQL extension
|
|
INSTALL mysql;
|
|
LOAD mysql;
|
|
|
|
-- Attach MySQL database
|
|
ATTACH 'host=localhost port=3306 database=mydb user=root password=pass'
|
|
AS mysql_db (TYPE MYSQL);
|
|
|
|
-- Query
|
|
SELECT * FROM mysql_db.orders WHERE order_date >= '2024-01-01';
|
|
```
|
|
|
|
**Complete tool example**:
|
|
|
|
```yaml
|
|
# tools/query_mysql_orders.yml
|
|
mxcp: 1
|
|
tool:
|
|
name: query_mysql_orders
|
|
description: "Query orders from MySQL database"
|
|
parameters:
|
|
- name: start_date
|
|
type: string
|
|
format: date
|
|
required: false
|
|
- name: status
|
|
type: string
|
|
required: false
|
|
return:
|
|
type: array
|
|
items:
|
|
type: object
|
|
source:
|
|
code: |
|
|
INSTALL mysql;
|
|
LOAD mysql;
|
|
ATTACH IF NOT EXISTS 'host=${MYSQL_HOST} database=${MYSQL_DB} user=${MYSQL_USER} password=${MYSQL_PASSWORD}'
|
|
AS mysql_db (TYPE MYSQL);
|
|
|
|
SELECT
|
|
order_id,
|
|
customer_id,
|
|
order_date,
|
|
total_amount,
|
|
status
|
|
FROM mysql_db.orders
|
|
WHERE ($start_date IS NULL OR order_date >= $start_date)
|
|
AND ($status IS NULL OR status = $status)
|
|
ORDER BY order_date DESC
|
|
LIMIT 1000
|
|
```
|
|
|
|
### SQLite Connection
|
|
|
|
```sql
|
|
-- Attach SQLite database
|
|
ATTACH 'path/to/database.db' AS sqlite_db (TYPE SQLITE);
|
|
|
|
-- Query
|
|
SELECT * FROM sqlite_db.users WHERE active = true;
|
|
```
|
|
|
|
**Tool example**:
|
|
|
|
```yaml
|
|
# tools/query_sqlite.yml
|
|
mxcp: 1
|
|
tool:
|
|
name: query_sqlite_users
|
|
description: "Query users from SQLite database"
|
|
parameters:
|
|
- name: active_only
|
|
type: boolean
|
|
default: true
|
|
return:
|
|
type: array
|
|
source:
|
|
code: |
|
|
ATTACH IF NOT EXISTS '${SQLITE_DB_PATH}' AS sqlite_db (TYPE SQLITE);
|
|
|
|
SELECT user_id, username, email, created_at
|
|
FROM sqlite_db.users
|
|
WHERE $active_only = false OR active = true
|
|
ORDER BY created_at DESC
|
|
```
|
|
|
|
### SQL Server Connection
|
|
|
|
```sql
|
|
-- Install SQL Server extension
|
|
INSTALL sqlserver;
|
|
LOAD sqlserver;
|
|
|
|
-- Attach SQL Server database
|
|
ATTACH 'Server=localhost;Database=mydb;Uid=user;Pwd=pass;'
|
|
AS sqlserver_db (TYPE SQLSERVER);
|
|
|
|
-- Query
|
|
SELECT * FROM sqlserver_db.dbo.products WHERE category = 'Electronics';
|
|
```
|
|
|
|
## Method 2: dbt Integration with External Databases
|
|
|
|
**Use dbt when**:
|
|
- You want to materialize/cache external data locally
|
|
- You need to transform external data before querying
|
|
- You want data quality tests on external data
|
|
- You prefer declarative SQL over ATTACH statements
|
|
|
|
### dbt Sources for External Databases
|
|
|
|
**Pattern**: External DB → dbt source → dbt model → MXCP tool
|
|
|
|
#### Step 1: Configure dbt profile for external database
|
|
|
|
```yaml
|
|
# profiles.yml (auto-generated by MXCP, or manually edit)
|
|
my_project:
|
|
outputs:
|
|
dev:
|
|
type: postgres # or mysql, sqlserver, etc.
|
|
host: localhost
|
|
port: 5432
|
|
user: "{{ env_var('DB_USER') }}"
|
|
password: "{{ env_var('DB_PASSWORD') }}"
|
|
dbname: mydb
|
|
schema: public
|
|
threads: 4
|
|
|
|
# Hybrid: use DuckDB for local, Postgres for source
|
|
hybrid:
|
|
type: duckdb
|
|
path: "{{ env_var('MXCP_DUCKDB_PATH', 'data/db-default.duckdb') }}"
|
|
target: hybrid
|
|
```
|
|
|
|
#### Step 2: Define external database as dbt source
|
|
|
|
```yaml
|
|
# models/sources.yml
|
|
version: 2
|
|
|
|
sources:
|
|
- name: production_db
|
|
description: "Production PostgreSQL database"
|
|
database: postgres_db # Matches ATTACH name
|
|
schema: public
|
|
tables:
|
|
- name: customers
|
|
description: "Customer master data"
|
|
columns:
|
|
- name: customer_id
|
|
description: "Unique customer identifier"
|
|
tests:
|
|
- unique
|
|
- not_null
|
|
- name: email
|
|
tests:
|
|
- not_null
|
|
- name: country
|
|
tests:
|
|
- not_null
|
|
|
|
- name: orders
|
|
description: "Order transactions"
|
|
columns:
|
|
- name: order_id
|
|
tests:
|
|
- unique
|
|
- not_null
|
|
- name: customer_id
|
|
tests:
|
|
- not_null
|
|
- relationships:
|
|
to: source('production_db', 'customers')
|
|
field: customer_id
|
|
```
|
|
|
|
#### Step 3: Create dbt model to cache/transform external data
|
|
|
|
```sql
|
|
-- models/customer_summary.sql
|
|
{{ config(
|
|
materialized='table',
|
|
description='Customer summary from production database'
|
|
) }}
|
|
|
|
SELECT
|
|
c.customer_id,
|
|
c.name,
|
|
c.email,
|
|
c.country,
|
|
COUNT(o.order_id) as order_count,
|
|
COALESCE(SUM(o.total_amount), 0) as total_spent,
|
|
MAX(o.order_date) as last_order_date
|
|
FROM {{ source('production_db', 'customers') }} c
|
|
LEFT JOIN {{ source('production_db', 'orders') }} o
|
|
ON c.customer_id = o.customer_id
|
|
GROUP BY c.customer_id, c.name, c.email, c.country
|
|
```
|
|
|
|
```yaml
|
|
# models/schema.yml
|
|
version: 2
|
|
|
|
models:
|
|
- name: customer_summary
|
|
description: "Aggregated customer metrics from production"
|
|
columns:
|
|
- name: customer_id
|
|
tests:
|
|
- unique
|
|
- not_null
|
|
- name: order_count
|
|
tests:
|
|
- not_null
|
|
- name: total_spent
|
|
tests:
|
|
- not_null
|
|
```
|
|
|
|
#### Step 4: Run dbt to materialize data
|
|
|
|
```bash
|
|
# Test connection to external database
|
|
dbt debug
|
|
|
|
# Run models (fetches from external DB, materializes in DuckDB)
|
|
dbt run --select customer_summary
|
|
|
|
# Test data quality
|
|
dbt test --select customer_summary
|
|
```
|
|
|
|
#### Step 5: Create MXCP tool to query materialized data
|
|
|
|
```yaml
|
|
# tools/get_customer_summary.yml
|
|
mxcp: 1
|
|
tool:
|
|
name: get_customer_summary
|
|
description: "Get customer summary statistics from cached production data"
|
|
parameters:
|
|
- name: country
|
|
type: string
|
|
required: false
|
|
- name: min_orders
|
|
type: integer
|
|
default: 0
|
|
return:
|
|
type: array
|
|
items:
|
|
type: object
|
|
properties:
|
|
customer_id: { type: integer }
|
|
name: { type: string }
|
|
order_count: { type: integer }
|
|
total_spent: { type: number }
|
|
source:
|
|
code: |
|
|
SELECT
|
|
customer_id,
|
|
name,
|
|
email,
|
|
country,
|
|
order_count,
|
|
total_spent,
|
|
last_order_date
|
|
FROM customer_summary
|
|
WHERE ($country IS NULL OR country = $country)
|
|
AND order_count >= $min_orders
|
|
ORDER BY total_spent DESC
|
|
LIMIT 100
|
|
```
|
|
|
|
#### Step 6: Refresh data periodically
|
|
|
|
```bash
|
|
# Manual refresh
|
|
dbt run --select customer_summary
|
|
|
|
# Or create Python tool to trigger refresh
|
|
```
|
|
|
|
```yaml
|
|
# tools/refresh_data.yml
|
|
mxcp: 1
|
|
tool:
|
|
name: refresh_customer_data
|
|
description: "Refresh customer summary from production database"
|
|
language: python
|
|
return:
|
|
type: object
|
|
source:
|
|
file: ../python/refresh.py
|
|
```
|
|
|
|
```python
|
|
# python/refresh.py
|
|
from mxcp.runtime import reload_duckdb
|
|
import subprocess
|
|
|
|
def refresh_customer_data() -> dict:
|
|
"""Refresh customer summary from external database"""
|
|
|
|
def run_dbt():
|
|
result = subprocess.run(
|
|
["dbt", "run", "--select", "customer_summary"],
|
|
capture_output=True,
|
|
text=True
|
|
)
|
|
if result.returncode != 0:
|
|
raise Exception(f"dbt run failed: {result.stderr}")
|
|
|
|
test_result = subprocess.run(
|
|
["dbt", "test", "--select", "customer_summary"],
|
|
capture_output=True,
|
|
text=True
|
|
)
|
|
if test_result.returncode != 0:
|
|
raise Exception(f"dbt test failed: {test_result.stderr}")
|
|
|
|
# Run dbt with exclusive database access
|
|
reload_duckdb(
|
|
payload_func=run_dbt,
|
|
description="Refreshing customer data from production"
|
|
)
|
|
|
|
return {
|
|
"status": "success",
|
|
"message": "Customer data refreshed from production database"
|
|
}
|
|
```
|
|
|
|
### Incremental dbt Models for Large Tables
|
|
|
|
For large external tables, use incremental materialization:
|
|
|
|
```sql
|
|
-- models/orders_incremental.sql
|
|
{{ config(
|
|
materialized='incremental',
|
|
unique_key='order_id',
|
|
on_schema_change='fail'
|
|
) }}
|
|
|
|
SELECT
|
|
order_id,
|
|
customer_id,
|
|
order_date,
|
|
total_amount,
|
|
status
|
|
FROM {{ source('production_db', 'orders') }}
|
|
|
|
{% if is_incremental() %}
|
|
-- Only fetch new/updated orders
|
|
WHERE order_date > (SELECT MAX(order_date) FROM {{ this }})
|
|
{% endif %}
|
|
```
|
|
|
|
```bash
|
|
# First run: fetch all historical data
|
|
dbt run --select orders_incremental --full-refresh
|
|
|
|
# Subsequent runs: only fetch new data
|
|
dbt run --select orders_incremental
|
|
```
|
|
|
|
## Connection Patterns and Best Practices
|
|
|
|
### Pattern 1: Read-Only Querying
|
|
|
|
**Use case**: Query production database directly without caching
|
|
|
|
```yaml
|
|
tool:
|
|
name: query_live_data
|
|
source:
|
|
code: |
|
|
ATTACH IF NOT EXISTS 'connection_string' AS prod (TYPE POSTGRES);
|
|
SELECT * FROM prod.public.table WHERE ...
|
|
```
|
|
|
|
**Pros**: Always fresh data
|
|
**Cons**: Slower queries, database load
|
|
|
|
### Pattern 2: Cached/Materialized Data
|
|
|
|
**Use case**: Cache external data in DuckDB for fast queries
|
|
|
|
```sql
|
|
-- dbt model caches external data
|
|
SELECT * FROM {{ source('external_db', 'table') }}
|
|
```
|
|
|
|
```yaml
|
|
# MXCP tool queries cache
|
|
tool:
|
|
source:
|
|
code: SELECT * FROM cached_table WHERE ...
|
|
```
|
|
|
|
**Pros**: Fast queries, no database load
|
|
**Cons**: Data staleness, needs refresh
|
|
|
|
### Pattern 3: Hybrid (Cache + Live)
|
|
|
|
**Use case**: Cache most data, query live for real-time needs
|
|
|
|
```sql
|
|
-- Combine cached and live data
|
|
SELECT * FROM cached_historical_orders
|
|
UNION ALL
|
|
SELECT * FROM prod.public.orders WHERE order_date >= CURRENT_DATE - INTERVAL '7 days'
|
|
```
|
|
|
|
### Security Best Practices
|
|
|
|
#### 1. Use Read-Only Database Users
|
|
|
|
```sql
|
|
-- PostgreSQL: Create read-only user
|
|
CREATE USER readonly_user WITH PASSWORD 'secure_password';
|
|
GRANT CONNECT ON DATABASE mydb TO readonly_user;
|
|
GRANT USAGE ON SCHEMA public TO readonly_user;
|
|
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;
|
|
```
|
|
|
|
#### 2. Store Credentials in Secrets
|
|
|
|
```yaml
|
|
# config.yml - NEVER commit passwords
|
|
secrets:
|
|
- name: db_password
|
|
type: env
|
|
parameters:
|
|
env_var: DB_PASSWORD
|
|
|
|
# Production: use Vault
|
|
- name: prod_db_password
|
|
type: vault
|
|
parameters:
|
|
path: secret/data/myapp/database
|
|
field: password
|
|
```
|
|
|
|
#### 3. Use Connection Pooling (for Python approach)
|
|
|
|
```python
|
|
# python/db_client.py
|
|
from mxcp.runtime import on_init, on_shutdown
|
|
import psycopg2.pool
|
|
|
|
connection_pool = None
|
|
|
|
@on_init
|
|
def setup_pool():
|
|
global connection_pool
|
|
connection_pool = psycopg2.pool.SimpleConnectionPool(
|
|
minconn=1,
|
|
maxconn=5,
|
|
host=os.getenv("DB_HOST"),
|
|
database=os.getenv("DB_NAME"),
|
|
user=os.getenv("DB_USER"),
|
|
password=os.getenv("DB_PASSWORD")
|
|
)
|
|
|
|
@on_shutdown
|
|
def close_pool():
|
|
global connection_pool
|
|
if connection_pool:
|
|
connection_pool.closeall()
|
|
|
|
def query_database(sql: str) -> list[dict]:
|
|
conn = connection_pool.getconn()
|
|
try:
|
|
cursor = conn.cursor()
|
|
cursor.execute(sql)
|
|
results = cursor.fetchall()
|
|
return results
|
|
finally:
|
|
connection_pool.putconn(conn)
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
#### Handle Connection Failures
|
|
|
|
```yaml
|
|
# tools/query_with_error_handling.yml
|
|
tool:
|
|
name: safe_query
|
|
language: python
|
|
source:
|
|
file: ../python/safe_query.py
|
|
```
|
|
|
|
```python
|
|
# python/safe_query.py
|
|
from mxcp.runtime import db
|
|
import duckdb
|
|
|
|
def safe_query(table_name: str) -> dict:
|
|
"""Query external database with error handling"""
|
|
try:
|
|
# Try to attach if not already attached
|
|
db.execute("""
|
|
INSTALL postgres;
|
|
LOAD postgres;
|
|
ATTACH IF NOT EXISTS 'host=${DB_HOST} dbname=${DB_NAME} user=${DB_USER} password=${DB_PASSWORD}'
|
|
AS prod (TYPE POSTGRES);
|
|
""")
|
|
|
|
# Query
|
|
results = db.execute(f"SELECT * FROM prod.public.{table_name} LIMIT 100").fetchall()
|
|
|
|
return {
|
|
"success": True,
|
|
"row_count": len(results),
|
|
"data": results
|
|
}
|
|
|
|
except duckdb.CatalogException as e:
|
|
return {
|
|
"success": False,
|
|
"error": "Table not found",
|
|
"message": f"Table {table_name} does not exist in external database",
|
|
"suggestion": "Check table name and database connection"
|
|
}
|
|
|
|
except duckdb.IOException as e:
|
|
return {
|
|
"success": False,
|
|
"error": "Connection failed",
|
|
"message": "Could not connect to external database",
|
|
"suggestion": "Check database credentials and network connectivity"
|
|
}
|
|
|
|
except Exception as e:
|
|
return {
|
|
"success": False,
|
|
"error": "Unexpected error",
|
|
"message": str(e)
|
|
}
|
|
```
|
|
|
|
### Performance Optimization
|
|
|
|
#### 1. Add Indexes on Frequently Filtered Columns
|
|
|
|
```sql
|
|
-- On external database (PostgreSQL)
|
|
CREATE INDEX idx_customers_country ON customers(country);
|
|
CREATE INDEX idx_orders_date ON orders(order_date);
|
|
```
|
|
|
|
#### 2. Limit Result Sets
|
|
|
|
```sql
|
|
-- Always add LIMIT for large tables
|
|
SELECT * FROM prod.public.orders
|
|
WHERE order_date >= '2024-01-01'
|
|
LIMIT 1000 -- Prevent overwhelming queries
|
|
```
|
|
|
|
#### 3. Materialize Complex Joins
|
|
|
|
```sql
|
|
-- Instead of complex join on every query
|
|
-- Create dbt model to materialize the join
|
|
{{ config(materialized='table') }}
|
|
|
|
SELECT ... complex join logic ...
|
|
FROM {{ source('prod', 'table1') }} t1
|
|
JOIN {{ source('prod', 'table2') }} t2 ...
|
|
```
|
|
|
|
## Complete Example: PostgreSQL to MXCP
|
|
|
|
**Scenario**: Query production PostgreSQL customer database
|
|
|
|
```bash
|
|
# 1. Create project
|
|
mkdir postgres-customers && cd postgres-customers
|
|
mxcp init --bootstrap
|
|
|
|
# 2. Create config
|
|
cat > config.yml <<'EOF'
|
|
mxcp: 1
|
|
|
|
profiles:
|
|
default:
|
|
secrets:
|
|
- name: db_host
|
|
type: env
|
|
parameters:
|
|
env_var: DB_HOST
|
|
- name: db_user
|
|
type: env
|
|
parameters:
|
|
env_var: DB_USER
|
|
- name: db_password
|
|
type: env
|
|
parameters:
|
|
env_var: DB_PASSWORD
|
|
EOF
|
|
|
|
# 3. Create tool
|
|
cat > tools/query_customers.yml <<'EOF'
|
|
mxcp: 1
|
|
tool:
|
|
name: query_customers
|
|
description: "Query customers from PostgreSQL"
|
|
parameters:
|
|
- name: country
|
|
type: string
|
|
required: false
|
|
return:
|
|
type: array
|
|
source:
|
|
code: |
|
|
INSTALL postgres;
|
|
LOAD postgres;
|
|
ATTACH IF NOT EXISTS 'host=${DB_HOST} port=5432 dbname=customers user=${DB_USER} password=${DB_PASSWORD}'
|
|
AS prod (TYPE POSTGRES);
|
|
|
|
SELECT customer_id, name, email, country
|
|
FROM prod.public.customers
|
|
WHERE $country IS NULL OR country = $country
|
|
LIMIT 100
|
|
EOF
|
|
|
|
# 4. Set credentials
|
|
export DB_HOST="localhost"
|
|
export DB_USER="readonly_user"
|
|
export DB_PASSWORD="secure_password"
|
|
|
|
# 5. Test
|
|
mxcp validate
|
|
mxcp run tool query_customers --param country="US"
|
|
|
|
# 6. Start server
|
|
mxcp serve
|
|
```
|
|
|
|
## Summary
|
|
|
|
**For external database connections**:
|
|
|
|
1. **Direct querying** → Use ATTACH with parameterized connection strings
|
|
2. **Cached data** → Use dbt sources + models for materialization
|
|
3. **Always use read-only users** for security
|
|
4. **Store credentials in environment variables** or Vault
|
|
5. **Handle connection errors** gracefully in Python tools
|
|
6. **Test with** `mxcp validate && mxcp run tool <name>`
|
|
7. **Use dbt for** large tables (incremental models) and transformations
|
|
|
|
**Decision guide**:
|
|
- Small queries, real-time data needed → ATTACH
|
|
- Large tables, can tolerate staleness → dbt materialization
|
|
- Complex transformations → dbt models
|
|
- Simple SELECT queries → ATTACH
|
|
|
|
This approach gives you full SQL database access while maintaining MXCP's security, validation, and testing workflow.
|