gh-raw-labs-claude-code-mar…/skills/mxcp-expert/references/project-selection-guide.md

# Project Selection Guide

Decision tree and heuristics for selecting the right MXCP approach and templates based on **technical requirements**.

**Scope**: This guide helps select implementation patterns (SQL vs Python, template selection, architecture patterns) based on data sources, authentication mechanisms, and technical constraints. It does NOT help define business requirements or determine what features to build.

## Decision Tree

Use this decision tree to determine the appropriate MXCP implementation approach:

```
User Request
    ├─ Data File
    │   ├─ CSV file
    │   │   ├─ Static data → dbt seed + SQL tool
    │   │   ├─ Needs transformation → dbt seed + dbt model + SQL tool
    │   │   └─ Large file (>100MB) → Convert to Parquet + dbt model
    │   ├─ Excel file (.xlsx, .xls)
    │   │   ├─ Static/one-time → Convert to CSV + dbt seed
    │   │   ├─ User upload (dynamic) → Python tool with pandas + DuckDB table
    │   │   └─ Multi-sheet → Python tool to load all sheets as tables
    │   ├─ JSON/Parquet
    │   │   └─ DuckDB read_json/read_parquet directly in SQL tool
    │   └─ Synthetic data needed
    │       ├─ For testing → dbt model with GENERATE_SERIES
    │       ├─ Dynamic generation → Python tool with parameters
    │       └─ With statistics → Generate + analyze in single tool
    │
    ├─ External API Integration
    │   ├─ OAuth required
    │   │   ├─ Google (Calendar, Sheets, etc.) → google-calendar template
    │   │   ├─ Jira Cloud → jira-oauth template
    │   │   ├─ Salesforce → salesforce-oauth template
    │   │   └─ Other OAuth → Adapt google-calendar template
    │   │
    │   ├─ API Token/Basic Auth
    │   │   ├─ Jira → jira template
    │   │   ├─ Confluence → confluence template
    │   │   ├─ Salesforce → salesforce template
    │   │   ├─ Custom API → python-demo template
    │   │   └─ REST API → Create new Python tool
    │   │
    │   └─ Public API (no auth)
    │       └─ Create SQL tool with read_json/read_csv from URL
    │
    ├─ Database Connection
    │   ├─ PostgreSQL
    │   │   ├─ Direct query → DuckDB ATTACH + SQL tools
    │   │   └─ Cache data → dbt source + model + SQL tools
    │   ├─ MySQL
    │   │   ├─ Direct query → DuckDB ATTACH + SQL tools
    │   │   └─ Cache data → dbt source + model
    │   ├─ SQLite → DuckDB ATTACH + SQL tools (simple)
    │   ├─ SQL Server → DuckDB ATTACH + SQL tools
    │   └─ Other/NoSQL → Create Python tool with connection library
    │
    ├─ Complex Logic/Processing
    │   ├─ Data transformation → dbt model
    │   ├─ Business logic → Python tool
    │   ├─ ML/AI processing → Python tool with libraries
    │   └─ Async operations → Python tool with async/await
    │
    └─ Authentication/Security System
        ├─ Keycloak → keycloak template
        ├─ Custom SSO → Adapt keycloak template
        └─ Policy enforcement → Use MXCP policies
```

## Available Project Templates

### Data-Focused Templates

#### covid_owid
**Use when**: Working with external data sources, caching datasets

**Features**:
- dbt integration for data caching
- External CSV/JSON fetching
- Data quality tests
- Incremental updates

**Example use cases**:
- "Cache COVID statistics for offline analysis"
- "Query external datasets regularly"
- "Download and transform public data"

**Key files**:
- `models/` - dbt models for data transformation
- `tools/` - SQL tools querying cached data

#### earthquakes
**Use when**: Real-time data monitoring, geospatial data

**Features**:
- Real-time API queries
- Geospatial filtering
- Time-based queries

**Example use cases**:
- "Monitor earthquake activity"
- "Query geospatial data by region"
- "Real-time event tracking"

### API Integration Templates

#### google-calendar (OAuth)
**Use when**: Integrating with Google APIs or other OAuth 2.0 services

**Features**:
- OAuth 2.0 authentication flow
- Token management
- Google API client integration
- Python endpoints with async support

**Example use cases**:
- "Connect to Google Calendar"
- "Access Google Sheets data"
- "Integrate with Gmail"
- "Any OAuth 2.0 API integration"

**Adaptation guide**:
1. Replace Google API client with target API client
2. Update OAuth scopes and endpoints
3. Modify tool definitions for new API methods
4. Update configuration with new OAuth provider

#### jira (API Token)
**Use when**: Integrating with Jira using API tokens

**Features**:
- API token authentication
- JQL query support
- Issue, user, project management
- Python HTTP client pattern

**Example use cases**:
- "Query Jira issues"
- "Get project information"
- "Search for users"

#### jira-oauth (OAuth)
**Use when**: Jira integration requiring OAuth

**Features**:
- OAuth 1.0a for Jira
- More secure than API tokens
- Full Jira REST API access

#### confluence
**Use when**: Atlassian Confluence integration

**Features**:
- Confluence REST API
- Page and space queries
- Content search

**Example use cases**:
- "Search Confluence pages"
- "Get page content"
- "List spaces"

#### salesforce / salesforce-oauth
**Use when**: Salesforce CRM integration

**Features**:
- Salesforce REST API
- SOQL queries
- OAuth or username/password auth

**Example use cases**:
- "Query Salesforce records"
- "Get account information"
- "Search opportunities"

### Development Templates

#### python-demo
**Use when**: Building custom Python-based tools

**Features**:
- Python endpoint patterns
- Async/await examples
- Database access patterns
- Error handling

**Example use cases**:
- "Create custom API integration"
- "Implement complex business logic"
- "Build ML/AI-powered tools"

**Key patterns**:
```python
# Sync endpoint
def simple_tool(param: str) -> dict:
    return {"result": param.upper()}

# Async endpoint
async def async_tool(ids: list[str]) -> list[dict]:
    results = await asyncio.gather(*[fetch_data(id) for id in ids])
    return results

# Database access
def db_tool(query: str) -> list[dict]:
    return db.execute(query).fetchall()
```

### Infrastructure Templates

#### plugin
**Use when**: Extending DuckDB with custom functions

**Features**:
- DuckDB plugin development
- Custom SQL functions
- Compiled extensions

**Example use cases**:
- "Add custom SQL functions"
- "Integrate C/C++ libraries"
- "Optimize performance-critical operations"

#### keycloak
**Use when**: Enterprise authentication/authorization

**Features**:
- Keycloak integration
- SSO support
- Role-based access control

**Example use cases**:
- "Integrate with Keycloak SSO"
- "Implement role-based policies"
- "Enterprise user management"

#### squirro
**Use when**: Enterprise search and insights integration

**Features**:
- Squirro API integration
- Search and analytics
- Enterprise data access

## Common Scenarios and Heuristics

### Scenario 1: CSV File to Query

**User says**: "I need to connect my chat to a CSV file"

**Heuristic**:
1. **DO NOT** use existing templates
2. **CREATE** new MXCP project from scratch
3. **APPROACH**:
   - Place CSV in `seeds/` directory
   - Create `seeds/schema.yml` with schema definition and tests
   - Run `dbt seed` to load into DuckDB
   - Create SQL tool: `SELECT * FROM <table_name>`
   - Add parameters for filtering if needed

**Implementation steps**:
```bash
# 1. Initialize project
mkdir csv-server && cd csv-server
mxcp init --bootstrap

# 2. Setup dbt
mkdir seeds
cp /path/to/file.csv seeds/data.csv

# 3. Create schema
cat > seeds/schema.yml <<EOF
version: 2
seeds:
  - name: data
    description: "User uploaded CSV data"
    columns:
      - name: id
        tests: [unique, not_null]
      # ... add all columns
EOF

# 4. Load seed
dbt seed

# 5. Create tool
cat > tools/query_data.yml <<EOF
mxcp: 1
tool:
  name: query_data
  description: "Query the uploaded CSV data"
  parameters:
    - name: filter_column
      type: string
      required: false
  return:
    type: array
  source:
    code: |
      SELECT * FROM data
      WHERE \$filter_column IS NULL OR column_name = \$filter_column
EOF

# 6. Test
mxcp validate
mxcp test
mxcp serve
```

### Scenario 2: API Integration (OAuth)

**User says**: "Connect to [OAuth-enabled API]"

**Heuristic**:
1. **Check** if template exists (Google, Jira OAuth, Salesforce OAuth)
2. **If exists**: Copy and adapt template
3. **If not**: Copy `google-calendar` template and modify

**Implementation steps**:
```bash
# 1. Copy template
cp -r assets/project-templates/google-calendar my-api-project
cd my-api-project

# 2. Update mxcp-site.yml
vim mxcp-site.yml  # Change project name

# 3. Update config.yml for new OAuth provider
vim config.yml  # Update OAuth endpoints and scopes

# 4. Replace API client
pip install new-api-client-library
vim python/*.py  # Replace google-api-client with new library

# 5. Update tools for new API methods
vim tools/*.yml  # Adapt to new API endpoints

# 6. Test OAuth flow
mxcp serve
# Follow OAuth flow in browser
```

### Scenario 3: API Integration (Token/Basic Auth)

**User says**: "Connect to [API with token]"

**Heuristic**:
1. **Check** if template exists (Jira, Confluence, Salesforce)
2. **If exists**: Copy and adapt template
3. **If not**: Use `python-demo` template

**Implementation steps**:
```bash
# 1. Copy python-demo template
cp -r assets/project-templates/python-demo my-api-project
cd my-api-project

# 2. Create Python endpoint
cat > python/api_client.py <<EOF
import httpx
from mxcp.runtime import get_secret

async def fetch_data(endpoint: str) -> dict:
    secret = get_secret("api_token")
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.example.com/{endpoint}",
            headers={"Authorization": f"Bearer {secret['token']}"}
        )
        return response.json()
EOF

# 3. Create tool
# 4. Configure secret in config.yml
# 5. Test
```

### Scenario 4: Complex Data Transformation

**User says**: "Transform this data and provide analytics"

**Heuristic**:
1. **Use** dbt for transformations
2. **Use** SQL tools for queries
3. **Pattern**: seed → model → tool

**Implementation steps**:
```bash
# 1. Load source data (seed or external)
# 2. Create dbt model for transformation
cat > models/analytics.sql <<EOF
{{ config(materialized='table') }}

SELECT
  DATE_TRUNC('month', date) as month,
  category,
  SUM(amount) as total,
  AVG(amount) as average,
  COUNT(*) as count
FROM {{ ref('source_data') }}
GROUP BY month, category
EOF

# 3. Create schema.yml
# 4. Run dbt
dbt run --select analytics
dbt test --select analytics

# 5. Create tool to query model
# 6. Test
```

### Scenario 5: Excel File Integration

**User says**: "I have an Excel file with sales data" or "Read this xlsx file"

**Heuristic**:
1. **If static/one-time**: Convert to CSV, use dbt seed
2. **If user upload**: Python tool with pandas to load into DuckDB
3. **If multi-sheet**: Python tool to process all sheets

**Implementation steps**:
```bash
# Option A: Static Excel → CSV → dbt seed
python -c "import pandas as pd; pd.read_excel('data.xlsx').to_csv('seeds/data.csv', index=False)"
cat > seeds/schema.yml  # Create schema
dbt seed

# Option B: Dynamic upload → Python tool
cat > python/excel_loader.py  # Create loader
cat > tools/load_excel.yml    # Create tool
pip install openpyxl pandas   # Add dependencies
```

See **references/excel-integration.md** for complete patterns.

### Scenario 6: Synthetic Data Generation

**User says**: "Generate test data" or "Create synthetic customer records" or "I need dummy data for testing"

**Heuristic**:
1. **If persistent test data**: dbt model with GENERATE_SERIES
2. **If dynamic/parameterized**: Python tool
3. **If with analysis**: Generate + calculate statistics in one tool

**Implementation steps**:
```bash
# Option A: Persistent via dbt
cat > models/synthetic_customers.sql <<EOF
{{ config(materialized='table') }}
SELECT
  ROW_NUMBER() OVER () AS id,
  'customer' || ROW_NUMBER() OVER () || '@example.com' AS email
FROM GENERATE_SERIES(1, 1000)
EOF

dbt run --select synthetic_customers

# Option B: Dynamic via Python
cat > python/generate_data.py  # Create generator
cat > tools/generate_test_data.yml  # Create tool
```

See **references/synthetic-data-patterns.md** for complete patterns.

### Scenario 7: Python Library Wrapping

**User says**: "Wrap the Stripe API" or "Use pandas for analysis" or "Connect to Redis"

**Heuristic**:
1. **Check** if it's an API client library (stripe, twilio, etc.)
2. **Check** if it's a data/ML library (pandas, sklearn, etc.)
3. **Use** `python-demo` as base
4. **Add** library to requirements.txt
5. **Use** @on_init for initialization if stateful

**Implementation steps**:
```bash
# 1. Copy python-demo template
cp -r assets/project-templates/python-demo my-project

# 2. Install library
echo "stripe>=5.4.0" >> requirements.txt
pip install stripe

# 3. Create wrapper
cat > python/stripe_wrapper.py  # Implement wrapper functions

# 4. Create tools
cat > tools/create_customer.yml  # Map to wrapper functions

# 5. Create project config with secrets
cat > config.yml <<EOF
mxcp: 1
profiles:
  default:
    secrets:
      - name: api_key
        type: env
        parameters:
          env_var: API_KEY
EOF

# User sets: export API_KEY=xxx
# Or user copies to ~/.mxcp/ manually if preferred
```

See **references/python-api.md** (Wrapping External Libraries section) for complete patterns.

### Scenario 8: ML/AI Processing

**User says**: "Analyze sentiment" or "Classify images" or "Train a model"

**Heuristic**:
1. **Use** Python tool (not SQL)
2. **Use** `python-demo` template as base
3. **Add** ML libraries (transformers, scikit-learn, etc.)
4. **Use** @on_init to load models (expensive operation)

**Implementation steps**:
```bash
# 1. Copy python-demo
# 2. Install ML libraries
pip install transformers torch

# 3. Create Python endpoint with model loading
cat > python/ml_tool.py <<EOF
from mxcp.runtime import on_init
from transformers import pipeline

classifier = None

@on_init
def load_model():
    global classifier
    classifier = pipeline("sentiment-analysis")

async def analyze_sentiment(texts: list[str]) -> list[dict]:
    results = classifier(texts)
    return [{"text": t, **r} for t, r in zip(texts, results)]
EOF

# 4. Create tool definition
# 5. Test
```

### Scenario 9: External Database Connection

**User says**: "Connect to my PostgreSQL database" or "Query my MySQL production database"

**Heuristic**:
1. **Ask** if data can be exported to CSV (simpler approach)
2. **Ask** if they need real-time data or can cache it
3. **Decide**: Direct query (ATTACH) vs cached (dbt)

**Implementation steps - Direct Query (ATTACH)**:
```bash
# 1. Create project
mkdir db-connection && cd db-connection
mxcp init --bootstrap

# 2. Create config with credentials
cat > config.yml <<EOF
mxcp: 1
profiles:
  default:
    secrets:
      - name: db_host
        type: env
        parameters:
          env_var: DB_HOST
      - name: db_user
        type: env
        parameters:
          env_var: DB_USER
      - name: db_password
        type: env
        parameters:
          env_var: DB_PASSWORD
      - name: db_name
        type: env
        parameters:
          env_var: DB_NAME
EOF

# 3. Create SQL tool with ATTACH
cat > tools/query_database.yml <<EOF
mxcp: 1
tool:
  name: query_customers
  description: "Query customers from PostgreSQL database"
  parameters:
    - name: country
      type: string
      required: false
  return:
    type: array
  source:
    code: |
      -- Install and attach PostgreSQL
      INSTALL postgres;
      LOAD postgres;
      ATTACH IF NOT EXISTS 'host=\${DB_HOST} port=5432 dbname=\${DB_NAME} user=\${DB_USER} password=\${DB_PASSWORD}'
        AS prod_db (TYPE POSTGRES);

      -- Query attached database
      SELECT customer_id, name, email, country
      FROM prod_db.public.customers
      WHERE \$country IS NULL OR country = \$country
      LIMIT 1000
EOF

# 4. Set credentials and test
export DB_HOST="localhost"
export DB_USER="readonly_user"
export DB_PASSWORD="secure_pass"
export DB_NAME="mydb"

mxcp validate
mxcp run tool query_customers --param country="US"
mxcp serve
```

**Implementation steps - Cached with dbt**:
```bash
# 1. Create project
mkdir db-cache && cd db-cache
mxcp init --bootstrap

# 2. Create dbt source
mkdir -p models
cat > models/sources.yml <<EOF
version: 2
sources:
  - name: production
    database: postgres_db
    schema: public
    tables:
      - name: customers
        columns:
          - name: customer_id
            tests: [unique, not_null]
EOF

# 3. Create dbt model to cache data
cat > models/customer_cache.sql <<EOF
{{ config(materialized='table') }}

-- Attach PostgreSQL
{% set attach_sql %}
INSTALL postgres;
LOAD postgres;
ATTACH IF NOT EXISTS 'host=\${DB_HOST} dbname=\${DB_NAME} user=\${DB_USER} password=\${DB_PASSWORD}'
  AS postgres_db (TYPE POSTGRES);
{% endset %}
{% do run_query(attach_sql) %}

-- Cache data
SELECT * FROM postgres_db.public.customers
EOF

# 4. Create schema
cat > models/schema.yml <<EOF
version: 2
models:
  - name: customer_cache
    columns:
      - name: customer_id
        tests: [unique, not_null]
EOF

# 5. Run dbt to cache data
export DB_HOST="localhost" DB_USER="user" DB_PASSWORD="pass" DB_NAME="mydb"
dbt run --select customer_cache
dbt test --select customer_cache

# 6. Create MXCP tool to query cache (fast!)
cat > tools/query_cached.yml <<EOF
mxcp: 1
tool:
  name: query_customers
  source:
    code: SELECT * FROM customer_cache WHERE \$country IS NULL OR country = \$country
EOF

# 7. Create refresh tool
# (see minimal-working-examples.md Example 7 for complete refresh tool)
```

**When to use which approach**:
- **ATTACH (Direct)**: Real-time data needed, small queries, low query frequency
- **dbt (Cached)**: Large tables, frequent queries, can tolerate staleness, want data quality tests

See **references/database-connections.md** for complete patterns.

## When to Ask for Clarification

**If user request is ambiguous, ask these questions**:

### Data Source Unclear
- "What type of data are you working with? (CSV, API, database, etc.)"
- "Do you have a file to upload, or are you connecting to an external source?"

### API Integration Unclear
- "Does this API require authentication? (OAuth, API token, basic auth, or none)"
- "What operations do you need? (read data, write data, both)"

### Data Volume Unclear
- "How large is the dataset? (<1MB, 1-100MB, >100MB)"
- "How often does the data update? (static, daily, real-time)"

### Security Requirements Unclear
- "Who should have access to this data? (everyone, specific roles, specific users)"
- "Are there any sensitive fields that need protection?"

### Functionality Unclear
- "What questions do you want to ask about this data?"
- "What operations should be available through the MCP server?"

## Heuristics When No Interaction Available

**If cannot ask questions, use these defaults**:

1. **CSV file mentioned** → dbt seed + SQL tool with `SELECT *`
2. **Excel mentioned** → Convert to CSV + dbt seed OR Python pandas tool
3. **API mentioned** → Check for template, otherwise use Python tool with httpx
4. **OAuth mentioned** → Use google-calendar template as base
5. **Database mentioned** → DuckDB ATTACH for direct query OR dbt for caching
6. **PostgreSQL/MySQL mentioned** → Use ATTACH with read-only user
7. **Transformation needed** → dbt model
8. **Complex logic** → Python tool
9. **Security not mentioned** → No policies (user can add later)
10. **No auth mentioned for API** → Assume token/basic auth

## Configuration Management

### Project-Local Config (Recommended)

**ALWAYS create `config.yml` in the project directory, NOT `~/.mxcp/config.yml`**

**Why?**
- User maintains control over global config
- Project is self-contained and portable
- Safer for agents (no global config modification)
- User can review before copying to ~/.mxcp/

**Basic config.yml template**:
```yaml
# config.yml (in project root)
mxcp: 1

profiles:
  default:
    # Secrets via environment variables (recommended)
    secrets:
      - name: api_token
        type: env
        parameters:
          env_var: API_TOKEN

    # Database configuration (optional, default is data/db-default.duckdb)
    database:
      path: "data/db-default.duckdb"

    # Authentication (if needed)
    auth:
      provider: github  # or google, microsoft, etc.

  production:
    database:
      path: "prod.duckdb"
    audit:
      enabled: true
      path: "audit.jsonl"
```

**Usage options**:
```bash
# Option 1: Auto-discover (mxcp looks for ./config.yml)
mxcp serve

# Option 2: Explicit path via environment variable
MXCP_CONFIG=./config.yml mxcp serve

# Option 3: User manually copies to global location
cp config.yml ~/.mxcp/config.yml
mxcp serve
```

**In skill implementations**:
```bash
# CORRECT: Create local config
cat > config.yml <<EOF
mxcp: 1
profiles:
  default:
    secrets:
      - name: github_token
        type: env
        parameters:
          env_var: GITHUB_TOKEN
EOF

echo "Config created at ./config.yml"
echo "Set environment variable: export GITHUB_TOKEN=your_token"
echo "Or copy to global config: cp config.yml ~/.mxcp/config.yml"
```

```bash
# WRONG: Don't edit user's global config
# DON'T DO THIS:
# vim ~/.mxcp/config.yml  # ❌ Never do this!
```

### Secrets Management

**Three approaches (in order of preference)**:

1. **Environment Variables** (Best for development):
```yaml
# config.yml
secrets:
  - name: api_key
    type: env
    parameters:
      env_var: API_KEY
```
```bash
export API_KEY=your_secret_key
mxcp serve
```

2. **Vault/1Password** (Best for production):
```yaml
# config.yml
secrets:
  - name: database_creds
    type: vault
    parameters:
      path: secret/data/myapp/db
      field: password
```

3. **Direct in config.yml** (Only for non-sensitive or example values):
```yaml
# config.yml - ONLY for non-sensitive data
secrets:
  - name: api_endpoint
    type: python
    parameters:
      url: "https://api.example.com"  # Not sensitive
```

**Instructions for users**:
```bash
# After agent creates config.yml, user can:

# Option A: Use environment variables
export API_KEY=xxx
export DB_PASSWORD=yyy
mxcp serve

# Option B: Copy to global config and edit
cp config.yml ~/.mxcp/config.yml
vim ~/.mxcp/config.yml  # User edits their own config
mxcp serve

# Option C: Use with explicit path
MXCP_CONFIG=./config.yml mxcp serve
```

## Security-First Checklist

**ALWAYS consider security**:

- [ ] **Authentication**: What auth method is needed?
- [ ] **Authorization**: Who can access this data?
- [ ] **Input validation**: Add parameter validation in tool definition
- [ ] **Output filtering**: Use policies to filter sensitive fields
- [ ] **Secrets management**: Use Vault/1Password, never hardcode
- [ ] **Audit logging**: Enable for production systems
- [ ] **SQL injection**: Use parameterized queries (`$param`)
- [ ] **Rate limiting**: Consider for external API calls

## Robustness Checklist

**ALWAYS ensure robustness**:

- [ ] **Error handling**: Add try/catch in Python, handle nulls in SQL
- [ ] **Type validation**: Define return types and parameter types
- [ ] **Tests**: Create test cases for all tools
- [ ] **Data validation**: Add dbt tests for seeds and models
- [ ] **Documentation**: Add descriptions to all tools/resources
- [ ] **Schema validation**: Create schema.yml for all dbt seeds/models

## Testing Checklist

**ALWAYS test before deployment**:

- [ ] `mxcp validate` - Structure validation
- [ ] `mxcp test` - Functional testing
- [ ] `mxcp lint` - Metadata quality
- [ ] `dbt test` - Data quality (if using dbt)
- [ ] Manual testing with `mxcp run tool <name>`
- [ ] Test with invalid inputs
- [ ] Test with edge cases (empty data, nulls, etc.)

## Summary

**Quick reference for common requests**:

| User Request | Approach | Template | Key Steps |
|--------------|----------|----------|-----------|
| "Query my CSV" | dbt seed + SQL tool | None | seed → schema.yml → dbt seed/test → SQL tool |
| "Read Excel file" | Convert to CSV + dbt seed OR pandas tool | None | Excel→CSV → seed OR pandas → DuckDB table |
| "Connect to PostgreSQL" | ATTACH + SQL tool OR dbt cache | None | ATTACH → SQL tool OR dbt source/model → SQL tool |
| "Connect to MySQL" | ATTACH + SQL tool OR dbt cache | None | ATTACH → SQL tool OR dbt source/model → SQL tool |
| "Generate test data" | dbt model or Python | None | GENERATE_SERIES → dbt model or Python tool |
| "Wrap library X" | Python wrapper | python-demo | Install lib → wrap functions → create tools |
| "Connect to Google Calendar" | OAuth + Python | google-calendar | Copy template → configure OAuth |
| "Connect to Jira" | Token + Python | jira or jira-oauth | Copy template → configure token |
| "Transform data" | dbt model | None | seed/source → model → schema.yml → dbt run/test → SQL tool |
| "Complex logic" | Python tool | python-demo | Copy template → implement function |
| "ML/AI task" | Python + libraries | python-demo | Add ML libs → implement model |
| "External API" | Python + httpx | python-demo | Implement client → create tool |

**Priority order**:
1. Security (auth, policies, validation)
2. Robustness (error handling, types, tests)
3. Testing (validate, test, lint)
4. Features (based on user needs)