93 lines
2.5 KiB
Markdown
93 lines
2.5 KiB
Markdown
# Phase 1: Schema Design
|
|
|
|
**Objective**: Create a production-ready JSON schema respecting all limitations
|
|
|
|
## Steps
|
|
|
|
### 1. Define Output Structure
|
|
|
|
Ask the user:
|
|
- "What fields do you need in the output?"
|
|
- "Which fields are required vs. optional?"
|
|
- "What are the data types for each field?"
|
|
- "Are there nested objects or arrays?"
|
|
|
|
### 2. Choose Schema Approach
|
|
|
|
**Option A: Pydantic (Python) - Recommended**
|
|
```python
|
|
from pydantic import BaseModel
|
|
from typing import List, Optional
|
|
|
|
class ContactInfo(BaseModel):
|
|
name: str
|
|
email: str
|
|
plan_interest: Optional[str] = None
|
|
demo_requested: bool = False
|
|
tags: List[str] = []
|
|
```
|
|
|
|
**Option B: Zod (TypeScript) - Recommended**
|
|
```typescript
|
|
import { z } from 'zod';
|
|
|
|
const ContactInfoSchema = z.object({
|
|
name: z.string(),
|
|
email: z.string().email(),
|
|
plan_interest: z.string().optional(),
|
|
demo_requested: z.boolean().default(false),
|
|
tags: z.array(z.string()).default([]),
|
|
});
|
|
```
|
|
|
|
**Option C: Raw JSON Schema**
|
|
```json
|
|
{
|
|
"type": "object",
|
|
"properties": {
|
|
"name": {"type": "string", "description": "Full name"},
|
|
"email": {"type": "string", "description": "Email address"},
|
|
"plan_interest": {"type": "string", "description": "Interested plan"},
|
|
"demo_requested": {"type": "boolean"},
|
|
"tags": {"type": "array", "items": {"type": "string"}}
|
|
},
|
|
"required": ["name", "email", "demo_requested"],
|
|
"additionalProperties": false
|
|
}
|
|
```
|
|
|
|
### 3. Apply JSON Schema Limitations
|
|
|
|
**✅ Supported Features:**
|
|
- All basic types: object, array, string, integer, number, boolean, null
|
|
- `enum` (strings, numbers, bools, nulls only)
|
|
- `const`
|
|
- `anyOf` and `allOf` (limited)
|
|
- `$ref`, `$def`, `definitions` (local only)
|
|
- `required` and `additionalProperties: false`
|
|
- String formats: date-time, time, date, email, uri, uuid, ipv4, ipv6
|
|
- Array `minItems` (0 or 1 only)
|
|
|
|
**❌ NOT Supported (SDK can transform these):**
|
|
- Recursive schemas
|
|
- Numerical constraints (minimum, maximum)
|
|
- String constraints (minLength, maxLength)
|
|
- Complex array constraints
|
|
- External `$ref`
|
|
|
|
### 4. Add AI-Friendly Descriptions
|
|
|
|
```python
|
|
class Invoice(BaseModel):
|
|
invoice_number: str # Field(description="Invoice ID, format: INV-XXXXX")
|
|
date: str # Field(description="Invoice date in YYYY-MM-DD format")
|
|
total: float # Field(description="Total amount in USD")
|
|
items: List[LineItem] # Field(description="Line items on the invoice")
|
|
```
|
|
|
|
Good descriptions help Claude understand what to extract.
|
|
|
|
## Output
|
|
|
|
Production-ready schema following Anthropic's limitations.
|