Files
2025-11-29 18:16:46 +08:00

2.5 KiB

Phase 1: Schema Design

Objective: Create a production-ready JSON schema respecting all limitations

Steps

1. Define Output Structure

Ask the user:

  • "What fields do you need in the output?"
  • "Which fields are required vs. optional?"
  • "What are the data types for each field?"
  • "Are there nested objects or arrays?"

2. Choose Schema Approach

Option A: Pydantic (Python) - Recommended

from pydantic import BaseModel
from typing import List, Optional

class ContactInfo(BaseModel):
    name: str
    email: str
    plan_interest: Optional[str] = None
    demo_requested: bool = False
    tags: List[str] = []

Option B: Zod (TypeScript) - Recommended

import { z } from 'zod';

const ContactInfoSchema = z.object({
  name: z.string(),
  email: z.string().email(),
  plan_interest: z.string().optional(),
  demo_requested: z.boolean().default(false),
  tags: z.array(z.string()).default([]),
});

Option C: Raw JSON Schema

{
  "type": "object",
  "properties": {
    "name": {"type": "string", "description": "Full name"},
    "email": {"type": "string", "description": "Email address"},
    "plan_interest": {"type": "string", "description": "Interested plan"},
    "demo_requested": {"type": "boolean"},
    "tags": {"type": "array", "items": {"type": "string"}}
  },
  "required": ["name", "email", "demo_requested"],
  "additionalProperties": false
}

3. Apply JSON Schema Limitations

Supported Features:

  • All basic types: object, array, string, integer, number, boolean, null
  • enum (strings, numbers, bools, nulls only)
  • const
  • anyOf and allOf (limited)
  • $ref, $def, definitions (local only)
  • required and additionalProperties: false
  • String formats: date-time, time, date, email, uri, uuid, ipv4, ipv6
  • Array minItems (0 or 1 only)

NOT Supported (SDK can transform these):

  • Recursive schemas
  • Numerical constraints (minimum, maximum)
  • String constraints (minLength, maxLength)
  • Complex array constraints
  • External $ref

4. Add AI-Friendly Descriptions

class Invoice(BaseModel):
    invoice_number: str  # Field(description="Invoice ID, format: INV-XXXXX")
    date: str  # Field(description="Invoice date in YYYY-MM-DD format")
    total: float  # Field(description="Total amount in USD")
    items: List[LineItem]  # Field(description="Line items on the invoice")

Good descriptions help Claude understand what to extract.

Output

Production-ready schema following Anthropic's limitations.