Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:16:46 +08:00
commit 3d2cb201f0
33 changed files with 2911 additions and 0 deletions

View File

@@ -0,0 +1,47 @@
# JSON Schema Limitations Reference
## Supported Features
- ✅ All basic types (object, array, string, integer, number, boolean, null)
-`enum` (primitives only)
-`const`, `anyOf`, `allOf`
-`$ref`, `$def`, `definitions` (local)
-`required`, `additionalProperties: false`
- ✅ String formats: date-time, time, date, email, uri, uuid, ipv4, ipv6
-`minItems: 0` or `minItems: 1` for arrays
## NOT Supported
- ❌ Recursive schemas
- ❌ Numerical constraints (minimum, maximum, multipleOf)
- ❌ String constraints (minLength, maxLength, pattern with complex regex)
- ❌ Array constraints (beyond minItems 0/1)
- ❌ External `$ref`
- ❌ Complex types in enums
## SDK Transformation
Python and TypeScript SDKs automatically remove unsupported constraints and add them to descriptions.
## Success Criteria
- [ ] Schema designed with all required fields
- [ ] JSON Schema limitations respected
- [ ] SDK helper integrated (Pydantic/Zod)
- [ ] Beta header included in requests
- [ ] Error handling for refusals and token limits
- [ ] Tested with representative examples
- [ ] Edge cases covered (missing fields, invalid data)
- [ ] Production optimization considered (caching, tokens)
- [ ] Monitoring in place (latency, costs)
- [ ] Documentation provided
## Important Reminders
1. **Use SDK helpers** - `client.beta.messages.parse()` auto-validates
2. **Respect limitations** - No recursive schemas, no min/max constraints
3. **Add descriptions** - Helps Claude understand what to extract
4. **Handle refusals** - Don't retry safety refusals
5. **Monitor performance** - Watch for cache misses and high latency
6. **Set `additionalProperties: false`** - Required for all objects
7. **Test thoroughly** - Edge cases often reveal schema issues

View File

@@ -0,0 +1,86 @@
# Common Use Cases
## Use Case 1: Data Extraction
**Scenario**: Extract invoice data from text/images
```python
from pydantic import BaseModel
from typing import List
class LineItem(BaseModel):
description: str
quantity: int
unit_price: float
total: float
class Invoice(BaseModel):
invoice_number: str
date: str
customer_name: str
line_items: List[LineItem]
subtotal: float
tax: float
total_amount: float
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
max_tokens=2048,
messages=[{"role": "user", "content": f"Extract invoice:\n{invoice_text}"}],
output_format=Invoice,
)
invoice = response.parsed_output
# Insert into database with guaranteed types
db.insert_invoice(invoice.model_dump())
```
## Use Case 2: Classification
**Scenario**: Classify support tickets
```python
class TicketClassification(BaseModel):
category: str # "billing", "technical", "sales"
priority: str # "low", "medium", "high", "critical"
confidence: float
requires_human: bool
suggested_assignee: Optional[str] = None
tags: List[str]
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": f"Classify:\n{ticket}"}],
output_format=TicketClassification,
)
classification = response.parsed_output
if classification.requires_human or classification.confidence < 0.7:
route_to_human(ticket)
else:
auto_assign(ticket, classification.category)
```
## Use Case 3: API Response Formatting
**Scenario**: Generate API-ready responses
```python
class APIResponse(BaseModel):
status: str # "success" or "error"
data: dict
errors: Optional[List[dict]] = None
metadata: dict
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": f"Process: {request}"}],
output_format=APIResponse,
)
# Directly return as JSON API response
return jsonify(response.parsed_output.model_dump())
```