Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:29:15 +08:00
commit be476a3fea
76 changed files with 12812 additions and 0 deletions

View File

@@ -0,0 +1,67 @@
# Documentation Architecture Examples
Real-world examples of comprehensive documentation generation, architecture documentation, and coverage validation.
## Available Examples
### [OpenAPI 3.1 Generation from FastAPI](openapi-generation.md)
Complete workflow for automatic API documentation generation from FastAPI codebase.
**Scenario**: E-commerce API with 47 undocumented endpoints causing 12 integration issues/week
**Solution**: Enhanced OpenAPI generation, multi-language examples, interactive Swagger UI, CI/CD auto-generation
**Results**: Integration issues 12/week → 0.5/week (96% reduction), manual doc time 4hrs → 0 (automated)
**Key Techniques**: FastAPI OpenAPI customization, Pydantic v2 field validators, example generation scripts
---
### [System Architecture Documentation with Mermaid](architecture-docs.md)
Comprehensive system architecture documentation reducing onboarding time from 3-4 weeks to 4-5 days.
**Scenario**: No architecture docs, tribal knowledge spread across 8 developers, 3-4 week onboarding
**Solution**: 8 Mermaid diagrams, Architecture Decision Records, progressive disclosure, version-controlled
**Results**: Onboarding 3-4 weeks → 4-5 days (75% reduction), architecture questions 15hrs/week → 2hrs/week
**Key Techniques**: Mermaid diagrams (system, sequence, data flow, ER), ADR template, multi-tenant flow docs
---
### [Documentation Coverage Validation](coverage-validation.md)
Automated documentation coverage analysis with 80% threshold enforcement in CI/CD.
**Scenario**: Unknown coverage, 147 undocumented functions, no visibility into gaps
**Solution**: TypeScript coverage (ts-morph), Python coverage (AST), HTML reports, CI/CD enforcement
**Results**: TS 42% → 87%, Python 38% → 91%, API 51% → 95%, undocumented 147 → 18
**Key Techniques**: AST parsing, OpenAPI schema analysis, coverage threshold enforcement, HTML reports
---
## Common Patterns
1. **Automation First**: All documentation generated/validated automatically
2. **CI/CD Integration**: Updates on every commit, coverage checks block PRs
3. **Multi-Language Support**: Examples in TypeScript, Python, cURL
4. **Visual Documentation**: Mermaid diagrams for architecture, sequences, data models
5. **Progressive Disclosure**: Start with overview, drill into details
## Quick Reference
| Need | Example | Key Tool |
|------|---------|----------|
| API Documentation | [openapi-generation.md](openapi-generation.md) | FastAPI + Pydantic v2 |
| System Architecture | [architecture-docs.md](architecture-docs.md) | Mermaid + ADRs |
| Coverage Analysis | [coverage-validation.md](coverage-validation.md) | ts-morph + Python AST |
---
Related: [Reference Guides](../reference/INDEX.md) | [Templates](../templates/) | [Return to Agent](../docs-architect.md)

View File

@@ -0,0 +1,442 @@
# Example: System Architecture Documentation with Mermaid Diagrams
Complete workflow for creating comprehensive system architecture documentation for a distributed Grey Haven application.
## Context
**Project**: Multi-Tenant SaaS Platform (TanStack Start + Cloudflare Workers + FastAPI + PostgreSQL)
**Problem**: New developers taking 3-4 weeks to understand system architecture, high onboarding cost
**Goal**: Create comprehensive architecture documentation that reduces onboarding time to <1 week
**Initial State**:
- No architecture documentation
- Tribal knowledge spread across 8 senior developers
- New hires asking same questions repeatedly
- 3-4 weeks until new developer productive
- Architecture decisions not documented (ADRs missing)
## Step 1: System Overview with Mermaid
### High-Level Architecture Diagram
```mermaid
graph TB
subgraph "Client Layer"
Browser[Web Browser]
Mobile[Mobile App]
end
subgraph "Edge Layer (Cloudflare Workers)"
Gateway[API Gateway]
Auth[Auth Service]
Cache[KV Cache]
end
subgraph "Application Layer"
Frontend[TanStack Start<br/>React 19]
Backend[FastAPI Backend<br/>Python 3.12]
end
subgraph "Data Layer"
PostgreSQL[(PostgreSQL<br/>PlanetScale)]
Redis[(Redis Cache<br/>Upstash)]
S3[(R2 Object Storage<br/>Cloudflare)]
end
subgraph "External Services"
Stripe[Stripe<br/>Payments]
SendGrid[SendGrid<br/>Email]
DataDog[DataDog<br/>Monitoring]
end
Browser --> Gateway
Mobile --> Gateway
Gateway --> Auth
Gateway --> Frontend
Gateway --> Backend
Auth --> Cache
Frontend --> PostgreSQL
Backend --> PostgreSQL
Backend --> Redis
Backend --> S3
Backend --> Stripe
Backend --> SendGrid
Backend -.telemetry.-> DataDog
```
## Step 2: Request Flow Sequence Diagrams
### User Authentication Flow
```mermaid
sequenceDiagram
actor User
participant Browser
participant Gateway as API Gateway<br/>(Cloudflare Worker)
participant Auth as Auth Service<br/>(Cloudflare Worker)
participant KV as KV Cache
participant DB as PostgreSQL
User->>Browser: Enter email/password
Browser->>Gateway: POST /auth/login
Gateway->>Auth: Validate credentials
Auth->>DB: Query user by email
DB-->>Auth: User record
alt Valid Credentials
Auth->>Auth: Hash password & verify
Auth->>Auth: Generate JWT token
Auth->>KV: Store session (token -> user_id)
KV-->>Auth: OK
Auth-->>Gateway: {token, user}
Gateway-->>Browser: 200 OK {token, user}
Browser->>Browser: Store token in localStorage
Browser-->>User: Redirect to dashboard
else Invalid Credentials
Auth-->>Gateway: 401 Unauthorized
Gateway-->>Browser: {error: "INVALID_CREDENTIALS"}
Browser-->>User: Show error message
end
```
### Multi-Tenant Data Access Flow
```mermaid
sequenceDiagram
participant Client
participant Gateway
participant Backend as FastAPI Backend
participant DB as PostgreSQL<br/>(Row-Level Security)
Client->>Gateway: GET /api/orders<br/>Authorization: Bearer <token>
Gateway->>Gateway: Validate JWT token
Gateway->>Gateway: Extract tenant_id from token
Gateway->>Backend: Forward request<br/>X-Tenant-ID: tenant_123
Backend->>Backend: Set session context<br/>SET app.tenant_id = 'tenant_123'
Backend->>DB: SELECT * FROM orders<br/>(RLS automatically filters by tenant)
Note over DB: Row-Level Security Policy:<br/>CREATE POLICY tenant_isolation ON orders<br/>FOR SELECT USING (tenant_id = current_setting('app.tenant_id'))
DB-->>Backend: Orders for tenant_123 only
Backend-->>Gateway: {orders: [...]}
Gateway-->>Client: 200 OK {orders: [...]}
```
## Step 3: Data Flow Diagram
### Order Processing Data Flow
```mermaid
flowchart LR
User[User Creates Order] --> Validation[Validate Order Data]
Validation --> Stock{Check Stock<br/>Availability}
Stock -->|Insufficient| Error[Return 400 Error]
Stock -->|Available| Reserve[Reserve Inventory]
Reserve --> Payment[Process Payment<br/>via Stripe]
Payment -->|Failed| Release[Release Reservation]
Release --> Error
Payment -->|Success| CreateOrder[Create Order<br/>in Database]
CreateOrder --> Queue[Queue Email<br/>Confirmation]
Queue --> Cache[Invalidate<br/>User Cache]
Cache --> Success[Return Order]
Success --> Async[Async: Send Email<br/>via SendGrid]
Success --> Metrics[Update Metrics<br/>in DataDog]
```
## Step 4: Database Schema ER Diagram
```mermaid
erDiagram
TENANT ||--o{ USER : has
TENANT ||--o{ ORDER : has
USER ||--o{ ORDER : places
ORDER ||--|{ ORDER_ITEM : contains
PRODUCT ||--o{ ORDER_ITEM : included_in
TENANT ||--o{ PRODUCT : owns
TENANT {
uuid id PK
string name
string subdomain UK
timestamp created_at
}
USER {
uuid id PK
uuid tenant_id FK
string email UK
string hashed_password
string role
timestamp created_at
}
PRODUCT {
uuid id PK
uuid tenant_id FK
string name
decimal price
int stock
}
ORDER {
uuid id PK
uuid tenant_id FK
uuid user_id FK
decimal subtotal
decimal tax
decimal total
string status
timestamp created_at
}
ORDER_ITEM {
uuid id PK
uuid order_id FK
uuid product_id FK
int quantity
decimal unit_price
}
```
## Step 5: Deployment Architecture
```mermaid
graph TB
subgraph "Development"
DevBranch[Feature Branch]
DevEnv[Dev Environment<br/>Cloudflare Preview]
end
subgraph "Staging"
MainBranch[Main Branch]
StageEnv[Staging Environment<br/>staging.greyhaven.com]
StageDB[(Staging PostgreSQL)]
end
subgraph "Production"
Release[Release Tag]
ProdWorkers[Cloudflare Workers<br/>300+ Datacenters]
ProdDB[(Production PostgreSQL<br/>PlanetScale)]
ProdCache[(Redis Cache<br/>Upstash)]
end
DevBranch -->|git push| CI1[GitHub Actions]
CI1 -->|Deploy| DevEnv
DevBranch -->|PR Merged| MainBranch
MainBranch -->|Deploy| CI2[GitHub Actions]
CI2 -->|Run Tests| TestSuite
TestSuite -->|Success| StageEnv
StageEnv --> StageDB
MainBranch -->|git tag v1.0.0| Release
Release -->|Deploy| CI3[GitHub Actions]
CI3 -->|Canary 10%| ProdWorkers
CI3 -->|Monitor 10 min| Metrics
Metrics -->|Success| FullDeploy[100% Rollout]
FullDeploy --> ProdWorkers
ProdWorkers --> ProdDB
ProdWorkers --> ProdCache
```
## Step 6: State Machine Diagram for Order Status
```mermaid
stateDiagram-v2
[*] --> Pending: Order Created
Pending --> Processing: Payment Confirmed
Pending --> Cancelled: Payment Failed
Processing --> Shipped: Fulfillment Complete
Processing --> Cancelled: Out of Stock
Shipped --> Delivered: Tracking Confirmed
Shipped --> Returned: Customer Return
Delivered --> Returned: Return Requested
Returned --> Refunded: Return Approved
Cancelled --> [*]
Delivered --> [*]
Refunded --> [*]
note right of Pending
Inventory reserved
Payment processing
end note
note right of Processing
Items picked
Preparing shipment
end note
note right of Shipped
Tracking number assigned
In transit
end note
```
## Step 7: Architecture Decision Records (ADRs)
### ADR-001: Choose Cloudflare Workers for Edge Computing
```markdown
# ADR-001: Use Cloudflare Workers for API Gateway and Auth
**Date**: 2024-01-15
**Status**: Accepted
**Decision Makers**: Engineering Team
## Context
We need an edge computing platform for API gateway, authentication, and caching that:
- Provides global low latency (<50ms p95)
- Scales automatically without management
- Integrates with our CDN infrastructure
- Supports multi-tenant architecture
## Decision
We will use Cloudflare Workers for edge computing with KV for session storage.
## Alternatives Considered
1. **AWS Lambda@Edge**: Good performance but vendor lock-in, higher cost
2. **Traditional Load Balancer**: Single region, no edge caching
3. **Self-hosted Edge Nodes**: Complex deployment, maintenance overhead
## Consequences
**Positive**:
- Global deployment (300+ datacenters) with <50ms latency worldwide
- Auto-scaling to zero cost when idle
- Built-in DDoS protection and WAF
- KV storage for session caching (sub-millisecond reads)
- 1ms CPU time limit forces efficient code
**Negative**:
- 1ms CPU time limit requires careful optimization
- Cold starts (though <10ms typically)
- Limited to JavaScript/TypeScript/Rust/Python (via Pyodide)
- No native PostgreSQL driver (must use HTTP-based client)
## Implementation
- API Gateway: Handles routing, CORS, rate limiting
- Auth Service: JWT validation, session management (KV)
- Cache Layer: API response caching (KV + Cache API)
## Monitoring
- Worker CPU time (aim for <500μs p95)
- KV cache hit rate (aim for >95%)
- Edge response time (aim for <50ms p95)
```
### ADR-002: PostgreSQL with Row-Level Security for Multi-Tenancy
```markdown
# ADR-002: PostgreSQL Row-Level Security (RLS) for Multi-Tenant Isolation
**Date**: 2024-01-20
**Status**: Accepted
## Context
Multi-tenant SaaS requires strict data isolation. Accidental cross-tenant data access would be a critical security breach.
## Decision
Use PostgreSQL Row-Level Security (RLS) policies to enforce tenant isolation at the database level.
## Implementation
```sql
-- Enable RLS on all tables
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
-- Create policy that filters by session tenant_id
CREATE POLICY tenant_isolation ON orders
FOR ALL
USING (tenant_id = current_setting('app.tenant_id', true)::uuid);
-- Application sets tenant context per request
SET app.tenant_id = '<tenant_id_from_jwt>';
```
## Consequences
**Positive**:
- Database-level enforcement (cannot be bypassed by application bugs)
- Automatic filtering on all queries (including ORMs)
- Performance: RLS uses indexes efficiently
**Negative**:
- Requires setting session context per connection
- Slightly more complex query plans
## Monitoring
- Weekly audit: Check for tables missing RLS
- Quarterly penetration test: Attempt cross-tenant access
```
## Results
### Before
- No architecture documentation
- 3-4 weeks until new developer productive
- 15+ hours/week answering architecture questions
- Architecture decisions lost to time
- Difficult to identify bottlenecks
### After
- Comprehensive architecture docs with 8 Mermaid diagrams
- 5 Architecture Decision Records documenting key choices
- Documentation in Git (versioned, reviewed)
- Interactive diagrams (clickable, navigable)
### Improvements
- Onboarding time: 3-4 weeks → 4-5 days (75% reduction)
- Architecture questions: 15 hrs/week → 2 hrs/week (87% reduction)
- New developer productivity: Week 4 → Week 1
- Time to understand data flow: 2 weeks → 1 day
### Developer Feedback
- "The sequence diagrams made auth flow crystal clear"
- "ERD diagram helped me understand relationships immediately"
- "ADRs answered 'why did we choose X?' questions"
## Key Lessons
1. **Mermaid Diagrams**: Version-controlled, reviewable, always up-to-date
2. **Multiple Perspectives**: System, sequence, data flow, deployment diagrams all needed
3. **ADRs are Critical**: "Why" is as important as "what"
4. **Progressive Disclosure**: Overview first, then drill into details
5. **Keep Diagrams Simple**: One concept per diagram, not everything at once
## Prevention Measures
**Implemented**:
- [x] All architecture docs in Git (versioned)
- [x] Mermaid diagrams (not static images)
- [x] ADR template for all major decisions
- [x] Onboarding checklist includes reading architecture docs
**Ongoing**:
- [ ] Auto-generate diagrams from code (infrastructure as code)
- [ ] Quarterly architecture review (docs up-to-date?)
- [ ] New ADR for every major technical decision
---
Related: [openapi-generation.md](openapi-generation.md) | [coverage-validation.md](coverage-validation.md) | [Return to INDEX](INDEX.md)

View File

@@ -0,0 +1,411 @@
# Example: Documentation Coverage Validation and Gap Analysis
Complete workflow for analyzing documentation coverage, identifying gaps, and establishing quality gates in CI/CD.
## Context
**Project**: FastAPI + TanStack Start SaaS Platform
**Problem**: Documentation coverage unknown, many functions and API endpoints undocumented
**Goal**: Establish 80% documentation coverage with CI/CD enforcement
**Initial State**:
- No visibility into documentation coverage
- 147 undocumented functions and 23 undocumented API endpoints
- New code merged without documentation requirements
- Partners complained about missing API documentation
## Step 1: TypeScript Documentation Coverage Analysis
```typescript
// scripts/analyze-ts-coverage.ts
import { Project } from "ts-morph";
function analyzeTypeScriptCoverage(projectPath: string) {
const project = new Project({ tsConfigFilePath: `${projectPath}/tsconfig.json` });
const result = { total: 0, documented: 0, undocumented: [] };
project.getSourceFiles().forEach((sourceFile) => {
// Analyze exported functions
sourceFile.getFunctions().filter((fn) => fn.isExported()).forEach((fn) => {
result.total++;
const jsDocs = fn.getJsDocs();
if (jsDocs.length > 0 && jsDocs[0].getDescription().trim().length > 0) {
result.documented++;
} else {
result.undocumented.push({
name: fn.getName() || "(anonymous)",
location: `${sourceFile.getFilePath()}:${fn.getStartLineNumber()}`,
});
}
});
// Analyze interfaces
sourceFile.getInterfaces().forEach((iface) => {
if (!iface.isExported()) return;
result.total++;
if (iface.getJsDocs().length > 0) {
result.documented++;
} else {
result.undocumented.push({
name: iface.getName(),
location: `${sourceFile.getFilePath()}:${iface.getStartLineNumber()}`,
});
}
});
});
const coverage = (result.documented / result.total) * 100;
console.log(`TypeScript Coverage: ${coverage.toFixed(1)}%`);
console.log(`Documented: ${result.documented} / ${result.total}`);
if (result.undocumented.length > 0) {
console.log("\nMissing documentation:");
result.undocumented.forEach((item) => console.log(` - ${item.name} (${item.location})`));
}
if (coverage < 80) {
console.error(`❌ Coverage ${coverage.toFixed(1)}% below threshold 80%`);
process.exit(1);
}
console.log(`✅ Coverage ${coverage.toFixed(1)}% meets threshold`);
}
analyzeTypeScriptCoverage("./app");
```
## Step 2: Python Documentation Coverage Analysis
```python
# scripts/analyze_py_coverage.py
import ast
from pathlib import Path
from typing import List, Dict
class DocstringAnalyzer(ast.NodeVisitor):
def __init__(self):
self.total = 0
self.documented = 0
self.undocumented: List[Dict] = []
self.current_file = ""
def visit_FunctionDef(self, node: ast.FunctionDef):
if node.name.startswith("_"): # Skip private functions
return
self.total += 1
docstring = ast.get_docstring(node)
if docstring and len(docstring.strip()) > 10:
self.documented += 1
else:
self.undocumented.append({
"name": node.name,
"type": "function",
"location": f"{self.current_file}:{node.lineno}"
})
self.generic_visit(node)
def visit_ClassDef(self, node: ast.ClassDef):
self.total += 1
docstring = ast.get_docstring(node)
if docstring and len(docstring.strip()) > 10:
self.documented += 1
else:
self.undocumented.append({
"name": node.name,
"type": "class",
"location": f"{self.current_file}:{node.lineno}"
})
self.generic_visit(node)
def analyze_python_coverage(project_path: str):
analyzer = DocstringAnalyzer()
for py_file in Path(project_path).rglob("*.py"):
if "__pycache__" in str(py_file):
continue
analyzer.current_file = str(py_file)
with open(py_file, "r") as f:
try:
tree = ast.parse(f.read())
analyzer.visit(tree)
except SyntaxError:
print(f"⚠️ Syntax error in {py_file}")
coverage = (analyzer.documented / analyzer.total * 100) if analyzer.total > 0 else 0
print(f"Python Coverage: {coverage:.1f}%")
print(f"Documented: {analyzer.documented} / {analyzer.total}")
if analyzer.undocumented:
print("\nMissing documentation:")
for item in analyzer.undocumented:
print(f" - {item['type']} {item['name']} ({item['location']})")
if coverage < 80:
print(f"❌ Coverage {coverage:.1f}% below threshold 80%")
exit(1)
print(f"✅ Coverage {coverage:.1f}% meets threshold")
analyze_python_coverage("./app")
```
## Step 3: API Endpoint Documentation Coverage
```python
# scripts/analyze_api_coverage.py
from fastapi import FastAPI
def analyze_api_documentation(app: FastAPI):
result = {"total_endpoints": 0, "documented": 0, "undocumented": []}
openapi = app.openapi()
for path, methods in openapi["paths"].items():
for method, details in methods.items():
result["total_endpoints"] += 1
has_summary = bool(details.get("summary"))
has_description = bool(details.get("description"))
if has_summary and has_description:
result["documented"] += 1
else:
missing = []
if not has_summary: missing.append("summary")
if not has_description: missing.append("description")
result["undocumented"].append({
"method": method.upper(),
"path": path,
"missing": missing
})
coverage = (result["documented"] / result["total_endpoints"] * 100)
print(f"API Coverage: {coverage:.1f}%")
print(f"Documented: {result['documented']} / {result['total_endpoints']}")
if result["undocumented"]:
print("\nMissing documentation:")
for endpoint in result["undocumented"]:
missing = ", ".join(endpoint["missing"])
print(f" - {endpoint['method']} {endpoint['path']} (missing: {missing})")
if coverage < 80:
print(f"❌ Coverage {coverage:.1f}% below threshold 80%")
exit(1)
print(f"✅ Coverage {coverage:.1f}% meets threshold")
from app.main import app
analyze_api_documentation(app)
```
## Step 4: Comprehensive HTML Coverage Report
```python
# scripts/generate_coverage_report.py
from jinja2 import Template
from datetime import datetime
def generate_coverage_report(ts_coverage, py_coverage, api_coverage):
template = Template('''
<!DOCTYPE html>
<html>
<head>
<title>Documentation Coverage Report</title>
<style>
body { font-family: Arial; margin: 40px; }
.summary { display: grid; grid-template-columns: repeat(3, 1fr); gap: 20px; }
.card { border: 1px solid #ddd; padding: 20px; border-radius: 8px; }
.card.pass { border-left: 4px solid #28a745; }
.card.fail { border-left: 4px solid #dc3545; }
.coverage { font-size: 48px; font-weight: bold; margin: 10px 0; }
.undocumented { margin-top: 40px; }
.undocumented li { padding: 8px; background: #f8f9fa; margin: 4px 0; }
</style>
</head>
<body>
<h1>Documentation Coverage Report</h1>
<p>Generated: {{ timestamp }}</p>
<div class="summary">
<div class="card {{ 'pass' if ts_coverage.coverage >= 80 else 'fail' }}">
<h3>TypeScript</h3>
<div class="coverage">{{ "%.1f"|format(ts_coverage.coverage) }}%</div>
<p>{{ ts_coverage.documented }} / {{ ts_coverage.total }}</p>
</div>
<div class="card {{ 'pass' if py_coverage.coverage >= 80 else 'fail' }}">
<h3>Python</h3>
<div class="coverage">{{ "%.1f"|format(py_coverage.coverage) }}%</div>
<p>{{ py_coverage.documented }} / {{ py_coverage.total }}</p>
</div>
<div class="card {{ 'pass' if api_coverage.coverage >= 80 else 'fail' }}">
<h3>API</h3>
<div class="coverage">{{ "%.1f"|format(api_coverage.coverage) }}%</div>
<p>{{ api_coverage.documented }} / {{ api_coverage.total_endpoints }}</p>
</div>
</div>
{% for section in [ts_coverage, py_coverage] %}
{% if section.undocumented %}
<div class="undocumented">
<h2>{{ section.name }} - Missing Documentation</h2>
<ul>
{% for item in section.undocumented %}
<li><strong>{{ item.name }}</strong> - {{ item.location }}</li>
{% endfor %}
</ul>
</div>
{% endif %}
{% endfor %}
</body>
</html>
''')
html = template.render(
timestamp=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
ts_coverage=ts_coverage,
py_coverage=py_coverage,
api_coverage=api_coverage
)
with open("docs/coverage-report.html", "w") as f:
f.write(html)
print("📊 Coverage report generated: docs/coverage-report.html")
```
## Step 5: CI/CD Integration
```yaml
# .github/workflows/documentation-coverage.yml
name: Documentation Coverage
on:
pull_request:
push:
branches: [main]
jobs:
documentation-coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: |
npm install
pip install -r requirements.txt jinja2
- name: Check TypeScript coverage
run: npx ts-node scripts/analyze-ts-coverage.ts
- name: Check Python coverage
run: python scripts/analyze_py_coverage.py
- name: Check API coverage
run: python scripts/analyze_api_coverage.py
- name: Generate report
if: always()
run: python scripts/generate_coverage_report.py
- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: coverage-report
path: docs/coverage-report.html
- name: Comment on PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '📊 Documentation coverage report generated. Check artifacts.'
});
```
## Results
### Before
- Documentation coverage: unknown
- No visibility into gaps
- 147 undocumented functions
- 23 undocumented API endpoints
- New code merged without docs
- Partners complained about missing docs
### After
- TypeScript coverage: 42% → 87%
- Python coverage: 38% → 91%
- API endpoint coverage: 51% → 95%
- CI/CD enforcement (fails build if <80%)
- Automated HTML reports
### Improvements
- Undocumented functions: 147 → 18 (88% reduction)
- Undocumented endpoints: 23 → 1 (96% reduction)
- Time to find function docs: 15 min → instant
- Partner onboarding: 2 weeks → 3 days
- Documentation debt: eliminated weekly
### Developer Feedback
- "Coverage reports made it clear what needed docs"
- "CI/CD enforcement prevented new undocumented code"
- "HTML report showed exactly what was missing"
- "80% threshold is challenging but achievable"
## Key Lessons
1. **Automated Analysis**: Manual tracking doesn't scale
2. **CI/CD Enforcement**: Prevents documentation regression
3. **Visibility**: Reports show exactly what's missing
4. **Threshold-Based**: 80% coverage is achievable and meaningful
5. **Multi-Language**: Each language needs appropriate tooling (ts-morph, AST, OpenAPI)
6. **HTML Reports**: Visual representation drives action
## Prevention Measures
**Implemented**:
- [x] TypeScript coverage analysis (ts-morph)
- [x] Python coverage analysis (AST)
- [x] API endpoint documentation check
- [x] HTML coverage reports
- [x] CI/CD integration (fails below 80%)
- [x] PR comments with coverage status
**Ongoing**:
- [ ] Pre-commit hooks (warn if adding undocumented code)
- [ ] Dashboard showing coverage trends over time
- [ ] Team documentation KPIs (quarterly review)
- [ ] Automated "most undocumented files" weekly report
---
Related: [openapi-generation.md](openapi-generation.md) | [architecture-docs.md](architecture-docs.md) | [Return to INDEX](INDEX.md)

View File

@@ -0,0 +1,437 @@
# Example: OpenAPI 3.1 Generation from FastAPI Codebase
Complete workflow showing automatic OpenAPI specification generation from a FastAPI codebase with Pydantic v2 models.
## Context
**Project**: E-commerce API (FastAPI + Pydantic v2 + SQLModel)
**Problem**: Manual API documentation was 3 months out of date, causing integration failures for 2 partner teams
**Goal**: Generate comprehensive OpenAPI 3.1 spec automatically from code with multi-language examples
**Initial State**:
- 47 API endpoints with no documentation
- 12 integration issues per week from stale documentation
- Manual doc updates taking 4+ hours per release
- Partners blocked waiting for updated contracts
## Step 1: Pydantic v2 Models with Rich Schemas
```python
# app/models/orders.py
from pydantic import BaseModel, Field
from typing import List
from datetime import datetime
class OrderItem(BaseModel):
product_id: str = Field(..., description="Product identifier")
quantity: int = Field(..., gt=0, description="Quantity to order")
unit_price: float = Field(..., gt=0, description="Price per unit in USD")
class OrderCreate(BaseModel):
"""Create a new order for the authenticated user."""
items: List[OrderItem] = Field(..., min_length=1, description="Order line items")
shipping_address_id: str = Field(..., description="ID of shipping address")
model_config = {
"json_schema_extra": {
"examples": [{
"items": [{"product_id": "prod_123", "quantity": 2, "unit_price": 29.99}],
"shipping_address_id": "addr_456"
}]
}
}
class Order(BaseModel):
"""Order with calculated totals."""
id: str
user_id: str
items: List[OrderItem]
subtotal: float = Field(..., description="Sum of all item prices")
tax: float = Field(..., description="Calculated tax amount")
total: float = Field(..., description="Final order total")
status: str = Field(..., description="pending, processing, shipped, delivered, cancelled")
created_at: datetime
```
## Step 2: Enhanced OpenAPI Generation
```python
# app/main.py
from fastapi import FastAPI
from fastapi.openapi.utils import get_openapi
app = FastAPI()
def custom_openapi():
if app.openapi_schema:
return app.openapi_schema
openapi_schema = get_openapi(
title="Grey Haven E-Commerce API",
version="1.0.0",
description="E-commerce API with JWT auth. Rate limit: 1000 req/hour (authenticated).",
routes=app.routes,
)
# Add security schemes
openapi_schema["components"]["securitySchemes"] = {
"BearerAuth": {
"type": "http",
"scheme": "bearer",
"bearerFormat": "JWT",
"description": "JWT token from /auth/login"
}
}
openapi_schema["security"] = [{"BearerAuth": []}]
# Add error response schema
openapi_schema["components"]["schemas"]["ErrorResponse"] = {
"type": "object",
"required": ["error", "message"],
"properties": {
"error": {"type": "string", "example": "INSUFFICIENT_STOCK"},
"message": {"type": "string", "example": "Product has insufficient stock"},
"details": {"type": "object", "additionalProperties": True}
}
}
# Add rate limit headers
openapi_schema["components"]["headers"] = {
"X-RateLimit-Limit": {"description": "Request limit per hour", "schema": {"type": "integer"}},
"X-RateLimit-Remaining": {"description": "Remaining requests", "schema": {"type": "integer"}},
"X-RateLimit-Reset": {"description": "Reset timestamp", "schema": {"type": "integer"}}
}
app.openapi_schema = openapi_schema
return app.openapi_schema
app.openapi = custom_openapi
```
## Step 3: FastAPI Route with Complete Documentation
```python
# app/routers/orders.py
from fastapi import APIRouter, Depends, HTTPException
router = APIRouter(prefix="/api/v1/orders", tags=["orders"])
@router.post("/", response_model=Order, status_code=201)
async def create_order(
order_data: OrderCreate,
current_user: User = Depends(get_current_user),
session: Session = Depends(get_session)
) -> Order:
"""
Create a new order for the authenticated user.
The order will be created in 'pending' status and total calculated
including applicable taxes based on shipping address.
**Requires**:
- Valid JWT authentication token
- At least one item in the order
- Valid shipping address ID owned by the user
**Returns**: Created order with calculated totals
**Raises**:
- **401 Unauthorized**: If user is not authenticated
- **404 Not Found**: If shipping address not found
- **400 Bad Request**: If product stock insufficient or validation fails
- **429 Too Many Requests**: If rate limit exceeded
"""
# Validate shipping address belongs to user
address = session.get(ShippingAddress, order_data.shipping_address_id)
if not address or address.user_id != current_user.id:
raise HTTPException(404, detail="Shipping address not found")
# Check stock availability
for item in order_data.items:
product = session.get(Product, item.product_id)
if not product or product.stock < item.quantity:
raise HTTPException(
400,
detail={
"error": "INSUFFICIENT_STOCK",
"message": f"Product {item.product_id} has insufficient stock",
"details": {
"product_id": item.product_id,
"requested": item.quantity,
"available": product.stock if product else 0
}
}
)
# Create order and calculate totals
order = Order(
user_id=current_user.id,
items=order_data.items,
subtotal=sum(item.quantity * item.unit_price for item in order_data.items)
)
order.tax = order.subtotal * 0.08 # 8% tax
order.total = order.subtotal + order.tax
order.status = "pending"
session.add(order)
session.commit()
return order
```
## Step 4: Multi-Language Code Examples
### Automated Example Generation
```python
# scripts/generate_examples.py
def generate_examples(openapi_spec):
"""Generate TypeScript, Python, and cURL examples for each endpoint."""
examples = {}
for path, methods in openapi_spec["paths"].items():
for method, details in methods.items():
operation_id = details.get("operationId", f"{method}_{path}")
# TypeScript example
examples[f"{operation_id}_typescript"] = f'''
const response = await fetch('https://api.greyhaven.com{path}', {{
method: '{method.upper()}',
headers: {{
'Authorization': 'Bearer YOUR_API_TOKEN',
'Content-Type': 'application/json'
}},
body: JSON.stringify({{
items: [{{ product_id: "prod_123", quantity: 2, unit_price: 29.99 }}],
shipping_address_id: "addr_456"
}})
}});
const order = await response.json();
'''
# Python example
examples[f"{operation_id}_python"] = f'''
import requests
response = requests.{method}(
'https://api.greyhaven.com{path}',
headers={{'Authorization': 'Bearer YOUR_API_TOKEN'}},
json={{
'items': [{{'product_id': 'prod_123', 'quantity': 2, 'unit_price': 29.99}}],
'shipping_address_id': 'addr_456'
}}
)
order = response.json()
'''
# cURL example
examples[f"{operation_id}_curl"] = f'''
curl -X {method.upper()} https://api.greyhaven.com{path} \\
-H "Authorization: Bearer YOUR_API_TOKEN" \\
-H "Content-Type: application/json" \\
-d '{{"items": [{{"product_id": "prod_123", "quantity": 2, "unit_price": 29.99}}], "shipping_address_id": "addr_456"}}'
'''
return examples
```
## Step 5: Interactive Swagger UI
```python
# app/main.py (enhanced)
from fastapi.openapi.docs import get_swagger_ui_html
@app.get("/docs", include_in_schema=False)
async def custom_swagger_ui_html():
return get_swagger_ui_html(
openapi_url="/openapi.json",
title=f"{app.title} - API Documentation",
swagger_js_url="https://cdn.jsdelivr.net/npm/swagger-ui-dist@5/swagger-ui-bundle.js",
swagger_css_url="https://cdn.jsdelivr.net/npm/swagger-ui-dist@5/swagger-ui.css",
swagger_ui_parameters={
"persistAuthorization": True, # Remember auth token
"displayRequestDuration": True, # Show request timing
"filter": True, # Enable filtering
"tryItOutEnabled": True # Enable try-it-out by default
}
)
```
## Step 6: CI/CD Auto-Generation
```yaml
# .github/workflows/generate-docs.yml
name: Generate API Documentation
on:
push:
branches: [main]
paths: ['app/**/*.py']
jobs:
generate-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Generate OpenAPI spec
run: |
pip install -r requirements.txt
python -c "
from app.main import app
import json
with open('docs/openapi.json', 'w') as f:
json.dump(app.openapi(), f, indent=2)
"
- name: Generate code examples
run: python scripts/generate_examples.py
- name: Validate OpenAPI
run: npx @redocly/cli lint docs/openapi.json
- name: Deploy to Cloudflare Pages
run: |
npm install -g wrangler
wrangler pages deploy docs/ --project-name=api-docs
env:
CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
```
## Generated OpenAPI Specification (Excerpt)
```yaml
openapi: 3.1.0
info:
title: Grey Haven E-Commerce API
version: 1.0.0
description: E-commerce API with JWT auth. Rate limit: 1000 req/hour.
servers:
- url: https://api.greyhaven.com
description: Production
paths:
/api/v1/orders:
post:
summary: Create a new order
description: Create order in 'pending' status with calculated totals
operationId: createOrder
tags: [orders]
security:
- BearerAuth: []
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/OrderCreate'
responses:
'201':
description: Order created successfully
headers:
X-RateLimit-Limit:
$ref: '#/components/headers/X-RateLimit-Limit'
content:
application/json:
schema:
$ref: '#/components/schemas/Order'
'400':
description: Validation error or insufficient stock
content:
application/json:
schema:
$ref: '#/components/schemas/ErrorResponse'
'401':
description: Unauthorized (invalid token)
'429':
description: Rate limit exceeded
components:
securitySchemes:
BearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
schemas:
OrderItem:
type: object
required: [product_id, quantity, unit_price]
properties:
product_id:
type: string
example: "prod_123"
quantity:
type: integer
minimum: 1
example: 2
unit_price:
type: number
minimum: 0.01
example: 29.99
```
## Results
### Before
- Manual documentation 3 months out of date
- 47 endpoints with no docs
- 12 integration issues per week
- 4+ hours manual doc updates per release
- Partners blocked waiting for updated contracts
### After
- OpenAPI spec auto-generated on every commit
- 100% endpoint coverage with examples
- Interactive Swagger UI with try-it-out
- Multi-language examples (TypeScript, Python, cURL)
- Complete error response documentation
### Improvements
- Integration issues: 12/week → 0.5/week (96% reduction)
- Doc update time: 4 hours → 0 minutes (automated)
- Partner satisfaction: 45% → 98%
- Time-to-integration: 2 weeks → 2 days
### Partner Feedback
- "The interactive docs with try-it-out saved us days of testing"
- "Code examples in our language made integration trivial"
- "Error responses are fully documented - no guesswork"
## Key Lessons
1. **Automation is Critical**: Manual docs will always drift from code
2. **Pydantic v2 Schema**: Excellent OpenAPI generation with field validators
3. **Multi-Language Examples**: Dramatically improved partner integration speed
4. **Interactive Docs**: Try-it-out functionality reduced support tickets
5. **CI/CD Integration**: Documentation stays current automatically
6. **Error Documentation**: Complete error schemas eliminated guesswork
## Prevention Measures
**Implemented**:
- [x] Auto-generation on every commit (GitHub Actions)
- [x] OpenAPI spec validation in CI/CD
- [x] Interactive Swagger UI deployed to Cloudflare Pages
- [x] Multi-language code examples generated
- [x] Complete error response schemas
- [x] Rate limiting documentation
**Ongoing**:
- [ ] SDK auto-generation from OpenAPI spec (TypeScript, Python clients)
- [ ] Contract testing (validate API matches OpenAPI spec)
- [ ] Changelog generation from git commits
---
Related: [architecture-docs.md](architecture-docs.md) | [coverage-validation.md](coverage-validation.md) | [Return to INDEX](INDEX.md)