Initial commit
This commit is contained in:
413
skills/observability-monitoring/SKILL.md
Normal file
413
skills/observability-monitoring/SKILL.md
Normal file
@@ -0,0 +1,413 @@
|
||||
---
|
||||
name: observability-monitoring
|
||||
description: Implement observability and monitoring using Cloudflare Workers Analytics, wrangler tail for logs, and health checks. Use when setting up monitoring, implementing logging, configuring alerts, or debugging production issues.
|
||||
---
|
||||
|
||||
# Grey Haven Observability and Monitoring
|
||||
|
||||
Implement comprehensive monitoring for Grey Haven applications using **Cloudflare Workers** built-in observability tools.
|
||||
|
||||
## Observability Stack
|
||||
|
||||
### Grey Haven Monitoring Architecture
|
||||
|
||||
- **Logging**: Cloudflare Workers logs + wrangler tail
|
||||
- **Metrics**: Cloudflare Workers Analytics dashboard
|
||||
- **Custom Events**: Cloudflare Analytics Engine
|
||||
- **Health Checks**: Cloudflare Health Checks for endpoint availability
|
||||
- **Error Tracking**: Console errors visible in Cloudflare dashboard
|
||||
|
||||
## Cloudflare Workers Logging
|
||||
|
||||
### Console Logging in Workers
|
||||
|
||||
```typescript
|
||||
// app/utils/logger.ts
|
||||
export interface LogEvent {
|
||||
level: "debug" | "info" | "warn" | "error";
|
||||
message: string;
|
||||
context?: Record<string, unknown>;
|
||||
userId?: string;
|
||||
tenantId?: string;
|
||||
requestId?: string;
|
||||
duration?: number;
|
||||
}
|
||||
|
||||
export function log(event: LogEvent) {
|
||||
const logData = {
|
||||
timestamp: new Date().toISOString(),
|
||||
level: event.level,
|
||||
message: event.message,
|
||||
environment: process.env.ENVIRONMENT,
|
||||
user_id: event.userId,
|
||||
tenant_id: event.tenantId,
|
||||
request_id: event.requestId,
|
||||
duration_ms: event.duration,
|
||||
...event.context,
|
||||
};
|
||||
|
||||
// Structured console logging (visible in Cloudflare dashboard)
|
||||
console[event.level](JSON.stringify(logData));
|
||||
}
|
||||
|
||||
// Convenience methods
|
||||
export const logger = {
|
||||
debug: (message: string, context?: Record<string, unknown>) =>
|
||||
log({ level: "debug", message, context }),
|
||||
info: (message: string, context?: Record<string, unknown>) =>
|
||||
log({ level: "info", message, context }),
|
||||
warn: (message: string, context?: Record<string, unknown>) =>
|
||||
log({ level: "warn", message, context }),
|
||||
error: (message: string, context?: Record<string, unknown>) =>
|
||||
log({ level: "error", message, context }),
|
||||
};
|
||||
```
|
||||
|
||||
### Logging Middleware
|
||||
|
||||
```typescript
|
||||
// app/middleware/logging.ts
|
||||
import { logger } from "~/utils/logger";
|
||||
import { v4 as uuidv4 } from "uuid";
|
||||
|
||||
export async function loggingMiddleware(
|
||||
request: Request,
|
||||
next: () => Promise<Response>
|
||||
) {
|
||||
const requestId = uuidv4();
|
||||
const startTime = Date.now();
|
||||
|
||||
try {
|
||||
const response = await next();
|
||||
const duration = Date.now() - startTime;
|
||||
|
||||
logger.info("Request completed", {
|
||||
request_id: requestId,
|
||||
method: request.method,
|
||||
url: request.url,
|
||||
status: response.status,
|
||||
duration_ms: duration,
|
||||
});
|
||||
|
||||
return response;
|
||||
} catch (error) {
|
||||
const duration = Date.now() - startTime;
|
||||
|
||||
logger.error("Request failed", {
|
||||
request_id: requestId,
|
||||
method: request.method,
|
||||
url: request.url,
|
||||
error: error.message,
|
||||
stack: error.stack,
|
||||
duration_ms: duration,
|
||||
});
|
||||
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Cloudflare Workers Analytics
|
||||
|
||||
### Workers Analytics Dashboard
|
||||
|
||||
Access metrics at: `https://dash.cloudflare.com → Workers → Analytics`
|
||||
|
||||
**Key Metrics**:
|
||||
- Request rate (requests/second)
|
||||
- CPU time (milliseconds)
|
||||
- Error rate (%)
|
||||
- Success rate (%)
|
||||
- Response time (P50, P95, P99)
|
||||
- Invocations per day
|
||||
- GB-seconds (compute usage)
|
||||
|
||||
### Wrangler Tail (Real-time Logs)
|
||||
|
||||
```bash
|
||||
# Stream production logs
|
||||
npx wrangler tail --config wrangler.production.toml
|
||||
|
||||
# Filter by status code
|
||||
npx wrangler tail --status error --config wrangler.production.toml
|
||||
|
||||
# Filter by method
|
||||
npx wrangler tail --method POST --config wrangler.production.toml
|
||||
|
||||
# Filter by IP address
|
||||
npx wrangler tail --ip 1.2.3.4 --config wrangler.production.toml
|
||||
|
||||
# Output to file
|
||||
npx wrangler tail --config wrangler.production.toml > logs.txt
|
||||
```
|
||||
|
||||
### Accessing Logs in Cloudflare Dashboard
|
||||
|
||||
1. Go to `https://dash.cloudflare.com`
|
||||
2. Navigate to Workers & Pages
|
||||
3. Select your Worker
|
||||
4. Click "Logs" tab
|
||||
5. View real-time logs with filtering
|
||||
|
||||
**Log Features**:
|
||||
- Real-time streaming
|
||||
- Filter by status code
|
||||
- Filter by request method
|
||||
- Search log content
|
||||
- Export logs (JSON)
|
||||
|
||||
## Analytics Engine (Custom Events)
|
||||
|
||||
### Setup Analytics Engine
|
||||
|
||||
**wrangler.toml**:
|
||||
```toml
|
||||
[[analytics_engine_datasets]]
|
||||
binding = "ANALYTICS"
|
||||
```
|
||||
|
||||
### Track Custom Events
|
||||
|
||||
```typescript
|
||||
// app/utils/analytics.ts
|
||||
export async function trackEvent(
|
||||
env: Env,
|
||||
eventName: string,
|
||||
data: {
|
||||
user_id?: string;
|
||||
tenant_id?: string;
|
||||
duration_ms?: number;
|
||||
[key: string]: string | number | undefined;
|
||||
}
|
||||
) {
|
||||
try {
|
||||
await env.ANALYTICS.writeDataPoint({
|
||||
blobs: [eventName],
|
||||
doubles: [data.duration_ms || 0],
|
||||
indexes: [data.user_id || "", data.tenant_id || ""],
|
||||
});
|
||||
} catch (error) {
|
||||
console.error("Failed to track event:", error);
|
||||
}
|
||||
}
|
||||
|
||||
// Usage in server function
|
||||
export const loginUser = createServerFn({ method: "POST" }).handler(
|
||||
async ({ data, context }) => {
|
||||
const startTime = Date.now();
|
||||
const user = await authenticateUser(data);
|
||||
const duration = Date.now() - startTime;
|
||||
|
||||
// Track login event
|
||||
await trackEvent(context.env, "user_login", {
|
||||
user_id: user.id,
|
||||
tenant_id: user.tenantId,
|
||||
duration_ms: duration,
|
||||
});
|
||||
|
||||
return user;
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### Query Analytics Data
|
||||
|
||||
Use Cloudflare GraphQL API:
|
||||
|
||||
```graphql
|
||||
query GetLoginStats {
|
||||
viewer {
|
||||
accounts(filter: { accountTag: $accountId }) {
|
||||
workersAnalyticsEngineDataset(dataset: "my_analytics") {
|
||||
query(
|
||||
filter: {
|
||||
blob1: "user_login"
|
||||
datetime_gt: "2025-01-01T00:00:00Z"
|
||||
}
|
||||
) {
|
||||
count
|
||||
dimensions {
|
||||
blob1 # event name
|
||||
index1 # user_id
|
||||
index2 # tenant_id
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Health Checks
|
||||
|
||||
### Health Check Endpoint
|
||||
|
||||
```typescript
|
||||
// app/routes/api/health.ts
|
||||
import { createServerFn } from "@tanstack/start";
|
||||
import { db } from "~/lib/server/db";
|
||||
|
||||
export const GET = createServerFn({ method: "GET" }).handler(async ({ context }) => {
|
||||
const startTime = Date.now();
|
||||
const checks: Record<string, string> = {};
|
||||
|
||||
// Check database
|
||||
let dbHealthy = false;
|
||||
try {
|
||||
await db.execute("SELECT 1");
|
||||
dbHealthy = true;
|
||||
checks.database = "ok";
|
||||
} catch (error) {
|
||||
console.error("Database health check failed:", error);
|
||||
checks.database = "failed";
|
||||
}
|
||||
|
||||
// Check Redis (if using Upstash)
|
||||
let redisHealthy = false;
|
||||
if (context.env.REDIS) {
|
||||
try {
|
||||
await context.env.REDIS.ping();
|
||||
redisHealthy = true;
|
||||
checks.redis = "ok";
|
||||
} catch (error) {
|
||||
console.error("Redis health check failed:", error);
|
||||
checks.redis = "failed";
|
||||
}
|
||||
}
|
||||
|
||||
const duration = Date.now() - startTime;
|
||||
const healthy = dbHealthy && (!context.env.REDIS || redisHealthy);
|
||||
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
status: healthy ? "healthy" : "unhealthy",
|
||||
checks,
|
||||
duration_ms: duration,
|
||||
timestamp: new Date().toISOString(),
|
||||
environment: process.env.ENVIRONMENT,
|
||||
}),
|
||||
{
|
||||
status: healthy ? 200 : 503,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
}
|
||||
);
|
||||
});
|
||||
```
|
||||
|
||||
### Cloudflare Health Checks
|
||||
|
||||
Configure in Cloudflare dashboard:
|
||||
|
||||
1. Go to Traffic → Health Checks
|
||||
2. Create health check for `/api/health`
|
||||
3. Configure:
|
||||
- Interval: 60 seconds
|
||||
- Timeout: 5 seconds
|
||||
- Retries: 2
|
||||
- Expected status: 200
|
||||
4. Set up notifications (email/webhook)
|
||||
|
||||
## Error Tracking
|
||||
|
||||
### Structured Error Logging
|
||||
|
||||
```typescript
|
||||
// app/utils/error-handler.ts
|
||||
import { logger } from "~/utils/logger";
|
||||
|
||||
export function handleError(error: Error, context?: Record<string, unknown>) {
|
||||
// Log error with full context
|
||||
logger.error(error.message, {
|
||||
error_name: error.name,
|
||||
stack: error.stack,
|
||||
...context,
|
||||
});
|
||||
|
||||
// Also log to Analytics Engine for tracking
|
||||
if (context?.env) {
|
||||
trackEvent(context.env as Env, "error_occurred", {
|
||||
error_name: error.name,
|
||||
error_message: error.message,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Usage in server function
|
||||
export const updateUser = createServerFn({ method: "POST" }).handler(
|
||||
async ({ data, context }) => {
|
||||
try {
|
||||
return await userService.update(data);
|
||||
} catch (error) {
|
||||
handleError(error, {
|
||||
user_id: context.user?.id,
|
||||
tenant_id: context.tenant?.id,
|
||||
env: context.env,
|
||||
});
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### Viewing Errors in Cloudflare
|
||||
|
||||
1. **Workers Dashboard**: View errors in real-time
|
||||
2. **Wrangler Tail**: `npx wrangler tail --status error`
|
||||
3. **Analytics**: Check error rate metrics
|
||||
4. **Health Checks**: Monitor endpoint failures
|
||||
|
||||
## Supporting Documentation
|
||||
|
||||
All supporting files are under 500 lines per Anthropic best practices:
|
||||
|
||||
- **[examples/](examples/)** - Complete monitoring examples
|
||||
- [cloudflare-logging.md](examples/cloudflare-logging.md) - Structured console logging
|
||||
- [wrangler-tail.md](examples/wrangler-tail.md) - Real-time log streaming
|
||||
- [analytics-engine.md](examples/analytics-engine.md) - Custom event tracking
|
||||
- [health-checks.md](examples/health-checks.md) - Health check implementations
|
||||
- [error-tracking.md](examples/error-tracking.md) - Error handling patterns
|
||||
- [INDEX.md](examples/INDEX.md) - Examples navigation
|
||||
|
||||
- **[reference/](reference/)** - Monitoring references
|
||||
- [cloudflare-metrics.md](reference/cloudflare-metrics.md) - Available metrics
|
||||
- [wrangler-commands.md](reference/wrangler-commands.md) - Wrangler CLI reference
|
||||
- [alert-configuration.md](reference/alert-configuration.md) - Setting up alerts
|
||||
- [INDEX.md](reference/INDEX.md) - Reference navigation
|
||||
|
||||
- **[templates/](templates/)** - Copy-paste ready templates
|
||||
- [logger.ts](templates/logger.ts) - Cloudflare logger template
|
||||
- [health-check.ts](templates/health-check.ts) - Health check endpoint
|
||||
|
||||
- **[checklists/](checklists/)** - Monitoring checklists
|
||||
- [observability-setup-checklist.md](checklists/observability-setup-checklist.md) - Setup checklist
|
||||
|
||||
## When to Apply This Skill
|
||||
|
||||
Use this skill when:
|
||||
- Setting up monitoring for new Cloudflare Workers projects
|
||||
- Implementing structured logging with console
|
||||
- Debugging production issues with wrangler tail
|
||||
- Setting up health checks
|
||||
- Implementing custom metrics tracking with Analytics Engine
|
||||
- Configuring Cloudflare alerts
|
||||
|
||||
## Template Reference
|
||||
|
||||
These patterns are from Grey Haven's production monitoring:
|
||||
- **Cloudflare Workers Analytics**: Request and performance metrics
|
||||
- **Wrangler tail**: Real-time log streaming
|
||||
- **Console logging**: Structured JSON logs
|
||||
- **Analytics Engine**: Custom event tracking
|
||||
|
||||
## Critical Reminders
|
||||
|
||||
1. **Structured logging**: Use JSON.stringify for console logs
|
||||
2. **Request IDs**: Track requests with UUIDs for debugging
|
||||
3. **Error context**: Include tenant_id, user_id in all error logs
|
||||
4. **Health checks**: Monitor database and external service connections
|
||||
5. **Wrangler tail**: Use filters to narrow down logs (--status, --method)
|
||||
6. **Performance**: Track duration_ms for all operations
|
||||
7. **Environment**: Log environment in all messages for filtering
|
||||
8. **Analytics Engine**: Use for custom metrics and event tracking
|
||||
9. **Dashboard access**: Logs available in Cloudflare Workers dashboard
|
||||
10. **Real-time debugging**: Use wrangler tail for live production debugging
|
||||
48
skills/observability-monitoring/examples/INDEX.md
Normal file
48
skills/observability-monitoring/examples/INDEX.md
Normal file
@@ -0,0 +1,48 @@
|
||||
# Observability Examples
|
||||
|
||||
Complete monitoring and logging examples for Grey Haven Cloudflare Workers applications.
|
||||
|
||||
## Available Examples
|
||||
|
||||
### [cloudflare-logging.md](cloudflare-logging.md)
|
||||
Structured console logging for Cloudflare Workers.
|
||||
- TypeScript logger implementation
|
||||
- Log levels and context
|
||||
- JSON structured logging
|
||||
- Viewing logs in Cloudflare dashboard
|
||||
|
||||
### [wrangler-tail.md](wrangler-tail.md)
|
||||
Real-time log streaming with wrangler tail.
|
||||
- Streaming production logs
|
||||
- Filtering by status, method, IP
|
||||
- Exporting logs to files
|
||||
- Common troubleshooting patterns
|
||||
|
||||
### [analytics-engine.md](analytics-engine.md)
|
||||
Cloudflare Analytics Engine for custom events.
|
||||
- Custom event tracking
|
||||
- Analytics Engine API
|
||||
- GraphQL queries for analytics
|
||||
- Event aggregation patterns
|
||||
|
||||
### [health-checks.md](health-checks.md)
|
||||
Health check endpoint implementations.
|
||||
- Database health checks
|
||||
- Redis health checks
|
||||
- Multi-service health checks
|
||||
- Cloudflare health check configuration
|
||||
|
||||
### [error-tracking.md](error-tracking.md)
|
||||
Error handling and tracking patterns.
|
||||
- Structured error logging
|
||||
- Error context tracking
|
||||
- Analytics Engine for error metrics
|
||||
- Error rate monitoring
|
||||
|
||||
## Quick Reference
|
||||
|
||||
**Need logging?** → [cloudflare-logging.md](cloudflare-logging.md)
|
||||
**Need real-time logs?** → [wrangler-tail.md](wrangler-tail.md)
|
||||
**Need custom metrics?** → [analytics-engine.md](analytics-engine.md)
|
||||
**Need health checks?** → [health-checks.md](health-checks.md)
|
||||
**Need error tracking?** → [error-tracking.md](error-tracking.md)
|
||||
32
skills/observability-monitoring/reference/INDEX.md
Normal file
32
skills/observability-monitoring/reference/INDEX.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Observability Reference
|
||||
|
||||
Configuration references and patterns for Cloudflare Workers monitoring.
|
||||
|
||||
## Available References
|
||||
|
||||
### [cloudflare-metrics.md](cloudflare-metrics.md)
|
||||
Available Cloudflare Workers metrics.
|
||||
- Workers Analytics metrics
|
||||
- Analytics Engine metrics
|
||||
- Health check metrics
|
||||
- Performance metrics (CPU time, response time)
|
||||
|
||||
### [wrangler-commands.md](wrangler-commands.md)
|
||||
Wrangler CLI commands for monitoring.
|
||||
- wrangler tail reference
|
||||
- Log filtering options
|
||||
- Output formatting
|
||||
- Common command patterns
|
||||
|
||||
### [alert-configuration.md](alert-configuration.md)
|
||||
Setting up alerts and notifications.
|
||||
- Cloudflare health check alerts
|
||||
- Email notifications
|
||||
- Webhook integration
|
||||
- Alert thresholds
|
||||
|
||||
## Quick Reference
|
||||
|
||||
**Need metrics?** → [cloudflare-metrics.md](cloudflare-metrics.md)
|
||||
**Need CLI commands?** → [wrangler-commands.md](wrangler-commands.md)
|
||||
**Need alerts?** → [alert-configuration.md](alert-configuration.md)
|
||||
Reference in New Issue
Block a user