Initial commit
This commit is contained in:
12
.claude-plugin/plugin.json
Normal file
12
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"name": "cloudflare-browser-rendering",
|
||||
"description": "Add headless Chrome automation with Puppeteer/Playwright on Cloudflare Workers. Use when: taking screenshots, generating PDFs, web scraping, crawling sites, browser automation, or troubleshooting XPath errors, browser timeouts, binding not passed errors, or session limits.",
|
||||
"version": "1.0.0",
|
||||
"author": {
|
||||
"name": "Jeremy Dawes",
|
||||
"email": "jeremy@jezweb.net"
|
||||
},
|
||||
"skills": [
|
||||
"./"
|
||||
]
|
||||
}
|
||||
3
README.md
Normal file
3
README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# cloudflare-browser-rendering
|
||||
|
||||
Add headless Chrome automation with Puppeteer/Playwright on Cloudflare Workers. Use when: taking screenshots, generating PDFs, web scraping, crawling sites, browser automation, or troubleshooting XPath errors, browser timeouts, binding not passed errors, or session limits.
|
||||
783
SKILL.md
Normal file
783
SKILL.md
Normal file
@@ -0,0 +1,783 @@
|
||||
---
|
||||
name: cloudflare-browser-rendering
|
||||
description: |
|
||||
Add headless Chrome automation with Puppeteer/Playwright on Cloudflare Workers. Use when: taking screenshots, generating PDFs, web scraping, crawling sites, browser automation, or troubleshooting XPath errors, browser timeouts, binding not passed errors, or session limits.
|
||||
license: MIT
|
||||
---
|
||||
|
||||
# Cloudflare Browser Rendering - Complete Reference
|
||||
|
||||
Production-ready knowledge domain for building browser automation workflows with Cloudflare Browser Rendering.
|
||||
|
||||
**Status**: Production Ready ✅
|
||||
**Last Updated**: 2025-11-23
|
||||
**Dependencies**: cloudflare-worker-base (for Worker setup)
|
||||
**Latest Versions**: @cloudflare/puppeteer@1.0.4 (July 2025), @cloudflare/playwright@1.0.0 (Playwright v1.55 GA Sept 2025), wrangler@4.50.0
|
||||
|
||||
**Recent Updates (2025)**:
|
||||
- **Sept 2025**: Playwright v1.55 GA, Stagehand framework support (Workers AI), /links excludeExternalLinks param
|
||||
- **Aug 2025**: Billing GA (Aug 20), /sessions endpoint in local dev, X-Browser-Ms-Used header
|
||||
- **July 2025**: Playwright v1.54.1 + MCP v0.0.30, Playwright local dev support (wrangler@4.26.0+), Puppeteer v22.13.1 sync, /content returns title, /json custom_ai param, /screenshot viewport 1920x1080 default
|
||||
- **June 2025**: Web Bot Auth headers auto-included
|
||||
- **April 2025**: Playwright support launched, free tier introduced
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Quick Start (5 minutes)](#quick-start-5-minutes)
|
||||
2. [Browser Rendering Overview](#browser-rendering-overview)
|
||||
3. [Puppeteer API Reference](#puppeteer-api-reference)
|
||||
4. [Playwright API Reference](#playwright-api-reference)
|
||||
5. [Session Management](#session-management)
|
||||
6. [Common Patterns](#common-patterns)
|
||||
7. [Pricing & Limits](#pricing--limits)
|
||||
8. [Known Issues Prevention](#known-issues-prevention)
|
||||
9. [Production Checklist](#production-checklist)
|
||||
|
||||
---
|
||||
|
||||
## Quick Start (5 minutes)
|
||||
|
||||
### 1. Add Browser Binding
|
||||
|
||||
**wrangler.jsonc:**
|
||||
```jsonc
|
||||
{
|
||||
"name": "browser-worker",
|
||||
"main": "src/index.ts",
|
||||
"compatibility_date": "2023-03-14",
|
||||
"compatibility_flags": ["nodejs_compat"],
|
||||
"browser": {
|
||||
"binding": "MYBROWSER"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Why nodejs_compat?** Browser Rendering requires Node.js APIs and polyfills.
|
||||
|
||||
### 2. Install Puppeteer
|
||||
|
||||
```bash
|
||||
npm install @cloudflare/puppeteer
|
||||
```
|
||||
|
||||
### 3. Take Your First Screenshot
|
||||
|
||||
```typescript
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url") || "https://example.com";
|
||||
|
||||
// Launch browser
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Navigate and capture
|
||||
await page.goto(url);
|
||||
const screenshot = await page.screenshot();
|
||||
|
||||
// Clean up
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
### 4. Deploy
|
||||
|
||||
```bash
|
||||
npx wrangler deploy
|
||||
```
|
||||
|
||||
Test at: `https://your-worker.workers.dev/?url=https://example.com`
|
||||
|
||||
**CRITICAL:**
|
||||
- Always pass `env.MYBROWSER` to `puppeteer.launch()` (not undefined)
|
||||
- Always call `browser.close()` when done (or use `browser.disconnect()` for session reuse)
|
||||
- Use `nodejs_compat` compatibility flag
|
||||
|
||||
---
|
||||
|
||||
## Browser Rendering Overview
|
||||
|
||||
### What is Browser Rendering?
|
||||
|
||||
Cloudflare Browser Rendering provides headless Chromium browsers running on Cloudflare's global network. Use familiar tools like Puppeteer and Playwright to automate browser tasks:
|
||||
|
||||
- **Screenshots** - Capture visual snapshots of web pages
|
||||
- **PDF Generation** - Convert HTML/URLs to PDFs
|
||||
- **Web Scraping** - Extract content from dynamic websites
|
||||
- **Testing** - Automate frontend tests
|
||||
- **Crawling** - Navigate multi-page workflows
|
||||
|
||||
### Two Integration Methods
|
||||
|
||||
| Method | Best For | Complexity |
|
||||
|--------|----------|-----------|
|
||||
| **Workers Bindings** | Complex automation, custom workflows, session management | Advanced |
|
||||
| **REST API** | Simple screenshot/PDF tasks | Simple |
|
||||
|
||||
**This skill covers Workers Bindings** (the advanced method with full Puppeteer/Playwright APIs).
|
||||
|
||||
### Puppeteer vs Playwright
|
||||
|
||||
| Feature | Puppeteer | Playwright |
|
||||
|---------|-----------|------------|
|
||||
| **API Familiarity** | Most popular | Growing adoption |
|
||||
| **Package** | `@cloudflare/puppeteer@1.0.4` | `@cloudflare/playwright@1.0.0` |
|
||||
| **Session Management** | ✅ Advanced APIs | ⚠️ Basic |
|
||||
| **Browser Support** | Chromium only | Chromium only (Firefox/Safari not yet supported) |
|
||||
| **Best For** | Screenshots, PDFs, scraping | Testing, frontend automation |
|
||||
|
||||
**Recommendation**: Use Puppeteer for most use cases. Playwright is ideal if you're already using it for testing.
|
||||
|
||||
---
|
||||
|
||||
## Puppeteer API Reference
|
||||
|
||||
**Core APIs** (complete reference: https://pptr.dev/api/):
|
||||
|
||||
**Global Functions:**
|
||||
- `puppeteer.launch(env.MYBROWSER, options?)` - Launch new browser (CRITICAL: must pass binding)
|
||||
- `puppeteer.connect(env.MYBROWSER, sessionId)` - Connect to existing session
|
||||
- `puppeteer.sessions(env.MYBROWSER)` - List running sessions
|
||||
- `puppeteer.history(env.MYBROWSER)` - List recent sessions (open + closed)
|
||||
- `puppeteer.limits(env.MYBROWSER)` - Check account limits
|
||||
|
||||
**Browser Methods:**
|
||||
- `browser.newPage()` - Create new tab (preferred over launching new browsers)
|
||||
- `browser.sessionId()` - Get session ID for reuse
|
||||
- `browser.close()` - Terminate session
|
||||
- `browser.disconnect()` - Keep session alive for reuse
|
||||
- `browser.createBrowserContext()` - Isolated incognito context (separate cookies/cache)
|
||||
|
||||
**Page Methods:**
|
||||
- `page.goto(url, { waitUntil, timeout })` - Navigate (use `"networkidle0"` for dynamic content)
|
||||
- `page.screenshot({ fullPage, type, quality, clip })` - Capture image
|
||||
- `page.pdf({ format, printBackground, margin })` - Generate PDF
|
||||
- `page.evaluate(() => ...)` - Execute JS in browser (data extraction, XPath workaround)
|
||||
- `page.content()` / `page.setContent(html)` - Get/set HTML
|
||||
- `page.waitForSelector(selector)` - Wait for element
|
||||
- `page.type(selector, text)` / `page.click(selector)` - Form interaction
|
||||
|
||||
**Critical Patterns:**
|
||||
```typescript
|
||||
// Must pass binding
|
||||
const browser = await puppeteer.launch(env.MYBROWSER); // ✅
|
||||
// const browser = await puppeteer.launch(); // ❌ Error!
|
||||
|
||||
// Session reuse for performance
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSessions = sessions.filter(s => !s.connectionId);
|
||||
if (freeSessions.length > 0) {
|
||||
browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
|
||||
}
|
||||
|
||||
// Keep session alive
|
||||
await browser.disconnect(); // Don't close
|
||||
|
||||
// XPath workaround (not directly supported)
|
||||
const data = await page.evaluate(() => {
|
||||
return new XPathEvaluator()
|
||||
.createExpression("/html/body/div/h1")
|
||||
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
|
||||
.singleNodeValue.innerHTML;
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Playwright API Reference
|
||||
|
||||
**Status**: GA (Sept 2025) - Playwright v1.55, MCP v0.0.30 support, local dev support (wrangler@4.26.0+)
|
||||
|
||||
**Installation:**
|
||||
```bash
|
||||
npm install @cloudflare/playwright
|
||||
```
|
||||
|
||||
**Configuration Requirements (2025 Update):**
|
||||
```jsonc
|
||||
{
|
||||
"compatibility_flags": ["nodejs_compat"],
|
||||
"compatibility_date": "2025-09-15" // Required for Playwright v1.55
|
||||
}
|
||||
```
|
||||
|
||||
**Basic Usage:**
|
||||
```typescript
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com");
|
||||
const screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
**Puppeteer vs Playwright:**
|
||||
- **Import**: `puppeteer` vs `{ chromium }` from "@cloudflare/playwright"
|
||||
- **Session API**: Puppeteer has advanced session management (sessions/history/limits), Playwright basic
|
||||
- **Auto-waiting**: Playwright has built-in auto-waiting, Puppeteer requires manual `waitForSelector()`
|
||||
- **MCP Support**: Playwright MCP v0.0.30 (July 2025), Playwright MCP server available
|
||||
|
||||
**Recommendation**: Use Puppeteer for session reuse patterns. Use Playwright if migrating existing tests or need MCP integration.
|
||||
|
||||
**Official Docs**: https://developers.cloudflare.com/browser-rendering/playwright/
|
||||
|
||||
---
|
||||
|
||||
## Session Management
|
||||
|
||||
**Why**: Launching new browsers is slow and consumes concurrency limits. Reuse sessions for faster response, lower concurrency usage, better resource utilization.
|
||||
|
||||
### Session Reuse Pattern (Critical)
|
||||
|
||||
```typescript
|
||||
async function getBrowser(env: Env): Promise<Browser> {
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSessions = sessions.filter(s => !s.connectionId);
|
||||
|
||||
if (freeSessions.length > 0) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
|
||||
} catch (e) {
|
||||
console.log("Failed to connect, launching new browser");
|
||||
}
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await getBrowser(env);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com");
|
||||
const screenshot = await page.screenshot();
|
||||
|
||||
await browser.disconnect(); // ✅ Keep alive for reuse
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close(); // ❌ Close on error
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Key Rules:**
|
||||
- ✅ `browser.disconnect()` - Keep session alive for reuse
|
||||
- ❌ `browser.close()` - Only on errors or when truly done
|
||||
- ✅ Always handle connection failures
|
||||
|
||||
### Browser Contexts (Cookie/Cache Isolation)
|
||||
|
||||
Use `browser.createBrowserContext()` to share browser but isolate cookies/cache:
|
||||
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const context1 = await browser.createBrowserContext(); // User 1
|
||||
const context2 = await browser.createBrowserContext(); // User 2
|
||||
|
||||
const page1 = await context1.newPage();
|
||||
const page2 = await context2.newPage();
|
||||
// Separate cookies/cache per context
|
||||
```
|
||||
|
||||
### Multiple Tabs Pattern
|
||||
|
||||
**❌ Bad**: Launch 10 browsers for 10 URLs (wastes concurrency)
|
||||
**✅ Good**: 1 browser, 10 tabs via `Promise.all()` + `browser.newPage()`
|
||||
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const results = await Promise.all(
|
||||
urls.map(async (url) => {
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url);
|
||||
const data = await page.evaluate(() => ({ title: document.title }));
|
||||
await page.close();
|
||||
return { url, data };
|
||||
})
|
||||
);
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Screenshot with KV Caching
|
||||
|
||||
Cache screenshots to reduce browser usage and improve performance:
|
||||
|
||||
```typescript
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
CACHE: KVNamespace;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
if (!url) return new Response("Missing ?url parameter", { status: 400 });
|
||||
|
||||
const normalizedUrl = new URL(url).toString();
|
||||
|
||||
// Check cache first
|
||||
let screenshot = await env.CACHE.get(normalizedUrl, { type: "arrayBuffer" });
|
||||
|
||||
if (!screenshot) {
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(normalizedUrl);
|
||||
screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
|
||||
// Cache for 24 hours
|
||||
await env.CACHE.put(normalizedUrl, screenshot, { expirationTtl: 60 * 60 * 24 });
|
||||
}
|
||||
|
||||
return new Response(screenshot, { headers: { "content-type": "image/png" } });
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
### AI-Enhanced Scraping
|
||||
|
||||
Combine Browser Rendering with Workers AI for structured data extraction:
|
||||
|
||||
```typescript
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
AI: Ai;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
|
||||
// Scrape page content
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url!, { waitUntil: "networkidle0" });
|
||||
const bodyContent = await page.$eval("body", el => el.innerHTML);
|
||||
await browser.close();
|
||||
|
||||
// Extract structured data with AI
|
||||
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
|
||||
messages: [{
|
||||
role: "user",
|
||||
content: `Extract product info as JSON from this HTML. Include: name, price, description.\n\nHTML:\n${bodyContent.slice(0, 4000)}`
|
||||
}]
|
||||
});
|
||||
|
||||
return Response.json({ url, product: JSON.parse(response.response) });
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Other Common Patterns**: PDF generation (`page.pdf()`), structured scraping (`page.evaluate()`), form automation (`page.type()` + `page.click()`). See bundled `templates/` directory.
|
||||
|
||||
---
|
||||
|
||||
## Pricing & Limits
|
||||
|
||||
**Billing GA**: August 20, 2025
|
||||
|
||||
**Free Tier**: 10 min/day, 3 concurrent, 3 launches/min, 60s timeout
|
||||
**Paid Tier**: 10 hrs/month included ($0.09/hr after), 10 concurrent avg ($2.00/browser after), 30 launches/min, 60s-10min timeout
|
||||
|
||||
**Concurrency Calculation**: Monthly average of daily peak usage (e.g., 15 browsers avg = (15 - 10 included) × $2.00 = $10.00/mo)
|
||||
|
||||
**Rate Limiting**: Enforced per-second (180 req/min = 3 req/sec, not bursty). Check `puppeteer.limits(env.MYBROWSER)` before launching:
|
||||
|
||||
```typescript
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
const delay = limits.timeUntilNextAllowedBrowserAcquisition || 1000;
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Issues Prevention
|
||||
|
||||
This skill prevents **6 documented issues**:
|
||||
|
||||
---
|
||||
|
||||
### Issue #1: XPath Selectors Not Supported
|
||||
|
||||
**Error:** "XPath selector not supported" or selector failures
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/faq/#why-cant-i-use-an-xpath-selector-when-using-browser-rendering-with-puppeteer
|
||||
**Why It Happens:** XPath poses a security risk to Workers
|
||||
**Prevention:** Use CSS selectors or `page.evaluate()` with XPathEvaluator
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
// ❌ Don't use XPath directly (not supported)
|
||||
// await page.$x('/html/body/div/h1')
|
||||
|
||||
// ✅ Use CSS selector
|
||||
const heading = await page.$("div > h1");
|
||||
|
||||
// ✅ Or use XPath in page.evaluate()
|
||||
const innerHtml = await page.evaluate(() => {
|
||||
return new XPathEvaluator()
|
||||
.createExpression("/html/body/div/h1")
|
||||
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
|
||||
.singleNodeValue.innerHTML;
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue #2: Browser Binding Not Passed
|
||||
|
||||
**Error:** "Cannot read properties of undefined (reading 'fetch')"
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/faq/#cannot-read-properties-of-undefined-reading-fetch
|
||||
**Why It Happens:** `puppeteer.launch()` called without browser binding
|
||||
**Prevention:** Always pass `env.MYBROWSER` to launch
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
// ❌ Missing browser binding
|
||||
const browser = await puppeteer.launch(); // Error!
|
||||
|
||||
// ✅ Pass binding
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue #3: Browser Timeout (60 seconds)
|
||||
|
||||
**Error:** Browser closes unexpectedly after 60 seconds
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/platform/limits/#note-on-browser-timeout
|
||||
**Why It Happens:** Default timeout is 60 seconds of inactivity
|
||||
**Prevention:** Use `keep_alive` option to extend up to 10 minutes
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
// Extend timeout to 5 minutes for long-running tasks
|
||||
const browser = await puppeteer.launch(env.MYBROWSER, {
|
||||
keep_alive: 300000 // 5 minutes = 300,000 ms
|
||||
});
|
||||
```
|
||||
|
||||
**Note:** Browser closes if no devtools commands for the specified duration.
|
||||
|
||||
---
|
||||
|
||||
### Issue #4: Concurrency Limits Reached
|
||||
|
||||
**Error:** "Rate limit exceeded" or new browser launch fails
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
**Why It Happens:** Exceeded concurrent browser limit (3 free, 10-30 paid)
|
||||
**Prevention:** Reuse sessions, use tabs instead of multiple browsers, check limits before launching
|
||||
|
||||
**Solutions:**
|
||||
```typescript
|
||||
// 1. Check limits before launching
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
return new Response("Concurrency limit reached", { status: 429 });
|
||||
}
|
||||
|
||||
// 2. Reuse sessions
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSessions = sessions.filter(s => !s.connectionId);
|
||||
if (freeSessions.length > 0) {
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
|
||||
}
|
||||
|
||||
// 3. Use tabs instead of multiple browsers
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page1 = await browser.newPage();
|
||||
const page2 = await browser.newPage(); // Same browser, different tabs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue #5: Local Development Request Size Limit
|
||||
|
||||
**Error:** Request larger than 1MB fails in `wrangler dev`
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/faq/#does-local-development-support-all-browser-rendering-features
|
||||
**Why It Happens:** Local development limitation
|
||||
**Prevention:** Use `remote: true` in browser binding for local dev
|
||||
|
||||
**Solution:**
|
||||
```jsonc
|
||||
// wrangler.jsonc for local development
|
||||
{
|
||||
"browser": {
|
||||
"binding": "MYBROWSER",
|
||||
"remote": true // Use real headless browser during dev
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue #6: Bot Protection Always Triggered
|
||||
|
||||
**Error:** Website blocks requests as bot traffic
|
||||
**Source:** https://developers.cloudflare.com/browser-rendering/faq/#will-browser-rendering-bypass-cloudflares-bot-protection
|
||||
**Why It Happens:** Browser Rendering requests always identified as bots
|
||||
**Prevention:** Cannot bypass; if scraping your own zone, create WAF skip rule
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
// ❌ Cannot bypass bot protection
|
||||
// Requests will always be identified as bots
|
||||
|
||||
// ✅ If scraping your own Cloudflare zone:
|
||||
// 1. Go to Security > WAF > Custom rules
|
||||
// 2. Create skip rule with custom header:
|
||||
// Header: X-Custom-Auth
|
||||
// Value: your-secret-token
|
||||
// 3. Pass header in your scraping requests
|
||||
|
||||
// Note: Automatic headers are included:
|
||||
// - cf-biso-request-id
|
||||
// - cf-biso-devtools
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Checklist
|
||||
|
||||
Before deploying Browser Rendering Workers to production:
|
||||
|
||||
### Configuration
|
||||
- [ ] **Browser binding configured** in wrangler.jsonc
|
||||
- [ ] **nodejs_compat flag enabled** (required for Browser Rendering)
|
||||
- [ ] **Keep-alive timeout set** if tasks take > 60 seconds
|
||||
- [ ] **Remote binding enabled** for local development if needed
|
||||
|
||||
### Error Handling
|
||||
- [ ] **Retry logic implemented** for rate limits
|
||||
- [ ] **Timeout handling** for page.goto()
|
||||
- [ ] **Browser cleanup** in try-finally blocks
|
||||
- [ ] **Concurrency limit checks** before launching browsers
|
||||
- [ ] **Graceful degradation** when browser unavailable
|
||||
|
||||
### Performance
|
||||
- [ ] **Session reuse implemented** for high-traffic routes
|
||||
- [ ] **Multiple tabs used** instead of multiple browsers
|
||||
- [ ] **Incognito contexts** for session isolation
|
||||
- [ ] **KV caching** for repeated screenshots/PDFs
|
||||
- [ ] **Batch operations** to maximize browser utilization
|
||||
|
||||
### Monitoring
|
||||
- [ ] **Log browser session IDs** for debugging
|
||||
- [ ] **Track browser duration** for billing estimates
|
||||
- [ ] **Monitor concurrency usage** with puppeteer.limits()
|
||||
- [ ] **Alert on rate limit errors**
|
||||
- [ ] **Dashboard monitoring** at https://dash.cloudflare.com/?to=/:account/workers/browser-rendering
|
||||
|
||||
### Security
|
||||
- [ ] **Input validation** for URLs (prevent SSRF)
|
||||
- [ ] **Timeout limits** to prevent abuse
|
||||
- [ ] **Rate limiting** on public endpoints
|
||||
- [ ] **Authentication** for sensitive scraping endpoints
|
||||
- [ ] **WAF rules** if scraping your own zone
|
||||
|
||||
### Testing
|
||||
- [ ] **Test screenshot capture** with various page sizes
|
||||
- [ ] **Test PDF generation** with custom HTML
|
||||
- [ ] **Test scraping** with dynamic content (networkidle0)
|
||||
- [ ] **Test error scenarios** (invalid URLs, timeouts)
|
||||
- [ ] **Load test** concurrency limits
|
||||
|
||||
---
|
||||
|
||||
## Error Handling Best Practices
|
||||
|
||||
**Production Pattern** - Use try-catch with proper cleanup:
|
||||
|
||||
```typescript
|
||||
async function withBrowser<T>(env: Env, fn: (browser: Browser) => Promise<T>): Promise<T> {
|
||||
let browser: Browser | null = null;
|
||||
|
||||
try {
|
||||
// 1. Check limits before launching
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
throw new Error("Rate limit reached");
|
||||
}
|
||||
|
||||
// 2. Try session reuse first
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSessions = sessions.filter(s => !s.connectionId);
|
||||
browser = freeSessions.length > 0
|
||||
? await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId)
|
||||
: await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
// 3. Execute user function
|
||||
const result = await fn(browser);
|
||||
|
||||
// 4. Disconnect (keep alive)
|
||||
await browser.disconnect();
|
||||
return result;
|
||||
} catch (error) {
|
||||
// 5. Close on error
|
||||
if (browser) await browser.close();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Principles**: Check limits → Reuse sessions → Execute → Disconnect on success, close on error
|
||||
|
||||
---
|
||||
|
||||
## Using Bundled Resources
|
||||
|
||||
### Templates (templates/)
|
||||
|
||||
Ready-to-use code templates for common patterns:
|
||||
|
||||
- `basic-screenshot.ts` - Minimal screenshot example
|
||||
- `screenshot-with-kv-cache.ts` - Screenshot with KV caching
|
||||
- `pdf-generation.ts` - Generate PDFs from HTML or URLs
|
||||
- `web-scraper-basic.ts` - Basic web scraping pattern
|
||||
- `web-scraper-batch.ts` - Batch scrape multiple URLs
|
||||
- `session-reuse.ts` - Session reuse for performance
|
||||
- `ai-enhanced-scraper.ts` - Scraping with Workers AI
|
||||
- `playwright-example.ts` - Playwright alternative example
|
||||
- `wrangler-browser-config.jsonc` - Browser binding configuration
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Copy template to your project
|
||||
cp ~/.claude/skills/cloudflare-browser-rendering/templates/basic-screenshot.ts src/index.ts
|
||||
```
|
||||
|
||||
### References (references/)
|
||||
|
||||
Deep-dive documentation:
|
||||
|
||||
- `session-management.md` - Complete session reuse guide
|
||||
- `pricing-and-limits.md` - Detailed pricing breakdown
|
||||
- `common-errors.md` - All known issues and solutions
|
||||
- `puppeteer-vs-playwright.md` - Feature comparison and migration
|
||||
|
||||
**When to load:** Reference when implementing advanced patterns or debugging specific issues.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
**Required:**
|
||||
- `@cloudflare/puppeteer@1.0.4` - Puppeteer for Workers
|
||||
- `wrangler@4.43.0+` - Cloudflare CLI
|
||||
|
||||
**Optional:**
|
||||
- `@cloudflare/playwright@1.0.0` - Playwright for Workers (alternative)
|
||||
- `@cloudflare/workers-types@4.20251014.0+` - TypeScript types
|
||||
|
||||
**Related Skills:**
|
||||
- `cloudflare-worker-base` - Worker setup with Hono
|
||||
- `cloudflare-kv` - KV caching for screenshots
|
||||
- `cloudflare-r2` - R2 storage for generated files
|
||||
- `cloudflare-workers-ai` - AI-enhanced scraping
|
||||
|
||||
---
|
||||
|
||||
## Official Documentation
|
||||
|
||||
- **Browser Rendering Docs**: https://developers.cloudflare.com/browser-rendering/
|
||||
- **Puppeteer API**: https://pptr.dev/api/
|
||||
- **Playwright API**: https://playwright.dev/docs/api/class-playwright
|
||||
- **Cloudflare Puppeteer Fork**: https://github.com/cloudflare/puppeteer
|
||||
- **Cloudflare Playwright Fork**: https://github.com/cloudflare/playwright
|
||||
- **Pricing**: https://developers.cloudflare.com/browser-rendering/platform/pricing/
|
||||
- **Limits**: https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
|
||||
---
|
||||
|
||||
## Package Versions (Verified 2025-10-22)
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"@cloudflare/puppeteer": "^1.0.4"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@cloudflare/workers-types": "^4.20251014.0",
|
||||
"wrangler": "^4.43.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Alternative (Playwright):**
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"@cloudflare/playwright": "^1.0.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Problem: "Cannot read properties of undefined (reading 'fetch')"
|
||||
**Solution:** Pass browser binding to puppeteer.launch():
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER); // Not just puppeteer.launch()
|
||||
```
|
||||
|
||||
### Problem: XPath selectors not working
|
||||
**Solution:** Use CSS selectors or page.evaluate() with XPathEvaluator (see Issue #1)
|
||||
|
||||
### Problem: Browser closes after 60 seconds
|
||||
**Solution:** Extend timeout with keep_alive:
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 300000 });
|
||||
```
|
||||
|
||||
### Problem: Rate limit reached
|
||||
**Solution:** Reuse sessions, use tabs, check limits before launching (see Issue #4)
|
||||
|
||||
### Problem: Local dev request > 1MB fails
|
||||
**Solution:** Enable remote binding in wrangler.jsonc:
|
||||
```jsonc
|
||||
{ "browser": { "binding": "MYBROWSER", "remote": true } }
|
||||
```
|
||||
|
||||
### Problem: Website blocks as bot
|
||||
**Solution:** Cannot bypass. If your own zone, create WAF skip rule (see Issue #6)
|
||||
|
||||
---
|
||||
|
||||
**Questions? Issues?**
|
||||
|
||||
1. Check `references/common-errors.md` for detailed solutions
|
||||
2. Review `references/session-management.md` for performance optimization
|
||||
3. Verify browser binding is configured in wrangler.jsonc
|
||||
4. Check official docs: https://developers.cloudflare.com/browser-rendering/
|
||||
5. Ensure `nodejs_compat` compatibility flag is enabled
|
||||
101
plugin.lock.json
Normal file
101
plugin.lock.json
Normal file
@@ -0,0 +1,101 @@
|
||||
{
|
||||
"$schema": "internal://schemas/plugin.lock.v1.json",
|
||||
"pluginId": "gh:jezweb/claude-skills:skills/cloudflare-browser-rendering",
|
||||
"normalized": {
|
||||
"repo": null,
|
||||
"ref": "refs/tags/v20251128.0",
|
||||
"commit": "f6b65a162ae4f587ce6a8e81896247875b63e3e5",
|
||||
"treeHash": "e3ef8d6debe9616a1cbe302e85dc18cb49d96c36872a8c07c7acae4cf142113d",
|
||||
"generatedAt": "2025-11-28T10:18:58.576479Z",
|
||||
"toolVersion": "publish_plugins.py@0.2.0"
|
||||
},
|
||||
"origin": {
|
||||
"remote": "git@github.com:zhongweili/42plugin-data.git",
|
||||
"branch": "master",
|
||||
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
|
||||
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
|
||||
},
|
||||
"manifest": {
|
||||
"name": "cloudflare-browser-rendering",
|
||||
"description": "Add headless Chrome automation with Puppeteer/Playwright on Cloudflare Workers. Use when: taking screenshots, generating PDFs, web scraping, crawling sites, browser automation, or troubleshooting XPath errors, browser timeouts, binding not passed errors, or session limits.",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"content": {
|
||||
"files": [
|
||||
{
|
||||
"path": "README.md",
|
||||
"sha256": "00452a543d02e6fe8c0f8e46d746fdb0a16787aa06f850460200fd92135065cb"
|
||||
},
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"sha256": "f0d36b31ffbdbeb1b5411f9e3ab488f5a6fd4b8751a5882792059ad9b060746d"
|
||||
},
|
||||
{
|
||||
"path": "references/common-errors.md",
|
||||
"sha256": "7395079a6f3573ed4858df1c5b653341911f9d4293c01f0384359059b2100171"
|
||||
},
|
||||
{
|
||||
"path": "references/pricing-and-limits.md",
|
||||
"sha256": "c38d89c4a3dd11d8564695eaeaab584c0d3cfd64e03fa33d1878715f74c416b1"
|
||||
},
|
||||
{
|
||||
"path": "references/puppeteer-vs-playwright.md",
|
||||
"sha256": "44ceb27acff58f2216d42b69f2902c2f6365a56de347a9e6a2605696858e1744"
|
||||
},
|
||||
{
|
||||
"path": "references/session-management.md",
|
||||
"sha256": "78467d521547a60ce85e464f5237bb5313dc6c19127267b5492da93a11167131"
|
||||
},
|
||||
{
|
||||
"path": "scripts/check-versions.sh",
|
||||
"sha256": "7101b170427b9183cb1375263790732b9c11ff84df86ef09504a04148794173d"
|
||||
},
|
||||
{
|
||||
"path": ".claude-plugin/plugin.json",
|
||||
"sha256": "9891b4c3cbdfbd2a5833ef02a25165152e768782cbbb25e2b781702428a64bb9"
|
||||
},
|
||||
{
|
||||
"path": "templates/session-reuse.ts",
|
||||
"sha256": "42a96c01227e25aa2cb0c2e9b9fdb83ade99fb4288a3bf616760645c727ca4b4"
|
||||
},
|
||||
{
|
||||
"path": "templates/wrangler-browser-config.jsonc",
|
||||
"sha256": "b587dc298f75a82dff9ba343ad7edb555a25cc9b2621d393839f378f04d7b0a1"
|
||||
},
|
||||
{
|
||||
"path": "templates/pdf-generation.ts",
|
||||
"sha256": "cdfd88c037ace52984185a023555cc6a852c2c0bd9036c1a0d08756d4dd849a7"
|
||||
},
|
||||
{
|
||||
"path": "templates/web-scraper-basic.ts",
|
||||
"sha256": "3434f82fdd25d8cd4f0a16ff5b12437d0833ec571fcf849feb69332ed2a7b60c"
|
||||
},
|
||||
{
|
||||
"path": "templates/playwright-example.ts",
|
||||
"sha256": "c575c7e163675c819bdccce6624e18cb8695c3c7e19dc1107618d946949eb5a0"
|
||||
},
|
||||
{
|
||||
"path": "templates/screenshot-with-kv-cache.ts",
|
||||
"sha256": "16d841deba9b8376cb2e80c8a32c0389de44248422350fc12fe1e6b9e46c3ce1"
|
||||
},
|
||||
{
|
||||
"path": "templates/basic-screenshot.ts",
|
||||
"sha256": "36564d257d9dd1721f0a1a7e2168d9e719d386c8494c766890fa8b04ac97ff51"
|
||||
},
|
||||
{
|
||||
"path": "templates/web-scraper-batch.ts",
|
||||
"sha256": "626904560ecf736f1f39a55fc1a9d6f302e865ead02d32f317573370921e1a25"
|
||||
},
|
||||
{
|
||||
"path": "templates/ai-enhanced-scraper.ts",
|
||||
"sha256": "523872273fbe1c4bf850e0243b3b97a7f16a581355c6897e4741b1a1a586590e"
|
||||
}
|
||||
],
|
||||
"dirSha256": "e3ef8d6debe9616a1cbe302e85dc18cb49d96c36872a8c07c7acae4cf142113d"
|
||||
},
|
||||
"security": {
|
||||
"scannedAt": null,
|
||||
"scannerVersion": null,
|
||||
"flags": []
|
||||
}
|
||||
}
|
||||
632
references/common-errors.md
Normal file
632
references/common-errors.md
Normal file
@@ -0,0 +1,632 @@
|
||||
# Common Errors and Solutions
|
||||
|
||||
Complete reference for all known Browser Rendering errors with sources, root causes, and solutions.
|
||||
|
||||
---
|
||||
|
||||
## Error 1: "Cannot read properties of undefined (reading 'fetch')"
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
TypeError: Cannot read properties of undefined (reading 'fetch')
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/faq/#cannot-read-properties-of-undefined-reading-fetch
|
||||
|
||||
**Root Cause**: Browser binding not passed to `puppeteer.launch()`
|
||||
|
||||
**Why It Happens:**
|
||||
```typescript
|
||||
// ❌ Missing browser binding
|
||||
const browser = await puppeteer.launch();
|
||||
// ^ undefined - no binding passed!
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
// ✅ Pass browser binding
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
// ^^^^^^^^^^^^^^^^ binding from env
|
||||
```
|
||||
|
||||
**Prevention**: Always pass `env.MYBROWSER` (or your configured binding name) to `puppeteer.launch()`.
|
||||
|
||||
---
|
||||
|
||||
## Error 2: XPath Selector Not Supported
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: XPath selectors are not supported in Browser Rendering
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/faq/#why-cant-i-use-an-xpath-selector-when-using-browser-rendering-with-puppeteer
|
||||
|
||||
**Root Cause**: XPath poses security risk to Workers
|
||||
|
||||
**Why It Happens:**
|
||||
```typescript
|
||||
// ❌ XPath selectors not directly supported
|
||||
const elements = await page.$x('/html/body/div/h1');
|
||||
```
|
||||
|
||||
**Solution 1: Use CSS Selectors**
|
||||
```typescript
|
||||
// ✅ Use CSS selector instead
|
||||
const element = await page.$("div > h1");
|
||||
const elements = await page.$$("div > h1");
|
||||
```
|
||||
|
||||
**Solution 2: Use XPath in page.evaluate()**
|
||||
```typescript
|
||||
// ✅ Use XPath inside page.evaluate()
|
||||
const innerHtml = await page.evaluate(() => {
|
||||
return (
|
||||
// @ts-ignore - runs in browser context
|
||||
new XPathEvaluator()
|
||||
.createExpression("/html/body/div/h1")
|
||||
// @ts-ignore
|
||||
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
|
||||
.singleNodeValue.innerHTML
|
||||
);
|
||||
});
|
||||
```
|
||||
|
||||
**Prevention**: Use CSS selectors by default. Only use XPath via `page.evaluate()` if absolutely necessary.
|
||||
|
||||
---
|
||||
|
||||
## Error 3: Browser Timeout
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Browser session closed due to inactivity
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/platform/limits/#note-on-browser-timeout
|
||||
|
||||
**Root Cause**: Default 60 second idle timeout
|
||||
|
||||
**Why It Happens:**
|
||||
- No devtools commands sent for 60 seconds
|
||||
- Browser automatically closes to free resources
|
||||
|
||||
**Solution: Extend Timeout**
|
||||
```typescript
|
||||
// ✅ Extend timeout to 5 minutes
|
||||
const browser = await puppeteer.launch(env.MYBROWSER, {
|
||||
keep_alive: 300000 // 5 minutes = 300,000 ms
|
||||
});
|
||||
```
|
||||
|
||||
**Maximum**: 600,000ms (10 minutes)
|
||||
|
||||
**Use Cases for Extended Timeout:**
|
||||
- Multi-step workflows
|
||||
- Long-running scraping
|
||||
- Session reuse across requests
|
||||
|
||||
**Prevention**: Only extend if actually needed. Longer timeout = more billable hours.
|
||||
|
||||
---
|
||||
|
||||
## Error 4: Rate Limit Exceeded
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Rate limit exceeded. Too many concurrent browsers.
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
|
||||
**Root Cause**: Exceeded concurrent browser limit
|
||||
|
||||
**Limits:**
|
||||
- Free tier: 3 concurrent browsers
|
||||
- Paid tier: 10-30 concurrent browsers
|
||||
|
||||
**Solution 1: Check Limits Before Launching**
|
||||
```typescript
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: "Rate limit reached",
|
||||
retryAfter: limits.timeUntilNextAllowedBrowserAcquisition
|
||||
}),
|
||||
{ status: 429 }
|
||||
);
|
||||
}
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
```
|
||||
|
||||
**Solution 2: Reuse Sessions**
|
||||
```typescript
|
||||
// Try to connect to existing session first
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch {
|
||||
// Session closed, launch new
|
||||
}
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
```
|
||||
|
||||
**Solution 3: Use Multiple Tabs**
|
||||
```typescript
|
||||
// ❌ Bad: 10 browsers
|
||||
for (const url of urls) {
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
// ...
|
||||
}
|
||||
|
||||
// ✅ Good: 1 browser, 10 tabs
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
await Promise.all(urls.map(async url => {
|
||||
const page = await browser.newPage();
|
||||
// ...
|
||||
await page.close();
|
||||
}));
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
**Prevention**: Monitor concurrency usage, implement session reuse, use tabs instead of multiple browsers.
|
||||
|
||||
---
|
||||
|
||||
## Error 5: Local Development Request Size Limit
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Request payload too large (>1MB)
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/faq/#does-local-development-support-all-browser-rendering-features
|
||||
|
||||
**Root Cause**: Local development limitation (requests >1MB fail)
|
||||
|
||||
**Solution: Use Remote Binding**
|
||||
```jsonc
|
||||
// wrangler.jsonc
|
||||
{
|
||||
"browser": {
|
||||
"binding": "MYBROWSER",
|
||||
"remote": true // ← Use real headless browser during dev
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**With Remote Binding:**
|
||||
- Connects to actual Cloudflare browser (not local simulation)
|
||||
- No 1MB request limit
|
||||
- Counts toward your quota
|
||||
|
||||
**Prevention**: Enable `remote: true` for local development if working with large payloads.
|
||||
|
||||
---
|
||||
|
||||
## Error 6: Bot Protection Triggered
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Blocked by bot protection / CAPTCHA challenge
|
||||
```
|
||||
|
||||
**Source**: https://developers.cloudflare.com/browser-rendering/faq/#will-browser-rendering-bypass-cloudflares-bot-protection
|
||||
|
||||
**Root Cause**: Browser Rendering requests always identified as bots
|
||||
|
||||
**Why It Happens:**
|
||||
- Cloudflare automatically identifies Browser Rendering traffic
|
||||
- Cannot bypass bot protection
|
||||
- Automatic headers added: `cf-biso-request-id`, `cf-biso-devtools`
|
||||
|
||||
**Solution (If Scraping Your Own Zone):**
|
||||
Create WAF skip rule:
|
||||
|
||||
1. Go to Security > WAF > Custom rules
|
||||
2. Create skip rule with custom header:
|
||||
- Header: `X-Custom-Auth`
|
||||
- Value: `your-secret-token`
|
||||
3. Add header in your Worker:
|
||||
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Set custom header
|
||||
await page.setExtraHTTPHeaders({
|
||||
"X-Custom-Auth": "your-secret-token"
|
||||
});
|
||||
|
||||
await page.goto(url);
|
||||
```
|
||||
|
||||
**Solution (If Scraping External Sites):**
|
||||
- Cannot bypass bot protection
|
||||
- Some sites will block Browser Rendering traffic
|
||||
- Consider using site's official API instead
|
||||
|
||||
**Prevention**: Use official APIs when available. Only scrape your own zones if possible.
|
||||
|
||||
---
|
||||
|
||||
## Error 7: Navigation Timeout
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
TimeoutError: Navigation timeout of 30000 ms exceeded
|
||||
```
|
||||
|
||||
**Root Cause**: Page failed to load within timeout
|
||||
|
||||
**Why It Happens:**
|
||||
- Slow website
|
||||
- Large page assets
|
||||
- Network issues
|
||||
- Page never reaches desired load state
|
||||
|
||||
**Solution 1: Increase Timeout**
|
||||
```typescript
|
||||
await page.goto(url, {
|
||||
timeout: 60000 // 60 seconds
|
||||
});
|
||||
```
|
||||
|
||||
**Solution 2: Change Wait Condition**
|
||||
```typescript
|
||||
// ❌ Strict (waits for all network requests)
|
||||
await page.goto(url, { waitUntil: "networkidle0" });
|
||||
|
||||
// ✅ More lenient (waits for DOMContentLoaded)
|
||||
await page.goto(url, { waitUntil: "domcontentloaded" });
|
||||
|
||||
// ✅ Most lenient (waits for load event only)
|
||||
await page.goto(url, { waitUntil: "load" });
|
||||
```
|
||||
|
||||
**Solution 3: Handle Timeout Gracefully**
|
||||
```typescript
|
||||
try {
|
||||
await page.goto(url, { timeout: 30000 });
|
||||
} catch (error) {
|
||||
if (error instanceof Error && error.name === "TimeoutError") {
|
||||
console.log("Navigation timeout, taking screenshot anyway");
|
||||
const screenshot = await page.screenshot();
|
||||
return screenshot;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
```
|
||||
|
||||
**Prevention**: Set appropriate timeouts for your use case. Use lenient wait conditions for slow sites.
|
||||
|
||||
---
|
||||
|
||||
## Error 8: Memory Limit Exceeded
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Browser exceeded its memory limit
|
||||
```
|
||||
|
||||
**Root Cause**: Page too large or too many tabs open
|
||||
|
||||
**Why It Happens:**
|
||||
- Opening many tabs simultaneously
|
||||
- Large pages with many assets
|
||||
- Memory leaks from not closing pages
|
||||
|
||||
**Solution 1: Close Pages**
|
||||
```typescript
|
||||
const page = await browser.newPage();
|
||||
// ... use page ...
|
||||
await page.close(); // ← Don't forget!
|
||||
```
|
||||
|
||||
**Solution 2: Limit Concurrent Tabs**
|
||||
```typescript
|
||||
import PQueue from "p-queue";
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const queue = new PQueue({ concurrency: 5 }); // Max 5 tabs
|
||||
|
||||
await Promise.all(urls.map(url =>
|
||||
queue.add(async () => {
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url);
|
||||
// ...
|
||||
await page.close();
|
||||
})
|
||||
));
|
||||
```
|
||||
|
||||
**Solution 3: Use Smaller Viewports**
|
||||
```typescript
|
||||
await page.setViewport({
|
||||
width: 1280,
|
||||
height: 720 // Smaller than default
|
||||
});
|
||||
```
|
||||
|
||||
**Prevention**: Always close pages when done. Limit concurrent tabs. Process URLs in batches.
|
||||
|
||||
---
|
||||
|
||||
## Error 9: Failed to Connect to Session
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Failed to connect to browser session
|
||||
```
|
||||
|
||||
**Root Cause**: Session closed between `.sessions()` and `.connect()` calls
|
||||
|
||||
**Why It Happens:**
|
||||
- Session timed out (60s idle)
|
||||
- Session closed by another Worker
|
||||
- Session terminated unexpectedly
|
||||
|
||||
**Solution: Handle Connection Failures**
|
||||
```typescript
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
return browser;
|
||||
} catch (error) {
|
||||
console.log("Failed to connect to session, launching new browser");
|
||||
}
|
||||
}
|
||||
|
||||
// Fall back to launching new browser
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
```
|
||||
|
||||
**Prevention**: Always wrap `puppeteer.connect()` in try-catch. Have fallback to `puppeteer.launch()`.
|
||||
|
||||
---
|
||||
|
||||
## Error 10: Too Many Requests Per Minute
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Too many browser launches per minute
|
||||
```
|
||||
|
||||
**Root Cause**: Exceeded "new browsers per minute" limit
|
||||
|
||||
**Limits:**
|
||||
- Free tier: 3 per minute (1 every 20 seconds)
|
||||
- Paid tier: 30 per minute (1 every 2 seconds)
|
||||
|
||||
**Solution: Implement Rate Limiting**
|
||||
```typescript
|
||||
async function launchWithRateLimit(env: Env): Promise<Browser> {
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
const delay = limits.timeUntilNextAllowedBrowserAcquisition || 2000;
|
||||
console.log(`Rate limited, waiting ${delay}ms`);
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
```
|
||||
|
||||
**Prevention**: Check limits before launching. Implement exponential backoff. Reuse sessions instead of launching new browsers.
|
||||
|
||||
---
|
||||
|
||||
## Error 11: Binding Not Configured
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Browser binding not found
|
||||
```
|
||||
|
||||
**Root Cause**: Browser binding not configured in wrangler.jsonc
|
||||
|
||||
**Solution: Add Browser Binding**
|
||||
```jsonc
|
||||
// wrangler.jsonc
|
||||
{
|
||||
"browser": {
|
||||
"binding": "MYBROWSER"
|
||||
},
|
||||
"compatibility_flags": ["nodejs_compat"]
|
||||
}
|
||||
```
|
||||
|
||||
**Also Add to TypeScript Types:**
|
||||
```typescript
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
```
|
||||
|
||||
**Prevention**: Always configure browser binding and nodejs_compat flag.
|
||||
|
||||
---
|
||||
|
||||
## Error 12: nodejs_compat Flag Missing
|
||||
|
||||
**Full Error:**
|
||||
```
|
||||
Error: Node.js APIs not available
|
||||
```
|
||||
|
||||
**Root Cause**: `nodejs_compat` compatibility flag not enabled
|
||||
|
||||
**Solution: Add Compatibility Flag**
|
||||
```jsonc
|
||||
// wrangler.jsonc
|
||||
{
|
||||
"compatibility_flags": ["nodejs_compat"]
|
||||
}
|
||||
```
|
||||
|
||||
**Why It's Required:**
|
||||
Browser Rendering needs Node.js APIs and polyfills to work.
|
||||
|
||||
**Prevention**: Always include `nodejs_compat` when using Browser Rendering.
|
||||
|
||||
---
|
||||
|
||||
## Error Handling Template
|
||||
|
||||
Complete error handling for production use:
|
||||
|
||||
```typescript
|
||||
import puppeteer, { Browser } from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
async function withBrowser<T>(
|
||||
env: Env,
|
||||
fn: (browser: Browser) => Promise<T>
|
||||
): Promise<T> {
|
||||
let browser: Browser | null = null;
|
||||
|
||||
try {
|
||||
// Check limits
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
throw new Error(
|
||||
`Rate limit reached. Retry after ${limits.timeUntilNextAllowedBrowserAcquisition}ms`
|
||||
);
|
||||
}
|
||||
|
||||
// Try to reuse session
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch (error) {
|
||||
console.log("Failed to connect, launching new browser");
|
||||
browser = await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
} else {
|
||||
browser = await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
|
||||
// Execute user function
|
||||
const result = await fn(browser);
|
||||
|
||||
// Disconnect (keep session alive)
|
||||
await browser.disconnect();
|
||||
|
||||
return result;
|
||||
} catch (error) {
|
||||
// Close on error
|
||||
if (browser) {
|
||||
await browser.close();
|
||||
}
|
||||
|
||||
// Re-throw with context
|
||||
if (error instanceof Error) {
|
||||
error.message = `Browser operation failed: ${error.message}`;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
try {
|
||||
const screenshot = await withBrowser(env, async (browser) => {
|
||||
const page = await browser.newPage();
|
||||
|
||||
try {
|
||||
await page.goto("https://example.com", {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000
|
||||
});
|
||||
} catch (error) {
|
||||
if (error instanceof Error && error.name === "TimeoutError") {
|
||||
console.log("Navigation timeout, taking screenshot anyway");
|
||||
} else {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
return await page.screenshot();
|
||||
});
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
} catch (error) {
|
||||
console.error("Request failed:", error);
|
||||
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: error instanceof Error ? error.message : "Unknown error"
|
||||
}),
|
||||
{
|
||||
status: 500,
|
||||
headers: { "content-type": "application/json" }
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Debugging Checklist
|
||||
|
||||
When encountering browser errors:
|
||||
|
||||
1. **Check browser binding**
|
||||
- [ ] Binding configured in wrangler.jsonc?
|
||||
- [ ] nodejs_compat flag enabled?
|
||||
- [ ] Binding passed to puppeteer.launch()?
|
||||
|
||||
2. **Check limits**
|
||||
- [ ] Within concurrent browser limit?
|
||||
- [ ] Within new browsers/minute limit?
|
||||
- [ ] Call puppeteer.limits() to verify?
|
||||
|
||||
3. **Check timeouts**
|
||||
- [ ] Navigation timeout appropriate?
|
||||
- [ ] Browser keep_alive set if needed?
|
||||
- [ ] Timeout errors handled gracefully?
|
||||
|
||||
4. **Check session management**
|
||||
- [ ] browser.close() called on errors?
|
||||
- [ ] Pages closed when done?
|
||||
- [ ] Session reuse implemented correctly?
|
||||
|
||||
5. **Check network**
|
||||
- [ ] Target URL accessible?
|
||||
- [ ] No CORS/bot protection issues?
|
||||
- [ ] Appropriate wait conditions used?
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- **FAQ**: https://developers.cloudflare.com/browser-rendering/faq/
|
||||
- **Limits**: https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
- **GitHub Issues**: https://github.com/cloudflare/puppeteer/issues
|
||||
- **Discord**: https://discord.cloudflare.com/
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-10-22
|
||||
593
references/pricing-and-limits.md
Normal file
593
references/pricing-and-limits.md
Normal file
@@ -0,0 +1,593 @@
|
||||
# Pricing and Limits Reference
|
||||
|
||||
Complete breakdown of Cloudflare Browser Rendering pricing, limits, and cost optimization strategies.
|
||||
|
||||
---
|
||||
|
||||
## Pricing Overview
|
||||
|
||||
Browser Rendering is billed on **two metrics**:
|
||||
|
||||
1. **Duration** - Total browser hours used
|
||||
2. **Concurrency** - Monthly average of concurrent browsers (Workers Bindings only)
|
||||
|
||||
---
|
||||
|
||||
## Free Tier (Workers Free Plan)
|
||||
|
||||
| Feature | Limit |
|
||||
|---------|-------|
|
||||
| **Browser Duration** | 10 minutes per day |
|
||||
| **Concurrent Browsers** | 3 per account |
|
||||
| **New Browsers per Minute** | 3 per minute |
|
||||
| **REST API Requests** | 6 per minute |
|
||||
| **Browser Timeout (Idle)** | 60 seconds |
|
||||
| **Max Session Duration** | No hard limit (closes on idle timeout) |
|
||||
|
||||
### Free Tier Use Cases
|
||||
|
||||
**Good for:**
|
||||
- Development and testing
|
||||
- Personal projects
|
||||
- Low-traffic screenshot services (<100 requests/day)
|
||||
- Learning and experimentation
|
||||
|
||||
**Not suitable for:**
|
||||
- Production applications
|
||||
- High-traffic services
|
||||
- Long-running scraping jobs
|
||||
- Batch operations (>3 concurrent browsers)
|
||||
|
||||
---
|
||||
|
||||
## Paid Tier (Workers Paid Plan)
|
||||
|
||||
### Included Limits
|
||||
|
||||
| Feature | Included |
|
||||
|---------|----------|
|
||||
| **Browser Duration** | 10 hours per month |
|
||||
| **Concurrent Browsers** | 10 (monthly average) |
|
||||
| **New Browsers per Minute** | 30 per minute |
|
||||
| **REST API Requests** | 180 per minute |
|
||||
| **Max Concurrent Browsers** | 30 per account |
|
||||
| **Browser Timeout** | 60 seconds (extendable to 10 minutes with keep_alive) |
|
||||
|
||||
### Beyond Included Limits
|
||||
|
||||
| Metric | Price |
|
||||
|--------|-------|
|
||||
| **Additional Browser Hours** | $0.09 per hour |
|
||||
| **Additional Concurrent Browsers** | $2.00 per browser (monthly average) |
|
||||
|
||||
### Requesting Higher Limits
|
||||
|
||||
If you need more than:
|
||||
- 30 concurrent browsers
|
||||
- 30 new browsers per minute
|
||||
- 180 REST API requests per minute
|
||||
|
||||
**Request higher limits**: https://forms.gle/CdueDKvb26mTaepa9
|
||||
|
||||
---
|
||||
|
||||
## Rate Limits
|
||||
|
||||
### Per-Second Enforcement
|
||||
|
||||
Rate limits are enforced **per-second**, not per-minute.
|
||||
|
||||
**Example**: 180 requests per minute = 3 requests per second
|
||||
|
||||
**This means:**
|
||||
- ❌ Cannot send all 180 requests at once
|
||||
- ✅ Must spread evenly over the minute (3/second)
|
||||
|
||||
**Implementation:**
|
||||
```typescript
|
||||
async function rateLimitedLaunch(env: Env): Promise<Browser> {
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
const delay = limits.timeUntilNextAllowedBrowserAcquisition;
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
```
|
||||
|
||||
### Free Tier Rate Limits
|
||||
|
||||
- **Concurrent browsers**: 3
|
||||
- **New browsers/minute**: 3 (= 1 every 20 seconds)
|
||||
- **REST API requests/minute**: 6 (= 1 every 10 seconds)
|
||||
|
||||
### Paid Tier Rate Limits
|
||||
|
||||
- **Concurrent browsers**: 30 (default, can request higher)
|
||||
- **New browsers/minute**: 30 (= 1 every 2 seconds)
|
||||
- **REST API requests/minute**: 180 (= 3 per second)
|
||||
|
||||
---
|
||||
|
||||
## Duration Billing
|
||||
|
||||
### How It Works
|
||||
|
||||
1. **Daily Totals**: Cloudflare sums all browser usage each day (in seconds)
|
||||
2. **Monthly Total**: Sum of all daily totals
|
||||
3. **Rounded to Hours**: Total rounded to nearest hour
|
||||
4. **Billed**: Total hours minus 10 included hours
|
||||
|
||||
**Example:**
|
||||
- Day 1: 60 seconds (1 minute)
|
||||
- Day 2: 120 seconds (2 minutes)
|
||||
- ...
|
||||
- Day 30: 90 seconds (1.5 minutes)
|
||||
- **Monthly Total**: 45 minutes = 0.75 hours (rounded to 1 hour)
|
||||
- **Billable**: 1 hour - 10 included = 0 hours (still within free allowance)
|
||||
|
||||
### Failed Requests
|
||||
|
||||
**Failed requests are NOT billed** if they fail with `waitForTimeout` error.
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
try {
|
||||
await page.goto(url, { timeout: 30000 });
|
||||
} catch (error) {
|
||||
// If this times out, browser time is NOT charged
|
||||
console.log("Navigation timeout - not billed");
|
||||
}
|
||||
```
|
||||
|
||||
### Duration Optimization
|
||||
|
||||
**Minimize browser time:**
|
||||
|
||||
1. **Close browsers promptly**
|
||||
```typescript
|
||||
await browser.close(); // Don't leave hanging
|
||||
```
|
||||
|
||||
2. **Use session reuse**
|
||||
```typescript
|
||||
// Reuse session instead of launching new browser
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, sessionId);
|
||||
```
|
||||
|
||||
3. **Timeout management**
|
||||
```typescript
|
||||
// Set appropriate timeouts (don't wait forever)
|
||||
await page.goto(url, { timeout: 30000 });
|
||||
```
|
||||
|
||||
4. **Cache aggressively**
|
||||
```typescript
|
||||
// Cache screenshots in KV to avoid re-rendering
|
||||
const cached = await env.KV.get(url, { type: "arrayBuffer" });
|
||||
if (cached) return new Response(cached);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Concurrency Billing
|
||||
|
||||
### How It Works
|
||||
|
||||
1. **Daily Peak**: Cloudflare records highest concurrent browsers each day
|
||||
2. **Monthly Average**: Average of all daily peaks
|
||||
3. **Billed**: Average - 10 included browsers
|
||||
|
||||
**Formula:**
|
||||
```
|
||||
monthly_average = sum(daily_peaks) / days_in_month
|
||||
billable = max(0, monthly_average - 10)
|
||||
cost = billable * $2.00
|
||||
```
|
||||
|
||||
**Example:**
|
||||
- Days 1-15: 10 concurrent browsers (daily peak)
|
||||
- Days 16-30: 20 concurrent browsers (daily peak)
|
||||
- Monthly average: ((10 × 15) + (20 × 15)) / 30 = 15 browsers
|
||||
- Billable: 15 - 10 = 5 browsers
|
||||
- **Cost**: 5 × $2.00 = **$10.00**
|
||||
|
||||
### Concurrency vs Duration
|
||||
|
||||
| Scenario | Concurrency Impact | Duration Impact |
|
||||
|----------|-------------------|-----------------|
|
||||
| 1 browser for 10 hours | 1 concurrent browser | 10 browser hours |
|
||||
| 10 browsers for 1 hour | 10 concurrent browsers | 10 browser hours |
|
||||
| 100 browsers for 6 minutes | 100 concurrent browsers (!!) | 10 browser hours |
|
||||
|
||||
**Key Insight**: Short bursts of high concurrency are EXPENSIVE.
|
||||
|
||||
### Concurrency Optimization
|
||||
|
||||
**Minimize concurrent browsers:**
|
||||
|
||||
1. **Use multiple tabs**
|
||||
```typescript
|
||||
// ❌ Bad: 10 browsers
|
||||
for (const url of urls) {
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
// ...
|
||||
}
|
||||
|
||||
// ✅ Good: 1 browser, 10 tabs
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
await Promise.all(urls.map(async url => {
|
||||
const page = await browser.newPage();
|
||||
// ...
|
||||
}));
|
||||
```
|
||||
|
||||
2. **Session reuse**
|
||||
```typescript
|
||||
// Maintain pool of warm browsers
|
||||
// Reuse instead of launching new ones
|
||||
```
|
||||
|
||||
3. **Queue requests**
|
||||
```typescript
|
||||
// Limit concurrent operations
|
||||
const queue = new PQueue({ concurrency: 3 });
|
||||
await Promise.all(urls.map(url => queue.add(() => process(url))));
|
||||
```
|
||||
|
||||
4. **Incognito contexts**
|
||||
```typescript
|
||||
// Share browser, isolate sessions
|
||||
const context1 = await browser.createBrowserContext();
|
||||
const context2 = await browser.createBrowserContext();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cost Examples
|
||||
|
||||
### Example 1: Screenshot Service
|
||||
|
||||
**Scenario:**
|
||||
- 10,000 screenshots per month
|
||||
- 3 second average per screenshot
|
||||
- No caching, no session reuse
|
||||
|
||||
**Duration:**
|
||||
- 10,000 × 3 seconds = 30,000 seconds = 8.33 hours
|
||||
- Billable: 8.33 - 10 = 0 hours (within free allowance)
|
||||
- **Duration Cost**: $0.00
|
||||
|
||||
**Concurrency:**
|
||||
- Assume 100 requests/hour during peak (9am-5pm weekdays)
|
||||
- 100 requests/hour ÷ 3600 seconds = 0.028 browsers/second
|
||||
- Peak: ~3 concurrent browsers
|
||||
- Daily peak (weekdays): 3 browsers
|
||||
- Daily peak (weekends): 1 browser
|
||||
- Monthly average: ((3 × 22) + (1 × 8)) / 30 = 2.5 browsers
|
||||
- Billable: 2.5 - 10 = 0 (within free allowance)
|
||||
- **Concurrency Cost**: $0.00
|
||||
|
||||
**Total: $0.00** (within free tier!)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Heavy Scraping
|
||||
|
||||
**Scenario:**
|
||||
- 1,000 URLs per day
|
||||
- 10 seconds average per URL
|
||||
- Batch processing (10 concurrent browsers)
|
||||
|
||||
**Duration:**
|
||||
- 1,000 × 10 seconds × 30 days = 300,000 seconds = 83.33 hours
|
||||
- Billable: 83.33 - 10 = 73.33 hours
|
||||
- **Duration Cost**: 73.33 × $0.09 = **$6.60**
|
||||
|
||||
**Concurrency:**
|
||||
- Daily peak: 10 concurrent browsers (every day)
|
||||
- Monthly average: 10 browsers
|
||||
- Billable: 10 - 10 = 0 (within free allowance)
|
||||
- **Concurrency Cost**: $0.00
|
||||
|
||||
**Total: $6.60/month**
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Burst Traffic
|
||||
|
||||
**Scenario:**
|
||||
- Newsletter sent monthly with screenshot links
|
||||
- 10,000 screenshots in 1 hour
|
||||
- Each screenshot: 3 seconds
|
||||
|
||||
**Duration:**
|
||||
- 10,000 × 3 seconds = 30,000 seconds = 8.33 hours
|
||||
- Billable: 8.33 - 10 = 0 hours
|
||||
- **Duration Cost**: $0.00
|
||||
|
||||
**Concurrency:**
|
||||
- 10,000 screenshots in 1 hour = 166 requests/minute
|
||||
- At 3 seconds each: ~8.3 concurrent browsers
|
||||
- But limited to 30 max, so likely queueing
|
||||
- Daily peak: 30 browsers (rate limit)
|
||||
- Monthly average: (30 × 1 day + 1 × 29 days) / 30 = 1.97 browsers
|
||||
- Billable: 1.97 - 10 = 0
|
||||
- **Concurrency Cost**: $0.00
|
||||
|
||||
**Total: $0.00**
|
||||
|
||||
**Note**: Would hit rate limits. Better to spread over longer period or request higher limits.
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Production API (Optimized)
|
||||
|
||||
**Scenario:**
|
||||
- 100,000 screenshots per month
|
||||
- Session reuse + KV caching (90% cache hit rate)
|
||||
- 10,000 actual browser renderings
|
||||
- 5 seconds average per render
|
||||
- Maintain pool of 5 warm browsers
|
||||
|
||||
**Duration:**
|
||||
- 10,000 × 5 seconds = 50,000 seconds = 13.89 hours
|
||||
- Billable: 13.89 - 10 = 3.89 hours
|
||||
- **Duration Cost**: 3.89 × $0.09 = **$0.35**
|
||||
|
||||
**Concurrency:**
|
||||
- Maintain pool of 5 browsers (keep_alive)
|
||||
- Daily peak: 5 browsers
|
||||
- Monthly average: 5 browsers
|
||||
- Billable: 5 - 10 = 0
|
||||
- **Concurrency Cost**: $0.00
|
||||
|
||||
**Total: $0.35/month** for 100k requests!
|
||||
|
||||
**ROI**: $0.0000035 per screenshot
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization Strategies
|
||||
|
||||
### 1. Aggressive Caching
|
||||
|
||||
**Strategy**: Cache screenshots/PDFs in KV or R2
|
||||
|
||||
**Impact**:
|
||||
- Reduces browser hours by 80-95%
|
||||
- Reduces concurrency needs
|
||||
- Faster response times
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
// Check cache first
|
||||
const cached = await env.KV.get(url, { type: "arrayBuffer" });
|
||||
if (cached) return new Response(cached);
|
||||
|
||||
// Generate and cache
|
||||
const screenshot = await generateScreenshot(url);
|
||||
await env.KV.put(url, screenshot, { expirationTtl: 86400 });
|
||||
```
|
||||
|
||||
**Cost Savings**: 80-95% reduction
|
||||
|
||||
---
|
||||
|
||||
### 2. Session Reuse
|
||||
|
||||
**Strategy**: Maintain pool of warm browsers, reuse sessions
|
||||
|
||||
**Impact**:
|
||||
- Reduces cold start time
|
||||
- Lower concurrency charges
|
||||
- Better throughput
|
||||
|
||||
**Implementation**: See `session-reuse.ts` template
|
||||
|
||||
**Cost Savings**: 30-50% reduction
|
||||
|
||||
---
|
||||
|
||||
### 3. Multiple Tabs
|
||||
|
||||
**Strategy**: Use tabs instead of multiple browsers
|
||||
|
||||
**Impact**:
|
||||
- 10-50x reduction in concurrency
|
||||
- Minimal duration increase
|
||||
- Much cheaper
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
await Promise.all(urls.map(async url => {
|
||||
const page = await browser.newPage();
|
||||
// process
|
||||
await page.close();
|
||||
}));
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
**Cost Savings**: 90%+ reduction in concurrency charges
|
||||
|
||||
---
|
||||
|
||||
### 4. Appropriate Timeouts
|
||||
|
||||
**Strategy**: Set reasonable timeouts, don't wait forever
|
||||
|
||||
**Impact**:
|
||||
- Prevents hanging browsers
|
||||
- Reduces wasted duration
|
||||
- Better error handling
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
await page.goto(url, {
|
||||
timeout: 30000, // 30 second max
|
||||
waitUntil: "networkidle0"
|
||||
});
|
||||
```
|
||||
|
||||
**Cost Savings**: 20-40% reduction
|
||||
|
||||
---
|
||||
|
||||
### 5. Request Queueing
|
||||
|
||||
**Strategy**: Limit concurrent operations to stay within limits
|
||||
|
||||
**Impact**:
|
||||
- Avoid rate limit errors
|
||||
- Predictable costs
|
||||
- Better resource utilization
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
import PQueue from "p-queue";
|
||||
|
||||
const queue = new PQueue({ concurrency: 5 });
|
||||
|
||||
await Promise.all(urls.map(url =>
|
||||
queue.add(() => processUrl(url))
|
||||
));
|
||||
```
|
||||
|
||||
**Cost Savings**: Avoids rate limit charges
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Usage
|
||||
|
||||
### Dashboard
|
||||
|
||||
View usage in Cloudflare Dashboard:
|
||||
|
||||
https://dash.cloudflare.com/?to=/:account/workers/browser-rendering
|
||||
|
||||
**Metrics available:**
|
||||
- Total browser hours used
|
||||
- REST API requests
|
||||
- Concurrent browsers (graph)
|
||||
- Cost estimates
|
||||
|
||||
### Response Headers
|
||||
|
||||
REST API returns browser time used:
|
||||
|
||||
```
|
||||
X-Browser-Ms-Used: 2340
|
||||
```
|
||||
|
||||
(Browser time in milliseconds for that request)
|
||||
|
||||
### Custom Tracking
|
||||
|
||||
```typescript
|
||||
interface UsageMetrics {
|
||||
date: string;
|
||||
browserHours: number;
|
||||
peakConcurrency: number;
|
||||
requests: number;
|
||||
cacheHitRate: number;
|
||||
}
|
||||
|
||||
// Track in D1 or Analytics Engine
|
||||
await env.ANALYTICS.writeDataPoint({
|
||||
indexes: [date],
|
||||
blobs: ["browser_usage"],
|
||||
doubles: [browserHours, peakConcurrency, requests]
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cost Alerts
|
||||
|
||||
### Set Up Alerts
|
||||
|
||||
1. **Monitor daily peaks**
|
||||
```typescript
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.activeSessions.length > 15) {
|
||||
console.warn("High concurrency detected:", limits.activeSessions.length);
|
||||
}
|
||||
```
|
||||
|
||||
2. **Track hourly usage**
|
||||
```typescript
|
||||
const usage = await getHourlyUsage();
|
||||
if (usage.browserHours > 1) {
|
||||
console.warn("High browser usage this hour:", usage.browserHours);
|
||||
}
|
||||
```
|
||||
|
||||
3. **Set budget limits**
|
||||
```typescript
|
||||
const monthlyBudget = 50; // $50/month
|
||||
const currentCost = await estimateCurrentCost();
|
||||
if (currentCost > monthlyBudget * 0.8) {
|
||||
console.warn("Approaching monthly budget:", currentCost);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices Summary
|
||||
|
||||
1. **Always cache** screenshots/PDFs in KV or R2
|
||||
2. **Reuse sessions** instead of launching new browsers
|
||||
3. **Use multiple tabs** instead of multiple browsers
|
||||
4. **Set appropriate timeouts** to prevent hanging
|
||||
5. **Monitor usage** in dashboard and logs
|
||||
6. **Queue requests** to stay within rate limits
|
||||
7. **Test caching** to optimize hit rate
|
||||
8. **Profile operations** to identify slow requests
|
||||
9. **Use incognito contexts** for session isolation
|
||||
10. **Request higher limits** if needed for production
|
||||
|
||||
---
|
||||
|
||||
## Common Questions
|
||||
|
||||
### Q: Are failed requests billed?
|
||||
|
||||
**A**: No. Requests that fail with `waitForTimeout` error are NOT billed.
|
||||
|
||||
### Q: How is concurrency calculated?
|
||||
|
||||
**A**: Monthly average of daily peak concurrent browsers.
|
||||
|
||||
### Q: Can I reduce my bill?
|
||||
|
||||
**A**: Yes! Use caching, session reuse, and multiple tabs. See optimization strategies above.
|
||||
|
||||
### Q: What if I hit limits?
|
||||
|
||||
**A**: Implement queueing, or request higher limits: https://forms.gle/CdueDKvb26mTaepa9
|
||||
|
||||
### Q: Is there a free tier?
|
||||
|
||||
**A**: Yes! 10 minutes/day browser time, 3 concurrent browsers.
|
||||
|
||||
### Q: How do I estimate costs?
|
||||
|
||||
**A**: Monitor usage in dashboard, then calculate:
|
||||
- Duration: (hours - 10) × $0.09
|
||||
- Concurrency: (avg - 10) × $2.00
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- **Official Pricing Docs**: https://developers.cloudflare.com/browser-rendering/platform/pricing/
|
||||
- **Limits Docs**: https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
- **Dashboard**: https://dash.cloudflare.com/?to=/:account/workers/browser-rendering
|
||||
- **Request Higher Limits**: https://forms.gle/CdueDKvb26mTaepa9
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-10-22
|
||||
627
references/puppeteer-vs-playwright.md
Normal file
627
references/puppeteer-vs-playwright.md
Normal file
@@ -0,0 +1,627 @@
|
||||
# Puppeteer vs Playwright Comparison
|
||||
|
||||
Complete comparison guide for choosing between @cloudflare/puppeteer and @cloudflare/playwright.
|
||||
|
||||
---
|
||||
|
||||
## Quick Recommendation
|
||||
|
||||
**Use Puppeteer if:**
|
||||
- ✅ Starting a new project
|
||||
- ✅ Need session management features
|
||||
- ✅ Want to optimize performance/costs
|
||||
- ✅ Building screenshot/PDF services
|
||||
- ✅ Web scraping workflows
|
||||
|
||||
**Use Playwright if:**
|
||||
- ✅ Already have Playwright tests to migrate
|
||||
- ✅ Prefer auto-waiting behavior
|
||||
- ✅ Don't need advanced session features
|
||||
- ✅ Want cross-browser APIs (even if only Chromium supported now)
|
||||
|
||||
**Bottom Line**: **Puppeteer is recommended** for most Browser Rendering use cases.
|
||||
|
||||
---
|
||||
|
||||
## Package Installation
|
||||
|
||||
### Puppeteer
|
||||
```bash
|
||||
npm install @cloudflare/puppeteer
|
||||
```
|
||||
|
||||
**Version**: 1.0.4 (based on Puppeteer v23.x)
|
||||
|
||||
### Playwright
|
||||
```bash
|
||||
npm install @cloudflare/playwright
|
||||
```
|
||||
|
||||
**Version**: 1.0.0 (based on Playwright v1.55.0)
|
||||
|
||||
---
|
||||
|
||||
## API Comparison
|
||||
|
||||
### Launching a Browser
|
||||
|
||||
**Puppeteer:**
|
||||
```typescript
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
```
|
||||
|
||||
**Playwright:**
|
||||
```typescript
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
```
|
||||
|
||||
**Key Difference**: Playwright uses `chromium.launch()` (browser-specific), Puppeteer uses `puppeteer.launch()` (generic).
|
||||
|
||||
---
|
||||
|
||||
### Basic Screenshot Example
|
||||
|
||||
**Puppeteer:**
|
||||
```typescript
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com");
|
||||
const screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Playwright:**
|
||||
```typescript
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com");
|
||||
const screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Key Difference**: Nearly identical! Main difference is import and launch method.
|
||||
|
||||
---
|
||||
|
||||
## Feature Comparison
|
||||
|
||||
| Feature | Puppeteer | Playwright | Notes |
|
||||
|---------|-----------|------------|-------|
|
||||
| **Basic Screenshots** | ✅ Yes | ✅ Yes | Both support PNG/JPEG |
|
||||
| **PDF Generation** | ✅ Yes | ✅ Yes | Identical API |
|
||||
| **Page Navigation** | ✅ Yes | ✅ Yes | Similar API |
|
||||
| **Element Selectors** | CSS only | CSS, text | Playwright has more selector types |
|
||||
| **Auto-waiting** | ❌ Manual | ✅ Built-in | Playwright waits for elements automatically |
|
||||
| **Session Management** | ✅ Advanced | ⚠️ Basic | Puppeteer has .sessions(), .history(), .limits() |
|
||||
| **Session Reuse** | ✅ Yes | ⚠️ Limited | Puppeteer has .connect() with sessionId |
|
||||
| **Browser Contexts** | ✅ Yes | ✅ Yes | Both support incognito contexts |
|
||||
| **Multiple Tabs** | ✅ Yes | ✅ Yes | Both support multiple pages |
|
||||
| **Network Interception** | ✅ Yes | ✅ Yes | Similar APIs |
|
||||
| **Geolocation** | ✅ Yes | ✅ Yes | Similar APIs |
|
||||
| **Emulation** | ✅ Yes | ✅ Yes | Device emulation, viewport |
|
||||
| **Browser Support** | Chromium only | Chromium only | Firefox/Safari not yet supported |
|
||||
| **TypeScript Types** | ✅ Yes | ✅ Yes | Both fully typed |
|
||||
|
||||
---
|
||||
|
||||
## Session Management
|
||||
|
||||
### Puppeteer (Advanced)
|
||||
|
||||
```typescript
|
||||
// List active sessions
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
|
||||
// Find free session
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
// Connect to existing session
|
||||
if (freeSession) {
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
}
|
||||
|
||||
// Check limits
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
console.log("Can launch:", limits.allowedBrowserAcquisitions > 0);
|
||||
|
||||
// View history
|
||||
const history = await puppeteer.history(env.MYBROWSER);
|
||||
```
|
||||
|
||||
**Puppeteer APIs:**
|
||||
- ✅ `puppeteer.sessions()` - List active sessions
|
||||
- ✅ `puppeteer.connect()` - Connect to session by ID
|
||||
- ✅ `puppeteer.history()` - View recent sessions
|
||||
- ✅ `puppeteer.limits()` - Check account limits
|
||||
- ✅ `browser.sessionId()` - Get current session ID
|
||||
- ✅ `browser.disconnect()` - Disconnect without closing
|
||||
|
||||
---
|
||||
|
||||
### Playwright (Basic)
|
||||
|
||||
```typescript
|
||||
// Launch browser
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
|
||||
// Get session info (basic)
|
||||
// Note: No .sessions(), .history(), or .limits() APIs
|
||||
```
|
||||
|
||||
**Playwright APIs:**
|
||||
- ❌ No `chromium.sessions()` equivalent
|
||||
- ❌ No session reuse APIs
|
||||
- ❌ No limits checking
|
||||
- ❌ No session history
|
||||
|
||||
**Workaround**: Use Puppeteer-style session management via REST API (more complex).
|
||||
|
||||
---
|
||||
|
||||
## Auto-Waiting Behavior
|
||||
|
||||
### Puppeteer (Manual)
|
||||
|
||||
```typescript
|
||||
// Must explicitly wait for elements
|
||||
await page.goto("https://example.com");
|
||||
await page.waitForSelector("button#submit");
|
||||
await page.click("button#submit");
|
||||
```
|
||||
|
||||
**Pros**: Fine-grained control
|
||||
|
||||
**Cons**: More verbose, easy to forget waits
|
||||
|
||||
---
|
||||
|
||||
### Playwright (Auto-waiting)
|
||||
|
||||
```typescript
|
||||
// Automatically waits for elements
|
||||
await page.goto("https://example.com");
|
||||
await page.click("button#submit"); // Waits automatically!
|
||||
```
|
||||
|
||||
**Pros**: Less boilerplate, fewer timing issues
|
||||
|
||||
**Cons**: Less control over wait behavior
|
||||
|
||||
---
|
||||
|
||||
## Selector Support
|
||||
|
||||
### Puppeteer
|
||||
|
||||
**Supported:**
|
||||
- CSS selectors: `"button#submit"`, `"div > p"`
|
||||
- `:visible`, `:hidden` pseudo-classes
|
||||
- `page.$()`, `page.$$()` for querying
|
||||
|
||||
**Not Supported:**
|
||||
- XPath selectors (use `page.evaluate()` workaround)
|
||||
- Text selectors
|
||||
- Layout selectors
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
// CSS selector
|
||||
const button = await page.$("button#submit");
|
||||
|
||||
// XPath workaround
|
||||
const heading = await page.evaluate(() => {
|
||||
return new XPathEvaluator()
|
||||
.createExpression("//h1[@class='title']")
|
||||
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
|
||||
.singleNodeValue.textContent;
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Playwright
|
||||
|
||||
**Supported:**
|
||||
- CSS selectors: `"button#submit"`
|
||||
- Text selectors: `"text=Submit"`
|
||||
- XPath selectors: `"xpath=//button"`
|
||||
- Layout selectors: `"button :right-of(:text('Cancel'))"`
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
// CSS selector
|
||||
await page.click("button#submit");
|
||||
|
||||
// Text selector
|
||||
await page.click("text=Submit");
|
||||
|
||||
// Combined selector
|
||||
await page.click("button >> text=Submit");
|
||||
```
|
||||
|
||||
**Advantage**: More flexible selector options
|
||||
|
||||
---
|
||||
|
||||
## Performance & Cost
|
||||
|
||||
### Puppeteer (Optimized)
|
||||
|
||||
**Session Reuse:**
|
||||
```typescript
|
||||
// Reuse sessions to reduce costs
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, sessionId);
|
||||
await browser.disconnect(); // Keep alive
|
||||
```
|
||||
|
||||
**Cost Impact:**
|
||||
- ✅ Reduce cold starts by 50-70%
|
||||
- ✅ Lower concurrency charges
|
||||
- ✅ Better throughput
|
||||
|
||||
---
|
||||
|
||||
### Playwright (Limited Optimization)
|
||||
|
||||
**No Session Reuse:**
|
||||
```typescript
|
||||
// Must launch new browser each time
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
await browser.close(); // Cannot keep alive for reuse
|
||||
```
|
||||
|
||||
**Cost Impact:**
|
||||
- ❌ Higher browser hours (cold starts every request)
|
||||
- ❌ Higher concurrency usage
|
||||
- ❌ Lower throughput
|
||||
|
||||
**Difference**: ~30-50% higher costs with Playwright vs optimized Puppeteer.
|
||||
|
||||
---
|
||||
|
||||
## API Differences
|
||||
|
||||
| Operation | Puppeteer | Playwright |
|
||||
|-----------|-----------|------------|
|
||||
| **Import** | `import puppeteer from "@cloudflare/puppeteer"` | `import { chromium } from "@cloudflare/playwright"` |
|
||||
| **Launch** | `puppeteer.launch(env.MYBROWSER)` | `chromium.launch(env.BROWSER)` |
|
||||
| **Connect** | `puppeteer.connect(env.MYBROWSER, sessionId)` | ❌ Not available |
|
||||
| **Sessions** | `puppeteer.sessions(env.MYBROWSER)` | ❌ Not available |
|
||||
| **Limits** | `puppeteer.limits(env.MYBROWSER)` | ❌ Not available |
|
||||
| **Goto** | `page.goto(url, { waitUntil: "networkidle0" })` | `page.goto(url, { waitUntil: "networkidle" })` |
|
||||
| **Screenshot** | `page.screenshot({ fullPage: true })` | `page.screenshot({ fullPage: true })` |
|
||||
| **PDF** | `page.pdf({ format: "A4" })` | `page.pdf({ format: "A4" })` |
|
||||
| **Wait** | `page.waitForSelector("button")` | `page.locator("button").waitFor()` |
|
||||
| **Click** | `page.click("button")` | `page.click("button")` (auto-waits) |
|
||||
|
||||
---
|
||||
|
||||
## Migration Guide
|
||||
|
||||
### Puppeteer → Playwright
|
||||
|
||||
```typescript
|
||||
// Before (Puppeteer)
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url, { waitUntil: "networkidle0" });
|
||||
await page.waitForSelector("button#submit");
|
||||
await page.click("button#submit");
|
||||
const screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
```typescript
|
||||
// After (Playwright)
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url, { waitUntil: "networkidle" });
|
||||
// No waitForSelector needed - auto-waits
|
||||
await page.click("button#submit");
|
||||
const screenshot = await page.screenshot();
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
**Changes:**
|
||||
1. Import: `puppeteer` → `{ chromium }`
|
||||
2. Launch: `puppeteer.launch()` → `chromium.launch()`
|
||||
3. Wait: `networkidle0` → `networkidle`
|
||||
4. Remove explicit `waitForSelector()` (auto-waits)
|
||||
|
||||
---
|
||||
|
||||
### Playwright → Puppeteer
|
||||
|
||||
```typescript
|
||||
// Before (Playwright)
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url);
|
||||
await page.click("button#submit"); // Auto-waits
|
||||
```
|
||||
|
||||
```typescript
|
||||
// After (Puppeteer)
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url, { waitUntil: "networkidle0" });
|
||||
await page.waitForSelector("button#submit"); // Explicit wait
|
||||
await page.click("button#submit");
|
||||
```
|
||||
|
||||
**Changes:**
|
||||
1. Import: `{ chromium }` → `puppeteer`
|
||||
2. Launch: `chromium.launch()` → `puppeteer.launch()`
|
||||
3. Add explicit waits: `page.waitForSelector()`
|
||||
4. Specify wait conditions: `waitUntil: "networkidle0"`
|
||||
|
||||
---
|
||||
|
||||
## Use Case Recommendations
|
||||
|
||||
### Screenshot Service
|
||||
**Winner**: **Puppeteer**
|
||||
|
||||
**Reason**: Session reuse reduces costs by 30-50%
|
||||
|
||||
```typescript
|
||||
// Puppeteer: Reuse sessions
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, sessionId);
|
||||
await browser.disconnect(); // Keep alive
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### PDF Generation
|
||||
**Winner**: **Tie**
|
||||
|
||||
**Reason**: Identical API, no session reuse benefit
|
||||
|
||||
```typescript
|
||||
// Both have same API
|
||||
const pdf = await page.pdf({ format: "A4" });
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Web Scraping
|
||||
**Winner**: **Puppeteer**
|
||||
|
||||
**Reason**: Session management + limit checking
|
||||
|
||||
```typescript
|
||||
// Puppeteer: Check limits before scraping
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
await delay(limits.timeUntilNextAllowedBrowserAcquisition);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test Migration
|
||||
**Winner**: **Playwright**
|
||||
|
||||
**Reason**: Easier to migrate existing Playwright tests
|
||||
|
||||
```typescript
|
||||
// Minimal changes needed
|
||||
// Just update imports and launch
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Interactive Automation
|
||||
**Winner**: **Tie**
|
||||
|
||||
**Reason**: Both support form filling, clicking, etc.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### wrangler.jsonc (Puppeteer)
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"browser": {
|
||||
"binding": "MYBROWSER"
|
||||
},
|
||||
"compatibility_flags": ["nodejs_compat"]
|
||||
}
|
||||
```
|
||||
|
||||
```typescript
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### wrangler.jsonc (Playwright)
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"browser": {
|
||||
"binding": "BROWSER"
|
||||
},
|
||||
"compatibility_flags": ["nodejs_compat"]
|
||||
}
|
||||
```
|
||||
|
||||
```typescript
|
||||
interface Env {
|
||||
BROWSER: Fetcher;
|
||||
}
|
||||
```
|
||||
|
||||
**Note**: Binding name is arbitrary, but convention is `MYBROWSER` for Puppeteer and `BROWSER` for Playwright.
|
||||
|
||||
---
|
||||
|
||||
## Production Considerations
|
||||
|
||||
### Puppeteer Advantages
|
||||
- ✅ Session reuse (30-50% cost savings)
|
||||
- ✅ Limit checking (`puppeteer.limits()`)
|
||||
- ✅ Session monitoring (`puppeteer.sessions()`, `.history()`)
|
||||
- ✅ Better performance optimization options
|
||||
- ✅ More mature Cloudflare fork
|
||||
|
||||
### Playwright Advantages
|
||||
- ✅ Auto-waiting (less code)
|
||||
- ✅ More selector types
|
||||
- ✅ Better cross-browser APIs (future-proof)
|
||||
- ✅ Easier migration from existing tests
|
||||
|
||||
---
|
||||
|
||||
## Recommendation Summary
|
||||
|
||||
| Scenario | Recommended | Reason |
|
||||
|----------|-------------|--------|
|
||||
| New project | **Puppeteer** | Session management + cost optimization |
|
||||
| Screenshot service | **Puppeteer** | Session reuse saves 30-50% |
|
||||
| PDF generation | **Tie** | Identical API |
|
||||
| Web scraping | **Puppeteer** | Limit checking + session management |
|
||||
| Migrating Playwright tests | **Playwright** | Minimal changes needed |
|
||||
| High traffic production | **Puppeteer** | Better performance optimization |
|
||||
| Quick prototype | **Tie** | Both easy to start with |
|
||||
|
||||
---
|
||||
|
||||
## Code Examples
|
||||
|
||||
### Puppeteer (Production-Optimized)
|
||||
|
||||
```typescript
|
||||
import puppeteer, { Browser } from "@cloudflare/puppeteer";
|
||||
|
||||
async function getBrowser(env: Env): Promise<Browser> {
|
||||
// Check limits
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
throw new Error("Rate limit reached");
|
||||
}
|
||||
|
||||
// Try to reuse session
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch {
|
||||
// Session closed, launch new
|
||||
}
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await getBrowser(env);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com", {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000
|
||||
});
|
||||
const screenshot = await page.screenshot();
|
||||
|
||||
// Disconnect (keep alive)
|
||||
await browser.disconnect();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Playwright (Simple)
|
||||
|
||||
```typescript
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com", {
|
||||
waitUntil: "networkidle",
|
||||
timeout: 30000
|
||||
});
|
||||
const screenshot = await page.screenshot();
|
||||
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: { "content-type": "image/png" }
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- **Puppeteer Docs**: https://pptr.dev/
|
||||
- **Playwright Docs**: https://playwright.dev/
|
||||
- **Cloudflare Puppeteer Fork**: https://github.com/cloudflare/puppeteer
|
||||
- **Cloudflare Playwright Fork**: https://github.com/cloudflare/playwright
|
||||
- **Browser Rendering Docs**: https://developers.cloudflare.com/browser-rendering/
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-10-22
|
||||
739
references/session-management.md
Normal file
739
references/session-management.md
Normal file
@@ -0,0 +1,739 @@
|
||||
# Session Management Guide
|
||||
|
||||
Complete guide to browser session management for performance optimization and concurrency handling.
|
||||
|
||||
---
|
||||
|
||||
## Why Session Management Matters
|
||||
|
||||
**The Problem:**
|
||||
- Launching new browsers is slow (~2-3 seconds cold start)
|
||||
- Each launch consumes concurrency quota
|
||||
- Free tier: Only 3 concurrent browsers
|
||||
- Paid tier: 10-30 concurrent browsers (costs $2/browser beyond included)
|
||||
|
||||
**The Solution:**
|
||||
- Reuse browser sessions across requests
|
||||
- Use multiple tabs instead of multiple browsers
|
||||
- Check limits before launching
|
||||
- Disconnect (don't close) to keep sessions alive
|
||||
|
||||
**Benefits:**
|
||||
- ⚡ **50-70% faster** (no cold start)
|
||||
- 💰 **Lower costs** (reduced concurrency charges)
|
||||
- 📊 **Better utilization** (one browser, many tabs)
|
||||
|
||||
---
|
||||
|
||||
## Session Lifecycle
|
||||
|
||||
```
|
||||
1. Launch → Browser session created (session ID assigned)
|
||||
2. Connected → Worker actively using browser
|
||||
3. Disconnected → Session idle, available for reuse
|
||||
4. Timeout → Session closed after 60s idle (configurable)
|
||||
5. Closed → Session terminated (must launch new one)
|
||||
```
|
||||
|
||||
### Session States
|
||||
|
||||
| State | Description | Can Connect? |
|
||||
|-------|-------------|--------------|
|
||||
| **Active with connection** | Worker is using browser | ❌ No (occupied) |
|
||||
| **Active without connection** | Browser idle, waiting | ✅ Yes (available) |
|
||||
| **Closed** | Session terminated | ❌ No (gone) |
|
||||
|
||||
---
|
||||
|
||||
## Session Management API
|
||||
|
||||
### puppeteer.sessions()
|
||||
|
||||
List all currently running browser sessions.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await puppeteer.sessions(binding: Fetcher): Promise<SessionInfo[]>
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```typescript
|
||||
interface SessionInfo {
|
||||
sessionId: string; // Unique session ID
|
||||
startTime: number; // Unix timestamp (ms)
|
||||
connectionId?: string; // Present if worker is connected
|
||||
connectionStartTime?: number;
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
|
||||
// Find free sessions (no active connection)
|
||||
const freeSessions = sessions.filter(s => !s.connectionId);
|
||||
|
||||
// Find occupied sessions
|
||||
const occupiedSessions = sessions.filter(s => s.connectionId);
|
||||
|
||||
console.log({
|
||||
total: sessions.length,
|
||||
available: freeSessions.length,
|
||||
occupied: occupiedSessions.length
|
||||
});
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"sessionId": "478f4d7d-e943-40f6-a414-837d3736a1dc",
|
||||
"startTime": 1711621703708,
|
||||
"connectionId": "2a2246fa-e234-4dc1-8433-87e6cee80145",
|
||||
"connectionStartTime": 1711621704607
|
||||
},
|
||||
{
|
||||
"sessionId": "565e05fb-4d2a-402b-869b-5b65b1381db7",
|
||||
"startTime": 1711621703808
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- Session `478f4d...` is **occupied** (has connectionId)
|
||||
- Session `565e05...` is **available** (no connectionId)
|
||||
|
||||
---
|
||||
|
||||
### puppeteer.history()
|
||||
|
||||
List recent sessions, both open and closed.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await puppeteer.history(binding: Fetcher): Promise<HistoryEntry[]>
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```typescript
|
||||
interface HistoryEntry {
|
||||
sessionId: string;
|
||||
startTime: number;
|
||||
endTime?: number; // Present if closed
|
||||
closeReason?: number; // Numeric close code
|
||||
closeReasonText?: string; // Human-readable reason
|
||||
}
|
||||
```
|
||||
|
||||
**Close Reasons:**
|
||||
- `"NormalClosure"` - Explicitly closed with browser.close()
|
||||
- `"BrowserIdle"` - Timeout due to 60s idle period
|
||||
- `"Unknown"` - Unexpected closure
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const history = await puppeteer.history(env.MYBROWSER);
|
||||
|
||||
history.forEach(entry => {
|
||||
const duration = entry.endTime
|
||||
? (entry.endTime - entry.startTime) / 1000
|
||||
: 'still running';
|
||||
|
||||
console.log({
|
||||
sessionId: entry.sessionId,
|
||||
duration: `${duration}s`,
|
||||
closeReason: entry.closeReasonText || 'N/A'
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Use Cases:**
|
||||
- Monitor browser usage patterns
|
||||
- Debug unexpected closures
|
||||
- Track session lifetimes
|
||||
- Estimate costs
|
||||
|
||||
---
|
||||
|
||||
### puppeteer.limits()
|
||||
|
||||
Check current account limits and session availability.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await puppeteer.limits(binding: Fetcher): Promise<LimitsInfo>
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```typescript
|
||||
interface LimitsInfo {
|
||||
activeSessions: Array<{ id: string }>;
|
||||
maxConcurrentSessions: number;
|
||||
allowedBrowserAcquisitions: number; // Can launch now?
|
||||
timeUntilNextAllowedBrowserAcquisition: number; // ms to wait
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
console.log({
|
||||
active: limits.activeSessions.length,
|
||||
max: limits.maxConcurrentSessions,
|
||||
canLaunch: limits.allowedBrowserAcquisitions > 0,
|
||||
waitTime: limits.timeUntilNextAllowedBrowserAcquisition
|
||||
});
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
{
|
||||
"activeSessions": [
|
||||
{ "id": "478f4d7d-e943-40f6-a414-837d3736a1dc" },
|
||||
{ "id": "565e05fb-4d2a-402b-869b-5b65b1381db7" }
|
||||
],
|
||||
"allowedBrowserAcquisitions": 1,
|
||||
"maxConcurrentSessions": 10,
|
||||
"timeUntilNextAllowedBrowserAcquisition": 0
|
||||
}
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- 2 sessions currently active
|
||||
- Maximum 10 concurrent sessions allowed
|
||||
- Can launch 1 more browser now
|
||||
- No wait time required
|
||||
|
||||
---
|
||||
|
||||
### puppeteer.connect()
|
||||
|
||||
Connect to an existing browser session.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await puppeteer.connect(binding: Fetcher, sessionId: string): Promise<Browser>
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
console.log("Connected to existing session:", browser.sessionId());
|
||||
} catch (error) {
|
||||
console.log("Connection failed, session may have closed");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Error Handling:**
|
||||
Session may close between `.sessions()` call and `.connect()` call. Always wrap in try-catch.
|
||||
|
||||
---
|
||||
|
||||
### browser.sessionId()
|
||||
|
||||
Get the current browser's session ID.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
browser.sessionId(): string
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const sessionId = browser.sessionId();
|
||||
console.log("Current session:", sessionId);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### browser.disconnect()
|
||||
|
||||
Disconnect from browser WITHOUT closing it.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await browser.disconnect(): Promise<void>
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- Want to reuse session later
|
||||
- Keep browser warm for next request
|
||||
- Reduce cold start times
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const sessionId = browser.sessionId();
|
||||
|
||||
// Do work
|
||||
const page = await browser.newPage();
|
||||
await page.goto("https://example.com");
|
||||
|
||||
// Disconnect (keep alive)
|
||||
await browser.disconnect();
|
||||
|
||||
// Later: reconnect
|
||||
const browserAgain = await puppeteer.connect(env.MYBROWSER, sessionId);
|
||||
```
|
||||
|
||||
**Important:**
|
||||
- Browser will still timeout after 60s idle (use `keep_alive` to extend)
|
||||
- Session remains in your concurrent browser count
|
||||
- Other workers CAN connect to this session
|
||||
|
||||
---
|
||||
|
||||
### browser.close()
|
||||
|
||||
Close the browser and terminate the session.
|
||||
|
||||
**Signature:**
|
||||
```typescript
|
||||
await browser.close(): Promise<void>
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- Done with browser completely
|
||||
- Want to free concurrency slot
|
||||
- Error occurred during processing
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
// Do work
|
||||
} catch (error) {
|
||||
await browser.close(); // Clean up on error
|
||||
throw error;
|
||||
}
|
||||
|
||||
await browser.close(); // Normal cleanup
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Session Reuse Patterns
|
||||
|
||||
### Pattern 1: Simple Reuse
|
||||
|
||||
```typescript
|
||||
async function getBrowser(env: Env): Promise<Browser> {
|
||||
// Try to connect to existing session
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch {
|
||||
// Session closed, launch new one
|
||||
}
|
||||
}
|
||||
|
||||
// Launch new browser
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const browser = await getBrowser(env);
|
||||
|
||||
// Do work
|
||||
const page = await browser.newPage();
|
||||
// ...
|
||||
|
||||
// Disconnect (keep alive)
|
||||
await browser.disconnect();
|
||||
|
||||
return response;
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Pattern 2: Reuse with Limits Check
|
||||
|
||||
```typescript
|
||||
async function getBrowserSafe(env: Env): Promise<Browser> {
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch {
|
||||
// Continue to launch
|
||||
}
|
||||
}
|
||||
|
||||
// Check limits before launching
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
throw new Error(
|
||||
`Rate limit reached. Retry after ${limits.timeUntilNextAllowedBrowserAcquisition}ms`
|
||||
);
|
||||
}
|
||||
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Pattern 3: Retry with Backoff
|
||||
|
||||
```typescript
|
||||
async function getBrowserWithRetry(
|
||||
env: Env,
|
||||
maxRetries = 3
|
||||
): Promise<Browser> {
|
||||
for (let i = 0; i < maxRetries; i++) {
|
||||
try {
|
||||
// Try existing session first
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch {
|
||||
// Continue to launch
|
||||
}
|
||||
}
|
||||
|
||||
// Check limits
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
|
||||
if (limits.allowedBrowserAcquisitions > 0) {
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
|
||||
// Rate limited, wait and retry
|
||||
if (i < maxRetries - 1) {
|
||||
const delay = Math.min(
|
||||
limits.timeUntilNextAllowedBrowserAcquisition,
|
||||
Math.pow(2, i) * 1000 // Exponential backoff
|
||||
);
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
} catch (error) {
|
||||
if (i === maxRetries - 1) throw error;
|
||||
}
|
||||
}
|
||||
|
||||
throw new Error("Failed to acquire browser after retries");
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Browser Timeout Management
|
||||
|
||||
### Default Timeout
|
||||
|
||||
Browsers close after **60 seconds of inactivity** (no devtools commands).
|
||||
|
||||
**Inactivity means:**
|
||||
- No `page.goto()`
|
||||
- No `page.screenshot()`
|
||||
- No `page.evaluate()`
|
||||
- No other browser/page operations
|
||||
|
||||
### Extending Timeout with keep_alive
|
||||
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER, {
|
||||
keep_alive: 300000 // 5 minutes = 300,000 ms
|
||||
});
|
||||
```
|
||||
|
||||
**Maximum:** 600,000ms (10 minutes)
|
||||
|
||||
**Use Cases:**
|
||||
- Long-running scraping workflows
|
||||
- Multi-step form automation
|
||||
- Session reuse across multiple requests
|
||||
|
||||
**Cost Impact:**
|
||||
- Longer keep_alive = more browser hours billed
|
||||
- Only extend if actually needed
|
||||
|
||||
---
|
||||
|
||||
## Incognito Browser Contexts
|
||||
|
||||
Use browser contexts to isolate cookies/cache while sharing a browser.
|
||||
|
||||
**Benefits:**
|
||||
- 1 concurrent browser instead of N
|
||||
- Separate cookies/cache per context
|
||||
- Test multi-user scenarios
|
||||
- Session isolation
|
||||
|
||||
**Example:**
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
// Create isolated contexts
|
||||
const context1 = await browser.createBrowserContext();
|
||||
const context2 = await browser.createBrowserContext();
|
||||
|
||||
// Each context has separate state
|
||||
const page1 = await context1.newPage();
|
||||
const page2 = await context2.newPage();
|
||||
|
||||
await page1.goto("https://app.example.com"); // User 1
|
||||
await page2.goto("https://app.example.com"); // User 2
|
||||
|
||||
// page1 and page2 have separate cookies
|
||||
await context1.close();
|
||||
await context2.close();
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Multiple Tabs vs Multiple Browsers
|
||||
|
||||
### ❌ Bad: Multiple Browsers
|
||||
|
||||
```typescript
|
||||
// Uses 10 concurrent browsers!
|
||||
for (const url of urls) {
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url);
|
||||
await browser.close();
|
||||
}
|
||||
```
|
||||
|
||||
**Problems:**
|
||||
- 10x concurrency usage
|
||||
- 10x cold start delays
|
||||
- May hit concurrency limits
|
||||
|
||||
---
|
||||
|
||||
### ✅ Good: Multiple Tabs
|
||||
|
||||
```typescript
|
||||
// Uses 1 concurrent browser
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
const results = await Promise.all(
|
||||
urls.map(async (url) => {
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url);
|
||||
const data = await page.evaluate(() => ({
|
||||
title: document.title
|
||||
}));
|
||||
await page.close();
|
||||
return data;
|
||||
})
|
||||
);
|
||||
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- 1 concurrent browser (10x reduction)
|
||||
- Faster (no repeated cold starts)
|
||||
- Cheaper (reduced concurrency charges)
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Debugging
|
||||
|
||||
### Log Session Activity
|
||||
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
const sessionId = browser.sessionId();
|
||||
|
||||
console.log({
|
||||
event: "browser_launched",
|
||||
sessionId,
|
||||
timestamp: Date.now()
|
||||
});
|
||||
|
||||
// Do work
|
||||
|
||||
await browser.disconnect();
|
||||
|
||||
console.log({
|
||||
event: "browser_disconnected",
|
||||
sessionId,
|
||||
timestamp: Date.now()
|
||||
});
|
||||
```
|
||||
|
||||
### Track Session Metrics
|
||||
|
||||
```typescript
|
||||
interface SessionMetrics {
|
||||
sessionId: string;
|
||||
launched: boolean; // true = new, false = reused
|
||||
duration: number; // ms
|
||||
operations: number; // page navigations
|
||||
}
|
||||
|
||||
async function trackSession(env: Env, fn: (browser: Browser) => Promise<void>) {
|
||||
const start = Date.now();
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
|
||||
let browser: Browser;
|
||||
let launched: boolean;
|
||||
|
||||
if (freeSession) {
|
||||
browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
launched = false;
|
||||
} else {
|
||||
browser = await puppeteer.launch(env.MYBROWSER);
|
||||
launched = true;
|
||||
}
|
||||
|
||||
await fn(browser);
|
||||
|
||||
const metrics: SessionMetrics = {
|
||||
sessionId: browser.sessionId(),
|
||||
launched,
|
||||
duration: Date.now() - start,
|
||||
operations: 1 // Track actual operations in production
|
||||
};
|
||||
|
||||
await browser.disconnect();
|
||||
|
||||
return metrics;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Best Practices
|
||||
|
||||
1. **Always Check Limits**
|
||||
- Call `puppeteer.limits()` before launching
|
||||
- Handle rate limit errors gracefully
|
||||
- Implement retry with backoff
|
||||
|
||||
2. **Prefer Session Reuse**
|
||||
- Try `puppeteer.connect()` first
|
||||
- Fall back to `puppeteer.launch()` only if needed
|
||||
- Use `browser.disconnect()` instead of `browser.close()`
|
||||
|
||||
3. **Use Multiple Tabs**
|
||||
- One browser, many tabs
|
||||
- Reduces concurrency usage 10-50x
|
||||
- Faster than multiple browsers
|
||||
|
||||
4. **Set Appropriate Timeouts**
|
||||
- Default 60s is fine for most use cases
|
||||
- Extend only if actually needed (keep_alive)
|
||||
- Remember: longer timeout = more billable hours
|
||||
|
||||
5. **Handle Errors**
|
||||
- Always `browser.close()` on errors
|
||||
- Wrap `puppeteer.connect()` in try-catch
|
||||
- Gracefully handle rate limits
|
||||
|
||||
6. **Monitor Usage**
|
||||
- Log session IDs
|
||||
- Track reuse rate
|
||||
- Monitor concurrency in dashboard
|
||||
|
||||
7. **Use Incognito Contexts**
|
||||
- Isolate sessions while sharing browser
|
||||
- Better than multiple browsers
|
||||
- Test multi-user scenarios safely
|
||||
|
||||
---
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
### Scenario: Screenshot Service (1000 requests/hour)
|
||||
|
||||
**Bad Approach (No Session Reuse):**
|
||||
- Launch new browser for each request
|
||||
- 1000 browsers/hour
|
||||
- Average session: 5 seconds
|
||||
- Browser hours: (1000 * 5) / 3600 = 1.39 hours
|
||||
- Average concurrency: ~14 browsers
|
||||
- **Cost**: 1.39 hours = $0.13 + (14-10) * $2 = $8.13/hour
|
||||
|
||||
**Good Approach (Session Reuse):**
|
||||
- Maintain pool of 3-5 warm browsers
|
||||
- Reuse sessions across requests
|
||||
- Average session: 1 hour (keep_alive)
|
||||
- Browser hours: 5 hours (5 browsers * 1 hour)
|
||||
- Average concurrency: 5 browsers
|
||||
- **Cost**: 5 hours = $0.45/hour
|
||||
|
||||
**Savings: 94%** ($8.13 → $0.45)
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue: "Failed to connect to session"
|
||||
|
||||
**Cause:** Session closed between `.sessions()` and `.connect()` calls
|
||||
|
||||
**Solution:**
|
||||
```typescript
|
||||
const freeSession = sessions.find(s => !s.connectionId);
|
||||
if (freeSession) {
|
||||
try {
|
||||
return await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
|
||||
} catch (error) {
|
||||
console.log("Session closed, launching new browser");
|
||||
return await puppeteer.launch(env.MYBROWSER);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Issue: Sessions timing out too quickly
|
||||
|
||||
**Cause:** Default 60s idle timeout
|
||||
|
||||
**Solution:** Extend with keep_alive:
|
||||
```typescript
|
||||
const browser = await puppeteer.launch(env.MYBROWSER, {
|
||||
keep_alive: 300000 // 5 minutes
|
||||
});
|
||||
```
|
||||
|
||||
### Issue: Rate limit reached
|
||||
|
||||
**Cause:** Too many concurrent browsers or launches per minute
|
||||
|
||||
**Solution:** Check limits before launching:
|
||||
```typescript
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
return new Response("Rate limit reached", { status: 429 });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Reference
|
||||
|
||||
- **Official Docs**: https://developers.cloudflare.com/browser-rendering/workers-bindings/reuse-sessions/
|
||||
- **Limits**: https://developers.cloudflare.com/browser-rendering/platform/limits/
|
||||
- **Pricing**: https://developers.cloudflare.com/browser-rendering/platform/pricing/
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-10-22
|
||||
53
scripts/check-versions.sh
Executable file
53
scripts/check-versions.sh
Executable file
@@ -0,0 +1,53 @@
|
||||
#!/bin/bash
|
||||
# check-versions.sh
|
||||
# Verify package versions for Cloudflare Browser Rendering skill
|
||||
|
||||
set -e
|
||||
|
||||
echo "Checking Cloudflare Browser Rendering package versions..."
|
||||
echo ""
|
||||
|
||||
# Function to check package version
|
||||
check_package() {
|
||||
local package=$1
|
||||
local current=$2
|
||||
|
||||
echo "📦 $package"
|
||||
echo " Current in skill: $current"
|
||||
|
||||
if command -v npm &> /dev/null; then
|
||||
latest=$(npm view $package version 2>/dev/null || echo "N/A")
|
||||
echo " Latest on npm: $latest"
|
||||
|
||||
if [ "$current" != "$latest" ] && [ "$latest" != "N/A" ]; then
|
||||
echo " ⚠️ Update available!"
|
||||
else
|
||||
echo " ✅ Up to date"
|
||||
fi
|
||||
else
|
||||
echo " ⚠️ npm not found, skipping latest version check"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
}
|
||||
|
||||
echo "=== Core Packages ==="
|
||||
echo ""
|
||||
|
||||
check_package "@cloudflare/puppeteer" "1.0.4"
|
||||
check_package "@cloudflare/playwright" "1.0.0"
|
||||
|
||||
echo "=== Related Packages ==="
|
||||
echo ""
|
||||
|
||||
check_package "wrangler" "4.43.0"
|
||||
check_package "@cloudflare/workers-types" "4.20251014.0"
|
||||
|
||||
echo "=== Verification Complete ==="
|
||||
echo ""
|
||||
echo "To update a package version in this skill:"
|
||||
echo "1. Update the version in SKILL.md"
|
||||
echo "2. Update templates if API changes"
|
||||
echo "3. Test all template files"
|
||||
echo "4. Update 'Last Updated' date"
|
||||
echo "5. Commit changes"
|
||||
139
templates/ai-enhanced-scraper.ts
Normal file
139
templates/ai-enhanced-scraper.ts
Normal file
@@ -0,0 +1,139 @@
|
||||
// AI-Enhanced Web Scraper
|
||||
// Combine Browser Rendering with Workers AI to extract structured data intelligently
|
||||
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
AI: Ai;
|
||||
}
|
||||
|
||||
interface ProductData {
|
||||
name: string;
|
||||
price: string;
|
||||
description: string;
|
||||
availability: string;
|
||||
[key: string]: any;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
|
||||
if (!url) {
|
||||
return new Response("Missing ?url parameter", { status: 400 });
|
||||
}
|
||||
|
||||
// Step 1: Scrape page content with browser
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
|
||||
await page.goto(url, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
// Extract raw HTML content
|
||||
const bodyContent = await page.$eval("body", (el) => el.innerHTML);
|
||||
|
||||
await browser.close();
|
||||
|
||||
// Truncate to fit AI context (4000 chars)
|
||||
const truncatedContent = bodyContent.slice(0, 4000);
|
||||
|
||||
// Step 2: Extract structured data with AI
|
||||
const aiResponse = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
|
||||
messages: [
|
||||
{
|
||||
role: "system",
|
||||
content:
|
||||
"You are a data extraction assistant. Extract product information from HTML and return ONLY valid JSON.",
|
||||
},
|
||||
{
|
||||
role: "user",
|
||||
content: `Extract product information from this HTML. Return JSON with these fields: name, price, description, availability. If any field is not found, use empty string.\n\nHTML:\n${truncatedContent}`,
|
||||
},
|
||||
],
|
||||
stream: false,
|
||||
});
|
||||
|
||||
// Parse AI response
|
||||
let productData: ProductData;
|
||||
try {
|
||||
const responseText = (aiResponse as any).response;
|
||||
// Try to extract JSON from response (AI might wrap it in markdown)
|
||||
const jsonMatch = responseText.match(/\{[\s\S]*\}/);
|
||||
if (jsonMatch) {
|
||||
productData = JSON.parse(jsonMatch[0]);
|
||||
} else {
|
||||
productData = JSON.parse(responseText);
|
||||
}
|
||||
} catch {
|
||||
productData = {
|
||||
name: "",
|
||||
price: "",
|
||||
description: "",
|
||||
availability: "",
|
||||
raw: (aiResponse as any).response,
|
||||
};
|
||||
}
|
||||
|
||||
return Response.json({
|
||||
url,
|
||||
product: productData,
|
||||
extractedAt: new Date().toISOString(),
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return Response.json(
|
||||
{
|
||||
error: error instanceof Error ? error.message : "AI-enhanced scraping failed",
|
||||
},
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Setup:
|
||||
* Add AI binding to wrangler.jsonc:
|
||||
* {
|
||||
* "browser": { "binding": "MYBROWSER" },
|
||||
* "ai": { "binding": "AI" },
|
||||
* "compatibility_flags": ["nodejs_compat"]
|
||||
* }
|
||||
*
|
||||
* Usage:
|
||||
* GET /?url=https://example.com/product
|
||||
*
|
||||
* Response:
|
||||
* {
|
||||
* "url": "https://example.com/product",
|
||||
* "product": {
|
||||
* "name": "Example Product",
|
||||
* "price": "$99.99",
|
||||
* "description": "Product description...",
|
||||
* "availability": "In Stock"
|
||||
* },
|
||||
* "extractedAt": "2025-10-22T12:34:56.789Z"
|
||||
* }
|
||||
*
|
||||
* Benefits:
|
||||
* - No need to write custom CSS selectors for each site
|
||||
* - AI adapts to different page structures
|
||||
* - Extracts semantic information, not just raw HTML
|
||||
* - Handles variations in HTML structure
|
||||
*
|
||||
* Limitations:
|
||||
* - AI context limited to ~4000 chars of HTML
|
||||
* - May hallucinate if data not present
|
||||
* - Requires AI binding (uses neurons quota)
|
||||
*
|
||||
* See also:
|
||||
* - cloudflare-workers-ai skill for more AI patterns
|
||||
* - web-scraper-basic.ts for traditional CSS selector approach
|
||||
*/
|
||||
76
templates/basic-screenshot.ts
Normal file
76
templates/basic-screenshot.ts
Normal file
@@ -0,0 +1,76 @@
|
||||
// Basic Screenshot Example
|
||||
// Minimal example for taking screenshots with Cloudflare Browser Rendering
|
||||
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
|
||||
if (!url) {
|
||||
return new Response("Missing ?url parameter. Example: ?url=https://example.com", {
|
||||
status: 400,
|
||||
});
|
||||
}
|
||||
|
||||
let normalizedUrl: string;
|
||||
try {
|
||||
normalizedUrl = new URL(url).toString();
|
||||
} catch {
|
||||
return new Response("Invalid URL", { status: 400 });
|
||||
}
|
||||
|
||||
// Launch browser
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
// Create new page
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Navigate to URL
|
||||
await page.goto(normalizedUrl, {
|
||||
waitUntil: "networkidle0", // Wait for network to be idle
|
||||
timeout: 30000, // 30 second timeout
|
||||
});
|
||||
|
||||
// Take screenshot
|
||||
const screenshot = await page.screenshot({
|
||||
fullPage: true, // Capture full scrollable page
|
||||
type: "png",
|
||||
});
|
||||
|
||||
// Clean up
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: {
|
||||
"content-type": "image/png",
|
||||
"cache-control": "public, max-age=3600", // Cache for 1 hour
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
// Always close browser on error
|
||||
await browser.close();
|
||||
throw error;
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Deploy:
|
||||
* npx wrangler deploy
|
||||
*
|
||||
* Test:
|
||||
* https://your-worker.workers.dev/?url=https://example.com
|
||||
*
|
||||
* Configuration (wrangler.jsonc):
|
||||
* {
|
||||
* "browser": { "binding": "MYBROWSER" },
|
||||
* "compatibility_flags": ["nodejs_compat"]
|
||||
* }
|
||||
*/
|
||||
127
templates/pdf-generation.ts
Normal file
127
templates/pdf-generation.ts
Normal file
@@ -0,0 +1,127 @@
|
||||
// PDF Generation
|
||||
// Generate PDFs from URLs or custom HTML content
|
||||
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
interface PDFRequest {
|
||||
url?: string;
|
||||
html?: string;
|
||||
options?: {
|
||||
format?: "Letter" | "A4" | "A3" | "Legal";
|
||||
landscape?: boolean;
|
||||
margin?: {
|
||||
top?: string;
|
||||
right?: string;
|
||||
bottom?: string;
|
||||
left?: string;
|
||||
};
|
||||
};
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
if (request.method !== "POST") {
|
||||
return new Response("Method not allowed. Use POST with JSON body.", {
|
||||
status: 405,
|
||||
});
|
||||
}
|
||||
|
||||
const body = await request.json<PDFRequest>();
|
||||
const { url, html, options = {} } = body;
|
||||
|
||||
if (!url && !html) {
|
||||
return new Response('Missing "url" or "html" in request body', {
|
||||
status: 400,
|
||||
});
|
||||
}
|
||||
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Load content
|
||||
if (html) {
|
||||
await page.setContent(html, { waitUntil: "networkidle0" });
|
||||
} else if (url) {
|
||||
await page.goto(url, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
}
|
||||
|
||||
// Generate PDF
|
||||
const pdf = await page.pdf({
|
||||
format: options.format || "A4",
|
||||
landscape: options.landscape || false,
|
||||
printBackground: true, // Include background colors/images
|
||||
margin: options.margin || {
|
||||
top: "1cm",
|
||||
right: "1cm",
|
||||
bottom: "1cm",
|
||||
left: "1cm",
|
||||
},
|
||||
});
|
||||
|
||||
await browser.close();
|
||||
|
||||
// Generate filename
|
||||
const filename = url
|
||||
? `${new URL(url).hostname.replace(/\./g, "_")}.pdf`
|
||||
: "document.pdf";
|
||||
|
||||
return new Response(pdf, {
|
||||
headers: {
|
||||
"content-type": "application/pdf",
|
||||
"content-disposition": `attachment; filename="${filename}"`,
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: error instanceof Error ? error.message : "PDF generation failed",
|
||||
}),
|
||||
{
|
||||
status: 500,
|
||||
headers: { "content-type": "application/json" },
|
||||
}
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Usage Examples:
|
||||
*
|
||||
* 1. PDF from URL:
|
||||
* POST /
|
||||
* Content-Type: application/json
|
||||
* {
|
||||
* "url": "https://example.com"
|
||||
* }
|
||||
*
|
||||
* 2. PDF from custom HTML:
|
||||
* POST /
|
||||
* {
|
||||
* "html": "<!DOCTYPE html><html><body><h1>Invoice</h1></body></html>"
|
||||
* }
|
||||
*
|
||||
* 3. PDF with custom options:
|
||||
* POST /
|
||||
* {
|
||||
* "url": "https://example.com",
|
||||
* "options": {
|
||||
* "format": "Letter",
|
||||
* "landscape": true,
|
||||
* "margin": {
|
||||
* "top": "2cm",
|
||||
* "bottom": "2cm"
|
||||
* }
|
||||
* }
|
||||
* }
|
||||
*/
|
||||
99
templates/playwright-example.ts
Normal file
99
templates/playwright-example.ts
Normal file
@@ -0,0 +1,99 @@
|
||||
// Playwright Example
|
||||
// Alternative to Puppeteer using @cloudflare/playwright
|
||||
|
||||
import { chromium } from "@cloudflare/playwright";
|
||||
|
||||
interface Env {
|
||||
BROWSER: Fetcher;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url") || "https://example.com";
|
||||
|
||||
// Launch browser (note: chromium.launch instead of puppeteer.launch)
|
||||
const browser = await chromium.launch(env.BROWSER);
|
||||
|
||||
try {
|
||||
// Create new page
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Navigate to URL
|
||||
await page.goto(url, {
|
||||
waitUntil: "networkidle",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
// Take screenshot
|
||||
const screenshot = await page.screenshot({
|
||||
fullPage: true,
|
||||
type: "png",
|
||||
});
|
||||
|
||||
// Clean up
|
||||
await browser.close();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: {
|
||||
"content-type": "image/png",
|
||||
"cache-control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: error instanceof Error ? error.message : "Screenshot failed",
|
||||
}),
|
||||
{
|
||||
status: 500,
|
||||
headers: { "content-type": "application/json" },
|
||||
}
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Playwright vs Puppeteer:
|
||||
*
|
||||
* Similarities:
|
||||
* - Very similar API (page.goto, page.screenshot, etc.)
|
||||
* - Both support Chromium on Workers
|
||||
* - Same use cases (screenshots, PDFs, scraping)
|
||||
*
|
||||
* Differences:
|
||||
*
|
||||
* | Feature | Puppeteer | Playwright |
|
||||
* |---------|-----------|------------|
|
||||
* | Import | `import puppeteer from "@cloudflare/puppeteer"` | `import { chromium } from "@cloudflare/playwright"` |
|
||||
* | Launch | `puppeteer.launch(env.MYBROWSER)` | `chromium.launch(env.BROWSER)` |
|
||||
* | Session Management | ✅ Advanced (sessions, history, limits) | ⚠️ Basic |
|
||||
* | Auto-waiting | Manual waitForSelector() | Built-in auto-waiting |
|
||||
* | Selectors | CSS only | CSS, text, XPath (via workaround) |
|
||||
* | Version | @cloudflare/puppeteer@1.0.4 | @cloudflare/playwright@1.0.0 |
|
||||
*
|
||||
* When to use Playwright:
|
||||
* - Already using Playwright for testing
|
||||
* - Prefer auto-waiting behavior
|
||||
* - Don't need advanced session management
|
||||
*
|
||||
* When to use Puppeteer:
|
||||
* - Need session reuse for performance
|
||||
* - Want to check limits before launching
|
||||
* - More familiar with Puppeteer API
|
||||
*
|
||||
* Installation:
|
||||
* npm install @cloudflare/playwright
|
||||
*
|
||||
* Configuration (wrangler.jsonc):
|
||||
* {
|
||||
* "browser": { "binding": "BROWSER" },
|
||||
* "compatibility_flags": ["nodejs_compat"]
|
||||
* }
|
||||
*
|
||||
* Recommendation:
|
||||
* Stick with Puppeteer for most use cases unless you have
|
||||
* existing Playwright tests to migrate.
|
||||
*/
|
||||
107
templates/screenshot-with-kv-cache.ts
Normal file
107
templates/screenshot-with-kv-cache.ts
Normal file
@@ -0,0 +1,107 @@
|
||||
// Screenshot with KV Caching
|
||||
// Production-ready screenshot service with KV caching to reduce browser usage
|
||||
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
SCREENSHOT_CACHE: KVNamespace;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
const refresh = searchParams.get("refresh") === "true";
|
||||
|
||||
if (!url) {
|
||||
return new Response("Missing ?url parameter", { status: 400 });
|
||||
}
|
||||
|
||||
const normalizedUrl = new URL(url).toString();
|
||||
|
||||
// Check cache (unless refresh requested)
|
||||
if (!refresh) {
|
||||
const cached = await env.SCREENSHOT_CACHE.get(normalizedUrl, {
|
||||
type: "arrayBuffer",
|
||||
});
|
||||
|
||||
if (cached) {
|
||||
return new Response(cached, {
|
||||
headers: {
|
||||
"content-type": "image/png",
|
||||
"x-cache": "HIT",
|
||||
"cache-control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Generate screenshot
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
|
||||
await page.goto(normalizedUrl, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
const screenshot = await page.screenshot({
|
||||
fullPage: true,
|
||||
type: "png",
|
||||
});
|
||||
|
||||
await browser.close();
|
||||
|
||||
// Cache for 24 hours
|
||||
await env.SCREENSHOT_CACHE.put(normalizedUrl, screenshot, {
|
||||
expirationTtl: 60 * 60 * 24, // 24 hours
|
||||
});
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: {
|
||||
"content-type": "image/png",
|
||||
"x-cache": "MISS",
|
||||
"cache-control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: error instanceof Error ? error.message : "Screenshot failed",
|
||||
}),
|
||||
{
|
||||
status: 500,
|
||||
headers: { "content-type": "application/json" },
|
||||
}
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Setup:
|
||||
* 1. Create KV namespace:
|
||||
* npx wrangler kv namespace create SCREENSHOT_CACHE
|
||||
* npx wrangler kv namespace create SCREENSHOT_CACHE --preview
|
||||
*
|
||||
* 2. Add to wrangler.jsonc:
|
||||
* {
|
||||
* "browser": { "binding": "MYBROWSER" },
|
||||
* "compatibility_flags": ["nodejs_compat"],
|
||||
* "kv_namespaces": [
|
||||
* {
|
||||
* "binding": "SCREENSHOT_CACHE",
|
||||
* "id": "YOUR_KV_ID",
|
||||
* "preview_id": "YOUR_PREVIEW_ID"
|
||||
* }
|
||||
* ]
|
||||
* }
|
||||
*
|
||||
* Usage:
|
||||
* New screenshot: ?url=https://example.com
|
||||
* Force refresh: ?url=https://example.com&refresh=true
|
||||
*/
|
||||
118
templates/session-reuse.ts
Normal file
118
templates/session-reuse.ts
Normal file
@@ -0,0 +1,118 @@
|
||||
// Session Reuse Pattern
|
||||
// Optimize performance by reusing browser sessions instead of launching new ones
|
||||
|
||||
import puppeteer, { Browser } from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get or create a browser instance
|
||||
* Tries to connect to existing session first, launches new one if needed
|
||||
*/
|
||||
async function getBrowser(env: Env): Promise<{ browser: Browser; launched: boolean }> {
|
||||
// Check for available sessions
|
||||
const sessions = await puppeteer.sessions(env.MYBROWSER);
|
||||
|
||||
// Find sessions without active connections
|
||||
const freeSessions = sessions.filter((s) => !s.connectionId);
|
||||
|
||||
if (freeSessions.length > 0) {
|
||||
// Try to connect to existing session
|
||||
try {
|
||||
console.log("Connecting to existing session:", freeSessions[0].sessionId);
|
||||
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
|
||||
return { browser, launched: false };
|
||||
} catch (error) {
|
||||
console.log("Failed to connect, launching new browser:", error);
|
||||
}
|
||||
}
|
||||
|
||||
// Check limits before launching
|
||||
const limits = await puppeteer.limits(env.MYBROWSER);
|
||||
if (limits.allowedBrowserAcquisitions === 0) {
|
||||
throw new Error(
|
||||
`Rate limit reached. Retry after ${limits.timeUntilNextAllowedBrowserAcquisition}ms`
|
||||
);
|
||||
}
|
||||
|
||||
// Launch new session
|
||||
console.log("Launching new browser session");
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
return { browser, launched: true };
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url") || "https://example.com";
|
||||
|
||||
try {
|
||||
// Get or create browser
|
||||
const { browser, launched } = await getBrowser(env);
|
||||
const sessionId = browser.sessionId();
|
||||
|
||||
console.log({
|
||||
sessionId,
|
||||
launched,
|
||||
message: launched ? "New browser launched" : "Reused existing session",
|
||||
});
|
||||
|
||||
// Do work
|
||||
const page = await browser.newPage();
|
||||
await page.goto(url, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
const screenshot = await page.screenshot();
|
||||
await page.close();
|
||||
|
||||
// IMPORTANT: Disconnect (don't close) to keep session alive for reuse
|
||||
await browser.disconnect();
|
||||
|
||||
return new Response(screenshot, {
|
||||
headers: {
|
||||
"content-type": "image/png",
|
||||
"x-session-id": sessionId,
|
||||
"x-session-reused": launched ? "false" : "true",
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
return new Response(
|
||||
JSON.stringify({
|
||||
error: error instanceof Error ? error.message : "Unknown error",
|
||||
}),
|
||||
{
|
||||
status: 500,
|
||||
headers: { "content-type": "application/json" },
|
||||
}
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Key Concepts:
|
||||
*
|
||||
* 1. puppeteer.sessions() - List all active sessions
|
||||
* 2. puppeteer.connect() - Connect to existing session
|
||||
* 3. browser.disconnect() - Disconnect WITHOUT closing (keeps session alive)
|
||||
* 4. browser.close() - Terminate session completely
|
||||
* 5. puppeteer.limits() - Check rate limits before launching
|
||||
*
|
||||
* Benefits:
|
||||
* - Faster response times (no cold start)
|
||||
* - Lower concurrency usage
|
||||
* - Better resource utilization
|
||||
*
|
||||
* Trade-offs:
|
||||
* - Sessions time out after 60s idle (extend with keep_alive)
|
||||
* - Must handle connection failures gracefully
|
||||
* - Need to track which sessions are available
|
||||
*
|
||||
* Response Headers:
|
||||
* - x-session-id: Browser session ID
|
||||
* - x-session-reused: true if reused existing session
|
||||
*/
|
||||
116
templates/web-scraper-basic.ts
Normal file
116
templates/web-scraper-basic.ts
Normal file
@@ -0,0 +1,116 @@
|
||||
// Basic Web Scraper
|
||||
// Extract structured data from web pages
|
||||
|
||||
import puppeteer from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
interface ScrapedData {
|
||||
url: string;
|
||||
title: string;
|
||||
description: string;
|
||||
headings: string[];
|
||||
links: Array<{ text: string; href: string }>;
|
||||
images: Array<{ alt: string; src: string }>;
|
||||
timestamp: string;
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const url = searchParams.get("url");
|
||||
|
||||
if (!url) {
|
||||
return new Response("Missing ?url parameter", { status: 400 });
|
||||
}
|
||||
|
||||
const normalizedUrl = new URL(url).toString();
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Navigate to page
|
||||
await page.goto(normalizedUrl, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
// Wait for body to be present
|
||||
await page.waitForSelector("body");
|
||||
|
||||
// Extract structured data
|
||||
const data = await page.evaluate<ScrapedData>(() => {
|
||||
// Get all headings
|
||||
const headings = Array.from(document.querySelectorAll("h1, h2, h3")).map(
|
||||
(el) => el.textContent?.trim() || ""
|
||||
);
|
||||
|
||||
// Get all links
|
||||
const links = Array.from(document.querySelectorAll("a"))
|
||||
.filter((a) => a.href)
|
||||
.map((a) => ({
|
||||
text: a.textContent?.trim() || "",
|
||||
href: a.href,
|
||||
}))
|
||||
.slice(0, 50); // Limit to first 50 links
|
||||
|
||||
// Get all images
|
||||
const images = Array.from(document.querySelectorAll("img"))
|
||||
.filter((img) => img.src)
|
||||
.map((img) => ({
|
||||
alt: img.alt || "",
|
||||
src: img.src,
|
||||
}))
|
||||
.slice(0, 20); // Limit to first 20 images
|
||||
|
||||
return {
|
||||
url: window.location.href,
|
||||
title: document.title,
|
||||
description:
|
||||
document.querySelector('meta[name="description"]')?.getAttribute("content") ||
|
||||
"",
|
||||
headings,
|
||||
links,
|
||||
images,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
});
|
||||
|
||||
await browser.close();
|
||||
|
||||
return Response.json(data, {
|
||||
headers: {
|
||||
"cache-control": "public, max-age=3600",
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return Response.json(
|
||||
{
|
||||
error: error instanceof Error ? error.message : "Scraping failed",
|
||||
url: normalizedUrl,
|
||||
},
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Usage:
|
||||
* GET /?url=https://example.com
|
||||
*
|
||||
* Response:
|
||||
* {
|
||||
* "url": "https://example.com",
|
||||
* "title": "Example Domain",
|
||||
* "description": "...",
|
||||
* "headings": ["Example Domain"],
|
||||
* "links": [{ "text": "More information...", "href": "..." }],
|
||||
* "images": [],
|
||||
* "timestamp": "2025-10-22T12:34:56.789Z"
|
||||
* }
|
||||
*/
|
||||
138
templates/web-scraper-batch.ts
Normal file
138
templates/web-scraper-batch.ts
Normal file
@@ -0,0 +1,138 @@
|
||||
// Batch Web Scraper
|
||||
// Scrape multiple URLs efficiently using browser tabs
|
||||
|
||||
import puppeteer, { Browser } from "@cloudflare/puppeteer";
|
||||
|
||||
interface Env {
|
||||
MYBROWSER: Fetcher;
|
||||
}
|
||||
|
||||
interface ScrapeResult {
|
||||
url: string;
|
||||
success: boolean;
|
||||
data?: {
|
||||
title: string;
|
||||
description: string;
|
||||
textContent: string; // First 500 chars
|
||||
};
|
||||
error?: string;
|
||||
}
|
||||
|
||||
async function scrapePage(browser: Browser, url: string): Promise<ScrapeResult> {
|
||||
const page = await browser.newPage();
|
||||
|
||||
try {
|
||||
await page.goto(url, {
|
||||
waitUntil: "networkidle0",
|
||||
timeout: 30000,
|
||||
});
|
||||
|
||||
const data = await page.evaluate(() => ({
|
||||
title: document.title,
|
||||
description:
|
||||
document.querySelector('meta[name="description"]')?.getAttribute("content") ||
|
||||
"",
|
||||
textContent: document.body.innerText.slice(0, 500), // First 500 chars
|
||||
}));
|
||||
|
||||
await page.close();
|
||||
|
||||
return {
|
||||
url,
|
||||
success: true,
|
||||
data,
|
||||
};
|
||||
} catch (error) {
|
||||
await page.close();
|
||||
|
||||
return {
|
||||
url,
|
||||
success: false,
|
||||
error: error instanceof Error ? error.message : "Unknown error",
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
export default {
|
||||
async fetch(request: Request, env: Env): Promise<Response> {
|
||||
if (request.method !== "POST") {
|
||||
return new Response("Method not allowed. Use POST with JSON body.", {
|
||||
status: 405,
|
||||
});
|
||||
}
|
||||
|
||||
const { urls } = await request.json<{ urls: string[] }>();
|
||||
|
||||
if (!urls || !Array.isArray(urls) || urls.length === 0) {
|
||||
return new Response('Missing "urls" array in request body', {
|
||||
status: 400,
|
||||
});
|
||||
}
|
||||
|
||||
// Limit batch size
|
||||
if (urls.length > 20) {
|
||||
return new Response("Maximum 20 URLs per batch", { status: 400 });
|
||||
}
|
||||
|
||||
// Launch single browser
|
||||
const browser = await puppeteer.launch(env.MYBROWSER);
|
||||
|
||||
try {
|
||||
// Scrape all URLs in parallel (each in its own tab)
|
||||
const results = await Promise.all(urls.map((url) => scrapePage(browser, url)));
|
||||
|
||||
await browser.close();
|
||||
|
||||
const summary = {
|
||||
total: results.length,
|
||||
successful: results.filter((r) => r.success).length,
|
||||
failed: results.filter((r) => !r.success).length,
|
||||
};
|
||||
|
||||
return Response.json({
|
||||
summary,
|
||||
results,
|
||||
});
|
||||
} catch (error) {
|
||||
await browser.close();
|
||||
return Response.json(
|
||||
{
|
||||
error: error instanceof Error ? error.message : "Batch scraping failed",
|
||||
},
|
||||
{ status: 500 }
|
||||
);
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* Usage:
|
||||
* POST /
|
||||
* Content-Type: application/json
|
||||
* {
|
||||
* "urls": [
|
||||
* "https://example.com",
|
||||
* "https://example.org",
|
||||
* "https://example.net"
|
||||
* ]
|
||||
* }
|
||||
*
|
||||
* Response:
|
||||
* {
|
||||
* "summary": {
|
||||
* "total": 3,
|
||||
* "successful": 3,
|
||||
* "failed": 0
|
||||
* },
|
||||
* "results": [
|
||||
* {
|
||||
* "url": "https://example.com",
|
||||
* "success": true,
|
||||
* "data": { "title": "...", "description": "...", "textContent": "..." }
|
||||
* }
|
||||
* ]
|
||||
* }
|
||||
*
|
||||
* Note: Uses 1 browser with multiple tabs instead of multiple browsers.
|
||||
* This reduces concurrency usage and is more efficient.
|
||||
*/
|
||||
116
templates/wrangler-browser-config.jsonc
Normal file
116
templates/wrangler-browser-config.jsonc
Normal file
@@ -0,0 +1,116 @@
|
||||
// Complete wrangler.jsonc configuration for Browser Rendering
|
||||
{
|
||||
"name": "browser-worker",
|
||||
"main": "src/index.ts",
|
||||
"compatibility_date": "2023-03-14",
|
||||
|
||||
// REQUIRED: nodejs_compat flag for Browser Rendering
|
||||
"compatibility_flags": [
|
||||
"nodejs_compat"
|
||||
],
|
||||
|
||||
// Browser binding (required)
|
||||
"browser": {
|
||||
"binding": "MYBROWSER"
|
||||
// Optional: Use real headless browser during local development
|
||||
// "remote": true
|
||||
},
|
||||
|
||||
// Optional: KV for caching screenshots/PDFs
|
||||
// Create with: npx wrangler kv namespace create SCREENSHOT_CACHE
|
||||
// npx wrangler kv namespace create SCREENSHOT_CACHE --preview
|
||||
"kv_namespaces": [
|
||||
{
|
||||
"binding": "SCREENSHOT_CACHE",
|
||||
"id": "YOUR_KV_ID", // Replace with actual ID
|
||||
"preview_id": "YOUR_PREVIEW_ID" // Replace with actual preview ID
|
||||
}
|
||||
],
|
||||
|
||||
// Optional: R2 for storing generated files
|
||||
// Create with: npx wrangler r2 bucket create browser-files
|
||||
"r2_buckets": [
|
||||
{
|
||||
"binding": "BROWSER_FILES",
|
||||
"bucket_name": "browser-files"
|
||||
}
|
||||
],
|
||||
|
||||
// Optional: AI binding for AI-enhanced scraping
|
||||
"ai": {
|
||||
"binding": "AI"
|
||||
},
|
||||
|
||||
// Optional: D1 for storing scraping results
|
||||
// Create with: npx wrangler d1 create browser-db
|
||||
"d1_databases": [
|
||||
{
|
||||
"binding": "DB",
|
||||
"database_name": "browser-db",
|
||||
"database_id": "YOUR_DB_ID"
|
||||
}
|
||||
],
|
||||
|
||||
// Optional: Environment variables
|
||||
"vars": {
|
||||
"ENVIRONMENT": "production"
|
||||
},
|
||||
|
||||
// Optional: Secrets (set with: npx wrangler secret put SECRET_NAME)
|
||||
// "secrets": ["API_KEY"]
|
||||
|
||||
// Optional: Custom routes for production
|
||||
// "routes": [
|
||||
// {
|
||||
// "pattern": "browser.example.com/*",
|
||||
// "zone_name": "example.com"
|
||||
// }
|
||||
// ]
|
||||
}
|
||||
|
||||
/**
|
||||
* Key Configuration Notes:
|
||||
*
|
||||
* 1. nodejs_compat flag is REQUIRED
|
||||
* - Browser Rendering needs Node.js APIs
|
||||
* - Automatically enables nodejs_compat_v2 if compatibility_date >= 2024-09-23
|
||||
*
|
||||
* 2. Browser binding name
|
||||
* - Use "MYBROWSER" or any name you prefer
|
||||
* - Reference in code: env.MYBROWSER
|
||||
*
|
||||
* 3. Remote binding for local development
|
||||
* - "remote": true connects to real headless browser
|
||||
* - Useful if hitting 1MB request limit in local dev
|
||||
* - Remove for production (not needed)
|
||||
*
|
||||
* 4. KV for caching
|
||||
* - Highly recommended for production screenshot services
|
||||
* - Reduces browser usage and costs
|
||||
* - Cache TTL: typically 1-24 hours
|
||||
*
|
||||
* 5. R2 for file storage
|
||||
* - Store generated PDFs or screenshots long-term
|
||||
* - Cheaper than KV for large files
|
||||
* - Use presigned URLs for downloads
|
||||
*
|
||||
* 6. AI binding
|
||||
* - Optional: for AI-enhanced scraping
|
||||
* - Requires Workers Paid plan
|
||||
* - See cloudflare-workers-ai skill
|
||||
*
|
||||
* 7. D1 database
|
||||
* - Optional: store scraping metadata
|
||||
* - Track URLs, timestamps, status
|
||||
* - See cloudflare-d1 skill
|
||||
*
|
||||
* Commands:
|
||||
* npx wrangler dev # Local development
|
||||
* npx wrangler deploy # Deploy to production
|
||||
* npx wrangler tail # View logs
|
||||
*
|
||||
* See also:
|
||||
* - cloudflare-worker-base skill for complete Worker setup
|
||||
* - cloudflare-kv skill for KV caching patterns
|
||||
* - cloudflare-r2 skill for R2 storage patterns
|
||||
*/
|
||||
Reference in New Issue
Block a user