Files
gh-jezweb-claude-skills-ski…/SKILL.md
2025-11-30 08:24:08 +08:00

24 KiB
Raw Blame History

name, description, license
name description license
cloudflare-browser-rendering Add headless Chrome automation with Puppeteer/Playwright on Cloudflare Workers. Use when: taking screenshots, generating PDFs, web scraping, crawling sites, browser automation, or troubleshooting XPath errors, browser timeouts, binding not passed errors, or session limits. MIT

Cloudflare Browser Rendering - Complete Reference

Production-ready knowledge domain for building browser automation workflows with Cloudflare Browser Rendering.

Status: Production Ready Last Updated: 2025-11-23 Dependencies: cloudflare-worker-base (for Worker setup) Latest Versions: @cloudflare/puppeteer@1.0.4 (July 2025), @cloudflare/playwright@1.0.0 (Playwright v1.55 GA Sept 2025), wrangler@4.50.0

Recent Updates (2025):

  • Sept 2025: Playwright v1.55 GA, Stagehand framework support (Workers AI), /links excludeExternalLinks param
  • Aug 2025: Billing GA (Aug 20), /sessions endpoint in local dev, X-Browser-Ms-Used header
  • July 2025: Playwright v1.54.1 + MCP v0.0.30, Playwright local dev support (wrangler@4.26.0+), Puppeteer v22.13.1 sync, /content returns title, /json custom_ai param, /screenshot viewport 1920x1080 default
  • June 2025: Web Bot Auth headers auto-included
  • April 2025: Playwright support launched, free tier introduced

Table of Contents

  1. Quick Start (5 minutes)
  2. Browser Rendering Overview
  3. Puppeteer API Reference
  4. Playwright API Reference
  5. Session Management
  6. Common Patterns
  7. Pricing & Limits
  8. Known Issues Prevention
  9. Production Checklist

Quick Start (5 minutes)

1. Add Browser Binding

wrangler.jsonc:

{
  "name": "browser-worker",
  "main": "src/index.ts",
  "compatibility_date": "2023-03-14",
  "compatibility_flags": ["nodejs_compat"],
  "browser": {
    "binding": "MYBROWSER"
  }
}

Why nodejs_compat? Browser Rendering requires Node.js APIs and polyfills.

2. Install Puppeteer

npm install @cloudflare/puppeteer

3. Take Your First Screenshot

import puppeteer from "@cloudflare/puppeteer";

interface Env {
  MYBROWSER: Fetcher;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { searchParams } = new URL(request.url);
    const url = searchParams.get("url") || "https://example.com";

    // Launch browser
    const browser = await puppeteer.launch(env.MYBROWSER);
    const page = await browser.newPage();

    // Navigate and capture
    await page.goto(url);
    const screenshot = await page.screenshot();

    // Clean up
    await browser.close();

    return new Response(screenshot, {
      headers: { "content-type": "image/png" }
    });
  }
};

4. Deploy

npx wrangler deploy

Test at: https://your-worker.workers.dev/?url=https://example.com

CRITICAL:

  • Always pass env.MYBROWSER to puppeteer.launch() (not undefined)
  • Always call browser.close() when done (or use browser.disconnect() for session reuse)
  • Use nodejs_compat compatibility flag

Browser Rendering Overview

What is Browser Rendering?

Cloudflare Browser Rendering provides headless Chromium browsers running on Cloudflare's global network. Use familiar tools like Puppeteer and Playwright to automate browser tasks:

  • Screenshots - Capture visual snapshots of web pages
  • PDF Generation - Convert HTML/URLs to PDFs
  • Web Scraping - Extract content from dynamic websites
  • Testing - Automate frontend tests
  • Crawling - Navigate multi-page workflows

Two Integration Methods

Method Best For Complexity
Workers Bindings Complex automation, custom workflows, session management Advanced
REST API Simple screenshot/PDF tasks Simple

This skill covers Workers Bindings (the advanced method with full Puppeteer/Playwright APIs).

Puppeteer vs Playwright

Feature Puppeteer Playwright
API Familiarity Most popular Growing adoption
Package @cloudflare/puppeteer@1.0.4 @cloudflare/playwright@1.0.0
Session Management Advanced APIs ⚠️ Basic
Browser Support Chromium only Chromium only (Firefox/Safari not yet supported)
Best For Screenshots, PDFs, scraping Testing, frontend automation

Recommendation: Use Puppeteer for most use cases. Playwright is ideal if you're already using it for testing.


Puppeteer API Reference

Core APIs (complete reference: https://pptr.dev/api/):

Global Functions:

  • puppeteer.launch(env.MYBROWSER, options?) - Launch new browser (CRITICAL: must pass binding)
  • puppeteer.connect(env.MYBROWSER, sessionId) - Connect to existing session
  • puppeteer.sessions(env.MYBROWSER) - List running sessions
  • puppeteer.history(env.MYBROWSER) - List recent sessions (open + closed)
  • puppeteer.limits(env.MYBROWSER) - Check account limits

Browser Methods:

  • browser.newPage() - Create new tab (preferred over launching new browsers)
  • browser.sessionId() - Get session ID for reuse
  • browser.close() - Terminate session
  • browser.disconnect() - Keep session alive for reuse
  • browser.createBrowserContext() - Isolated incognito context (separate cookies/cache)

Page Methods:

  • page.goto(url, { waitUntil, timeout }) - Navigate (use "networkidle0" for dynamic content)
  • page.screenshot({ fullPage, type, quality, clip }) - Capture image
  • page.pdf({ format, printBackground, margin }) - Generate PDF
  • page.evaluate(() => ...) - Execute JS in browser (data extraction, XPath workaround)
  • page.content() / page.setContent(html) - Get/set HTML
  • page.waitForSelector(selector) - Wait for element
  • page.type(selector, text) / page.click(selector) - Form interaction

Critical Patterns:

// Must pass binding
const browser = await puppeteer.launch(env.MYBROWSER); // ✅
// const browser = await puppeteer.launch(); // ❌ Error!

// Session reuse for performance
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
  browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
}

// Keep session alive
await browser.disconnect(); // Don't close

// XPath workaround (not directly supported)
const data = await page.evaluate(() => {
  return new XPathEvaluator()
    .createExpression("/html/body/div/h1")
    .evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
    .singleNodeValue.innerHTML;
});

Playwright API Reference

Status: GA (Sept 2025) - Playwright v1.55, MCP v0.0.30 support, local dev support (wrangler@4.26.0+)

Installation:

npm install @cloudflare/playwright

Configuration Requirements (2025 Update):

{
  "compatibility_flags": ["nodejs_compat"],
  "compatibility_date": "2025-09-15"  // Required for Playwright v1.55
}

Basic Usage:

import { chromium } from "@cloudflare/playwright";

const browser = await chromium.launch(env.BROWSER);
const page = await browser.newPage();
await page.goto("https://example.com");
const screenshot = await page.screenshot();
await browser.close();

Puppeteer vs Playwright:

  • Import: puppeteer vs { chromium } from "@cloudflare/playwright"
  • Session API: Puppeteer has advanced session management (sessions/history/limits), Playwright basic
  • Auto-waiting: Playwright has built-in auto-waiting, Puppeteer requires manual waitForSelector()
  • MCP Support: Playwright MCP v0.0.30 (July 2025), Playwright MCP server available

Recommendation: Use Puppeteer for session reuse patterns. Use Playwright if migrating existing tests or need MCP integration.

Official Docs: https://developers.cloudflare.com/browser-rendering/playwright/


Session Management

Why: Launching new browsers is slow and consumes concurrency limits. Reuse sessions for faster response, lower concurrency usage, better resource utilization.

Session Reuse Pattern (Critical)

async function getBrowser(env: Env): Promise<Browser> {
  const sessions = await puppeteer.sessions(env.MYBROWSER);
  const freeSessions = sessions.filter(s => !s.connectionId);

  if (freeSessions.length > 0) {
    try {
      return await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
    } catch (e) {
      console.log("Failed to connect, launching new browser");
    }
  }

  return await puppeteer.launch(env.MYBROWSER);
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const browser = await getBrowser(env);

    try {
      const page = await browser.newPage();
      await page.goto("https://example.com");
      const screenshot = await page.screenshot();

      await browser.disconnect(); // ✅ Keep alive for reuse

      return new Response(screenshot, {
        headers: { "content-type": "image/png" }
      });
    } catch (error) {
      await browser.close(); // ❌ Close on error
      throw error;
    }
  }
};

Key Rules:

  • browser.disconnect() - Keep session alive for reuse
  • browser.close() - Only on errors or when truly done
  • Always handle connection failures

Browser Contexts (Cookie/Cache Isolation)

Use browser.createBrowserContext() to share browser but isolate cookies/cache:

const browser = await puppeteer.launch(env.MYBROWSER);
const context1 = await browser.createBrowserContext(); // User 1
const context2 = await browser.createBrowserContext(); // User 2

const page1 = await context1.newPage();
const page2 = await context2.newPage();
// Separate cookies/cache per context

Multiple Tabs Pattern

Bad: Launch 10 browsers for 10 URLs (wastes concurrency) Good: 1 browser, 10 tabs via Promise.all() + browser.newPage()

const browser = await puppeteer.launch(env.MYBROWSER);
const results = await Promise.all(
  urls.map(async (url) => {
    const page = await browser.newPage();
    await page.goto(url);
    const data = await page.evaluate(() => ({ title: document.title }));
    await page.close();
    return { url, data };
  })
);
await browser.close();

Common Patterns

Screenshot with KV Caching

Cache screenshots to reduce browser usage and improve performance:

interface Env {
  MYBROWSER: Fetcher;
  CACHE: KVNamespace;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { searchParams } = new URL(request.url);
    const url = searchParams.get("url");
    if (!url) return new Response("Missing ?url parameter", { status: 400 });

    const normalizedUrl = new URL(url).toString();

    // Check cache first
    let screenshot = await env.CACHE.get(normalizedUrl, { type: "arrayBuffer" });

    if (!screenshot) {
      const browser = await puppeteer.launch(env.MYBROWSER);
      const page = await browser.newPage();
      await page.goto(normalizedUrl);
      screenshot = await page.screenshot();
      await browser.close();

      // Cache for 24 hours
      await env.CACHE.put(normalizedUrl, screenshot, { expirationTtl: 60 * 60 * 24 });
    }

    return new Response(screenshot, { headers: { "content-type": "image/png" } });
  }
};

AI-Enhanced Scraping

Combine Browser Rendering with Workers AI for structured data extraction:

interface Env {
  MYBROWSER: Fetcher;
  AI: Ai;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { searchParams } = new URL(request.url);
    const url = searchParams.get("url");

    // Scrape page content
    const browser = await puppeteer.launch(env.MYBROWSER);
    const page = await browser.newPage();
    await page.goto(url!, { waitUntil: "networkidle0" });
    const bodyContent = await page.$eval("body", el => el.innerHTML);
    await browser.close();

    // Extract structured data with AI
    const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
      messages: [{
        role: "user",
        content: `Extract product info as JSON from this HTML. Include: name, price, description.\n\nHTML:\n${bodyContent.slice(0, 4000)}`
      }]
    });

    return Response.json({ url, product: JSON.parse(response.response) });
  }
};

Other Common Patterns: PDF generation (page.pdf()), structured scraping (page.evaluate()), form automation (page.type() + page.click()). See bundled templates/ directory.


Pricing & Limits

Billing GA: August 20, 2025

Free Tier: 10 min/day, 3 concurrent, 3 launches/min, 60s timeout Paid Tier: 10 hrs/month included ($0.09/hr after), 10 concurrent avg ($2.00/browser after), 30 launches/min, 60s-10min timeout

Concurrency Calculation: Monthly average of daily peak usage (e.g., 15 browsers avg = (15 - 10 included) × $2.00 = $10.00/mo)

Rate Limiting: Enforced per-second (180 req/min = 3 req/sec, not bursty). Check puppeteer.limits(env.MYBROWSER) before launching:

const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
  const delay = limits.timeUntilNextAllowedBrowserAcquisition || 1000;
  await new Promise(resolve => setTimeout(resolve, delay));
}

Known Issues Prevention

This skill prevents 6 documented issues:


Issue #1: XPath Selectors Not Supported

Error: "XPath selector not supported" or selector failures Source: https://developers.cloudflare.com/browser-rendering/faq/#why-cant-i-use-an-xpath-selector-when-using-browser-rendering-with-puppeteer Why It Happens: XPath poses a security risk to Workers Prevention: Use CSS selectors or page.evaluate() with XPathEvaluator

Solution:

// ❌ Don't use XPath directly (not supported)
// await page.$x('/html/body/div/h1')

// ✅ Use CSS selector
const heading = await page.$("div > h1");

// ✅ Or use XPath in page.evaluate()
const innerHtml = await page.evaluate(() => {
  return new XPathEvaluator()
    .createExpression("/html/body/div/h1")
    .evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
    .singleNodeValue.innerHTML;
});

Issue #2: Browser Binding Not Passed

Error: "Cannot read properties of undefined (reading 'fetch')" Source: https://developers.cloudflare.com/browser-rendering/faq/#cannot-read-properties-of-undefined-reading-fetch Why It Happens: puppeteer.launch() called without browser binding Prevention: Always pass env.MYBROWSER to launch

Solution:

// ❌ Missing browser binding
const browser = await puppeteer.launch(); // Error!

// ✅ Pass binding
const browser = await puppeteer.launch(env.MYBROWSER);

Issue #3: Browser Timeout (60 seconds)

Error: Browser closes unexpectedly after 60 seconds Source: https://developers.cloudflare.com/browser-rendering/platform/limits/#note-on-browser-timeout Why It Happens: Default timeout is 60 seconds of inactivity Prevention: Use keep_alive option to extend up to 10 minutes

Solution:

// Extend timeout to 5 minutes for long-running tasks
const browser = await puppeteer.launch(env.MYBROWSER, {
  keep_alive: 300000 // 5 minutes = 300,000 ms
});

Note: Browser closes if no devtools commands for the specified duration.


Issue #4: Concurrency Limits Reached

Error: "Rate limit exceeded" or new browser launch fails Source: https://developers.cloudflare.com/browser-rendering/platform/limits/ Why It Happens: Exceeded concurrent browser limit (3 free, 10-30 paid) Prevention: Reuse sessions, use tabs instead of multiple browsers, check limits before launching

Solutions:

// 1. Check limits before launching
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
  return new Response("Concurrency limit reached", { status: 429 });
}

// 2. Reuse sessions
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
  const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
}

// 3. Use tabs instead of multiple browsers
const browser = await puppeteer.launch(env.MYBROWSER);
const page1 = await browser.newPage();
const page2 = await browser.newPage(); // Same browser, different tabs

Issue #5: Local Development Request Size Limit

Error: Request larger than 1MB fails in wrangler dev Source: https://developers.cloudflare.com/browser-rendering/faq/#does-local-development-support-all-browser-rendering-features Why It Happens: Local development limitation Prevention: Use remote: true in browser binding for local dev

Solution:

// wrangler.jsonc for local development
{
  "browser": {
    "binding": "MYBROWSER",
    "remote": true  // Use real headless browser during dev
  }
}

Issue #6: Bot Protection Always Triggered

Error: Website blocks requests as bot traffic Source: https://developers.cloudflare.com/browser-rendering/faq/#will-browser-rendering-bypass-cloudflares-bot-protection Why It Happens: Browser Rendering requests always identified as bots Prevention: Cannot bypass; if scraping your own zone, create WAF skip rule

Solution:

// ❌ Cannot bypass bot protection
// Requests will always be identified as bots

// ✅ If scraping your own Cloudflare zone:
// 1. Go to Security > WAF > Custom rules
// 2. Create skip rule with custom header:
//    Header: X-Custom-Auth
//    Value: your-secret-token
// 3. Pass header in your scraping requests

// Note: Automatic headers are included:
// - cf-biso-request-id
// - cf-biso-devtools

Production Checklist

Before deploying Browser Rendering Workers to production:

Configuration

  • Browser binding configured in wrangler.jsonc
  • nodejs_compat flag enabled (required for Browser Rendering)
  • Keep-alive timeout set if tasks take > 60 seconds
  • Remote binding enabled for local development if needed

Error Handling

  • Retry logic implemented for rate limits
  • Timeout handling for page.goto()
  • Browser cleanup in try-finally blocks
  • Concurrency limit checks before launching browsers
  • Graceful degradation when browser unavailable

Performance

  • Session reuse implemented for high-traffic routes
  • Multiple tabs used instead of multiple browsers
  • Incognito contexts for session isolation
  • KV caching for repeated screenshots/PDFs
  • Batch operations to maximize browser utilization

Monitoring

Security

  • Input validation for URLs (prevent SSRF)
  • Timeout limits to prevent abuse
  • Rate limiting on public endpoints
  • Authentication for sensitive scraping endpoints
  • WAF rules if scraping your own zone

Testing

  • Test screenshot capture with various page sizes
  • Test PDF generation with custom HTML
  • Test scraping with dynamic content (networkidle0)
  • Test error scenarios (invalid URLs, timeouts)
  • Load test concurrency limits

Error Handling Best Practices

Production Pattern - Use try-catch with proper cleanup:

async function withBrowser<T>(env: Env, fn: (browser: Browser) => Promise<T>): Promise<T> {
  let browser: Browser | null = null;

  try {
    // 1. Check limits before launching
    const limits = await puppeteer.limits(env.MYBROWSER);
    if (limits.allowedBrowserAcquisitions === 0) {
      throw new Error("Rate limit reached");
    }

    // 2. Try session reuse first
    const sessions = await puppeteer.sessions(env.MYBROWSER);
    const freeSessions = sessions.filter(s => !s.connectionId);
    browser = freeSessions.length > 0
      ? await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId)
      : await puppeteer.launch(env.MYBROWSER);

    // 3. Execute user function
    const result = await fn(browser);

    // 4. Disconnect (keep alive)
    await browser.disconnect();
    return result;
  } catch (error) {
    // 5. Close on error
    if (browser) await browser.close();
    throw error;
  }
}

Key Principles: Check limits → Reuse sessions → Execute → Disconnect on success, close on error


Using Bundled Resources

Templates (templates/)

Ready-to-use code templates for common patterns:

  • basic-screenshot.ts - Minimal screenshot example
  • screenshot-with-kv-cache.ts - Screenshot with KV caching
  • pdf-generation.ts - Generate PDFs from HTML or URLs
  • web-scraper-basic.ts - Basic web scraping pattern
  • web-scraper-batch.ts - Batch scrape multiple URLs
  • session-reuse.ts - Session reuse for performance
  • ai-enhanced-scraper.ts - Scraping with Workers AI
  • playwright-example.ts - Playwright alternative example
  • wrangler-browser-config.jsonc - Browser binding configuration

Usage:

# Copy template to your project
cp ~/.claude/skills/cloudflare-browser-rendering/templates/basic-screenshot.ts src/index.ts

References (references/)

Deep-dive documentation:

  • session-management.md - Complete session reuse guide
  • pricing-and-limits.md - Detailed pricing breakdown
  • common-errors.md - All known issues and solutions
  • puppeteer-vs-playwright.md - Feature comparison and migration

When to load: Reference when implementing advanced patterns or debugging specific issues.


Dependencies

Required:

  • @cloudflare/puppeteer@1.0.4 - Puppeteer for Workers
  • wrangler@4.43.0+ - Cloudflare CLI

Optional:

  • @cloudflare/playwright@1.0.0 - Playwright for Workers (alternative)
  • @cloudflare/workers-types@4.20251014.0+ - TypeScript types

Related Skills:

  • cloudflare-worker-base - Worker setup with Hono
  • cloudflare-kv - KV caching for screenshots
  • cloudflare-r2 - R2 storage for generated files
  • cloudflare-workers-ai - AI-enhanced scraping

Official Documentation


Package Versions (Verified 2025-10-22)

{
  "dependencies": {
    "@cloudflare/puppeteer": "^1.0.4"
  },
  "devDependencies": {
    "@cloudflare/workers-types": "^4.20251014.0",
    "wrangler": "^4.43.0"
  }
}

Alternative (Playwright):

{
  "dependencies": {
    "@cloudflare/playwright": "^1.0.0"
  }
}

Troubleshooting

Problem: "Cannot read properties of undefined (reading 'fetch')"

Solution: Pass browser binding to puppeteer.launch():

const browser = await puppeteer.launch(env.MYBROWSER); // Not just puppeteer.launch()

Problem: XPath selectors not working

Solution: Use CSS selectors or page.evaluate() with XPathEvaluator (see Issue #1)

Problem: Browser closes after 60 seconds

Solution: Extend timeout with keep_alive:

const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 300000 });

Problem: Rate limit reached

Solution: Reuse sessions, use tabs, check limits before launching (see Issue #4)

Problem: Local dev request > 1MB fails

Solution: Enable remote binding in wrangler.jsonc:

{ "browser": { "binding": "MYBROWSER", "remote": true } }

Problem: Website blocks as bot

Solution: Cannot bypass. If your own zone, create WAF skip rule (see Issue #6)


Questions? Issues?

  1. Check references/common-errors.md for detailed solutions
  2. Review references/session-management.md for performance optimization
  3. Verify browser binding is configured in wrangler.jsonc
  4. Check official docs: https://developers.cloudflare.com/browser-rendering/
  5. Ensure nodejs_compat compatibility flag is enabled