7.1 KiB
Browser Automation Examples
This document provides detailed examples of common browser automation tasks using the CLI tool.
Example 1: Extract Product Information from E-commerce
User request: "Go to example.com/product/123 and extract the product details"
Workflow:
-
Navigate to the product page:
browser navigate https://example.com/product/123 -
Extract product data with schema:
browser extract "Extract the product information" '{"productName": "string", "price": "number", "currency": "string", "inStock": "boolean", "rating": "number", "reviewCount": "number"}' -
Close the browser:
browser close
Expected result: JSON object with product details that can be analyzed or stored.
Example 2: Fill Out and Submit a Contact Form
User request: "Fill out the contact form on example.com with my information"
Workflow:
-
Navigate to contact page:
browser navigate https://example.com/contact -
Act: Fill in name field:
browser act "Fill in the name field with 'John Doe'" -
Act: Fill in email field:
browser act "Fill in the email field with 'john.doe@example.com'" -
Act: Fill in message field:
browser act "Fill in the message field with 'I would like to inquire about your services'" -
Act: Submit the form:
browser act "Click the Submit button" -
Screenshot to capture confirmation:
browser screenshot -
Close the browser:
browser close
Example 3: Research and Summarize News Articles
User request: "Check the latest tech news on techcrunch.com and summarize the top stories"
Workflow:
-
Navigate to news site:
browser navigate https://techcrunch.com -
Extract article headlines and summaries:
browser extract "Extract the top 5 article headlines and their summaries" '{"headlines": "string", "summary": "string", "author": "string", "publishedDate": "string"}' -
Close the browser:
browser close -
Analyze and summarize the extracted data using Claude's text analysis capabilities.
Example 4: Login and Navigate Authenticated Area
User request: "Log into example.com and navigate to my dashboard"
Workflow:
-
Navigate to login page:
browser navigate https://example.com/login -
Act: Fill in username:
browser act "Fill in the username field with 'myusername'" -
Act: Fill in password:
browser act "Fill in the password field with 'mypassword'" -
Act: Click login button:
browser act "Click the Login button" -
Act: Wait for page load:
browser act "Wait for the page to fully load" -
Navigate to dashboard:
browser navigate https://example.com/dashboard -
Screenshot the dashboard:
browser screenshot -
Close the browser:
browser close
Note: This example uses Chrome's user profile (.chrome-profile/) which may preserve session cookies between runs.
Example 5: Search and Collect Results
User request: "Search Google for 'best TypeScript practices' and get the top 5 results"
Workflow:
-
Navigate to Google:
browser navigate https://www.google.com -
Act: Perform search:
browser act "Type 'best TypeScript practices' in the search box and press Enter" -
Act: Wait for results:
browser act "Wait for search results to load" -
Extract search results:
browser extract "Extract the top 5 search results" '{"title": "string", "url": "string", "snippet": "string"}' -
Close the browser:
browser close
Example 6: Download a File
User request: "Download the PDF file from example.com/documents/report.pdf"
Workflow:
-
Navigate to the file URL:
browser navigate https://example.com/documents/report.pdf -
Act: Wait for download to start:
browser act "Wait for 5 seconds for the download to complete" -
Close the browser:
browser close
Note: Files are automatically downloaded to ./agent/downloads/ directory due to CDP configuration.
Example 7: Debugging a Page Issue
User request: "Check why the submit button isn't working on example.com/form"
Workflow:
-
Navigate to the form page:
browser navigate https://example.com/form -
Screenshot initial state:
browser screenshot -
Observe available elements:
browser observe "Find all buttons and their states" -
Observe form fields:
browser observe "Find all form input fields and their required status" -
Act: Try filling required fields:
browser act "Fill in all required fields with test data" -
Screenshot after filling:
browser screenshot -
Observe button state again:
browser observe "Check if the submit button is now enabled" -
Close the browser:
browser close
Analyze the screenshots and observations to determine the issue.
Example 8: Multi-Page Data Collection
User request: "Extract product information from the first 3 pages of results on example.com/products"
Workflow:
-
Navigate to products page:
browser navigate https://example.com/products -
Extract products from page 1:
browser extract "Extract all products on this page" '{"name": "string", "price": "number", "imageUrl": "string"}' -
Act: Click next page:
browser act "Click the Next Page button" -
Extract products from page 2:
browser extract "Extract all products on this page" '{"name": "string", "price": "number", "imageUrl": "string"}' -
Act: Click next page:
browser act "Click the Next Page button" -
Extract products from page 3:
browser extract "Extract all products on this page" '{"name": "string", "price": "number", "imageUrl": "string"}' -
Close the browser:
browser close
Combine and process all extracted data.
Tips for Success
- Be specific with natural language: "Click the blue Submit button in the footer" is better than "click submit". This is extremely important because there's much ambiguity in many websites.
- Wait when needed: After navigation or actions that trigger page changes, explicitly wait
- Use observe for discovery: When unsure what elements exist, use observe first
- Take screenshots for debugging: Visual confirmation helps understand what the browser sees
- Handle errors gracefully: If an action fails, try breaking it into smaller steps
- Clean up resources: Always close the browser when done to free up system resources