Initial commit
This commit is contained in:
38
commands/scrape-site.md
Normal file
38
commands/scrape-site.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
allowed-tools: mcp__mcp-server-firecrawl__firecrawl_scrape, Write, Bash
|
||||
description: Scrape websites using Firecrawl MCP and save content to research folders
|
||||
model: claude-sonnet-4-5-20250929
|
||||
---
|
||||
|
||||
# Scrape Site
|
||||
|
||||
This command scrapes websites using the Firecrawl MCP and intelligently saves the content to organized research folders within the desktop-commander documentation system.
|
||||
|
||||
$ARGUMENTS
|
||||
|
||||
**Usage Examples:**
|
||||
|
||||
- `/scrape-site https://docs.anthropic.com/claude/guide` - Scrape and auto-organize in research folder
|
||||
- `/scrape-site https://example.com/api "api-docs"` - Scrape and save to specific subfolder
|
||||
- `/scrape-site https://github.com/owner/repo/wiki "github-wiki"` - Save with custom folder name
|
||||
|
||||
## Instructions
|
||||
|
||||
- Extract the URL from `$ARGUMENTS` (first argument is always the URL to scrape)
|
||||
- If second argument provided, use it as the subfolder name; otherwise auto-generate from URL domain
|
||||
- Use Firecrawl MCP to scrape the site with markdown format and main content extraction
|
||||
- Create organized folder structure in `docs/research/[domain-or-subfolder]/`
|
||||
- Generate descriptive filename based on URL path or page title
|
||||
- Save scraped content as markdown file with metadata header (URL, date, source)
|
||||
- Create or update an index file in the research folder listing all scraped content
|
||||
- Provide summary of scraped content and file location
|
||||
|
||||
## Context
|
||||
|
||||
- Research folder structure: `docs/research/` (organized by domain/topic)
|
||||
- Existing research: !`ls -la docs/context7-research/ docs/research/ 2>/dev/null | head -10`
|
||||
- Firecrawl MCP status: Available for web scraping with markdown output
|
||||
- Current date: !`date "+%Y-%m-%d"`
|
||||
- Content organization: domain-based folders (anthropic, github, etc.) or custom subfolder names
|
||||
- File naming: descriptive names based on URL path, avoiding special characters
|
||||
- Metadata format: YAML frontmatter with url, scraped_date, domain, and title fields
|
||||
Reference in New Issue
Block a user