Initial commit
This commit is contained in:
389
skills/skill/references/skill-seekers-integration.md
Normal file
389
skills/skill/references/skill-seekers-integration.md
Normal file
@@ -0,0 +1,389 @@
|
||||
# Skill_Seekers Integration Guide
|
||||
|
||||
How skill-factory integrates with Skill_Seekers for automated skill creation.
|
||||
|
||||
## What is Skill_Seekers?
|
||||
|
||||
[Skill_Seekers](https://github.com/yusufkaraaslan/Skill_Seekers) is a Python tool (3,562★) that automatically converts:
|
||||
- Documentation websites → Claude skills
|
||||
- GitHub repositories → Claude skills
|
||||
- PDF files → Claude skills
|
||||
|
||||
**Key features:**
|
||||
- AST parsing for code analysis
|
||||
- OCR for scanned PDFs
|
||||
- Conflict detection (docs vs actual code)
|
||||
- MCP integration
|
||||
- 299 passing tests
|
||||
|
||||
## Installation
|
||||
|
||||
### One-Command Install
|
||||
|
||||
```bash
|
||||
~/Projects/claude-skills/skill-factory/skill/scripts/install-skill-seekers.sh
|
||||
```
|
||||
|
||||
### Manual Install
|
||||
|
||||
```bash
|
||||
# Clone
|
||||
git clone https://github.com/yusufkaraaslan/Skill_Seekers ~/Skill_Seekers
|
||||
|
||||
# Install dependencies
|
||||
cd ~/Skill_Seekers
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Optional: MCP integration
|
||||
./setup_mcp.sh
|
||||
```
|
||||
|
||||
### Verify Installation
|
||||
|
||||
```bash
|
||||
cd ~/Skill_Seekers
|
||||
python3 -c "import cli.doc_scraper" && echo "✅ Installed correctly"
|
||||
```
|
||||
|
||||
## Usage from skill-factory
|
||||
|
||||
skill-factory automatically uses Skill_Seekers when appropriate.
|
||||
|
||||
**Automatic detection:**
|
||||
```
|
||||
User: "Create React skill from react.dev"
|
||||
↓
|
||||
skill-factory detects documentation source
|
||||
↓
|
||||
Automatically runs Skill_Seekers
|
||||
↓
|
||||
Post-processes output
|
||||
↓
|
||||
Quality checks
|
||||
↓
|
||||
Delivers result
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### 1. Automatic Installation Check
|
||||
|
||||
Before using Skill_Seekers:
|
||||
```python
|
||||
def check_skill_seekers():
|
||||
seekers_path = os.environ.get('SKILL_SEEKERS_PATH', f'{HOME}/Skill_Seekers')
|
||||
|
||||
if not os.path.exists(seekers_path):
|
||||
print("Skill_Seekers not found. Install? (y/n)")
|
||||
if input().lower() == 'y':
|
||||
install_skill_seekers()
|
||||
else:
|
||||
return False
|
||||
|
||||
# Verify dependencies
|
||||
try:
|
||||
subprocess.run(
|
||||
['python3', '-c', 'import cli.doc_scraper'],
|
||||
cwd=seekers_path,
|
||||
check=True,
|
||||
capture_output=True
|
||||
)
|
||||
return True
|
||||
except:
|
||||
print("Dependencies missing. Installing...")
|
||||
install_dependencies(seekers_path)
|
||||
return True
|
||||
```
|
||||
|
||||
### 2. Scraping with Optimal Settings
|
||||
|
||||
```python
|
||||
def scrape_documentation(url: str, skill_name: str):
|
||||
seekers_path = get_seekers_path()
|
||||
|
||||
# Optimal settings for Claude skills
|
||||
cmd = [
|
||||
'python3', 'cli/doc_scraper.py',
|
||||
'--url', url,
|
||||
'--name', skill_name,
|
||||
'--async', # 2-3x faster
|
||||
'--output', f'{seekers_path}/output/{skill_name}'
|
||||
]
|
||||
|
||||
# Run with progress monitoring
|
||||
process = subprocess.Popen(
|
||||
cmd,
|
||||
cwd=seekers_path,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT
|
||||
)
|
||||
|
||||
for line in process.stdout:
|
||||
# Show progress to user
|
||||
print(f" {line.decode().strip()}")
|
||||
|
||||
return f'{seekers_path}/output/{skill_name}'
|
||||
```
|
||||
|
||||
### 3. Post-Processing Output
|
||||
|
||||
Skill_Seekers output needs enhancement for Claude compatibility:
|
||||
|
||||
```python
|
||||
def post_process_skill_seekers_output(output_dir):
|
||||
skill_path = f'{output_dir}/SKILL.md'
|
||||
|
||||
# Load skill
|
||||
skill = load_skill(skill_path)
|
||||
|
||||
# Enhancements
|
||||
enhancements = []
|
||||
|
||||
# 1. Check frontmatter
|
||||
if not has_proper_frontmatter(skill):
|
||||
skill = add_frontmatter(skill)
|
||||
enhancements.append("Added proper YAML frontmatter")
|
||||
|
||||
# 2. Check description specificity
|
||||
if is_description_generic(skill):
|
||||
skill = improve_description(skill)
|
||||
enhancements.append("Improved description specificity")
|
||||
|
||||
# 3. Check examples
|
||||
example_count = count_code_blocks(skill)
|
||||
if example_count < 5:
|
||||
# Extract more from scraped data
|
||||
skill = extract_more_examples(skill, output_dir)
|
||||
enhancements.append(f"Added {count_code_blocks(skill) - example_count} more examples")
|
||||
|
||||
# 4. Apply progressive disclosure if needed
|
||||
if count_lines(skill) > 500:
|
||||
skill = apply_progressive_disclosure(skill)
|
||||
enhancements.append("Applied progressive disclosure")
|
||||
|
||||
# Save enhanced skill
|
||||
save_skill(skill_path, skill)
|
||||
|
||||
return skill_path, enhancements
|
||||
```
|
||||
|
||||
### 4. Quality Scoring
|
||||
|
||||
```python
|
||||
def quality_check_seekers_output(skill_path):
|
||||
# Score against Anthropic best practices
|
||||
score, issues = score_skill(skill_path)
|
||||
|
||||
print(f"📊 Initial quality: {score}/10")
|
||||
|
||||
if score < 8.0:
|
||||
print(f" ⚠️ Issues: {len(issues)}")
|
||||
for issue in issues:
|
||||
print(f" - {issue}")
|
||||
|
||||
return score, issues
|
||||
```
|
||||
|
||||
## Supported Documentation Sources
|
||||
|
||||
### Documentation Websites
|
||||
|
||||
**Common frameworks:**
|
||||
- React: https://react.dev
|
||||
- Vue: https://vuejs.org
|
||||
- Django: https://docs.djangoproject.com
|
||||
- FastAPI: https://fastapi.tiangolo.com
|
||||
- Rust docs: https://docs.rs/[crate]
|
||||
|
||||
**Usage:**
|
||||
```python
|
||||
scrape_documentation('https://react.dev', 'react-development')
|
||||
```
|
||||
|
||||
### GitHub Repositories
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
scrape_github_repo('facebook/react', 'react-internals')
|
||||
```
|
||||
|
||||
Features:
|
||||
- AST parsing for actual API
|
||||
- Conflict detection vs docs
|
||||
- README extraction
|
||||
- Issues/PR analysis
|
||||
- CHANGELOG parsing
|
||||
|
||||
### PDF Files
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
scrape_pdf('/path/to/manual.pdf', 'api-manual')
|
||||
```
|
||||
|
||||
Features:
|
||||
- Text extraction
|
||||
- OCR for scanned pages
|
||||
- Table extraction
|
||||
- Code block detection
|
||||
- Image extraction
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Skill_Seekers location
|
||||
export SKILL_SEEKERS_PATH="$HOME/Skill_Seekers"
|
||||
|
||||
# Cache behavior
|
||||
export SKILL_SEEKERS_NO_CACHE="true" # For "latest" requests
|
||||
|
||||
# Output location
|
||||
export SKILL_SEEKERS_OUTPUT="$HOME/.claude/skills"
|
||||
```
|
||||
|
||||
### Custom Presets
|
||||
|
||||
Skill_Seekers has presets for common frameworks:
|
||||
```python
|
||||
presets = {
|
||||
'react': {
|
||||
'url': 'https://react.dev',
|
||||
'selectors': {'main_content': 'article'},
|
||||
'categories': ['components', 'hooks', 'api']
|
||||
},
|
||||
'rust': {
|
||||
'url_pattern': 'https://docs.rs/{crate}',
|
||||
'type': 'rust_docs'
|
||||
}
|
||||
# ... more presets
|
||||
}
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
Typical scraping times:
|
||||
|
||||
| Documentation Size | Sync Mode | Async Mode |
|
||||
|-------------------|-----------|------------|
|
||||
| Small (100-500 pages) | 15-30 min | 5-10 min |
|
||||
| Medium (500-2K pages) | 30-60 min | 10-20 min |
|
||||
| Large (10K+ pages) | 60-120 min | 20-40 min |
|
||||
|
||||
**Always use `--async` flag** (2-3x faster)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Skill_Seekers Not Found
|
||||
|
||||
```bash
|
||||
# Check installation
|
||||
ls ~/Skill_Seekers
|
||||
|
||||
# If missing, install
|
||||
scripts/install-skill-seekers.sh
|
||||
```
|
||||
|
||||
### Dependencies Missing
|
||||
|
||||
```bash
|
||||
cd ~/Skill_Seekers
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Python Version Error
|
||||
|
||||
Skill_Seekers requires Python 3.10+:
|
||||
```bash
|
||||
python3 --version # Should be 3.10 or higher
|
||||
```
|
||||
|
||||
### Scraping Fails
|
||||
|
||||
Check selectors in configuration:
|
||||
```python
|
||||
# If default selectors don't work
|
||||
python3 cli/doc_scraper.py \
|
||||
--url https://example.com \
|
||||
--name example \
|
||||
--selector "main" \ # Custom selector
|
||||
--async
|
||||
```
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Conflict Detection
|
||||
|
||||
When combining docs + GitHub:
|
||||
```python
|
||||
scrape_multi_source({
|
||||
'docs': 'https://react.dev',
|
||||
'github': 'facebook/react'
|
||||
}, 'react-complete')
|
||||
|
||||
# Outputs:
|
||||
# - Documented APIs
|
||||
# - Actual code APIs
|
||||
# - ⚠️ Conflicts highlighted
|
||||
# - Side-by-side comparison
|
||||
```
|
||||
|
||||
### MCP Integration
|
||||
|
||||
If Skill_Seekers MCP is installed:
|
||||
```
|
||||
User (in Claude Code): "Generate React skill from react.dev"
|
||||
|
||||
Claude automatically uses Skill_Seekers MCP server
|
||||
```
|
||||
|
||||
## Quality Enhancement Loop
|
||||
|
||||
After Skill_Seekers scraping:
|
||||
|
||||
```python
|
||||
1. Scrape with Skill_Seekers → Initial skill
|
||||
2. Quality check → Score: 7.4/10
|
||||
3. Apply enhancements → Fix issues
|
||||
4. Re-check → Score: 8.2/10 ✅
|
||||
5. Test with scenarios
|
||||
6. Deliver
|
||||
```
|
||||
|
||||
## When NOT to Use Skill_Seekers
|
||||
|
||||
Don't use for:
|
||||
- Custom workflows (no docs to scrape)
|
||||
- Company-specific processes
|
||||
- Novel methodologies
|
||||
- Skills requiring original thinking
|
||||
|
||||
Use manual TDD approach instead (Path B).
|
||||
|
||||
## Source
|
||||
|
||||
Integration built on [Skill_Seekers v2.0.0](https://github.com/yusufkaraaslan/Skill_Seekers)
|
||||
- MIT License
|
||||
- 3,562 stars
|
||||
- Active maintenance
|
||||
- 299 passing tests
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
# Check installation
|
||||
scripts/check-skill-seekers.sh
|
||||
|
||||
# Install
|
||||
scripts/install-skill-seekers.sh
|
||||
|
||||
# Scrape documentation
|
||||
scripts/run-automated.sh <url> <skill-name>
|
||||
|
||||
# Scrape GitHub
|
||||
scripts/run-github-scrape.sh <org/repo> <skill-name>
|
||||
|
||||
# Scrape PDF
|
||||
scripts/run-pdf-scrape.sh <pdf-path> <skill-name>
|
||||
```
|
||||
Reference in New Issue
Block a user