Initial commit
This commit is contained in:
632
skills/docs-seeker/references/best-practices.md
Normal file
632
skills/docs-seeker/references/best-practices.md
Normal file
@@ -0,0 +1,632 @@
|
||||
# Best Practices
|
||||
|
||||
Essential principles and proven strategies for effective documentation discovery.
|
||||
|
||||
## 1. Prioritize context7.com for llms.txt
|
||||
|
||||
### Why
|
||||
|
||||
- **Comprehensive aggregator**: Single source for most documentation
|
||||
- **Most efficient**: Instant access without searching
|
||||
- **Authoritative**: Aggregates official sources
|
||||
- **Up-to-date**: Continuously maintained
|
||||
- **Fast**: Direct URL construction vs searching
|
||||
- **Topic filtering**: Targeted results with ?topic= parameter
|
||||
|
||||
### Implementation
|
||||
|
||||
```
|
||||
Step 1: Try context7.com (ALWAYS FIRST)
|
||||
↓
|
||||
Know GitHub repo?
|
||||
YES → https://context7.com/{org}/{repo}/llms.txt
|
||||
NO → Continue
|
||||
↓
|
||||
Know website?
|
||||
YES → https://context7.com/websites/{normalized-path}/llms.txt
|
||||
NO → Continue
|
||||
↓
|
||||
Specific topic needed?
|
||||
YES → Add ?topic={query} parameter
|
||||
NO → Use base URL
|
||||
↓
|
||||
Found?
|
||||
YES → Use as primary source
|
||||
NO → Fall back to WebSearch for llms.txt
|
||||
↓
|
||||
Still not found?
|
||||
YES → Fall back to repository analysis
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
```
|
||||
Best approach (context7.com):
|
||||
1. Direct URL: https://context7.com/vercel/next.js/llms.txt
|
||||
2. WebFetch llms.txt
|
||||
3. Launch Explorer agents for URLs
|
||||
Total time: ~15 seconds
|
||||
|
||||
Topic-specific approach:
|
||||
1. Direct URL: https://context7.com/shadcn-ui/ui/llms.txt?topic=date
|
||||
2. WebFetch filtered content
|
||||
3. Present focused results
|
||||
Total time: ~10 seconds
|
||||
|
||||
Good fallback approach:
|
||||
1. context7.com returns 404
|
||||
2. WebSearch: "Astro llms.txt site:docs.astro.build"
|
||||
3. Found → WebFetch llms.txt
|
||||
4. Launch Explorer agents for URLs
|
||||
Total time: ~60 seconds
|
||||
|
||||
Poor approach:
|
||||
1. Skip context7.com entirely
|
||||
2. Search for various documentation pages
|
||||
3. Manually collect URLs
|
||||
4. Process one by one
|
||||
Total time: ~5 minutes
|
||||
```
|
||||
|
||||
### When Not Available
|
||||
|
||||
Fallback strategy when context7.com unavailable:
|
||||
- If context7.com returns 404 → try WebSearch for llms.txt
|
||||
- If WebSearch finds nothing in 30 seconds → move to repository
|
||||
- If domain is incorrect → try 2-3 alternatives, then move on
|
||||
- If documentation is very old → likely doesn't have llms.txt
|
||||
|
||||
## 2. Use Parallel Agents Aggressively
|
||||
|
||||
### Why
|
||||
|
||||
- **Speed**: N tasks in time of 1 (vs N × time)
|
||||
- **Efficiency**: Better resource utilization
|
||||
- **Coverage**: Comprehensive results faster
|
||||
- **Scalability**: Handles large documentation sets
|
||||
|
||||
### Guidelines
|
||||
|
||||
**Always use parallel for 3+ URLs:**
|
||||
```
|
||||
3 URLs → 1 Explorer agent (acceptable)
|
||||
4-10 URLs → 3-5 Explorer agents (optimal)
|
||||
11+ URLs → 5-7 agents in phases (best)
|
||||
```
|
||||
|
||||
**Launch all agents in single message:**
|
||||
```
|
||||
Good:
|
||||
[Send one message with 5 Task tool calls]
|
||||
|
||||
Bad:
|
||||
[Send message with Task call]
|
||||
[Wait for result]
|
||||
[Send another message with Task call]
|
||||
[Wait for result]
|
||||
...
|
||||
```
|
||||
|
||||
### Distribution Strategy
|
||||
|
||||
**Even distribution:**
|
||||
```
|
||||
10 URLs, 5 agents:
|
||||
Agent 1: URLs 1-2
|
||||
Agent 2: URLs 3-4
|
||||
Agent 3: URLs 5-6
|
||||
Agent 4: URLs 7-8
|
||||
Agent 5: URLs 9-10
|
||||
```
|
||||
|
||||
**Topic-based distribution:**
|
||||
```
|
||||
10 URLs, 3 agents:
|
||||
Agent 1: Installation & Setup (URLs 1-3)
|
||||
Agent 2: Core Concepts & API (URLs 4-7)
|
||||
Agent 3: Examples & Guides (URLs 8-10)
|
||||
```
|
||||
|
||||
### When Not to Parallelize
|
||||
|
||||
- Single URL (use WebFetch)
|
||||
- 2 URLs (single agent is fine)
|
||||
- Dependencies between tasks (sequential required)
|
||||
- Limited documentation (1-2 pages)
|
||||
|
||||
## 3. Verify Official Sources
|
||||
|
||||
### Why
|
||||
|
||||
- **Accuracy**: Avoid outdated information
|
||||
- **Security**: Prevent malicious content
|
||||
- **Credibility**: Maintain trust
|
||||
- **Relevance**: Match user's version/needs
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
**For llms.txt:**
|
||||
```
|
||||
[ ] Domain matches official site
|
||||
[ ] HTTPS connection
|
||||
[ ] Content format is valid
|
||||
[ ] URLs point to official docs
|
||||
[ ] Last-Modified header is recent (if available)
|
||||
```
|
||||
|
||||
**For repositories:**
|
||||
```
|
||||
[ ] Organization matches official entity
|
||||
[ ] Star count appropriate for library
|
||||
[ ] Recent commits (last 6 months)
|
||||
[ ] README mentions official status
|
||||
[ ] Links back to official website
|
||||
[ ] License matches expectations
|
||||
```
|
||||
|
||||
**For documentation:**
|
||||
```
|
||||
[ ] Domain is official
|
||||
[ ] Version matches user request
|
||||
[ ] Last updated date visible
|
||||
[ ] Content is complete (not stubs)
|
||||
[ ] Links work (not 404s)
|
||||
```
|
||||
|
||||
### Red Flags
|
||||
|
||||
⚠️ **Unofficial sources:**
|
||||
- Personal GitHub forks
|
||||
- Outdated tutorials (>2 years old)
|
||||
- Unmaintained repositories
|
||||
- Suspicious domains
|
||||
- No version information
|
||||
- Conflicting with official docs
|
||||
|
||||
### When to Use Unofficial Sources
|
||||
|
||||
Acceptable when:
|
||||
- No official documentation exists
|
||||
- Clearly labeled as community resource
|
||||
- Recent and well-maintained
|
||||
- Cross-referenced with official info
|
||||
- User is aware of unofficial status
|
||||
|
||||
## 4. Report Methodology
|
||||
|
||||
### Why
|
||||
|
||||
- **Transparency**: User knows how you found info
|
||||
- **Reproducibility**: User can verify
|
||||
- **Troubleshooting**: Helps debug issues
|
||||
- **Trust**: Builds confidence in results
|
||||
|
||||
### What to Include
|
||||
|
||||
**Always report:**
|
||||
```markdown
|
||||
## Source
|
||||
|
||||
**Method**: llms.txt / Repository / Research / Mixed
|
||||
**Primary source**: [main URL or repository]
|
||||
**Additional sources**: [list]
|
||||
**Date accessed**: [current date]
|
||||
**Version**: [documentation version]
|
||||
```
|
||||
|
||||
**For llms.txt:**
|
||||
```markdown
|
||||
**Method**: llms.txt
|
||||
**URL**: https://docs.astro.build/llms.txt
|
||||
**URLs processed**: 8
|
||||
**Date accessed**: 2025-10-26
|
||||
**Version**: Latest (as of Oct 2025)
|
||||
```
|
||||
|
||||
**For repository:**
|
||||
```markdown
|
||||
**Method**: Repository analysis (Repomix)
|
||||
**Repository**: https://github.com/org/library
|
||||
**Commit**: abc123f (2025-10-20)
|
||||
**Stars**: 15.2k
|
||||
**Analysis date**: 2025-10-26
|
||||
```
|
||||
|
||||
**For research:**
|
||||
```markdown
|
||||
**Method**: Multi-source research
|
||||
**Sources**:
|
||||
- Official website: [url]
|
||||
- Package registry: [url]
|
||||
- Stack Overflow: [url]
|
||||
- Community tutorials: [urls]
|
||||
**Date accessed**: 2025-10-26
|
||||
**Note**: No official llms.txt or repository available
|
||||
```
|
||||
|
||||
### Limitations Disclosure
|
||||
|
||||
Always note:
|
||||
```markdown
|
||||
## ⚠️ Limitations
|
||||
|
||||
- Documentation for v2.x (user may need v3.x)
|
||||
- API reference section incomplete
|
||||
- Examples based on TypeScript (Python examples unavailable)
|
||||
- Last updated 6 months ago
|
||||
```
|
||||
|
||||
## 5. Handle Versions Explicitly
|
||||
|
||||
### Why
|
||||
|
||||
- **Compatibility**: Avoid version mismatch errors
|
||||
- **Accuracy**: Features vary by version
|
||||
- **Migration**: Support upgrade paths
|
||||
- **Clarity**: No ambiguity about what's covered
|
||||
|
||||
### Version Detection
|
||||
|
||||
**Check these sources:**
|
||||
```
|
||||
1. URL path: /docs/v2/
|
||||
2. Page header/title
|
||||
3. Version selector on page
|
||||
4. Git tag/branch name
|
||||
5. Package.json or equivalent
|
||||
6. Release date correlation
|
||||
```
|
||||
|
||||
### Version Handling Rules
|
||||
|
||||
**User specifies version:**
|
||||
```
|
||||
Request: "Documentation for React 18"
|
||||
→ Search: "React v18 documentation"
|
||||
→ Verify: Check version in content
|
||||
→ Report: "Documentation for React v18.2.0"
|
||||
```
|
||||
|
||||
**User doesn't specify:**
|
||||
```
|
||||
Request: "Documentation for Next.js"
|
||||
→ Default: Assume latest
|
||||
→ Confirm: "I'll find the latest Next.js documentation"
|
||||
→ Report: "Documentation for Next.js 14.0 (latest as of [date])"
|
||||
```
|
||||
|
||||
**Version mismatch found:**
|
||||
```
|
||||
Request: "Docs for v2"
|
||||
Found: Only v3 documentation
|
||||
→ Report: "⚠️ Requested v2, but only v3 docs available. Here's v3 with migration guide."
|
||||
```
|
||||
|
||||
### Multi-Version Scenarios
|
||||
|
||||
**Comparison request:**
|
||||
```
|
||||
Request: "Compare v1 and v2"
|
||||
→ Find both versions
|
||||
→ Launch parallel agents (set A for v1, set B for v2)
|
||||
→ Present side-by-side analysis
|
||||
```
|
||||
|
||||
**Migration request:**
|
||||
```
|
||||
Request: "How to migrate from v1 to v2"
|
||||
→ Find v2 migration guide
|
||||
→ Also fetch v1 and v2 docs
|
||||
→ Highlight breaking changes
|
||||
→ Provide code examples (before/after)
|
||||
```
|
||||
|
||||
## 6. Aggregate Intelligently
|
||||
|
||||
### Why
|
||||
|
||||
- **Clarity**: Easier to understand
|
||||
- **Efficiency**: Less cognitive load
|
||||
- **Completeness**: Unified view
|
||||
- **Actionability**: Clear next steps
|
||||
|
||||
### Bad Aggregation (Don't Do This)
|
||||
|
||||
```markdown
|
||||
## Results
|
||||
|
||||
Agent 1 found:
|
||||
[dump of agent 1 output]
|
||||
|
||||
Agent 2 found:
|
||||
[dump of agent 2 output]
|
||||
|
||||
Agent 3 found:
|
||||
[dump of agent 3 output]
|
||||
```
|
||||
|
||||
Problems:
|
||||
- Redundant information repeated
|
||||
- No synthesis
|
||||
- Hard to scan
|
||||
- Lacks narrative
|
||||
|
||||
### Good Aggregation (Do This)
|
||||
|
||||
```markdown
|
||||
## Installation
|
||||
|
||||
[Synthesized from agents 1 & 2]
|
||||
Three installation methods available:
|
||||
|
||||
1. **npm (recommended)**:
|
||||
```bash
|
||||
npm install library-name
|
||||
```
|
||||
|
||||
2. **CDN**: [from agent 1]
|
||||
```html
|
||||
<script src="..."></script>
|
||||
```
|
||||
|
||||
3. **Manual**: [from agent 3]
|
||||
Download and include in project
|
||||
|
||||
## Core Concepts
|
||||
|
||||
[Synthesized from agents 2 & 4]
|
||||
The library is built around three main concepts:
|
||||
|
||||
1. **Components**: [definition from agent 2]
|
||||
2. **State**: [definition from agent 4]
|
||||
3. **Effects**: [definition from agent 2]
|
||||
|
||||
## Examples
|
||||
|
||||
[From agents 3 & 5, deduplicated]
|
||||
...
|
||||
```
|
||||
|
||||
Benefits:
|
||||
- Organized by topic
|
||||
- Deduplicated
|
||||
- Clear narrative
|
||||
- Easy to scan
|
||||
|
||||
### Synthesis Techniques
|
||||
|
||||
**Deduplication:**
|
||||
```
|
||||
Agent 1: "Install with npm install foo"
|
||||
Agent 2: "You can install using npm: npm install foo"
|
||||
→ Synthesized: "Install: `npm install foo`"
|
||||
```
|
||||
|
||||
**Prioritization:**
|
||||
```
|
||||
Agent 1: Basic usage example
|
||||
Agent 2: Basic usage example (same)
|
||||
Agent 3: Advanced usage example
|
||||
→ Keep: Basic (from agent 1) + Advanced (from agent 3)
|
||||
```
|
||||
|
||||
**Organization:**
|
||||
```
|
||||
Agents returned mixed information:
|
||||
- Installation steps
|
||||
- Configuration
|
||||
- Usage example
|
||||
- Installation requirements
|
||||
- More usage examples
|
||||
|
||||
→ Reorganize:
|
||||
1. Installation (requirements + steps)
|
||||
2. Configuration
|
||||
3. Usage (all examples together)
|
||||
```
|
||||
|
||||
## 7. Time Management
|
||||
|
||||
### Why
|
||||
|
||||
- **User experience**: Fast results
|
||||
- **Resource efficiency**: Don't waste compute
|
||||
- **Fail fast**: Quickly try alternatives
|
||||
- **Practical limits**: Avoid hanging
|
||||
|
||||
### Timeouts
|
||||
|
||||
**Set explicit timeouts:**
|
||||
```
|
||||
WebSearch: 30 seconds
|
||||
WebFetch: 60 seconds
|
||||
Repository clone: 5 minutes
|
||||
Repomix processing: 10 minutes
|
||||
Explorer agent: 5 minutes per URL
|
||||
Researcher agent: 10 minutes
|
||||
```
|
||||
|
||||
### Time Budgets
|
||||
|
||||
**Simple query (single library, latest version):**
|
||||
```
|
||||
Target: <2 minutes total
|
||||
|
||||
Phase 1 (Discovery): 30 seconds
|
||||
- llms.txt search: 15 seconds
|
||||
- Fetch llms.txt: 15 seconds
|
||||
|
||||
Phase 2 (Exploration): 60 seconds
|
||||
- Launch agents: 5 seconds
|
||||
- Agents fetch URLs: 60 seconds (parallel)
|
||||
|
||||
Phase 3 (Aggregation): 30 seconds
|
||||
- Synthesize results
|
||||
- Format output
|
||||
|
||||
Total: ~2 minutes
|
||||
```
|
||||
|
||||
**Complex query (multiple versions, comparison):**
|
||||
```
|
||||
Target: <5 minutes total
|
||||
|
||||
Phase 1 (Discovery): 60 seconds
|
||||
- Search both versions
|
||||
- Fetch both llms.txt files
|
||||
|
||||
Phase 2 (Exploration): 180 seconds
|
||||
- Launch 6 agents (2 sets of 3)
|
||||
- Parallel exploration
|
||||
|
||||
Phase 3 (Comparison): 60 seconds
|
||||
- Analyze differences
|
||||
- Format side-by-side
|
||||
|
||||
Total: ~5 minutes
|
||||
```
|
||||
|
||||
### When to Extend Timeouts
|
||||
|
||||
Acceptable to go longer when:
|
||||
- User explicitly requests comprehensive analysis
|
||||
- Repository is large but necessary
|
||||
- Multiple fallbacks attempted
|
||||
- User is informed of delay
|
||||
|
||||
### When to Give Up
|
||||
|
||||
Move to next method after:
|
||||
- 3 failed attempts on same approach
|
||||
- Timeout exceeded by 2x
|
||||
- No progress for 30 seconds
|
||||
- Error indicates permanent failure (404, auth required)
|
||||
|
||||
## 8. Cache Findings
|
||||
|
||||
### Why
|
||||
|
||||
- **Speed**: Instant results for repeated requests
|
||||
- **Efficiency**: Reduce network requests
|
||||
- **Consistency**: Same results within session
|
||||
- **Reliability**: Less dependent on network
|
||||
|
||||
### What to Cache
|
||||
|
||||
**High value (always cache):**
|
||||
```
|
||||
- Repomix output (large, expensive to generate)
|
||||
- llms.txt content (static, frequently referenced)
|
||||
- Repository README (relatively static)
|
||||
- Package registry metadata (changes rarely)
|
||||
```
|
||||
|
||||
**Medium value (cache within session):**
|
||||
```
|
||||
- Documentation page content
|
||||
- Search results
|
||||
- Repository structure
|
||||
- Version lists
|
||||
```
|
||||
|
||||
**Low value (don't cache):**
|
||||
```
|
||||
- Real-time data (latest releases)
|
||||
- User-specific content
|
||||
- Time-sensitive information
|
||||
```
|
||||
|
||||
### Cache Duration
|
||||
|
||||
```
|
||||
Within conversation:
|
||||
- All fetched content (reuse freely)
|
||||
|
||||
Within session:
|
||||
- Repomix output (until conversation ends)
|
||||
- llms.txt content (until new version requested)
|
||||
|
||||
Across sessions:
|
||||
- Don't cache (start fresh each time)
|
||||
```
|
||||
|
||||
### Cache Invalidation
|
||||
|
||||
Refresh cache when:
|
||||
```
|
||||
- User requests specific different version
|
||||
- User says "get latest" or "refresh"
|
||||
- Explicit time reference ("docs from today")
|
||||
- Previous cache is from different library
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
```
|
||||
# First request for library X
|
||||
1. Fetch llms.txt
|
||||
2. Store content in session variable
|
||||
3. Use for processing
|
||||
|
||||
# Second request for library X (same session)
|
||||
1. Check if llms.txt cached
|
||||
2. Reuse cached content
|
||||
3. Skip redundant fetch
|
||||
|
||||
# Request for library Y
|
||||
1. Don't reuse library X cache
|
||||
2. Fetch fresh for library Y
|
||||
```
|
||||
|
||||
### Cache Hit Messages
|
||||
|
||||
```markdown
|
||||
ℹ️ Using cached llms.txt from 5 minutes ago.
|
||||
To fetch fresh, say "refresh" or "get latest".
|
||||
```
|
||||
|
||||
## Quick Reference Checklist
|
||||
|
||||
### Before Starting
|
||||
|
||||
- [ ] Identify library name clearly
|
||||
- [ ] Confirm version (default: latest)
|
||||
- [ ] Check if cached data available
|
||||
- [ ] Plan method (llms.txt → repo → research)
|
||||
|
||||
### During Discovery
|
||||
|
||||
- [ ] Start with llms.txt search
|
||||
- [ ] Verify source is official
|
||||
- [ ] Check version matches requirement
|
||||
- [ ] Set timeout for each operation
|
||||
- [ ] Fall back quickly if method fails
|
||||
|
||||
### During Exploration
|
||||
|
||||
- [ ] Use parallel agents for 3+ URLs
|
||||
- [ ] Launch all agents in single message
|
||||
- [ ] Distribute workload evenly
|
||||
- [ ] Monitor for errors/timeouts
|
||||
- [ ] Be ready to retry or fallback
|
||||
|
||||
### Before Presenting
|
||||
|
||||
- [ ] Synthesize by topic (not by agent)
|
||||
- [ ] Deduplicate repeated information
|
||||
- [ ] Verify version is correct
|
||||
- [ ] Include source attribution
|
||||
- [ ] Note any limitations
|
||||
- [ ] Format clearly
|
||||
- [ ] Check completeness
|
||||
|
||||
### Quality Gates
|
||||
|
||||
Ask before presenting:
|
||||
- [ ] Is information accurate?
|
||||
- [ ] Are sources official?
|
||||
- [ ] Does version match request?
|
||||
- [ ] Are all key topics covered?
|
||||
- [ ] Are limitations noted?
|
||||
- [ ] Is methodology documented?
|
||||
- [ ] Is output well-organized?
|
||||
Reference in New Issue
Block a user