Initial commit

2025-11-30 08:48:52 +08:00
commit 6ec3196ecc
434 changed files with 125248 additions and 0 deletions
--- a/skills/docs-seeker/references/best-practices.md
+++ b/skills/docs-seeker/references/best-practices.md
@@ -0,0 +1,632 @@
+# Best Practices
+
+Essential principles and proven strategies for effective documentation discovery.
+
+## 1. Prioritize context7.com for llms.txt
+
+### Why
+
+- **Comprehensive aggregator**: Single source for most documentation
+- **Most efficient**: Instant access without searching
+- **Authoritative**: Aggregates official sources
+- **Up-to-date**: Continuously maintained
+- **Fast**: Direct URL construction vs searching
+- **Topic filtering**: Targeted results with ?topic= parameter
+
+### Implementation
+
+```
+Step 1: Try context7.com (ALWAYS FIRST)
+  ↓
+Know GitHub repo?
+  YES → https://context7.com/{org}/{repo}/llms.txt
+  NO → Continue
+  ↓
+Know website?
+  YES → https://context7.com/websites/{normalized-path}/llms.txt
+  NO → Continue
+  ↓
+Specific topic needed?
+  YES → Add ?topic={query} parameter
+  NO → Use base URL
+  ↓
+Found?
+  YES → Use as primary source
+  NO → Fall back to WebSearch for llms.txt
+  ↓
+Still not found?
+  YES → Fall back to repository analysis
+```
+
+### Examples
+
+```
+Best approach (context7.com):
+1. Direct URL: https://context7.com/vercel/next.js/llms.txt
+2. WebFetch llms.txt
+3. Launch Explorer agents for URLs
+Total time: ~15 seconds
+
+Topic-specific approach:
+1. Direct URL: https://context7.com/shadcn-ui/ui/llms.txt?topic=date
+2. WebFetch filtered content
+3. Present focused results
+Total time: ~10 seconds
+
+Good fallback approach:
+1. context7.com returns 404
+2. WebSearch: "Astro llms.txt site:docs.astro.build"
+3. Found → WebFetch llms.txt
+4. Launch Explorer agents for URLs
+Total time: ~60 seconds
+
+Poor approach:
+1. Skip context7.com entirely
+2. Search for various documentation pages
+3. Manually collect URLs
+4. Process one by one
+Total time: ~5 minutes
+```
+
+### When Not Available
+
+Fallback strategy when context7.com unavailable:
+- If context7.com returns 404 → try WebSearch for llms.txt
+- If WebSearch finds nothing in 30 seconds → move to repository
+- If domain is incorrect → try 2-3 alternatives, then move on
+- If documentation is very old → likely doesn't have llms.txt
+
+## 2. Use Parallel Agents Aggressively
+
+### Why
+
+- **Speed**: N tasks in time of 1 (vs N × time)
+- **Efficiency**: Better resource utilization
+- **Coverage**: Comprehensive results faster
+- **Scalability**: Handles large documentation sets
+
+### Guidelines
+
+**Always use parallel for 3+ URLs:**
+```
+3 URLs → 1 Explorer agent (acceptable)
+4-10 URLs → 3-5 Explorer agents (optimal)
+11+ URLs → 5-7 agents in phases (best)
+```
+
+**Launch all agents in single message:**
+```
+Good:
+[Send one message with 5 Task tool calls]
+
+Bad:
+[Send message with Task call]
+[Wait for result]
+[Send another message with Task call]
+[Wait for result]
+...
+```
+
+### Distribution Strategy
+
+**Even distribution:**
+```
+10 URLs, 5 agents:
+Agent 1: URLs 1-2
+Agent 2: URLs 3-4
+Agent 3: URLs 5-6
+Agent 4: URLs 7-8
+Agent 5: URLs 9-10
+```
+
+**Topic-based distribution:**
+```
+10 URLs, 3 agents:
+Agent 1: Installation & Setup (URLs 1-3)
+Agent 2: Core Concepts & API (URLs 4-7)
+Agent 3: Examples & Guides (URLs 8-10)
+```
+
+### When Not to Parallelize
+
+- Single URL (use WebFetch)
+- 2 URLs (single agent is fine)
+- Dependencies between tasks (sequential required)
+- Limited documentation (1-2 pages)
+
+## 3. Verify Official Sources
+
+### Why
+
+- **Accuracy**: Avoid outdated information
+- **Security**: Prevent malicious content
+- **Credibility**: Maintain trust
+- **Relevance**: Match user's version/needs
+
+### Verification Checklist
+
+**For llms.txt:**
+```
+[ ] Domain matches official site
+[ ] HTTPS connection
+[ ] Content format is valid
+[ ] URLs point to official docs
+[ ] Last-Modified header is recent (if available)
+```
+
+**For repositories:**
+```
+[ ] Organization matches official entity
+[ ] Star count appropriate for library
+[ ] Recent commits (last 6 months)
+[ ] README mentions official status
+[ ] Links back to official website
+[ ] License matches expectations
+```
+
+**For documentation:**
+```
+[ ] Domain is official
+[ ] Version matches user request
+[ ] Last updated date visible
+[ ] Content is complete (not stubs)
+[ ] Links work (not 404s)
+```
+
+### Red Flags
+
+⚠️ **Unofficial sources:**
+- Personal GitHub forks
+- Outdated tutorials (>2 years old)
+- Unmaintained repositories
+- Suspicious domains
+- No version information
+- Conflicting with official docs
+
+### When to Use Unofficial Sources
+
+Acceptable when:
+- No official documentation exists
+- Clearly labeled as community resource
+- Recent and well-maintained
+- Cross-referenced with official info
+- User is aware of unofficial status
+
+## 4. Report Methodology
+
+### Why
+
+- **Transparency**: User knows how you found info
+- **Reproducibility**: User can verify
+- **Troubleshooting**: Helps debug issues
+- **Trust**: Builds confidence in results
+
+### What to Include
+
+**Always report:**
+```markdown
+## Source
+
+**Method**: llms.txt / Repository / Research / Mixed
+**Primary source**: [main URL or repository]
+**Additional sources**: [list]
+**Date accessed**: [current date]
+**Version**: [documentation version]
+```
+
+**For llms.txt:**
+```markdown
+**Method**: llms.txt
+**URL**: https://docs.astro.build/llms.txt
+**URLs processed**: 8
+**Date accessed**: 2025-10-26
+**Version**: Latest (as of Oct 2025)
+```
+
+**For repository:**
+```markdown
+**Method**: Repository analysis (Repomix)
+**Repository**: https://github.com/org/library
+**Commit**: abc123f (2025-10-20)
+**Stars**: 15.2k
+**Analysis date**: 2025-10-26
+```
+
+**For research:**
+```markdown
+**Method**: Multi-source research
+**Sources**:
+- Official website: [url]
+- Package registry: [url]
+- Stack Overflow: [url]
+- Community tutorials: [urls]
+**Date accessed**: 2025-10-26
+**Note**: No official llms.txt or repository available
+```
+
+### Limitations Disclosure
+
+Always note:
+```markdown
+## ⚠️ Limitations
+
+- Documentation for v2.x (user may need v3.x)
+- API reference section incomplete
+- Examples based on TypeScript (Python examples unavailable)
+- Last updated 6 months ago
+```
+
+## 5. Handle Versions Explicitly
+
+### Why
+
+- **Compatibility**: Avoid version mismatch errors
+- **Accuracy**: Features vary by version
+- **Migration**: Support upgrade paths
+- **Clarity**: No ambiguity about what's covered
+
+### Version Detection
+
+**Check these sources:**
+```
+1. URL path: /docs/v2/
+2. Page header/title
+3. Version selector on page
+4. Git tag/branch name
+5. Package.json or equivalent
+6. Release date correlation
+```
+
+### Version Handling Rules
+
+**User specifies version:**
+```
+Request: "Documentation for React 18"
+→ Search: "React v18 documentation"
+→ Verify: Check version in content
+→ Report: "Documentation for React v18.2.0"
+```
+
+**User doesn't specify:**
+```
+Request: "Documentation for Next.js"
+→ Default: Assume latest
+→ Confirm: "I'll find the latest Next.js documentation"
+→ Report: "Documentation for Next.js 14.0 (latest as of [date])"
+```
+
+**Version mismatch found:**
+```
+Request: "Docs for v2"
+Found: Only v3 documentation
+→ Report: "⚠️ Requested v2, but only v3 docs available. Here's v3 with migration guide."
+```
+
+### Multi-Version Scenarios
+
+**Comparison request:**
+```
+Request: "Compare v1 and v2"
+→ Find both versions
+→ Launch parallel agents (set A for v1, set B for v2)
+→ Present side-by-side analysis
+```
+
+**Migration request:**
+```
+Request: "How to migrate from v1 to v2"
+→ Find v2 migration guide
+→ Also fetch v1 and v2 docs
+→ Highlight breaking changes
+→ Provide code examples (before/after)
+```
+
+## 6. Aggregate Intelligently
+
+### Why
+
+- **Clarity**: Easier to understand
+- **Efficiency**: Less cognitive load
+- **Completeness**: Unified view
+- **Actionability**: Clear next steps
+
+### Bad Aggregation (Don't Do This)
+
+```markdown
+## Results
+
+Agent 1 found:
+[dump of agent 1 output]
+
+Agent 2 found:
+[dump of agent 2 output]
+
+Agent 3 found:
+[dump of agent 3 output]
+```
+
+Problems:
+- Redundant information repeated
+- No synthesis
+- Hard to scan
+- Lacks narrative
+
+### Good Aggregation (Do This)
+
+```markdown
+## Installation
+
+[Synthesized from agents 1 & 2]
+Three installation methods available:
+
+1. **npm (recommended)**:
+   ```bash
+   npm install library-name
+   ```
+
+2. **CDN**: [from agent 1]
+   ```html
+   <script src="..."></script>
+   ```
+
+3. **Manual**: [from agent 3]
+   Download and include in project
+
+## Core Concepts
+
+[Synthesized from agents 2 & 4]
+The library is built around three main concepts:
+
+1. **Components**: [definition from agent 2]
+2. **State**: [definition from agent 4]
+3. **Effects**: [definition from agent 2]
+
+## Examples
+
+[From agents 3 & 5, deduplicated]
+...
+```
+
+Benefits:
+- Organized by topic
+- Deduplicated
+- Clear narrative
+- Easy to scan
+
+### Synthesis Techniques
+
+**Deduplication:**
+```
+Agent 1: "Install with npm install foo"
+Agent 2: "You can install using npm: npm install foo"
+→ Synthesized: "Install: `npm install foo`"
+```
+
+**Prioritization:**
+```
+Agent 1: Basic usage example
+Agent 2: Basic usage example (same)
+Agent 3: Advanced usage example
+→ Keep: Basic (from agent 1) + Advanced (from agent 3)
+```
+
+**Organization:**
+```
+Agents returned mixed information:
+- Installation steps
+- Configuration
+- Usage example
+- Installation requirements
+- More usage examples
+
+→ Reorganize:
+1. Installation (requirements + steps)
+2. Configuration
+3. Usage (all examples together)
+```
+
+## 7. Time Management
+
+### Why
+
+- **User experience**: Fast results
+- **Resource efficiency**: Don't waste compute
+- **Fail fast**: Quickly try alternatives
+- **Practical limits**: Avoid hanging
+
+### Timeouts
+
+**Set explicit timeouts:**
+```
+WebSearch: 30 seconds
+WebFetch: 60 seconds
+Repository clone: 5 minutes
+Repomix processing: 10 minutes
+Explorer agent: 5 minutes per URL
+Researcher agent: 10 minutes
+```
+
+### Time Budgets
+
+**Simple query (single library, latest version):**
+```
+Target: <2 minutes total
+
+Phase 1 (Discovery): 30 seconds
+- llms.txt search: 15 seconds
+- Fetch llms.txt: 15 seconds
+
+Phase 2 (Exploration): 60 seconds
+- Launch agents: 5 seconds
+- Agents fetch URLs: 60 seconds (parallel)
+
+Phase 3 (Aggregation): 30 seconds
+- Synthesize results
+- Format output
+
+Total: ~2 minutes
+```
+
+**Complex query (multiple versions, comparison):**
+```
+Target: <5 minutes total
+
+Phase 1 (Discovery): 60 seconds
+- Search both versions
+- Fetch both llms.txt files
+
+Phase 2 (Exploration): 180 seconds
+- Launch 6 agents (2 sets of 3)
+- Parallel exploration
+
+Phase 3 (Comparison): 60 seconds
+- Analyze differences
+- Format side-by-side
+
+Total: ~5 minutes
+```
+
+### When to Extend Timeouts
+
+Acceptable to go longer when:
+- User explicitly requests comprehensive analysis
+- Repository is large but necessary
+- Multiple fallbacks attempted
+- User is informed of delay
+
+### When to Give Up
+
+Move to next method after:
+- 3 failed attempts on same approach
+- Timeout exceeded by 2x
+- No progress for 30 seconds
+- Error indicates permanent failure (404, auth required)
+
+## 8. Cache Findings
+
+### Why
+
+- **Speed**: Instant results for repeated requests
+- **Efficiency**: Reduce network requests
+- **Consistency**: Same results within session
+- **Reliability**: Less dependent on network
+
+### What to Cache
+
+**High value (always cache):**
+```
+- Repomix output (large, expensive to generate)
+- llms.txt content (static, frequently referenced)
+- Repository README (relatively static)
+- Package registry metadata (changes rarely)
+```
+
+**Medium value (cache within session):**
+```
+- Documentation page content
+- Search results
+- Repository structure
+- Version lists
+```
+
+**Low value (don't cache):**
+```
+- Real-time data (latest releases)
+- User-specific content
+- Time-sensitive information
+```
+
+### Cache Duration
+
+```
+Within conversation:
+- All fetched content (reuse freely)
+
+Within session:
+- Repomix output (until conversation ends)
+- llms.txt content (until new version requested)
+
+Across sessions:
+- Don't cache (start fresh each time)
+```
+
+### Cache Invalidation
+
+Refresh cache when:
+```
+- User requests specific different version
+- User says "get latest" or "refresh"
+- Explicit time reference ("docs from today")
+- Previous cache is from different library
+```
+
+### Implementation
+
+```
+# First request for library X
+1. Fetch llms.txt
+2. Store content in session variable
+3. Use for processing
+
+# Second request for library X (same session)
+1. Check if llms.txt cached
+2. Reuse cached content
+3. Skip redundant fetch
+
+# Request for library Y
+1. Don't reuse library X cache
+2. Fetch fresh for library Y
+```
+
+### Cache Hit Messages
+
+```markdown
+ℹ️ Using cached llms.txt from 5 minutes ago.
+To fetch fresh, say "refresh" or "get latest".
+```
+
+## Quick Reference Checklist
+
+### Before Starting
+
+- [ ] Identify library name clearly
+- [ ] Confirm version (default: latest)
+- [ ] Check if cached data available
+- [ ] Plan method (llms.txt → repo → research)
+
+### During Discovery
+
+- [ ] Start with llms.txt search
+- [ ] Verify source is official
+- [ ] Check version matches requirement
+- [ ] Set timeout for each operation
+- [ ] Fall back quickly if method fails
+
+### During Exploration
+
+- [ ] Use parallel agents for 3+ URLs
+- [ ] Launch all agents in single message
+- [ ] Distribute workload evenly
+- [ ] Monitor for errors/timeouts
+- [ ] Be ready to retry or fallback
+
+### Before Presenting
+
+- [ ] Synthesize by topic (not by agent)
+- [ ] Deduplicate repeated information
+- [ ] Verify version is correct
+- [ ] Include source attribution
+- [ ] Note any limitations
+- [ ] Format clearly
+- [ ] Check completeness
+
+### Quality Gates
+
+Ask before presenting:
+- [ ] Is information accurate?
+- [ ] Are sources official?
+- [ ] Does version match request?
+- [ ] Are all key topics covered?
+- [ ] Are limitations noted?
+- [ ] Is methodology documented?
+- [ ] Is output well-organized?