Files
gh-rafaelcalleja-claude-mar…/skills/docs-seeker/references/performance.md
2025-11-30 08:48:52 +08:00

575 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Performance Optimization
Strategies and techniques for maximizing speed and efficiency in documentation discovery.
## Core Principles
### 0. Use context7.com for Instant llms.txt Access
**Fastest Approach:**
Direct URL construction instead of searching:
```
Traditional: WebSearch (15-30s) → WebFetch (5-10s) = 20-40s
context7.com: Direct WebFetch (5-10s) = 5-10s
Speed improvement: 2-4x faster
```
**Benefits:**
- No search required (instant URL construction)
- Consistent URL patterns
- Reliable availability
- Topic filtering for targeted results
**Examples:**
```
GitHub repo:
https://context7.com/vercel/next.js/llms.txt
→ Instant, no search needed
Website:
https://context7.com/websites/imgix/llms.txt
→ Instant, no search needed
Topic-specific:
https://context7.com/shadcn-ui/ui/llms.txt?topic=date
→ Filtered results, even faster
```
**Performance Impact:**
```
Without context7.com:
1. WebSearch for llms.txt: 15s
2. WebFetch llms.txt: 5s
3. Launch agents: 5s
Total: 25s
With context7.com:
1. Direct WebFetch: 5s
2. Launch agents: 5s
Total: 10s (2.5x faster!)
With context7.com + topic:
1. Direct WebFetch (filtered): 3s
2. Process focused results: 2s
Total: 5s (5x faster!)
```
### 1. Minimize Sequential Operations
**The Problem:**
Sequential operations add up linearly:
```
Total Time = Op1 + Op2 + Op3 + ... + OpN
```
Example:
```
Fetch URL 1: 5 seconds
Fetch URL 2: 5 seconds
Fetch URL 3: 5 seconds
Total: 15 seconds
```
**The Solution:**
Parallel operations complete in max time of slowest:
```
Total Time = max(Op1, Op2, Op3, ..., OpN)
```
Example:
```
Launch 3 agents simultaneously
All complete in: ~5 seconds
Total: 5 seconds (3x faster!)
```
### 2. Batch Related Operations
**Benefits:**
- Fewer context switches
- Better resource utilization
- Easier to track
- More efficient aggregation
**Grouping Strategies:**
**By topic:**
```
Agent 1: All authentication-related docs
Agent 2: All database-related docs
Agent 3: All API-related docs
```
**By content type:**
```
Agent 1: All tutorials
Agent 2: All reference docs
Agent 3: All examples
```
**By priority:**
```
Phase 1 (critical): Getting started, installation, core concepts
Phase 2 (important): Guides, API reference, configuration
Phase 3 (optional): Advanced topics, internals, optimization
```
### 3. Smart Caching
**What to cache:**
- Repomix output (expensive to generate)
- llms.txt content (static)
- Repository structure (rarely changes)
- Documentation URLs (reference list)
**When to refresh:**
- User requests specific version
- Documentation updated (check last-modified)
- Cache older than session
- User explicitly requests fresh data
### 4. Early Termination
**When to stop:**
```
✓ User's core needs met
✓ Critical information found
✓ Time limit approaching
✓ Diminishing returns (90% coverage achieved)
```
**How to decide:**
```
After Phase 1 (critical docs):
- Review what was found
- Check against user request
- If 80%+ covered → deliver now
- Offer to fetch more if needed
```
## Performance Patterns
### Pattern 1: Parallel Exploration
**Scenario:** llms.txt contains 10 URLs
**Slow approach (sequential):**
```
Time: 10 URLs × 5 seconds = 50 seconds
Step 1: Fetch URL 1 (5s)
Step 2: Fetch URL 2 (5s)
Step 3: Fetch URL 3 (5s)
...
Step 10: Fetch URL 10 (5s)
```
**Fast approach (parallel):**
```
Time: ~5-10 seconds total
Step 1: Launch 5 Explorer agents (simultaneous)
Agent 1: URLs 1-2
Agent 2: URLs 3-4
Agent 3: URLs 5-6
Agent 4: URLs 7-8
Agent 5: URLs 9-10
Step 2: Wait for all (max time: ~5-10s)
Step 3: Aggregate results
```
**Speedup:** 5-10x faster
### Pattern 2: Lazy Loading
**Scenario:** Documentation has 30+ pages
**Slow approach (fetch everything):**
```
Time: 30 URLs × 5 seconds ÷ 5 agents = 30 seconds
Fetch all 30 pages upfront
User only needs 5 of them
Wasted: 25 pages × 5 seconds ÷ 5 = 25 seconds
```
**Fast approach (priority loading):**
```
Time: 10 URLs × 5 seconds ÷ 5 agents = 10 seconds
Phase 1: Fetch critical 10 pages
Review: Does this cover user's needs?
If yes: Stop here (saved 20 seconds)
If no: Fetch additional as needed
```
**Speedup:** Up to 3x faster for typical use cases
### Pattern 3: Smart Fallbacks
**Scenario:** llms.txt not found
**Slow approach (exhaustive search):**
```
Time: ~5 minutes
Try: docs.library.com/llms.txt (30s timeout)
Try: library.dev/llms.txt (30s timeout)
Try: library.io/llms.txt (30s timeout)
Try: library.org/llms.txt (30s timeout)
Try: www.library.com/llms.txt (30s timeout)
Then: Fall back to repository
```
**Fast approach (quick fallback):**
```
Time: ~1 minute
Try: docs.library.com/llms.txt (15s)
Try: library.dev/llms.txt (15s)
Not found → Immediately try repository (30s)
```
**Speedup:** 5x faster
### Pattern 4: Incremental Results
**Scenario:** Large documentation set
**Slow approach (all-or-nothing):**
```
Time: 5 minutes until first result
Fetch all documentation
Aggregate everything
Present complete report
User waits 5 minutes
```
**Fast approach (streaming):**
```
Time: 30 seconds to first result
Phase 1: Fetch critical docs (30s)
Present: Initial findings
Phase 2: Fetch important docs (60s)
Update: Additional findings
Phase 3: Fetch supplementary (90s)
Final: Complete report
```
**Benefit:** User gets value immediately, can stop early if satisfied
## Optimization Techniques
### Technique 1: Workload Balancing
**Problem:** Uneven distribution causes bottlenecks
```
Bad distribution:
Agent 1: 1 URL (small) → finishes in 5s
Agent 2: 10 URLs (large) → finishes in 50s
Total: 50s (bottlenecked by Agent 2)
```
**Solution:** Balance by estimated size
```
Good distribution:
Agent 1: 3 URLs (medium pages) → ~15s
Agent 2: 3 URLs (medium pages) → ~15s
Agent 3: 3 URLs (medium pages) → ~15s
Agent 4: 1 URL (large page) → ~15s
Total: ~15s (balanced)
```
### Technique 2: Request Coalescing
**Problem:** Redundant requests slow things down
```
Bad:
Agent 1: Fetch README.md
Agent 2: Fetch README.md (duplicate!)
Agent 3: Fetch README.md (duplicate!)
Wasted: 2 redundant fetches
```
**Solution:** Deduplicate before fetching
```
Good:
Pre-processing: Identify unique URLs
Agent 1: Fetch README.md (once)
Agent 2: Fetch INSTALL.md
Agent 3: Fetch API.md
Share: README.md content across agents if needed
```
### Technique 3: Timeout Tuning
**Problem:** Default timeouts too conservative
```
Slow:
WebFetch timeout: 120s (too long for fast sites)
If site is down: Wait 120s before failing
```
**Solution:** Adaptive timeouts
```
Fast:
Known fast sites (official docs): 30s timeout
Unknown sites: 60s timeout
Large repos: 120s timeout
If timeout hit: Immediately try alternative
```
### Technique 4: Selective Fetching
**Problem:** Fetching irrelevant content
```
Wasteful:
Fetch: Installation guide ✓ (needed)
Fetch: API reference ✓ (needed)
Fetch: Internal architecture ✗ (not needed for basic usage)
Fetch: Contributing guide ✗ (not needed)
Fetch: Changelog ✗ (not needed)
```
**Solution:** Filter by user needs
```
Efficient:
User need: "How to get started"
Fetch only: Installation, basic usage, examples
Skip: Advanced topics, internals, contribution
Speedup: 50% less fetching
```
## Performance Benchmarks
### Target Times
| Scenario | Target Time | Acceptable | Too Slow |
|----------|-------------|------------|----------|
| Single URL | <10s | 10-20s | >20s |
| llms.txt (5 URLs) | <30s | 30-60s | >60s |
| llms.txt (15 URLs) | <60s | 60-120s | >120s |
| Repository analysis | <2min | 2-5min | >5min |
| Research fallback | <3min | 3-7min | >7min |
### Real-World Examples
**Fast case (Next.js with llms.txt):**
```
00:00 - Start
00:05 - Found llms.txt
00:10 - Fetched content (12 URLs)
00:15 - Launched 4 agents
00:45 - All agents complete
00:55 - Report ready
Total: 55 seconds ✓
```
**Medium case (Repository without llms.txt):**
```
00:00 - Start
00:15 - llms.txt not found
00:20 - Found repository
00:30 - Cloned repository
02:00 - Repomix complete
02:30 - Analyzed output
02:45 - Report ready
Total: 2m 45s ✓
```
**Slow case (Scattered documentation):**
```
00:00 - Start
00:30 - llms.txt not found
00:45 - Repository not found
01:00 - Launched 4 Researcher agents
05:00 - All research complete
06:00 - Aggregated findings
06:30 - Report ready
Total: 6m 30s (acceptable for research)
```
## Common Performance Issues
### Issue 1: Too Many Agents
**Symptom:** Slower than sequential
```
Problem:
Launched 15 agents for 15 URLs
Overhead: Agent initialization, coordination
Result: Slower than 5 agents with 3 URLs each
```
**Solution:**
```
Max 7 agents per batch
Group URLs sensibly
Use phases for large sets
```
### Issue 2: Blocking Operations
**Symptom:** Agents waiting unnecessarily
```
Problem:
Agent 1: Fetch URL, wait for Agent 2
Agent 2: Fetch URL, wait for Agent 3
Agent 3: Fetch URL
Result: Sequential instead of parallel
```
**Solution:**
```
Launch all agents independently
No dependencies between agents
Aggregate after all complete
```
### Issue 3: Redundant Fetching
**Symptom:** Same content fetched multiple times
```
Problem:
Phase 1: Fetch installation guide
Phase 2: Fetch installation guide again
Result: Wasted time
```
**Solution:**
```
Cache fetched content
Check cache before fetching
Reuse within session
```
### Issue 4: Late Bailout
**Symptom:** Continuing when should stop
```
Problem:
Found 90% of needed info after 1 minute
Spent 4 more minutes on remaining 10%
Result: 5x time for marginal gain
```
**Solution:**
```
Check progress after critical phase
If 80%+ covered → offer to stop
Only continue if user wants comprehensive
```
## Performance Monitoring
### Key Metrics
**Track these times:**
```
- llms.txt discovery: Target <30s
- Repository clone: Target <60s
- Repomix processing: Target <2min
- Agent exploration: Target <60s
- Total time: Target <3min for typical case
```
### Performance Report Template
```markdown
## Performance Summary
**Total time**: 1m 25s
**Method**: llms.txt + parallel exploration
**Breakdown**:
- Discovery: 15s (llms.txt search & fetch)
- Exploration: 50s (4 agents, 12 URLs)
- Aggregation: 20s (synthesis & formatting)
**Efficiency**: 8.5x faster than sequential
(12 URLs × 5s = 60s sequential, actual: 50s parallel)
```
### When to Optimize Further
Optimize if:
- [ ] Total time >2x target
- [ ] User explicitly requests "fast"
- [ ] Repeated similar queries (cache benefit)
- [ ] Large documentation set (>20 URLs)
Don't over-optimize if:
- [ ] Already meeting targets
- [ ] One-time query
- [ ] User values completeness over speed
- [ ] Research requires thoroughness
## Quick Optimization Checklist
### Before Starting
- [ ] Check if content already cached
- [ ] Identify fastest method for this case
- [ ] Plan for parallel execution
- [ ] Set appropriate timeouts
### During Execution
- [ ] Launch agents in parallel (not sequential)
- [ ] Use single message for multiple agents
- [ ] Monitor for bottlenecks
- [ ] Be ready to terminate early
### After First Phase
- [ ] Assess coverage achieved
- [ ] Determine if user needs met
- [ ] Decide: continue or deliver now
- [ ] Cache results for potential reuse
### Optimization Decision Tree
```
Need documentation?
Check cache
HIT → Use cached (0s) ✓
MISS → Continue
llms.txt available?
YES → Parallel agents (30-60s) ✓
NO → Continue
Repository available?
YES → Repomix (2-5min)
NO → Research (3-7min)
After Phase 1:
80%+ coverage?
YES → Deliver now (save time) ✓
NO → Continue to Phase 2
```