575 lines
11 KiB
Markdown
575 lines
11 KiB
Markdown
# Performance Optimization
|
||
|
||
Strategies and techniques for maximizing speed and efficiency in documentation discovery.
|
||
|
||
## Core Principles
|
||
|
||
### 0. Use context7.com for Instant llms.txt Access
|
||
|
||
**Fastest Approach:**
|
||
|
||
Direct URL construction instead of searching:
|
||
```
|
||
Traditional: WebSearch (15-30s) → WebFetch (5-10s) = 20-40s
|
||
context7.com: Direct WebFetch (5-10s) = 5-10s
|
||
|
||
Speed improvement: 2-4x faster
|
||
```
|
||
|
||
**Benefits:**
|
||
- No search required (instant URL construction)
|
||
- Consistent URL patterns
|
||
- Reliable availability
|
||
- Topic filtering for targeted results
|
||
|
||
**Examples:**
|
||
```
|
||
GitHub repo:
|
||
https://context7.com/vercel/next.js/llms.txt
|
||
→ Instant, no search needed
|
||
|
||
Website:
|
||
https://context7.com/websites/imgix/llms.txt
|
||
→ Instant, no search needed
|
||
|
||
Topic-specific:
|
||
https://context7.com/shadcn-ui/ui/llms.txt?topic=date
|
||
→ Filtered results, even faster
|
||
```
|
||
|
||
**Performance Impact:**
|
||
```
|
||
Without context7.com:
|
||
1. WebSearch for llms.txt: 15s
|
||
2. WebFetch llms.txt: 5s
|
||
3. Launch agents: 5s
|
||
Total: 25s
|
||
|
||
With context7.com:
|
||
1. Direct WebFetch: 5s
|
||
2. Launch agents: 5s
|
||
Total: 10s (2.5x faster!)
|
||
|
||
With context7.com + topic:
|
||
1. Direct WebFetch (filtered): 3s
|
||
2. Process focused results: 2s
|
||
Total: 5s (5x faster!)
|
||
```
|
||
|
||
### 1. Minimize Sequential Operations
|
||
|
||
**The Problem:**
|
||
|
||
Sequential operations add up linearly:
|
||
```
|
||
Total Time = Op1 + Op2 + Op3 + ... + OpN
|
||
```
|
||
|
||
Example:
|
||
```
|
||
Fetch URL 1: 5 seconds
|
||
Fetch URL 2: 5 seconds
|
||
Fetch URL 3: 5 seconds
|
||
Total: 15 seconds
|
||
```
|
||
|
||
**The Solution:**
|
||
|
||
Parallel operations complete in max time of slowest:
|
||
```
|
||
Total Time = max(Op1, Op2, Op3, ..., OpN)
|
||
```
|
||
|
||
Example:
|
||
```
|
||
Launch 3 agents simultaneously
|
||
All complete in: ~5 seconds
|
||
Total: 5 seconds (3x faster!)
|
||
```
|
||
|
||
### 2. Batch Related Operations
|
||
|
||
**Benefits:**
|
||
- Fewer context switches
|
||
- Better resource utilization
|
||
- Easier to track
|
||
- More efficient aggregation
|
||
|
||
**Grouping Strategies:**
|
||
|
||
**By topic:**
|
||
```
|
||
Agent 1: All authentication-related docs
|
||
Agent 2: All database-related docs
|
||
Agent 3: All API-related docs
|
||
```
|
||
|
||
**By content type:**
|
||
```
|
||
Agent 1: All tutorials
|
||
Agent 2: All reference docs
|
||
Agent 3: All examples
|
||
```
|
||
|
||
**By priority:**
|
||
```
|
||
Phase 1 (critical): Getting started, installation, core concepts
|
||
Phase 2 (important): Guides, API reference, configuration
|
||
Phase 3 (optional): Advanced topics, internals, optimization
|
||
```
|
||
|
||
### 3. Smart Caching
|
||
|
||
**What to cache:**
|
||
- Repomix output (expensive to generate)
|
||
- llms.txt content (static)
|
||
- Repository structure (rarely changes)
|
||
- Documentation URLs (reference list)
|
||
|
||
**When to refresh:**
|
||
- User requests specific version
|
||
- Documentation updated (check last-modified)
|
||
- Cache older than session
|
||
- User explicitly requests fresh data
|
||
|
||
### 4. Early Termination
|
||
|
||
**When to stop:**
|
||
```
|
||
✓ User's core needs met
|
||
✓ Critical information found
|
||
✓ Time limit approaching
|
||
✓ Diminishing returns (90% coverage achieved)
|
||
```
|
||
|
||
**How to decide:**
|
||
```
|
||
After Phase 1 (critical docs):
|
||
- Review what was found
|
||
- Check against user request
|
||
- If 80%+ covered → deliver now
|
||
- Offer to fetch more if needed
|
||
```
|
||
|
||
## Performance Patterns
|
||
|
||
### Pattern 1: Parallel Exploration
|
||
|
||
**Scenario:** llms.txt contains 10 URLs
|
||
|
||
**Slow approach (sequential):**
|
||
```
|
||
Time: 10 URLs × 5 seconds = 50 seconds
|
||
|
||
Step 1: Fetch URL 1 (5s)
|
||
Step 2: Fetch URL 2 (5s)
|
||
Step 3: Fetch URL 3 (5s)
|
||
...
|
||
Step 10: Fetch URL 10 (5s)
|
||
```
|
||
|
||
**Fast approach (parallel):**
|
||
```
|
||
Time: ~5-10 seconds total
|
||
|
||
Step 1: Launch 5 Explorer agents (simultaneous)
|
||
Agent 1: URLs 1-2
|
||
Agent 2: URLs 3-4
|
||
Agent 3: URLs 5-6
|
||
Agent 4: URLs 7-8
|
||
Agent 5: URLs 9-10
|
||
|
||
Step 2: Wait for all (max time: ~5-10s)
|
||
Step 3: Aggregate results
|
||
```
|
||
|
||
**Speedup:** 5-10x faster
|
||
|
||
### Pattern 2: Lazy Loading
|
||
|
||
**Scenario:** Documentation has 30+ pages
|
||
|
||
**Slow approach (fetch everything):**
|
||
```
|
||
Time: 30 URLs × 5 seconds ÷ 5 agents = 30 seconds
|
||
|
||
Fetch all 30 pages upfront
|
||
User only needs 5 of them
|
||
Wasted: 25 pages × 5 seconds ÷ 5 = 25 seconds
|
||
```
|
||
|
||
**Fast approach (priority loading):**
|
||
```
|
||
Time: 10 URLs × 5 seconds ÷ 5 agents = 10 seconds
|
||
|
||
Phase 1: Fetch critical 10 pages
|
||
Review: Does this cover user's needs?
|
||
If yes: Stop here (saved 20 seconds)
|
||
If no: Fetch additional as needed
|
||
```
|
||
|
||
**Speedup:** Up to 3x faster for typical use cases
|
||
|
||
### Pattern 3: Smart Fallbacks
|
||
|
||
**Scenario:** llms.txt not found
|
||
|
||
**Slow approach (exhaustive search):**
|
||
```
|
||
Time: ~5 minutes
|
||
|
||
Try: docs.library.com/llms.txt (30s timeout)
|
||
Try: library.dev/llms.txt (30s timeout)
|
||
Try: library.io/llms.txt (30s timeout)
|
||
Try: library.org/llms.txt (30s timeout)
|
||
Try: www.library.com/llms.txt (30s timeout)
|
||
Then: Fall back to repository
|
||
```
|
||
|
||
**Fast approach (quick fallback):**
|
||
```
|
||
Time: ~1 minute
|
||
|
||
Try: docs.library.com/llms.txt (15s)
|
||
Try: library.dev/llms.txt (15s)
|
||
Not found → Immediately try repository (30s)
|
||
```
|
||
|
||
**Speedup:** 5x faster
|
||
|
||
### Pattern 4: Incremental Results
|
||
|
||
**Scenario:** Large documentation set
|
||
|
||
**Slow approach (all-or-nothing):**
|
||
```
|
||
Time: 5 minutes until first result
|
||
|
||
Fetch all documentation
|
||
Aggregate everything
|
||
Present complete report
|
||
User waits 5 minutes
|
||
```
|
||
|
||
**Fast approach (streaming):**
|
||
```
|
||
Time: 30 seconds to first result
|
||
|
||
Phase 1: Fetch critical docs (30s)
|
||
Present: Initial findings
|
||
Phase 2: Fetch important docs (60s)
|
||
Update: Additional findings
|
||
Phase 3: Fetch supplementary (90s)
|
||
Final: Complete report
|
||
```
|
||
|
||
**Benefit:** User gets value immediately, can stop early if satisfied
|
||
|
||
## Optimization Techniques
|
||
|
||
### Technique 1: Workload Balancing
|
||
|
||
**Problem:** Uneven distribution causes bottlenecks
|
||
|
||
```
|
||
Bad distribution:
|
||
Agent 1: 1 URL (small) → finishes in 5s
|
||
Agent 2: 10 URLs (large) → finishes in 50s
|
||
Total: 50s (bottlenecked by Agent 2)
|
||
```
|
||
|
||
**Solution:** Balance by estimated size
|
||
|
||
```
|
||
Good distribution:
|
||
Agent 1: 3 URLs (medium pages) → ~15s
|
||
Agent 2: 3 URLs (medium pages) → ~15s
|
||
Agent 3: 3 URLs (medium pages) → ~15s
|
||
Agent 4: 1 URL (large page) → ~15s
|
||
Total: ~15s (balanced)
|
||
```
|
||
|
||
### Technique 2: Request Coalescing
|
||
|
||
**Problem:** Redundant requests slow things down
|
||
|
||
```
|
||
Bad:
|
||
Agent 1: Fetch README.md
|
||
Agent 2: Fetch README.md (duplicate!)
|
||
Agent 3: Fetch README.md (duplicate!)
|
||
Wasted: 2 redundant fetches
|
||
```
|
||
|
||
**Solution:** Deduplicate before fetching
|
||
|
||
```
|
||
Good:
|
||
Pre-processing: Identify unique URLs
|
||
Agent 1: Fetch README.md (once)
|
||
Agent 2: Fetch INSTALL.md
|
||
Agent 3: Fetch API.md
|
||
Share: README.md content across agents if needed
|
||
```
|
||
|
||
### Technique 3: Timeout Tuning
|
||
|
||
**Problem:** Default timeouts too conservative
|
||
|
||
```
|
||
Slow:
|
||
WebFetch timeout: 120s (too long for fast sites)
|
||
If site is down: Wait 120s before failing
|
||
```
|
||
|
||
**Solution:** Adaptive timeouts
|
||
|
||
```
|
||
Fast:
|
||
Known fast sites (official docs): 30s timeout
|
||
Unknown sites: 60s timeout
|
||
Large repos: 120s timeout
|
||
If timeout hit: Immediately try alternative
|
||
```
|
||
|
||
### Technique 4: Selective Fetching
|
||
|
||
**Problem:** Fetching irrelevant content
|
||
|
||
```
|
||
Wasteful:
|
||
Fetch: Installation guide ✓ (needed)
|
||
Fetch: API reference ✓ (needed)
|
||
Fetch: Internal architecture ✗ (not needed for basic usage)
|
||
Fetch: Contributing guide ✗ (not needed)
|
||
Fetch: Changelog ✗ (not needed)
|
||
```
|
||
|
||
**Solution:** Filter by user needs
|
||
|
||
```
|
||
Efficient:
|
||
User need: "How to get started"
|
||
Fetch only: Installation, basic usage, examples
|
||
Skip: Advanced topics, internals, contribution
|
||
Speedup: 50% less fetching
|
||
```
|
||
|
||
## Performance Benchmarks
|
||
|
||
### Target Times
|
||
|
||
| Scenario | Target Time | Acceptable | Too Slow |
|
||
|----------|-------------|------------|----------|
|
||
| Single URL | <10s | 10-20s | >20s |
|
||
| llms.txt (5 URLs) | <30s | 30-60s | >60s |
|
||
| llms.txt (15 URLs) | <60s | 60-120s | >120s |
|
||
| Repository analysis | <2min | 2-5min | >5min |
|
||
| Research fallback | <3min | 3-7min | >7min |
|
||
|
||
### Real-World Examples
|
||
|
||
**Fast case (Next.js with llms.txt):**
|
||
```
|
||
00:00 - Start
|
||
00:05 - Found llms.txt
|
||
00:10 - Fetched content (12 URLs)
|
||
00:15 - Launched 4 agents
|
||
00:45 - All agents complete
|
||
00:55 - Report ready
|
||
Total: 55 seconds ✓
|
||
```
|
||
|
||
**Medium case (Repository without llms.txt):**
|
||
```
|
||
00:00 - Start
|
||
00:15 - llms.txt not found
|
||
00:20 - Found repository
|
||
00:30 - Cloned repository
|
||
02:00 - Repomix complete
|
||
02:30 - Analyzed output
|
||
02:45 - Report ready
|
||
Total: 2m 45s ✓
|
||
```
|
||
|
||
**Slow case (Scattered documentation):**
|
||
```
|
||
00:00 - Start
|
||
00:30 - llms.txt not found
|
||
00:45 - Repository not found
|
||
01:00 - Launched 4 Researcher agents
|
||
05:00 - All research complete
|
||
06:00 - Aggregated findings
|
||
06:30 - Report ready
|
||
Total: 6m 30s (acceptable for research)
|
||
```
|
||
|
||
## Common Performance Issues
|
||
|
||
### Issue 1: Too Many Agents
|
||
|
||
**Symptom:** Slower than sequential
|
||
|
||
```
|
||
Problem:
|
||
Launched 15 agents for 15 URLs
|
||
Overhead: Agent initialization, coordination
|
||
Result: Slower than 5 agents with 3 URLs each
|
||
```
|
||
|
||
**Solution:**
|
||
```
|
||
Max 7 agents per batch
|
||
Group URLs sensibly
|
||
Use phases for large sets
|
||
```
|
||
|
||
### Issue 2: Blocking Operations
|
||
|
||
**Symptom:** Agents waiting unnecessarily
|
||
|
||
```
|
||
Problem:
|
||
Agent 1: Fetch URL, wait for Agent 2
|
||
Agent 2: Fetch URL, wait for Agent 3
|
||
Agent 3: Fetch URL
|
||
Result: Sequential instead of parallel
|
||
```
|
||
|
||
**Solution:**
|
||
```
|
||
Launch all agents independently
|
||
No dependencies between agents
|
||
Aggregate after all complete
|
||
```
|
||
|
||
### Issue 3: Redundant Fetching
|
||
|
||
**Symptom:** Same content fetched multiple times
|
||
|
||
```
|
||
Problem:
|
||
Phase 1: Fetch installation guide
|
||
Phase 2: Fetch installation guide again
|
||
Result: Wasted time
|
||
```
|
||
|
||
**Solution:**
|
||
```
|
||
Cache fetched content
|
||
Check cache before fetching
|
||
Reuse within session
|
||
```
|
||
|
||
### Issue 4: Late Bailout
|
||
|
||
**Symptom:** Continuing when should stop
|
||
|
||
```
|
||
Problem:
|
||
Found 90% of needed info after 1 minute
|
||
Spent 4 more minutes on remaining 10%
|
||
Result: 5x time for marginal gain
|
||
```
|
||
|
||
**Solution:**
|
||
```
|
||
Check progress after critical phase
|
||
If 80%+ covered → offer to stop
|
||
Only continue if user wants comprehensive
|
||
```
|
||
|
||
## Performance Monitoring
|
||
|
||
### Key Metrics
|
||
|
||
**Track these times:**
|
||
```
|
||
- llms.txt discovery: Target <30s
|
||
- Repository clone: Target <60s
|
||
- Repomix processing: Target <2min
|
||
- Agent exploration: Target <60s
|
||
- Total time: Target <3min for typical case
|
||
```
|
||
|
||
### Performance Report Template
|
||
|
||
```markdown
|
||
## Performance Summary
|
||
|
||
**Total time**: 1m 25s
|
||
**Method**: llms.txt + parallel exploration
|
||
|
||
**Breakdown**:
|
||
- Discovery: 15s (llms.txt search & fetch)
|
||
- Exploration: 50s (4 agents, 12 URLs)
|
||
- Aggregation: 20s (synthesis & formatting)
|
||
|
||
**Efficiency**: 8.5x faster than sequential
|
||
(12 URLs × 5s = 60s sequential, actual: 50s parallel)
|
||
```
|
||
|
||
### When to Optimize Further
|
||
|
||
Optimize if:
|
||
- [ ] Total time >2x target
|
||
- [ ] User explicitly requests "fast"
|
||
- [ ] Repeated similar queries (cache benefit)
|
||
- [ ] Large documentation set (>20 URLs)
|
||
|
||
Don't over-optimize if:
|
||
- [ ] Already meeting targets
|
||
- [ ] One-time query
|
||
- [ ] User values completeness over speed
|
||
- [ ] Research requires thoroughness
|
||
|
||
## Quick Optimization Checklist
|
||
|
||
### Before Starting
|
||
|
||
- [ ] Check if content already cached
|
||
- [ ] Identify fastest method for this case
|
||
- [ ] Plan for parallel execution
|
||
- [ ] Set appropriate timeouts
|
||
|
||
### During Execution
|
||
|
||
- [ ] Launch agents in parallel (not sequential)
|
||
- [ ] Use single message for multiple agents
|
||
- [ ] Monitor for bottlenecks
|
||
- [ ] Be ready to terminate early
|
||
|
||
### After First Phase
|
||
|
||
- [ ] Assess coverage achieved
|
||
- [ ] Determine if user needs met
|
||
- [ ] Decide: continue or deliver now
|
||
- [ ] Cache results for potential reuse
|
||
|
||
### Optimization Decision Tree
|
||
|
||
```
|
||
Need documentation?
|
||
↓
|
||
Check cache
|
||
↓
|
||
HIT → Use cached (0s) ✓
|
||
MISS → Continue
|
||
↓
|
||
llms.txt available?
|
||
↓
|
||
YES → Parallel agents (30-60s) ✓
|
||
NO → Continue
|
||
↓
|
||
Repository available?
|
||
↓
|
||
YES → Repomix (2-5min)
|
||
NO → Research (3-7min)
|
||
↓
|
||
After Phase 1:
|
||
80%+ coverage?
|
||
↓
|
||
YES → Deliver now (save time) ✓
|
||
NO → Continue to Phase 2
|
||
```
|