Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:48:52 +08:00
commit 6ec3196ecc
434 changed files with 125248 additions and 0 deletions

View File

@@ -0,0 +1,574 @@
# Performance Optimization
Strategies and techniques for maximizing speed and efficiency in documentation discovery.
## Core Principles
### 0. Use context7.com for Instant llms.txt Access
**Fastest Approach:**
Direct URL construction instead of searching:
```
Traditional: WebSearch (15-30s) → WebFetch (5-10s) = 20-40s
context7.com: Direct WebFetch (5-10s) = 5-10s
Speed improvement: 2-4x faster
```
**Benefits:**
- No search required (instant URL construction)
- Consistent URL patterns
- Reliable availability
- Topic filtering for targeted results
**Examples:**
```
GitHub repo:
https://context7.com/vercel/next.js/llms.txt
→ Instant, no search needed
Website:
https://context7.com/websites/imgix/llms.txt
→ Instant, no search needed
Topic-specific:
https://context7.com/shadcn-ui/ui/llms.txt?topic=date
→ Filtered results, even faster
```
**Performance Impact:**
```
Without context7.com:
1. WebSearch for llms.txt: 15s
2. WebFetch llms.txt: 5s
3. Launch agents: 5s
Total: 25s
With context7.com:
1. Direct WebFetch: 5s
2. Launch agents: 5s
Total: 10s (2.5x faster!)
With context7.com + topic:
1. Direct WebFetch (filtered): 3s
2. Process focused results: 2s
Total: 5s (5x faster!)
```
### 1. Minimize Sequential Operations
**The Problem:**
Sequential operations add up linearly:
```
Total Time = Op1 + Op2 + Op3 + ... + OpN
```
Example:
```
Fetch URL 1: 5 seconds
Fetch URL 2: 5 seconds
Fetch URL 3: 5 seconds
Total: 15 seconds
```
**The Solution:**
Parallel operations complete in max time of slowest:
```
Total Time = max(Op1, Op2, Op3, ..., OpN)
```
Example:
```
Launch 3 agents simultaneously
All complete in: ~5 seconds
Total: 5 seconds (3x faster!)
```
### 2. Batch Related Operations
**Benefits:**
- Fewer context switches
- Better resource utilization
- Easier to track
- More efficient aggregation
**Grouping Strategies:**
**By topic:**
```
Agent 1: All authentication-related docs
Agent 2: All database-related docs
Agent 3: All API-related docs
```
**By content type:**
```
Agent 1: All tutorials
Agent 2: All reference docs
Agent 3: All examples
```
**By priority:**
```
Phase 1 (critical): Getting started, installation, core concepts
Phase 2 (important): Guides, API reference, configuration
Phase 3 (optional): Advanced topics, internals, optimization
```
### 3. Smart Caching
**What to cache:**
- Repomix output (expensive to generate)
- llms.txt content (static)
- Repository structure (rarely changes)
- Documentation URLs (reference list)
**When to refresh:**
- User requests specific version
- Documentation updated (check last-modified)
- Cache older than session
- User explicitly requests fresh data
### 4. Early Termination
**When to stop:**
```
✓ User's core needs met
✓ Critical information found
✓ Time limit approaching
✓ Diminishing returns (90% coverage achieved)
```
**How to decide:**
```
After Phase 1 (critical docs):
- Review what was found
- Check against user request
- If 80%+ covered → deliver now
- Offer to fetch more if needed
```
## Performance Patterns
### Pattern 1: Parallel Exploration
**Scenario:** llms.txt contains 10 URLs
**Slow approach (sequential):**
```
Time: 10 URLs × 5 seconds = 50 seconds
Step 1: Fetch URL 1 (5s)
Step 2: Fetch URL 2 (5s)
Step 3: Fetch URL 3 (5s)
...
Step 10: Fetch URL 10 (5s)
```
**Fast approach (parallel):**
```
Time: ~5-10 seconds total
Step 1: Launch 5 Explorer agents (simultaneous)
Agent 1: URLs 1-2
Agent 2: URLs 3-4
Agent 3: URLs 5-6
Agent 4: URLs 7-8
Agent 5: URLs 9-10
Step 2: Wait for all (max time: ~5-10s)
Step 3: Aggregate results
```
**Speedup:** 5-10x faster
### Pattern 2: Lazy Loading
**Scenario:** Documentation has 30+ pages
**Slow approach (fetch everything):**
```
Time: 30 URLs × 5 seconds ÷ 5 agents = 30 seconds
Fetch all 30 pages upfront
User only needs 5 of them
Wasted: 25 pages × 5 seconds ÷ 5 = 25 seconds
```
**Fast approach (priority loading):**
```
Time: 10 URLs × 5 seconds ÷ 5 agents = 10 seconds
Phase 1: Fetch critical 10 pages
Review: Does this cover user's needs?
If yes: Stop here (saved 20 seconds)
If no: Fetch additional as needed
```
**Speedup:** Up to 3x faster for typical use cases
### Pattern 3: Smart Fallbacks
**Scenario:** llms.txt not found
**Slow approach (exhaustive search):**
```
Time: ~5 minutes
Try: docs.library.com/llms.txt (30s timeout)
Try: library.dev/llms.txt (30s timeout)
Try: library.io/llms.txt (30s timeout)
Try: library.org/llms.txt (30s timeout)
Try: www.library.com/llms.txt (30s timeout)
Then: Fall back to repository
```
**Fast approach (quick fallback):**
```
Time: ~1 minute
Try: docs.library.com/llms.txt (15s)
Try: library.dev/llms.txt (15s)
Not found → Immediately try repository (30s)
```
**Speedup:** 5x faster
### Pattern 4: Incremental Results
**Scenario:** Large documentation set
**Slow approach (all-or-nothing):**
```
Time: 5 minutes until first result
Fetch all documentation
Aggregate everything
Present complete report
User waits 5 minutes
```
**Fast approach (streaming):**
```
Time: 30 seconds to first result
Phase 1: Fetch critical docs (30s)
Present: Initial findings
Phase 2: Fetch important docs (60s)
Update: Additional findings
Phase 3: Fetch supplementary (90s)
Final: Complete report
```
**Benefit:** User gets value immediately, can stop early if satisfied
## Optimization Techniques
### Technique 1: Workload Balancing
**Problem:** Uneven distribution causes bottlenecks
```
Bad distribution:
Agent 1: 1 URL (small) → finishes in 5s
Agent 2: 10 URLs (large) → finishes in 50s
Total: 50s (bottlenecked by Agent 2)
```
**Solution:** Balance by estimated size
```
Good distribution:
Agent 1: 3 URLs (medium pages) → ~15s
Agent 2: 3 URLs (medium pages) → ~15s
Agent 3: 3 URLs (medium pages) → ~15s
Agent 4: 1 URL (large page) → ~15s
Total: ~15s (balanced)
```
### Technique 2: Request Coalescing
**Problem:** Redundant requests slow things down
```
Bad:
Agent 1: Fetch README.md
Agent 2: Fetch README.md (duplicate!)
Agent 3: Fetch README.md (duplicate!)
Wasted: 2 redundant fetches
```
**Solution:** Deduplicate before fetching
```
Good:
Pre-processing: Identify unique URLs
Agent 1: Fetch README.md (once)
Agent 2: Fetch INSTALL.md
Agent 3: Fetch API.md
Share: README.md content across agents if needed
```
### Technique 3: Timeout Tuning
**Problem:** Default timeouts too conservative
```
Slow:
WebFetch timeout: 120s (too long for fast sites)
If site is down: Wait 120s before failing
```
**Solution:** Adaptive timeouts
```
Fast:
Known fast sites (official docs): 30s timeout
Unknown sites: 60s timeout
Large repos: 120s timeout
If timeout hit: Immediately try alternative
```
### Technique 4: Selective Fetching
**Problem:** Fetching irrelevant content
```
Wasteful:
Fetch: Installation guide ✓ (needed)
Fetch: API reference ✓ (needed)
Fetch: Internal architecture ✗ (not needed for basic usage)
Fetch: Contributing guide ✗ (not needed)
Fetch: Changelog ✗ (not needed)
```
**Solution:** Filter by user needs
```
Efficient:
User need: "How to get started"
Fetch only: Installation, basic usage, examples
Skip: Advanced topics, internals, contribution
Speedup: 50% less fetching
```
## Performance Benchmarks
### Target Times
| Scenario | Target Time | Acceptable | Too Slow |
|----------|-------------|------------|----------|
| Single URL | <10s | 10-20s | >20s |
| llms.txt (5 URLs) | <30s | 30-60s | >60s |
| llms.txt (15 URLs) | <60s | 60-120s | >120s |
| Repository analysis | <2min | 2-5min | >5min |
| Research fallback | <3min | 3-7min | >7min |
### Real-World Examples
**Fast case (Next.js with llms.txt):**
```
00:00 - Start
00:05 - Found llms.txt
00:10 - Fetched content (12 URLs)
00:15 - Launched 4 agents
00:45 - All agents complete
00:55 - Report ready
Total: 55 seconds ✓
```
**Medium case (Repository without llms.txt):**
```
00:00 - Start
00:15 - llms.txt not found
00:20 - Found repository
00:30 - Cloned repository
02:00 - Repomix complete
02:30 - Analyzed output
02:45 - Report ready
Total: 2m 45s ✓
```
**Slow case (Scattered documentation):**
```
00:00 - Start
00:30 - llms.txt not found
00:45 - Repository not found
01:00 - Launched 4 Researcher agents
05:00 - All research complete
06:00 - Aggregated findings
06:30 - Report ready
Total: 6m 30s (acceptable for research)
```
## Common Performance Issues
### Issue 1: Too Many Agents
**Symptom:** Slower than sequential
```
Problem:
Launched 15 agents for 15 URLs
Overhead: Agent initialization, coordination
Result: Slower than 5 agents with 3 URLs each
```
**Solution:**
```
Max 7 agents per batch
Group URLs sensibly
Use phases for large sets
```
### Issue 2: Blocking Operations
**Symptom:** Agents waiting unnecessarily
```
Problem:
Agent 1: Fetch URL, wait for Agent 2
Agent 2: Fetch URL, wait for Agent 3
Agent 3: Fetch URL
Result: Sequential instead of parallel
```
**Solution:**
```
Launch all agents independently
No dependencies between agents
Aggregate after all complete
```
### Issue 3: Redundant Fetching
**Symptom:** Same content fetched multiple times
```
Problem:
Phase 1: Fetch installation guide
Phase 2: Fetch installation guide again
Result: Wasted time
```
**Solution:**
```
Cache fetched content
Check cache before fetching
Reuse within session
```
### Issue 4: Late Bailout
**Symptom:** Continuing when should stop
```
Problem:
Found 90% of needed info after 1 minute
Spent 4 more minutes on remaining 10%
Result: 5x time for marginal gain
```
**Solution:**
```
Check progress after critical phase
If 80%+ covered → offer to stop
Only continue if user wants comprehensive
```
## Performance Monitoring
### Key Metrics
**Track these times:**
```
- llms.txt discovery: Target <30s
- Repository clone: Target <60s
- Repomix processing: Target <2min
- Agent exploration: Target <60s
- Total time: Target <3min for typical case
```
### Performance Report Template
```markdown
## Performance Summary
**Total time**: 1m 25s
**Method**: llms.txt + parallel exploration
**Breakdown**:
- Discovery: 15s (llms.txt search & fetch)
- Exploration: 50s (4 agents, 12 URLs)
- Aggregation: 20s (synthesis & formatting)
**Efficiency**: 8.5x faster than sequential
(12 URLs × 5s = 60s sequential, actual: 50s parallel)
```
### When to Optimize Further
Optimize if:
- [ ] Total time >2x target
- [ ] User explicitly requests "fast"
- [ ] Repeated similar queries (cache benefit)
- [ ] Large documentation set (>20 URLs)
Don't over-optimize if:
- [ ] Already meeting targets
- [ ] One-time query
- [ ] User values completeness over speed
- [ ] Research requires thoroughness
## Quick Optimization Checklist
### Before Starting
- [ ] Check if content already cached
- [ ] Identify fastest method for this case
- [ ] Plan for parallel execution
- [ ] Set appropriate timeouts
### During Execution
- [ ] Launch agents in parallel (not sequential)
- [ ] Use single message for multiple agents
- [ ] Monitor for bottlenecks
- [ ] Be ready to terminate early
### After First Phase
- [ ] Assess coverage achieved
- [ ] Determine if user needs met
- [ ] Decide: continue or deliver now
- [ ] Cache results for potential reuse
### Optimization Decision Tree
```
Need documentation?
Check cache
HIT → Use cached (0s) ✓
MISS → Continue
llms.txt available?
YES → Parallel agents (30-60s) ✓
NO → Continue
Repository available?
YES → Repomix (2-5min)
NO → Research (3-7min)
After Phase 1:
80%+ coverage?
YES → Deliver now (save time) ✓
NO → Continue to Phase 2
```