zhongwei/gh-rafaelcalleja-claude-market-place-plugins-claudekit-skills

Fork 0

Files

Zhongwei Li 6ec3196ecc Initial commit

2025-11-30 08:48:52 +08:00

11 KiB

Raw Blame History

Performance Optimization

Strategies and techniques for maximizing speed and efficiency in documentation discovery.

Core Principles

0. Use context7.com for Instant llms.txt Access

Fastest Approach:

Direct URL construction instead of searching:

Traditional: WebSearch (15-30s) → WebFetch (5-10s) = 20-40s
context7.com: Direct WebFetch (5-10s) = 5-10s

Speed improvement: 2-4x faster

Benefits:

No search required (instant URL construction)
Consistent URL patterns
Reliable availability
Topic filtering for targeted results

Examples:

GitHub repo:
https://context7.com/vercel/next.js/llms.txt
→ Instant, no search needed

Website:
https://context7.com/websites/imgix/llms.txt
→ Instant, no search needed

Topic-specific:
https://context7.com/shadcn-ui/ui/llms.txt?topic=date
→ Filtered results, even faster

Performance Impact:

Without context7.com:
1. WebSearch for llms.txt: 15s
2. WebFetch llms.txt: 5s
3. Launch agents: 5s
Total: 25s

With context7.com:
1. Direct WebFetch: 5s
2. Launch agents: 5s
Total: 10s (2.5x faster!)

With context7.com + topic:
1. Direct WebFetch (filtered): 3s
2. Process focused results: 2s
Total: 5s (5x faster!)

1. Minimize Sequential Operations

The Problem:

Sequential operations add up linearly:

Total Time = Op1 + Op2 + Op3 + ... + OpN

Example:

Fetch URL 1: 5 seconds
Fetch URL 2: 5 seconds
Fetch URL 3: 5 seconds
Total: 15 seconds

The Solution:

Parallel operations complete in max time of slowest:

Total Time = max(Op1, Op2, Op3, ..., OpN)

Example:

Launch 3 agents simultaneously
All complete in: ~5 seconds
Total: 5 seconds (3x faster!)

Benefits:

Fewer context switches
Better resource utilization
Easier to track
More efficient aggregation

Grouping Strategies:

By topic:

Agent 1: All authentication-related docs
Agent 2: All database-related docs
Agent 3: All API-related docs

By content type:

Agent 1: All tutorials
Agent 2: All reference docs
Agent 3: All examples

By priority:

Phase 1 (critical): Getting started, installation, core concepts
Phase 2 (important): Guides, API reference, configuration
Phase 3 (optional): Advanced topics, internals, optimization

3. Smart Caching

What to cache:

Repomix output (expensive to generate)
llms.txt content (static)
Repository structure (rarely changes)
Documentation URLs (reference list)

When to refresh:

User requests specific version
Documentation updated (check last-modified)
Cache older than session
User explicitly requests fresh data

4. Early Termination

When to stop:

✓ User's core needs met
✓ Critical information found
✓ Time limit approaching
✓ Diminishing returns (90% coverage achieved)

How to decide:

After Phase 1 (critical docs):
- Review what was found
- Check against user request
- If 80%+ covered → deliver now
- Offer to fetch more if needed

Performance Patterns

Pattern 1: Parallel Exploration

Scenario: llms.txt contains 10 URLs

Slow approach (sequential):

Time: 10 URLs × 5 seconds = 50 seconds

Step 1: Fetch URL 1 (5s)
Step 2: Fetch URL 2 (5s)
Step 3: Fetch URL 3 (5s)
...
Step 10: Fetch URL 10 (5s)

Fast approach (parallel):

Time: ~5-10 seconds total

Step 1: Launch 5 Explorer agents (simultaneous)
  Agent 1: URLs 1-2
  Agent 2: URLs 3-4
  Agent 3: URLs 5-6
  Agent 4: URLs 7-8
  Agent 5: URLs 9-10

Step 2: Wait for all (max time: ~5-10s)
Step 3: Aggregate results

Speedup: 5-10x faster

Pattern 2: Lazy Loading

Scenario: Documentation has 30+ pages

Slow approach (fetch everything):

Time: 30 URLs × 5 seconds ÷ 5 agents = 30 seconds

Fetch all 30 pages upfront
User only needs 5 of them
Wasted: 25 pages × 5 seconds ÷ 5 = 25 seconds

Fast approach (priority loading):

Time: 10 URLs × 5 seconds ÷ 5 agents = 10 seconds

Phase 1: Fetch critical 10 pages
Review: Does this cover user's needs?
If yes: Stop here (saved 20 seconds)
If no: Fetch additional as needed

Speedup: Up to 3x faster for typical use cases

Pattern 3: Smart Fallbacks

Scenario: llms.txt not found

Slow approach (exhaustive search):

Time: ~5 minutes

Try: docs.library.com/llms.txt (30s timeout)
Try: library.dev/llms.txt (30s timeout)
Try: library.io/llms.txt (30s timeout)
Try: library.org/llms.txt (30s timeout)
Try: www.library.com/llms.txt (30s timeout)
Then: Fall back to repository

Fast approach (quick fallback):

Time: ~1 minute

Try: docs.library.com/llms.txt (15s)
Try: library.dev/llms.txt (15s)
Not found → Immediately try repository (30s)

Speedup: 5x faster

Pattern 4: Incremental Results

Scenario: Large documentation set

Slow approach (all-or-nothing):

Time: 5 minutes until first result

Fetch all documentation
Aggregate everything
Present complete report
User waits 5 minutes

Fast approach (streaming):

Time: 30 seconds to first result

Phase 1: Fetch critical docs (30s)
Present: Initial findings
Phase 2: Fetch important docs (60s)
Update: Additional findings
Phase 3: Fetch supplementary (90s)
Final: Complete report

Benefit: User gets value immediately, can stop early if satisfied

Optimization Techniques

Technique 1: Workload Balancing

Problem: Uneven distribution causes bottlenecks

Bad distribution:
Agent 1: 1 URL (small) → finishes in 5s
Agent 2: 10 URLs (large) → finishes in 50s
Total: 50s (bottlenecked by Agent 2)

Solution: Balance by estimated size

Good distribution:
Agent 1: 3 URLs (medium pages) → ~15s
Agent 2: 3 URLs (medium pages) → ~15s
Agent 3: 3 URLs (medium pages) → ~15s
Agent 4: 1 URL (large page) → ~15s
Total: ~15s (balanced)

Technique 2: Request Coalescing

Problem: Redundant requests slow things down

Bad:
Agent 1: Fetch README.md
Agent 2: Fetch README.md (duplicate!)
Agent 3: Fetch README.md (duplicate!)
Wasted: 2 redundant fetches

Solution: Deduplicate before fetching

Good:
Pre-processing: Identify unique URLs
Agent 1: Fetch README.md (once)
Agent 2: Fetch INSTALL.md
Agent 3: Fetch API.md
Share: README.md content across agents if needed

Technique 3: Timeout Tuning

Problem: Default timeouts too conservative

Slow:
WebFetch timeout: 120s (too long for fast sites)
If site is down: Wait 120s before failing

Solution: Adaptive timeouts

Fast:
Known fast sites (official docs): 30s timeout
Unknown sites: 60s timeout
Large repos: 120s timeout
If timeout hit: Immediately try alternative

Technique 4: Selective Fetching

Problem: Fetching irrelevant content

Wasteful:
Fetch: Installation guide ✓ (needed)
Fetch: API reference ✓ (needed)
Fetch: Internal architecture ✗ (not needed for basic usage)
Fetch: Contributing guide ✗ (not needed)
Fetch: Changelog ✗ (not needed)

Solution: Filter by user needs

Efficient:
User need: "How to get started"
Fetch only: Installation, basic usage, examples
Skip: Advanced topics, internals, contribution
Speedup: 50% less fetching

Performance Benchmarks

Target Times

Scenario	Target Time	Acceptable	Too Slow
Single URL	<10s	10-20s	>20s
llms.txt (5 URLs)	<30s	30-60s	>60s
llms.txt (15 URLs)	<60s	60-120s	>120s
Repository analysis	<2min	2-5min	>5min
Research fallback	<3min	3-7min	>7min

Real-World Examples

Fast case (Next.js with llms.txt):

00:00 - Start
00:05 - Found llms.txt
00:10 - Fetched content (12 URLs)
00:15 - Launched 4 agents
00:45 - All agents complete
00:55 - Report ready
Total: 55 seconds ✓

Medium case (Repository without llms.txt):

00:00 - Start
00:15 - llms.txt not found
00:20 - Found repository
00:30 - Cloned repository
02:00 - Repomix complete
02:30 - Analyzed output
02:45 - Report ready
Total: 2m 45s ✓

Slow case (Scattered documentation):

00:00 - Start
00:30 - llms.txt not found
00:45 - Repository not found
01:00 - Launched 4 Researcher agents
05:00 - All research complete
06:00 - Aggregated findings
06:30 - Report ready
Total: 6m 30s (acceptable for research)

Common Performance Issues

Issue 1: Too Many Agents

Symptom: Slower than sequential

Problem:
Launched 15 agents for 15 URLs
Overhead: Agent initialization, coordination
Result: Slower than 5 agents with 3 URLs each

Solution:

Max 7 agents per batch
Group URLs sensibly
Use phases for large sets

Issue 2: Blocking Operations

Symptom: Agents waiting unnecessarily

Problem:
Agent 1: Fetch URL, wait for Agent 2
Agent 2: Fetch URL, wait for Agent 3
Agent 3: Fetch URL
Result: Sequential instead of parallel

Solution:

Launch all agents independently
No dependencies between agents
Aggregate after all complete

Issue 3: Redundant Fetching

Symptom: Same content fetched multiple times

Problem:
Phase 1: Fetch installation guide
Phase 2: Fetch installation guide again
Result: Wasted time

Solution:

Cache fetched content
Check cache before fetching
Reuse within session

Issue 4: Late Bailout

Symptom: Continuing when should stop

Problem:
Found 90% of needed info after 1 minute
Spent 4 more minutes on remaining 10%
Result: 5x time for marginal gain

Solution:

Check progress after critical phase
If 80%+ covered → offer to stop
Only continue if user wants comprehensive

Performance Monitoring

Key Metrics

Track these times:

- llms.txt discovery: Target <30s
- Repository clone: Target <60s
- Repomix processing: Target <2min
- Agent exploration: Target <60s
- Total time: Target <3min for typical case

Performance Report Template

## Performance Summary

**Total time**: 1m 25s
**Method**: llms.txt + parallel exploration

**Breakdown**:
- Discovery: 15s (llms.txt search & fetch)
- Exploration: 50s (4 agents, 12 URLs)
- Aggregation: 20s (synthesis & formatting)

**Efficiency**: 8.5x faster than sequential
(12 URLs × 5s = 60s sequential, actual: 50s parallel)

When to Optimize Further

Optimize if:

Total time >2x target
User explicitly requests "fast"
Repeated similar queries (cache benefit)
Large documentation set (>20 URLs)

Don't over-optimize if:

Already meeting targets
One-time query
User values completeness over speed
Research requires thoroughness

Quick Optimization Checklist

Before Starting

Check if content already cached
Identify fastest method for this case
Plan for parallel execution
Set appropriate timeouts

During Execution

Launch agents in parallel (not sequential)
Use single message for multiple agents
Monitor for bottlenecks
Be ready to terminate early

After First Phase

Assess coverage achieved
Determine if user needs met
Decide: continue or deliver now
Cache results for potential reuse

Optimization Decision Tree

Need documentation?
  ↓
Check cache
  ↓
HIT → Use cached (0s) ✓
MISS → Continue
  ↓
llms.txt available?
  ↓
YES → Parallel agents (30-60s) ✓
NO → Continue
  ↓
Repository available?
  ↓
YES → Repomix (2-5min)
NO → Research (3-7min)
  ↓
After Phase 1:
80%+ coverage?
  ↓
YES → Deliver now (save time) ✓
NO → Continue to Phase 2

11 KiB Raw Blame History Unescape Escape

Performance Optimization

Core Principles

0. Use context7.com for Instant llms.txt Access

1. Minimize Sequential Operations

2. Batch Related Operations

3. Smart Caching

4. Early Termination

Performance Patterns

Pattern 1: Parallel Exploration

Pattern 2: Lazy Loading

Pattern 3: Smart Fallbacks

Pattern 4: Incremental Results

Optimization Techniques

Technique 1: Workload Balancing

Technique 2: Request Coalescing

Technique 3: Timeout Tuning

Technique 4: Selective Fetching

Performance Benchmarks

Target Times

Real-World Examples

Common Performance Issues

Issue 1: Too Many Agents

Issue 2: Blocking Operations

Issue 3: Redundant Fetching

Issue 4: Late Bailout

Performance Monitoring

Key Metrics

Performance Report Template

When to Optimize Further

Quick Optimization Checklist

Before Starting

During Execution

After First Phase

Optimization Decision Tree

11 KiB

Raw Blame History