Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:52:13 +08:00
commit 4b20ee9596
10 changed files with 3079 additions and 0 deletions

View File

@@ -0,0 +1,405 @@
# Example: llms.txt Generation for Different Project Types
This document shows examples of llms.txt files generated for different types of projects, demonstrating how to structure the file based on project characteristics.
---
## Example 1: Python Library (Data Processing)
### Project Context
A Python library called "DataFlow" for stream data processing with multiple output formats.
### Generated llms.txt
```markdown
# DataFlow
> DataFlow is a Python library for processing data streams with real-time transformations
> and multiple output formats. It provides efficient stream processing with lazy evaluation
> and built-in error handling.
Key features:
- Fast stream processing with lazy evaluation
- Support for CSV, JSON, Parquet, and custom formats
- Built-in error handling and recovery
- Zero-dependency core library
- Extensible plugin system
## Documentation
- [Quick Start Guide](https://github.com/example/dataflow/blob/main/docs/quickstart.md): Get up and running in 5 minutes
- [Core Concepts](https://github.com/example/dataflow/blob/main/docs/concepts.md): Understanding streams, transformations, and processing
- [Configuration Guide](https://github.com/example/dataflow/blob/main/docs/configuration.md): All configuration options explained
## API Reference
- [Stream API](https://github.com/example/dataflow/blob/main/docs/api/stream.md): Stream creation and manipulation methods
- [Transformations](https://github.com/example/dataflow/blob/main/docs/api/transforms.md): Built-in transformation functions
- [Exports](https://github.com/example/dataflow/blob/main/docs/api/exports.md): Output format specifications
## Examples
- [Basic Usage](https://github.com/example/dataflow/blob/main/examples/basic.md): Simple stream processing examples
- [Common Patterns](https://github.com/example/dataflow/blob/main/examples/patterns.md): Filtering, mapping, and aggregation
- [Error Handling](https://github.com/example/dataflow/blob/main/examples/errors.md): Handling failures and recovery
- [Advanced Usage](https://github.com/example/dataflow/blob/main/examples/advanced.md): Parallel processing and custom plugins
## Development
- [Contributing Guide](https://github.com/example/dataflow/blob/main/CONTRIBUTING.md): How to contribute to DataFlow
- [Development Setup](https://github.com/example/dataflow/blob/main/docs/development.md): Setting up local development environment
- [Testing](https://github.com/example/dataflow/blob/main/docs/testing.md): Running and writing tests
## Optional
- [DataFlow Blog](https://dataflow.example.com/blog/): Latest updates and tutorials
- [Changelog](https://github.com/example/dataflow/blob/main/CHANGELOG.md): Version history and release notes
- [Performance Benchmarks](https://github.com/example/dataflow/blob/main/docs/performance.md): Benchmark results and optimization tips
```
### Why This Structure?
- **Blockquote**: Clearly explains what DataFlow is and its main value proposition
- **Key Features**: Bullet list highlights important capabilities
- **Documentation**: Essential guides for getting started and understanding core concepts
- **API Reference**: Organized by major components (Stream, Transformations, Exports)
- **Examples**: Progressive from basic to advanced, includes error handling
- **Development**: Resources for contributors
- **Optional**: Secondary resources like blog and benchmarks
---
## Example 2: CLI Tool (Developer Tool)
### Project Context
A command-line tool called "BuildKit" for managing build processes and deployment pipelines.
### Generated llms.txt
```markdown
# BuildKit
> BuildKit is a CLI tool for managing build processes, running tests, and deploying
> applications across multiple environments. It provides a unified interface for common
> development workflows.
BuildKit follows these principles:
- Convention over configuration
- Fast feedback loops
- Environment parity
- Reproducible builds
## Getting Started
- [Installation](https://buildkit.dev/docs/install.md): Installing BuildKit on macOS, Linux, and Windows
- [Quick Start](https://buildkit.dev/docs/quickstart.md): Your first BuildKit project in 5 minutes
- [Core Concepts](https://buildkit.dev/docs/concepts.md): Understanding tasks, pipelines, and environments
## Commands
- [build](https://buildkit.dev/docs/commands/build.md): Build your project with automatic dependency detection
- [test](https://buildkit.dev/docs/commands/test.md): Run tests with parallel execution
- [deploy](https://buildkit.dev/docs/commands/deploy.md): Deploy to staging or production
- [watch](https://buildkit.dev/docs/commands/watch.md): Watch for changes and rebuild automatically
- [All Commands](https://buildkit.dev/docs/commands/): Complete command reference
## Configuration
- [buildkit.yml](https://buildkit.dev/docs/config.md): Configuration file reference
- [Environment Variables](https://buildkit.dev/docs/env.md): Environment-specific configuration
- [Plugins](https://buildkit.dev/docs/plugins.md): Extending BuildKit with custom plugins
## Examples
- [Node.js Projects](https://buildkit.dev/examples/nodejs.md): Building and deploying Node.js apps
- [Python Projects](https://buildkit.dev/examples/python.md): Python application workflows
- [Monorepos](https://buildkit.dev/examples/monorepo.md): Managing multiple packages
- [CI/CD Integration](https://buildkit.dev/examples/ci.md): Using BuildKit in CI/CD pipelines
## Optional
- [BuildKit Blog](https://buildkit.dev/blog/): Tutorials and case studies
- [Plugin Directory](https://buildkit.dev/plugins/): Community plugins
- [Troubleshooting](https://buildkit.dev/docs/troubleshooting.md): Common issues and solutions
```
### Why This Structure?
- **Principles**: Shows design philosophy upfront
- **Getting Started**: Installation and quickstart are priority for CLI tools
- **Commands**: Individual command documentation (most important for CLI tools)
- **Configuration**: Clear section for config files and customization
- **Examples**: Language/framework-specific guides
- **Optional**: Community resources and troubleshooting
---
## Example 3: Web Framework
### Project Context
A web framework called "FastWeb" for building modern web applications.
### Generated llms.txt
```markdown
# FastWeb
> FastWeb is a modern web framework for building full-stack applications with Python.
> It provides server-side rendering, API routes, and built-in database support with
> zero configuration required.
FastWeb features:
- File-based routing with automatic code splitting
- Server-side rendering (SSR) and static site generation (SSG)
- Built-in API routes and middleware
- Real-time capabilities with WebSockets
- TypeScript-first with excellent type inference
## Documentation
- [Getting Started](https://fastweb.dev/docs/getting-started.md): Create your first FastWeb app
- [Routing](https://fastweb.dev/docs/routing.md): File-based routing and dynamic routes
- [Data Fetching](https://fastweb.dev/docs/data.md): Loading data on server and client
- [Rendering](https://fastweb.dev/docs/rendering.md): SSR, SSG, and client-side rendering
- [API Routes](https://fastweb.dev/docs/api.md): Building REST and GraphQL APIs
## Guides
- [Authentication](https://fastweb.dev/guides/auth.md): User authentication and authorization
- [Database Integration](https://fastweb.dev/guides/database.md): Working with databases
- [Deployment](https://fastweb.dev/guides/deployment.md): Deploying to production
- [Testing](https://fastweb.dev/guides/testing.md): Unit and integration testing
- [Performance](https://fastweb.dev/guides/performance.md): Optimization best practices
## API Reference
- [Configuration](https://fastweb.dev/api/config.md): fastweb.config.js options
- [CLI](https://fastweb.dev/api/cli.md): Command-line interface reference
- [Components](https://fastweb.dev/api/components.md): Built-in components
- [Hooks](https://fastweb.dev/api/hooks.md): React-style hooks API
- [Utilities](https://fastweb.dev/api/utils.md): Helper functions and utilities
## Examples
- [Blog](https://fastweb.dev/examples/blog.md): Building a blog with markdown
- [E-commerce](https://fastweb.dev/examples/ecommerce.md): Product catalog and checkout
- [Dashboard](https://fastweb.dev/examples/dashboard.md): Admin dashboard with charts
- [Real-time Chat](https://fastweb.dev/examples/chat.md): WebSocket-based chat app
## Integrations
- [Databases](https://fastweb.dev/integrations/databases.md): PostgreSQL, MySQL, MongoDB
- [CSS Frameworks](https://fastweb.dev/integrations/css.md): Tailwind, Bootstrap, etc.
- [Analytics](https://fastweb.dev/integrations/analytics.md): Google Analytics, Plausible
- [CMS](https://fastweb.dev/integrations/cms.md): Headless CMS integrations
## Optional
- [FastWeb Blog](https://fastweb.dev/blog/): Tutorials and announcements
- [Showcase](https://fastweb.dev/showcase/): Sites built with FastWeb
- [Community](https://fastweb.dev/community/): Discord, GitHub discussions
- [Changelog](https://fastweb.dev/changelog/): Version history
```
### Why This Structure?
- **Framework Features**: Lists core capabilities upfront
- **Documentation**: Core framework concepts and features
- **Guides**: Task-oriented how-to guides (authentication, deployment, etc.)
- **API Reference**: Technical reference for configuration and APIs
- **Examples**: Complete application examples
- **Integrations**: Third-party tool integration guides
- **Optional**: Community and showcase resources
---
## Example 4: Claude Skill
### Project Context
A Claude skill for optimizing documentation (this project!).
### Generated llms.txt
```markdown
# c7score-optimizer
> A Claude skill that optimizes project documentation and README files to score highly
> on Context7's c7score benchmark, making docs more effective for AI-assisted coding tools.
> Also generates llms.txt files for projects.
The skill provides:
- Documentation analysis and quality assessment
- Question-driven content restructuring
- Code snippet enhancement with context
- llms.txt file generation
- Python analysis script for automated scanning
## Documentation
- [README](https://github.com/example/c7score-optimizer/blob/main/README.md): Overview, installation, and usage
- [Skill Definition](https://github.com/example/c7score-optimizer/blob/main/SKILL.md): Complete skill workflow and instructions
- [Changelog](https://github.com/example/c7score-optimizer/blob/main/CHANGELOG.md): Version history and updates
## Reference Materials
- [C7Score Metrics](https://github.com/example/c7score-optimizer/blob/main/references/c7score_metrics.md): Understanding the c7score benchmark
- [Optimization Patterns](https://github.com/example/c7score-optimizer/blob/main/references/optimization_patterns.md): 20+ transformation patterns
- [llms.txt Format](https://github.com/example/c7score-optimizer/blob/main/references/llmstxt_format.md): Complete llms.txt specification
## Examples
- [README Optimization](https://github.com/example/c7score-optimizer/blob/main/examples/sample_readme.md): Before/after documentation transformation
- [llms.txt Generation](https://github.com/example/c7score-optimizer/blob/main/examples/sample_llmstxt.md): Generated llms.txt examples
## Development
- [Analysis Script](https://github.com/example/c7score-optimizer/blob/main/scripts/analyze_docs.py): Python tool for documentation scanning
- [Contributing](https://github.com/example/c7score-optimizer/blob/main/CONTRIBUTING.md): How to contribute improvements
## Optional
- [Context7 c7score](https://www.context7.ai/c7score): Official c7score benchmark
- [llmstxt.org](https://llmstxt.org/): Official llms.txt specification
- [Claude Code Docs](https://docs.claude.com/claude-code): Claude Code documentation
```
### Why This Structure?
- **Skill Capabilities**: Clear explanation of what the skill does
- **Documentation**: Essential files (README, SKILL.md, CHANGELOG)
- **Reference Materials**: Detailed specifications and patterns
- **Examples**: Practical before/after demonstrations
- **Development**: Tools and contribution guides
- **Optional**: External resources and official documentation
---
## Key Patterns Across All Examples
### 1. Strong Opening
Every example has:
- Clear H1 with project name
- Informative blockquote explaining what it is
- Key features/principles in bullets
### 2. Logical Section Progression
Common pattern:
1. **Getting Started / Documentation** (high priority)
2. **API / Commands / Core Features** (high priority)
3. **Guides / Examples** (practical applications)
4. **Development / Contributing** (for contributors)
5. **Optional** (secondary resources)
### 3. Descriptive Links
All links include:
- Clear, action-oriented titles
- Helpful descriptions after colons
- Context about what each resource contains
### 4. Full URLs
All examples use complete URLs with protocol:
-`https://example.com/docs/guide.md`
-`/docs/guide.md`
-`../guide.md`
### 5. Markdown-First
Prefer linking to `.md` files:
-`docs/guide.md`
- ⚠️ `docs/guide.html` (acceptable if no .md available)
---
## Decision Tree: What Sections to Include?
### For Libraries/Packages
- **Must have**: Documentation, API Reference, Examples
- **Should have**: Getting Started, Development
- **Nice to have**: Guides, Integrations, Optional
### For CLI Tools
- **Must have**: Getting Started, Commands, Examples
- **Should have**: Configuration, Development
- **Nice to have**: Plugins, Troubleshooting, Optional
### For Frameworks
- **Must have**: Documentation, Guides, API Reference, Examples
- **Should have**: Integrations, Getting Started
- **Nice to have**: Showcase, Optional
### For Skills/Plugins
- **Must have**: Documentation, Reference Materials
- **Should have**: Examples, Development
- **Nice to have**: Optional (external resources)
---
## Common Customizations by Project Type
### Open Source Project
Add to Optional:
- Contributing guide
- Code of conduct
- Governance
- Roadmap
### Commercial Product
Add sections:
- Pricing/Plans
- Support
- Enterprise features
- Migration guides
### Educational Resource
Add sections:
- Tutorials
- Video courses
- Exercises
- Certification
### Research Project
Add sections:
- Papers
- Datasets
- Experiments
- Citations
---
## Anti-Patterns to Avoid
### ❌ Too Granular
```markdown
## Installation
- [macOS Install](url)
- [Linux Install](url)
- [Windows Install](url)
- [Docker Install](url)
```
Better: One "Installation" link covering all platforms
### ❌ No Descriptions
```markdown
- [Guide](url)
- [Docs](url)
- [API](url)
```
Better: Add helpful context after colons
### ❌ Outdated Links
```markdown
- [Guide](https://example.com/v1/guide.md)
```
Better: Link to latest version or version-agnostic URLs
### ❌ Relative URLs
```markdown
- [Guide](../docs/guide.md)
```
Better: Use full URLs with protocol
### ❌ Too Much Content
Don't paste entire documentation. Link to it.
---
Use these examples as templates when generating llms.txt files for different project types!

View File

@@ -0,0 +1,352 @@
# Example: Before and After C7Score Optimization
This example demonstrates the transformation of a typical README into an optimized version that scores higher on the c7score benchmark.
---
## BEFORE Optimization (Low c7score)
```markdown
# DataFlow
![Build Status](https://img.shields.io/badge/build-passing-brightgreen)
![Version](https://img.shields.io/badge/version-2.3.1-blue)
Last updated: March 15, 2024
DataFlow is a Python library for processing data streams.
## Features
- Stream processing
- Data transformation
- Multiple output formats
- Error handling
## Installation
```bash
pip install dataflow
```
## Usage
```python
from dataflow import Stream
s = Stream()
```
## API
See the API documentation for details.
## License
MIT
```
### Issues with the BEFORE version:
- ❌ Metadata badges and timestamps (reduce c7score)
- ❌ Vague descriptions without concrete examples
- ❌ Code snippets lack context and imports
- ❌ No question-driven structure
- ❌ Missing error handling examples
- ❌ No progressive complexity
- ❌ Features listed without demonstrations
- ❌ Incomplete quickstart example
---
## AFTER Optimization (High c7score)
```markdown
# DataFlow
A Python library for processing data streams with real-time transformations and multiple output formats.
## Quick Start
### How do I install DataFlow?
```bash
pip install dataflow
```
For development installation:
```bash
git clone https://github.com/example/dataflow.git
cd dataflow
pip install -e ".[dev]"
```
### How do I process my first data stream?
```python
from dataflow import Stream, Processor
# Create a stream and add data
stream = Stream()
stream.add_data([1, 2, 3, 4, 5])
# Process the stream
processor = Processor()
result = processor.transform(stream, lambda x: x * 2)
print(result.to_list()) # Output: [2, 4, 6, 8, 10]
```
### How do I handle different data sources?
```python
from dataflow import Stream, FileSource, APISource
# Load from a file
file_stream = Stream.from_source(
FileSource('data.csv', format='csv')
)
# Load from an API
api_stream = Stream.from_source(
APISource('https://api.example.com/data', auth_token='your_token')
)
# Process both streams
for item in file_stream:
print(f"Processing: {item}")
```
## Common Use Cases
### How do I transform data in a stream?
```python
from dataflow import Stream, Transformer
stream = Stream.from_list([
{'name': 'Alice', 'age': 30},
{'name': 'Bob', 'age': 25}
])
# Apply transformations
transformed = stream.map(lambda x: {
**x,
'age_group': 'adult' if x['age'] >= 18 else 'minor'
})
for person in transformed:
print(person)
# Output: {'name': 'Alice', 'age': 30, 'age_group': 'adult'}
# {'name': 'Bob', 'age': 25, 'age_group': 'adult'}
```
### How do I filter data in a stream?
```python
from dataflow import Stream
stream = Stream.from_list([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Filter even numbers
evens = stream.filter(lambda x: x % 2 == 0)
print(evens.to_list()) # Output: [2, 4, 6, 8, 10]
```
### How do I export stream data to different formats?
```python
from dataflow import Stream
stream = Stream.from_list([
{'id': 1, 'value': 100},
{'id': 2, 'value': 200}
])
# Export to JSON
stream.to_json('output.json', indent=2)
# Export to CSV
stream.to_csv('output.csv', headers=True)
# Export to parquet (requires pyarrow)
stream.to_parquet('output.parquet', compression='snappy')
```
## Error Handling
### How do I handle errors during stream processing?
```python
from dataflow import Stream, ProcessingError
stream = Stream.from_list([1, 2, 'invalid', 4, 5])
try:
result = stream.map(lambda x: x * 2)
except ProcessingError as e:
print(f"Error processing stream: {e}")
print(f"Failed at item: {e.item}")
print(f"Position: {e.position}")
```
### How do I skip invalid items in a stream?
```python
from dataflow import Stream
stream = Stream.from_list([1, 2, 'invalid', 4, 5])
# Skip invalid items with error handling
def safe_transform(x):
try:
return x * 2
except TypeError:
return None
result = stream.map(safe_transform).filter(lambda x: x is not None)
print(result.to_list()) # Output: [2, 4, 8, 10]
```
## Advanced Usage
### How do I chain multiple transformations?
```python
from dataflow import Stream
result = (
Stream.from_list([1, 2, 3, 4, 5])
.filter(lambda x: x > 2) # Keep values > 2
.map(lambda x: x * 2) # Double them
.map(lambda x: {'value': x}) # Convert to dict
.to_list()
)
print(result)
# Output: [{'value': 6}, {'value': 8}, {'value': 10}]
```
### How do I process streams in parallel?
```python
from dataflow import Stream, ParallelProcessor
stream = Stream.from_list(range(1000))
# Process with 4 worker threads
processor = ParallelProcessor(workers=4)
result = processor.map(stream, lambda x: expensive_operation(x))
for item in result:
print(item)
```
### How do I aggregate data from a stream?
```python
from dataflow import Stream
stream = Stream.from_list([1, 2, 3, 4, 5])
# Calculate sum
total = stream.reduce(lambda acc, x: acc + x, initial=0)
print(f"Sum: {total}") # Output: Sum: 15
# Calculate average
count = stream.count()
average = total / count
print(f"Average: {average}") # Output: Average: 3.0
```
## Performance
### How do I optimize stream processing performance?
```python
from dataflow import Stream, BufferedStream
# Use buffering for better performance
stream = BufferedStream.from_source(
source=large_data_source,
buffer_size=1000 # Process in chunks of 1000
)
# Enable lazy evaluation
result = stream.lazy().map(transform_fn).filter(filter_fn)
# Only evaluate when needed
final_data = result.to_list()
```
## Requirements
- Python 3.7 or higher
- Optional dependencies:
- `pyarrow` for Parquet support
- `pandas` for DataFrame integration
## Installation Options
Standard installation:
```bash
pip install dataflow
```
With optional dependencies:
```bash
pip install dataflow[parquet] # For Parquet support
pip install dataflow[pandas] # For pandas integration
pip install dataflow[all] # All optional features
```
## License
MIT License - see LICENSE file for details
```
### Improvements in the AFTER version:
- ✅ Removed metadata badges and timestamps
- ✅ Question-driven headers throughout
- ✅ Complete code examples with imports and context
- ✅ Progressive complexity (basic → advanced)
- ✅ Error handling examples
- ✅ Multiple use cases demonstrated
- ✅ Concrete outputs shown in comments
- ✅ Installation options clearly explained
- ✅ Common questions answered with working code
---
## C7Score Impact Estimate
### BEFORE Version Metrics:
- Question-Snippet Matching: ~40/100 (incomplete examples, poor alignment)
- LLM Evaluation: ~50/100 (vague descriptions)
- Formatting: ~70/100 (basic markdown, code blocks present)
- Metadata Removal: ~30/100 (badges and timestamps present)
- Initialization Examples: ~50/100 (incomplete quickstart)
**Estimated BEFORE c7score: ~45/100**
### AFTER Version Metrics:
- Question-Snippet Matching: ~90/100 (excellent Q&A alignment)
- LLM Evaluation: ~95/100 (comprehensive, clear)
- Formatting: ~95/100 (proper structure, complete blocks)
- Metadata Removal: ~100/100 (all noise removed)
- Initialization Examples: ~95/100 (complete, progressive)
**Estimated AFTER c7score: ~92/100**
---
## Key Transformation Patterns Used
1. **Question Headers**: "Installation" → "How do I install DataFlow?"
2. **Complete Examples**: Added imports, setup, and expected outputs
3. **Progressive Complexity**: Basic → Common → Advanced sections
4. **Error Scenarios**: Dedicated error handling examples
5. **Concrete Outputs**: Included actual output in code comments
6. **Noise Removal**: Stripped badges and timestamps
7. **Context Addition**: Every snippet is runnable as-is
8. **Multiple Paths**: Showed different ways to achieve goals
Use this example as a template for optimizing your own documentation!