Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:52:13 +08:00
commit 4b20ee9596
10 changed files with 3079 additions and 0 deletions

View File

@@ -0,0 +1,349 @@
# C7Score Metrics Reference
## Overview
c7score evaluates documentation quality for Context7 using 5 metrics divided into two groups:
- **LLM Analysis** (Metrics 1-2): AI-powered evaluation
- **Text Analysis** (Metrics 3-5): Rule-based checks
## Metric 1: Question-Snippet Comparison (LLM)
**What it measures:** How well code snippets answer common developer questions about the library.
**Scoring approach:**
- LLM generates 15 common questions developers might ask about the library
- Each snippet is evaluated on how well it answers these questions
- Higher scores for snippets that directly address practical usage questions
**Optimization strategies:**
- Include code examples that answer "how do I..." questions
- Provide working code snippets for common use cases
- Address setup, configuration, and basic operations
- Show real-world usage patterns, not just API signatures
- Include examples that demonstrate the library's main features
**What scores well:**
- "How do I initialize the client?" with full working example
- "How do I handle authentication?" with complete code
- "How do I make a basic query?" with error handling included
**What scores poorly:**
- Partial code that doesn't run standalone
- API reference without usage examples
- Theoretical explanations without practical code
## Metric 2: LLM Evaluation (LLM)
**What it measures:** Overall snippet quality including relevancy, clarity, and correctness.
**Scoring criteria:**
- **Relevancy**: Does the snippet provide useful information about the library?
- **Clarity**: Is the code and explanation easy to understand?
- **Correctness**: Is the code syntactically correct and using proper APIs?
- **Uniqueness**: Are snippets providing unique information or duplicating content?
**Optimization strategies:**
- Ensure each snippet provides distinct, valuable information
- Use clear variable names and structure
- Add brief explanatory comments where helpful
- Verify all code is syntactically correct
- Remove or consolidate duplicate snippets
- Test code examples to ensure they work
**What causes low scores:**
- High rate of duplicate snippets (>25% identical copies)
- Unclear or confusing code structure
- Syntax errors or incorrect API usage
- Snippets that don't add new information
## Metric 3: Formatting (Text Analysis)
**What it measures:** Whether snippets have the expected format and structure.
**Checks performed:**
- Are categories missing? (e.g., no title, description, or code)
- Are code snippets too short or too long?
- Are language tags actually descriptions? (e.g., "FORTE Build System Configuration")
- Are languages set to "none" or showing console output?
- Is the code just a list or argument descriptions?
**Optimization strategies:**
- Follow consistent snippet structure: TITLE / DESCRIPTION / CODE
- Use 40-dash delimiters between snippets (----------------------------------------)
- Set proper language tags (python, javascript, typescript, bash, etc.)
- Avoid very short snippets (<3 lines) unless absolutely necessary
- Avoid very long snippets (>100 lines) - break into focused examples
- Don't use lists in place of code
**Example good format:**
```
Getting Started with Authentication
----------------------------------------
Initialize the client with your API key and authenticate requests.
```python
from library import Client
client = Client(api_key="your_api_key")
client.authenticate()
```
```
**What to avoid:**
- Language tags like "CLI Arguments" or "Configuration File"
- Pretty-printed tables instead of code
- Numbered/bulleted lists masquerading as code
- Missing titles or descriptions
- Inconsistent formatting
## Metric 4: Project Metadata (Text Analysis)
**What it measures:** Presence of irrelevant project information that doesn't help developers use the library.
**Checks performed:**
- BibTeX citations (would have language tag "Bibtex")
- Licensing information
- Directory structure listings
- Project governance or administrative content
**Optimization strategies:**
- Remove or minimize licensing snippets
- Avoid directory tree representations
- Don't include citation information
- Focus on usage, not project management
- Keep administrative content out of code documentation
**What to remove or relocate:**
- LICENSE files or license text
- CONTRIBUTING.md guidelines
- Directory listings or project structure
- Academic citations (BibTeX, APA, etc.)
- Governance policies
**Exception:** Brief installation or setup instructions that mention directories are okay if needed for library usage.
## Metric 5: Initialization (Text Analysis)
**What it measures:** Snippets that are only imports or installations without meaningful content.
**Checks performed:**
- Snippets that are just import statements
- Snippets that are just installation commands (pip install, npm install)
- No additional context or usage examples
**Optimization strategies:**
- Combine imports with usage examples
- Show installation in context of setup process
- Always follow imports with actual usage code
- Make installation snippets include next steps
**Good approach:**
```python
# Installation and basic usage
# First install: pip install library-name
from library import Client
# Initialize and make your first request
client = Client()
result = client.get_data()
```
**Poor approach:**
```python
# Just imports
import library
from library import Client
```
```bash
# Just installation
pip install library-name
```
## Scoring Weights
Default c7score weights (can be customized):
- Question-Snippet Comparison: 0.8 (80%)
- LLM Evaluation: 0.05 (5%)
- Formatting: 0.05 (5%)
- Project Metadata: 0.05 (5%)
- Initialization: 0.05 (5%)
The question-answer metric dominates because Context7's primary goal is helping developers answer practical questions about library usage.
## Overall Best Practices
1. **Focus on answering questions**: Think "How would a developer actually use this?"
2. **Provide complete, working examples**: Not just fragments
3. **Ensure uniqueness**: Each snippet should teach something new
4. **Structure consistently**: TITLE / DESCRIPTION / CODE format
5. **Use proper language tags**: python, javascript, typescript, etc.
6. **Remove noise**: No licensing, directory trees, or pure imports
7. **Test your code**: All examples should be syntactically correct
8. **Keep it practical**: Real-world usage beats theoretical explanation
---
## Self-Evaluation Rubrics
When evaluating documentation quality using c7score methodology, use these detailed rubrics:
### 1. Question-Snippet Matching Rubric (80% weight)
**Score: 90-100 (Excellent)**
- All major developer questions have complete answers
- Code examples are self-contained and runnable
- Examples include imports, setup, and usage context
- Common use cases are clearly demonstrated
- Error handling is shown where relevant
- Examples progress from simple to advanced
**Score: 70-89 (Good)**
- Most questions are answered with working code
- Examples are mostly complete but may miss minor details
- Some context or imports may be implicit
- Common use cases covered
- Minor gaps in error handling
**Score: 50-69 (Fair)**
- Some questions answered, others partially addressed
- Examples require significant external knowledge
- Missing imports or setup context
- Limited use case coverage
- Error handling largely absent
**Score: 30-49 (Poor)**
- Few questions fully answered
- Examples are fragments without context
- Unclear how to actually use the code
- Major use cases not covered
- No error handling
**Score: 0-29 (Very Poor)**
- Questions not addressed in documentation
- No practical examples
- Only API signatures without usage
- Cannot determine how to use the library
### 2. LLM Evaluation Rubric (10% weight)
**Unique Information (30% of metric):**
- 100%: Every snippet provides unique value, no duplicates
- 75%: Minimal duplication, mostly unique content
- 50%: Some repeated information across snippets
- 25%: Significant duplication
- 0%: Many duplicate snippets
**Clarity (30% of metric):**
- 100%: Well-worded, professional, no errors
- 75%: Clear with minor grammar/wording issues
- 50%: Understandable but awkward phrasing
- 25%: Confusing or poorly worded
- 0%: Unclear, incomprehensible
**Correct Syntax (40% of metric):**
- 100%: All code syntactically perfect
- 75%: Minor syntax issues (missing semicolons, etc.)
- 50%: Some syntax errors but code is recognizable
- 25%: Multiple syntax errors
- 0%: Code is not valid
**Final LLM Evaluation Score** = (Unique×0.3) + (Clarity×0.3) + (Syntax×0.4)
### 3. Formatting Rubric (5% weight)
**Score: 100 (Perfect)**
- All snippets have proper language tags (python, javascript, etc.)
- Language tags are actual languages, not descriptions
- All code blocks use triple backticks with language
- Code blocks are properly closed
- No lists within CODE sections
- Minimum length requirements met (5+ words)
**Score: 80-99 (Minor Issues)**
- 1-2 snippets missing language tags
- One or two incorrectly formatted blocks
- Minor inconsistencies
**Score: 50-79 (Multiple Problems)**
- Several snippets missing language tags
- Some use descriptive strings instead of language names
- Inconsistent formatting
**Score: 0-49 (Significant Issues)**
- Many snippets improperly formatted
- Widespread use of wrong language tags
- Code not in proper blocks
### 4. Metadata Removal Rubric (2.5% weight)
**Score: 100 (Clean)**
- No license text in code examples
- No citation formats (BibTeX, RIS)
- No directory structure listings
- No project metadata
- Pure code and usage examples
**Score: 75-99 (Minimal Metadata)**
- One or two snippets with minor metadata
- Brief license mentions that don't dominate
**Score: 50-74 (Some Metadata)**
- Several snippets include project metadata
- Directory structures present
- Some citation content
**Score: 0-49 (Heavy Metadata)**
- Significant license/citation content
- Multiple directory listings
- Project metadata dominates
### 5. Initialization Rubric (2.5% weight)
**Score: 100 (Excellent)**
- All examples show usage beyond setup
- Installation combined with first usage
- Imports followed by practical examples
- No standalone import/install snippets
**Score: 75-99 (Mostly Good)**
- 1-2 snippets are setup-only
- Most examples show actual usage
**Score: 50-74 (Some Init-Only)**
- Several snippets are just imports/installation
- Mixed quality
**Score: 0-49 (Many Init-Only)**
- Many snippets are only imports
- Many snippets are only installation
- Lack of usage examples
### Scoring Best Practices
**When evaluating:**
1. **Read entire documentation** before scoring
2. **Count specific examples** (e.g., "7 out of 10 snippets...")
3. **Be consistent** between before/after evaluations
4. **Explain scores** with concrete evidence
5. **Use percentages** when quantifying (e.g., "80% of examples...")
6. **Identify improvements** specifically
7. **Calculate weighted average**: (Q×0.8) + (L×0.1) + (F×0.05) + (M×0.025) + (I×0.025)
**Example Calculation:**
- Question-Snippet: 85/100 × 0.8 = 68
- LLM Evaluation: 90/100 × 0.1 = 9
- Formatting: 100/100 × 0.05 = 5
- Metadata: 100/100 × 0.025 = 2.5
- Initialization: 95/100 × 0.025 = 2.375
- **Total: 86.875 ≈ 87/100**
### Common Scoring Mistakes to Avoid
**Being too generous**: Score based on evidence, not potential
**Ignoring weights**: Question-answer matters most (80%)
**Vague explanations**: Say "5 of 8 examples lack imports" not "some issues"
**Inconsistent standards**: Apply same rubric to before/after
**Forgetting context**: Consider project type and audience
**Be specific, objective, and consistent**

View File

@@ -0,0 +1,406 @@
# llms.txt Format Specification
This document provides a complete reference for creating llms.txt files according to the official specification at https://llmstxt.org/
## Overview
**llms.txt** is a standardized markdown file format designed to provide LLM-friendly content summaries and documentation. It solves a critical problem: context windows are too small to handle most websites in their entirety.
### Purpose
- Provides brief background information, guidance, and links to detailed markdown files
- Optimized for consumption by language models and AI agents
- Used at inference time when users explicitly request information
- Helps LLMs navigate documentation, understand projects, and access the right resources
- Enables chatbots with search functionality to retrieve relevant information efficiently
### Why Markdown?
The specification uses markdown rather than XML/JSON because "we expect many of these files to be read by language models and agents" while still being "readable using standard programmatic-based tools."
## File Structure
The format follows a specific structural hierarchy:
1. **H1 heading** (`# Title`) - **REQUIRED**
2. **Blockquote summary** (`> text`) - Optional but recommended
3. **Descriptive content** (paragraphs, bullet lists) - Optional
4. **H2-delimited sections** (`## Section Name`) with file lists - Optional
### Basic Template
```markdown
# Project Name
> Brief summary of what this project does and why it exists.
- Key principle or feature
- Another important concept
- Third key point
## Documentation
- [Main Guide](https://example.com/docs/guide.md): Getting started guide
- [API Reference](https://example.com/docs/api.md): Complete API documentation
## Examples
- [Basic Usage](https://example.com/examples/basic.md): Simple examples
- [Advanced Patterns](https://example.com/examples/advanced.md): Complex use cases
## Optional
- [Blog](https://example.com/blog/): Latest news and updates
- [Community](https://example.com/community/): Join discussions
```
## Required Elements
### H1 Title (Required)
The project or site name - this is the **ONLY mandatory element**.
```markdown
# Project Name
```
## Optional Elements
### Blockquote Summary (Recommended)
Brief project description with key information necessary for understanding the rest of the file.
```markdown
> Project Name is a Python library for data processing. It provides efficient
> stream transformations and supports multiple output formats.
```
### Descriptive Content (Optional)
Any markdown content **EXCEPT headings**. Use paragraphs, bullet lists, etc.
```markdown
Key features:
- Fast stream processing with lazy evaluation
- Built-in error handling and recovery
- Zero-dependency core library
- Extensible plugin system
Project Name follows these principles:
1. Simplicity over complexity
2. Performance by default
3. Developer experience first
```
**Important:** Do NOT use H2, H3, or other headings in descriptive content. Only H1 (title) and H2 (section headers) are allowed.
### File List Sections (Optional)
H2-headed sections containing links to resources.
```markdown
## Section Name
- [Link Title](https://full-url): Optional description or notes about the resource
- [Another Link](https://url): More details here
```
## Link Format Requirements
Each file list entry must follow this exact pattern:
```markdown
- [Link Title](https://full-url): Optional description
```
### Rules:
1. Use markdown bullet lists (`-`)
2. Include markdown hyperlinks `[name](url)`
3. Optionally add `:` followed by notes about the file
4. Links should point to markdown versions of documentation (preferably `.md` files)
5. Use full URLs, not relative paths
### Examples:
```markdown
## Documentation
- [Quick Start](https://docs.example.com/quickstart.md): Get up and running in 5 minutes
- [Configuration Guide](https://docs.example.com/config.md): All configuration options explained
- [API Reference](https://docs.example.com/api.md): Complete API documentation with examples
```
## Special Sections
### Optional Section
The **"Optional"** section has special meaning: content here can be skipped when shorter context is needed.
```markdown
## Optional
- [Blog](https://example.com/blog/): Latest news about the project
- [Case Studies](https://example.com/cases/): Real-world usage examples
- [Video Tutorials](https://example.com/videos/): Visual learning resources
```
Use this section for:
- Secondary resources
- Community links
- Blog posts and news
- Extended tutorials
- Background reading
## Common Section Names
### Documentation-focused Projects
```markdown
## Documentation
- Core docs, guides, tutorials
## API Reference
- Function references, method documentation
## Examples
- Code samples, patterns, recipes
## Guides
- How-to guides, best practices
## Development
- Contributing, setup, testing
## Optional
- Blog, community, extended resources
```
### Tool/CLI Projects
```markdown
## Getting Started
- Installation, quickstart
## Commands
- CLI reference, usage examples
## Configuration
- Config files, options
## Examples
- Common workflows, patterns
## Optional
- Advanced usage, plugins
```
### Framework Projects
```markdown
## Core Concepts
- Architecture, principles
## Documentation
- Guides, tutorials
## API Reference
- Component APIs, hooks
## Examples
- Starter templates, patterns
## Plugins/Integrations
- Extensions, third-party tools
## Optional
- Blog, showcase, community
```
## File Placement
### Repository Location
Place at **`/llms.txt`** in the repository root, alongside `README.md`.
### Web Serving
For websites, serve at the root path `/llms.txt` (e.g., `https://example.com/llms.txt`).
### Companion Files
You can create expanded versions:
- `llms-ctx.txt` - Expanded content without URLs
- `llms-ctx-full.txt` - Expanded content with URLs
For referenced pages, create markdown versions:
- `page.html``page.html.md`
- Or use `index.html.md` for pages without filenames
## Best Practices
### Content Guidelines
1. **Be Concise**: Use clear, brief language
2. **Avoid Jargon**: Explain technical terms or link to explanations
3. **Information Hierarchy**: Most important content first
4. **Test with LLMs**: Verify that language models can understand your content
5. **Keep Updated**: Maintain accuracy as your project evolves
### Link Best Practices
1. **Descriptive Titles**: Use meaningful link text (not "click here")
2. **Helpful Notes**: Add context after colons to explain what each resource contains
3. **Stable URLs**: Link to permanent, versioned documentation
4. **Markdown Files**: Prefer `.md` files over HTML when possible
5. **Complete URLs**: Use full URLs with protocol (https://)
### Organizational Strategy
1. **Start with Essentials**: Put most important docs first
2. **Logical Grouping**: Group related resources under descriptive H2 headings
3. **Progressive Detail**: Basic → Intermediate → Advanced
4. **Optional Last**: Secondary resources go in the "Optional" section
5. **Consistent Format**: Use the same link format throughout
## Examples from the Wild
### Real-World Implementations
- **Astro**: https://docs.astro.build/llms.txt
- **FastHTML**: https://www.fastht.ml/docs/llms.txt
- **Shopify**: https://shopify.dev/llms.txt
- **Strapi**: https://docs.strapi.io/llms.txt
- **Modal**: https://modal.com/llms.txt
### Example: FastHTML Style
```markdown
# FastHTML
> FastHTML is a Python library for building web applications using pure Python.
FastHTML follows these principles:
- Write HTML in Python with no JavaScript required
- Use standard Python patterns and idioms
- Deploy anywhere Python runs
## Documentation
- [Tutorial](https://docs.fastht.ml/tutorial.md): Step-by-step introduction
- [Reference](https://docs.fastht.ml/reference.md): Complete API reference
- [Examples](https://docs.fastht.ml/examples.md): Common patterns and recipes
## Optional
- [FastHTML Blog](https://fastht.ml/blog/): Latest updates
```
### Example: Framework Style
```markdown
# Astro
> Astro is an all-in-one web framework for building fast, content-focused websites.
- Uses island architecture for better performance
- Server-first design with minimal client JavaScript
- Supports React, Vue, Svelte, and other UI frameworks
- Zero JavaScript by default
## Documentation Sets
- [Getting Started](https://docs.astro.build/getting-started.md): Installation and first project
- [Core Concepts](https://docs.astro.build/core-concepts.md): Islands, components, routing
- [Complete Docs](https://docs.astro.build/llms-full.txt): Full documentation set
## API Reference
- [Configuration](https://docs.astro.build/reference/configuration.md): astro.config.mjs options
- [CLI Commands](https://docs.astro.build/reference/cli.md): Command-line reference
- [Integrations API](https://docs.astro.build/reference/integrations.md): Building integrations
## Optional
- [Astro Blog](https://astro.build/blog/): Development news
- [Showcase](https://astro.build/showcase/): Sites built with Astro
```
## Allowed Markdown Elements
### Supported
- `#` H1 for title (required)
- `##` H2 for section headers
- `>` Blockquotes for summary
- `-` Bullet lists
- `[text](url)` Markdown links
- `:` Colon separator for notes after links
- Plain paragraphs
- Numbered lists (`1.`, `2.`, etc.)
### Not Used/Forbidden
- H3, H4, H5, H6 headings in descriptive content
- XML, JSON, or other structured formats
- Complex markdown tables
- Images (focus on text content)
- Code blocks (link to them instead)
## Tools and Integration
### CLI Tool
`llms_txt2ctx` - Command-line tool for processing and expanding llms.txt files
### Framework Plugins
- **VitePress**: https://github.com/okineadev/vitepress-plugin-llms
- **Docusaurus**: https://github.com/rachfop/docusaurus-plugin-llms
- **Drupal**: https://www.drupal.org/project/llm_support
- **PHP**: https://github.com/raphaelstolt/llms-txt-php
### Directories
- https://llmstxt.site/ - Directory of available llms.txt files
- https://directory.llmstxt.cloud/ - Community directory
## Common Mistakes to Avoid
1. **Using Relative URLs**: Always use full URLs with protocol
2. **Too Much Content**: Keep it concise, link to details
3. **Missing Descriptions**: Add helpful notes after link colons
4. **No Structure**: Use H2 sections to organize links
5. **Outdated Links**: Keep URLs current as docs evolve
6. **Complex Formatting**: Stick to simple markdown
7. **No Summary**: Always include a blockquote summary
8. **Wrong File Location**: Must be at repository root as `/llms.txt`
## Validation Checklist
Before publishing your llms.txt:
- ✅ File is named exactly `llms.txt` (lowercase)
- ✅ File is at repository root
- ✅ Has H1 title as first element
- ✅ Has blockquote summary
- ✅ Uses only H1 and H2 headings
- ✅ All links use full URLs
- ✅ Links use proper markdown format `[text](url)`
- ✅ Descriptive notes added after colons where helpful
- ✅ Sections logically organized
- ✅ Essential content comes before optional
- ✅ Links point to markdown files when possible
- ✅ Content is concise and clear
- ✅ Tested with an LLM for comprehension
## Additional Resources
- **Official Site**: https://llmstxt.org/
- **GitHub**: https://github.com/answerdotai/llms-txt
- **Issues**: https://github.com/AnswerDotAI/llms-txt/issues/new
- **Community**: Discord channel (check official site for link)
## Version
This reference is based on the llms.txt specification as of November 2025. Check https://llmstxt.org/ for the latest updates.

View File

@@ -0,0 +1,428 @@
# Documentation Optimization Patterns for C7Score
## Analysis Workflow
### Step 1: Audit Current Documentation
Review the existing documentation and categorize snippets:
1. **Question-answering snippets**: Count how many snippets directly answer developer questions
2. **API reference snippets**: Count pure API documentation without usage examples
3. **Installation/import-only snippets**: Count snippets that are just setup with no usage
4. **Metadata snippets**: Count licensing, directory structures, citations
5. **Duplicate snippets**: Identify repeated or very similar content
6. **Formatting issues**: Note inconsistent formats, wrong language tags, etc.
### Step 2: Generate Common Questions
Use an LLM to generate 15-20 common questions a developer would ask about the library:
**Example questions:**
- How do I install and set up [library]?
- How do I [main feature 1]?
- How do I [main feature 2]?
- How do I handle errors?
- How do I configure [common setting]?
- What are the authentication options?
- How do I integrate with [common use case]?
- What are the rate limits and how do I handle them?
- How do I use [advanced feature]?
- How do I test code using [library]?
### Step 3: Map Questions to Snippets
Create a mapping:
- Which questions are well-answered by existing snippets?
- Which questions have weak or missing answers?
- Which snippets don't answer any important questions?
### Step 4: Optimize High-Impact Areas
Focus optimization efforts based on c7score weights:
**Priority 1 (80% of score): Question-Snippet Matching**
- Add missing snippets for unanswered questions
- Enhance snippets that partially answer questions
- Ensure each snippet addresses at least one common question
**Priority 2 (5% each): Other Metrics**
- Remove duplicates
- Fix formatting inconsistencies
- Remove metadata snippets
- Combine import-only snippets with usage
## Snippet Transformation Patterns
### Pattern 1: API Reference → Usage Example
**Before:**
```
Client.authenticate(api_key: str) -> bool
Authenticates the client with the provided API key.
Parameters:
- api_key (str): Your API key
Returns:
- bool: True if authentication succeeded
```
**After:**
```
Authenticating Your Client
----------------------------------------
Authenticate your client using your API key from the dashboard.
```python
from library import Client
# Initialize with your API key
client = Client(api_key="your_api_key_here")
# Authenticate
if client.authenticate():
print("Successfully authenticated!")
else:
print("Authentication failed")
```
```
### Pattern 2: Import-Only → Complete Setup
**Before:**
```python
from library import Client, Query, Response
```
**After:**
```
Quick Start: Making Your First Request
----------------------------------------
Install the library, import the client, and make your first API call.
```python
# Install: pip install library-name
from library import Client
# Initialize and authenticate
client = Client(api_key="your_api_key")
# Make your first request
response = client.query("SELECT * FROM data LIMIT 10")
for row in response:
print(row)
```
```
### Pattern 3: Multiple Small → One Comprehensive
**Before (3 separate snippets):**
```python
client = Client()
```
```python
client.connect()
```
```python
client.query("SELECT * FROM table")
```
**After (1 comprehensive snippet):**
```
Complete Workflow: Connect and Query
----------------------------------------
Full example showing initialization, connection, and querying.
```python
from library import Client
# Initialize the client
client = Client(
api_key="your_api_key",
region="us-west-2"
)
# Establish connection
client.connect()
# Execute query
result = client.query("SELECT * FROM users WHERE active = true")
# Process results
for row in result:
print(f"User: {row['name']}, Email: {row['email']}")
# Close connection
client.close()
```
```
### Pattern 4: Remove Metadata Snippets
**Remove these entirely:**
```
Project Structure
----------------------------------------
myproject/
├── src/
│ ├── main.py
│ └── utils.py
├── tests/
└── README.md
```
```
License
----------------------------------------
MIT License
Copyright (c) 2024...
```
```
Citation
----------------------------------------
@article{library2024,
title={The Library Paper},
...
}
```
## README Optimization
### Structure Your README for High Scores
**1. Quick Start Section (High Priority)**
```markdown
## Quick Start
```python
# Install
pip install your-library
# Import and use
from your_library import Client
client = Client(api_key="key")
result = client.do_something()
print(result)
```
```
**2. Common Use Cases (High Priority)**
For each major feature, provide:
- Clear section title answering "How do I...?"
- Brief description
- Complete, working code example
- Expected output or result
**3. API Reference (Lower Priority)**
Keep it, but ensure each API method has at least one usage example.
**4. Configuration Examples (Medium Priority)**
Show common configuration scenarios with full context.
**5. Error Handling (Medium Priority)**
Demonstrate proper error handling in realistic scenarios.
### What to Minimize or Remove
- **Installation only**: Always combine with first usage
- **Long lists**: Convert to example-driven content
- **Project governance**: Move to separate CONTRIBUTING.md
- **Licensing**: Link to LICENSE file, don't duplicate
- **Directory trees**: Remove unless essential for setup
- **Academic citations**: Remove from main docs
## Testing Documentation Quality
### Manual Quality Checks
Before finalizing, verify each snippet:
1. ✅ **Can run standalone**: Copy-paste the code and it works (with minimal setup)
2. ✅ **Answers a question**: Clearly addresses a "how do I..." query
3. ✅ **Unique information**: Doesn't duplicate other snippets
4. ✅ **Proper format**: Has title, description, and code with correct language tag
5. ✅ **Practical focus**: Shows real-world usage, not just theory
6. ✅ **Complete imports**: Includes all necessary imports
7. ✅ **No metadata**: No licensing, citations, or directory trees
8. ✅ **Correct syntax**: Code is valid and would actually run
### Question Coverage Matrix
Create a checklist:
- [ ] Installation and setup
- [ ] Basic initialization
- [ ] Authentication methods
- [ ] Primary use case 1
- [ ] Primary use case 2
- [ ] Configuration options
- [ ] Error handling
- [ ] Advanced features
- [ ] Integration examples
- [ ] Testing approaches
Each checkbox should map to at least one high-quality snippet.
## Iteration and Refinement
After creating optimized documentation:
1. Run c7score to get baseline metrics
2. Identify lowest-scoring metric
3. Apply targeted improvements for that metric
4. Re-run c7score
5. Repeat until reaching target score (typically 85+)
### Common Score Ranges
- **90-100**: Excellent, comprehensive, question-focused documentation
- **80-89**: Good documentation with some gaps or formatting issues
- **70-79**: Adequate but needs more complete examples or has duplicates
- **60-69**: Significant gaps in question coverage or many formatting issues
- **Below 60**: Major restructuring needed
## Example: Full Snippet Transformation
### Original (Low Score)
```markdown
## Installation
```bash
npm install my-library
```
## Usage
Import the library:
```javascript
const MyLibrary = require('my-library');
```
## API
### connect(options)
Connects to the service.
### query(sql)
Executes a query.
```
### Optimized (High Score)
```markdown
## Getting Started: Installation and First Query
```javascript
// Install the library
// npm install my-library
const MyLibrary = require('my-library');
// Connect to your database
const client = new MyLibrary({
host: 'your-host.example.com',
apiKey: 'your-api-key',
database: 'production'
});
await client.connect();
// Run your first query
const results = await client.query('SELECT * FROM users LIMIT 5');
console.log(results);
// Always close the connection
await client.close();
```
## Common Use Cases
### Authenticating with OAuth
```javascript
const MyLibrary = require('my-library');
// OAuth authentication flow
const client = new MyLibrary({
authMethod: 'oauth',
clientId: 'your-client-id',
clientSecret: 'your-client-secret'
});
// Get auth URL for user
const authUrl = client.getAuthUrl('http://localhost:3000/callback');
console.log(`Visit: ${authUrl}`);
// After user authorizes, exchange code for token
const tokens = await client.exchangeCode(authCode);
await client.connect();
```
### Handling Errors and Retries
```javascript
const MyLibrary = require('my-library');
const client = new MyLibrary({
host: 'your-host.example.com',
apiKey: 'your-api-key',
// Configure automatic retries
retries: 3,
retryDelay: 1000
});
try {
await client.connect();
const results = await client.query('SELECT * FROM users');
console.log(results);
} catch (error) {
if (error.code === 'TIMEOUT') {
console.error('Query timed out, try a smaller result set');
} else if (error.code === 'AUTH_ERROR') {
console.error('Authentication failed, check your API key');
} else {
console.error('Unexpected error:', error.message);
}
} finally {
await client.close();
}
```
### Advanced: Batch Operations
```javascript
const MyLibrary = require('my-library');
const client = new MyLibrary({
host: 'your-host.example.com',
apiKey: 'your-api-key'
});
await client.connect();
// Batch insert for better performance
const users = [
{ name: 'Alice', email: 'alice@example.com' },
{ name: 'Bob', email: 'bob@example.com' },
{ name: 'Charlie', email: 'charlie@example.com' }
];
const result = await client.batchInsert('users', users);
console.log(`Inserted ${result.rowCount} users`);
await client.close();
```
```
**Score Impact:**
- Question coverage: +40 points (answers 4 major questions)
- Removes import-only: +5 points
- Consistent formatting: +5 points
- Working examples: +20 points
- No duplicates: +10 points
- **Total improvement: ~80 point increase**