20 KiB
Mandatory Quality Standards
Fundamental Principles
Production-Ready, Not Prototype
- Code must work without modifications
- Doesn't need "now implement X"
- Can be used immediately
Functional, Not Placeholder
- Complete code in all functions
- No TODO, pass, NotImplementedError
- Robust error handling
Useful, Not Generic
- Specific and detailed content
- Concrete examples, not abstract
- Not just external links
Standards by File Type
Python Scripts
✅ MANDATORY
1. Complete structure:
#!/usr/bin/env python3
"""Module docstring"""
# Imports
import ...
# Constants
CONST = value
# Classes/Functions
class/def ...
# Main
def main():
...
if __name__ == "__main__":
main()
2. Docstrings:
- Module docstring: 3-5 lines
- Class docstring: Description + Example
- Method docstring: Args, Returns, Raises, Example
3. Type hints:
def function(param1: str, param2: int = 10) -> Dict[str, Any]:
...
4. Error handling:
try:
result = risky_operation()
except SpecificError as e:
# Handle specifically
log_error(e)
raise CustomError(f"Context: {e}")
5. Validations:
def process(data: Dict) -> pd.DataFrame:
# Validate input
if not data:
raise ValueError("Data cannot be empty")
if 'required_field' not in data:
raise ValueError("Missing required field")
# Process
...
# Validate output
assert len(result) > 0, "Result cannot be empty"
assert result['value'].notna().all(), "No null values allowed"
return result
6. Appropriate logging:
import logging
logger = logging.getLogger(__name__)
def fetch_data():
logger.info("Fetching data from API...")
# ...
logger.debug(f"Received {len(data)} records")
# ...
logger.error(f"API error: {e}")
❌ FORBIDDEN
# ❌ DON'T DO THIS:
def analyze():
# TODO: implement analysis
pass
def process(data): # ❌ No type hints
# ❌ No docstring
result = data # ❌ No real logic
return result # ❌ No validation
def fetch_api(url):
response = requests.get(url) # ❌ No timeout
return response.json() # ❌ No error handling
✅ DO THIS:
def analyze_yoy(df: pd.DataFrame, commodity: str, year1: int, year2: int) -> Dict:
"""
Perform year-over-year analysis
Args:
df: DataFrame with parsed data
commodity: Commodity name (e.g., "CORN")
year1: Current year
year2: Previous year
Returns:
Dict with keys:
- production_current: float
- production_previous: float
- change_percent: float
- interpretation: str
Raises:
ValueError: If data not found for specified years
DataQualityError: If data fails validation
Example:
>>> analyze_yoy(df, "CORN", 2023, 2022)
{'production_current': 15.3, 'change_percent': 11.7, ...}
"""
# Validate inputs
if commodity not in df['commodity'].unique():
raise ValueError(f"Commodity {commodity} not found in data")
# Filter data
df1 = df[(df['commodity'] == commodity) & (df['year'] == year1)]
df2 = df[(df['commodity'] == commodity) & (df['year'] == year2)]
if len(df1) == 0 or len(df2) == 0:
raise ValueError(f"Data not found for {commodity} in {year1} or {year2}")
# Extract values
prod1 = df1['production'].iloc[0]
prod2 = df2['production'].iloc[0]
# Calculate
change = prod1 - prod2
change_pct = (change / prod2) * 100
# Interpret
if abs(change_pct) < 2:
interpretation = "stable"
elif change_pct > 10:
interpretation = "significant_increase"
elif change_pct > 2:
interpretation = "moderate_increase"
elif change_pct < -10:
interpretation = "significant_decrease"
else:
interpretation = "moderate_decrease"
# Return
return {
"commodity": commodity,
"production_current": round(prod1, 1),
"production_previous": round(prod2, 1),
"change_absolute": round(change, 1),
"change_percent": round(change_pct, 1),
"interpretation": interpretation
}
SKILL.md
✅ MANDATORY
1. Valid frontmatter:
---
name: agent-name
description: [150-250 words with keywords]
---
2. Size: 5000-7000 words
3. Mandatory sections:
- When to use (specific triggers)
- Data source (detailed API)
- Workflows (complete step-by-step)
- Scripts (each one explained)
- Analyses (methodologies)
- Errors (complete handling)
- Validations (mandatory)
- Keywords (complete list)
- Examples (5+ complete)
4. Detailed workflows:
✅ GOOD:
### Workflow: YoY Comparison
1. **Identify question parameters**
- Commodity: [extract from question]
- Years: Current vs previous (or specified)
2. **Fetch data**
```bash
python scripts/fetch_nass.py \
--commodity CORN \
--years 2023,2022 \
--output data/raw/corn_2023_2022.json
-
Parse
python scripts/parse_nass.py \ --input data/raw/corn_2023_2022.json \ --output data/processed/corn.csv -
Analyze
python scripts/analyze_nass.py \ --input data/processed/corn.csv \ --analysis yoy \ --commodity CORN \ --year1 2023 \ --year2 2022 \ --output data/analysis/corn_yoy.json -
Interpret results
File
data/analysis/corn_yoy.jsoncontains:{ "production_current": 15.3, "change_percent": 11.7, "interpretation": "significant_increase" }Respond to user: "Corn production grew 11.7% in 2023..."
❌ **BAD**:
```markdown
### Workflow: Comparison
1. Get data
2. Compare
3. Return result
5. Complete examples:
✅ GOOD:
### Example 1: YoY Comparison
**Question**: "How's corn production compared to last year?"
**Executed flow**:
[Specific commands with outputs]
**Generated answer**:
"Corn production in 2023 is 15.3 billion bushels,
growth of 11.7% vs 2022 (13.7 billion). Growth
comes mainly from area increase (+8%) with stable yield."
❌ BAD:
### Example: Comparison
User asks about comparison. Agent compares and responds.
❌ FORBIDDEN
- Empty sections
- "See documentation"
- Workflows without specific commands
- Generic examples
References
✅ MANDATORY
1. Useful and self-contained content:
✅ GOOD (references/api-guide.md):
## Endpoint: Get Production Data
**URL**: `GET https://quickstats.nass.usda.gov/api/api_GET/`
**Parameters**:
- `commodity_desc`: Commodity name
- Example: "CORN", "SOYBEANS"
- Case-sensitive
- `year`: Desired year
- Example: 2023
- Range: 1866-present
**Complete request example**:
```bash
curl -H "X-Api-Key: YOUR_KEY" \
"https://quickstats.nass.usda.gov/api/api_GET/?commodity_desc=CORN&year=2023&format=JSON"
Expected response:
{
"data": [
{
"year": 2023,
"commodity_desc": "CORN",
"value": "15,300,000,000",
"unit_desc": "BU"
}
]
}
Important fields:
value: Comes as STRING with commas- Solution:
value.replace(',', '') - Convert to float after
- Solution:
❌ **BAD**:
```markdown
## API Endpoint
For details on how to use the API, consult the official documentation at:
https://quickstats.nass.usda.gov/api
[End of file]
2. Adequate size:
- API guide: 1500-2000 words
- Analysis methods: 2000-3000 words
- Troubleshooting: 1000-1500 words
3. Concrete examples:
- Always include examples with real values
- Executable code blocks
- Expected outputs
❌ FORBIDDEN
- "For more information, see [link]"
- Sections with only 2-3 lines
- Lists without details
- Circular references ("see other doc that sees other doc")
Assets (Configs)
✅ MANDATORY
1. Syntactically valid JSON:
# ALWAYS validate:
python -c "import json; json.load(open('config.json'))"
2. Real values:
✅ GOOD:
{
"api": {
"base_url": "https://quickstats.nass.usda.gov/api",
"api_key_env": "NASS_API_KEY",
"_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration",
"rate_limit_per_day": 1000,
"timeout_seconds": 30
}
}
❌ BAD:
{
"api": {
"base_url": "YOUR_API_URL_HERE",
"api_key": "YOUR_KEY_HERE"
}
}
3. Inline comments (using _comment or _note):
{
"_comment": "Differentiated TTL by data type",
"cache": {
"ttl_historical_days": 365,
"_note_historical": "Historical data doesn't change",
"ttl_current_days": 7,
"_note_current": "Current year data may be revised"
}
}
README.md
✅ MANDATORY
1. Complete installation instructions:
✅ GOOD:
## Installation
### 1. Get API Key (Free)
1. Access https://quickstats.nass.usda.gov/api#registration
2. Fill form:
- Name: [your name]
- Email: [your email]
- Purpose: "Personal research"
3. Click "Submit"
4. You'll receive email with API key in ~1 minute
5. Key format: `A1B2C3D4-E5F6-G7H8-I9J0-K1L2M3N4O5P6`
### 2. Configure Environment
**Option A - Export** (temporary):
```bash
export NASS_API_KEY="your_key_here"
Option B - .bashrc/.zshrc (permanent):
echo 'export NASS_API_KEY="your_key_here"' >> ~/.bashrc
source ~/.bashrc
Option C - .env file (per project):
echo "NASS_API_KEY=your_key_here" > .env
3. Install Dependencies
cd nass-usda-agriculture
pip install -r requirements.txt
Requirements:
- requests
- pandas
- numpy
❌ **BAD**:
```markdown
## Installation
1. Get API key from the official website
2. Configure environment
3. Install dependencies
4. Done!
2. Concrete usage examples:
✅ GOOD:
## Examples
### Example 1: Current Production
You: "What's US corn production in 2023?"
Claude: "Corn production in 2023 was 15.3 billion bushels (389 million metric tons)..."
### Example 2: YoY Comparison
You: "Compare soybeans this year vs last year"
Claude: "Soybean production in 2023 is 2.6% below 2022:
- 2023: 4.165 billion bushels
- 2022: 4.276 billion bushels
- Drop from area (-4.5%), yield improved (+0.8%)"
[3-5 more examples]
❌ BAD:
## Usage
Ask questions about agriculture and the agent will respond.
3. Specific troubleshooting:
✅ GOOD:
### Error: "NASS_API_KEY environment variable not found"
**Cause**: API key not configured
**Step-by-step solution**:
1. Verify key was obtained: https://...
2. Configure environment:
```bash
export NASS_API_KEY="your_key_here"
- Verify:
echo $NASS_API_KEY - Should show your key
- If doesn't work, restart terminal
Still not working?
- Check for extra spaces in key
- Verify key hasn't expired (validity: 1 year)
- Re-generate key if needed
---
## Quality Checklist
### Per Python Script
- [ ] Shebang: `#!/usr/bin/env python3`
- [ ] Module docstring (3-5 lines)
- [ ] Organized imports (stdlib, 3rd party, local)
- [ ] Constants at top (if applicable)
- [ ] Type hints in all public functions
- [ ] Docstrings in classes (description + attributes + example)
- [ ] Docstrings in methods (Args, Returns, Raises, Example)
- [ ] Error handling for risky operations
- [ ] Input validations
- [ ] Output validations
- [ ] Appropriate logging
- [ ] Main function with argparse
- [ ] if __name__ == "__main__"
- [ ] Functional code (no TODO/pass)
- [ ] Valid syntax (test: `python -m py_compile script.py`)
### Per SKILL.md
- [ ] Frontmatter with name and description
- [ ] Description 150-250 characters with keywords
- [ ] Size 5000+ words
- [ ] "When to Use" section with specific triggers
- [ ] "Data Source" section detailed
- [ ] Step-by-step workflows with commands
- [ ] Scripts explained individually
- [ ] Analyses documented (objective, methodology)
- [ ] Errors handled (all expected)
- [ ] Validations listed
- [ ] Performance/cache explained
- [ ] Complete keywords
- [ ] Complete examples (5+)
### Per Reference File
- [ ] 1000+ words
- [ ] Useful content (not just links)
- [ ] Concrete examples with real values
- [ ] Executable code blocks
- [ ] Well structured (headings, lists)
- [ ] No empty sections
- [ ] No "TODO: write"
### Per Asset (Config)
- [ ] Syntactically valid JSON (validate!)
- [ ] Real values (not "YOUR_X_HERE" without context)
- [ ] Inline comments (_comment, _note)
- [ ] Instructions for values user must fill
- [ ] Logical and organized structure
### Per README.md
- [ ] Step-by-step installation
- [ ] How to get API key (detailed)
- [ ] How to configure (3 options)
- [ ] How to install dependencies
- [ ] How to install in Claude Code
- [ ] Usage examples (5+)
- [ ] Troubleshooting (10+ problems)
- [ ] License
- [ ] Contact/contribution (if applicable)
### Complete Agent
- [ ] DECISIONS.md documents all choices
- [ ] **VERSION** file created (e.g. 1.0.0)
- [ ] **CHANGELOG.md** created with complete v1.0.0 entry
- [ ] **INSTALACAO.md** with complete didactic tutorial
- [ ] **comprehensive_{domain}_report()** implemented
- [ ] marketplace.json with version field
- [ ] 18+ files created
- [ ] ~1500+ lines of Python code
- [ ] ~10,000+ words of documentation
- [ ] 2+ configs
- [ ] requirements.txt
- [ ] .gitignore (if needed)
- [ ] No placeholder/TODO
- [ ] Valid syntax (Python, JSON, YAML)
- [ ] Ready to use (production-ready)
---
## Quality Examples
### Example: Error Handling
❌ **BAD**:
```python
def fetch(url):
return requests.get(url).json()
✅ GOOD:
def fetch(url: str, timeout: int = 30) -> Dict:
"""
Fetch data from URL with error handling
Args:
url: URL to fetch
timeout: Timeout in seconds
Returns:
JSON response as dict
Raises:
NetworkError: If connection fails
TimeoutError: If request times out
APIError: If API returns error
"""
try:
response = requests.get(url, timeout=timeout)
response.raise_for_status()
data = response.json()
if 'error' in data:
raise APIError(f"API error: {data['error']}")
return data
except requests.Timeout:
raise TimeoutError(f"Request timed out after {timeout}s")
except requests.ConnectionError as e:
raise NetworkError(f"Connection failed: {e}")
except requests.HTTPError as e:
if e.response.status_code == 429:
raise RateLimitError("Rate limit exceeded")
else:
raise APIError(f"HTTP {e.response.status_code}: {e}")
Example: Validations
❌ BAD:
def parse(data):
df = pd.DataFrame(data)
return df
✅ GOOD:
def parse(data: List[Dict]) -> pd.DataFrame:
"""Parse and validate data"""
# Validate input
if not data:
raise ValueError("Data cannot be empty")
if not isinstance(data, list):
raise TypeError(f"Expected list, got {type(data)}")
# Parse
df = pd.DataFrame(data)
# Validate schema
required_cols = ['year', 'commodity', 'value']
missing = set(required_cols) - set(df.columns)
if missing:
raise ValueError(f"Missing required columns: {missing}")
# Validate types
df['year'] = pd.to_numeric(df['year'], errors='raise')
df['value'] = pd.to_numeric(df['value'], errors='raise')
# Validate ranges
current_year = datetime.now().year
if (df['year'] > current_year).any():
raise ValueError(f"Future years found (max allowed: {current_year})")
if (df['value'] < 0).any():
raise ValueError("Negative values found")
# Validate no duplicates
if df.duplicated(subset=['year', 'commodity']).any():
raise ValueError("Duplicate records found")
return df
Example: Docstrings
❌ BAD:
def analyze(df, commodity):
"""Analyze data"""
# ...
✅ GOOD:
def analyze_yoy(
df: pd.DataFrame,
commodity: str,
year1: int,
year2: int
) -> Dict[str, Any]:
"""
Perform year-over-year comparison analysis
Compares production, area, and yield between two years
and decomposes growth into area vs yield contributions.
Args:
df: DataFrame with columns ['year', 'commodity', 'production', 'area', 'yield']
commodity: Commodity name (e.g., "CORN", "SOYBEANS")
year1: Current year to compare
year2: Previous year to compare against
Returns:
Dict containing:
- production_current (float): Production in year1 (million units)
- production_previous (float): Production in year2
- change_absolute (float): Absolute change
- change_percent (float): Percent change
- decomposition (dict): Area vs yield contribution
- interpretation (str): "increase", "decrease", or "stable"
Raises:
ValueError: If commodity not found in data
ValueError: If either year not found in data
DataQualityError: If production != area * yield (tolerance > 1%)
Example:
>>> df = pd.DataFrame([
... {'year': 2023, 'commodity': 'CORN', 'production': 15.3, 'area': 94.6, 'yield': 177},
... {'year': 2022, 'commodity': 'CORN', 'production': 13.7, 'area': 89.2, 'yield': 173}
... ])
>>> result = analyze_yoy(df, "CORN", 2023, 2022)
>>> result['change_percent']
11.7
"""
# [Complete implementation]
Anti-Patterns
Anti-Pattern 1: Partial Implementation
❌ NO:
def yoy_comparison(df, commodity, year1, year2):
# Implement YoY comparison
pass
def state_ranking(df, commodity):
# TODO: implement ranking
raise NotImplementedError()
✅ YES:
# [Complete and functional code for BOTH functions]
Anti-Pattern 2: Empty References
❌ NO:
# Analysis Methods
## YoY Comparison
This method compares two years.
## Ranking
This method ranks states.
✅ YES:
# Analysis Methods
## YoY Comparison
### Objective
Compare metrics between current and previous year...
### Detailed Methodology
**Formulas**:
Δ X = X(t) - X(t-1) Δ X% = (Δ X / X(t-1)) × 100
**Decomposition** (for production):
[Complete mathematics]
**Interpretation**:
- |Δ| < 2%: Stable
- Δ > 10%: Significant increase
[...]
### Validations
[List]
### Complete Numerical Example
[With real values]
Anti-Pattern 3: Useless Configs
❌ NO:
{
"api_url": "INSERT_URL",
"api_key": "INSERT_KEY"
}
✅ YES:
{
"_comment": "Configuration for NASS USDA Agent",
"api": {
"base_url": "https://quickstats.nass.usda.gov/api",
"_note": "This is the official USDA NASS API base URL",
"api_key_env": "NASS_API_KEY",
"_key_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration"
}
}
Final Validation
Before delivering to user, verify:
Sanity Test
# 1. Python syntax
find scripts -name "*.py" -exec python -m py_compile {} \;
# 2. JSON syntax
python -c "import json; json.load(open('assets/config.json'))"
# 3. Imports make sense
grep -r "^import\|^from" scripts/*.py | sort | uniq
# Verify all libs are: stdlib, requests, pandas, numpy
# No imports of uninstalled libs
# 4. SKILL.md has frontmatter
head -5 SKILL.md | grep "^---$"
# 5. SKILL.md size
wc -w SKILL.md
# Should be > 5000 words
Final Checklist
- Syntax check passed (Python, JSON)
- No import of non-existent lib
- No TODO or pass
- SKILL.md > 5000 words
- References with content
- README with complete instructions
- DECISIONS.md created
- requirements.txt created