Initial commit

2025-11-29 18:27:28 +08:00
commit 8db9c44dd8
79 changed files with 37715 additions and 0 deletions
--- a/skills/FrancyJGLisboa__agent-skill-creator/references/quality-standards.md
+++ b/skills/FrancyJGLisboa__agent-skill-creator/references/quality-standards.md
@@ -0,0 +1,937 @@
+# Mandatory Quality Standards
+
+## Fundamental Principles
+
+**Production-Ready, Not Prototype**
+- Code must work without modifications
+- Doesn't need "now implement X"
+- Can be used immediately
+
+**Functional, Not Placeholder**
+- Complete code in all functions
+- No TODO, pass, NotImplementedError
+- Robust error handling
+
+**Useful, Not Generic**
+- Specific and detailed content
+- Concrete examples, not abstract
+- Not just external links
+
+---
+
+## Standards by File Type
+
+### Python Scripts
+
+#### ✅ MANDATORY
+
+**1. Complete structure**:
+```python
+#!/usr/bin/env python3
+"""Module docstring"""
+
+# Imports
+import ...
+
+# Constants
+CONST = value
+
+# Classes/Functions
+class/def ...
+
+# Main
+def main():
+    ...
+
+if __name__ == "__main__":
+    main()
+```
+
+**2. Docstrings**:
+- Module docstring: 3-5 lines
+- Class docstring: Description + Example
+- Method docstring: Args, Returns, Raises, Example
+
+**3. Type hints**:
+```python
+def function(param1: str, param2: int = 10) -> Dict[str, Any]:
+    ...
+```
+
+**4. Error handling**:
+```python
+try:
+    result = risky_operation()
+except SpecificError as e:
+    # Handle specifically
+    log_error(e)
+    raise CustomError(f"Context: {e}")
+```
+
+**5. Validations**:
+```python
+def process(data: Dict) -> pd.DataFrame:
+    # Validate input
+    if not data:
+        raise ValueError("Data cannot be empty")
+
+    if 'required_field' not in data:
+        raise ValueError("Missing required field")
+
+    # Process
+    ...
+
+    # Validate output
+    assert len(result) > 0, "Result cannot be empty"
+    assert result['value'].notna().all(), "No null values allowed"
+
+    return result
+```
+
+**6. Appropriate logging**:
+```python
+import logging
+
+logger = logging.getLogger(__name__)
+
+def fetch_data():
+    logger.info("Fetching data from API...")
+    # ...
+    logger.debug(f"Received {len(data)} records")
+    # ...
+    logger.error(f"API error: {e}")
+```
+
+#### ❌ FORBIDDEN
+
+```python
+# ❌ DON'T DO THIS:
+
+def analyze():
+    # TODO: implement analysis
+    pass
+
+def process(data):  # ❌ No type hints
+    # ❌ No docstring
+    result = data  # ❌ No real logic
+    return result  # ❌ No validation
+
+def fetch_api(url):
+    response = requests.get(url)  # ❌ No timeout
+    return response.json()  # ❌ No error handling
+```
+
+#### ✅ DO THIS:
+
+```python
+def analyze_yoy(df: pd.DataFrame, commodity: str, year1: int, year2: int) -> Dict:
+    """
+    Perform year-over-year analysis
+
+    Args:
+        df: DataFrame with parsed data
+        commodity: Commodity name (e.g., "CORN")
+        year1: Current year
+        year2: Previous year
+
+    Returns:
+        Dict with keys:
+            - production_current: float
+            - production_previous: float
+            - change_percent: float
+            - interpretation: str
+
+    Raises:
+        ValueError: If data not found for specified years
+        DataQualityError: If data fails validation
+
+    Example:
+        >>> analyze_yoy(df, "CORN", 2023, 2022)
+        {'production_current': 15.3, 'change_percent': 11.7, ...}
+    """
+    # Validate inputs
+    if commodity not in df['commodity'].unique():
+        raise ValueError(f"Commodity {commodity} not found in data")
+
+    # Filter data
+    df1 = df[(df['commodity'] == commodity) & (df['year'] == year1)]
+    df2 = df[(df['commodity'] == commodity) & (df['year'] == year2)]
+
+    if len(df1) == 0 or len(df2) == 0:
+        raise ValueError(f"Data not found for {commodity} in {year1} or {year2}")
+
+    # Extract values
+    prod1 = df1['production'].iloc[0]
+    prod2 = df2['production'].iloc[0]
+
+    # Calculate
+    change = prod1 - prod2
+    change_pct = (change / prod2) * 100
+
+    # Interpret
+    if abs(change_pct) < 2:
+        interpretation = "stable"
+    elif change_pct > 10:
+        interpretation = "significant_increase"
+    elif change_pct > 2:
+        interpretation = "moderate_increase"
+    elif change_pct < -10:
+        interpretation = "significant_decrease"
+    else:
+        interpretation = "moderate_decrease"
+
+    # Return
+    return {
+        "commodity": commodity,
+        "production_current": round(prod1, 1),
+        "production_previous": round(prod2, 1),
+        "change_absolute": round(change, 1),
+        "change_percent": round(change_pct, 1),
+        "interpretation": interpretation
+    }
+```
+
+---
+
+### SKILL.md
+
+#### ✅ MANDATORY
+
+**1. Valid frontmatter**:
+```yaml
+---
+name: agent-name
+description: [150-250 words with keywords]
+---
+```
+
+**2. Size**: 5000-7000 words
+
+**3. Mandatory sections**:
+- When to use (specific triggers)
+- Data source (detailed API)
+- Workflows (complete step-by-step)
+- Scripts (each one explained)
+- Analyses (methodologies)
+- Errors (complete handling)
+- Validations (mandatory)
+- Keywords (complete list)
+- Examples (5+ complete)
+
+**4. Detailed workflows**:
+
+✅ **GOOD**:
+```markdown
+### Workflow: YoY Comparison
+
+1. **Identify question parameters**
+   - Commodity: [extract from question]
+   - Years: Current vs previous (or specified)
+
+2. **Fetch data**
+   ```bash
+   python scripts/fetch_nass.py \
+     --commodity CORN \
+     --years 2023,2022 \
+     --output data/raw/corn_2023_2022.json
+   ```
+
+3. **Parse**
+   ```bash
+   python scripts/parse_nass.py \
+     --input data/raw/corn_2023_2022.json \
+     --output data/processed/corn.csv
+   ```
+
+4. **Analyze**
+   ```bash
+   python scripts/analyze_nass.py \
+     --input data/processed/corn.csv \
+     --analysis yoy \
+     --commodity CORN \
+     --year1 2023 \
+     --year2 2022 \
+     --output data/analysis/corn_yoy.json
+   ```
+
+5. **Interpret results**
+
+   File `data/analysis/corn_yoy.json` contains:
+   ```json
+   {
+     "production_current": 15.3,
+     "change_percent": 11.7,
+     "interpretation": "significant_increase"
+   }
+   ```
+
+   Respond to user:
+   "Corn production grew 11.7% in 2023..."
+```
+
+❌ **BAD**:
+```markdown
+### Workflow: Comparison
+
+1. Get data
+2. Compare
+3. Return result
+```
+
+**5. Complete examples**:
+
+✅ **GOOD**:
+```markdown
+### Example 1: YoY Comparison
+
+**Question**: "How's corn production compared to last year?"
+
+**Executed flow**:
+[Specific commands with outputs]
+
+**Generated answer**:
+"Corn production in 2023 is 15.3 billion bushels,
+growth of 11.7% vs 2022 (13.7 billion). Growth
+comes mainly from area increase (+8%) with stable yield."
+```
+
+❌ **BAD**:
+```markdown
+### Example: Comparison
+
+User asks about comparison. Agent compares and responds.
+```
+
+#### ❌ FORBIDDEN
+
+- Empty sections
+- "See documentation"
+- Workflows without specific commands
+- Generic examples
+
+---
+
+### References
+
+#### ✅ MANDATORY
+
+**1. Useful and self-contained content**:
+
+✅ **GOOD** (references/api-guide.md):
+```markdown
+## Endpoint: Get Production Data
+
+**URL**: `GET https://quickstats.nass.usda.gov/api/api_GET/`
+
+**Parameters**:
+- `commodity_desc`: Commodity name
+  - Example: "CORN", "SOYBEANS"
+  - Case-sensitive
+- `year`: Desired year
+  - Example: 2023
+  - Range: 1866-present
+
+**Complete request example**:
+```bash
+curl -H "X-Api-Key: YOUR_KEY" \
+  "https://quickstats.nass.usda.gov/api/api_GET/?commodity_desc=CORN&year=2023&format=JSON"
+```
+
+**Expected response**:
+```json
+{
+  "data": [
+    {
+      "year": 2023,
+      "commodity_desc": "CORN",
+      "value": "15,300,000,000",
+      "unit_desc": "BU"
+    }
+  ]
+}
+```
+
+**Important fields**:
+- `value`: Comes as STRING with commas
+  - Solution: `value.replace(',', '')`
+  - Convert to float after
+```
+
+❌ **BAD**:
+```markdown
+## API Endpoint
+
+For details on how to use the API, consult the official documentation at:
+https://quickstats.nass.usda.gov/api
+
+[End of file]
+```
+
+**2. Adequate size**:
+- API guide: 1500-2000 words
+- Analysis methods: 2000-3000 words
+- Troubleshooting: 1000-1500 words
+
+**3. Concrete examples**:
+- Always include examples with real values
+- Executable code blocks
+- Expected outputs
+
+#### ❌ FORBIDDEN
+
+- "For more information, see [link]"
+- Sections with only 2-3 lines
+- Lists without details
+- Circular references ("see other doc that sees other doc")
+
+---
+
+### Assets (Configs)
+
+#### ✅ MANDATORY
+
+**1. Syntactically valid JSON**:
+```bash
+# ALWAYS validate:
+python -c "import json; json.load(open('config.json'))"
+```
+
+**2. Real values**:
+
+✅ **GOOD**:
+```json
+{
+  "api": {
+    "base_url": "https://quickstats.nass.usda.gov/api",
+    "api_key_env": "NASS_API_KEY",
+    "_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration",
+    "rate_limit_per_day": 1000,
+    "timeout_seconds": 30
+  }
+}
+```
+
+❌ **BAD**:
+```json
+{
+  "api": {
+    "base_url": "YOUR_API_URL_HERE",
+    "api_key": "YOUR_KEY_HERE"
+  }
+}
+```
+
+**3. Inline comments** (using `_comment` or `_note`):
+```json
+{
+  "_comment": "Differentiated TTL by data type",
+  "cache": {
+    "ttl_historical_days": 365,
+    "_note_historical": "Historical data doesn't change",
+    "ttl_current_days": 7,
+    "_note_current": "Current year data may be revised"
+  }
+}
+```
+
+---
+
+### README.md
+
+#### ✅ MANDATORY
+
+**1. Complete installation instructions**:
+
+✅ **GOOD**:
+```markdown
+## Installation
+
+### 1. Get API Key (Free)
+
+1. Access https://quickstats.nass.usda.gov/api#registration
+2. Fill form:
+   - Name: [your name]
+   - Email: [your email]
+   - Purpose: "Personal research"
+3. Click "Submit"
+4. You'll receive email with API key in ~1 minute
+5. Key format: `A1B2C3D4-E5F6-G7H8-I9J0-K1L2M3N4O5P6`
+
+### 2. Configure Environment
+
+**Option A - Export** (temporary):
+```bash
+export NASS_API_KEY="your_key_here"
+```
+
+**Option B - .bashrc/.zshrc** (permanent):
+```bash
+echo 'export NASS_API_KEY="your_key_here"' >> ~/.bashrc
+source ~/.bashrc
+```
+
+**Option C - .env file** (per project):
+```bash
+echo "NASS_API_KEY=your_key_here" > .env
+```
+
+### 3. Install Dependencies
+
+```bash
+cd nass-usda-agriculture
+pip install -r requirements.txt
+```
+
+Requirements:
+- requests
+- pandas
+- numpy
+```
+
+❌ **BAD**:
+```markdown
+## Installation
+
+1. Get API key from the official website
+2. Configure environment
+3. Install dependencies
+4. Done!
+```
+
+**2. Concrete usage examples**:
+
+✅ **GOOD**:
+```markdown
+## Examples
+
+### Example 1: Current Production
+
+```
+You: "What's US corn production in 2023?"
+
+Claude: "Corn production in 2023 was 15.3 billion
+bushels (389 million metric tons)..."
+```
+
+### Example 2: YoY Comparison
+
+```
+You: "Compare soybeans this year vs last year"
+
+Claude: "Soybean production in 2023 is 2.6% below 2022:
+- 2023: 4.165 billion bushels
+- 2022: 4.276 billion bushels
+- Drop from area (-4.5%), yield improved (+0.8%)"
+```
+
+[3-5 more examples]
+```
+
+❌ **BAD**:
+```markdown
+## Usage
+
+Ask questions about agriculture and the agent will respond.
+```
+
+**3. Specific troubleshooting**:
+
+✅ **GOOD**:
+```markdown
+### Error: "NASS_API_KEY environment variable not found"
+
+**Cause**: API key not configured
+
+**Step-by-step solution**:
+1. Verify key was obtained: https://...
+2. Configure environment:
+   ```bash
+   export NASS_API_KEY="your_key_here"
+   ```
+3. Verify:
+   ```bash
+   echo $NASS_API_KEY
+   ```
+4. Should show your key
+5. If doesn't work, restart terminal
+
+**Still not working?**
+- Check for extra spaces in key
+- Verify key hasn't expired (validity: 1 year)
+- Re-generate key if needed
+```
+
+---
+
+## Quality Checklist
+
+### Per Python Script
+
+- [ ] Shebang: `#!/usr/bin/env python3`
+- [ ] Module docstring (3-5 lines)
+- [ ] Organized imports (stdlib, 3rd party, local)
+- [ ] Constants at top (if applicable)
+- [ ] Type hints in all public functions
+- [ ] Docstrings in classes (description + attributes + example)
+- [ ] Docstrings in methods (Args, Returns, Raises, Example)
+- [ ] Error handling for risky operations
+- [ ] Input validations
+- [ ] Output validations
+- [ ] Appropriate logging
+- [ ] Main function with argparse
+- [ ] if __name__ == "__main__"
+- [ ] Functional code (no TODO/pass)
+- [ ] Valid syntax (test: `python -m py_compile script.py`)
+
+### Per SKILL.md
+
+- [ ] Frontmatter with name and description
+- [ ] Description 150-250 characters with keywords
+- [ ] Size 5000+ words
+- [ ] "When to Use" section with specific triggers
+- [ ] "Data Source" section detailed
+- [ ] Step-by-step workflows with commands
+- [ ] Scripts explained individually
+- [ ] Analyses documented (objective, methodology)
+- [ ] Errors handled (all expected)
+- [ ] Validations listed
+- [ ] Performance/cache explained
+- [ ] Complete keywords
+- [ ] Complete examples (5+)
+
+### Per Reference File
+
+- [ ] 1000+ words
+- [ ] Useful content (not just links)
+- [ ] Concrete examples with real values
+- [ ] Executable code blocks
+- [ ] Well structured (headings, lists)
+- [ ] No empty sections
+- [ ] No "TODO: write"
+
+### Per Asset (Config)
+
+- [ ] Syntactically valid JSON (validate!)
+- [ ] Real values (not "YOUR_X_HERE" without context)
+- [ ] Inline comments (_comment, _note)
+- [ ] Instructions for values user must fill
+- [ ] Logical and organized structure
+
+### Per README.md
+
+- [ ] Step-by-step installation
+- [ ] How to get API key (detailed)
+- [ ] How to configure (3 options)
+- [ ] How to install dependencies
+- [ ] How to install in Claude Code
+- [ ] Usage examples (5+)
+- [ ] Troubleshooting (10+ problems)
+- [ ] License
+- [ ] Contact/contribution (if applicable)
+
+### Complete Agent
+
+- [ ] DECISIONS.md documents all choices
+- [ ] **VERSION** file created (e.g. 1.0.0)
+- [ ] **CHANGELOG.md** created with complete v1.0.0 entry
+- [ ] **INSTALACAO.md** with complete didactic tutorial
+- [ ] **comprehensive_{domain}_report()** implemented
+- [ ] marketplace.json with version field
+- [ ] 18+ files created
+- [ ] ~1500+ lines of Python code
+- [ ] ~10,000+ words of documentation
+- [ ] 2+ configs
+- [ ] requirements.txt
+- [ ] .gitignore (if needed)
+- [ ] No placeholder/TODO
+- [ ] Valid syntax (Python, JSON, YAML)
+- [ ] Ready to use (production-ready)
+
+---
+
+## Quality Examples
+
+### Example: Error Handling
+
+❌ **BAD**:
+```python
+def fetch(url):
+    return requests.get(url).json()
+```
+
+✅ **GOOD**:
+```python
+def fetch(url: str, timeout: int = 30) -> Dict:
+    """
+    Fetch data from URL with error handling
+
+    Args:
+        url: URL to fetch
+        timeout: Timeout in seconds
+
+    Returns:
+        JSON response as dict
+
+    Raises:
+        NetworkError: If connection fails
+        TimeoutError: If request times out
+        APIError: If API returns error
+    """
+    try:
+        response = requests.get(url, timeout=timeout)
+        response.raise_for_status()
+
+        data = response.json()
+
+        if 'error' in data:
+            raise APIError(f"API error: {data['error']}")
+
+        return data
+
+    except requests.Timeout:
+        raise TimeoutError(f"Request timed out after {timeout}s")
+
+    except requests.ConnectionError as e:
+        raise NetworkError(f"Connection failed: {e}")
+
+    except requests.HTTPError as e:
+        if e.response.status_code == 429:
+            raise RateLimitError("Rate limit exceeded")
+        else:
+            raise APIError(f"HTTP {e.response.status_code}: {e}")
+```
+
+### Example: Validations
+
+❌ **BAD**:
+```python
+def parse(data):
+    df = pd.DataFrame(data)
+    return df
+```
+
+✅ **GOOD**:
+```python
+def parse(data: List[Dict]) -> pd.DataFrame:
+    """Parse and validate data"""
+
+    # Validate input
+    if not data:
+        raise ValueError("Data cannot be empty")
+
+    if not isinstance(data, list):
+        raise TypeError(f"Expected list, got {type(data)}")
+
+    # Parse
+    df = pd.DataFrame(data)
+
+    # Validate schema
+    required_cols = ['year', 'commodity', 'value']
+    missing = set(required_cols) - set(df.columns)
+    if missing:
+        raise ValueError(f"Missing required columns: {missing}")
+
+    # Validate types
+    df['year'] = pd.to_numeric(df['year'], errors='raise')
+    df['value'] = pd.to_numeric(df['value'], errors='raise')
+
+    # Validate ranges
+    current_year = datetime.now().year
+    if (df['year'] > current_year).any():
+        raise ValueError(f"Future years found (max allowed: {current_year})")
+
+    if (df['value'] < 0).any():
+        raise ValueError("Negative values found")
+
+    # Validate no duplicates
+    if df.duplicated(subset=['year', 'commodity']).any():
+        raise ValueError("Duplicate records found")
+
+    return df
+```
+
+### Example: Docstrings
+
+❌ **BAD**:
+```python
+def analyze(df, commodity):
+    """Analyze data"""
+    # ...
+```
+
+✅ **GOOD**:
+```python
+def analyze_yoy(
+    df: pd.DataFrame,
+    commodity: str,
+    year1: int,
+    year2: int
+) -> Dict[str, Any]:
+    """
+    Perform year-over-year comparison analysis
+
+    Compares production, area, and yield between two years
+    and decomposes growth into area vs yield contributions.
+
+    Args:
+        df: DataFrame with columns ['year', 'commodity', 'production', 'area', 'yield']
+        commodity: Commodity name (e.g., "CORN", "SOYBEANS")
+        year1: Current year to compare
+        year2: Previous year to compare against
+
+    Returns:
+        Dict containing:
+            - production_current (float): Production in year1 (million units)
+            - production_previous (float): Production in year2
+            - change_absolute (float): Absolute change
+            - change_percent (float): Percent change
+            - decomposition (dict): Area vs yield contribution
+            - interpretation (str): "increase", "decrease", or "stable"
+
+    Raises:
+        ValueError: If commodity not found in data
+        ValueError: If either year not found in data
+        DataQualityError: If production != area * yield (tolerance > 1%)
+
+    Example:
+        >>> df = pd.DataFrame([
+        ...     {'year': 2023, 'commodity': 'CORN', 'production': 15.3, 'area': 94.6, 'yield': 177},
+        ...     {'year': 2022, 'commodity': 'CORN', 'production': 13.7, 'area': 89.2, 'yield': 173}
+        ... ])
+        >>> result = analyze_yoy(df, "CORN", 2023, 2022)
+        >>> result['change_percent']
+        11.7
+    """
+    # [Complete implementation]
+```
+
+---
+
+## Anti-Patterns
+
+### Anti-Pattern 1: Partial Implementation
+
+❌ **NO**:
+```python
+def yoy_comparison(df, commodity, year1, year2):
+    # Implement YoY comparison
+    pass
+
+def state_ranking(df, commodity):
+    # TODO: implement ranking
+    raise NotImplementedError()
+```
+
+✅ **YES**:
+```python
+# [Complete and functional code for BOTH functions]
+```
+
+### Anti-Pattern 2: Empty References
+
+❌ **NO**:
+```markdown
+# Analysis Methods
+
+## YoY Comparison
+
+This method compares two years.
+
+## Ranking
+
+This method ranks states.
+```
+
+✅ **YES**:
+```markdown
+# Analysis Methods
+
+## YoY Comparison
+
+### Objective
+Compare metrics between current and previous year...
+
+### Detailed Methodology
+
+**Formulas**:
+```
+Δ X = X(t) - X(t-1)
+Δ X% = (Δ X / X(t-1)) × 100
+```
+
+**Decomposition** (for production):
+[Complete mathematics]
+
+**Interpretation**:
+- |Δ| < 2%: Stable
+- Δ > 10%: Significant increase
+[...]
+
+### Validations
+[List]
+
+### Complete Numerical Example
+[With real values]
+```
+
+### Anti-Pattern 3: Useless Configs
+
+❌ **NO**:
+```json
+{
+  "api_url": "INSERT_URL",
+  "api_key": "INSERT_KEY"
+}
+```
+
+✅ **YES**:
+```json
+{
+  "_comment": "Configuration for NASS USDA Agent",
+  "api": {
+    "base_url": "https://quickstats.nass.usda.gov/api",
+    "_note": "This is the official USDA NASS API base URL",
+    "api_key_env": "NASS_API_KEY",
+    "_key_instructions": "Get free API key from: https://quickstats.nass.usda.gov/api#registration"
+  }
+}
+```
+
+---
+
+## Final Validation
+
+Before delivering to user, verify:
+
+### Sanity Test
+
+```bash
+# 1. Python syntax
+find scripts -name "*.py" -exec python -m py_compile {} \;
+
+# 2. JSON syntax
+python -c "import json; json.load(open('assets/config.json'))"
+
+# 3. Imports make sense
+grep -r "^import\|^from" scripts/*.py | sort | uniq
+# Verify all libs are: stdlib, requests, pandas, numpy
+# No imports of uninstalled libs
+
+# 4. SKILL.md has frontmatter
+head -5 SKILL.md | grep "^---$"
+
+# 5. SKILL.md size
+wc -w SKILL.md
+# Should be > 5000 words
+```
+
+### Final Checklist
+
+- [ ] Syntax check passed (Python, JSON)
+- [ ] No import of non-existent lib
+- [ ] No TODO or pass
+- [ ] SKILL.md > 5000 words
+- [ ] References with content
+- [ ] README with complete instructions
+- [ ] DECISIONS.md created
+- [ ] requirements.txt created