Initial commit

2025-11-29 18:27:28 +08:00
commit 8db9c44dd8
79 changed files with 37715 additions and 0 deletions
--- a/skills/FrancyJGLisboa__agent-skill-creator/references/phase3-architecture.md
+++ b/skills/FrancyJGLisboa__agent-skill-creator/references/phase3-architecture.md
@@ -0,0 +1,351 @@
+# Phase 3: Architecture and Structuring
+
+## Objective
+
+**STRUCTURE** the agent optimally: folders, files, responsibilities, cache, performance.
+
+## Detailed Process
+
+### Step 1: Define Agent Name
+
+Based on domain and objective, create descriptive name:
+
+**Format**: `domain-objective-type`
+
+**Examples**:
+- US Agriculture → `nass-usda-agriculture`
+- Stock analysis → `stock-technical-analysis`
+- Global climate → `noaa-climate-analysis`
+- Brazil CONAB → `conab-crop-yield-analysis`
+
+**Rules**:
+- lowercase
+- hyphens to separate words
+- maximum 50 characters
+- descriptive but concise
+
+### Step 2: Directory Structure
+
+**Decision**: How many levels of organization?
+
+**Option A - Simple** (small agents):
+```
+agent-name/
+├── .claude-plugin/
+│   └── marketplace.json    ← REQUIRED for installation!
+├── SKILL.md
+├── scripts/
+│   └── main.py
+├── references/
+│   └── guide.md
+└── assets/
+    └── config.json
+```
+
+**Option B - Organized** (medium agents):
+```
+agent-name/
+├── .claude-plugin/
+│   └── marketplace.json    ← REQUIRED for installation!
+├── SKILL.md
+├── scripts/
+│   ├── fetch.py
+│   ├── parse.py
+│   ├── analyze.py
+│   └── utils/
+│       ├── cache.py
+│       └── validators.py
+├── references/
+│   ├── api-guide.md
+│   └── methodology.md
+└── assets/
+    └── config.json
+```
+
+**Option C - Complete** (complex agents):
+```
+agent-name/
+├── .claude-plugin/
+│   └── marketplace.json    ← REQUIRED for installation!
+├── SKILL.md
+├── scripts/
+│   ├── core/
+│   │   ├── fetch_[source].py
+│   │   ├── parse_[source].py
+│   │   └── analyze_[source].py
+│   ├── models/
+│   │   ├── forecasting.py
+│   │   └── ml_models.py
+│   └── utils/
+│       ├── cache_manager.py
+│       ├── rate_limiter.py
+│       └── validators.py
+├── references/
+│   ├── api/
+│   │   └── [api-name]-guide.md
+│   ├── methods/
+│   │   └── analysis-methods.md
+│   └── troubleshooting.md
+├── assets/
+│   ├── config.json
+│   └── metadata.json
+└── data/
+    ├── raw/
+    ├── processed/
+    ├── cache/
+    └── analysis/
+```
+
+**Choose based on**:
+- Number of scripts (1-2 → A, 3-5 → B, 6+ → C)
+- Analysis complexity
+- Prefer starting with B, expand to C if needed
+
+### Step 3: Define Script Responsibilities
+
+**Principle**: Separation of Concerns
+
+**Typical scripts**:
+
+**1. fetch_[source].py**
+- **Responsibility**: API requests, authentication, rate limiting
+- **Input**: Query parameters (commodity, year, etc)
+- **Output**: Raw JSON from API
+- **Does NOT**: Parse, transform, analyze
+- **Size**: 200-300 lines
+
+**2. parse_[source].py**
+- **Responsibility**: Parsing, cleaning, validation
+- **Input**: API JSON
+- **Output**: Structured DataFrame
+- **Does NOT**: Fetch, analyze
+- **Size**: 150-200 lines
+
+**3. analyze_[source].py**
+- **Responsibility**: All analyses (YoY, ranking, etc)
+- **Input**: Clean DataFrame
+- **Output**: Dict with results
+- **Does NOT**: Fetch, parse
+- **Size**: 300-500 lines (all analyses)
+
+**Typical utils**:
+
+**cache_manager.py**:
+- Manage response cache
+- Differentiated TTL
+- ~100-150 lines
+
+**rate_limiter.py**:
+- Control rate limit
+- Persistent counter
+- Alerts
+- ~100-150 lines
+
+**validators.py**:
+- Data validations
+- Consistency checks
+- ~100-150 lines
+
+**unit_converter.py** (if needed):
+- Unit conversions
+- ~50-100 lines
+
+### Step 4: Plan References
+
+**Typical files**:
+
+**api-guide.md** (~1500 words):
+- How to get API key
+- Main endpoints with examples
+- Important parameters
+- Response format
+- Limitations and quirks
+- API troubleshooting
+
+**analysis-methods.md** (~2000 words):
+- Each analysis explained
+- Mathematical formulas
+- Interpretations
+- Validations
+- Concrete examples
+
+**troubleshooting.md** (~1000 words):
+- Common problems
+- Step-by-step solutions
+- FAQs
+
+**domain-context.md** (optional, ~1000 words):
+- Domain context
+- Terminology
+- Important concepts
+- Benchmarks
+
+### Step 5: Plan Assets
+
+**config.json**:
+- API settings (URL, rate limits, timeouts)
+- Cache settings (TTLs, directories)
+- Analysis defaults
+- Validation thresholds
+
+**Typical structure**:
+```json
+{
+  "api": {
+    "base_url": "...",
+    "api_key_env": "VAR_NAME",
+    "rate_limit_per_day": 1000,
+    "timeout_seconds": 30
+  },
+  "cache": {
+    "enabled": true,
+    "dir": "data/cache",
+    "ttl_historical_days": 365,
+    "ttl_current_days": 7
+  },
+  "defaults": {
+    "param1": "value1"
+  },
+  "validation": {
+    "threshold1": 0.01
+  }
+}
+```
+
+**metadata.json** (if needed):
+- Domain-specific metadata
+- Mappings (aliases)
+- Conversions (units)
+- Groupings (regions)
+
+**Example**:
+```json
+{
+  "commodities": {
+    "CORN": {
+      "aliases": ["corn", "maize"],
+      "unit": "BU",
+      "conversion_to_mt": 0.0254
+    }
+  },
+  "regions": {
+    "MIDWEST": ["IA", "IL", "IN", "OH"]
+  }
+}
+```
+
+### Step 6: Cache Strategy
+
+**Decision**: What to cache and for how long?
+
+**General rule**:
+- **Historical data** (year < current): Permanent cache (365+ days)
+- **Current year data**: Short cache (7 days - may be revised)
+- **Metadata** (commodity lists, states): Permanent cache
+
+**Implementation**:
+- Cache by key (parameter hash)
+- Check age before using
+- Fallback to expired cache if API fails
+
+**Example**:
+```python
+def get_cache_ttl(year: int) -> timedelta:
+    """Determine cache TTL based on year"""
+    current_year = datetime.now().year
+
+    if year < current_year:
+        # Historical: cache for 1 year (effectively permanent)
+        return timedelta(days=365)
+    else:
+        # Current year: cache for 7 days (may be revised)
+        return timedelta(days=7)
+```
+
+### Step 7: Rate Limiting Strategy
+
+**Decision**: How to control rate limits?
+
+**Components**:
+1. **Persistent counter** (file/DB)
+2. **Pre-request verification**
+3. **Alerts** (when near limit)
+4. **Blocking** (when limit reached)
+
+**Implementation**:
+```python
+class RateLimiter:
+    def __init__(self, max_requests: int, period_seconds: int):
+        self.max_requests = max_requests
+        self.period = period_seconds
+        self.counter_file = Path("data/cache/rate_limit_counter.json")
+
+    def allow_request(self) -> bool:
+        """Check if request is allowed"""
+        count = self._get_current_count()
+
+        if count >= self.max_requests:
+            return False
+
+        # Warn when near limit
+        if count > self.max_requests * 0.9:
+            print(f"⚠️ Rate limit: {count}/{self.max_requests} requests used")
+
+        return True
+
+    def record_request(self):
+        """Record that request was made"""
+        count = self._get_current_count()
+        self._save_count(count + 1)
+```
+
+### Step 8: Document Architecture
+
+Create section in DECISIONS.md:
+
+```markdown
+## Phase 3: Architecture
+
+### Chosen Structure
+
+```
+[Directory tree]
+```
+
+**Justification**:
+- Separate scripts for modularity
+- Utils for reusable code
+- References for progressive disclosure
+- Data/ to separate raw vs processed
+
+### Defined Scripts
+
+**fetch_[source].py** (280 lines estimated):
+- Responsibility: [...]
+- Input/Output: [...]
+
+[For each script...]
+
+### Cache Strategy
+
+- Historical: Permanent cache
+- Current: 7 day cache
+- Justification: [historical data doesn't change]
+
+### Rate Limiting
+
+- Method: [persistent counter]
+- Limits: [1000/day]
+- Alerts: [>90% usage]
+```
+
+## Phase 3 Checklist
+
+- [ ] Agent name defined
+- [ ] Directory structure chosen (A/B/C)
+- [ ] Responsibilities of each script defined
+- [ ] References planned (which files, content)
+- [ ] Assets planned (which configs, structure)
+- [ ] Cache strategy defined (what, TTL)
+- [ ] Rate limiting strategy defined
+- [ ] Architecture documented in DECISIONS.md