Initial commit

2025-11-29 18:27:28 +08:00
commit 8db9c44dd8
79 changed files with 37715 additions and 0 deletions
--- a/skills/FrancyJGLisboa__agent-skill-creator/references/phase1-discovery.md
+++ b/skills/FrancyJGLisboa__agent-skill-creator/references/phase1-discovery.md
@@ -0,0 +1,454 @@
+# Phase 1: Discovery and API Research
+
+## Objective
+
+Research and **DECIDE** autonomously which API or data source to use for the agent.
+
+## Detailed Process
+
+### Step 1: Identify Domain
+
+From user input, extract the main domain:
+
+| User Input | Identified Domain |
+|------------------|---------------------|
+| "US crop data" | Agriculture (US) |
+| "stock market analysis" | Finance / Stock Market |
+| "global climate data" | Climate / Meteorology |
+| "economic indicators" | Economy / Macro |
+| "commodity data" | Trading / Commodities |
+
+### Step 2: Search Available APIs
+
+For the identified domain, use WebSearch to find public APIs:
+
+**Search queries**:
+```
+"[domain] API free public data"
+"[domain] government API documentation"
+"best API for [domain] historical data"
+"[domain] open data sources"
+```
+
+**Example (US agriculture)**:
+```bash
+WebSearch: "US agriculture API free historical data"
+WebSearch: "USDA API documentation"
+WebSearch: "agricultural statistics API United States"
+```
+
+**Typical result**: 5-10 candidate APIs
+
+### Step 3: Research Documentation
+
+For each candidate API, use WebFetch to load:
+- Homepage/overview
+- Getting started guide
+- API reference
+- Rate limits and pricing
+
+**Extract information**:
+
+```markdown
+## API 1: [Name]
+
+**URL**: [base URL]
+**Docs**: [docs URL]
+
+**Authentication**:
+- Type: API key / OAuth / None
+- Cost: Free / Paid
+- How to obtain: [steps]
+
+**Available Data**:
+- Temporal coverage: [from when to when]
+- Geographic coverage: [countries, regions]
+- Metrics: [list]
+- Granularity: [daily, monthly, annual]
+
+**Limitations**:
+- Rate limit: [requests per day/hour]
+- Max records: [per request]
+- Throttling: [yes/no]
+
+**Quality**:
+- Source: [official government / private]
+- Reliability: [high/medium/low]
+- Update frequency: [frequency]
+
+**Documentation**:
+- Quality: [excellent/good/poor]
+
+### Step 4: API Capability Inventory (NEW v2.0 - CRITICAL!)
+
+**OBJECTIVE:** Ensure the skill uses 100% of API capabilities, not just the basics!
+
+**LEARNING:** us-crop-monitor v1.0 used only CONDITION (1 of 5 NASS metrics).
+v2.0 had to add PROGRESS, YIELD, PRODUCTION, AREA (+3,500 lines of rework).
+
+**Process:**
+
+**Step 4.1: Complete Inventory**
+
+For the chosen API, catalog ALL data types:
+
+```markdown
+## Complete Inventory - {API Name}
+
+**Available Metrics/Endpoints:**
+
+| Endpoint/Metric | Returns | Granularity | Coverage | Value |
+|-----------------|---------------|---------------|-----------|-------|
+| {metric1}       | {description}   | {daily/weekly}| {geo}     | ⭐⭐⭐⭐⭐ |
+| {metric2}       | {description}   | {monthly}     | {geo}     | ⭐⭐⭐⭐⭐ |
+| {metric3}       | {description}   | {annual}      | {geo}     | ⭐⭐⭐⭐  |
+...
+
+**Real Example (NASS):**
+
+| Metric Type    | Data               | Frequency | Value    | Implement? |
+|----------------|--------------------| ----------|----------|------------|
+| CONDITION      | Quality ratings    | Weekly    | ⭐⭐⭐⭐⭐ | ✅ YES     |
+| PROGRESS       | % planted/harvested| Weekly    | ⭐⭐⭐⭐⭐ | ✅ YES     |
+| YIELD          | Bu/acre            | Monthly   | ⭐⭐⭐⭐⭐ | ✅ YES     |
+| PRODUCTION     | Total bushels      | Monthly   | ⭐⭐⭐⭐⭐ | ✅ YES     |
+| AREA           | Acres planted      | Annual    | ⭐⭐⭐⭐  | ✅ YES     |
+| PRICE          | $/bushel           | Monthly   | ⭐⭐⭐    | ⚪ v2.0    |
+```
+
+**Step 4.2: Coverage Decision**
+
+**GOLDEN RULE:**
+- If metric has ⭐⭐⭐⭐ or ⭐⭐⭐⭐⭐ value → Implement in v1.0
+- If API has 5 high-value metrics → Implement all 5!
+- Never leave >50% of API unused without strong justification
+
+**Step 4.3: Document Decision**
+
+In DECISIONS.md:
+```markdown
+## API Coverage Decision
+
+API {name} offers {N} types of metrics.
+
+**Implemented in v1.0 ({X} of {N}):**
+- {metric1} - {justification}
+- {metric2} - {justification}
+...
+
+**Not implemented ({Y} of {N}):**
+- {metricZ} - {why not} (planned for v2.0)
+
+**Coverage:** {X/N * 100}% = {evaluation}
+- If < 70%: Clearly explain why low coverage
+- If > 70%: ✅ Good coverage
+```
+
+**Output of this phase:** Exact list of all `get_*()` methods to implement
+- Examples: [many/few/none]
+- SDKs: [Python/R/None]
+
+**Ease of Use**:
+- Format: JSON / CSV / XML
+- Structure: [simple/complex]
+- Quirks: [any strange behavior?]
+```
+
+### Step 4: Compare Options
+
+Create comparison table:
+
+| API | Coverage | Cost | Rate Limit | Quality | Docs | Ease | Score |
+|-----|-----------|-------|------------|-----------|------|------------|-------|
+| API 1 | ⭐⭐⭐⭐⭐ | Free | 1000/day | Official | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | 9.2/10 |
+| API 2 | ⭐⭐⭐⭐ | $49/mo | Unlimited | Private | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | 7.8/10 |
+| API 3 | ⭐⭐⭐ | Free | 100/day | Private | ⭐⭐ | ⭐⭐⭐ | 5.5/10 |
+
+**Scoring criteria**:
+- Coverage (fit with need): 30% weight
+- Cost (prefer free): 20% weight
+- Rate limit (sufficient?): 15% weight
+- Quality (official > private): 15% weight
+- Documentation (facilitates implementation): 10% weight
+- Ease of use (format, structure): 10% weight
+
+### Step 5: DECIDE
+
+**Consider user constraints**:
+- Mentioned "free"? → Eliminate paid options
+- Mentioned "10+ years historical data"? → Check coverage
+- Mentioned "real-time"? → Prioritize streaming APIs
+
+**Apply logic**:
+1. Eliminate APIs that violate constraints
+2. Of remaining, choose highest score
+3. If tie, prefer:
+   - Official > private
+   - Better documentation
+   - Easier to use
+
+**FINAL DECISION**:
+
+```markdown
+## Selected API: [API Name]
+
+**Score**: X.X/10
+
+**Justification**:
+- ✅ Coverage: [specific details]
+- ✅ Cost: [free/paid + details]
+- ✅ Rate limit: [number] requests/day (sufficient for [estimated usage])
+- ✅ Quality: [official/private + reliability]
+- ✅ Documentation: [quality + examples]
+- ✅ Ease of use: [format, structure]
+
+**Fit with requirements**:
+- Constraint 1 (e.g., free): ✅ Met
+- Constraint 2 (e.g., 10+ years history): ✅ Met (since [year])
+- Primary need (e.g., crop production): ✅ Covered
+
+**Alternatives Considered**:
+
+**API X**: Score 7.5/10
+- Rejected because: [specific reason]
+- Trade-off: [what we lose vs gain]
+
+**API Y**: Score 6.2/10
+- Rejected because: [reason]
+
+**Conclusion**:
+[API Name] is the best option because [1-2 sentence synthesis].
+```
+
+### Step 6: Research Technical Details
+
+After deciding, dive deep into documentation:
+
+**Load via WebFetch**:
+- Getting started guide
+- Complete API reference
+- Authentication guide
+- Rate limiting details
+- Best practices
+
+**Extract for implementation**:
+
+```markdown
+## Technical Details - [API]
+
+### Authentication
+
+**Method**: API key in header
+**Header**: `X-Api-Key: YOUR_KEY`
+**Obtaining key**:
+1. [step 1]
+2. [step 2]
+3. [step 3]
+
+### Main Endpoints
+
+**Endpoint 1**: [Name]
+- **URL**: `GET https://api.example.com/v1/endpoint`
+- **Parameters**:
+  - `param1` (required): [description, type, example]
+  - `param2` (optional): [description, type, default]
+- **Response** (200 OK):
+```json
+{
+  "data": [...],
+  "meta": {...}
+}
+```
+- **Errors**:
+  - 400: [when occurs, how to handle]
+  - 401: [when occurs, how to handle]
+  - 429: [rate limit, how to handle]
+
+**Example request**:
+```bash
+curl -H "X-Api-Key: YOUR_KEY" \
+  "https://api.example.com/v1/endpoint?param1=value"
+```
+
+[Repeat for all relevant endpoints]
+
+### Rate Limiting
+
+- Limit: [number] requests per [period]
+- Response headers:
+  - `X-RateLimit-Limit`: Total limit
+  - `X-RateLimit-Remaining`: Remaining requests
+  - `X-RateLimit-Reset`: Reset timestamp
+- Behavior when exceeded: [429 error, throttling, ban?]
+- Best practice: [how to implement rate limiting]
+
+### Quirks and Gotchas
+
+**Quirk 1**: Values come as strings with formatting
+- Example: `"2,525,000"` instead of `2525000`
+- Solution: Remove commas before converting
+
+**Quirk 2**: Suppressed data marked as "(D)"
+- Meaning: Withheld to avoid disclosing data
+- Solution: Treat as NULL, signal to user
+
+**Quirk 3**: [other non-obvious behavior]
+- Solution: [how to handle]
+
+### Performance Tips
+
+- Historical data doesn't change → cache permanently
+- Recent data may be revised → short cache (7 days)
+- Use pagination parameters if large response
+- Make parallel requests when possible (respecting rate limit)
+```
+
+### Step 7: Document for Later Use
+
+Save everything in `references/api-guide.md` of the agent to be created.
+
+## Discovery Examples
+
+### Example 1: US Agriculture
+
+**Input**: "US crop data"
+
+**Research**:
+```
+WebSearch: "USDA API agricultural data"
+→ Found: NASS QuickStats, ERS, FAS
+
+WebFetch: https://quickstats.nass.usda.gov/api
+→ Free, data since 1866, 1000/day rate limit
+
+WebFetch: https://www.ers.usda.gov/developer/
+→ Free, economic focus, less granular
+
+WebFetch: https://apps.fas.usda.gov/api
+→ International focus, not domestic
+```
+
+**Comparison**:
+| API | Coverage (US domestic) | Cost | Production Data | Score |
+|-----|---------------------------|-------|-------------------|-------|
+| NASS | ⭐⭐⭐⭐⭐ (excellent) | Free | ⭐⭐⭐⭐⭐ | 9.5/10 |
+| ERS | ⭐⭐⭐⭐ (good) | Free | ⭐⭐⭐ (economic) | 7.0/10 |
+| FAS | ⭐⭐ (international) | Free | ⭐⭐ (global) | 4.0/10 |
+
+**DECISION**: NASS QuickStats API
+- Best coverage for US domestic agriculture
+- Free with reasonable rate limit
+- Complete production, area, yield data
+
+### Example 2: Stock Market
+
+**Input**: "technical stock analysis"
+
+**Research**:
+```
+WebSearch: "stock market API free historical data"
+→ Alpha Vantage, Yahoo Finance, IEX Cloud, Polygon.io
+
+WebFetch: Alpha Vantage docs
+→ Free, 5 requests/min, 500/day
+
+WebFetch: Yahoo Finance (yfinance)
+→ Free, unlimited but unofficial
+
+WebFetch: IEX Cloud
+→ Freemium, good docs, 50k free credits/month
+```
+
+**Comparison**:
+| API | Data | Cost | Rate Limit | Official | Score |
+|-----|-------|-------|------------|---------|-------|
+| Alpha Vantage | Complete | Free | 500/day | ⭐⭐⭐ | 8.0/10 |
+| Yahoo Finance | Complete | Free | Unlimited | ❌ Unofficial | 7.5/10 |
+| IEX Cloud | Excellent | Freemium | 50k/month | ⭐⭐⭐⭐ | 8.5/10 |
+
+**DECISION**: IEX Cloud (free tier)
+- Official and reliable
+- 50k requests/month sufficient
+- Excellent documentation
+- Complete data (OHLCV + volume)
+
+### Example 3: Global Climate
+
+**Input**: "global climate data"
+
+**Research**:
+```
+WebSearch: "weather API historical data global"
+→ NOAA, OpenWeather, Weather.gov, Meteostat
+
+[Research each one...]
+```
+
+**DECISION**: NOAA Climate Data Online (CDO) API
+- Official (US government)
+- Free
+- Global and historical coverage (1900+)
+- Rate limit: 1000/day
+
+## Decision Documentation
+
+Create `DECISIONS.md` file in agent:
+
+```markdown
+# Architecture Decisions
+
+## Date: [creation date]
+
+## Phase 1: API Selection
+
+### Chosen API
+
+**[API Name]**
+
+### Selection Process
+
+**APIs Researched**: [list]
+
+**Evaluation Criteria**:
+1. Data coverage (fit with need)
+2. Cost (preference for free)
+3. Rate limits (viability)
+4. Quality (official > private)
+5. Documentation (facilitates development)
+
+### Comparison
+
+[Comparison table]
+
+### Final Justification
+
+[2-3 paragraphs explaining why this API was chosen]
+
+### Trade-offs
+
+**What we gain**:
+- [benefit 1]
+- [benefit 2]
+
+**What we lose** (vs alternatives):
+- [accepted limitation 1]
+- [accepted limitation 2]
+
+### Technical Details
+
+[Summary of endpoints, authentication, rate limits, etc]
+
+**Complete documentation**: See `references/api-guide.md`
+```
+
+## Phase 1 Checklist
+
+Before proceeding to Phase 2, verify:
+
+- [ ] Research completed (WebSearch + WebFetch)
+- [ ] Minimum 3 APIs compared
+- [ ] Decision made with clear justification
+- [ ] User constraints respected
+- [ ] Technical details extracted
+- [ ] DECISIONS.md created
+- [ ] Ready for analysis design