Initial commit
This commit is contained in:
@@ -0,0 +1,454 @@
|
||||
# Phase 1: Discovery and API Research
|
||||
|
||||
## Objective
|
||||
|
||||
Research and **DECIDE** autonomously which API or data source to use for the agent.
|
||||
|
||||
## Detailed Process
|
||||
|
||||
### Step 1: Identify Domain
|
||||
|
||||
From user input, extract the main domain:
|
||||
|
||||
| User Input | Identified Domain |
|
||||
|------------------|---------------------|
|
||||
| "US crop data" | Agriculture (US) |
|
||||
| "stock market analysis" | Finance / Stock Market |
|
||||
| "global climate data" | Climate / Meteorology |
|
||||
| "economic indicators" | Economy / Macro |
|
||||
| "commodity data" | Trading / Commodities |
|
||||
|
||||
### Step 2: Search Available APIs
|
||||
|
||||
For the identified domain, use WebSearch to find public APIs:
|
||||
|
||||
**Search queries**:
|
||||
```
|
||||
"[domain] API free public data"
|
||||
"[domain] government API documentation"
|
||||
"best API for [domain] historical data"
|
||||
"[domain] open data sources"
|
||||
```
|
||||
|
||||
**Example (US agriculture)**:
|
||||
```bash
|
||||
WebSearch: "US agriculture API free historical data"
|
||||
WebSearch: "USDA API documentation"
|
||||
WebSearch: "agricultural statistics API United States"
|
||||
```
|
||||
|
||||
**Typical result**: 5-10 candidate APIs
|
||||
|
||||
### Step 3: Research Documentation
|
||||
|
||||
For each candidate API, use WebFetch to load:
|
||||
- Homepage/overview
|
||||
- Getting started guide
|
||||
- API reference
|
||||
- Rate limits and pricing
|
||||
|
||||
**Extract information**:
|
||||
|
||||
```markdown
|
||||
## API 1: [Name]
|
||||
|
||||
**URL**: [base URL]
|
||||
**Docs**: [docs URL]
|
||||
|
||||
**Authentication**:
|
||||
- Type: API key / OAuth / None
|
||||
- Cost: Free / Paid
|
||||
- How to obtain: [steps]
|
||||
|
||||
**Available Data**:
|
||||
- Temporal coverage: [from when to when]
|
||||
- Geographic coverage: [countries, regions]
|
||||
- Metrics: [list]
|
||||
- Granularity: [daily, monthly, annual]
|
||||
|
||||
**Limitations**:
|
||||
- Rate limit: [requests per day/hour]
|
||||
- Max records: [per request]
|
||||
- Throttling: [yes/no]
|
||||
|
||||
**Quality**:
|
||||
- Source: [official government / private]
|
||||
- Reliability: [high/medium/low]
|
||||
- Update frequency: [frequency]
|
||||
|
||||
**Documentation**:
|
||||
- Quality: [excellent/good/poor]
|
||||
|
||||
### Step 4: API Capability Inventory (NEW v2.0 - CRITICAL!)
|
||||
|
||||
**OBJECTIVE:** Ensure the skill uses 100% of API capabilities, not just the basics!
|
||||
|
||||
**LEARNING:** us-crop-monitor v1.0 used only CONDITION (1 of 5 NASS metrics).
|
||||
v2.0 had to add PROGRESS, YIELD, PRODUCTION, AREA (+3,500 lines of rework).
|
||||
|
||||
**Process:**
|
||||
|
||||
**Step 4.1: Complete Inventory**
|
||||
|
||||
For the chosen API, catalog ALL data types:
|
||||
|
||||
```markdown
|
||||
## Complete Inventory - {API Name}
|
||||
|
||||
**Available Metrics/Endpoints:**
|
||||
|
||||
| Endpoint/Metric | Returns | Granularity | Coverage | Value |
|
||||
|-----------------|---------------|---------------|-----------|-------|
|
||||
| {metric1} | {description} | {daily/weekly}| {geo} | ⭐⭐⭐⭐⭐ |
|
||||
| {metric2} | {description} | {monthly} | {geo} | ⭐⭐⭐⭐⭐ |
|
||||
| {metric3} | {description} | {annual} | {geo} | ⭐⭐⭐⭐ |
|
||||
...
|
||||
|
||||
**Real Example (NASS):**
|
||||
|
||||
| Metric Type | Data | Frequency | Value | Implement? |
|
||||
|----------------|--------------------| ----------|----------|------------|
|
||||
| CONDITION | Quality ratings | Weekly | ⭐⭐⭐⭐⭐ | ✅ YES |
|
||||
| PROGRESS | % planted/harvested| Weekly | ⭐⭐⭐⭐⭐ | ✅ YES |
|
||||
| YIELD | Bu/acre | Monthly | ⭐⭐⭐⭐⭐ | ✅ YES |
|
||||
| PRODUCTION | Total bushels | Monthly | ⭐⭐⭐⭐⭐ | ✅ YES |
|
||||
| AREA | Acres planted | Annual | ⭐⭐⭐⭐ | ✅ YES |
|
||||
| PRICE | $/bushel | Monthly | ⭐⭐⭐ | ⚪ v2.0 |
|
||||
```
|
||||
|
||||
**Step 4.2: Coverage Decision**
|
||||
|
||||
**GOLDEN RULE:**
|
||||
- If metric has ⭐⭐⭐⭐ or ⭐⭐⭐⭐⭐ value → Implement in v1.0
|
||||
- If API has 5 high-value metrics → Implement all 5!
|
||||
- Never leave >50% of API unused without strong justification
|
||||
|
||||
**Step 4.3: Document Decision**
|
||||
|
||||
In DECISIONS.md:
|
||||
```markdown
|
||||
## API Coverage Decision
|
||||
|
||||
API {name} offers {N} types of metrics.
|
||||
|
||||
**Implemented in v1.0 ({X} of {N}):**
|
||||
- {metric1} - {justification}
|
||||
- {metric2} - {justification}
|
||||
...
|
||||
|
||||
**Not implemented ({Y} of {N}):**
|
||||
- {metricZ} - {why not} (planned for v2.0)
|
||||
|
||||
**Coverage:** {X/N * 100}% = {evaluation}
|
||||
- If < 70%: Clearly explain why low coverage
|
||||
- If > 70%: ✅ Good coverage
|
||||
```
|
||||
|
||||
**Output of this phase:** Exact list of all `get_*()` methods to implement
|
||||
- Examples: [many/few/none]
|
||||
- SDKs: [Python/R/None]
|
||||
|
||||
**Ease of Use**:
|
||||
- Format: JSON / CSV / XML
|
||||
- Structure: [simple/complex]
|
||||
- Quirks: [any strange behavior?]
|
||||
```
|
||||
|
||||
### Step 4: Compare Options
|
||||
|
||||
Create comparison table:
|
||||
|
||||
| API | Coverage | Cost | Rate Limit | Quality | Docs | Ease | Score |
|
||||
|-----|-----------|-------|------------|-----------|------|------------|-------|
|
||||
| API 1 | ⭐⭐⭐⭐⭐ | Free | 1000/day | Official | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | 9.2/10 |
|
||||
| API 2 | ⭐⭐⭐⭐ | $49/mo | Unlimited | Private | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | 7.8/10 |
|
||||
| API 3 | ⭐⭐⭐ | Free | 100/day | Private | ⭐⭐ | ⭐⭐⭐ | 5.5/10 |
|
||||
|
||||
**Scoring criteria**:
|
||||
- Coverage (fit with need): 30% weight
|
||||
- Cost (prefer free): 20% weight
|
||||
- Rate limit (sufficient?): 15% weight
|
||||
- Quality (official > private): 15% weight
|
||||
- Documentation (facilitates implementation): 10% weight
|
||||
- Ease of use (format, structure): 10% weight
|
||||
|
||||
### Step 5: DECIDE
|
||||
|
||||
**Consider user constraints**:
|
||||
- Mentioned "free"? → Eliminate paid options
|
||||
- Mentioned "10+ years historical data"? → Check coverage
|
||||
- Mentioned "real-time"? → Prioritize streaming APIs
|
||||
|
||||
**Apply logic**:
|
||||
1. Eliminate APIs that violate constraints
|
||||
2. Of remaining, choose highest score
|
||||
3. If tie, prefer:
|
||||
- Official > private
|
||||
- Better documentation
|
||||
- Easier to use
|
||||
|
||||
**FINAL DECISION**:
|
||||
|
||||
```markdown
|
||||
## Selected API: [API Name]
|
||||
|
||||
**Score**: X.X/10
|
||||
|
||||
**Justification**:
|
||||
- ✅ Coverage: [specific details]
|
||||
- ✅ Cost: [free/paid + details]
|
||||
- ✅ Rate limit: [number] requests/day (sufficient for [estimated usage])
|
||||
- ✅ Quality: [official/private + reliability]
|
||||
- ✅ Documentation: [quality + examples]
|
||||
- ✅ Ease of use: [format, structure]
|
||||
|
||||
**Fit with requirements**:
|
||||
- Constraint 1 (e.g., free): ✅ Met
|
||||
- Constraint 2 (e.g., 10+ years history): ✅ Met (since [year])
|
||||
- Primary need (e.g., crop production): ✅ Covered
|
||||
|
||||
**Alternatives Considered**:
|
||||
|
||||
**API X**: Score 7.5/10
|
||||
- Rejected because: [specific reason]
|
||||
- Trade-off: [what we lose vs gain]
|
||||
|
||||
**API Y**: Score 6.2/10
|
||||
- Rejected because: [reason]
|
||||
|
||||
**Conclusion**:
|
||||
[API Name] is the best option because [1-2 sentence synthesis].
|
||||
```
|
||||
|
||||
### Step 6: Research Technical Details
|
||||
|
||||
After deciding, dive deep into documentation:
|
||||
|
||||
**Load via WebFetch**:
|
||||
- Getting started guide
|
||||
- Complete API reference
|
||||
- Authentication guide
|
||||
- Rate limiting details
|
||||
- Best practices
|
||||
|
||||
**Extract for implementation**:
|
||||
|
||||
```markdown
|
||||
## Technical Details - [API]
|
||||
|
||||
### Authentication
|
||||
|
||||
**Method**: API key in header
|
||||
**Header**: `X-Api-Key: YOUR_KEY`
|
||||
**Obtaining key**:
|
||||
1. [step 1]
|
||||
2. [step 2]
|
||||
3. [step 3]
|
||||
|
||||
### Main Endpoints
|
||||
|
||||
**Endpoint 1**: [Name]
|
||||
- **URL**: `GET https://api.example.com/v1/endpoint`
|
||||
- **Parameters**:
|
||||
- `param1` (required): [description, type, example]
|
||||
- `param2` (optional): [description, type, default]
|
||||
- **Response** (200 OK):
|
||||
```json
|
||||
{
|
||||
"data": [...],
|
||||
"meta": {...}
|
||||
}
|
||||
```
|
||||
- **Errors**:
|
||||
- 400: [when occurs, how to handle]
|
||||
- 401: [when occurs, how to handle]
|
||||
- 429: [rate limit, how to handle]
|
||||
|
||||
**Example request**:
|
||||
```bash
|
||||
curl -H "X-Api-Key: YOUR_KEY" \
|
||||
"https://api.example.com/v1/endpoint?param1=value"
|
||||
```
|
||||
|
||||
[Repeat for all relevant endpoints]
|
||||
|
||||
### Rate Limiting
|
||||
|
||||
- Limit: [number] requests per [period]
|
||||
- Response headers:
|
||||
- `X-RateLimit-Limit`: Total limit
|
||||
- `X-RateLimit-Remaining`: Remaining requests
|
||||
- `X-RateLimit-Reset`: Reset timestamp
|
||||
- Behavior when exceeded: [429 error, throttling, ban?]
|
||||
- Best practice: [how to implement rate limiting]
|
||||
|
||||
### Quirks and Gotchas
|
||||
|
||||
**Quirk 1**: Values come as strings with formatting
|
||||
- Example: `"2,525,000"` instead of `2525000`
|
||||
- Solution: Remove commas before converting
|
||||
|
||||
**Quirk 2**: Suppressed data marked as "(D)"
|
||||
- Meaning: Withheld to avoid disclosing data
|
||||
- Solution: Treat as NULL, signal to user
|
||||
|
||||
**Quirk 3**: [other non-obvious behavior]
|
||||
- Solution: [how to handle]
|
||||
|
||||
### Performance Tips
|
||||
|
||||
- Historical data doesn't change → cache permanently
|
||||
- Recent data may be revised → short cache (7 days)
|
||||
- Use pagination parameters if large response
|
||||
- Make parallel requests when possible (respecting rate limit)
|
||||
```
|
||||
|
||||
### Step 7: Document for Later Use
|
||||
|
||||
Save everything in `references/api-guide.md` of the agent to be created.
|
||||
|
||||
## Discovery Examples
|
||||
|
||||
### Example 1: US Agriculture
|
||||
|
||||
**Input**: "US crop data"
|
||||
|
||||
**Research**:
|
||||
```
|
||||
WebSearch: "USDA API agricultural data"
|
||||
→ Found: NASS QuickStats, ERS, FAS
|
||||
|
||||
WebFetch: https://quickstats.nass.usda.gov/api
|
||||
→ Free, data since 1866, 1000/day rate limit
|
||||
|
||||
WebFetch: https://www.ers.usda.gov/developer/
|
||||
→ Free, economic focus, less granular
|
||||
|
||||
WebFetch: https://apps.fas.usda.gov/api
|
||||
→ International focus, not domestic
|
||||
```
|
||||
|
||||
**Comparison**:
|
||||
| API | Coverage (US domestic) | Cost | Production Data | Score |
|
||||
|-----|---------------------------|-------|-------------------|-------|
|
||||
| NASS | ⭐⭐⭐⭐⭐ (excellent) | Free | ⭐⭐⭐⭐⭐ | 9.5/10 |
|
||||
| ERS | ⭐⭐⭐⭐ (good) | Free | ⭐⭐⭐ (economic) | 7.0/10 |
|
||||
| FAS | ⭐⭐ (international) | Free | ⭐⭐ (global) | 4.0/10 |
|
||||
|
||||
**DECISION**: NASS QuickStats API
|
||||
- Best coverage for US domestic agriculture
|
||||
- Free with reasonable rate limit
|
||||
- Complete production, area, yield data
|
||||
|
||||
### Example 2: Stock Market
|
||||
|
||||
**Input**: "technical stock analysis"
|
||||
|
||||
**Research**:
|
||||
```
|
||||
WebSearch: "stock market API free historical data"
|
||||
→ Alpha Vantage, Yahoo Finance, IEX Cloud, Polygon.io
|
||||
|
||||
WebFetch: Alpha Vantage docs
|
||||
→ Free, 5 requests/min, 500/day
|
||||
|
||||
WebFetch: Yahoo Finance (yfinance)
|
||||
→ Free, unlimited but unofficial
|
||||
|
||||
WebFetch: IEX Cloud
|
||||
→ Freemium, good docs, 50k free credits/month
|
||||
```
|
||||
|
||||
**Comparison**:
|
||||
| API | Data | Cost | Rate Limit | Official | Score |
|
||||
|-----|-------|-------|------------|---------|-------|
|
||||
| Alpha Vantage | Complete | Free | 500/day | ⭐⭐⭐ | 8.0/10 |
|
||||
| Yahoo Finance | Complete | Free | Unlimited | ❌ Unofficial | 7.5/10 |
|
||||
| IEX Cloud | Excellent | Freemium | 50k/month | ⭐⭐⭐⭐ | 8.5/10 |
|
||||
|
||||
**DECISION**: IEX Cloud (free tier)
|
||||
- Official and reliable
|
||||
- 50k requests/month sufficient
|
||||
- Excellent documentation
|
||||
- Complete data (OHLCV + volume)
|
||||
|
||||
### Example 3: Global Climate
|
||||
|
||||
**Input**: "global climate data"
|
||||
|
||||
**Research**:
|
||||
```
|
||||
WebSearch: "weather API historical data global"
|
||||
→ NOAA, OpenWeather, Weather.gov, Meteostat
|
||||
|
||||
[Research each one...]
|
||||
```
|
||||
|
||||
**DECISION**: NOAA Climate Data Online (CDO) API
|
||||
- Official (US government)
|
||||
- Free
|
||||
- Global and historical coverage (1900+)
|
||||
- Rate limit: 1000/day
|
||||
|
||||
## Decision Documentation
|
||||
|
||||
Create `DECISIONS.md` file in agent:
|
||||
|
||||
```markdown
|
||||
# Architecture Decisions
|
||||
|
||||
## Date: [creation date]
|
||||
|
||||
## Phase 1: API Selection
|
||||
|
||||
### Chosen API
|
||||
|
||||
**[API Name]**
|
||||
|
||||
### Selection Process
|
||||
|
||||
**APIs Researched**: [list]
|
||||
|
||||
**Evaluation Criteria**:
|
||||
1. Data coverage (fit with need)
|
||||
2. Cost (preference for free)
|
||||
3. Rate limits (viability)
|
||||
4. Quality (official > private)
|
||||
5. Documentation (facilitates development)
|
||||
|
||||
### Comparison
|
||||
|
||||
[Comparison table]
|
||||
|
||||
### Final Justification
|
||||
|
||||
[2-3 paragraphs explaining why this API was chosen]
|
||||
|
||||
### Trade-offs
|
||||
|
||||
**What we gain**:
|
||||
- [benefit 1]
|
||||
- [benefit 2]
|
||||
|
||||
**What we lose** (vs alternatives):
|
||||
- [accepted limitation 1]
|
||||
- [accepted limitation 2]
|
||||
|
||||
### Technical Details
|
||||
|
||||
[Summary of endpoints, authentication, rate limits, etc]
|
||||
|
||||
**Complete documentation**: See `references/api-guide.md`
|
||||
```
|
||||
|
||||
## Phase 1 Checklist
|
||||
|
||||
Before proceeding to Phase 2, verify:
|
||||
|
||||
- [ ] Research completed (WebSearch + WebFetch)
|
||||
- [ ] Minimum 3 APIs compared
|
||||
- [ ] Decision made with clear justification
|
||||
- [ ] User constraints respected
|
||||
- [ ] Technical details extracted
|
||||
- [ ] DECISIONS.md created
|
||||
- [ ] Ready for analysis design
|
||||
Reference in New Issue
Block a user