Initial commit
This commit is contained in:
37
skills/mcp-skill-creator/references/example-config.json
Normal file
37
skills/mcp-skill-creator/references/example-config.json
Normal file
@@ -0,0 +1,37 @@
|
||||
{
|
||||
"_comment": "Example configuration for creating an MCP-powered skill",
|
||||
|
||||
"mcp_servers": [
|
||||
{
|
||||
"name": "puppeteer",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-puppeteer"]
|
||||
},
|
||||
{
|
||||
"name": "twitter",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-twitter"]
|
||||
},
|
||||
{
|
||||
"name": "reddit",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-reddit"]
|
||||
}
|
||||
],
|
||||
|
||||
"workflow_description": "When I research a new internet product:\n\n1. First, I visit the official website and read about their features, pricing, and positioning\n2. Then I search Twitter for recent mentions and community sentiment\n3. I also check Reddit discussions in relevant subreddits like r/SaaS and r/ProductHunt\n4. Finally, I combine all findings into a comprehensive markdown report with:\n - Executive summary\n - Key features and differentiators\n - Pricing analysis\n - Community sentiment\n - Pros and cons from users\n - My recommendations",
|
||||
|
||||
"preferences": [
|
||||
"I prefer to see quantitative metrics when available (user counts, ratings, conversion rates)",
|
||||
"I value recent information (last 6 months) much more than old reviews",
|
||||
"I like to highlight contradictions between what the company claims and what users actually experience",
|
||||
"For pricing, I want to see how it compares to direct competitors"
|
||||
],
|
||||
|
||||
"sop": [
|
||||
"Always start with official sources to establish ground truth",
|
||||
"Cross-reference official claims against community feedback",
|
||||
"Look for red flags like frequent complaints about specific features or missing functionality",
|
||||
"Prioritize reviews from actual users over promotional content or affiliate marketing",
|
||||
"Include direct quotes from notable reviews or discussions",
|
||||
"End report with actionable recommendations based on research"
|
||||
]
|
||||
}
|
||||
|
||||
367
skills/mcp-skill-creator/references/mcp-best-practices.md
Normal file
367
skills/mcp-skill-creator/references/mcp-best-practices.md
Normal file
@@ -0,0 +1,367 @@
|
||||
# MCP Code Execution Best Practices
|
||||
|
||||
This reference document provides detailed guidance on implementing efficient MCP integrations using code execution patterns, based on [Anthropic's MCP engineering blog post](https://www.anthropic.com/engineering/code-execution-with-mcp).
|
||||
|
||||
## Core Principles
|
||||
|
||||
### 1. Progressive Disclosure
|
||||
|
||||
**Problem**: Loading all MCP tool definitions upfront wastes context window space.
|
||||
|
||||
**Solution**: Present tools as code APIs on a filesystem, allowing models to load only what they need.
|
||||
|
||||
```
|
||||
scripts/
|
||||
├── tools/
|
||||
│ ├── google-drive/
|
||||
│ │ ├── getDocument.ts
|
||||
│ │ ├── listFiles.ts
|
||||
│ │ └── index.ts
|
||||
│ └── salesforce/
|
||||
│ ├── updateRecord.ts
|
||||
│ └── index.ts
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Reduces initial context from 150,000 tokens to 2,000 tokens (98.7% reduction)
|
||||
- Scales to thousands of tools without overwhelming the model
|
||||
- Tools loaded on-demand as needed
|
||||
|
||||
**Implementation**:
|
||||
```python
|
||||
# Agent explores filesystem
|
||||
tools_available = os.listdir('scripts/tools/google-drive/')
|
||||
|
||||
# Agent reads only needed tool definitions
|
||||
with open('scripts/tools/google-drive/getDocument.py') as f:
|
||||
tool_code = f.read()
|
||||
```
|
||||
|
||||
### 2. Context-Efficient Data Handling
|
||||
|
||||
**Problem**: Intermediate results flowing through context window consume excessive tokens.
|
||||
|
||||
**Bad Example**:
|
||||
```python
|
||||
# Without code execution - all data flows through context
|
||||
TOOL CALL: gdrive.getSheet(sheetId: 'abc123')
|
||||
→ returns 10,000 rows to model
|
||||
→ model filters in context
|
||||
→ passes filtered data to next tool
|
||||
```
|
||||
|
||||
**Good Example**:
|
||||
```python
|
||||
# With code execution - filter in execution environment
|
||||
sheet_data = await gdrive.getSheet({'sheetId': 'abc123'})
|
||||
|
||||
# Filter in execution environment (no context cost)
|
||||
pending_orders = [
|
||||
row for row in sheet_data
|
||||
if row['Status'] == 'pending' and row['Amount'] > 1000
|
||||
]
|
||||
|
||||
# Only return summary to model
|
||||
print(f"Found {len(pending_orders)} high-value pending orders")
|
||||
print(pending_orders[:5]) # Show first 5 for review
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Processes 10,000 rows but only sends 5 to model
|
||||
- Reduces token usage by 99.5%
|
||||
- Faster execution, lower costs
|
||||
|
||||
### 3. Parallel Execution
|
||||
|
||||
**Problem**: Sequential tool calls waste time when operations are independent.
|
||||
|
||||
**Bad Example**:
|
||||
```python
|
||||
# Sequential execution
|
||||
twitter_data = await x_com.search_tweets(query)
|
||||
# Wait for Twitter...
|
||||
reddit_data = await reddit.search_discussions(query)
|
||||
# Wait for Reddit...
|
||||
```
|
||||
|
||||
**Good Example**:
|
||||
```python
|
||||
# Parallel execution with asyncio.gather()
|
||||
twitter_task = x_com.search_tweets(query)
|
||||
reddit_task = reddit.search_discussions(query)
|
||||
producthunt_task = producthunt.search(query)
|
||||
|
||||
# Execute all concurrently
|
||||
results = await asyncio.gather(
|
||||
twitter_task,
|
||||
reddit_task,
|
||||
producthunt_task
|
||||
)
|
||||
|
||||
twitter_data, reddit_data, ph_data = results
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- 3x faster execution (if all APIs take similar time)
|
||||
- Better user experience
|
||||
- Efficient resource utilization
|
||||
|
||||
### 4. Complex Control Flow
|
||||
|
||||
**Problem**: Implementing loops and conditionals via sequential tool calls is inefficient.
|
||||
|
||||
**Bad Example**:
|
||||
```python
|
||||
# Agent alternates between tool calls and sleep
|
||||
TOOL CALL: slack.getMessages()
|
||||
→ no deployment message
|
||||
SLEEP: 5 seconds
|
||||
TOOL CALL: slack.getMessages()
|
||||
→ no deployment message
|
||||
SLEEP: 5 seconds
|
||||
# ... repeat many times
|
||||
```
|
||||
|
||||
**Good Example**:
|
||||
```python
|
||||
# Implement control flow in code
|
||||
async def wait_for_deployment(channel: str, timeout: int = 300):
|
||||
start_time = time.time()
|
||||
|
||||
while time.time() - start_time < timeout:
|
||||
messages = await slack.getChannelHistory(channel, limit=10)
|
||||
|
||||
if any('deployment complete' in m['text'].lower() for m in messages):
|
||||
return {'status': 'success', 'message': messages[0]}
|
||||
|
||||
await asyncio.sleep(10)
|
||||
|
||||
return {'status': 'timeout'}
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Single code execution instead of 60+ tool calls
|
||||
- Faster time to first token
|
||||
- More reliable error handling
|
||||
|
||||
### 5. Privacy-Preserving Operations
|
||||
|
||||
**Problem**: Sensitive data flowing through model context raises privacy concerns.
|
||||
|
||||
**Solution**: Keep sensitive data in execution environment, only share summaries.
|
||||
|
||||
```python
|
||||
# Load sensitive customer data
|
||||
customers = await gdrive.getSheet({'sheetId': 'customer_contacts'})
|
||||
|
||||
# Process PII in execution environment (never shown to model)
|
||||
for customer in customers:
|
||||
await salesforce.updateRecord({
|
||||
'objectType': 'Lead',
|
||||
'recordId': customer['salesforce_id'],
|
||||
'data': {
|
||||
'Email': customer['email'], # PII stays in execution env
|
||||
'Phone': customer['phone'], # PII stays in execution env
|
||||
'Name': customer['name'] # PII stays in execution env
|
||||
}
|
||||
})
|
||||
|
||||
# Only summary goes to model
|
||||
print(f"Updated {len(customers)} customer records")
|
||||
print("✓ All contact information synchronized")
|
||||
```
|
||||
|
||||
**Optional Enhancement**: Tokenize PII automatically in MCP client:
|
||||
```python
|
||||
# What model sees (if PII is tokenized):
|
||||
[
|
||||
{'email': '[EMAIL_1]', 'phone': '[PHONE_1]', 'name': '[NAME_1]'},
|
||||
{'email': '[EMAIL_2]', 'phone': '[PHONE_2]', 'name': '[NAME_2]'}
|
||||
]
|
||||
|
||||
# Real data flows Google Sheets → Salesforce without entering model context
|
||||
```
|
||||
|
||||
### 6. State Persistence and Skills
|
||||
|
||||
**Problem**: Agents cannot build on previous work without memory.
|
||||
|
||||
**Solution**: Use filesystem to persist intermediate results and reusable functions.
|
||||
|
||||
**State Persistence**:
|
||||
```python
|
||||
# Save intermediate results
|
||||
import json
|
||||
|
||||
intermediate_data = await fetch_and_process()
|
||||
|
||||
with open('./workspace/state.json', 'w') as f:
|
||||
json.dump(intermediate_data, f)
|
||||
|
||||
# Later execution picks up where it left off
|
||||
with open('./workspace/state.json') as f:
|
||||
state = json.load(f)
|
||||
```
|
||||
|
||||
**Skill Evolution**:
|
||||
```python
|
||||
# Save reusable function as a skill
|
||||
# In ./skills/save-sheet-as-csv.py
|
||||
import pandas as pd
|
||||
from scripts.tools import gdrive
|
||||
|
||||
async def save_sheet_as_csv(sheet_id: str, output_path: str):
|
||||
"""
|
||||
Reusable function to export Google Sheet as CSV
|
||||
"""
|
||||
data = await gdrive.getSheet({'sheetId': sheet_id})
|
||||
df = pd.DataFrame(data)
|
||||
df.to_csv(output_path, index=False)
|
||||
return output_path
|
||||
|
||||
# Later, in any workflow:
|
||||
from skills.save_sheet_as_csv import save_sheet_as_csv
|
||||
|
||||
csv_path = await save_sheet_as_csv('abc123', './data/export.csv')
|
||||
```
|
||||
|
||||
**Add SKILL.md** to create structured skill:
|
||||
```markdown
|
||||
---
|
||||
name: sheet-csv-exporter
|
||||
description: Export Google Sheets to CSV format
|
||||
---
|
||||
|
||||
# Sheet CSV Exporter
|
||||
|
||||
Provides a reusable function for exporting Google Sheets to CSV files.
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from skills.save_sheet_as_csv import save_sheet_as_csv
|
||||
|
||||
csv_path = await save_sheet_as_csv(
|
||||
sheet_id='your-sheet-id',
|
||||
output_path='./output/data.csv'
|
||||
)
|
||||
```
|
||||
```
|
||||
|
||||
## Token Usage Comparison
|
||||
|
||||
| Approach | Token Usage | Latency | Privacy |
|
||||
|----------|-------------|---------|---------|
|
||||
| **Direct Tool Calls** | 150,000+ tokens (all tool definitions loaded) | High (sequential calls) | ⚠️ All data through context |
|
||||
| **Code Execution with MCP** | 2,000 tokens (load on demand) | Low (parallel execution) | ✅ Data filtered/tokenized |
|
||||
|
||||
**Savings**: 98.7% token reduction, 3-5x faster execution
|
||||
|
||||
## When to Use Code Execution
|
||||
|
||||
✅ **Use code execution when**:
|
||||
- Working with many MCP tools (>10 tools)
|
||||
- Processing large datasets (>1000 rows)
|
||||
- Need parallel API calls
|
||||
- Workflow involves loops/conditionals
|
||||
- Privacy concerns with sensitive data
|
||||
- Building reusable workflows
|
||||
|
||||
❌ **Avoid code execution when**:
|
||||
- Simple single tool call
|
||||
- Small data amounts
|
||||
- Quick ad-hoc tasks
|
||||
- No performance concerns
|
||||
- Execution environment unavailable
|
||||
|
||||
## Implementation Considerations
|
||||
|
||||
### Security
|
||||
- Sandbox execution environment properly
|
||||
- Limit resource usage (CPU, memory, time)
|
||||
- Monitor for malicious code patterns
|
||||
- Validate all inputs
|
||||
|
||||
### Error Handling
|
||||
```python
|
||||
try:
|
||||
result = await mcp_tool(params)
|
||||
except Exception as e:
|
||||
# Log error
|
||||
logger.error(f"MCP tool failed: {e}")
|
||||
# Return graceful fallback
|
||||
return {'error': str(e), 'status': 'failed'}
|
||||
```
|
||||
|
||||
### Testing
|
||||
- Test scripts in isolation
|
||||
- Mock MCP tool responses
|
||||
- Verify error handling
|
||||
- Check performance gains
|
||||
|
||||
## Examples from Production
|
||||
|
||||
### Example 1: Document Processing Pipeline
|
||||
```python
|
||||
async def process_contracts(folder_id: str):
|
||||
"""Process all contracts in a folder"""
|
||||
# 1. List all files (single MCP call)
|
||||
files = await gdrive.listFiles({'folderId': folder_id})
|
||||
|
||||
# 2. Filter in execution environment
|
||||
pdf_files = [f for f in files if f['type'] == 'pdf']
|
||||
|
||||
# 3. Parallel processing
|
||||
results = await asyncio.gather(*[
|
||||
extract_contract_data(f['id'])
|
||||
for f in pdf_files
|
||||
])
|
||||
|
||||
# 4. Aggregate and save
|
||||
summary = aggregate_contract_summary(results)
|
||||
|
||||
# Only summary to model
|
||||
return {
|
||||
'total_contracts': len(pdf_files),
|
||||
'processed': len(results),
|
||||
'summary': summary[:500] # Truncate for context
|
||||
}
|
||||
```
|
||||
|
||||
### Example 2: Social Media Monitoring
|
||||
```python
|
||||
async def monitor_brand_mentions(brand: str):
|
||||
"""Monitor brand across multiple platforms"""
|
||||
# Parallel fetch from multiple sources
|
||||
twitter_task = x_com.search_tweets(f'"{brand}"')
|
||||
reddit_task = reddit.search(brand, subreddits=['technology'])
|
||||
hn_task = hackernews.search(brand)
|
||||
|
||||
mentions = await asyncio.gather(
|
||||
twitter_task, reddit_task, hn_task
|
||||
)
|
||||
|
||||
# Sentiment analysis in execution environment
|
||||
sentiment = analyze_sentiment_batch(mentions)
|
||||
|
||||
# Filter and aggregate
|
||||
recent_mentions = filter_last_24h(mentions)
|
||||
key_insights = extract_key_insights(recent_mentions)
|
||||
|
||||
return {
|
||||
'mention_count': len(recent_mentions),
|
||||
'sentiment': sentiment,
|
||||
'key_insights': key_insights,
|
||||
'platforms': {
|
||||
'twitter': len(mentions[0]),
|
||||
'reddit': len(mentions[1]),
|
||||
'hackernews': len(mentions[2])
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [MCP Official Documentation](https://modelcontextprotocol.io/)
|
||||
- [Anthropic MCP Engineering Blog](https://www.anthropic.com/engineering/code-execution-with-mcp)
|
||||
- [Cloudflare Code Mode](https://blog.cloudflare.com/code-mode/)
|
||||
497
skills/mcp-skill-creator/references/quick-start.md
Normal file
497
skills/mcp-skill-creator/references/quick-start.md
Normal file
@@ -0,0 +1,497 @@
|
||||
# Quick Start Guide
|
||||
|
||||
This guide will help you create your first MCP-powered skill using the mcp-skill-creator meta-skill.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. Python 3.10 or higher
|
||||
2. MCP SDK installed:
|
||||
```bash
|
||||
pip install mcp --break-system-packages
|
||||
```
|
||||
3. Node.js (for running MCP servers via npx)
|
||||
|
||||
## Overview
|
||||
|
||||
Creating an MCP-powered skill is a two-phase process:
|
||||
|
||||
**Phase 1 (Programmatic)**: Generate MCP infrastructure using provided scripts
|
||||
- Introspect MCP servers to discover tools
|
||||
- Generate type-safe Python wrappers
|
||||
|
||||
**Phase 2 (LLM-Driven)**: Create the skill following skill-creator principles
|
||||
- Analyze workflow and identify optimization opportunities
|
||||
- Write workflow scripts combining MCP tools
|
||||
- Embed user preferences and SOPs into SKILL.md
|
||||
- Package the final skill
|
||||
|
||||
## Step-by-Step Process
|
||||
|
||||
### Step 1: Gather Input from User
|
||||
|
||||
Collect the following information:
|
||||
|
||||
**MCP Servers** (required):
|
||||
```json
|
||||
{
|
||||
"mcp_servers": [
|
||||
{
|
||||
"name": "puppeteer",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-puppeteer"]
|
||||
},
|
||||
{
|
||||
"name": "twitter",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-twitter"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Workflow Description** (required):
|
||||
- User's step-by-step workflow
|
||||
- Can be natural language, numbered list, or sequential narrative
|
||||
- Example: "First I visit the official website, then check Twitter and Reddit, finally create a report"
|
||||
|
||||
**Preferences** (optional):
|
||||
- "I prefer quantitative metrics over qualitative descriptions"
|
||||
- "Recent information is more valuable than old"
|
||||
- "Always cross-reference claims"
|
||||
|
||||
**SOPs** (optional):
|
||||
- "Start with official sources before checking community feedback"
|
||||
- "Cite all sources in the final report"
|
||||
- "Highlight contradictions between official and community perspectives"
|
||||
|
||||
### Step 2: Generate MCP Infrastructure (Programmatic)
|
||||
|
||||
#### 2.1 Create MCP Configuration File
|
||||
|
||||
```bash
|
||||
# Save MCP server config
|
||||
cat > mcp_config.json << EOF
|
||||
{
|
||||
"servers": [
|
||||
{
|
||||
"name": "puppeteer",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-puppeteer"]
|
||||
},
|
||||
{
|
||||
"name": "twitter",
|
||||
"command": ["npx", "-y", "@modelcontextprotocol/server-twitter"]
|
||||
}
|
||||
]
|
||||
}
|
||||
EOF
|
||||
```
|
||||
|
||||
#### 2.2 Introspect MCP Servers
|
||||
|
||||
```bash
|
||||
python scripts/mcp_introspector.py mcp_config.json introspection.json
|
||||
```
|
||||
|
||||
This discovers all available tools from the MCP servers and saves them to `introspection.json`.
|
||||
|
||||
#### 2.3 Generate Tool Wrappers
|
||||
|
||||
```bash
|
||||
mkdir -p my-skill/scripts
|
||||
python scripts/generate_mcp_wrappers.py introspection.json my-skill
|
||||
```
|
||||
|
||||
This creates:
|
||||
- `my-skill/scripts/mcp_client.py` - Base MCP client
|
||||
- `my-skill/scripts/tools/<server>/<tool>.py` - Type-safe wrappers for each tool
|
||||
- `my-skill/scripts/tools/<server>/__init__.py` - Package initialization
|
||||
|
||||
**Generated structure**:
|
||||
```
|
||||
my-skill/
|
||||
└── scripts/
|
||||
├── mcp_client.py
|
||||
└── tools/
|
||||
├── puppeteer/
|
||||
│ ├── __init__.py
|
||||
│ ├── fetch_page.py
|
||||
│ └── screenshot.py
|
||||
└── twitter/
|
||||
├── __init__.py
|
||||
└── search_tweets.py
|
||||
```
|
||||
|
||||
### Step 3: Analyze Workflow (LLM-Driven)
|
||||
|
||||
Now use Claude to analyze the user's workflow description:
|
||||
|
||||
**Ask yourself**:
|
||||
- What are the distinct workflow steps?
|
||||
- Which steps are data fetching (candidates for parallelization)?
|
||||
- Which steps are data processing (candidates for execution environment filtering)?
|
||||
- Are there any loops, conditionals, or polling patterns?
|
||||
- What intermediate state needs to be preserved?
|
||||
|
||||
**Example Analysis**:
|
||||
|
||||
User workflow: "Research products by checking official site, Twitter, Reddit, then create report"
|
||||
|
||||
Analysis:
|
||||
- **Step 1**: Fetch official website (data fetch)
|
||||
- **Step 2**: Search Twitter (data fetch - can parallelize with step 3)
|
||||
- **Step 3**: Search Reddit (data fetch - can parallelize with step 2)
|
||||
- **Step 4**: Aggregate data (data processing - filter in execution env)
|
||||
- **Step 5**: Generate report (output)
|
||||
|
||||
Optimization opportunities:
|
||||
- Parallel execution: Steps 2-3
|
||||
- Data filtering: Step 4 (process 1000s of posts, return top 10)
|
||||
- Context efficiency: Return summary, not raw data
|
||||
|
||||
### Step 4: Plan Skill Contents (LLM-Driven)
|
||||
|
||||
Based on the workflow analysis, decide what to include:
|
||||
|
||||
**Should you create a script?**
|
||||
- ✅ Yes: Multi-step workflow, parallel execution opportunities, data filtering needed
|
||||
- ❌ No: Single tool call, high variability, exploratory task
|
||||
|
||||
**For each script, plan**:
|
||||
- Purpose and scope
|
||||
- Which MCP tools it uses
|
||||
- Optimization patterns (parallel, filtering, control flow)
|
||||
- Parameters and return value
|
||||
|
||||
**Example Plan**:
|
||||
|
||||
Script: `scripts/workflows/product_research_pipeline.py`
|
||||
- Purpose: Complete product research workflow
|
||||
- MCP tools: puppeteer.fetch_page, twitter.search_tweets, reddit.search_discussions
|
||||
- Optimizations:
|
||||
- Parallel execution of Twitter + Reddit searches
|
||||
- Data filtering: Extract top 10 insights from 1000+ posts
|
||||
- Return summary, not raw data
|
||||
- Parameters: product_url, product_name
|
||||
- Returns: {features, sentiment, highlights}
|
||||
|
||||
### Step 5: Implement the Skill (LLM-Driven)
|
||||
|
||||
#### 5.1 Create Workflow Scripts
|
||||
|
||||
Write `my-skill/scripts/workflows/product_research_pipeline.py`:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from scripts.tools import puppeteer, twitter, reddit
|
||||
|
||||
async def product_research_pipeline(product_url: str, product_name: str):
|
||||
"""
|
||||
Complete product research workflow
|
||||
|
||||
Optimizations:
|
||||
- Parallel Twitter + Reddit searches (2x faster)
|
||||
- Data filtering in execution environment
|
||||
- Returns summary, not raw data
|
||||
"""
|
||||
|
||||
# Fetch official website
|
||||
official = await puppeteer.fetch_page(product_url)
|
||||
|
||||
# Parallel social media research
|
||||
twitter_data, reddit_data = await asyncio.gather(
|
||||
twitter.search_tweets(f'"{product_name}"'),
|
||||
reddit.search_discussions(product_name, subreddits=['SaaS'])
|
||||
)
|
||||
|
||||
# Process in execution environment (not in context)
|
||||
key_features = extract_features(official, top_n=10)
|
||||
sentiment = analyze_sentiment([twitter_data, reddit_data])
|
||||
highlights = extract_highlights(twitter_data + reddit_data, top_n=5)
|
||||
|
||||
# Return summary only
|
||||
return {
|
||||
'key_features': key_features,
|
||||
'sentiment': sentiment,
|
||||
'highlights': highlights,
|
||||
'mention_count': len(twitter_data) + len(reddit_data)
|
||||
}
|
||||
|
||||
|
||||
def extract_features(html: str, top_n: int) -> list:
|
||||
"""Extract key features from website HTML"""
|
||||
# TODO: Implement feature extraction logic
|
||||
return []
|
||||
|
||||
def analyze_sentiment(social_data: list) -> dict:
|
||||
"""Analyze sentiment from social media posts"""
|
||||
# TODO: Implement sentiment analysis
|
||||
return {'score': 0, 'summary': ''}
|
||||
|
||||
def extract_highlights(posts: list, top_n: int) -> list:
|
||||
"""Extract most relevant highlights from posts"""
|
||||
# TODO: Implement highlight extraction
|
||||
return []
|
||||
```
|
||||
|
||||
**Key principles**:
|
||||
- Use `async`/`await` for IO-bound operations
|
||||
- Combine related MCP calls into single scripts
|
||||
- Filter/aggregate data in execution environment
|
||||
- Return summaries, not raw data
|
||||
- Include helper functions for data processing
|
||||
|
||||
#### 5.2 Write SKILL.md
|
||||
|
||||
Create `my-skill/SKILL.md` embedding user preferences and SOPs:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: product-research-workflow
|
||||
description: Automated product research integrating official sources and social platforms with emphasis on quantitative metrics and recent information
|
||||
---
|
||||
|
||||
# Product Research Workflow
|
||||
|
||||
Efficiently research internet products by gathering data from official sources
|
||||
and social platforms, following your standard research methodology.
|
||||
|
||||
## Workflow Overview
|
||||
|
||||
This skill implements your research process with built-in optimizations:
|
||||
|
||||
1. **Official Source Analysis**: Visit product website to extract key features
|
||||
and positioning (your SOP: always start with official sources)
|
||||
|
||||
2. **Social Intelligence Gathering**: Search Twitter and Reddit in parallel
|
||||
for community feedback (optimized: 2x faster than sequential)
|
||||
|
||||
3. **Cross-Reference Analysis**: Identify contradictions between official claims
|
||||
and community feedback (your preference: highlight discrepancies)
|
||||
|
||||
4. **Report Generation**: Create comprehensive report emphasizing quantitative
|
||||
metrics like ratings and user counts (your preference: quant > qual)
|
||||
|
||||
## Quick Start
|
||||
|
||||
```python
|
||||
from scripts.workflows import product_research_pipeline
|
||||
|
||||
report = await product_research_pipeline(
|
||||
product_url='https://example.com',
|
||||
product_name='ExampleApp'
|
||||
)
|
||||
```
|
||||
|
||||
## Available Workflows
|
||||
|
||||
### product_research_pipeline
|
||||
|
||||
**Use when**: Researching any new internet product or SaaS tool
|
||||
|
||||
**Location**: `scripts/workflows/product_research_pipeline.py`
|
||||
|
||||
**Optimizations**:
|
||||
- 2x faster via parallel social media gathering
|
||||
- Context-efficient: processes 1000s of posts, returns top 10 insights
|
||||
- Recent info prioritized (aligns with your preference)
|
||||
|
||||
**Usage**:
|
||||
```python
|
||||
from scripts.workflows import product_research_pipeline
|
||||
|
||||
result = await product_research_pipeline(
|
||||
product_url='https://product.com',
|
||||
product_name='ProductName'
|
||||
)
|
||||
|
||||
# Result structure:
|
||||
# {
|
||||
# 'key_features': ['Feature 1', 'Feature 2', ...],
|
||||
# 'sentiment': {'score': 0.75, 'summary': '...'},
|
||||
# 'highlights': ['Highlight 1', ...],
|
||||
# 'mention_count': 247
|
||||
# }
|
||||
```
|
||||
|
||||
## MCP Tools Available
|
||||
|
||||
### puppeteer
|
||||
|
||||
**Tools**: 5 available
|
||||
|
||||
**Location**: `scripts/tools/puppeteer/`
|
||||
|
||||
**Key tools**: fetch_page, screenshot, pdf_export
|
||||
|
||||
**Discovery**: Use `ls scripts/tools/puppeteer/` to see all tools
|
||||
|
||||
### twitter
|
||||
|
||||
**Tools**: 3 available
|
||||
|
||||
**Location**: `scripts/tools/twitter/`
|
||||
|
||||
**Key tools**: search_tweets, get_user_timeline
|
||||
|
||||
**Discovery**: Use `ls scripts/tools/twitter/` to see all tools
|
||||
|
||||
## Performance Notes
|
||||
|
||||
- **Token reduction**: 98.7% fewer tokens vs loading all tools upfront
|
||||
- **Speed**: 2x faster via parallel execution
|
||||
- **Context efficiency**: Processes large datasets, returns summaries
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
For custom workflows, combine individual MCP tools:
|
||||
|
||||
```python
|
||||
from scripts.tools import puppeteer, twitter
|
||||
|
||||
# Custom combination
|
||||
official_data = await puppeteer.fetch_page(url)
|
||||
tweets = await twitter.search_tweets(query)
|
||||
|
||||
# Your own processing logic...
|
||||
```
|
||||
```
|
||||
|
||||
**Critical**: Notice how preferences and SOPs are embedded into the workflow description, not as separate sections.
|
||||
|
||||
### Step 6: Package and Deliver
|
||||
|
||||
Once the skill is complete:
|
||||
|
||||
```bash
|
||||
python /mnt/skills/public/skill-creator/scripts/package_skill.py my-skill
|
||||
```
|
||||
|
||||
This creates `my-skill.skill` file ready for distribution.
|
||||
|
||||
## Complete Example
|
||||
|
||||
Let's create a product research skill from start to finish:
|
||||
|
||||
### 1. Gather Input
|
||||
|
||||
User provides:
|
||||
```
|
||||
MCP Servers: puppeteer, twitter
|
||||
Workflow: "Visit official site, check Twitter, create report"
|
||||
Preferences: "Quantitative metrics preferred, recent info valued"
|
||||
SOPs: "Always start with official sources"
|
||||
```
|
||||
|
||||
### 2. Generate Infrastructure
|
||||
|
||||
```bash
|
||||
# Create config
|
||||
cat > mcp_config.json << 'EOF'
|
||||
{
|
||||
"servers": [
|
||||
{"name": "puppeteer", "command": ["npx", "-y", "@modelcontextprotocol/server-puppeteer"]},
|
||||
{"name": "twitter", "command": ["npx", "-y", "@modelcontextprotocol/server-twitter"]}
|
||||
]
|
||||
}
|
||||
EOF
|
||||
|
||||
# Introspect
|
||||
python scripts/mcp_introspector.py mcp_config.json introspection.json
|
||||
|
||||
# Generate wrappers
|
||||
mkdir -p product-research-skill
|
||||
python scripts/generate_mcp_wrappers.py introspection.json product-research-skill
|
||||
```
|
||||
|
||||
### 3-5. Create Skill (LLM)
|
||||
|
||||
Claude analyzes workflow, creates workflow script, writes SKILL.md with embedded preferences.
|
||||
|
||||
### 6. Package
|
||||
|
||||
```bash
|
||||
python /mnt/skills/public/skill-creator/scripts/package_skill.py product-research-skill
|
||||
```
|
||||
|
||||
Done! You now have `product-research-skill.skill`.
|
||||
|
||||
## Tips for Success
|
||||
|
||||
### Writing Good Workflow Descriptions
|
||||
|
||||
✅ **Good**:
|
||||
- "First I visit the official website, then check Twitter and Reddit, finally create a report"
|
||||
- Numbered steps with clear actions
|
||||
- Mentions data sources explicitly
|
||||
|
||||
❌ **Bad**:
|
||||
- "I research products" (too vague)
|
||||
- No clear sequence
|
||||
- Missing data sources
|
||||
|
||||
### Embedding Preferences Effectively
|
||||
|
||||
✅ **Good**: Weave into workflow guidance
|
||||
```markdown
|
||||
This skill gathers quantitative metrics (ratings, user counts) from multiple
|
||||
sources, prioritizing recent information over older reviews.
|
||||
```
|
||||
|
||||
❌ **Bad**: Separate section
|
||||
```markdown
|
||||
## User Preferences
|
||||
- Likes quantitative metrics
|
||||
- Prefers recent info
|
||||
```
|
||||
|
||||
### When to Create Scripts vs Guidance
|
||||
|
||||
**Create Scripts**:
|
||||
- 3+ step workflows
|
||||
- Parallel execution opportunities
|
||||
- Data filtering needs
|
||||
- Repeated code patterns
|
||||
|
||||
**Use Text Guidance**:
|
||||
- Ad-hoc exploration
|
||||
- High variability
|
||||
- Simple tool usage
|
||||
- Flexibility needed
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### MCP Connection Errors
|
||||
|
||||
```bash
|
||||
# Test MCP server manually
|
||||
npx -y @modelcontextprotocol/server-puppeteer
|
||||
|
||||
# Check output for errors
|
||||
```
|
||||
|
||||
### Import Errors
|
||||
|
||||
```bash
|
||||
# Ensure MCP SDK installed
|
||||
pip install mcp --break-system-packages
|
||||
|
||||
# Verify Python path
|
||||
export PYTHONPATH="${PYTHONPATH}:$(pwd)/my-skill"
|
||||
```
|
||||
|
||||
### Generated Code Issues
|
||||
|
||||
- Review generated wrappers in `scripts/tools/`
|
||||
- Customize as needed (they're templates)
|
||||
- Test scripts before packaging
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Read `references/mcp-best-practices.md` for advanced patterns
|
||||
- Explore `references/example-config.json` for complete examples
|
||||
- Customize generated scripts for your specific needs
|
||||
- Build a library of workflow skills!
|
||||
|
||||
## Getting Help
|
||||
|
||||
- MCP best practices: `references/mcp-best-practices.md`
|
||||
- [MCP Documentation](https://modelcontextprotocol.io/)
|
||||
- [Anthropic's MCP Blog](https://www.anthropic.com/engineering/code-execution-with-mcp)
|
||||
Reference in New Issue
Block a user