# MCP Code Execution Best Practices This reference document provides detailed guidance on implementing efficient MCP integrations using code execution patterns, based on [Anthropic's MCP engineering blog post](https://www.anthropic.com/engineering/code-execution-with-mcp). ## Core Principles ### 1. Progressive Disclosure **Problem**: Loading all MCP tool definitions upfront wastes context window space. **Solution**: Present tools as code APIs on a filesystem, allowing models to load only what they need. ``` scripts/ ├── tools/ │ ├── google-drive/ │ │ ├── getDocument.ts │ │ ├── listFiles.ts │ │ └── index.ts │ └── salesforce/ │ ├── updateRecord.ts │ └── index.ts ``` **Benefits**: - Reduces initial context from 150,000 tokens to 2,000 tokens (98.7% reduction) - Scales to thousands of tools without overwhelming the model - Tools loaded on-demand as needed **Implementation**: ```python # Agent explores filesystem tools_available = os.listdir('scripts/tools/google-drive/') # Agent reads only needed tool definitions with open('scripts/tools/google-drive/getDocument.py') as f: tool_code = f.read() ``` ### 2. Context-Efficient Data Handling **Problem**: Intermediate results flowing through context window consume excessive tokens. **Bad Example**: ```python # Without code execution - all data flows through context TOOL CALL: gdrive.getSheet(sheetId: 'abc123') → returns 10,000 rows to model → model filters in context → passes filtered data to next tool ``` **Good Example**: ```python # With code execution - filter in execution environment sheet_data = await gdrive.getSheet({'sheetId': 'abc123'}) # Filter in execution environment (no context cost) pending_orders = [ row for row in sheet_data if row['Status'] == 'pending' and row['Amount'] > 1000 ] # Only return summary to model print(f"Found {len(pending_orders)} high-value pending orders") print(pending_orders[:5]) # Show first 5 for review ``` **Benefits**: - Processes 10,000 rows but only sends 5 to model - Reduces token usage by 99.5% - Faster execution, lower costs ### 3. Parallel Execution **Problem**: Sequential tool calls waste time when operations are independent. **Bad Example**: ```python # Sequential execution twitter_data = await x_com.search_tweets(query) # Wait for Twitter... reddit_data = await reddit.search_discussions(query) # Wait for Reddit... ``` **Good Example**: ```python # Parallel execution with asyncio.gather() twitter_task = x_com.search_tweets(query) reddit_task = reddit.search_discussions(query) producthunt_task = producthunt.search(query) # Execute all concurrently results = await asyncio.gather( twitter_task, reddit_task, producthunt_task ) twitter_data, reddit_data, ph_data = results ``` **Benefits**: - 3x faster execution (if all APIs take similar time) - Better user experience - Efficient resource utilization ### 4. Complex Control Flow **Problem**: Implementing loops and conditionals via sequential tool calls is inefficient. **Bad Example**: ```python # Agent alternates between tool calls and sleep TOOL CALL: slack.getMessages() → no deployment message SLEEP: 5 seconds TOOL CALL: slack.getMessages() → no deployment message SLEEP: 5 seconds # ... repeat many times ``` **Good Example**: ```python # Implement control flow in code async def wait_for_deployment(channel: str, timeout: int = 300): start_time = time.time() while time.time() - start_time < timeout: messages = await slack.getChannelHistory(channel, limit=10) if any('deployment complete' in m['text'].lower() for m in messages): return {'status': 'success', 'message': messages[0]} await asyncio.sleep(10) return {'status': 'timeout'} ``` **Benefits**: - Single code execution instead of 60+ tool calls - Faster time to first token - More reliable error handling ### 5. Privacy-Preserving Operations **Problem**: Sensitive data flowing through model context raises privacy concerns. **Solution**: Keep sensitive data in execution environment, only share summaries. ```python # Load sensitive customer data customers = await gdrive.getSheet({'sheetId': 'customer_contacts'}) # Process PII in execution environment (never shown to model) for customer in customers: await salesforce.updateRecord({ 'objectType': 'Lead', 'recordId': customer['salesforce_id'], 'data': { 'Email': customer['email'], # PII stays in execution env 'Phone': customer['phone'], # PII stays in execution env 'Name': customer['name'] # PII stays in execution env } }) # Only summary goes to model print(f"Updated {len(customers)} customer records") print("✓ All contact information synchronized") ``` **Optional Enhancement**: Tokenize PII automatically in MCP client: ```python # What model sees (if PII is tokenized): [ {'email': '[EMAIL_1]', 'phone': '[PHONE_1]', 'name': '[NAME_1]'}, {'email': '[EMAIL_2]', 'phone': '[PHONE_2]', 'name': '[NAME_2]'} ] # Real data flows Google Sheets → Salesforce without entering model context ``` ### 6. State Persistence and Skills **Problem**: Agents cannot build on previous work without memory. **Solution**: Use filesystem to persist intermediate results and reusable functions. **State Persistence**: ```python # Save intermediate results import json intermediate_data = await fetch_and_process() with open('./workspace/state.json', 'w') as f: json.dump(intermediate_data, f) # Later execution picks up where it left off with open('./workspace/state.json') as f: state = json.load(f) ``` **Skill Evolution**: ```python # Save reusable function as a skill # In ./skills/save-sheet-as-csv.py import pandas as pd from scripts.tools import gdrive async def save_sheet_as_csv(sheet_id: str, output_path: str): """ Reusable function to export Google Sheet as CSV """ data = await gdrive.getSheet({'sheetId': sheet_id}) df = pd.DataFrame(data) df.to_csv(output_path, index=False) return output_path # Later, in any workflow: from skills.save_sheet_as_csv import save_sheet_as_csv csv_path = await save_sheet_as_csv('abc123', './data/export.csv') ``` **Add SKILL.md** to create structured skill: ```markdown --- name: sheet-csv-exporter description: Export Google Sheets to CSV format --- # Sheet CSV Exporter Provides a reusable function for exporting Google Sheets to CSV files. ## Usage ```python from skills.save_sheet_as_csv import save_sheet_as_csv csv_path = await save_sheet_as_csv( sheet_id='your-sheet-id', output_path='./output/data.csv' ) ``` ``` ## Token Usage Comparison | Approach | Token Usage | Latency | Privacy | |----------|-------------|---------|---------| | **Direct Tool Calls** | 150,000+ tokens (all tool definitions loaded) | High (sequential calls) | ⚠️ All data through context | | **Code Execution with MCP** | 2,000 tokens (load on demand) | Low (parallel execution) | ✅ Data filtered/tokenized | **Savings**: 98.7% token reduction, 3-5x faster execution ## When to Use Code Execution ✅ **Use code execution when**: - Working with many MCP tools (>10 tools) - Processing large datasets (>1000 rows) - Need parallel API calls - Workflow involves loops/conditionals - Privacy concerns with sensitive data - Building reusable workflows ❌ **Avoid code execution when**: - Simple single tool call - Small data amounts - Quick ad-hoc tasks - No performance concerns - Execution environment unavailable ## Implementation Considerations ### Security - Sandbox execution environment properly - Limit resource usage (CPU, memory, time) - Monitor for malicious code patterns - Validate all inputs ### Error Handling ```python try: result = await mcp_tool(params) except Exception as e: # Log error logger.error(f"MCP tool failed: {e}") # Return graceful fallback return {'error': str(e), 'status': 'failed'} ``` ### Testing - Test scripts in isolation - Mock MCP tool responses - Verify error handling - Check performance gains ## Examples from Production ### Example 1: Document Processing Pipeline ```python async def process_contracts(folder_id: str): """Process all contracts in a folder""" # 1. List all files (single MCP call) files = await gdrive.listFiles({'folderId': folder_id}) # 2. Filter in execution environment pdf_files = [f for f in files if f['type'] == 'pdf'] # 3. Parallel processing results = await asyncio.gather(*[ extract_contract_data(f['id']) for f in pdf_files ]) # 4. Aggregate and save summary = aggregate_contract_summary(results) # Only summary to model return { 'total_contracts': len(pdf_files), 'processed': len(results), 'summary': summary[:500] # Truncate for context } ``` ### Example 2: Social Media Monitoring ```python async def monitor_brand_mentions(brand: str): """Monitor brand across multiple platforms""" # Parallel fetch from multiple sources twitter_task = x_com.search_tweets(f'"{brand}"') reddit_task = reddit.search(brand, subreddits=['technology']) hn_task = hackernews.search(brand) mentions = await asyncio.gather( twitter_task, reddit_task, hn_task ) # Sentiment analysis in execution environment sentiment = analyze_sentiment_batch(mentions) # Filter and aggregate recent_mentions = filter_last_24h(mentions) key_insights = extract_key_insights(recent_mentions) return { 'mention_count': len(recent_mentions), 'sentiment': sentiment, 'key_insights': key_insights, 'platforms': { 'twitter': len(mentions[0]), 'reddit': len(mentions[1]), 'hackernews': len(mentions[2]) } } ``` ## Further Reading - [MCP Official Documentation](https://modelcontextprotocol.io/) - [Anthropic MCP Engineering Blog](https://www.anthropic.com/engineering/code-execution-with-mcp) - [Cloudflare Code Mode](https://blog.cloudflare.com/code-mode/)