7.7 KiB
7.7 KiB
Code Interpreter Guide
Complete guide to using Code Interpreter with the Assistants API.
What is Code Interpreter?
A built-in tool that executes Python code in a sandboxed environment, enabling:
- Data analysis and processing
- Mathematical computations
- Chart and graph generation
- File parsing (CSV, JSON, Excel, etc.)
- Data transformations
Setup
const assistant = await openai.beta.assistants.create({
name: "Data Analyst",
instructions: "You analyze data and create visualizations.",
tools: [{ type: "code_interpreter" }],
model: "gpt-4o",
});
File Uploads
Upload Data Files
const file = await openai.files.create({
file: fs.createReadStream("data.csv"),
purpose: "assistants",
});
Attach to Messages
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Analyze this sales data",
attachments: [{
file_id: file.id,
tools: [{ type: "code_interpreter" }],
}],
});
Supported File Formats
Data Files:
.csv,.json,.xlsx- Tabular data.txt,.md- Text files.pdf,.docx,.pptx- Documents (text extraction)
Code Files:
.py,.js,.ts,.java,.cpp- Source code
Images (for processing, not vision):
.png,.jpg,.jpeg,.gif- Image manipulation
Archives:
.zip,.tar- Compressed files
Size Limit: 512 MB per file
Common Use Cases
1. Data Analysis
const thread = await openai.beta.threads.create({
messages: [{
role: "user",
content: "Calculate the average, median, and standard deviation of the revenue column",
attachments: [{
file_id: csvFileId,
tools: [{ type: "code_interpreter" }],
}],
}],
});
2. Data Visualization
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Create a line chart showing revenue over time",
});
// After run completes, download the generated image
const messages = await openai.beta.threads.messages.list(thread.id);
for (const content of messages.data[0].content) {
if (content.type === 'image_file') {
const imageData = await openai.files.content(content.image_file.file_id);
const buffer = Buffer.from(await imageData.arrayBuffer());
fs.writeFileSync('chart.png', buffer);
}
}
3. File Conversion
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Convert this Excel file to CSV format",
attachments: [{
file_id: excelFileId,
tools: [{ type: "code_interpreter" }],
}],
});
Retrieving Outputs
Text Output
const messages = await openai.beta.threads.messages.list(thread.id);
const response = messages.data[0];
for (const content of response.content) {
if (content.type === 'text') {
console.log(content.text.value);
}
}
Generated Files (Charts, CSVs)
for (const content of response.content) {
if (content.type === 'image_file') {
const fileId = content.image_file.file_id;
const data = await openai.files.content(fileId);
const buffer = Buffer.from(await data.arrayBuffer());
fs.writeFileSync(`output_${fileId}.png`, buffer);
}
}
Execution Logs
const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);
for (const step of runSteps.data) {
if (step.step_details.type === 'tool_calls') {
for (const toolCall of step.step_details.tool_calls) {
if (toolCall.type === 'code_interpreter') {
console.log('Code:', toolCall.code_interpreter.input);
console.log('Output:', toolCall.code_interpreter.outputs);
}
}
}
}
Python Environment
Available Libraries
The Code Interpreter sandbox includes common libraries:
- Data: pandas, numpy
- Math: scipy, sympy
- Plotting: matplotlib, seaborn
- ML: scikit-learn (limited)
- Utils: requests, PIL, csv, json
Note: Not all PyPI packages available. Use standard library where possible.
Environment Limits
- Execution Time: Part of 10-minute run limit
- Memory: Limited (exact amount not documented)
- Disk Space: Files persist during run only
- Network: No outbound internet access
Best Practices
1. Clear Instructions
// ❌ Vague
"Analyze the data"
// ✅ Specific
"Calculate the mean, median, and mode for each numeric column. Create a bar chart comparing these metrics."
2. File Download Immediately
// Generated files are temporary - download right after completion
if (run.status === 'completed') {
const messages = await openai.beta.threads.messages.list(thread.id);
// Download all image files immediately
for (const message of messages.data) {
for (const content of message.content) {
if (content.type === 'image_file') {
await downloadFile(content.image_file.file_id);
}
}
}
}
3. Error Handling
const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);
for (const step of runSteps.data) {
if (step.step_details.type === 'tool_calls') {
for (const toolCall of step.step_details.tool_calls) {
if (toolCall.type === 'code_interpreter') {
const outputs = toolCall.code_interpreter.outputs;
for (const output of outputs) {
if (output.type === 'logs' && output.logs.includes('Error')) {
console.error('Execution error:', output.logs);
}
}
}
}
}
}
Common Patterns
Pattern: Iterative Analysis
// 1. Upload data
const file = await openai.files.create({...});
// 2. Initial analysis
await sendMessage("What are the columns and data types?");
// 3. Follow-up based on results
await sendMessage("Show the distribution of the 'category' column");
// 4. Visualization
await sendMessage("Create a heatmap of correlations between numeric columns");
Pattern: Multi-File Processing
await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: "Merge these two CSV files on the 'id' column",
attachments: [
{ file_id: file1Id, tools: [{ type: "code_interpreter" }] },
{ file_id: file2Id, tools: [{ type: "code_interpreter" }] },
],
});
Troubleshooting
Issue: Code Execution Fails
Symptoms: Run completes but no output/error in logs
Solutions:
- Check file format compatibility
- Verify file isn't corrupted
- Ensure data is in expected format (headers, encoding)
- Try simpler request first to verify setup
Issue: Generated Files Not Found
Symptoms: image_file.file_id doesn't exist
Solutions:
- Download immediately after run completes
- Check run steps for actual outputs
- Verify code execution succeeded
Issue: Timeout on Large Files
Symptoms: Run exceeds 10-minute limit
Solutions:
- Split large files into smaller chunks
- Request specific analysis (not "analyze everything")
- Use sampling for exploratory analysis
Example Prompts
Data Exploration:
- "Summarize this dataset: shape, columns, data types, missing values"
- "Show the first 10 rows"
- "What are the unique values in the 'status' column?"
Statistical Analysis:
- "Calculate descriptive statistics for all numeric columns"
- "Perform correlation analysis between price and quantity"
- "Detect outliers using the IQR method"
Visualization:
- "Create a histogram of the 'age' distribution"
- "Plot revenue trends over time with a moving average"
- "Generate a scatter plot of height vs weight, colored by gender"
Data Transformation:
- "Remove rows with missing values"
- "Normalize the 'sales' column to 0-1 range"
- "Convert dates to YYYY-MM-DD format"
Last Updated: 2025-10-25