# Code Interpreter Guide Complete guide to using Code Interpreter with the Assistants API. --- ## What is Code Interpreter? A built-in tool that executes Python code in a sandboxed environment, enabling: - Data analysis and processing - Mathematical computations - Chart and graph generation - File parsing (CSV, JSON, Excel, etc.) - Data transformations --- ## Setup ```typescript const assistant = await openai.beta.assistants.create({ name: "Data Analyst", instructions: "You analyze data and create visualizations.", tools: [{ type: "code_interpreter" }], model: "gpt-4o", }); ``` --- ## File Uploads ### Upload Data Files ```typescript const file = await openai.files.create({ file: fs.createReadStream("data.csv"), purpose: "assistants", }); ``` ### Attach to Messages ```typescript await openai.beta.threads.messages.create(thread.id, { role: "user", content: "Analyze this sales data", attachments: [{ file_id: file.id, tools: [{ type: "code_interpreter" }], }], }); ``` --- ## Supported File Formats **Data Files**: - `.csv`, `.json`, `.xlsx` - Tabular data - `.txt`, `.md` - Text files - `.pdf`, `.docx`, `.pptx` - Documents (text extraction) **Code Files**: - `.py`, `.js`, `.ts`, `.java`, `.cpp` - Source code **Images** (for processing, not vision): - `.png`, `.jpg`, `.jpeg`, `.gif` - Image manipulation **Archives**: - `.zip`, `.tar` - Compressed files **Size Limit**: 512 MB per file --- ## Common Use Cases ### 1. Data Analysis ```typescript const thread = await openai.beta.threads.create({ messages: [{ role: "user", content: "Calculate the average, median, and standard deviation of the revenue column", attachments: [{ file_id: csvFileId, tools: [{ type: "code_interpreter" }], }], }], }); ``` ### 2. Data Visualization ```typescript await openai.beta.threads.messages.create(thread.id, { role: "user", content: "Create a line chart showing revenue over time", }); // After run completes, download the generated image const messages = await openai.beta.threads.messages.list(thread.id); for (const content of messages.data[0].content) { if (content.type === 'image_file') { const imageData = await openai.files.content(content.image_file.file_id); const buffer = Buffer.from(await imageData.arrayBuffer()); fs.writeFileSync('chart.png', buffer); } } ``` ### 3. File Conversion ```typescript await openai.beta.threads.messages.create(thread.id, { role: "user", content: "Convert this Excel file to CSV format", attachments: [{ file_id: excelFileId, tools: [{ type: "code_interpreter" }], }], }); ``` --- ## Retrieving Outputs ### Text Output ```typescript const messages = await openai.beta.threads.messages.list(thread.id); const response = messages.data[0]; for (const content of response.content) { if (content.type === 'text') { console.log(content.text.value); } } ``` ### Generated Files (Charts, CSVs) ```typescript for (const content of response.content) { if (content.type === 'image_file') { const fileId = content.image_file.file_id; const data = await openai.files.content(fileId); const buffer = Buffer.from(await data.arrayBuffer()); fs.writeFileSync(`output_${fileId}.png`, buffer); } } ``` ### Execution Logs ```typescript const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id); for (const step of runSteps.data) { if (step.step_details.type === 'tool_calls') { for (const toolCall of step.step_details.tool_calls) { if (toolCall.type === 'code_interpreter') { console.log('Code:', toolCall.code_interpreter.input); console.log('Output:', toolCall.code_interpreter.outputs); } } } } ``` --- ## Python Environment ### Available Libraries The Code Interpreter sandbox includes common libraries: - **Data**: pandas, numpy - **Math**: scipy, sympy - **Plotting**: matplotlib, seaborn - **ML**: scikit-learn (limited) - **Utils**: requests, PIL, csv, json **Note**: Not all PyPI packages available. Use standard library where possible. ### Environment Limits - **Execution Time**: Part of 10-minute run limit - **Memory**: Limited (exact amount not documented) - **Disk Space**: Files persist during run only - **Network**: No outbound internet access --- ## Best Practices ### 1. Clear Instructions ```typescript // ❌ Vague "Analyze the data" // ✅ Specific "Calculate the mean, median, and mode for each numeric column. Create a bar chart comparing these metrics." ``` ### 2. File Download Immediately ```typescript // Generated files are temporary - download right after completion if (run.status === 'completed') { const messages = await openai.beta.threads.messages.list(thread.id); // Download all image files immediately for (const message of messages.data) { for (const content of message.content) { if (content.type === 'image_file') { await downloadFile(content.image_file.file_id); } } } } ``` ### 3. Error Handling ```typescript const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id); for (const step of runSteps.data) { if (step.step_details.type === 'tool_calls') { for (const toolCall of step.step_details.tool_calls) { if (toolCall.type === 'code_interpreter') { const outputs = toolCall.code_interpreter.outputs; for (const output of outputs) { if (output.type === 'logs' && output.logs.includes('Error')) { console.error('Execution error:', output.logs); } } } } } } ``` --- ## Common Patterns ### Pattern: Iterative Analysis ```typescript // 1. Upload data const file = await openai.files.create({...}); // 2. Initial analysis await sendMessage("What are the columns and data types?"); // 3. Follow-up based on results await sendMessage("Show the distribution of the 'category' column"); // 4. Visualization await sendMessage("Create a heatmap of correlations between numeric columns"); ``` ### Pattern: Multi-File Processing ```typescript await openai.beta.threads.messages.create(thread.id, { role: "user", content: "Merge these two CSV files on the 'id' column", attachments: [ { file_id: file1Id, tools: [{ type: "code_interpreter" }] }, { file_id: file2Id, tools: [{ type: "code_interpreter" }] }, ], }); ``` --- ## Troubleshooting ### Issue: Code Execution Fails **Symptoms**: Run completes but no output/error in logs **Solutions**: - Check file format compatibility - Verify file isn't corrupted - Ensure data is in expected format (headers, encoding) - Try simpler request first to verify setup ### Issue: Generated Files Not Found **Symptoms**: `image_file.file_id` doesn't exist **Solutions**: - Download immediately after run completes - Check run steps for actual outputs - Verify code execution succeeded ### Issue: Timeout on Large Files **Symptoms**: Run exceeds 10-minute limit **Solutions**: - Split large files into smaller chunks - Request specific analysis (not "analyze everything") - Use sampling for exploratory analysis --- ## Example Prompts **Data Exploration**: - "Summarize this dataset: shape, columns, data types, missing values" - "Show the first 10 rows" - "What are the unique values in the 'status' column?" **Statistical Analysis**: - "Calculate descriptive statistics for all numeric columns" - "Perform correlation analysis between price and quantity" - "Detect outliers using the IQR method" **Visualization**: - "Create a histogram of the 'age' distribution" - "Plot revenue trends over time with a moving average" - "Generate a scatter plot of height vs weight, colored by gender" **Data Transformation**: - "Remove rows with missing values" - "Normalize the 'sales' column to 0-1 range" - "Convert dates to YYYY-MM-DD format" --- **Last Updated**: 2025-10-25