Files
gh-jezweb-claude-skills-ski…/references/code-interpreter-guide.md
2025-11-30 08:25:15 +08:00

7.7 KiB

Code Interpreter Guide

Complete guide to using Code Interpreter with the Assistants API.


What is Code Interpreter?

A built-in tool that executes Python code in a sandboxed environment, enabling:

  • Data analysis and processing
  • Mathematical computations
  • Chart and graph generation
  • File parsing (CSV, JSON, Excel, etc.)
  • Data transformations

Setup

const assistant = await openai.beta.assistants.create({
  name: "Data Analyst",
  instructions: "You analyze data and create visualizations.",
  tools: [{ type: "code_interpreter" }],
  model: "gpt-4o",
});

File Uploads

Upload Data Files

const file = await openai.files.create({
  file: fs.createReadStream("data.csv"),
  purpose: "assistants",
});

Attach to Messages

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Analyze this sales data",
  attachments: [{
    file_id: file.id,
    tools: [{ type: "code_interpreter" }],
  }],
});

Supported File Formats

Data Files:

  • .csv, .json, .xlsx - Tabular data
  • .txt, .md - Text files
  • .pdf, .docx, .pptx - Documents (text extraction)

Code Files:

  • .py, .js, .ts, .java, .cpp - Source code

Images (for processing, not vision):

  • .png, .jpg, .jpeg, .gif - Image manipulation

Archives:

  • .zip, .tar - Compressed files

Size Limit: 512 MB per file


Common Use Cases

1. Data Analysis

const thread = await openai.beta.threads.create({
  messages: [{
    role: "user",
    content: "Calculate the average, median, and standard deviation of the revenue column",
    attachments: [{
      file_id: csvFileId,
      tools: [{ type: "code_interpreter" }],
    }],
  }],
});

2. Data Visualization

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Create a line chart showing revenue over time",
});

// After run completes, download the generated image
const messages = await openai.beta.threads.messages.list(thread.id);
for (const content of messages.data[0].content) {
  if (content.type === 'image_file') {
    const imageData = await openai.files.content(content.image_file.file_id);
    const buffer = Buffer.from(await imageData.arrayBuffer());
    fs.writeFileSync('chart.png', buffer);
  }
}

3. File Conversion

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Convert this Excel file to CSV format",
  attachments: [{
    file_id: excelFileId,
    tools: [{ type: "code_interpreter" }],
  }],
});

Retrieving Outputs

Text Output

const messages = await openai.beta.threads.messages.list(thread.id);
const response = messages.data[0];

for (const content of response.content) {
  if (content.type === 'text') {
    console.log(content.text.value);
  }
}

Generated Files (Charts, CSVs)

for (const content of response.content) {
  if (content.type === 'image_file') {
    const fileId = content.image_file.file_id;
    const data = await openai.files.content(fileId);
    const buffer = Buffer.from(await data.arrayBuffer());
    fs.writeFileSync(`output_${fileId}.png`, buffer);
  }
}

Execution Logs

const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);

for (const step of runSteps.data) {
  if (step.step_details.type === 'tool_calls') {
    for (const toolCall of step.step_details.tool_calls) {
      if (toolCall.type === 'code_interpreter') {
        console.log('Code:', toolCall.code_interpreter.input);
        console.log('Output:', toolCall.code_interpreter.outputs);
      }
    }
  }
}

Python Environment

Available Libraries

The Code Interpreter sandbox includes common libraries:

  • Data: pandas, numpy
  • Math: scipy, sympy
  • Plotting: matplotlib, seaborn
  • ML: scikit-learn (limited)
  • Utils: requests, PIL, csv, json

Note: Not all PyPI packages available. Use standard library where possible.

Environment Limits

  • Execution Time: Part of 10-minute run limit
  • Memory: Limited (exact amount not documented)
  • Disk Space: Files persist during run only
  • Network: No outbound internet access

Best Practices

1. Clear Instructions

// ❌ Vague
"Analyze the data"

// ✅ Specific
"Calculate the mean, median, and mode for each numeric column. Create a bar chart comparing these metrics."

2. File Download Immediately

// Generated files are temporary - download right after completion
if (run.status === 'completed') {
  const messages = await openai.beta.threads.messages.list(thread.id);
  // Download all image files immediately
  for (const message of messages.data) {
    for (const content of message.content) {
      if (content.type === 'image_file') {
        await downloadFile(content.image_file.file_id);
      }
    }
  }
}

3. Error Handling

const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);

for (const step of runSteps.data) {
  if (step.step_details.type === 'tool_calls') {
    for (const toolCall of step.step_details.tool_calls) {
      if (toolCall.type === 'code_interpreter') {
        const outputs = toolCall.code_interpreter.outputs;
        for (const output of outputs) {
          if (output.type === 'logs' && output.logs.includes('Error')) {
            console.error('Execution error:', output.logs);
          }
        }
      }
    }
  }
}

Common Patterns

Pattern: Iterative Analysis

// 1. Upload data
const file = await openai.files.create({...});

// 2. Initial analysis
await sendMessage("What are the columns and data types?");

// 3. Follow-up based on results
await sendMessage("Show the distribution of the 'category' column");

// 4. Visualization
await sendMessage("Create a heatmap of correlations between numeric columns");

Pattern: Multi-File Processing

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Merge these two CSV files on the 'id' column",
  attachments: [
    { file_id: file1Id, tools: [{ type: "code_interpreter" }] },
    { file_id: file2Id, tools: [{ type: "code_interpreter" }] },
  ],
});

Troubleshooting

Issue: Code Execution Fails

Symptoms: Run completes but no output/error in logs

Solutions:

  • Check file format compatibility
  • Verify file isn't corrupted
  • Ensure data is in expected format (headers, encoding)
  • Try simpler request first to verify setup

Issue: Generated Files Not Found

Symptoms: image_file.file_id doesn't exist

Solutions:

  • Download immediately after run completes
  • Check run steps for actual outputs
  • Verify code execution succeeded

Issue: Timeout on Large Files

Symptoms: Run exceeds 10-minute limit

Solutions:

  • Split large files into smaller chunks
  • Request specific analysis (not "analyze everything")
  • Use sampling for exploratory analysis

Example Prompts

Data Exploration:

  • "Summarize this dataset: shape, columns, data types, missing values"
  • "Show the first 10 rows"
  • "What are the unique values in the 'status' column?"

Statistical Analysis:

  • "Calculate descriptive statistics for all numeric columns"
  • "Perform correlation analysis between price and quantity"
  • "Detect outliers using the IQR method"

Visualization:

  • "Create a histogram of the 'age' distribution"
  • "Plot revenue trends over time with a moving average"
  • "Generate a scatter plot of height vs weight, colored by gender"

Data Transformation:

  • "Remove rows with missing values"
  • "Normalize the 'sales' column to 0-1 range"
  • "Convert dates to YYYY-MM-DD format"

Last Updated: 2025-10-25