zhongwei/gh-jezweb-claude-skills-skills-openai-assistants

Fork 0

Files

Zhongwei Li 0c577730d5 Initial commit

2025-11-30 08:25:15 +08:00

7.7 KiB

Raw Blame History

Code Interpreter Guide

Complete guide to using Code Interpreter with the Assistants API.

What is Code Interpreter?

A built-in tool that executes Python code in a sandboxed environment, enabling:

Data analysis and processing
Mathematical computations
Chart and graph generation
File parsing (CSV, JSON, Excel, etc.)
Data transformations

Setup

const assistant = await openai.beta.assistants.create({
  name: "Data Analyst",
  instructions: "You analyze data and create visualizations.",
  tools: [{ type: "code_interpreter" }],
  model: "gpt-4o",
});

File Uploads

Upload Data Files

const file = await openai.files.create({
  file: fs.createReadStream("data.csv"),
  purpose: "assistants",
});

Attach to Messages

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Analyze this sales data",
  attachments: [{
    file_id: file.id,
    tools: [{ type: "code_interpreter" }],
  }],
});

Supported File Formats

Data Files:

.csv, .json, .xlsx - Tabular data
.txt, .md - Text files
.pdf, .docx, .pptx - Documents (text extraction)

Code Files:

.py, .js, .ts, .java, .cpp - Source code

Images (for processing, not vision):

.png, .jpg, .jpeg, .gif - Image manipulation

Archives:

.zip, .tar - Compressed files

Size Limit: 512 MB per file

Common Use Cases

1. Data Analysis

const thread = await openai.beta.threads.create({
  messages: [{
    role: "user",
    content: "Calculate the average, median, and standard deviation of the revenue column",
    attachments: [{
      file_id: csvFileId,
      tools: [{ type: "code_interpreter" }],
    }],
  }],
});

2. Data Visualization

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Create a line chart showing revenue over time",
});

// After run completes, download the generated image
const messages = await openai.beta.threads.messages.list(thread.id);
for (const content of messages.data[0].content) {
  if (content.type === 'image_file') {
    const imageData = await openai.files.content(content.image_file.file_id);
    const buffer = Buffer.from(await imageData.arrayBuffer());
    fs.writeFileSync('chart.png', buffer);
  }
}

3. File Conversion

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Convert this Excel file to CSV format",
  attachments: [{
    file_id: excelFileId,
    tools: [{ type: "code_interpreter" }],
  }],
});

Retrieving Outputs

Text Output

const messages = await openai.beta.threads.messages.list(thread.id);
const response = messages.data[0];

for (const content of response.content) {
  if (content.type === 'text') {
    console.log(content.text.value);
  }
}

Generated Files (Charts, CSVs)

for (const content of response.content) {
  if (content.type === 'image_file') {
    const fileId = content.image_file.file_id;
    const data = await openai.files.content(fileId);
    const buffer = Buffer.from(await data.arrayBuffer());
    fs.writeFileSync(`output_${fileId}.png`, buffer);
  }
}

Execution Logs

const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);

for (const step of runSteps.data) {
  if (step.step_details.type === 'tool_calls') {
    for (const toolCall of step.step_details.tool_calls) {
      if (toolCall.type === 'code_interpreter') {
        console.log('Code:', toolCall.code_interpreter.input);
        console.log('Output:', toolCall.code_interpreter.outputs);
      }
    }
  }
}

Python Environment

Available Libraries

The Code Interpreter sandbox includes common libraries:

Data: pandas, numpy
Math: scipy, sympy
Plotting: matplotlib, seaborn
ML: scikit-learn (limited)
Utils: requests, PIL, csv, json

Note: Not all PyPI packages available. Use standard library where possible.

Environment Limits

Execution Time: Part of 10-minute run limit
Memory: Limited (exact amount not documented)
Disk Space: Files persist during run only
Network: No outbound internet access

Best Practices

1. Clear Instructions

// ❌ Vague
"Analyze the data"

// ✅ Specific
"Calculate the mean, median, and mode for each numeric column. Create a bar chart comparing these metrics."

2. File Download Immediately

// Generated files are temporary - download right after completion
if (run.status === 'completed') {
  const messages = await openai.beta.threads.messages.list(thread.id);
  // Download all image files immediately
  for (const message of messages.data) {
    for (const content of message.content) {
      if (content.type === 'image_file') {
        await downloadFile(content.image_file.file_id);
      }
    }
  }
}

3. Error Handling

const runSteps = await openai.beta.threads.runs.steps.list(thread.id, run.id);

for (const step of runSteps.data) {
  if (step.step_details.type === 'tool_calls') {
    for (const toolCall of step.step_details.tool_calls) {
      if (toolCall.type === 'code_interpreter') {
        const outputs = toolCall.code_interpreter.outputs;
        for (const output of outputs) {
          if (output.type === 'logs' && output.logs.includes('Error')) {
            console.error('Execution error:', output.logs);
          }
        }
      }
    }
  }
}

Common Patterns

Pattern: Iterative Analysis

// 1. Upload data
const file = await openai.files.create({...});

// 2. Initial analysis
await sendMessage("What are the columns and data types?");

// 3. Follow-up based on results
await sendMessage("Show the distribution of the 'category' column");

// 4. Visualization
await sendMessage("Create a heatmap of correlations between numeric columns");

Pattern: Multi-File Processing

await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Merge these two CSV files on the 'id' column",
  attachments: [
    { file_id: file1Id, tools: [{ type: "code_interpreter" }] },
    { file_id: file2Id, tools: [{ type: "code_interpreter" }] },
  ],
});

Troubleshooting

Issue: Code Execution Fails

Symptoms: Run completes but no output/error in logs

Solutions:

Check file format compatibility
Verify file isn't corrupted
Ensure data is in expected format (headers, encoding)
Try simpler request first to verify setup

Issue: Generated Files Not Found

Symptoms: image_file.file_id doesn't exist

Solutions:

Download immediately after run completes
Check run steps for actual outputs
Verify code execution succeeded

Issue: Timeout on Large Files

Symptoms: Run exceeds 10-minute limit

Solutions:

Split large files into smaller chunks
Request specific analysis (not "analyze everything")
Use sampling for exploratory analysis

Example Prompts

Data Exploration:

"Summarize this dataset: shape, columns, data types, missing values"
"Show the first 10 rows"
"What are the unique values in the 'status' column?"

Statistical Analysis:

"Calculate descriptive statistics for all numeric columns"
"Perform correlation analysis between price and quantity"
"Detect outliers using the IQR method"

Visualization:

"Create a histogram of the 'age' distribution"
"Plot revenue trends over time with a moving average"
"Generate a scatter plot of height vs weight, colored by gender"

Data Transformation:

"Remove rows with missing values"
"Normalize the 'sales' column to 0-1 range"
"Convert dates to YYYY-MM-DD format"

Last Updated: 2025-10-25

7.7 KiB Raw Blame History

Code Interpreter Guide

What is Code Interpreter?

Setup

File Uploads

Upload Data Files

Attach to Messages

Supported File Formats

Common Use Cases

1. Data Analysis

2. Data Visualization

3. File Conversion

Retrieving Outputs

Text Output

Generated Files (Charts, CSVs)

Execution Logs

Python Environment

Available Libraries

Environment Limits

Best Practices

1. Clear Instructions

2. File Download Immediately

3. Error Handling

Common Patterns

Pattern: Iterative Analysis

Pattern: Multi-File Processing

Troubleshooting

Issue: Code Execution Fails

Issue: Generated Files Not Found

Issue: Timeout on Large Files

Example Prompts

7.7 KiB

Raw Blame History