Files
2025-11-29 18:00:42 +08:00

5.0 KiB

Updating gpt-oss Model Files

Why Update Model Files?

The openai_harmony.HarmonyError: Unexpected token errors are often caused by outdated generation_config.json files. HuggingFace updates these files to fix token parsing issues.

Current Configuration Files

gpt-oss-20b generation_config.json

Latest version includes:

{
  "bos_token_id": 199998,
  "do_sample": true,
  "eos_token_id": [
    200002,
    199999,
    200012
  ],
  "pad_token_id": 199999,
  "transformers_version": "4.55.0.dev0"
}

Key elements:

  • eos_token_id: Multiple EOS tokens including 200012 (tool call completion)
  • do_sample: Enabled for generation diversity
  • transformers_version: Indicates compatible transformers version

gpt-oss-120b Critical Commit

Commit: 8b193b0ef83bd41b40eb71fee8f1432315e02a3e

  • Fixed generation_config.json
  • Confirmed to resolve token parsing errors by user andresC98
  • Applied to gpt-oss-120b model

How to Update Model Files

Method 1: Re-download with HuggingFace CLI

# Install or update huggingface-hub
pip install --upgrade huggingface-hub

# For gpt-oss-20b
huggingface-cli download openai/gpt-oss-20b --local-dir ./gpt-oss-20b

# For gpt-oss-120b
huggingface-cli download openai/gpt-oss-120b --local-dir ./gpt-oss-120b

Method 2: Manual Update via Web

  1. Visit HuggingFace model page:

  2. Navigate to "Files and versions" tab

  3. Download latest generation_config.json

  4. Replace in your local model directory:

    # Find your model directory (varies by vLLM installation)
    # Common locations:
    # ~/.cache/huggingface/hub/models--openai--gpt-oss-20b/
    # ./models/gpt-oss-20b/
    
    # Replace the file
    cp ~/Downloads/generation_config.json /path/to/model/directory/
    

Method 3: Update with git (if model was cloned)

cd /path/to/model/directory
git pull origin main

Verification Steps

After updating:

  1. Check file contents:

    cat generation_config.json
    

    Verify it matches the current version shown above.

  2. Check modification date:

    ls -l generation_config.json
    

    Should be recent (after the commit date).

  3. Restart vLLM server:

    # Stop existing server
    # Start with correct flags (see tool-calling-setup.md)
    vllm serve openai/gpt-oss-20b \
      --tool-call-parser openai \
      --enable-auto-tool-choice
    
  4. Test tool calling:

    from openai import OpenAI
    
    client = OpenAI(base_url="http://localhost:8000/v1")
    
    response = client.chat.completions.create(
        model="openai/gpt-oss-20b",
        messages=[{"role": "user", "content": "What's the weather?"}],
        tools=[{
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string"}
                    }
                }
            }
        }]
    )
    
    print(response)
    

Troubleshooting Update Issues

vLLM Not Picking Up Changes

Symptom: Updated files but still getting errors

Solutions:

  1. Clear vLLM cache:

    rm -rf ~/.cache/vllm/
    
  2. Restart vLLM with fresh model load:

    # Use --download-dir to force specific directory
    vllm serve openai/gpt-oss-20b \
      --download-dir /path/to/models \
      --tool-call-parser openai \
      --enable-auto-tool-choice
    
  3. Check vLLM is loading the correct model directory:

    • Look for model path in vLLM startup logs
    • Verify it matches where you updated files

File Permission Issues

# Ensure files are readable
chmod 644 generation_config.json

# Check ownership
ls -l generation_config.json

Multiple Model Copies

Problem: vLLM might be loading from a different location

Solution:

  1. Find all copies:

    find ~/.cache -name "generation_config.json" -path "*/gpt-oss*"
    
  2. Update all copies or remove duplicates

  3. Use explicit --download-dir flag when starting vLLM

Additional Files to Check

While generation_config.json is the primary fix, also verify these files are current:

config.json

Contains model architecture configuration

tokenizer_config.json

Token encoding settings, including special tokens

special_tokens_map.json

Maps special token strings to IDs

To update all:

huggingface-cli download openai/gpt-oss-20b \
  --local-dir ./gpt-oss-20b \
  --force-download

When to Update

Update model files when:

  • Encountering token parsing errors
  • HuggingFace shows recent commits to model repo
  • vLLM error messages reference token IDs
  • After vLLM version upgrades
  • Community reports fixes via file updates

Cross-References

  • Known issues: See known-issues.md
  • vLLM configuration: See tool-calling-setup.md