5.0 KiB
Updating gpt-oss Model Files
Why Update Model Files?
The openai_harmony.HarmonyError: Unexpected token errors are often caused by outdated generation_config.json files. HuggingFace updates these files to fix token parsing issues.
Current Configuration Files
gpt-oss-20b generation_config.json
Latest version includes:
{
"bos_token_id": 199998,
"do_sample": true,
"eos_token_id": [
200002,
199999,
200012
],
"pad_token_id": 199999,
"transformers_version": "4.55.0.dev0"
}
Key elements:
- eos_token_id: Multiple EOS tokens including 200012 (tool call completion)
- do_sample: Enabled for generation diversity
- transformers_version: Indicates compatible transformers version
gpt-oss-120b Critical Commit
Commit: 8b193b0ef83bd41b40eb71fee8f1432315e02a3e
- Fixed generation_config.json
- Confirmed to resolve token parsing errors by user andresC98
- Applied to gpt-oss-120b model
How to Update Model Files
Method 1: Re-download with HuggingFace CLI
# Install or update huggingface-hub
pip install --upgrade huggingface-hub
# For gpt-oss-20b
huggingface-cli download openai/gpt-oss-20b --local-dir ./gpt-oss-20b
# For gpt-oss-120b
huggingface-cli download openai/gpt-oss-120b --local-dir ./gpt-oss-120b
Method 2: Manual Update via Web
-
Visit HuggingFace model page:
- gpt-oss-20b: https://huggingface.co/openai/gpt-oss-20b
- gpt-oss-120b: https://huggingface.co/openai/gpt-oss-120b
-
Navigate to "Files and versions" tab
-
Download latest
generation_config.json -
Replace in your local model directory:
# Find your model directory (varies by vLLM installation) # Common locations: # ~/.cache/huggingface/hub/models--openai--gpt-oss-20b/ # ./models/gpt-oss-20b/ # Replace the file cp ~/Downloads/generation_config.json /path/to/model/directory/
Method 3: Update with git (if model was cloned)
cd /path/to/model/directory
git pull origin main
Verification Steps
After updating:
-
Check file contents:
cat generation_config.jsonVerify it matches the current version shown above.
-
Check modification date:
ls -l generation_config.jsonShould be recent (after the commit date).
-
Restart vLLM server:
# Stop existing server # Start with correct flags (see tool-calling-setup.md) vllm serve openai/gpt-oss-20b \ --tool-call-parser openai \ --enable-auto-tool-choice -
Test tool calling:
from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1") response = client.chat.completions.create( model="openai/gpt-oss-20b", messages=[{"role": "user", "content": "What's the weather?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get the weather", "parameters": { "type": "object", "properties": { "location": {"type": "string"} } } } }] ) print(response)
Troubleshooting Update Issues
vLLM Not Picking Up Changes
Symptom: Updated files but still getting errors
Solutions:
-
Clear vLLM cache:
rm -rf ~/.cache/vllm/ -
Restart vLLM with fresh model load:
# Use --download-dir to force specific directory vllm serve openai/gpt-oss-20b \ --download-dir /path/to/models \ --tool-call-parser openai \ --enable-auto-tool-choice -
Check vLLM is loading the correct model directory:
- Look for model path in vLLM startup logs
- Verify it matches where you updated files
File Permission Issues
# Ensure files are readable
chmod 644 generation_config.json
# Check ownership
ls -l generation_config.json
Multiple Model Copies
Problem: vLLM might be loading from a different location
Solution:
-
Find all copies:
find ~/.cache -name "generation_config.json" -path "*/gpt-oss*" -
Update all copies or remove duplicates
-
Use explicit
--download-dirflag when starting vLLM
Additional Files to Check
While generation_config.json is the primary fix, also verify these files are current:
config.json
Contains model architecture configuration
tokenizer_config.json
Token encoding settings, including special tokens
special_tokens_map.json
Maps special token strings to IDs
To update all:
huggingface-cli download openai/gpt-oss-20b \
--local-dir ./gpt-oss-20b \
--force-download
When to Update
Update model files when:
- Encountering token parsing errors
- HuggingFace shows recent commits to model repo
- vLLM error messages reference token IDs
- After vLLM version upgrades
- Community reports fixes via file updates
Cross-References
- Known issues: See known-issues.md
- vLLM configuration: See tool-calling-setup.md