217 lines
5.0 KiB
Markdown
217 lines
5.0 KiB
Markdown
# Updating gpt-oss Model Files
|
|
|
|
## Why Update Model Files?
|
|
|
|
The `openai_harmony.HarmonyError: Unexpected token` errors are often caused by outdated `generation_config.json` files. HuggingFace updates these files to fix token parsing issues.
|
|
|
|
## Current Configuration Files
|
|
|
|
### gpt-oss-20b generation_config.json
|
|
|
|
Latest version includes:
|
|
```json
|
|
{
|
|
"bos_token_id": 199998,
|
|
"do_sample": true,
|
|
"eos_token_id": [
|
|
200002,
|
|
199999,
|
|
200012
|
|
],
|
|
"pad_token_id": 199999,
|
|
"transformers_version": "4.55.0.dev0"
|
|
}
|
|
```
|
|
|
|
**Key elements**:
|
|
- **eos_token_id**: Multiple EOS tokens including 200012 (tool call completion)
|
|
- **do_sample**: Enabled for generation diversity
|
|
- **transformers_version**: Indicates compatible transformers version
|
|
|
|
### gpt-oss-120b Critical Commit
|
|
|
|
**Commit**: 8b193b0ef83bd41b40eb71fee8f1432315e02a3e
|
|
- Fixed generation_config.json
|
|
- Confirmed to resolve token parsing errors by user andresC98
|
|
- Applied to gpt-oss-120b model
|
|
|
|
## How to Update Model Files
|
|
|
|
### Method 1: Re-download with HuggingFace CLI
|
|
|
|
```bash
|
|
# Install or update huggingface-hub
|
|
pip install --upgrade huggingface-hub
|
|
|
|
# For gpt-oss-20b
|
|
huggingface-cli download openai/gpt-oss-20b --local-dir ./gpt-oss-20b
|
|
|
|
# For gpt-oss-120b
|
|
huggingface-cli download openai/gpt-oss-120b --local-dir ./gpt-oss-120b
|
|
```
|
|
|
|
### Method 2: Manual Update via Web
|
|
|
|
1. Visit HuggingFace model page:
|
|
- gpt-oss-20b: https://huggingface.co/openai/gpt-oss-20b
|
|
- gpt-oss-120b: https://huggingface.co/openai/gpt-oss-120b
|
|
|
|
2. Navigate to "Files and versions" tab
|
|
|
|
3. Download latest `generation_config.json`
|
|
|
|
4. Replace in your local model directory:
|
|
```bash
|
|
# Find your model directory (varies by vLLM installation)
|
|
# Common locations:
|
|
# ~/.cache/huggingface/hub/models--openai--gpt-oss-20b/
|
|
# ./models/gpt-oss-20b/
|
|
|
|
# Replace the file
|
|
cp ~/Downloads/generation_config.json /path/to/model/directory/
|
|
```
|
|
|
|
### Method 3: Update with git (if model was cloned)
|
|
|
|
```bash
|
|
cd /path/to/model/directory
|
|
git pull origin main
|
|
```
|
|
|
|
## Verification Steps
|
|
|
|
After updating:
|
|
|
|
1. **Check file contents**:
|
|
```bash
|
|
cat generation_config.json
|
|
```
|
|
|
|
Verify it matches the current version shown above.
|
|
|
|
2. **Check modification date**:
|
|
```bash
|
|
ls -l generation_config.json
|
|
```
|
|
|
|
Should be recent (after the commit date).
|
|
|
|
3. **Restart vLLM server**:
|
|
```bash
|
|
# Stop existing server
|
|
# Start with correct flags (see tool-calling-setup.md)
|
|
vllm serve openai/gpt-oss-20b \
|
|
--tool-call-parser openai \
|
|
--enable-auto-tool-choice
|
|
```
|
|
|
|
4. **Test tool calling**:
|
|
```python
|
|
from openai import OpenAI
|
|
|
|
client = OpenAI(base_url="http://localhost:8000/v1")
|
|
|
|
response = client.chat.completions.create(
|
|
model="openai/gpt-oss-20b",
|
|
messages=[{"role": "user", "content": "What's the weather?"}],
|
|
tools=[{
|
|
"type": "function",
|
|
"function": {
|
|
"name": "get_weather",
|
|
"description": "Get the weather",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"location": {"type": "string"}
|
|
}
|
|
}
|
|
}
|
|
}]
|
|
)
|
|
|
|
print(response)
|
|
```
|
|
|
|
## Troubleshooting Update Issues
|
|
|
|
### vLLM Not Picking Up Changes
|
|
|
|
**Symptom**: Updated files but still getting errors
|
|
|
|
**Solutions**:
|
|
1. Clear vLLM cache:
|
|
```bash
|
|
rm -rf ~/.cache/vllm/
|
|
```
|
|
|
|
2. Restart vLLM with fresh model load:
|
|
```bash
|
|
# Use --download-dir to force specific directory
|
|
vllm serve openai/gpt-oss-20b \
|
|
--download-dir /path/to/models \
|
|
--tool-call-parser openai \
|
|
--enable-auto-tool-choice
|
|
```
|
|
|
|
3. Check vLLM is loading the correct model directory:
|
|
- Look for model path in vLLM startup logs
|
|
- Verify it matches where you updated files
|
|
|
|
### File Permission Issues
|
|
|
|
```bash
|
|
# Ensure files are readable
|
|
chmod 644 generation_config.json
|
|
|
|
# Check ownership
|
|
ls -l generation_config.json
|
|
```
|
|
|
|
### Multiple Model Copies
|
|
|
|
**Problem**: vLLM might be loading from a different location
|
|
|
|
**Solution**:
|
|
1. Find all copies:
|
|
```bash
|
|
find ~/.cache -name "generation_config.json" -path "*/gpt-oss*"
|
|
```
|
|
|
|
2. Update all copies or remove duplicates
|
|
|
|
3. Use explicit `--download-dir` flag when starting vLLM
|
|
|
|
## Additional Files to Check
|
|
|
|
While `generation_config.json` is the primary fix, also verify these files are current:
|
|
|
|
### config.json
|
|
Contains model architecture configuration
|
|
|
|
### tokenizer_config.json
|
|
Token encoding settings, including special tokens
|
|
|
|
### special_tokens_map.json
|
|
Maps special token strings to IDs
|
|
|
|
**To update all**:
|
|
```bash
|
|
huggingface-cli download openai/gpt-oss-20b \
|
|
--local-dir ./gpt-oss-20b \
|
|
--force-download
|
|
```
|
|
|
|
## When to Update
|
|
|
|
Update model files when:
|
|
- Encountering token parsing errors
|
|
- HuggingFace shows recent commits to model repo
|
|
- vLLM error messages reference token IDs
|
|
- After vLLM version upgrades
|
|
- Community reports fixes via file updates
|
|
|
|
## Cross-References
|
|
|
|
- Known issues: See known-issues.md
|
|
- vLLM configuration: See tool-calling-setup.md
|