Initial commit
This commit is contained in:
216
skills/gpt-oss-troubleshooting/reference/model-updates.md
Normal file
216
skills/gpt-oss-troubleshooting/reference/model-updates.md
Normal file
@@ -0,0 +1,216 @@
|
||||
# Updating gpt-oss Model Files
|
||||
|
||||
## Why Update Model Files?
|
||||
|
||||
The `openai_harmony.HarmonyError: Unexpected token` errors are often caused by outdated `generation_config.json` files. HuggingFace updates these files to fix token parsing issues.
|
||||
|
||||
## Current Configuration Files
|
||||
|
||||
### gpt-oss-20b generation_config.json
|
||||
|
||||
Latest version includes:
|
||||
```json
|
||||
{
|
||||
"bos_token_id": 199998,
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
200002,
|
||||
199999,
|
||||
200012
|
||||
],
|
||||
"pad_token_id": 199999,
|
||||
"transformers_version": "4.55.0.dev0"
|
||||
}
|
||||
```
|
||||
|
||||
**Key elements**:
|
||||
- **eos_token_id**: Multiple EOS tokens including 200012 (tool call completion)
|
||||
- **do_sample**: Enabled for generation diversity
|
||||
- **transformers_version**: Indicates compatible transformers version
|
||||
|
||||
### gpt-oss-120b Critical Commit
|
||||
|
||||
**Commit**: 8b193b0ef83bd41b40eb71fee8f1432315e02a3e
|
||||
- Fixed generation_config.json
|
||||
- Confirmed to resolve token parsing errors by user andresC98
|
||||
- Applied to gpt-oss-120b model
|
||||
|
||||
## How to Update Model Files
|
||||
|
||||
### Method 1: Re-download with HuggingFace CLI
|
||||
|
||||
```bash
|
||||
# Install or update huggingface-hub
|
||||
pip install --upgrade huggingface-hub
|
||||
|
||||
# For gpt-oss-20b
|
||||
huggingface-cli download openai/gpt-oss-20b --local-dir ./gpt-oss-20b
|
||||
|
||||
# For gpt-oss-120b
|
||||
huggingface-cli download openai/gpt-oss-120b --local-dir ./gpt-oss-120b
|
||||
```
|
||||
|
||||
### Method 2: Manual Update via Web
|
||||
|
||||
1. Visit HuggingFace model page:
|
||||
- gpt-oss-20b: https://huggingface.co/openai/gpt-oss-20b
|
||||
- gpt-oss-120b: https://huggingface.co/openai/gpt-oss-120b
|
||||
|
||||
2. Navigate to "Files and versions" tab
|
||||
|
||||
3. Download latest `generation_config.json`
|
||||
|
||||
4. Replace in your local model directory:
|
||||
```bash
|
||||
# Find your model directory (varies by vLLM installation)
|
||||
# Common locations:
|
||||
# ~/.cache/huggingface/hub/models--openai--gpt-oss-20b/
|
||||
# ./models/gpt-oss-20b/
|
||||
|
||||
# Replace the file
|
||||
cp ~/Downloads/generation_config.json /path/to/model/directory/
|
||||
```
|
||||
|
||||
### Method 3: Update with git (if model was cloned)
|
||||
|
||||
```bash
|
||||
cd /path/to/model/directory
|
||||
git pull origin main
|
||||
```
|
||||
|
||||
## Verification Steps
|
||||
|
||||
After updating:
|
||||
|
||||
1. **Check file contents**:
|
||||
```bash
|
||||
cat generation_config.json
|
||||
```
|
||||
|
||||
Verify it matches the current version shown above.
|
||||
|
||||
2. **Check modification date**:
|
||||
```bash
|
||||
ls -l generation_config.json
|
||||
```
|
||||
|
||||
Should be recent (after the commit date).
|
||||
|
||||
3. **Restart vLLM server**:
|
||||
```bash
|
||||
# Stop existing server
|
||||
# Start with correct flags (see tool-calling-setup.md)
|
||||
vllm serve openai/gpt-oss-20b \
|
||||
--tool-call-parser openai \
|
||||
--enable-auto-tool-choice
|
||||
```
|
||||
|
||||
4. **Test tool calling**:
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(base_url="http://localhost:8000/v1")
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="openai/gpt-oss-20b",
|
||||
messages=[{"role": "user", "content": "What's the weather?"}],
|
||||
tools=[{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "get_weather",
|
||||
"description": "Get the weather",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"location": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}]
|
||||
)
|
||||
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Troubleshooting Update Issues
|
||||
|
||||
### vLLM Not Picking Up Changes
|
||||
|
||||
**Symptom**: Updated files but still getting errors
|
||||
|
||||
**Solutions**:
|
||||
1. Clear vLLM cache:
|
||||
```bash
|
||||
rm -rf ~/.cache/vllm/
|
||||
```
|
||||
|
||||
2. Restart vLLM with fresh model load:
|
||||
```bash
|
||||
# Use --download-dir to force specific directory
|
||||
vllm serve openai/gpt-oss-20b \
|
||||
--download-dir /path/to/models \
|
||||
--tool-call-parser openai \
|
||||
--enable-auto-tool-choice
|
||||
```
|
||||
|
||||
3. Check vLLM is loading the correct model directory:
|
||||
- Look for model path in vLLM startup logs
|
||||
- Verify it matches where you updated files
|
||||
|
||||
### File Permission Issues
|
||||
|
||||
```bash
|
||||
# Ensure files are readable
|
||||
chmod 644 generation_config.json
|
||||
|
||||
# Check ownership
|
||||
ls -l generation_config.json
|
||||
```
|
||||
|
||||
### Multiple Model Copies
|
||||
|
||||
**Problem**: vLLM might be loading from a different location
|
||||
|
||||
**Solution**:
|
||||
1. Find all copies:
|
||||
```bash
|
||||
find ~/.cache -name "generation_config.json" -path "*/gpt-oss*"
|
||||
```
|
||||
|
||||
2. Update all copies or remove duplicates
|
||||
|
||||
3. Use explicit `--download-dir` flag when starting vLLM
|
||||
|
||||
## Additional Files to Check
|
||||
|
||||
While `generation_config.json` is the primary fix, also verify these files are current:
|
||||
|
||||
### config.json
|
||||
Contains model architecture configuration
|
||||
|
||||
### tokenizer_config.json
|
||||
Token encoding settings, including special tokens
|
||||
|
||||
### special_tokens_map.json
|
||||
Maps special token strings to IDs
|
||||
|
||||
**To update all**:
|
||||
```bash
|
||||
huggingface-cli download openai/gpt-oss-20b \
|
||||
--local-dir ./gpt-oss-20b \
|
||||
--force-download
|
||||
```
|
||||
|
||||
## When to Update
|
||||
|
||||
Update model files when:
|
||||
- Encountering token parsing errors
|
||||
- HuggingFace shows recent commits to model repo
|
||||
- vLLM error messages reference token IDs
|
||||
- After vLLM version upgrades
|
||||
- Community reports fixes via file updates
|
||||
|
||||
## Cross-References
|
||||
|
||||
- Known issues: See known-issues.md
|
||||
- vLLM configuration: See tool-calling-setup.md
|
||||
Reference in New Issue
Block a user