|
--- |
|
datasets: |
|
- nvidia/OpenCodeReasoning-2 |
|
- GetSoloTech/Code-Reasoning |
|
base_model: GetSoloTech/GPT-OSS-Code-Reasoning-20B |
|
library_name: mlx |
|
tags: |
|
- code-reasoning |
|
- coding |
|
- reasoning |
|
- problem-solving |
|
- algorithms |
|
- python |
|
- c++ |
|
- competitive-programming |
|
- vllm |
|
- mlx |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx |
|
|
|
This is an experimental quant with mixed precision selective layers, rendered with group size 32 |
|
|
|
Side effects of quanting with the qx86-hi formula |
|
```bash |
|
I needed Haskell code. |
|
The q6 starts with Haskell, 10k token down the road writes Python, and finishes with React |
|
The q6-hi, encoded with group size 32, writes some haskell, and stops somewhere at 20k tokens |
|
The qx86-hi worked for 40k tokens, reasoning around the Haskell solution without skipping a beat |
|
``` |
|
|
|
From the original model card: |
|
|
|
Overview |
|
```bash |
|
Base model: openai/gpt-oss-20b |
|
Objective: Supervised fine-tuning for competitive programming and algorithmic reasoning |
|
Dataset: nvidia/OpenCodeReasoning-2 (OCR-2), combining python and cpp splits. |
|
Each sample reconstructs the upstream question and uses the dataset's r1_generation as the assistant response |
|
Context length: 4096 tokens |
|
Training method: LoRA SFT via TRL SFTTrainer |
|
``` |
|
|
|
Intended Use |
|
```bash |
|
Intended: Generating Python/C++ solutions and reasoning for competitive programming tasks |
|
Out of scope: Safety-critical applications. May hallucinate or produce incorrect/inefficient code |
|
``` |
|
|
|
This model [GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx](https://huggingface.co/GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx) was |
|
converted to MLX format from [GetSoloTech/GPT-OSS-Code-Reasoning-20B](https://huggingface.co/GetSoloTech/GPT-OSS-Code-Reasoning-20B) |
|
using mlx-lm version **0.26.4**. |
|
|
|
## Use with mlx |
|
|
|
```bash |
|
pip install mlx-lm |
|
``` |
|
|
|
```python |
|
from mlx_lm import load, generate |
|
|
|
model, tokenizer = load("GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx") |
|
|
|
prompt = "hello" |
|
|
|
if tokenizer.chat_template is not None: |
|
messages = [{"role": "user", "content": prompt}] |
|
prompt = tokenizer.apply_chat_template( |
|
messages, add_generation_prompt=True |
|
) |
|
|
|
response = generate(model, tokenizer, prompt=prompt, verbose=True) |
|
``` |
|
|