--- datasets: - nvidia/OpenCodeReasoning-2 - GetSoloTech/Code-Reasoning base_model: GetSoloTech/GPT-OSS-Code-Reasoning-20B library_name: mlx tags: - code-reasoning - coding - reasoning - problem-solving - algorithms - python - c++ - competitive-programming - vllm - mlx pipeline_tag: text-generation --- # GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx This is an experimental quant with mixed precision selective layers, some compressed to 6bit, all rendered with group size 32 Side effects of quanting with the qx86-hi formula ```bash I needed Haskell code. The q6 starts with Haskell, 10k tokens down the road writes Python, and finishes with React at 30k tokens The q6-hi, encoded with group size 32, writes some Haskell, and stops around 24k tokens The qx86-hi worked for 43k tokens, reasoning around the Haskell solution without skipping a beat It didn't forget Python, it's just a bit more open-minded about other languages ``` From the original model card: Overview ```bash Base model: openai/gpt-oss-20b Objective: Supervised fine-tuning for competitive programming and algorithmic reasoning Dataset: nvidia/OpenCodeReasoning-2 (OCR-2), combining python and cpp splits. Each sample reconstructs the upstream question and uses the dataset's r1_generation as the assistant response Context length: 4096 tokens Training method: LoRA SFT via TRL SFTTrainer ``` Intended Use ```bash Intended: Generating Python/C++ solutions and reasoning for competitive programming tasks Out of scope: Safety-critical applications. May hallucinate or produce incorrect/inefficient code ``` This model [GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx](https://huggingface.co/GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx) was converted to MLX format from [GetSoloTech/GPT-OSS-Code-Reasoning-20B](https://huggingface.co/GetSoloTech/GPT-OSS-Code-Reasoning-20B) using mlx-lm version **0.26.4**. ## Use with mlx ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate model, tokenizer = load("GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx") prompt = "hello" if tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) response = generate(model, tokenizer, prompt=prompt, verbose=True) ```