README.md · nightmedia/GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx at 4a92da07f0f07fa5e49da27a3204da1993dcb817

GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx / README.md

nightmedia

Update README.md

4a92da0 verified 17 days ago

preview code

raw

history blame

2.19 kB

	---
	datasets:
	- nvidia/OpenCodeReasoning-2
	- GetSoloTech/Code-Reasoning
	base_model: GetSoloTech/GPT-OSS-Code-Reasoning-20B
	library_name: mlx
	tags:
	- code-reasoning
	- coding
	- reasoning
	- problem-solving
	- algorithms
	- python
	- c++
	- competitive-programming
	- vllm
	- mlx
	pipeline_tag: text-generation
	---

	# GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx

	This is an experimental quant with mixed precision selective layers, rendered with group size 32

	Side effects of quanting with the qx86-hi formula
	```bash
	I needed Haskell code.
	The q6 starts with Haskell, 10k token down the road writes Python, and finishes with React
	The q6-hi, encoded with group size 32, writes some haskell, and stops somewhere at 20k tokens
	The qx86-hi worked for 40k tokens, reasoning around the Haskell solution without skipping a beat
	```

	From the original model card:

	Overview
	```bash
	Base model: openai/gpt-oss-20b
	Objective: Supervised fine-tuning for competitive programming and algorithmic reasoning
	Dataset: nvidia/OpenCodeReasoning-2 (OCR-2), combining python and cpp splits.
	Each sample reconstructs the upstream question and uses the dataset's r1_generation as the assistant response
	Context length: 4096 tokens
	Training method: LoRA SFT via TRL SFTTrainer
	```

	Intended Use
	```bash
	Intended: Generating Python/C++ solutions and reasoning for competitive programming tasks
	Out of scope: Safety-critical applications. May hallucinate or produce incorrect/inefficient code
	```

	This model [GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx](https://huggingface.co/GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx) was
	converted to MLX format from [GetSoloTech/GPT-OSS-Code-Reasoning-20B](https://huggingface.co/GetSoloTech/GPT-OSS-Code-Reasoning-20B)
	using mlx-lm version 0.26.4.

	## Use with mlx

	```bash
	pip install mlx-lm
	```

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("GPT-OSS-Code-Reasoning-20B-qx86-hi-mlx")

	prompt = "hello"

	if tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, add_generation_prompt=True
	)

	response = generate(model, tokenizer, prompt=prompt, verbose=True)
	```