cpatonn's picture
Update README.md
39de9ad verified
---
license: apache-2.0
base_model:
- nvidia/OpenReasoning-Nemotron-32B
datasets:
- HuggingFaceH4/ultrachat_200k
---
# OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic
## Method
Quantised using [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor.git) and the following configs:
```
recipe = [
SmoothQuantModifier(smoothing_strength=0.8),
GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"]),
]
```