rinna
/

qwen2.5-bakeneko-32b-instruct-awq

@@ -3,7 +3,6 @@ thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rin
 license: apache-2.0
 language:
 - ja
-- en
 tags:
 - qwen2
 - conversational
@@ -22,9 +21,13 @@ library_name: transformers
 This model is a 4-bit quantized model for [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct) using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ). The quantized version is 4x smaller than the original model and thus requires less memory and provides faster inference.
-| Size | Continual Pre-Training | Instruction Tuning | DeepSeek-R1 Distillation
-| :-   | :-                     | :-                 | :-
-| 32B   | Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b) | Qwen2.5 Bakeneko 32B Instruct [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct)[[AWQ]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-awq)[[GGUF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int4)| DeepSeek R1 Distill Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b)[[AWQ]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-awq)[[GGUF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gguf)[[GPTQ int8]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int4)
 See [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct) for details about model architecture and data.
@@ -33,11 +36,27 @@ See [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-b
     - [Xinqi Chen](https://huggingface.co/Keely0419)
     - [Kei Sawada](https://huggingface.co/keisawada)
 ---
 # Benchmarking
-Please refer to [rinna's LM benchmark page](https://rinnakk.github.io/research/benchmarks/lm/index.html).
 ---

 license: apache-2.0
 language:
 - ja
 tags:
 - qwen2
 - conversational
 This model is a 4-bit quantized model for [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct) using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ). The quantized version is 4x smaller than the original model and thus requires less memory and provides faster inference.
+| Model Type | Model Name
+| :-   | :-
+| Japanese Continual Pre-Training Model | Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b)
+| Instruction-Tuning Model | Qwen2.5 Bakeneko 32B Instruct [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct)[[AWQ]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-awq)[[GGUF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-gptq-int4)
+| DeepSeek R1 Distill Qwen2.5 Merged Reasoning Model | DeepSeek R1 Distill Qwen2.5 Bakeneko 32B [[HF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b)[[AWQ]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-awq)[[GGUF]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gguf)[[GPTQ int8]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b-gptq-int4)
+| QwQ Merged Reasoning Model | QwQ Bakeneko 32B [[HF]](https://huggingface.co/rinna/qwq-bakeneko-32b)[[AWQ]](https://huggingface.co/rinna/qwq-bakeneko-32b-awq)[[GGUF]](https://huggingface.co/rinna/qwq-bakeneko-32b-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwq-bakeneko-32b-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwq-bakeneko-32b-gptq-int4)
+| QwQ Bakeneko Merged Instruction-Tuning Model | Qwen2.5 Bakeneko 32B Instruct V2 [[HF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2)[[AWQ]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-awq)[[GGUF]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gguf)[[GPTQ int8]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gptq-int8)[[GPTQ int4]](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2-gptq-int4)
 See [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct) for details about model architecture and data.
     - [Xinqi Chen](https://huggingface.co/Keely0419)
     - [Kei Sawada](https://huggingface.co/keisawada)
+* **Release date**
+    February 13, 2025
 ---
 # Benchmarking
+| Model | Japanese LM Evaluation Harness | Japanese MT-Bench (first turn) | Japanese MT-Bench (multi turn)
+| :-    | :-: | :-: | :-:
+| [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) | 79.46 | - | -
+| [rinna/qwen2.5-bakeneko-32b](https://huggingface.co/rinna/qwen2.5-bakeneko-32b) | 79.18 | - | -
+| [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) | 78.29 | 8.13 | 7.54
+| [rinna/qwen2.5-bakeneko-32b-instruct](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct) | 79.62 | 8.17 | 7.66
+| [rinna/qwen2.5-bakeneko-32b-instruct-v2](https://huggingface.co/rinna/qwen2.5-bakeneko-32b-instruct-v2) | 77.92 | 8.86 | 8.53
+| [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) | 73.51 | 7.39 | 6.88
+| [rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b](https://huggingface.co/rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b) | 77.43 | 8.58 | 8.19
+| [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B) | 76.12 | 8.58 | 8.25
+| [rinna/qwq-bakeneko-32b](https://huggingface.co/rinna/qwq-bakeneko-32b) | 78.31 | 8.81 | 8.52
+For detailed benchmarking results, please refer to [rinna's LM benchmark page (Sheet 20250213)](https://rinnakk.github.io/research/benchmarks/lm/index.html).
 ---