Safetensors
qwen3
ehartford commited on
Commit
9ea38bd
·
verified ·
1 Parent(s): aa9e8d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -19,6 +19,8 @@ This model was made possible by excellent AMD mi300x compute generously provided
19
 
20
  **⚠️ Experimental Model**: This model is created through weight interpolation and duplication, and has not been further trained. Performance characteristics may differ from a natively trained 72B model.
21
 
 
 
22
  ## Key Features
23
 
24
  - ✅ Full Qwen3-72B architecture (8192 hidden, 80 layers)
 
19
 
20
  **⚠️ Experimental Model**: This model is created through weight interpolation and duplication, and has not been further trained. Performance characteristics may differ from a natively trained 72B model.
21
 
22
+ As is, this model underperforms Qwen3-32B. The intent is to create a target suitable for distillation from Qwen3-235B.
23
+
24
  ## Key Features
25
 
26
  - ✅ Full Qwen3-72B architecture (8192 hidden, 80 layers)