QuixiAI
/

Qwen3-72B-Embiggened

Model card Files Files and versions

ehartford commited on Jun 14

Commit

9ea38bd

·

verified ·

1 Parent(s): aa9e8d9

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -19,6 +19,8 @@ This model was made possible by excellent AMD mi300x compute generously provided
 **⚠️ Experimental Model**: This model is created through weight interpolation and duplication, and has not been further trained. Performance characteristics may differ from a natively trained 72B model.
 ## Key Features
 - ✅ Full Qwen3-72B architecture (8192 hidden, 80 layers)

 **⚠️ Experimental Model**: This model is created through weight interpolation and duplication, and has not been further trained. Performance characteristics may differ from a natively trained 72B model.
+As is, this model underperforms Qwen3-32B.  The intent is to create a target suitable for distillation from Qwen3-235B.
 ## Key Features
 - ✅ Full Qwen3-72B architecture (8192 hidden, 80 layers)