This is DarwinLM pruned from Llama3.1-8B. The model is masked that the pruned weights are set as 0 while the remaining weights are the same as the original model.

The shape of all weights are the same as the original model.

# To use the model
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-4.6B-Llama3.1-8B-Pruned-Masked")

4.6B

Model Method Param. SciQ PIQA WG ArcE ArcC HS LogiQA BoolQ MMLU Avg
Llama-3.1-8B Dense 8B 96.3 81.2 74.3 81.4 58.2 81.7 31.1 84.0 65.2 72.8
Uniform 4.5B 29.1 53.6 51.7 26.0 23.6 27.1 25.5 62.1 25.7 36.1
ZipLM 6B 65.5 60.6 56.0 40.2 34.4 34.4 28.1 63.0 27.9 45.7
DarwinLM (one-shot) 4.6B 84.9 69.4 57.3 59.6 34.2 44.6 24.1 62.2 28.5 51.6
OLMO (2.5T) 7B 92.8 79.4 70.4 73.3 44.9 77.1 27.9 72.5 28.3 62.9
DarwinLM (10.0B) 4.6B 93.2 74.8 67.4 73.2 51.6 71.3 30.7 71.1 40.6 63.7
Downloads last month
105
Safetensors
Model size
7.99B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Shengkun/DarwinLM-4.6B-Llama3.1-8B-Pruned-Masked

Quantizations
1 model