This is DarwinLM pruned from Llama3.1-8B. The model is masked that the pruned weights are set as 0 while the remaining weights are the same as the original model.
The shape of all weights are the same as the original model.
# To use the model
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-4.6B-Llama3.1-8B-Pruned-Masked")
4.6B
Model | Method | Param. | SciQ | PIQA | WG | ArcE | ArcC | HS | LogiQA | BoolQ | MMLU | Avg |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Llama-3.1-8B | Dense | 8B | 96.3 | 81.2 | 74.3 | 81.4 | 58.2 | 81.7 | 31.1 | 84.0 | 65.2 | 72.8 |
Uniform | 4.5B | 29.1 | 53.6 | 51.7 | 26.0 | 23.6 | 27.1 | 25.5 | 62.1 | 25.7 | 36.1 | |
ZipLM | 6B | 65.5 | 60.6 | 56.0 | 40.2 | 34.4 | 34.4 | 28.1 | 63.0 | 27.9 | 45.7 | |
DarwinLM (one-shot) | 4.6B | 84.9 | 69.4 | 57.3 | 59.6 | 34.2 | 44.6 | 24.1 | 62.2 | 28.5 | 51.6 | |
OLMO (2.5T) | 7B | 92.8 | 79.4 | 70.4 | 73.3 | 44.9 | 77.1 | 27.9 | 72.5 | 28.3 | 62.9 | |
DarwinLM (10.0B) | 4.6B | 93.2 | 74.8 | 67.4 | 73.2 | 51.6 | 71.3 | 30.7 | 71.1 | 40.6 | 63.7 |
- Downloads last month
- 105
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support