This is DarwinLM pruned from Mistral-Minitron-8B. The model is masked that the pruned weights are set as 0 while the remaining weights are the same as the original model.
The shape of all weights are the same as the original model.
# To use the model
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-4B-Mistral-Minitron-8B-Pruned-Masked")
4.8B
Methods | Avg | SciQ | PIQA | WG | ARC-E | ARC-C | HS | LogiQA | BoolQ | MMLU |
---|---|---|---|---|---|---|---|---|---|---|
Mistral Minitron 8B | 72.4 | 96.5 | 80.4 | 79.5 | 83.1 | 63 | 62.4 | 33.6 | 83.9 | 69.4 |
DarwinLM 4.8B | 53.6 | 83.1 | 69.8 | 58.9 | 64.5 | 35.7 | 48.9 | 24.1 | 64.7 | 33.2 |
- Downloads last month
- 72
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support