Llama-3.1-8B General Model

This is a fine-tuned Llama-3.1-8B model specialized for general instruction following tasks. This checkpoint was released alongside https://arxiv.org/abs/2509.11167.

Model Details

  • Base model: Llama-3.1-8B
  • Training dataset: tulu3_mixture_general
  • Learning rate: 5e-06
  • Effective batch size: 128

Export Files

This repository includes export files for state averaging and other advanced techniques.

Downloads last month
17
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pmahdavi/Llama-3.1-8B-general

Finetuned
(1590)
this model

Space using pmahdavi/Llama-3.1-8B-general 1

Evaluation results