cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid

This model is a DPO (Direct Preference Optimization) fine-tuned version of Llama-3.2-1B-Instruct using the cluster method.

Model Details

  • Base Model: Llama-3.2-1B-Instruct
  • Training Method: cluster
  • Pruning Ratio: unknown
  • Training Date: 2025-09-15

Training Configuration

This model was trained using Direct Preference Optimization (DPO) with the following characteristics:

  • Method: cluster
  • Pruning applied during training
  • Fine-tuned on preference data

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "5456es/cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data

This model was trained on preference data using the DPO algorithm.

Limitations

This model inherits the limitations of its base model and may have additional limitations due to the pruning process.

Citation

If you use this model, please cite the original DPO paper and the base model.

Downloads last month
44
Safetensors
Model size
1.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support