cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid

This model is a DPO (Direct Preference Optimization) fine-tuned version of Llama-3.2-1B-Instruct using the cluster method.

Model Details

Base Model: Llama-3.2-1B-Instruct
Training Method: cluster
Pruning Ratio: unknown
Training Date: 2025-09-15

Training Configuration

This model was trained using Direct Preference Optimization (DPO) with the following characteristics:

Method: cluster
Pruning applied during training
Fine-tuned on preference data

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "5456es/cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data

This model was trained on preference data using the DPO algorithm.

Limitations

This model inherits the limitations of its base model and may have additional limitations due to the pruning process.

Citation

If you use this model, please cite the original DPO paper and the base model.

Downloads last month: 44

Safetensors

Model size

1.24B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support