cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid
This model is a DPO (Direct Preference Optimization) fine-tuned version of Llama-3.2-1B-Instruct using the cluster method.
Model Details
- Base Model: Llama-3.2-1B-Instruct
- Training Method: cluster
- Pruning Ratio: unknown
- Training Date: 2025-09-15
Training Configuration
This model was trained using Direct Preference Optimization (DPO) with the following characteristics:
- Method: cluster
- Pruning applied during training
- Fine-tuned on preference data
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "5456es/cluster_prune_Llama-3.2-1B-Instruct_prune_0.5-sigmoid"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example usage
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Data
This model was trained on preference data using the DPO algorithm.
Limitations
This model inherits the limitations of its base model and may have additional limitations due to the pruning process.
Citation
If you use this model, please cite the original DPO paper and the base model.
- Downloads last month
- 44
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support