Model Card for smolvlm-trl-dpo
This model was trained with DPO using the ChartQA dataset.
Quick start
from transformers import pipeline
question = "Provide an intricate description of every entity in the image."
generator = pipeline("text-generation", model="emretmrk/smolvlm-trl-dpo", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=1024, return_full_text=False)[0]
print(output["generated_text"])
Training procedure
This model was trained with DPO using the ChartQA dataset.
Framework versions
- TRL: 0.20.0
- Transformers: 4.54.1
- Pytorch: 2.6.0+cu124
- Datasets: 4.0.0
- Tokenizers: 0.21.2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for emretmrk/smolvlm-trl-dpo
Base model
HuggingFaceTB/SmolLM2-1.7B
Quantized
HuggingFaceTB/SmolLM2-1.7B-Instruct
Quantized
HuggingFaceTB/SmolVLM-Instruct