emretmrk
/

smolvlm-trl-dpo

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

Model Card for smolvlm-trl-dpo

This model was trained with DPO using the ChartQA dataset.

Quick start

from transformers import pipeline

question = "Provide an intricate description of every entity in the image."
generator = pipeline("text-generation", model="emretmrk/smolvlm-trl-dpo", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=1024, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

This model was trained with DPO using the ChartQA dataset.

Framework versions

TRL: 0.20.0
Transformers: 4.54.1
Pytorch: 2.6.0+cu124
Datasets: 4.0.0
Tokenizers: 0.21.2

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for emretmrk/smolvlm-trl-dpo

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Quantized

HuggingFaceTB/SmolVLM-Instruct

Finetuned

(113)

this model

Dataset used to train emretmrk/smolvlm-trl-dpo