SHIFT-D3.5-Qwen2.5-VL-3B-ArtCurator-DE
This LoRA-fine-tuned model generates detailed descriptions of cultural-heritage artworks for blind users in Serbian.
Usage
from transformers import QwenForCausalLM, QwenProcessor
processor = QwenProcessor.from_pretrained("JoseferEins/SHIFT-D3.5-Qwen2.5-VL-3B-ArtCurator-SR")
model = QwenForCausalLM.from_pretrained("JoseferEins/SHIFT-D3.5-Qwen2.5-VL-3B-ArtCurator-SR")
inputs = processor(
images="path/to/artwork.jpg",
text="Опиши ову слику детаљно / Opiši ovu sliku detaljno.",
return_tensors="pt"
)
out = model.generate(**inputs)
print(processor.decode(out[0], skip_special_tokens=True))