Model Card
This is Qwen2-VL 2B, fine-tuned for OCR/HTR with Spanish language historical documents using data from neulab/PangeaInstruct. Each image has a red box around an area of text in the image. The model is asked to return the text inside.
For the training data see
- Pangea (task_data_vmultilingual_cc_news_es_curated.tar)
- apjanco/fmb_primera_muestra_redboxes
Model Details
This is the model card of a 🤗 transformers model that has been pushed on the Hub.
- Developed by: Andrew Janco
- Model type: Qwen2-VL
- Language(s) (NLP): Spanish
- License: MIT
- Finetuned from model [optional]: Qwen2-VL 2B
Uses
This model is part of experiments to extract text from historical handwritten documents.
- Downloads last month
- 13
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the HF Inference API does not support transformers models with pipeline type image-text-to-text