Post
47
Try out the demo for Multimodal OCR featuring the implementation of models including
🤗Multimodal OCR Space : prithivMLmods/Multimodal-OCR
📦The models implemented in this Space are:
+ RolmOCR : reducto/RolmOCR
+ Qwen2VL OCR : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+ Qwen2VL OCR2 : prithivMLmods/Qwen2-VL-OCR2-2B-Instruct
RolmOCR
and Qwen2VL OCR
. The use case showcases image-to-text-to-text conversion and video understanding support for the RolmOCR model ! 🚀🤗Multimodal OCR Space : prithivMLmods/Multimodal-OCR
📦The models implemented in this Space are:
+ RolmOCR : reducto/RolmOCR
+ Qwen2VL OCR : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
[ or ]
+ Qwen2VL OCR2 : prithivMLmods/Qwen2-VL-OCR2-2B-Instruct
Qwen2VL OCR supports only image-text-to-text in the space.