Scalable Vision Language Model Training via High Quality Data Curation Paper • 2501.05952 • Published Jan 10 • 1
ColQwen2 Models Collection Pre-trained checkpoints for the ColQwen2 model. • 4 items • Updated 29 days ago • 2
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 26 days ago • 359
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published 18 days ago • 15
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Paper • 2501.13687 • Published 29 days ago • 8
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 97
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model Paper • 2501.05122 • Published Jan 9 • 20
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Paper • 2410.12628 • Published Oct 16, 2024 • 35
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83
multilingual vision models Collection Some papers I read for understanding vision models and also adding multilingual capabilities to them • 14 items • Updated Dec 11, 2024 • 2
Maya: An Instruction Finetuned Multilingual Multimodal Model Paper • 2412.07112 • Published Dec 10, 2024 • 27
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 60
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 127
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22, 2024 • 25