SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 17 days ago • 187
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 60
Biomedical Collection Models for biomedical research applications, such as radiology report generation and biomedical language understanding. • 9 items • Updated Jan 8 • 10
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 73
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 109
view article Article OCR Processing and Text in Image Analysis with Florence-2-base and Qwen2-VL-2B By PandorAI1995 • Oct 18, 2024 • 14
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 11 days ago • 296
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 570
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 298 items • Updated 3 days ago • 36
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning Paper • 2408.07931 • Published Aug 15, 2024 • 21
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16, 2024 • 98