view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • 17 days ago • 138
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • 26 days ago • 417
Summarization of Multimodal Presentations Collection Ressources related to summarization of multimodal presentations. • 6 items • Updated Apr 25
Perception Encoder: The best visual embeddings are not at the output of the network Paper • 2504.13181 • Published Apr 17 • 34
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing Paper • 2206.15076 • Published Jun 30, 2022 • 4
Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure Paper • 2504.10049 • Published Apr 14 • 3
Summarization of Multimodal Presentations Collection Ressources related to summarization of multimodal presentations. • 6 items • Updated Apr 25
Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure Paper • 2504.10049 • Published Apr 14 • 3
Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure Paper • 2504.10049 • Published Apr 14 • 3 • 2