view article Article You could have designed state of the art positional encoding Nov 25, 2024 β’ 127
Whisper Collection OpenAI Whisper speech recognition models in MLX format β’ 48 items β’ Updated Oct 1, 2024 β’ 26
What matters when building vision-language models? Paper β’ 2405.02246 β’ Published May 3, 2024 β’ 102
Idefics2 πΆ Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. β’ 11 items β’ Updated May 6, 2024 β’ 91
Zero-Shot Detection and Segmentation Collection Demos of projects focused on zero-shot detection and segmentation. β’ 4 items β’ Updated Feb 7, 2024 β’ 3