Running on Zero 34 34 Solving Inverse Problems with FLAIR 🎨 Restore and enhance images using text prompts
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 114
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published Nov 21, 2024 • 47
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 80
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25, 2024 • 122
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated 6 days ago • 222
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 11 items • Updated May 1 • 60
google/siglip-so400m-patch14-384 Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 2.08M • 576
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Paper • 2409.02095 • Published Sep 3, 2024 • 37