Andres Marafioti's picture

Andres Marafioti

andito

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

reacted to merve's post with šŸ¤— 1 day ago
So many open releases at Hugging Face past week šŸ¤Æ recapping all here ā¤µļø https://huggingface.co/collections/merve/march-21-releases-67dbe10e185f199e656140ae šŸ‘€ Multimodal > Mistral AI released a 24B vision LM, both base and instruction FT versions, sota šŸ”„ (OS) > with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS) > SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants > SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS) šŸ’¬ LLMs > NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset > LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B > Dataset: Glaive AI released a new reasoning dataset of 22M+ examples > Dataset: NVIDIA released new helpfulness dataset HelpSteer3 > Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS) > Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B > Dataset: GeneralThought-430K is a new reasoning dataset (OS) šŸ–¼ļø Image Generation/Computer Vision > Roboflow released RF-DETR, new real-time sota object detector (OS) šŸ”„ > YOLOE is a new real-time zero-shot object detector with text and visual prompts šŸ„¹ > Stability AI released Stable Virtual Camera, a new novel view synthesis model > Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model > ByteDance released InfiniteYou, new realistic photo generation model > StarVector is a new 8B model that generates svg from images > FlexWorld is a new model that expands 3D views (OS) šŸŽ¤ Audio > Sesame released CSM-1B new speech generation model (OS) šŸ¤– Robotics > NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset *OS ones have Apache 2.0 or MIT license
liked a model 4 days ago
HuggingFaceM4/idefics-80b-instruct
liked a model 6 days ago
HuggingFaceTB/SmolLM2-135M
View all activity

Organizations

Hugging Face's profile picture HuggingFaceM4's profile picture Huggingface Projects's profile picture Hugging Face H4's profile picture Hugging Face OSS Metrics's profile picture Hugging Face Smol Models Research's profile picture MLX Community's profile picture Distillation Hugs's profile picture Argilla Warehouse's profile picture Hugging Face FineVideo's profile picture smol-explorers's profile picture Hugging Face Science's profile picture Open R1's profile picture Smolvencoder's profile picture

andito's activity

upvoted an article 21 days ago
view article
Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

ā€¢ 69
upvoted an article about 1 month ago
view article
Article

SmolVLM2: Bringing Video Understanding to Every Device

ā€¢ 215
upvoted 3 articles about 2 months ago
view article
Article

Open-source DeepResearch ā€“ Freeing our search agents

ā€¢ 1.19k
view article
Article

Fixing Gradient Accumulation

ā€¢ 52
view article
Article

We now support VLMs in smolagents!

ā€¢ 97
upvoted an article 2 months ago
view article
Article

SmolVLM Grows Smaller ā€“ Introducing the 250M & 500M Models!

ā€¢ 165
upvoted an article 2 months ago
view article
Article

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

ā€¢ 31
upvoted 3 articles 5 months ago
view article
Article

Llama 3.2 in Keras

ā€¢ 12
view article
Article

Welcome, Gradio 5

ā€¢ 128
view article
Article

Tool Use, Unified

ā€¢ 92
upvoted 2 articles 6 months ago
view article
Article

FineVideo: behind the scenes

ā€¢ 30
view article
Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

ā€¢ 74