Andres Marafioti's picture

Andres Marafioti

andito

·

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

liked a model about 13 hours ago

HuggingFaceTB/SmolLM2-135M-Instruct

updated a model about 15 hours ago

andito/nanoVLM

liked a Space about 18 hours ago

webml-community/conversational-webgpu

View all activity

Organizations

andito's activity

New activity in lerobot/smolvla_base 2 days ago

Add library name and link to code

#2 opened 3 days ago by

New activity in lerobot/svla_so100_sorting 4 days ago

added paper

#2 opened 4 days ago by

add citation

#1 opened 4 days ago by

commented 2 papers 4 days ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published 4 days ago • 74 •

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published 4 days ago • 74 •

New activity in google/paligemma-3b-mix-448 10 days ago

Fix inference code

#11 opened 10 days ago by

New activity in HuggingFaceTB/SmolVLM-Instruct 23 days ago

how to get smolvlm working in ollama?

#27 opened 4 months ago by

commented 2 papers about 1 month ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 108 •

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 108 •

commented a paper about 2 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 188 •

commented a paper 2 months ago

Slow-Fast Architecture for Video Multi-Modal Large Language Models

Paper • 2504.01328 • Published Apr 2 • 8 •

New activity in HuggingFaceTB/SmolVLM-Instruct 2 months ago

How many parameters are there in the model?

#26 opened 5 months ago by

commented 2 papers 2 months ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published Apr 1 • 36 •

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published Apr 1 • 36 •

commented 2 papers 3 months ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 108 •

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 108 •

New activity in HuggingFaceTB/SmolVLM-256M-Instruct 4 months ago

Add ONNX sample code

#8 opened 4 months ago by

Upload photo_2025-01-25_13-45-22.jpg

#5 opened 4 months ago by

There is an issue with AutoProcessor

#6 opened 4 months ago by

New activity in HuggingFaceTB/SmolVLM-500M-Instruct 5 months ago

Upload ONNX weights

#1 opened 5 months ago by