SomosNLP

non-profit

https://somosnlp.org/

SomosNLP_

somosnlp

Activity Feed

AI & ML interests

Democratizar el PLN en español e incentivar su aplicación para generar impacto social 💛

Recent Activity

mariagrandury authored a paper 12 days ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

mariagrandury authored a paper 12 days ago

It's the same but not the same: Do LLMs distinguish Spanish varieties?

reddrex new activity 25 days ago

somosnlp/LingComp_QA:How use the dataset to train my model GPT

View all activity

somosnlp's activity

DrishtiSharma

authored a paper 16 days ago

Behind Maya: Building a Multilingual Vision Language Model

Paper • 2505.08910 • Published 24 days ago • 1

reddrex

in somosnlp/LingComp_QA 25 days ago

How use the dataset to train my model GPT

#1 opened 25 days ago by

luisaarias

ouhenio

updated a Space about 2 months ago

Mapa Blend-es

🌍

Revisa el avance colectivo de blend-es 😊

DrishtiSharma

authored a paper about 2 months ago

Robust and Fine-Grained Detection of AI Generated Texts

Paper • 2504.11952 • Published Apr 16 • 12

pcuenq

authored a paper about 2 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 188

osanseviero

authored a paper 2 months ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 51

alvarobartt

posted an update 3 months ago

Post

3167

🔥 Agents can do anything! @microsoft Research just announced the release of Magma 8B!

Magma is a new Visual Language Model (VLM) with 8B parameters for multi-modal agents designed to handle complex interactions across virtual and real environments; and it's MIT licensed!

Magma comes with exciting new features such as:
- Introduces the Set-of-Mark and Trace-of-Mark techniques for fine-tuning
- Leverages a large amount of unlabeled video data to learn the spatial-temporal grounding and planning
- A strong generalization and ability to be fine-tuned for other agentic tasks
- SOTA in different multi-modal benchmarks spanning across UI navigation, robotics manipulation, image / video understanding and spatial understanding and reasoning
- Generates goal-driven visual plans and actions for agentic use cases

Model: microsoft/Magma-8B
Technical Report: Magma: A Foundation Model for Multimodal AI Agents (2502.13130)