AI & ML interests

Machine learning, deep learning, generative AI, LLMs

Recent Activity

salma-remyxย 
updated a Space about 11 hours ago
salma-remyxย 
in remyxai/SpaceOm 23 days ago

How to Train the Model?

1
#1 opened 23 days ago by
qhz991029
salma-remyxย 
posted an update 24 days ago
view post
Post
1704
I'm auto-generating Docker Images to smoke-test new research repos ๐Ÿ”ฅ
Shared to Docker Hub daily! ๐Ÿณ

Today's featured paper+Image:
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

https://hub.docker.com/repository/docker/remyxai/2506.21862v1/general
salma-remyxย 
posted an update about 1 month ago
view post
Post
1583
When multiple benchmarks yield conflicting model rankings, how do you know which model to trust?

In this substack, we explore that question in the context of spatial reasoning capabilities as seen from the perspective of 3 new benchmarks.

Read more: https://remyxai.substack.com/p/benchmark-fusion
salma-remyxย 
posted an update 3 months ago
view post
Post
1868
SpaceThinker-Qwen2.5VL-3B shows a 3B VLM can compete with closed, frontier APIs in quantitative spatial reasoning, a key capability for embodied AI applications like drones and robotics.

Check out how it stacks up against Gemini and OpenAI on Q-Spatial-Bench in the ModelCard. Includes .gguf, colab quickstart, docker images.


SpaceThinker adopts the Qwen2.5VL-3B architecture, fine-tuned on the SpaceThinker dataset of synthetic spatial reasoning traces, created with VQASynth

This model builds upon the SpaceLLaVA series of VLMs finetuned for enhanced spatial reasoning using synthetic data by adding test-time compute for multimodal thinking.

Model: remyxai/SpaceThinker-Qwen2.5VL-3B
Dataset: remyxai/SpaceThinker
Space: remyxai/SpaceThinker-Qwen2.5VL-3B
Code: https://github.com/remyxai/VQASynth
Discussion: open-r1/README#10

  • 1 reply
ยท