Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.18129

One-RL-to-See-Them-All

One RL to See Them All: Visual Triple Unified Reinforcement Learning. GitHub: https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 2 days ago • 626 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 11 days ago • 18
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 2 days ago • 125 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 13 days ago • 59

about 1 hour ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

RL+reason model

about 3 hours ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 122
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 5

about 19 hours ago

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 27
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 105

One-RL-to-See-Them-All

https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 2 days ago • 626 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 11 days ago • 18
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 2 days ago • 125 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 13 days ago • 59

One-RL-to-See-Them-All

https://github.com/MiniMax-AI/One-RL-to-See-Them-All

One-RL-to-See-Them-All/Orsta-7B

Image-Text-to-Text • Updated 2 days ago • 626 • 7
One-RL-to-See-Them-All/Orsta-32B-0321

Image-Text-to-Text • Updated 11 days ago • 18
One-RL-to-See-Them-All/Orsta-32B-0326

Image-Text-to-Text • Updated 2 days ago • 125 • 4
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 13 days ago • 59

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published 17 days ago • 72
Reward Reasoning Model

Paper • 2505.14674 • Published 16 days ago • 34
Qwen3 Technical Report

Paper • 2505.09388 • Published 23 days ago • 182
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published 17 days ago • 78

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published 22 days ago • 90
Qwen3 Technical Report

Paper • 2505.09388 • Published 23 days ago • 182
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Paper • 2505.11049 • Published 21 days ago • 59
Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published 16 days ago • 129

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 74
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published 13 days ago • 109
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published 14 days ago • 76
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 13 days ago • 59

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs