VLM RL Reasoning - a Nazzaroth2 Collection

Nazzaroth2 's Collections

Reward Modeling

models to test out

RL_Papers in general

OCR

VLM RL Reasoning

LLM-External_information

llm_compression

LLM_Reasoning-ErrorCorrection

Loras

3D (nerfs, gaussians, generation etc.)

t2i consistency works

videogames_roleplay

small_or_multimodal_llm

manga_translation

VLM RL Reasoning

updated Jul 1

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published Mar 21 • 24
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation

Paper • 2503.16660 • Published Mar 20 • 73
CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24 • 30
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Paper • 2503.13964 • Published Mar 18 • 20
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 165
ViLBench: A Suite for Vision-Language Process Reward Modeling

Paper • 2503.20271 • Published Mar 26 • 7
Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27 • 80
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3 • 33
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Paper • 2504.07934 • Published Apr 10 • 20
Efficient Medical VIE via Reinforcement Learning

Paper • 2506.13363 • Published Jun 16 • 31
Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?

Paper • 2506.17417 • Published Jun 20 • 12